Contemporary Ergonomics 2000
Contemporary Ergonomics 2000 Edited by
P.T.McCabe Centre for Human Sciences, DERA, Farnborough, UK M.A.Hanson Institute of Occupational Medicine, Edinburgh, UK and S.A.Robertson Centre for Transport Studies, University College London, UK
THE Ergonomics society
First published 2000 by Taylor & Francis 11 New Fetter Lane, London EC4P 4EE Simultaneously published in the USA and Canada by Taylor & Francis Inc 29 West 35th Street, New York, NY 10001 Taylor & Francis is an imprint of the Taylor & Francis Group This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” © 2000 Taylor & Francis except The cognitive cockpit: operational requirement & technical challenge R.M.Taylor, H.Howells, D.Watson © British Crown Copyright 2000/DERA Tasking interface manager: affording pilot control of adaptive automation and aiding M.C.Bonner, R.M.Taylor, C.A.Miller © British Crown Copyright 2000/DERA Adaptive automation: who has control? I.R.Craig, S.G.Russell, E.K.Flood © British Crown Copyright 2000/DERA Influence of packing methods on musculoskeletal problems among brick packers A.D.J.Pinder © British Crown Copyright 2000/HSL The development of physical selection procedures for the British Army. Phase 3: validation M.Rayson, H.Pynn, A.Rothwell, A.Nevill © British Crown Copyright 2000/MOD The global implicit measure: evaluation of metrics for cockpit adaptation M.Vidulich, G.McMillan © 2000 US Government Gender differences in primary and secondary performance during simulated driving N.M.H.Brook-Carter, T.C.Lansdown, T.M.Kersloot © 2000 Transport Research Laboratory Road sign angularity T.M.Kersloot, B.R.Cooper © 2000 Transport Research Laboratory The British Crown Copyright papers detailed above are published with the permission of the Controller of Her Majesty’s Stationery Office. Publisher’s note This book is produced from camera-ready copy supplied by the editors. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers.
iv
Every effort has been made to ensure that the advice and information in this book is true and accurate at the time of going to press. However, neither the publisher nor the authors can accept any legal responsibility or liability for any errors or omissions that may be made. In the case of drug administration, any medical procedure or the use of technical equipment mentioned within this book, you are strongly advised to consult the manufacturer’s guidelines. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging in Publication Data A catalogue record for this book has been requested ISBN 0-203-30536-1 Master e-book ISBN
ISBN 0-203-34381-6 (Adobe eReader Format) ISBN 0-748-40958-0 (Print Edition)
CONTENTS
Preface AIR TRAFFIC CONTROL
xii 1
A HAZOP analysis of a future ATM system R.Kennedy, H.Jones, S.Shorrock, B.Kirwan
2
The future implementation of datalink technology: the controller-pilot perspective S.Harris, T.Lamoureux
7
Eye point-of-gaze, EEG and ECG measures of graphical/keyboard interfaces in simulated ATC H.David, F.Caloo, R.Mollard, P.Cabon, B.Farbos
12
Future system state prediction by novice and expert air traffic controllers D.Forrest, T.Lamoureux
17
Psychophysiological measures of adaptation to unfamiliar HMI in real-time ATC simulation H.David, R.Mollard, P.Cabon, B.Farbos
22
What the cognitive task analysts don’t tell you T.Lamoureux
27
ANTHROPOMETRY
32
Improving the usability of an anthropometric man-model program I.A.Ruiter
33
Anthropometric measurements in adolescents living at an intermediate altitude: the relationship between height, weight, head circumferenceand socioeconomic status M.D.Kaya, H.Yeşilyurt, B.Özkan, I.Çapoğlu, R.Akdağ
37
Relationship of upper limb postures to anthropometric variables L.W.O’Sullivan, T.J.Gallwey
43
COCKPIT DESIGN
49
Usability testing of a user interface for aircraft taxi guidance T.J.J.Bos, H.Kanis, A.J.C. de Reus, W.S.Green
50
The cognitive cockpit: operational requirement and technical challenge R.M.Taylor, H.Howells, D.Watson
57
vi
Situation assessor support system: a knowledge-based systems approach to pilot aiding N.R.Shadbolt, J.Tennison, N.Milton, H.Howells
62
Cognition monitor: a system for real time pilot state assessment K.Pleydell-Pearce, B.Dickson, S.Whitecross
67
Tasking interface manager: affording pilot control of adaptive automation and aiding M.C.Bonner, R.M.Taylor, C.A.Miller
72
The global implicit measure: evaluation of metrics for cockpit adaptation M.Vidulich, G.McMillan
77
DRIVERS & DRIVING
83
Brave new world: the vehicle autopia of the 21st century? M.S.Young, N.A.Stanton
84
Gender differences in primary and secondary performance during simulated driving N.Brook-Carter, T.C.Lansdown, T.Kersloot
89
Using observation of one traffic violation to predict an immediate second violation T.Wilson, C.Arsenault
94
ERROR & SYSTEMS
99
Analysis of shift change in the aircraft maintenance environment: findings and recommendations A.K.Gramopadhye, K.Kelkar
100
Consistency in HRA and impacts on human factors analysis R.Kennedy, B.Kirwan, B.Summersgill, K.Rea
105
GENERAL ERGONOMICS
110
A pilot study exploring the design of roles based on manufacturing process knowledge C.E.Siemieniuch, M.A.Sinclair
111
Long days and short weeks—the benefits and disadvantages K.J.N.C.Rich
116
Ergonomics needs of smallholder farmers in Mozambique D.H.O’Neill, E.J.Fraqueza
121
Are profiling beds better? Evidence from users and records J.Mitchell, J.Bennington, N.Jones,J.McClenahan
126
Human factors associated with escape from side-floating helicopters D.W.Jamieson, S.R.K.Coleshaw, I.J.Armstrong, C.Sellar, D.Howson
131
Ergonomic evaluation of work and environmental stresses on technicians working in a multimedia chip manufacturing industry in Malaysia R.N.Sen, Y.-H.Quek
136
vii
The development of physical selection procedures for the British army. Phase 3: validation M.P.Rayson, H.Pynn, A.Rothwell, A.Nevill
142
The ergonomic design of London Underground Limited’s incident reporting forms A.Whitlock, S.Layton, M.Sinclair-Williams, J.Parham
148
HCI & IT SYSTEMS
153
Research on cultural factors and interface metaphors in internet applications C.-H.Chen, C.-C.Hsu
154
Design and evaluation of a direct manipulation object for application in the postproduction special effects domain M.Hicks, J.Long, C.Borras
158
Older adults’ use of public technology M.Sheard, J.Noyes, T.Perfect
163
An assessment of the rationale & effectiveness of accelerator keys in computer applications C.C.H.Wong, K.Y.Lim
168
Can sound output enhance graphical computer interfaces? W.Morrissey, M.Zajicek
173
Adaptive automation: who has control? I.R.Craig, S.G.Russell, E.K.Flood
178
Research on Chinese computer users’ mental models in software interface design C.-H.Chen, C.-H.Chen
183
User requirements analysis for decision support systems: the question approach C.Parker
187
Why do IT systems fail to live up to expectations? a case study A.Bairsto, S.Harker
192
Development and trialling of user access to an information system for architects S.Meltzer, W.S.Green
197
Supporting universal access to information technology M.P.Zajicek, A.G.Arnold
202
Consumer acceptance of internet services M.Maguire
207
LEGISLATION
213
Ergonomics in Irish legislation V.Kelly
214
Public transport and the Disability Discrimination Act 1995 F.Bellerby
219
viii
The DSE directive: what does it mean? N.Heaton, A.Baird METHODOLOGY
224 229
How many participants: a simple statistic with some limitation H.Arisz, H.Kanis, M.J.Rooden
230
Psychophysical methods for quantifying opinions and preferences J.Engström, P.C.Burns
235
Using the web to support geographically dispersed, longitudinal usability evaluations M.Beard, C.Parker
240
The practice of triangulation I.S.MacLeod, L.Wells, K.Lane
245
MANUAL HANDLING
250
Work performed in three different modes of dynamic lifting A.D.J.Pinder, M.P.Rayson, D.W.Grieve
251
Teaching the neuromuscular approach to efficient handling and moving C.Donnelly
256
Influence of packing methods on musculoskeletal problems among brick packers A.D.J.Pinder
261
Use of human expertise in evaluating manual lifting tasks A.Genaidy, J.Beltran, A.Alhemoud, S.Yeung
266
Managing a manual handling risk assessment process J.Crowhurst, B.Catterall, G.Smyth
270
HANDS & WRISTS
275
The measurement of range of movement of the wrist: man or machine? G.E.Torrens, A.Newman
276
Hand function tests for workers exposed to hand-transmitted vibration B.M.Haward, M.J.Griffin
281
The relationship of wrist posture to discomfort during repetitive exertions E.J.Carey, T.J.Gallwey
286
Improving utensil and implement handle design through enhanced rotation and tilt G.Heavenor
291
Risk assessment of manual tipping of letter trays C.Parsons, A.Truelove
296
The evaluation of gloved and ungloved hands G.E.Torrens, A.Newman
301
ix
MUSCULOSKELETAL DISORDERS
306
Evaluating the use of single disc floor cleaners S.Hide, W.Morris, C.M.Haslegrave, O.O.Okunribido, S.C.Nichols
307
Health risks from mice and other non-keyboard input devices S.Hastings, V.Woods, R.A.Haslam, P.Buckle
312
Reducing risks for work-related musculoskeletal disorders in school nurseries C.Coole, C.M.Haslegrave
317
Psychosocial and physical factors and musculoskeletal illness in taxi drivers D.M.Anderson, R.K.Raanaas
322
Black Hawk helicopter loadmaster ergonomics P.Blanchonette, R.King, D.Crone, P.Simpson
329
Organisational issues as obstacles to intervention for musculoskeletal complaints C.G.Lawton, R.A.Haslam
334
Evaluating the risk of upper limb disorders for operators in a company using sanding and polishing equipment P.D.Bust, C.M.Haslegrave
339
PERSONAL PROTECTIVE EQUIPMENT
344
Specification of footwear for postal workers C.Parsons, A.Wray
345
What do British soldiers want from their gloves? D.McDonagh-Philp, G.E.Torrens
349
PRODUCT & WORKPLACE DESIGN
354
trends and product design P.W.Jordan
355
Sensory encounter: the codification of ‘soft’ qualities A.S.MacDonald
360
Usecues in the Delft design course H.Kanis, M.J.Rooden, W.S.Green
365
Design issues and visual impairment K.M.Stabler, S.van den Heuvel
370
Autonomy for disabled consumers: the need for systematic choice and innovation J.Mitchell, J.Bennington
375
Post office counter customer interface: a design challenge R.Ellis, C.Parsons
380
x
Revealing and responding to the needs of wheelchair consumers J.Mitchell, J.Bennington
385
Addressing pleasure in consumer products through ergonomics J.Simon, R.Benedyk
389
SEATING
394
Seating in the real world A.Baird, V.Malyon, N.Heaton
395
Drivers’ spinal responses to the effects of sitting posture T.J.Hadley, C.M.Haslegrave
400
The influence of automobile seat backrest angle and lumbar support on low back muscle activity M.Kolich, S.M.Taboun, A.I.Mohamed
405
TRAINING
410
An investigation of the effect of night vision goggles on cockpit task performance F.L.K.Tey, K.Y.Lim, Y.P.Chui
411
A redefinition of personal knowledge and a testing method to implement it D.P.Hunt
414
VISUAL DISPLAYS
419
The effect of two- & three-dimensional displays on remote crane control performance R.S.M.Quek, K.Y.Lim, Y.P.Chui
420
Airport baggage inspection—just another X-ray image? A.G.Gale, M.D.Mugglestone, K.J.Purdy, A.McClumpha
424
VIRTUAL REALITY
429
Immersive virtual reality and elderly users N.Karlsson, J.Engström, K.Johansson
430
Application of virtual reality to enhance user experience of electronic commerce (e-commerce) transactions H.Xu, K.Y.Lim, S.C.Fok
435
WARNINGS
440
The perceived hazardousness, urgency and attention-gettingness of fluorescent and nonfluorescent colours E.J.Tomkinson, R.B.Stammers
441
Increasing the conspicuity of food contents warnings E.A.Hoodless, R.B.Stammers
446
xi
Road sign angularity T.M.Kersloot, B.R.Cooper
450
Author Index
455
Subject Index
458
PREFACE
Contemporary Ergonomics 2000 are the proceedings of the Annual Conference of the Ergonomics Society, held in April 2000 at Stoke Rochford Hall. The conference is a major international event for Ergonomists and Human Factors Specialists and attracts contributions from around the world. Papers are chosen by a selection panel from abstracts submitted in the autumn of the previous year and the selected papers have the opportunity to be published in Contemporary Ergonomics. Papers are submitted as camera ready copy prior to the conference. Details of the submission procedure may be obtained from the Ergonomics Society. The Ergonomics Society is the professional body for Ergonomists and Human Factors Specialists, based in the United Kingdom, it attracts members throughout the world and is affiliated to the International Ergonomics Association. It provides recognition of competence of its members through the Professional Register. For further details contact: The Ergonomics Society, Devonshire House, Devonshire Square, Loughborough, Leics. LE11 3DW United Kingdom Tel./Fax. +44 1509 234904 e-mail.
[email protected] www. http://www.ergonomics.org.uk
Air traffic control
A HAZOP ANALYSIS OF A FUTURE ATM SYSTEM Richard Kennedy1, Helen Jones2, Steve Shorrock1 & Barry Kirwan1 1Air
Traffic Management Development Centre, National Air Traffic Services Bournemouth Airport, Christchurch, Dorset, BH23 6DF, UK
2Industrial
Ergonomics Group, School of Manufacturing & Mechanical Engineering,
University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
The introduction of new technology within Air Traffic Management (ATM) has lead to different types of human error being generated as the controller interacts with the system. This paper describes an approach that attempts to identify what human errors can arise and how they can be addressed by the design team. Based upon the Hazard and Operability (HAZOP) study approach, a technique was developed which could be applied to a prototype of a future ATM system. The HAZOP approach generated 87 recommendations for design improvements and the use of the technique was considered to be valuable for this and future projects in the ATM environment. Introduction Over the past few years, the aviation community has begun to recognise the growing need for updating the Air Traffic Management (ATM) system. The system itself has largely remained unchanged over the past two decades and has been effective in coping with the increases in traffic ‘year after year’. However, the ability of the system to process aircraft movements and the pressures placed upon the air traffic controller has almost hit its threshold level. A number of projects are being initiated that will help support the controller and reduce their workload. One way of freeing-up controller resources is to reduce the amount of RT (radio/telephone) time that the controller has to devote to each aircraft. A number of projects are underway to develop future ATM systems and this paper looks at a specific example of one of these projects. Assessing potential hazards at the design stage Currently, an Air Traffic Controller will use ‘flight progress strips’ in conjunction with a radar screen to control and monitor aircraft through their sector. Many new systems shift the focus away from one form of information presentation and usage (paper) to a completely different presentation medium (computer screen). Also, new systems may present additional system functionality, change the manner in which functions are implemented or actually remove some functions available to the controller. Therefore it is essential that the system is evaluated whilst it is still at the design stage so that potential operability problems and opportunities for human error can be identified and designed-out of the system. There are specialist methods to assess operability and human error potential of Human Computer Interaction (HCI) with most approaches being checklist-based and requiring human factors experts to perform the analysis. This study
CONTEMPOARY ERGONOMICS 2000
3
describes a different approach and reports the use of a modified Hazard and Operability (HAZOP) study approach. HAZOP has been widely used in the process industries and recently has been extended to address other types of system (e.g. programmable electronic systems, safety management systems etc.). The variants on the HAZOP approach are described in detail in Kennedy and Kirwan (1998). Hazard and Operability (HAZOP) study approach The modified HAZOP approach requires a team of experts to apply guide words (no, more, less etc.) to a visual and realistic representation of the system (e.g. screen dumps and task analysis). The HAZOP guide words that were developed for the study are shown in Table 1. A detailed description of the approach is given in Jones (1999). Table 1. HAZOP guide words and definitions
The purpose of HAZOP is to identify deviations away from the intended functioning of the system. Therefore, for instance, if the guide word ‘no’ was applied to the selection of a ‘menu’ a deviation such as ‘no heading entered into system’ would be identified. In turn, for each deviation, the group would go on to identify the consequences of the error on the system, indications that the error occurred, system defences and ways in which such an error would be recovered or reduced. HAZOP findings The HAZOP team was made up of three designers, one air traffic controller and two human factors specialists. The team spent a total of 16 hours, spanning three separate HAZOP sessions, interrogating the prototype system. Part of the study output is shown in Table 2. Overall, the HAZOP team identified a number of ‘vulnerabilities’ in the prototype system and ‘opportunities’ for error that needed to be designed out or worked around (e.g. via procedures and training). A total of 87 recommendations were generated from the three HAZOP sessions and these are classified as follows: • • • • • •
Changes to interface design and menus (34%) Improvements in user feedback on actions/inputs (25%) Training/procedures recommendations (16%) Modifications to aircraft status on screen (13%) Hardware/equipment changes (9%) Further study/future research ideas (3%)
4
A HAZOP ANALYSIS OF A FUTURE ATM SYSTEM
Discussion of HAZOP approach The HAZOP approach proved surprisingly useful and productive of changes in interface design. Moreover, since the designers were not only present, but were actively involved, any design changes they thought necessary were simultaneously accepted for implementation. The HAZOP therefore had very effective impact on the design process. It was also clear from a number of the problems identified that these would have been difficult to detect without a hybrid team present. In particular each member of the hybrid team brought a unique perspective to the group process: • the designer had complete detail on how the system worked down to the software description level, and what inputs were ‘afforded’ and the consequence of those inputs; • the controller could tell and show the designers how he or she would interpret the interface at any time, and recount scenarios where something else could happen that maybe the designers had not thought about; • the Human Factors specialist could suggest error modes, possible frequencies of errors and human error and system failure detection/recovery likelihoods, and advise on HF aspects of the solutions developed following the identification of a problem. All this information was shared effectively, leading to a rich multiple-perspective on the system design and its strengths and weaknesses. The HAZOP process allowed the different parties to see the system design through other parties’ eyes, in a fairly direct way, giving a shared system understanding. On the limitations side, the approach is resource-intensive, since only a small part of the system was analysed during the three sessions. It is also, as a process, fairly dependent on good chairmanship and a collaborative attitude between the participants. Lastly, it is fair to say that this pilot study of HAZOP applied to interface design was not as structured as a conventional HAZOP would have been, but such structuring and proceduralisation of the HAZOP process in this current context can always be built in at to later HAZOP sessions. There are currently plans to continue the HAZOP work and apply it to other systems, as it was found to be of significant value by the design team. Conclusion This paper has described an approach that can be applied to the design of new systems in order to identify the potential human errors in the operation of the system. The method, based on the Hazard and Operability (HAZOP) approach, was applied to a prototype ATM system. A number of recommendations for improvements to the prototype system were made from the HAZOP and use of the approach is now being planned for other ATM projects. Acknowledgements: The authors would like to thank the design team members who took part in the HAZOP sessions, namely: Andy Webb, Tony Goodship, Stephen Pember, Brian Young, John Levesley and Andy Kilner. Also, thanks to Huw Gibson at the University of Birmingham for his supervision of the MSc project.
Table 2. Example of HAZOP analysis outputs
CONTEMPOARY ERGONOMICS 2000 5
6
A HAZOP ANALYSIS OF A FUTURE ATM SYSTEM
The opinions in this paper are those of the authors, and do not necessarily represent those of NATS or other companies involved in this research References Jones, H. (1999) A modified HAZOP analysis of a future air traffic control system. Confidential MSc Thesis, Industrial Ergonomics Group, School of Manufacturing and Mechanical Engineering , University of Birmingham, September. Kennedy, R. and Kirwan, B. (1998) Development of a HAZOP-based method for identifying safety management vulnerabilities in high risk systems. Safety Science, 30, 3, 249–274.
THE FUTURE IMPLEMENTATION OF DATALINK TECHNOLOGY: THE CONTROLLER-PILOT PERSPECTIVE Sarah Harris1 & Tab Lamoureux 1College 2Air
of Aeronautics, Cranfield University, Cranfield, Bedfordshire MK43 0AL
Traffic Management Development Centre, National Air Traffic Services, Bournemouth Airport, Christchurch, Dorset BH23 6DF
In the near future, Air Traffic Control and a majority of aircraft will be equipped with datalink technology. However, in addition to its anticipated benefits, some concern has been raised regarding the use of datalink within certain contexts. For this reason, the perspectives of controllers and pilots on five main datalink tools and their use within different flight phases and control areas were gathered. Perspectives were based on the potential impact datalink could have on their workload, situation awareness, human error and crew resource management, compared to present operations. The method included the use of self-completion questionnaires, developed through preliminary datalink interviews. Introduction Due to a continued increase in air traffic, current radio telephony (R/T) channels have become over crowded and thus inefficient in coping with the existing demand for information transfer. The development of datalink (DL) technology has therefore focused on reducing the burden placed on R/T channels and enhancing the overall effectiveness of the communications, surveillance and navigation network. DL will constitute the implementation of digitised communication facilities via Very High Frequency transmitters, Secondary Surveillance Radar and satellite, which will be accessible for both the controller and pilot through some kind of visual display unit. This will enable the controller and pilot to digitally send and receive flight information, Air Traffic Control (ATC) instructions and other general pre-formatted communication messages between themselves, airline offices and other ATC Centres, with the press of a button. In general, DL implementation is anticipated to improve overall human performance compared to present R/T procedures. However, numerous studies (both in an existing DL context and in trials for European DL), have found that specific DL tools and facilities have had an adverse impact on human performance, compared to present operations. For example, problems are anticipated with the use of DL in certain flight phases and ATC environments, such as the busy Terminal Manoeuvring Area (TMA). Other difficulties may exist concerning inherent problems with automation in general. During DL trials, Shingledecker (1992) reported that controllers were forced to revert to R/T during the final approach sector (where tasks were *Current address: Avionic Systems Engineering, BAE Systems, York House, PO Box 87 Farnborough Aerospace Centre, Farnborough, Hampshire GU14 6YU
8
THE FUTURE IMPLEMENTATION OF DATALINK TECHNOLOGY
Table 1. Interference Caused by Similar Input Modalities (Nijhuis, 1993)
significantly more time constrained and demanding), as DL was deemed too slow or complex to maintain control in such a highly tactical airspace. In addition, one study showed that pilots also found DL to be operationally unacceptable, due to high task densities and small task completion windows in a terminal context (Reynolds and Neumeier, 1991). This was further supported in a review by Kerns (1991), where DL operations were found to increase workload and thus be unacceptable to pilots during both the departure phase (from take off until 2,000 ft) and the arrival phase (from 10,000 ft and below). Most of the difficulties and anticipated benefits associated with DL are due to the potential impact it may have on various human factors issues. One such example is the change in input modality from auditory to visual channels. This may significantly impact a controller’s capacity by overloading their perceptual resources and introducing the possibility of resource interference (see Table 1). Consequently, this study focuses on four significant human factors pertinent to DL research; workload, situation awareness, human error and crew resource management. The five main types of European DL included in the research were the Aircraft, Communication and Reporting System (ACARS), the Automatic Dependent Surveillance (ADS), Controller-Pilot Datalink Communications (CPDLC), Mode Select (Mode-S) and the Flight Information Service (FIS). Other more general aspects of DL were also covered, such as interface options and usage variability. An analysis of the potential impact of the DL items on the human factors across tasks (controllers and pilots) and contexts (TMA and ‘cruise’ phase), allowed a clear indication of which DL items need further research and development before their implementation. Methodology A self-completion questionnaire was developed to evaluate controller opinions across different types of control areas (i.e. oceanic, terminal and area control) and pilot opinions across different types of flight phases (i.e. from a ‘cruise’ perspective and a ‘TMA’ perspective). These questionnaires were formulated by compiling lists of DL items, which were segregated into ‘pilot only’ items (12), ‘controller only’ items (9) and ‘joint controller-pilot’ items (40). These were then transformed into questions, along with an appropriate answering structure that required participants to comment on whether they thought an item would have a negative, positive or have no effect at all on the four human factors, compared to their current operations. These questions were initially used in preliminary semi-structured interviews, then administered as draft questionnaires, and following feedback, were amended and sent out as final self-completion questionnaires. The final questionnaires were a ‘controller only questionnaire’, a ‘pilot only questionnaire’ (to be answered from a TMA perspective) and a ‘pilot only questionnaire’ (to be answered from a cruise perspective). The controller only questionnaire consisted of ‘controller only’ DL items and mixed items (applicable to controllers and pilots), whereas both pilot questionnaires contained the same mixed items, along with ‘pilot only’ items. A total of 100 ‘controller questionnaires’ were sent to Oceanic/Airways Control, Area Control and Terminal Control and 600 ‘pilot questionnaires’ (300 TMA, 300 Cruise) were delivered to British Airways Captains, Senior First Officers and First Officers.
CONTEMPOARY ERGONOMICS 2000
9
Discussion Responses were analysed between controllers and pilots in general and also between those who work (or were allocated) in the ‘cruise’ phase (cruise pilots and oceanic/area controllers) and those in the ‘TMA’ phase (terminal controllers and TMA pilots). Overall, results indicated no significant differences between the ‘cruise’ and ‘TMA’ controllers on any of the ‘controller only’ DL items and significant differences were only found between pilot perspectives on two of the ‘pilot only’ DL items. Table 2. Consensus Opinions towards Certain Datalink Items
n/a: Not applicable to that group (i.e. either a pilot only or a controller only question)
However, some of the descriptive group means were also examined, despite non-significant results. This was deemed imperative as a non-significant result often indicated a ‘consensus’ in opinion towards certain DL items. Only the means which were either above +.5 or below −.5 were described, in order to highlight any extreme preferences or dislikes by controllers and pilots (respondents indicated +1 for a positive effect, 0 for no effect and −1 for a negative effect). Table 2 includes some of the DL items that controllers and pilots had either extremely negative (neg) or positive (pos) opinions towards (a consensus), which may explain why no significant differences were found. As illustrated in Table 2, the ability to use DL for transparent transfers, multiple clearances, strategic and non-control information, was considered a particular benefit by all controllers and pilots. However, the controller’s ability to hear the intonations of the pilot’s voice and the pilot’s lack of awareness of other aircraft, was considered to have a potentially negative effect on their situation awareness and human error rate. Despite these disadvantages, most of the controllers and pilots shared a consensus that DL would be beneficial to their workload, due to improvements to non-critical aspects of their work. With reference to the joint controller-pilot sections, significant differences were found across and between both TMA and cruise groups and controllers and pilots. Table 3 includes the DL items where significant differences across groups were found. However, in the joint sections, only the outcomes with significant main effects are reported. These are summarised by indicating which one of the groups were significantly more positive towards that item with respect to one or more human factors (a p would mean
10
THE FUTURE IMPLEMENTATION OF DATALINK TECHNOLOGY
Table 3. Significant Differences in Opinions across All Groups
c-p: controller or pilot t-c: TMA or cruise phase n/a: Not applicable to that group (i.e. either a pilot only or a controller only question)
the main effect was due to pilots being significantly more positive towards that item). As with Table 2, overall extremely positive or extremely negative responses are also included. It was apparent that a majority of the main effects found were with relation to the positive responses by pilots compared to controllers, especially pilots from the TMA phase (see Table 3). Results indicated that pilots were favourable towards a message transaction time of 6–12 seconds, especially with reference to the cruise phase. However, terminal controllers thought this time limit was unacceptable. As TMA operations are restricted by small task completion windows, it was anticipated that the TMA group would be less favourable towards this time frame compared to the cruise group (Shingledecker, 1992). With respect to automatically triggered messages with no human input, pilots also found this facility to be beneficial (regardless of flight phase). Controllers however, thought this would have a particularly negative impact on situation awareness, especially in the TMA, as this is when they would normally speak with the aircraft. Ultimately, this would reduce the ‘perception of elements in the current situation’ (Endsley, 1988), thus impacting the first level of situation awareness. Additionally, controllers were extremely negative concerning the issue of having to retry if a message failed or having to notice if no response was received from a message sent, especially with respect to workload. The prospect of having to retry due to automation failure would add to the complexity of the task, thus increasing the workload of the controller. Items of particular benefit were the pre-notification idea, transfer of data authority facility, Departure Clearance Service and the use of DL for clearances in general. Although pilots were significantly more positive towards these items compared to controllers (on most of the human factors), controllers also deemed these items to have a potentially beneficial effect, especially with respect to workload and crew resource
CONTEMPOARY ERGONOMICS 2000
11
management within terminal control. However, one DL issue that was considered to have a negative impact on operations by both pilots and controllers, was the fact that pilots using DL will not be able to hear other ATC conversations with other aircraft. Both pilots and controllers thought this would have an extremely negative impact on human error and situation awareness, especially within the TMA. Even without DL, situation awareness errors are often associated with being unable to comprehend the ‘big picture’. Therefore, if DL reduces the ability to comprehend the ‘big picture’ (according to respondent’s opinions), then these types of errors may increase. To conclude, pilots from the terminal context thought DL could have a potentially positive impact overall, but mainly on non-critical operations. Conversely, controllers expressed particular concern over the use of DL in a terminal context. As DL will have to be used by both controllers and pilots in certain contexts (such as the TMA), this could pose problems for its future flexibility. Therefore, as well as researching into the actual impact DL may have on terminal control critical and routine operations, increasing the awareness of the benefits of using DL within terminal control, is of particular importance before its implementation. References Endsley, M.R. 1988, Design and Evaluation for Situation Awareness Enhancement. Proceedings of the Human Factors Society 32nd Annual Meeting, 1, 97–101 Kerns, K. 1991, Data-Link Communication Between Controllers and Pilots: A Review and Synthesis of the Simulation Literature. International Journal of Aviation Psychology. (Lawrence Erlbaum Associates), 1 (3), 181–204 Nijhuis, H.B. 1993, Workload in Air Traffic Control Communication. In E.J.Lovesey (Ed.) Contemporary Ergonomics: Proceedings of the Ergonomics Society’s 1993 Annual Conference, Edinburgh, Scotland. (Taylor and Francis), 284–289 Reynolds, M.C. and Neumeier, M.E. 1991, Mode-S Data Link Pilot-System Interface: A Blessing in De Skies or a Beast of a Burden? Sixth International Symposium on Aviation Psychology, 1, 154–159 Shingledecker, C.A. 1992, Controller Evaluations of ATC Data Link Services. Society of Automotive Engineers Technical Paper Series. Report Number 922027
EYE POINT-OF-GAZE, EEG AND ECG MEASURES OF GRAPHICAL/KEYBOARD INTERFACES IN SIMULATED ATC Hugh David1, F.Caloo1, R.Mollard2, P.Gabon2 & B.Farbos2 1Eurocontrol 2Laboratoire
Experimental Centre, 91222 Bretigny-sur-Orge, Cedex, France
de l’Anthropologie Applique, 45 Rue des Saints-Peres, 75270 Paris, France
To assess the utility of eye movement recording for the assessment of different ATC operating methods, its relation to other electro-physiological measures and their sensitivity to task difficulty, 8 controllers carried out four TRACON II exercises using a graphic and a keyboard interface in light and heavy traffic. An iView head-mounted eye-tracking device was used. EEG/ EOG and EKG were also measured, and on-line observations recorded using the Noldus Observer System. Significant events during the exercises were also identified for detailed analysis. Introduction A series of small-scale Real Time (RT) simulations of Air Traffic control (ATC) has been carried out as reported previously (Cabon et al, 1997, 1998, David et al, 1998, 1999), to evaluate the use of psychophysiological measures to measure the effects of performing ATC on controllers. These simulations used a simple Wesson International TRACON II Autonomous ATC simulator. On the basis of the findings of these simulations, selected psychophysiological and self-assessment measures were applied to a RT simulation, as described elsewhere in this volume. Continuing the Eurocontrol Experimental Centre policy of preliminary small-scale investigations, attention was turned to the measurement of eye-movement (Point of Gaze). This technique has been used on other occasions for investigations in real life ATC (Bouju and Sperandio, 1979, Leguillou et al, 1981) and in RT simulations, (David, 1985), but without corresponding measures of psychophysical strain. Aims This study was undertaken in order to:1. Provide experience in applying Eye Point-of-Gaze measurement in a Real-time simulation. 2. Provide experience of the simultaneous application of electrophysiological measures. 3. Investigate the sensitivity of these measures to different control devices in light and heavy traffic. 4. Investigate the utility of the Noldus Observer system for ATC observation. 5. Investigate controllers’ eye-movements prior to undesirable events.
CONTEMPOARY ERGONOMICS 2000
13
Method Eight experienced but not currently practising Air Traffic Controllers carried out simulation exercises using the TRACON II Autonomous Air Traffic Control (ATC) simulator at Eurocontrol Experimental Centre, (EEC), Bretigny, France. Each controller undertook four exercises, controlling 15 and 30 aircraft entering in 30 minutes, using graphic (trackball/pointer/windows) and coded keyboard input methods. The two lighter loaded exercises were carried out on one afternoon, following some preliminary training exercises, and the more heavily loaded exercises were carried out on the following afternoon, to minimise circadian rhythm problems. The orders of presentation of samples and input methods were permuted to minimise overall bias. The traffic level for the heavier sample was deliberately chosen to exceed the controllers’ expected capacity, in order to produce a significant number of ‘errors’: missed approaches (where an aircraft is not correctly positioned for landing), missed hand-offs (where an aircraft is not transferred to the next sector in time), conflicts (where aircraft approach within 3 NM without 1000 feet vertical separation) and even collisions. Measures The controllers’ left/right frontal Electroencephalogram (EEG) and Electrocardiogram were recorded using a Vitaport psychophysiological recorder. The point of gaze was recorded using a Sensomotoric Instruments iView head-mounted eye-tracking device. Point-of-Gaze video recordings were obtained for all exercises. (No interference was experienced between the EEG and the eye-tracking helmet—in fact the lightweight cycle helmet helped to secure the EEG electrodes firmly in place.) A preliminary analysis was carried out using the Noldus Observer on-line, an observer recording major shifts of attention. Significant events during the exercise were noted. The TRACON II simulator provided an overall score for each exercise, with numbers of the “errors” mentioned above. (Unfortunately, the TRACON II simulator stops if a collision occurs, and destroys the records of the controller concerned.) The Controllers filled in fatigue measurement instruments before and after exercises, and completed the NASA-TLX instrument after each exercise. They also completed a post-simulation questionnaire after the last exercise. Analysis Eye Movements Aircraft entered the simulation over a 30-minute period, and took about 15 minutes to complete their flights. There was an initial rise in activity, a busy period and a final tailing-off of activity. Eye-movement analysis was therefore confined to the busiest 20 minutes of each simulation. Initially, the location, duration and frequency of eye-movements were analysed on a minute-by-minute basis. The TRACON II screen (400 mm×300 mm) is divided into a Radar display (Top left, 275 mm square), flanked by a strip bay (Right, 125 mm wide), containing strips (15 mm deep). Pending strips, relating to aircraft not yet under the controller’s control, appear above Active strips. Strips appear in the Pending list, are transferred to the Active list automatically on acceptance, and are removed automatically from that list when the aircraft lands or leaves the area. A communications window (25 mm deep), below the radar, shows in text form the verbal messages generated by the system from the controller’s input and simulated pilots and adjacent controllers. When operating in the track-ball/pointer (Graphic) mode, pop-up windows
14
EYE POINT-OF-GAZE, EEG AND ECG MEASURES
of varying sizes and shapes (about 50 –80 mm in either dimension) appear, providing choices of instructions and complementary information. The mean number of fixations per minute was approximately 15 per minute for the graphic mode, and 25 per minute for the keyboard mode. In keyboard mode, the controllers switched frequently between the keyboard, the active strip area and the radar. Controllers using the graphic interface spent about 57 percent of their time looking at the radar, 17 percent looking at the active strips and 17 percent looking at ‘pop-ups’. Controllers using the keyboard interface spent about 47 percent of their time looking at the radar, 20 percent looking at active strips and 20 percent looking at the keyboard. Surprisingly, the traffic load made no significant differences in the duration or number of fixations, for either control mode. Electroencephalography The estimated thcta-rhythm power rose for higher traffic load in the keyboard mode, as might be expected. In the graphic mode, however, it fell. ECG- Mean Rate For both modes, the variability of heart rate (sinusarythmia) fell for higher task loads. There was a significant negative correlation between sinusarythmia and the number of fixations for radar, active strips, and keyboard, and between sinusarythmia and the time spent looking at the radar. NASA-TLX There were significantly higher scores for the mental, physical and temporal demand and effort sub-scales of the NASA-Task Load Index for heavier traffic load. Performance The overall TRACON score was higher for higher traffic load, although the deliberate overloading of controllers in the high traffic load produced variable scores. Specific error frequencies showed a more complex pattern. There were more separation losses in the keyboard mode, suggesting less situational awareness. There were more handover errors in heavy task load conditions, suggesting that controllers may decide to ‘shed’ this task under time stress, and more missed approaches in the graphic control mode, which may be attributed to the lower precision of the graphic methods for height and speed allocation. Controller Orders There were no significant differences in the numbers of orders per minute—even between high and low traffic load conditions. Discussion This was an initial feasibility study, which should be repeated with larger numbers of subjects. The observed results can only be regarded as tentative, but are indicative.
CONTEMPOARY ERGONOMICS 2000
15
The only problems encountered with the head-mounted iView equipment were that the tracking was lost when the controller was looking downwards, as the upper eyelid tends to fall as he does so. This problem is exacerbated when bifocal or progressive lenses are worn, since these force the controller to tilt his head back to read print. EEG electrodes were applied with collodion, and the converted bicycle helmet used by the iView system helped to hold them in good contact. Only one controller expressed discomfort at this combination. Calibration of the equipment presented some problems. A blank panel with a grid of reference points was presented to the controller, who was asked to look at these points while the computer-based calibration was carried out. Normally, this required about five minutes, but on some occasions calibration had to be repeated several times. It was important that the controller adopted his true working posture, rather than leaning back from the screen. (Controllers when working tend to lean towards the screen when solving problems, and to lean back after finding solutions.) The I-view system presents the point-of-gaze as a point on a video image taken from a head-mounted camera. This permits free head movement but requires costly and slow manual analysis. The video-records incorporating the controller’s eye movements provide the opportunity to examine in detail exactly where the controller’s attention was directed before significant events, such as failures to maintain separation, failures to hand over aircraft to the next sector and so on. These analyses are particularly laborious, and are currently being completed. An initial hypothesis, that controllers did not see that an aircraft was due to be handed over to the next sector, because they were looking at another part of the screen, does not appear to be supported. It appears that either they were too busy, or they had not realised that an attempt to hand over had been rejected. The integration of psychophysiological, eye-movement and operational data on a minute-by-minute basis was practical and effective. It is not yet practical to identify which aircraft image or strip is being looked at, or to separate EEG signals according to the direction of gaze, so that no direct information on the lateralisation of brain functions has been obtained. Conclusions 1. Eye Point-of-Gaze measurement was successfully applied in this Real-time simulation. 2. Electrophysiological measures were applied successfully at the same time as these measures. Results of ECG, EEG and Eye Movement were combined (on a minute-by-minute basis.) 3. Eye Movement measurements were sensitive to the interface mode. Psychophysiological measures were sensitive to the interface mode, and to traffic load. 4. The Noldus Observer system can be usefully applied to the direct observation of a single controller, and to the analysis of video records. Some method of time-sharing or sampling should be developed for the observation of many controllers. 5. The investigation of controllers’ eye-movements prior to undesirable events is continuing, but initial observations do not support the hypothesis of ‘over-concentration. References (Paper copies of EEC Notes and Reports are available from the address above. Recent Notes and Reports are available at the EEC Web-site (www.eurocontrol.fr)) Bouju, F. and Sperandio, J-C, 1979, Analyse de l’activite visuelle des controlleurs d’approche. Rapport C.O. 7911 R59. INRIA France
16
EYE POINT-OF-GAZE, EEG AND ECG MEASURES
Cabon, P., Mollard, R., Cointot, B., Martel, A. and Beslot, P. 1997, Elaboration of a method for the psychophysiological states of Air Traffic Controllers in Simulation EEC Report No. 323 (Eurocontrol Experimental Centre, Bretigny-sur-Orge, France) Cabon, P., Farbos, B., Bourgeois, S., R., Cointot, B. and Mollard, R. 1998, Objective evaluation of the learning process of controllers adapting to a new HMI for ATC EEC Note No. 16/98 (Eurocontrol Experimental Centre, Bretigny-surOrge, France) David, H., 1985, Measurement Of Air Traffic Controllers’ Eye Movements in Real Time Simulation. EEC Report No. 187 (Eurocontrol Experimental Centre, Bretigny-sur-Orge, France) David, H., Cabon, P., Bourgeoise-Bougrine, S. and Mollard, R., 1998 Psychophysiological Measures of Fatigue and Somnolence in Simulated Air Traffic Control, In M.A. Hanson (ed.) Contemporary Ergonomics 1998, (Taylor and Francis, London) 427– 433 David, H., Farbos. B., Bourgeois, S., Cabon, P., and Mollard, R., 1999 Psychophysiological Measures of Adaptation to an unfamiliar HMI in Simulated Air Traffic Control, In M.A.Hanson, E.J.Lovesey and S.A.Robertson, (eds.) Contemporary Ergonomics 1999, (Taylor and Francis, London) 12–16 Jasper, H.H., 1958, The 10–20 electrode system of the International Federation, Electroencephalography and Clinical Neurophysiology, 10, 371–376 Kramer A.F., Donchin E. and Wickens C.D. 1987, Event-Related Potentials as indices of mental workload and attentional allocation, In Electrical and Magnetic Activity of the Central Nervous System: Research and Clinical Applications on Aerospace Medicine. AGARD Conference Proceedings No. 432, pp 14–1 to 14–14 Leguillou, M., Halliez, B. and Nobel, J., 1981, Etude du Travail du controleur Organique au CRNA/Nord Par Analyse de la saisie visuelle, CRNA R 81/22 (DNA, Paris)
Future System State Prediction by Novice and Expert Air Traffic Controllers Damien Forrest1 & Tab Lamoureux2 1Psychology 2Air
Department, University College London Gower Street, London, WC1E 6BT
Traffic Management Development Centre, National Air Traffic Services Bournemouth Airport, Christchurch, Dorset, BH23 6DF
Abstract
The purpose of this study was to examine how well Novice and Expert Air Traffic Controllers could predict future conflict situations. Measures of conflict detection and confidence were established using signal detection methodology and calibration statistics. Significant differences were found between groups in their ability to identify conflict situations and both groups exhibited levels of over-confidence in their judgments. The type of conflict also showed a significant impact upon the controllers’ judgments, with side-on conflicts being the most difficult to predict. The results have implications for the training of the controllers and a number of recommendations are made. Introduction As the amount of air traffic continues to escalate, Air Traffic Management (ATM) systems endeavour to maintain their established records of safety and efficiency. Despite the increasing use of technology-based support systems within ATM, it is still the individual operators of the system that maintain primary control and who are ultimately responsible for the safe management of air traffic. Hence, the more macro-goals of the ATM system, i.e. the provision of a safe and expeditious air traffic control or advisory service (MATS— Part I), is reflective of the level to which the more micro-goals of the system are achieved, i.e. the skills, abilities and limitations of the individual operators; irrespective of the system’s technological power. The task success is contingent upon many factors. Controllers must be able to detect, identify and assimilate large amounts of data from competing sources of information, to make accurate judgments and timely decisions within this highly complex and dynamic task environment (Lamoureux, Cox and Kirwan, 1999; Smolensky, 1999). The dynamic nature of the task has a considerable impact upon the judgments and decisions made by the controllers (Kerstholt, 1993). With a task environment that continually changes, one of the largest dangers to the system is therefore the failure of the controllers to estimate and prepare for future system states. For example, based upon the controller’s awareness of the current situation, it is their skill in identifying future potential problems before they occur and to provide appropriate solutions to these problems, which makes this predictive ability one of the most important aspects of the integrity of the system. Notwithstanding the technological support, the importance of the training for this ability is evident. There is increasing use in ATM of technology-based support systems to help maintain this awareness and to aid in the prediction of future system states (e.g. the Short Term Conflict Alert system). However, there
18
FUTURE SYSTEM STATE PREDICTION BY NOVICE
are still concerns about how these systems currently (and in the future will) affect the controllers (Kirwan, 1999). Hence, (for the moment) future system predictions are still made at the ‘sharp-end’ by the controllers themselves. It has been suggested that errors in future state prediction are linked to a failure of the controller to maintain awareness of the task environment, with failures in state awareness forming the largest category of error in aviation (Jones and Endsley, 1996). One form of error identified is where the controller has full awareness of what is going on (e.g. they have detected and identified all the relevant aircraft) but, according to Jones and Endsley, have possessed a poor mental model for predicting the consequences of any actions taken into the future time-frame; which may result in a loss of separation (i.e. a potential conflict situation). Another prediction-associated error investigated within this study is proposed. This is termed an operational side-effect error (OSEE): the failure of the controller to predict the unforeseen and undesired consequences of their actions, resulting in a loss of separation. This is not, as Jones and Endsley perhaps would suggest, because of an inappropriate mental model—it is the prediction judgment itself that is unsound. For example, a controller has identified a potential conflict between aircraft A and aircraft B, who are both travelling at the same flight-level. Before taking any action, the controller re-assesses the situation in light of their planned actions. After which instructions are given to aircraft A to drop to a lower level, thus, avoiding the potential conflict situation with aircraft B. However, shortly after aircraft A loses separation with aircraft C, which was climbing through the flight level given to aircraft A. Although aware of the existence of aircraft C, the controller failed to identify it as being a potential problem. The question is, if the controller had awareness of aircraft C and it’s intentions, why then did they fail to identify the secondary conflict? In order to answer these types of questions, it would be useful to establish a measure of the controller’s predictive ability. Previous research within ATM has been arguably unsuccessful in the identification of the underlying judgmental aspects of this predictive ability. Although it is suggested that failures in prediction may be associated with a deficit in the controller’s state awareness, the global awareness measures often used to establish this tell us very little about the quality of the prediction judgment itself (Endsley, 1988; Jones & Endsley, 1996). SA is typically measured by the controller’s perception and recall of factors within the task environment, not on their ability “…to know what the hell to do about it” (Smolensky, 1999). Hence, as a tool for research or training, global measurement techniques are limited in their contribution to an understanding of these prediction errors. The aims of this research were twofold: Firstly, to measure the ability of the controllers to predict future system states; and secondly, to investigate the possible factors surrounding the associated errors in future system state prediction. Methodology Six males and one female student air traffic controllers (SATCs) and five male instructors were presented with a series of six static-based experimental scenarios. Each scenario was based on an area control task and contained 5, 6, 7 or 8 aircraft (on the radar display) and between 5 and 7 aircraft on the flight progress strips (i.e. aircraft due to enter the sector). Each scenario had number of potential conflicts: These were either, (a) primary conflicts, in which aircraft would lose separation if no action were taken; or (b) secondary conflicts, which if inappropriate actions were taken in dealing with a primary conflict, a secondary loss of separation would occur.
CONTEMPOARY ERGONOMICS 2000
19
A semi-structured interview was used to establish primary and secondary conflict identification data. Further, for each pair of aircraft discussed the participant was also asked to give an estimate of the closest point of approach (CPA) between the aircraft, together with a measure of confidence in that decision. Results Conflict identification was measured using multidimensional detection theory (see Swets, 1996). From these data the mean group d-prime and beta scores were computed (see table 1). Results showed that experts showed a greater ability to identify overall conflict situations than students (t (8)=−2.541, p<.05). Further, the lower beta scores shown by students indicated a greater bias (i.e. willingness) to identify a conflict situations when there was not. Table 1: Mean d-prime and beta scores for each group
Figure 1: Calibration curves for instructors and students
No significant difference was established between groups identifying primary conflicts. However, the actions taken in dealing with these conflicts resulted in a number of potential secondary conflict situations for both instructors and students. Although a number of these were subsequently identified, a number of these resulted in OSEEs, for both instructors and students (9% and 38% respectively). For calibration values of CPA estimates, a 5nm error margin was used. Partition scores were calculated separately for each participant and then averaged for each group (see Lichtenstein, Fischhoff & Philips, 1982). An ideal calibration score would be 0 and would be indicated by data points along the central calibration line, indicating that the CPA estimates of the controllers are representative of the true distances between the aircraft. All controllers exhibited levels of over-confidence in their estimations shown by data points falling below the ideal calibration line (see figure 1). However, instructors did show greater accuracy than students (.00389 and .01823 respectively).
20
FUTURE SYSTEM STATE PREDICTION BY NOVICE
Figure 2: Percentages of hits/misses across conflict types
Post hoc analyses revealed a significant difference in the conflict type identified by controllers (t (3)=6. 854, p<.01). As the angle of convergence between the aircraft is increased, controllers showed greater difficulty in identifying the conflicts, with side-on conflicts being the most difficult to judge (see figure 2). Discussion The results indicate superiority in primary and secondary conflict identification by instructors, as would be expected; with a number of secondary conflicts still present after action is taken. Instructors clearly showed greater skill in identifying the future positions and the relative distance between two aircraft. How does this support our understanding of future conflict identification? The importance of these judgments is perhaps more evident retrospectively. If the identification of a positive conflict is reflective of the controller’s judgment in the distance between the aircraft (i.e. it is believed that the aircraft will lose safe separation by coming within 5 nm of each other), then it is assumed that the controller will weight the aircrafts’ relationship more greatly and act accordingly. Alternatively, if the controller’s judgment is erroneous (i.e. the distance between the aircraft is believed to be greater than 5nm when it is not), the result is less favourable. An important implication also is with the presence of a negative conflict situation. If the controller judges the distance between the aircraft to be less than 5nm when it is not, valuable mental resources will be needlessly allocated away from possibly more important factors. Results also indicated that the graphical representation of these cues (i.e. via the radar display) might also have an impact in the accuracy of these judgments: the distances between certain conflict-types were more difficult to judge. This is possibly because controllers have a greater difficulty in visualising the pathways of the certain conflict types. The mental rotation required for conflicts with greater angles of convergence may be more difficult for the controllers to judge (Shepard and Metzler, 1971). Both the difficulty associated with and the important role of visual imagery has clearly been identified in the literature (Isaac, 1999). Further research is clearly required. On the basis of this research a number of recommendations have been made. Firstly, that training methods incorporate a greater awareness for these secondary conflict situations; perhaps through feedback of judgmental accuracy and confidence using a similar and ideally dynamic task. Secondly, further
CONTEMPOARY ERGONOMICS 2000
21
examination of the role of imagery in conflict discrimination, specifically the impact that the varying types of conflicts have on this ability. Lastly, the development of a tool aimed at providing performance-based cognitive measures of future system state prediction; which include conflict detection and judgment. References Endsley, M.R. (1988). Situation Awareness Global Assessment Technique (SAGAT). In Proceedings of the National Aerospace and Electronics Conference (NAECON), 789–795. New York: IEEE Isaac, A.R. (1999). Air Traffic Control: Human Performance Factor. Aldershot: Ashgate. Jones, D.G., & Endsley, M.R. (1996). Sources of Situation Awareness Errors in Aviation. Aviation, Space, and Environmental Medicine, 67 (6), 507–512. Kerstholt, J.H. (1994). The effect of time pressure on decision-making behaviour in a dynamic environment. Acta Psychologica, 86, 89–104. Kirwan, B. (1999). Cognitive error analysis of future automation options in Air Traffic Management. Contemporary Ergonomics 1999. London: Taylor & Francis. Lamoureux, T, Cox, M & Kirwan, B (1999). Cognitive Task Analysis in Training System Redesign. Contemporary Ergonomics 1999. London: Taylor & Francis. Lichtenstein, S., Fischhoff, B. & Philips, L.D. (1982). Calibration of probabilities: the state of the art to 1980. In D.Kahneman, P.Slovic & A.Tversky (eds), Judgment under uncertainty: Heuristics and Biases. Cambridge University Press, Cambridge. Manual of Air Traffic Services—Part I. National Air Traffic Services, London, 1999. Shepard, R.N. & Metzler, J. (1971). Mental rotation of three-dimensional objects. Science, 171, 701–703. Swets, John A (1996). Signal Detection Theory and ROC Analysis in Psychology and Diagnostics. Mahwah, NJ: Lawrence Erlbaum Associates
PSYCHOPHYSIOLOGICAL MEASURES OF ADAPTATION TO UNFAMILIAR HMI IN REAL-TIME ATC SIMULATION H.David1, R.Mollard2, P.Cabon2 & B.Farbos2 1Eurocontrol 2Laboratoire
Experimental Centre, 91222 Bretigny-sur-Orge, Cedex, France
de l’Anthropologie Applique, 45 Rue des Saints-Peres, 75270 Paris, France
Selected psychophysiological and self-assessment measures were applied to a Real-Time simulation of a revised interface for adjacent Swedish and Danish Air Traffic Control (ATC) sectors. (Because the Real Time simulation involved radical change in displays and operating methods, it was important to establish that the controllers had completed learning the system.) EEG studies showed high Theta-band activity, dropping off about the tenth day of simulation. Sleep-logs showed increasing duration of sleep throughout the simulation. Self-assessments of fatigue and sleepiness showed significantly higher levels in early morning and late evening sessions, and were significantly affected by age, experience and gender. Introduction A series of small-scale Real Time (RT) simulations of Air Traffic control (ATC) has been carried out as reported previously (Cabon et al, 1997, 1998, David et al, 1998, 1999), to evaluate the use of psychophysiological measures to measure the effects of performing ATC on controllers. These simulations used a simple TRACON II Autonomous ATC simulator (Wesson International, 1990) as used in Brookings et al (1996). I On the basis of the findings of these simulations, selected psychophysiological and selfassessment measures were applied to a RT simulation. The simulation selected (SweDen98—Eriksen and Harvey, 1999) was of a revised interface for adjacent Swedish and Danish ATC sectors. Aims This study was undertaken in order to:1. 2. 3. 4.
Give experience in applying psychophysiological methods in a Real-time simulation. Evaluate the time needed to familiarise the controllers with a new interface. Measure perceived (and where possible) objective strain on the controllers Make recommendations on the applications of psychophysical methods to Real-time simulation Method
The SweDen98 Real-Time Simulation took place from 6–24th June 1998 at the Eurocontrol Experimental Centre. The purpose of the simulation itself was to simulate a revised Controller-System interface, to allow
CONTEMPOARY ERGONOMICS 2000
23
controllers to familiarise themselves with a ‘stripless’ environment, and to make a safety assessment of the proposed system. The area simulated was the entire Kobenhavn Flight Information region/Upper information region (FIR/UIR) and parts of Malmo and Stockholm FIR/UIRs, with adjacent ‘feed’ sectors. Two organisations were simulated, differing in the aspects of Danish airspace simulated. (A Real Time Simulation is not a scientific experiment. It is intended to examine a close approximation to an existing or planned system, and to explore its limits. One major purpose of all simulations is to convince controllers that the system will work. The simulation plan is adapted in response to the requirements of the participants and organisers. If it is clear that one potential organisation is unworkable, no further exercises will be carried out with it. Formal experimental design is not considered important, and may not always be applicable.) 39 Swedish and Danish Air Traffic controllers and one Danish Air Defence Controller took part in the simulation. 52 simulation exercises, each involving 20 measured working positions, and lasting about 90 minutes were carried out over a period of three weeks. Measures, with Significant Results Electroencephalogram (EEG) and Electro-oculogram (EOG) EEG and EOG were recorded on one controller per simulation session. Four different controllers were recorded in each of the two organisations. Sites corresponding to the frontal lobes (FP1, FP2) and median (Fz Pz Cz) (Jasper 1958) were employed. Data were initially recorded using a Vickers Medical Discovery system, and subsequently analysed using a DEC Microvax system to remove EOG artifacts from the EEG recordings and to perform spectral analysis for 30 second intervals, with a final visual inspection to remove remaining artefacts. A high level of theta rhythm was observed, dropping off only after the tenth day of simulation. Event-Related potential (ERP) ERP was to be measured before and after each exercise (using the ‘odd-ball’ paradigm (Kramer et al, 1987) on the controller whose EEG was measured). It was found to be too cumbersome in the RT simulation environment, and was discontinued. Cortisol Salivary Cortisol was measured before and after each exercise on the eight controllers for whom EEG was measured. No significant differences were observed, probably due to technical problems in sample transportation. Sleep Logs All controllers participating in the exercise filled in self-reporting sleep logs for the onset and duration of sleep, and self-assessed fatigue and sleepiness on sleeping and rising for each night of the simulation and for two weeks after the end of the simulation. (Sleep logs for the period following the simulation were returned by post. All controllers returned their logs.)—Sleep time rose significantly during the simulation,
24
PSYCHOPHYSIOLOGICAL MEASURES OF ADAPTATION TO UNFAMILIAR
and fatigue at rising time decreased correspondingly. Sleep times during the two weekends spent in Paris were significantly lower than those during weekends at home after the simulation. NASA-TLX All controllers completed the NASA-TLX after each simulation. There was significantly more ‘frustration’ and lower ‘self-assessed performance’ with heavier workload. Performance (self-assessed) increased with time. Older controllers showed significantly higher ratings, particularly of ‘physical difficulty’. Self-rated fatigue and sleepiness A self-rating of fatigue and sleepiness, with a four-point self-assessment of mental fatigue, physical fatigue, sensorial fatigue and mood was carried out by all controllers before and after each exercise. Fatigue was significantly higher in early morning and late afternoon sessions. Older controllers, male controllers and those with no previous experience of the system were significantly more fatigued. Effects of Traffic Load Although systematic traffic variations were not part of the simulation strategy, a 20% increase in traffic load, even in the early stages of the simulation, produced significant changes in fatigue, NASA-TLX, and theta-band EEG. Discussion This simulation afforded a first opportunity to deploy the methods previously tested using the TRACON II simulator in a Real-Time simulation. The transition from a near-laboratory to a near-reality setting implied a significant loss of statistical control, and required negotiated compromises. For example, it would have been preferable to follow a single controller working at the same position throughout the simulation. The operational requirements required that there should be the maximum variation in the positions worked by each controller. The compromise reached involved the restriction of EEG/EOG measures to four controllers who worked one specific position, and who agreed to the EEG/EOG measures. Considerable time was needed to convince the controllers of the value of these and the more general measures employed. Once convinced, the controllers co-operated whole-heartedly. (The 100% return rate, by busy controllers, of sleep logs by post is an example of this attitude.) The use of these methods enabled the experimenters to conclude that the learning of an unfamiliar ATC HMI interface, even where the traffic and organisation were familiar, took up to ten days—considerably longer than expected, and longer than might be deduced from the controllers’ own opinions, as recorded in operational questionnaires and debriefings. The transfer from shift-work to daytime operation, (and the celebrations in Paris associated with the winning of the World Cup by France) resulted in accumulated sleep debt, and significantly affected controllers’ performance and subjective fatigue ratings. This simple technique points to a significant source of unconscious bias if subjective opinions of workload are taken at face value. Evidence from sleep-logs, NASA-TLX and cortisol levels suggests that it is undesirable to use earlymorning and late afternoon exercises, particularly following a week-end in Paris, for critical measurements.
CONTEMPOARY ERGONOMICS 2000
25
A more detailed discussion from a human factors point of view, with technical descriptions and copies of the paper instalments employed, is available in Cabon et al, (1999), which complements the ATC oriented Eriksen and Harvey (1999). Conclusions 1. Psychophysiological methods can be applied in a Real-time simulation, provided that the controllers’ informed assent has been obtained. 2. The time needed to familiarise the controllers with a new interface was considerably greater than normally expected—about 10 days. 3. ERP is too cumbersome, but EEG and ECG can be applied to one or two key working positions. Paperbased self-assessment methods (Fatigue/sleepiness questionnaires, NASA TLX, Sleep logs) should be applied to all controllers participating in the simulation. Although Cortisol analysis showed few significant differences in this study, owing to technical problems, it merits further evaluation. 4. Observation of this and subsequent simulations suggests that the disruption of working patterns arising from the change from shift work to regular day work, and stresses from unfamiliar life style significantly affect performance, particularly in early Monday morning and late Friday exercises, which should therefore be avoided for measured comparisons. References (Paper copies of EEC Notes and Reports are available from the address above. Recent Notes and Reports are available at the EEC Web-site (www.eurocontrol.fr)) Brookings J.B., Wilson G.F. and Swain, C.R. 1996, Psychophysiological responses to changes in workload during simulated Air Traffic Control, Biological Psychology, 42, p361–377 Cabon, P., Mollard, R., Cointot, B., Martel, A. and Beslot, P. 1997, Elaboration of a method for the psychophysiological states of Air Traffic Controllers in Simulation EEC Report No. 323 (Eurocontrol Experimental Centre, Bretigny-sur-Orge, France) Cabon, P., Farbos, B., Bourgeois, S., R., Cointot, B. and Mollard, R. 1998, Objective evaluation of the learning process of controllers adapting to a new HMI for ATC EEC Note No. 16/98 (Eurocontrol Experimental Centre, Bretigny-surOrge, France) Cabon, P., Farbos, B., Bourgeois-Bougrine, S. and Mollard, R. 1999, Objective Evaluation of Air Traffic Controllers’ Adaptation to an Unfamiliar Human-Computer Interface. EEC Report No. 334 (Eurocontrol Experimental Centre, Bretigny-sur-Orge, France) David, H., Cabon, P., Bourgeoise-Bougrine, S. and Mollard, R., 1998 Psychophysiological Measures of Fatigue and Somnolence in Simulated Air Traffic Control, In M.A. Hanson (ed.) Contemporary Ergonomics 1998, (Taylor and Francis, London) 427– 433 David, H., Farbos, B., Bourgeois, S., Cabon, P., and Mollard, R., 1999 Psychophysiological Measures of Adaptation to an unfamiliar HMI in Simulated Air Traffic Control, In M.A.Hanson, E.J.Lovesey and S.A.Robertson, (eds.) Contemporary Ergonomics 1999, (Taylor and Francis, London) 12–16 Eriksen, P. and Harvey, A., 1999, SweDen98 Real-Time Simulation, EEC Report No. 335 (Eurocontrol Experimental Centre, Bretigny-sur-Orge, France). Jasper, H.H., 1958, The 10–20 electrode system of the International Federation, Electroencephalography and Clinical
26
PSYCHOPHYSIOLOGICAL MEASURES OF ADAPTATION TO UNFAMILIAR
Neurophysiology, 10, 371–376 Kramer A.F., Donchin E. and Wickens C.D. 1987, Event-Related Potentials as indices of mental workload and attentional allocation, In Electrical and Magnetic Activity of the Central Nervous System: Research and Clinical Applications on Aerospace Medicine. AGARD Conference Proceedings No. 432, pp 14–1 to 14–14 Wesson International Inc., 1990, The TRACON II Multi-player ATC Simulator, (Wesson International, Dallas, Texas)
WHAT THE COGNITIVE TASK ANALYSTS DON’T TELL YOU Tab Lamoureux Air Traffic Management Development Centre, National Air Traffic Services Bournemouth Airport, Christchurch, Dorset, BH23 6DF
With the advent of complex computerised systems, the tasks that the human operator carries out have become more and more cognitive. In response to this, the human factors analyst has had to change the focus of an analysis of tasks from the physical processes to the mental processes. A review of the literature shows that many authors have conducted Cognitive Task Analyses (CTAs), but few of these have described the methodology used. Those who have described the methodology often fail to describe how the data is used to generate findings and make recommendations. This paper attempts to explain, with reference to selected examples, how findings and recommendations are implicitly based on the understanding the analyst has of widely known and accepted theories of cognition. Introduction The role of the Air Traffic Controller is to provide instructions to aircraft that will enable them to get to their destination as efficiently as possible without eroding specified safety margins. In order to do this effectively, Controllers must bring a great deal of knowledge to bear regarding airspace, routes, equipment and aircraft. The breadth of this knowledge is reflected in the fact that the initial training of a Controller lasts from 12 to 18 months, and achieving a full qualification can take up to two years longer. The training syllabus of the controller has developed over the last 50 years, with new topics being added to the syllabus with little consideration of either the existing syllabus or contemporary theories of learning. The nature of the Controller’s task has also become more complex, as problem solving now takes account of many dynamic variables instead of a few, slowly changing variables. This change in the task requires a great deal of expertise from the controller, the development of which is not reflected in the training. As a result, some Air Traffic Control (ATC) authorities are engaged in Cognitive Task Analysis (CTA) of the Controller’s task in order that the cognitive bases of the task can be better trained. One of the best known of these has been conducted by Seamster and his colleagues (Seamster et al, 1993) in support of the redesign of the training curriculum for Controllers in the Federal Aviation Administration (FAA) in the United States. Further examples of CTA in ATC include the Integrated Task Analysis (ITA) project of Eurocontrol (Dittman, Kallus and Van Damme, 1999) and Lamoureux and his colleagues in the United Kingdom (Lamoureux, Cox and Kirwan, 1999). Of these three examples, the work in the United States and the United Kingdom has been used to generate recommendations for training in ATC. Both projects have described the methodologies they use to investigate cognitive processes and generate data. Neither project, however, has described how they make
28
WHAT THE COGNITIVE TASK ANALYSTS
the leap from data to observations and recommendations relevant to training. Indeed, in the Seamster work simple definitions have been used to identify decision-making points and goal-directed behaviour, but few conclusions have been made regarding the requirements underpinning successful decision-making and goaldirected behaviour. Previous research attempting to make clearer the process of conducting a CTA (Militello, Hutton and Chrenka, 1998; Seamster, Redding and Kaempf, 1997) seem to have focussed on experiences that should be expected when conducting a CTA, rather than the theory that underpins analysis itself. Traditional task analyses (e.g. Kirwan and Ainsworth, 1992) have been successful tools in presenting human factors’ ideas to non-human factors audiences in part because they are clear about the assumptions being made and the information on which recommendations are made. CTA, in order to be as successful, must also be as clear. The analyst should be able to show his data and explain how it has been used to generate the findings and recommendations reported. This paper will use examples from Lamoureux and Cox (1999) to illustrate how this is possible. Fundamentals A CTA should be grounded in contemporary theories of cognition which provide a link between observations or data and eventual findings and recommendations. Important aspects of cognition that should be accounted for include a model of information processing, an understanding of working memory and long term memory, and types of behaviour exhibited by the job incumbent. The exact manner in which these theories are used will depend upon the application of the CTA. When considering working memory, it is also important to be aware of ‘chunking’ strategies adopted and the half-life of material in working memory. Long term memory should be considered with respect to types of memory, triggers, item strength and association, networks, schema and models, and knowledge type. The behaviour exhibited by the incumbent should be considered with respect to whether it is bottom-up or top-down in how it processes information. The person may also exhibit behaviour that is based at a knowledge-, rule- or skill-based level. Finally, it is important to be aware of the relationship between information processing, memory and behaviour. For instance, an expert is more likely to a better mental model than a novice, leading to stronger associative memories and thus a less strong trigger to initiate a behaviour. This trigger in itself may be an example of skilled behaviour, which will not require all of the expert’s mental resources. As a further effortsaving strategy, the expert will likely work in a more top-down manner, which is likely a skill-based behaviour, which requires a strong mental model and, by implication, schema and semantic network. Deconstruction The following paragraphs introduce findings or recommendations from Lamoureux and Cox (1999). They will be briefly related to the concepts introduced in ‘Fundamentals’, above. The generation of findings comes from the data gathered from the participants. This data is then compared with contemporary theories of cognition to generate findings and these findings then form the basis for recommendations. It is these recommendations that are then fed back into training system redesign. When conducting this study, participants were interviewed and their responses analysed. The analyst was interested in the nouns and verbs given in reply. The sequence in which they were mentioned provided an indication of what the main components of the mental model are, what their constituent elements are, and the relationships between elements are.
CONTEMPOARY ERGONOMICS 2000
29
“All participants asked for aircraft location and callsign, trail dots, SSR information (including current level and intention code), and requested level. Subsequently, all participants referred to aircraft by their company only (i.e. not the full callsign). From the intention code, most participants could infer the routing of the aircraft.” This information allowed the construction of a model of how information is arranged in the controller’s mind. The ‘hub’ around which this information is structured is the company identifier. This model must be reinforced such that the controller can call it to mind readily and make effective use of it. “When describing the strip display, students are more likely to describe it more comprehensively than instructors, but are also slower in recounting this information. Students are most likely still building their mental model of the task, and have not determined (through experience) what information is essential to task execution and what information is secondary. As a consequence students consider systematically all of the information at their disposal. Instructors consider certain essential items of information, and only consider less-important information if the essential items do not clarify/resolve the scenario adequately.” The instructor have a strong mental representation of the information required to do a job, and can navigate through their mental model quickly. Students on the other hand are not so confident at moving through their store of information. The passage above also illustrates a difference between bottom-up processing of novices and top-down processing of experts. “Students showed more reliance on visual cues than did instructors. When asked what the trigger is to bring a strip from the pending bay to the live bay, students responded that it was the appearance of the aircraft on the radar. Instructors expanded upon this by saying that the strip for any aircraft that could affect the planned route of the recently appeared aircraft would also be brought across. This also shows that the students are using a more simplistic mental model” This passage also illustrates that students are working at a more rule-based level of behaviour than instructors. This is because students require a trigger to bring the strip across, while instructors just ‘know’ when to bring a strip across. “It was noted that occasionally instructors would interrupt students ‘thought processes to correct an error made earlier. Also, instructors occasionally appeared to be premature in their prompting of students to do something. The nature of the students’ (and instructors’) mental processes mean that they must recognise the situation according to the information presented. In the case of students, this recognition is likely to take longer due to the systematic and explicit consideration of more information than instructors before comprehending and acting upon a situation. Decision-making usually follows a logical sequence, and interruption can disturb this sequence and hinder the learning process. Therefore instructors must judge whether students require more thinking time or additional guidance.” Not only do the student’s mental processes suffer from being interrupted, but the route information takes in being reinforced in the mind is disrupted. The information in working memory is forced out by incoming information (the instructor’s comment) or else is compelled to remain in working memory longer than the ideal half-life, leading to inefficient formation of a memory in long term memory. “The use of different coloured holders could represent a method by which to help students access their bottom-up plan at a less explicit level. Students currently have to read the strips and assimilate information to do with route and level (correlating this ‘with aircraft callsign and location on the radar). Interpretation of the strip information (i.e. understanding) may take time. By presenting strips in coloured holders, students will have expectations about the information the strips contain (presuming they have been taught what the colours mean). This can potentially save time devoted to understanding and thus reduce workload and hesitation. The difference in colour also represents a high intensity stimulus which can create a
30
WHAT THE COGNITIVE TASK ANALYSTS
‘distance’ between aircraft in the student’s mind and further reduce their time spent ‘thinking’ about which aircraft wants what, etc.” The use of different colours creates a ‘cognitive distance’ in the mind of the controller, with associated expectations that are quite different, meaning that few mistakes are made involving ‘right action, wrong object’. It is important, however, that the information is learned correctly, and this may take time. It would also be important that such theory is reinforced soon after teaching by practice. Forcing the novice to apply the theory will reinforce the information in the mental model and form appropriate links between existing and new knowledge. “That students may be constructing inappropriate links is further illustrated by the learning of Terrain Clearance rules. The Terrain Clearance rules dictate that the controller is responsible for ensuring that an aircraft is a certain height above ground within certain parameters (the ‘keyhole’ rule). However…most traffic is in airways, which, by definition, is clear of terrain. Also, the airspace maps…make no reference to terrain, so it is assumed to be 0 feet at all points. This knowledge does not necessarily allow the student to make appropriate links. Indeed, it is possible that the student will consider terrain clearance while acting as an airways controller, adding complexity and workload to his/her task.” The teaching of extraneous information can place a further burden on the working memory of the student. During learning, working memory is stretched because the mental model is not able to minimise the load. Thus, with the presence of information that the student believes they should remember, but will never use, the entrenchment of good mental models and development of skill will be hindered. Conclusion It is clear from this narrow look at the CTA literature that practitioners base their findings and recommendations implicitly on the current theory regarding cognition. To non-human factors people, however, it may look as if many of these findings and recommendations are pulled from thin air. As CTA attempts to systematically support the redesign of training curriculum it must open itself up to scrutiny. A presentation of the data, and an explanation of how the data is combined and then transformed by the theory to develop insightful findings and recommendations with impact, must be made to training managers. Having won their support by demonstrating that CTA is not black magic and can have a valuable influence, more distant horizons, such as interface design and ab initio selection, can be pursued. References Seamster, T.L., Redding R.E., Cannon J.R., Ryder J.M. and Purcell J.A. 1993, Cognitive Task Analysis of Expertise in Air Traffic Control , International Journal of Aviation Psychology, 3(4), 257–283 Seamster, T.L., Redding, R.E. and Kaempf, G.L. 1997, Applied Cognitive Task Analysis in Aviation, (Avebury, Aldershot) Lamoureux, T.M. and Cox, M.J. 1999, Cognitive Task Analysis of CATC Radar Skills Course, NATS Report no. 9904, (NATS, London) Lamoureux, T.M., Cox, M.J. and Kirwan, B.I. 1999, Cognitive Task Analysis in Training System Redesign. In M.A.Hanson, E.J.Lovesey and S.A.Robertson (eds.) Contemporary Ergonomics 1999, (Taylor and Francis, London), 17–21 Kirwan, B.I. and Ainsworth, L.K. 1992, A Guide to Task Analysis, (Taylor and Francis, London) Dittman, A., Kallus, K.W. and van Damme, D. 1999, Integrated Task Analysis—Phase 3: Baseline Reference of Air Traffic
CONTEMPOARY ERGONOMICS 2000
31
Controller Tasks and Cognitive Processes in the ECAC Area, Eurocontrol Report no. HUM.ET1.ST01.1000-REP-05, (Eurocontrol, Brussels) Militello, L.G., Hutton, R.J.B. and Chrenka, J.E. 1998, You Can’t Teach What You Can’t Describe: One Experience in Developing CTA Instruction, in Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, 390–394
Anthropometry
IMPROVING THE USABILITY OF AN ANTHROPOMETRIC MAN-MODEL PROGRAM Iemkje A.Ruiter Delft University of Technology, SubFaculty of Industrial Design Engineering, Jaffalaan 9 2628 BX Delft, The Netherlands E-mail:
[email protected]
Most research effort in computer aided anthropometric assessment is put into the development of anthropometric man-models. Publications about the actual use and usability of man-model programs are rare. At the SubFaculty of Industrial Design Engineering of Delft University of Technology we work with the ADAPS program (Anthropometric Design Assessment Program System). This paper presents an overview of the way Industrial Design students use ADAPS. Hereby we focus on ‘what goes wrong’ during the entire process of using an anthropometric man-model program. Based on this overview we present some suggestions for the improvement of the usability of man-model programs. Introduction From the very beginning of the development of 3D anthropometric computer man-models, researchers involved in this development process have published their work. Their main interest—the development of a ‘perfect’ man-model—is reflected in these publications. During the years the publications on man-model development became accompanied by case studies: what are the results of the use of anthropometric manmodels in designing workspaces and products. Up till now almost no publications have appeared about the actual use (and usability) of anthropometric man-models. At the SubFaculty of Industrial Design Engineering of Delft University of Technology we have been working with ADAPS (Anthropometric Design Assessment Program System) for almost 20 years. The ADAPS program enables its user to visualise a 3D anthropometric man-model together with a workspace or product. Thus it is possible to assess how well the dimensions of this (future) product or workspace are adapted to its users. Besides being a research project, ADAPS has been used in education from the very beginning. We offer students a 40 hours (elective) course in ADAPS. When using a man-model program, students make mistakes. This is no problem at all in an educational setting. We have our doubts however, whether they will be able to use man-models the way they should in their professional life. This paper presents an overview of the way Industrial Design students use ADAPS. Here we focus on ‘what goes wrong’ during the entire process of using the program. Based on this overview we present some suggestions about what could be done to improve the usability of man-model programs.
34
IMPROVING THE USABILITY OF AN ANTHROPOMETRIC
What goes wrong? The use of a man-modelling program can be divided into 3 phases: the preparation (what should you do before you start using the program), the actual use of the program and the assessment (what conclusions do you reach, based on the results you got). Students encounter problems in all three of these phases. Preparation During the design process a designer might decide to use an anthropometric man-model. Man-models can be used in at least two stages of the design process. To start with, they are a useful tool when setting up a program of requirements for the dimensions of a future product or workspace. When design concepts have been generated, man-models can be used to assess the dimensions of these concepts. We find that our students use man-models almost exclusively for the assessment of concepts. Even though we stress the possibility to use the man-models for setting up requirements when we introduce the ADAPS program to the students, they seldom use them for this purpose. A product or workspace will be designed for a target group. The more exactly this target group is defined, the better the designer will be able to select a man-model that represents this target group as close as possible. To give an exact definition of the target group appears to be difficult. Defining the target group in terms of age and sex seems to be easiest, nationality (especially mixed populations) usually gets less attention. One of the most common mistakes made is—at our Faculty—known as the ‘P5/P95 syndrome’: students tend to go automatically for a selection of 90% of the total population, not realising what effect this may have. The next step in the preparation is to focus on the possible user-product interaction. What postures does the product require from the users, what (human as well as product) dimensions are important and what problems are to be expected. Students are able to make an overview of the postures that the product handling requires from the users. However, these overviews are often incomplete and the postures superficially described. Students that look at a product that they are familiar with tend to perform much better at this part of the preparation. The last, crucial part of the preparation is formulating the assessment criteria. As the man-model cannot tell you whether it is comfortable or not, the designer needs to know beforehand whether a (model) posture is allowed or not. Sometimes this is rather easy to see (if you cannot reach the pedals of your bike, cycling will be difficult), but mostly it requires experience or (literature) research to determine what is right or wrong. Students tend to skip most of this part of the preparation (they think about it, but do not specify it). If they do set up assessment criteria they usually do not get much further than the basic knowledge of ergonomics they achieved during the first years of the curriculum. It appears to be difficult to translate ergonomic data, found in literature, to the specific situation they are looking at. For example: guidelines for computer workspaces are often based on 8-hours working days. These guidelines need not be applied so strictly when you are looking at a situation where computer use is limited to a few minutes, a few times a day. Using the man-model program When using a man-model, you need to have the model assuming functional postures, necessary to assess the dimensions of a product or workspace. The number of model-postures you need to look at has an optimum, based on what dimensions are likely to cause problems and what part of the target population will encounter these problems. To illustrate this: when looking at the height of a letterbox, it is important to look if small
CONTEMPOARY ERGONOMICS 2000
35
persons are able to post their mail. If they are, tall persons will be too. Students tend either to look at too many options per posture (different man-models like men, women, elderly, and a variety of percentile values) in order to be sure they do not forget anything, or to omit postures that are important. Postures, assumed by the man-models, are always a translation of the postures of real users of a product or workspace. The user of the man-model decides how these human postures are translated into man-model postures. Students usually are not aware that translating a posture is due to a number of assumptions and personal interpretations. They do not realise that there may be rather large differences in interpretation between model-users. Man-model programs each offer their own features. Even though the features offered by the ADAPS program are very limited, students who use ADAPS usually do not make optimum use of the possibilities of the program. Some examples: they do not use the ranges of motion of the joints of the man-model to their full extent (the ranges of motion of the shoulder joints are very often omitted in the assessment of maximum reach of the model), they do not realise that the possibility to display two models at the same time makes it also possible to display two different percentiles of the stature of a model in a certain posture, they do not use the possibility to rotate the sight-axis of the model in relation to the head (thus simulating eye movement) which may result in rather large rotations of the joints of the neck. Assessment The use of anthropometric man-models results in either requirements for the dimensions of a (future) product or conclusions about how well the dimensions of product concepts will match the functional dimensions of the future users of this product. Students show a tendency to go for ‘absolute’ answers. They try to formulate requirements for product dimensions by using a single value (this dimension should be x cm) instead of presenting it as a range (this dimension should be between x and y cm). Like other man-models, the ADAPS model is a representation of (a group of) human beings (a representation, based not only on data, but also on a number of assumptions) and thus has its limitations. In the introduction to the ADAPS course, students are told to be critical about the man-models. However, the way they use the models to assess the dimensions of their product concepts usually does not show much of a critical attitude towards the man-models. They tend to take the dimensions and ranges of motion of the models for granted, not taking the effect of possible differences between model and human dimensions into account when formulating the conclusions of their assessment. To illustrate this: when a person sits on a chair, the shape of the upper leg differs from the shape it has when the person is standing. Students show they are aware of this by having the leg ‘go through’ the surface of the seat; however, they seldom mention they made an assumption for the impression of the leg, and never give arguments for the assumptions they made. Not only should the man-models be subject to a critical analysis, but the process should be too. When formulating conclusions you should take into account what assumptions have been made during the preparation, the use of the model and the assessment, and what the effect of these assumptions may be. To look critically at the process appears to be rather difficult. Students seem hardly to be aware of what assumptions they make during the process. They seldom mention there are limitations to their predictions for the dimensioning of the product they are looking at.
36
IMPROVING THE USABILITY OF AN ANTHROPOMETRIC
Improving the usability The foregoing has presented an overview of what could go wrong in all three phases of the use of a manmodel program. What can we learn from the mistakes that are made; how can we use this information to improve the usability of these programs? To improve the usability of the use of a man-model program, sufficient information needs to be offered to compensate for the model-users’ lack of experience. This could be done in two ways: a tutor could assist the model-user or the information could be offered by the man-model program. Tutoring by an expert user could be achieved by presenting courses to the novice users. It is preferable to spread the contact with the tutors of the course over a rather long period of time. Thus the novice users will be able to use the experience, gained in between the meetings, at the next meeting. In our specific case (teaching the use of anthropometric man-models to industrial design students) we would prefer all students to learn to work with man-models from their first year on, instead of teaching an elective course in the last years of the curriculum. By using man-models as part of their design exercises the students will gain experience with the use of models at different levels, in different stages of the design process and in a variety of designs. It will be obvious that feedback will be crucial to improve their use of man-models. As human tutoring will not always be possible, another option is to build the ‘tutor’ inside the man-model program. The ideal situation is that the program contains the information and guidance the user needs to make optimum use of the man-models. This is the ideal situation, to find out how close you can get to this ideal a great number of questions needs to be answered. First of all you need to know what information and guidance users need: what could and may be expected of the users? As an example: ADAPS users do not always use the possible ranges of motion of the shoulder joint of the man-model in situations that require shoulder motion. What information do the users need to trigger them to move the shoulder? The next step is to find out what information and guidance could be implemented into the man-model program. Is it for example possible to let the program check all aspect of the preparation phase, like: is it possible to check whether the man-model user did select the right postures (postures the product requires from the future product users)? When the first two questions are answered the problem arises how the information should be presented by the man-model program and at what moment. What information could better be provided as feedback and what information needs to be given as feed-forward? We will use one of the aspects of the preparation phase—the selection of the target group of the product that has to be designed—to illustrate the above. We found that design students find it difficult to give an exact definition of the target group. Thus the man-model program should not only make the user aware of the importance of a careful selection of the target group but also guide the man-model user through the selection process. One of the ways this may be done is by presenting the man-model user not with a list of available models but with a questionnaire. The man-model user will be asked to specify the target group (sex, age etc.). When the specification is finished, the program selects not only the best fitting model it has in store, but also presents information about restrictions to the use of this specific man-model. Selecting a target group is only one of the many aspects of the use of man-models. It will be clear that there is still a long way to go towards the ideal man-model program, a program that offers not only ‘perfect’ man-models, but that also offers the man-model user the necessary guidance. Research is needed to determine what information should be presented to the man-model users and how this information has to be presented. A thorough analysis of the way man-model programs should be used (what is the ‘right’ use) and are used (what goes ‘wrong’) is a first step towards the ideal program.
ANTHROPOMETRIC MEASUREMENTS IN ADOLESCENTS LIVING AT AN INTERMEDIATE ALTITUDE: THE RELATIONSHIP BETWEEN HEIGHT, WEIGHT, HEAD CIRCUMFERENCE AND SOCIOECONOMIC STATUS M.Dursun Kaya1, Hakki Yeşilyurt2, Behzat Özkan3, ilyas Çapoğlu4 & Recep Akdağ5 1Assistant 2PhD, 3Assistant
Professor, Computer Science and Research Center
MD, Faculty of Medicine, Department of Anatomy
Professor, MD, Faculty of Medicine, Department of Pediatric Endocrinology
4Assistant 5Professor,
Professor, MD, Faculty of Medicine, Department of Endocrinology
MD, Faculty of Medicine, Department of Pediatrics, Ataturk University Erzurum, Turkey
This study investigated 387 high school students living at an altitude of about 2000 meters. The educational and income levels of their parents were determined, and data on anthropometric measurements, height, weight and head circumference were made. The students were divided into male and female groups, evaluated in terms of their parents’ educational and income levels, and the effects of these parameters on anthropometric measurements. In the female group, no relationship was established between their parents’ educational and income levels and the anthropometric measurements. In the male group, the anthropometric measurements increased in parallel to their parents’ levels of education only. It was observed that the income level of the family contributed to the increase in the head circumference, weight and height in the male adolescents. Introduction Anthropometric measurements such as weight-for-age, height-for-age and head circumference-for-age have been widely used to follow up physical growth in children (e.g. Nebigil et al., 1997). Children living in developing countries under poor socio-economic and hygienic conditions are often exposed to nutritional deprivation (e.g. Jonge et al., 1996). Malnutrition is not very wide-spread but marginal protein-energy deficiency affects more than 50% of the children of pre-school age in developing countries (e.g. de Onis et al., 1993). Marginal undernutrition has been shown to result mainly in a retardation of physical growth (e.g.
38
ANTHROPOMETRIC MEASUREMENTS IN ADOLESCENTS LIVING AT AN INTERMEDIATE
Frisancho et al., 1980; Spurr et al., 1983; Spurr and Reina., 1989), the affected children exhibiting smaller anthropometric characteristics than their well-fed counterparts of the same age (e.g. Jonge et al., 1996). Altitude is associated with reduced anthropometric measurements (e.g. Yip et al., 1988). In this study, we aimed to determine the anthropometric measurements of adolescents living at intermediate altitude, and to evaluate the effects of their parents’ educational status and income levels on the anthropometric measurements. Methods This study was conducted on 387 children aged between 15 and 17 years from high schools (182 females and 205 males). The location of the study was the city of Erzurum, situated approximately an altitude of 2000 meters in Eastern Turkey. The schools were selected by stratified sampling method and subdivided on a basis of geographical location and socioeconomic status. From each of these schools, children were randomly selected and measured (e.g. Kayis and Özok, 1991). The educational and income levels of their parents were determined by using a specific questionnaire. The children were examined and the healthy ones were included in the study. Data were developed on the selected anthropometric measurements, weight, height and head circumference. A Harpenden anthropometer, a digital weighing scale and a metal tape were used to obtain the adolescents’ height, weight and head circumference respectively. During the measurements, students were bare-footed and dressed in underwear (e.g. Kayis, 1992). Digital weighing scale was calibrated by another spring balance regularly. Anthropometric measurements were taken every morning period (e.g. Yeşjlyurt, 1999). The head circumference was measured from fronto-occipital plan. Subjects were divided into male and female groups. Each of these groups was divided according to their parents’ educational status (classified as primary, secondary, high school and university) and income levels (classified as low, medium and high level and determined by The State Institute of Statistics data). Whether mean anthropometric measurements were statistically different according to the age, parents’ educational status and income levels were investigated by using one-way analysis of variance (ANOVA, SPSS statistical package). In addition, percentiles and descriptive values regarding the anthropometric measurements of girls and boys were calculated according to age groups. The pubertal maturation status was evaluated by using Tanner (e.g. Tanner andWhitehouse., 1976). Results Descriptive values and percentiles of the anthropometric measurements of girls and boys were shown in Table 1, Table 2 and Table 3, respectively. In the female group, there was no statistical difference between the age groups, their parents’ educational status, their income levels and the means of anthropometric measurements of the subjects (Table 2, Table 4 and Table 6). In the male group, the mean value of heights was only statistically different depending their parents’ level of education (F=2,85; p<0,05; Table 5). The income level of the family contributed to the increased head circumference, weight and height in adolescents as shown in Table 7 and the differences were statistically significant (F=3,09, p<0,05; F=3,06, p<0,05; F=7,02, p<0,01, respectively). Discussion In this preliminary study, our findings showed that heights and weights of the children aged 15 to 17 years old were influenced by intermediate altitude, exhibiting smaller values than their sea-level counterparts of
CONTEMPOARY ERGONOMICS 2000
39
the same age (Table 8). Our results supported the conclusions reached by Yip et al.(e.g. Yip et al., 1988): their major implication was that altitude even at moderate levels can affect childhood growth. For nutritional surveys or clinical assessments at altitudes >1500 m., it seems necessary to obtain reference values for children at all age groups. A further study is already underway in the region. In our study we observed that socioeconomic status affected the physical growth in boys with the height and weight measurements increasing between 15–17 years. However, this was the case with the boy group only. Statistically there was no difference in the anthropometric measurements of girls in terms of socioeconomic status. It was also observed that the girls almost seemed to complete their physical growth by fourteen years of age, but the physical growth of boys continued even after this age. This was partly due to the continuing pubertal development in males and the gradual termination of the growth in females between the 15–17 years of age group in the region. References De Onis, M., Monteiro, C., Akré, J., and Clugstone, G. 1993, The worldwide magnitude of protein-energetic malnutrition: an overview from the WHO global database on child growth, Bull WHO, 71, 703–712 Frisancho, A.R., Guire, K., Babler, W., Borkan, G., and Way, A. 1980, Nutritional influence on childhood development and genetic control of adolescent growth of Quechuas and Mestizos from the Peruvian lowlands, Am J Phys Antropol, 52, 367– 375 Gönen, E., Kalmkara, V. and Özgen. Ö. 1991, Anthropometry of Turkish women, Applied Ergonomics, 22(6), 409–411 Hauspie, R.C., Vercauteren, M. and Susanne, C. 1997, Secular changes in growth and maturation: an update, Acta Pœdiatric Suppl, 423, 19–27 Jonge, R., Bedu, M., Fellmann, N., Blonc, S., Spielvogel, H. and Coudert, J. 1996, Effect of anthropometric characteristics and socio-economisc status on physical performances of pre-pubertal children living on Bolivia at low altitude, Eur J Appl Physiol, 74, 367–374 Kayis, B. 1992, Steps towards establishing anthropometric data bank in Turkey, In M. Mattila and W.Karwowski (ed.), Computer Applications in Ergonomics, Occupational Safety and Health 1992, (Elsevier Science Publishers B.V.), 181–190 Kayis, B., and Özok, A.F. 1991, Anthropometric survey among Turkish primary school children, Applied Ergonomics, 22, 55–56. Kroemer, K., Kroemer, H., and Kroemer-Elbert, K. 1994, Ergonomics: How to design for ease & efficiency, (Prentice Hall, New Jersey) Lindsay, R., Feldkamp, M., Harris, D., Robertson, J. and Rallison, M. 1994, Utah growth study: growth standards and the prevalence of growth hormone deficiency, The Journal of Pediatrics, 125(1), 29–35 Nebigil, I., Hizel, S., Tanyer, G., Dallar, Y. and Coskun, T. 1997, Heights and weight of primary school children of different social background in Ankara, Turkey, Journal of Tropical Pediatrics, 43, 297–300 Neyzi, O., Yalçindağ, A. and Alp, H. 1973, Heights and weights of Turkish children, Journal of Tropical Pediatrics., 19, 5–13 Spurr, G.B., Reina, J.C., Dahners, H.W. and Barac-Nieto, M. 1983, Marginal malnutrition in school-aged Colombia boys: functional consequences in maximum exercise, Am J Clin Nutr, 37, 834–847 Spurr, G.B. and Reina, J.C. 1989, Maximum oxygen consumption in marginally malnourished Colombian boys and girls 6–16 years of age, Am J Hum Biol, 1, 11– 19. Tanner, JM, and Whitehouse, RH. 1976, Clinical longitudinal standards for height, weight, height velocity, weight velocity, and the stages of puberty, Arch. Dis. Child., 51, 170–179 Wright, C.M., Aynsley-Green, A., Tomlinson, P., Ahmed, L. and MacFarlane, J.A. 1992, A comparison of height, weight and head circumference of primary school children living in deprived and non-deprived circumstances, Early Human Development, 31, 157–162
40
ANTHROPOMETRIC MEASUREMENTS IN ADOLESCENTS LIVING AT AN INTERMEDIATE
Yip, R., Binkin, N.J. and Trowbridge F.L. 1988, Altitude and childhood growth, The Journal of Pediatrics, 113, 486–489 Yeşjlyurt, H. 1999, 2000 metre yükseklikte yaşayan ilköğretim çağindaki çocuklarm antropometrik ölçümleri ve değerlendirilmeleri (Anthropometric measurements and assessments in primary school children living at an altitude of 2000 meters), Atatürk Üniversitesi Sağlik Bilimleri Enstitüsü, Erzurum, Turkey, (Yaymlanmamiş doktora tezi, doctoral thesis) Table 1. Descriptive values regarding the anthropometric measurements and variance analysis of the data from girls according to age groups
Table 2. Descriptive values regarding the anthropometric measurements and variance analysis of the data from boys according to age groups
***: p<0,001
CONTEMPOARY ERGONOMICS 2000
Table 3. Percentile values from the anthropometric measurements of adolescents according to age and sex
* “G” denotes girls and “B” denotes boys. Table 4. Anthropometric measurements and variance analysis of the data from girls according to parents’ educational status
Table 5. Anthropometric measurements and variance analysis of the data from boys according to parents’ educational status
*: p<0,05
41
42
ANTHROPOMETRIC MEASUREMENTS IN ADOLESCENTS LIVING AT AN INTERMEDIATE
Table 6. Anthropometric measurements and variance analysis of the data from girls by family income
Table 7. Anthropometric measurements and variance analysis of the data from boys by family income
*: p<0,05, **: p<0,01 Table 8. Mean height and weight of girls and boys in Erzurum and its comparison with their counterparts of sealevel in Turkey
a: Neyzi, O., Yalçindağ, A. and Alp, H. 1973, Heights and weights of Turkish children, Journal of Tropical Pediatrics., 19, 5–13
RELATIONSHIP OF UPPER LIMB POSTURES TO ANTHROPOMETRIC VARIABLES L.W.O’Sullivan & T.J.Gallwey Ergonomics Research Center, University of Limerick, Plassy Technological park, Limerick, Ireland
This study investigated inter-individual differences in the induced joint angles at the elbow and shoulder for some industrial tasks. The first part of the study involved a survey of the upper limb postures in an electronics company and two metal fabrication companies. The mean elbow and shoulder flexion angles and the Coefficient Of Variation (COV) values for tasks were obtained. For the second part of the study, ten subjects completed a laboratory simulation of the tasks in the electronics company. Significant regression equations (p<0.05 to p<0.001) which used on average between five and seven anthropometric dimensions, were computed to explain differences in elbow and shoulder joint angles for each of the 17 task elements. Some dimensions were significant in a number of the equations, e.g. stature was contained in 71% of the elbow equations. Two of the dimensions, elbow-hand length and arm length, were not significantly related to either the elbow or shoulder joint angles. Introduction Predicting joint postures for virtual workstations at the pre-production planning stage of workplace design can be used to prevent Work-related Musculo Skeletal Disorders (WMSDs). Landau (1999) discussed the importance of determining body angles in Man-Modelling Procedures and how workplace evaluations must indicate whether or not the task can completed by the person. This is supported by the view that for a computer man-model to be truly useful, it must be closely integrated with an accurate posture prediction model (Jung and Choe, 1996). Some man-models, given detailed task descriptions, can generate induced joint angles through human movement simulation at virtual workstations e.g. ERGOMAN (1997). Caution is needed however, as the predicted joint angles are often generated from “average” data values. As demonstrated by O’Sullivan and Gallwey (1999) the power of an evaluation can be diminished due to large errors when average data values are used to represent a variety of users. Individual differences have been promoted by ergonomists as an essential concern when planning workplaces to accommodate anthropometric features and when completing biomechanical evaluations of industrial tasks. However, little information appears to be documented on the effects of varying anthropometrics on the induced joint angles of the upper limbs for tasks. Sengupta (1995) related a number of body dimensions to normal and maximum reach envelopes. Their equations clearly identify the variation in reach envelopes as a function of individual differences. It follows that the induced joint angles for the upper limb would also vary between different sizes of people for a particular workstation.
44
RELATIONSHIP OF UPPER LIMB POSTURES TO ANTHROPOMETRIC VARIABLES
It would be beneficial to quantify inter-individual differences in induced joint angles for industrial tasks and to investigate if the differences between people can be explained by the differences in anthropometric dimensions. Method Industrial postures The first part of the study involved collecting data on upper limb postures in industry. Participants were one electronics company and two metal fabrication (job shop) companies. The six tasks analysed in the electronics company involved the complete assembly of on average 1000 components for the automobile industry per day. The second company, Job shop A, processed various size components from stainless steel, mainly for use in clean room environments. Job shop B manufactured industrial furniture for assembly plants. Video-recordings were made of operators in each of the companies using a Panasonic AG455 video camera while they completed their work. The joint angle data for the elbow and shoulder was measured manually from the video recordings using a gionometer. Laboratory experiment The second part of the study involved analysing the differences in joint angles between people in detail by replicating some of the industrial tasks in the laboratory under controlled conditions so that detailed joint angle measurements could be made. It was decided to simulate the tasks in the electronics company as the tasks were repetitive and therefore easier to control. Ten subjects completed the experiment, five female and five male. The average age was 23.5 years and all were right handed. The simulated task consisted of a 3-pin plug assembly. The subjects completed a total of 15 plug assemblies, the first five being practice runs. Each of the eight components of the plugs were positioned in bins on an arc of 300 mm on the table surface. A jig was positioned in front of the subject for holding the components during the assembly. The table surface was set at 790 mm and the seat height at 600 mm. Subjects were positioned on a chair with 25 mm clearance between their abdomen and the bins at the front of the table. The task was completed with the right hand only. A Penny and Gilles Biometrics electrogionometer (model XM 110) was used to measure elbow flexion. The signals were amplified and passed through a 16-bit analog-digital converter. LabVIEW software (National Instruments Corp., Austin, Texas) and a 330 MHz PC were used to collect the signals from the gionometers. Ten anthropometric measurements were recorded on each subject, i.e. stature, body mass, bi-deltoid width, elbow-hand length, elbow-shoulder distance, upper extremity length (arm length), abdominal depth, chest depth, seated stature and lumbar-table distance. Five of these were used previously by Sengupta (1995) for predicting reach envelopes. Results Industrial data The cycle time for the tasks in the electronics assembly company ranged from 8 seconds to 19 seconds. The tasks in both job shops involved cycle times between 12 seconds and 28 minutes.
CONTEMPOARY ERGONOMICS 2000
45
Table 1 Average elbow and shoulder joint angles for the Industrial tasks
As shown in Table 1, the elbow flexion results in the electronics company were similar between tasks, mean 18°, COV 0.12. This was not so for the shoulder joint angles as the COV values were almost double that of the elbow i.e. COV value 0.21. Table 2 indicates, that for the job shops, the mean elbow flexion values ranged between 61° and 136° while the shoulder flexion values ranged between 16° and 61°. Simulated assembly task As shown in Table 3, the induced joint angles varied a lot between the subjects for each of the task elements, as described by the COV values. For example, the mean elbow angle for task element 1 (picking up the base) had a COV value of 0.24. The average elbow joint angle for the elements ranged from 42° to 62° while the COV values ranged from 0.06 to 0.30. The mean shoulder angles ranged from—4° to 79°. The COV values ranged between 0.40 and 2.64. Regression analysis of the joint angle data were used to establish the relationships between anthropometry and the variation in joint angles between subjects. The regression analysis was run using backward elimination of variables in SPSS V9. The highest adjusted R2 value was used as the criterion for selecting equations from the statistical analysis. Regression equations for both sets of joint angles (i.e. elbow and shoulder) were computed separately for each of the 17 task elements using the anthropometric dimensions collected on each subject. For the elbow, the lowest R2 value was 0.894 (p<0.05) and the next lowest was 0.934 (p<0.01). The remainder of the R2 values ranged from 0.95 to 0.999 (all significant at p<0. 01). For the shoulder, the R2 values ranged from 0.889 (p<0.01) to 0.967 (p<0.05) for three of the task elements. The remainder of the R2 values ranged from 0.981 to 0.999 (all significant at p<0.01). Table 4 (ordered by elbow proportion vales) shows the percentage of equations for which each of the anthropometric variables were significant (p<0.05 or better) in explaining the between-subject differences. Two of the variables, elbow-hand length and arm-length were not significant in any of the equations for either the shoulder or elbow.
46
RELATIONSHIP OF UPPER LIMB POSTURES TO ANTHROPOMETRIC VARIABLES
Table 2. Summary statistics for elbow and shoulder joint angle data for both Job Shops
* Tasks were completed at a machine Table 3. Summary statistics for the elbow and shoulder for the simulated task
CONTEMPOARY ERGONOMICS 2000
47
Table 4. Percentage of the seventeen task elements for which each variable was significant
Discussion Industrial data The repetitive nature of the tasks in the electronics company is described by both the short cycle times and the often low COV values. The tasks in both job shops can be separated into repetitive and non-repetitive tasks similarly using the COV values. For example, sand-blasting, deburring and guillotine work all had low COV values for the elbow, i.e. 0.04, 0.02 and 0.02 respectively while the non-repetitive long cycle tasks largely had high COV values for the elbow, e.g. spraying (COV 0.34, 28 minute cycle). Simulated task Even though the mean joint angle data for the simulated task were not the same as the data for the electronics industry, it is suggested that the simulated task was similar to a real industrial task. There is a difference of 28° in the average elbow flexion values but the COV values are similar. The mean and COV values were similar for the shoulder angles with the exception of a small number of tasks. There was considerable variation in the joint angle data between the subjects. The regression analysis indicates that the individual differences were very strongly related to the anthropometric variables. Elbowshoulder distance was significant in a number of equations, more notably for the shoulder flexion angles. However it was surprising to note that elbow-hand length was not significant in any of the equations as this was one of the variables used by Sengupta (1995) to predict reach envelopes. In general, the equations consisted of a mix of both upper-limb dimensions and gross body dimensions e.g. body mass. This indicates that gross body features can affect induced joint angles considerably and therefore should be included in posture prediction models for the upper limb. Conclusions 1. Coefficient Of Variation (COV) values can assist in identifying repetitive tasks. 2. The simulated task did not induce joint angles exactly the same as those in the electronics company, but it was representative of an industrial task. 3. Anthropometric dimensions explained a lot of the differences in the joint angles between subjects at highly significant levels.
48
RELATIONSHIP OF UPPER LIMB POSTURES TO ANTHROPOMETRIC VARIABLES
Acknowledgements The research in this paper is part of the BRITE-EURAM III Project BE96–3568 IDEA funded by the European Union References Anon. 1997, ERGOPLAN Version 4.0, Delta Industrie Informatik GmbH, Schaflandstraße 2, D-70736 Fellback, Germany. Jung, E.S. and Choe, J., 1996, Human reach posture prediction based on psychophysical discomfort, International Journal of Industrial Ergonomics, 18, 173–179. Landau, K., 1999, Introduction to the Man-Modelling job design Procedure, In Proceedings of the International Conference on Computer-Aided Ergonomics and Safety, May 19th–21st, Barcelona, Edited by Mondello, P., Mattila, M. and Karwowski, W. O’Sullivan, L.W. Gallwey, T.J., 1999, Wrist posture evaluations using individual range of motion values V’s averages, In Proceedings of the 15th International Conference on Production Research, August 9th–12th, University of Limerick, Ireland, 1059–1062. Sengupta, A.K., 1995, Anthropometric Modelling and Evaluation of Workspace for Industrial Workstation Design, Ph.D. Dissertation, Technical University of Nova Scotia.
Cockpit design
USABILITY TESTING OF A USER INTERFACE FOR AIRCRAFT TAXI GUIDANCE T.J.J.Bos1, H.Kanis, A.J.C. de Reus1 & W.S.Green Technical University Delft, 1National Aerospace Laboratory NLR The Netherlands
In the design of a stand-alone route guidance system for taxiing in airports, two successive user trials were carried out with commercial airline pilots. Despite a thorough analysis of tasks, users and context of use of the new product, the first user trial provided new in-depth insight in its operational use. The trial was interrupted half-way through, in order to perform a design iteration. This decision led to a shift in the emphasis of the design effort and to a more efficient product. This was proven by the second trial. This project illustrates the importance of user trials in both the identification of difficulties in operating the product itself, and usage in its intended environment. By continuously taking in behavioural and expressed user reactions, a user trial can and perhaps should be cut short once enough interaction information has been collected. Introduction The increase in air traffic is especially noticeable in airports. Where runway capacity used to be the bottleneck in the air traffic flow, today it is the capacity of the taxiways. Pilots are confronted with the ever-changing layout of airports, increasing in size and complexity, with higher traffic density. Other temporary circumstances, such as low visibility conditions, cause delays and add to the workload. The purpose of this graduation project was to design a stand-alone taxi guidance system. The work was performed as a co-operation between the Man-Machine Integration Department of the National Aerospace Laboratory (NLR) in Amsterdam and the Department of Industrial Design of the Technical University of Delft,. Design mission The taxi guidance system should primarily provide route guidance to the flight crew in order to facilitate the taxi task, giving more comfort while keeping up the taxi speed. As a secondary objective it should provide global awareness. The system was dubbed Taxi Assistant. The Taxi Assistant was designed as a stand-alone system, consisting of a Differential Global Positioning System receiver and airport maps. Other traffic is not displayed and taxi clearances provided by ground control will, after being entered by the flight crew, provide route information. The assistant was designed as a portable system, suitable for retrofit. For the user
CONTEMPOARY ERGONOMICS 2000
51
interface of the taxi assistant, Schiphol airport was the example. The design method applied was the ISO Standard on human-centred design processes for interactive systems (ISO, 1997). In this method the involvement of intended users and the iteration of design solutions are important principles, together with a profound analysis of the users, tasks and environment preceding the actual design process. Concept 1 Visual guidance was presented in a display and for the operation of the Taxi Assistant, a combination of soft keys and hard keys was used. Underneath the display, a group of keys is presented, with the following functions: OK, cancel (c), up, down, left and right. Two more keys are present for the navigation to other modes. The different modes are visualised in figure 1 to 3. All fields reachable with the cursor are presented as outlined frames. ‘Taxi’ indicates that the taxi application is currently running and ‘SPL’ is an abbreviation for Schiphol airport. Both can be reached with the cursor, in order to change the application or to look at the layout of a freely selected airport.
Figure 1. Input mode
Figure 2. Guidance mode
52
USABILITY TESTING OF A USER INTERFACE FOR AIRCRAFT TAXI GUIDANCE
Figure 3. View mode
In Figure 1, the destination, here a runway (19L), is named in the clearance first and presented at the top. Second is the route identification, here the outer taxiway. ‘D44’ is the gate number at which the aircraft is currently located. This order is consistent with the track-up orientation in the guidance mode. Guidance is presented in graphical representations of the route (Figure 2), in a track-up orientation, with a bar at the side to indicate roughly the distance to the next turn. Speed (15 kts.) and the estimated time to the destination (3:15) are also presented. Audio guidance is optional. Identifications of the taxiway(intersection)s are presented (outer, L6). Figure 3 presents the overview of the route, which has got three zoom levels. The map is presented in a north-up orientation. ‘D’ stands for the identification of the apron. User trial 1. Method Three airline pilots (of the six originally planned) participated in the evaluation, preceded by two pilot trials with general aviation pilots. None of them had seen the design before, but the airline pilots were very familiar with Schiphol airport. As a test environment a mock-up of a cockpit was used (Figure 4). A software prototype of the device was displayed on a touchscreen on which hard keys of the design were simulated by soft keys. The pilots were to enter a total of 16 taxi clearances, provided by simulated radio telephony. Before every clearance a situation was described, serving as the setting for the task. During the first ten clearances the pilots were invited to think aloud (Rooden, 1998) about their perceptions, their expectations and their considerations of what they were doing. The last set of clearances had to be entered as quickly as possible. Time taken for entering a clearance was measured, in order to get an indication of the efficiency, as well as to start a discussion on the acceptable time. Pen and paper were present in case the pilot needed to write down the clearance. The user trial was video recorded. In the end questions were asked concerning the use of the device in reality.
CONTEMPOARY ERGONOMICS 2000
53
Figure 4. Research setting, mock-up of cockpit
Results The navigation to different modes and to the index display presenting other applications was not obvious to all participants. Some of the use cues, textual presentations of the functions, were ambiguous. For example the ‘taxi’-frame, indicating that the taxi application is the program presently running, was once mistakenly activated in order to start taxiing. These usage problems raised the question of whether the information density could be decreased and if cues could be introduced to provide more commonality between the modes. In certain circumstances the participants did not distinguish between the ‘>’ and the ‘OK’ key, which was purposely designed this way. This led to ideas on how to increase the efficiency in entering the route, as well as the awareness that the major effort in the design had been on the intelligibility for the first time user. One of the design objectives was that a clearance could be entered as fast as it is written down. Surprisingly, the airline pilots were first verifying the possibility of the cleared route, either by mind or by map, before reading back the clearance. The prototype did not provide this facility, before the clearance was entered. Most often the clearance was written down, verified and read back, after which it was entered, lengthening the overall input-time of the clearance tremendously. The opinion on the acceptable input-time was that it is quite critical, when taxiing in. But in the case of difficult circumstances, such as low visibility conditions, the participants would be happy to take the time. One pilot mentioned he would be very happy to just be able to trace the current position after landing! After three of the six airline pilots planned for the user trial, it was decided to stop the trial, for design iteration, because inspiration on possible improvements was gained. As radical design changes were anticipated, continuation of the first trial did not seem useful. Concept 2 One of the major design alterations made in the second concept was a reduction of information by application of other use cues. Pictograms were introduced for closing the application and others to link the different modes, by presenting destination, route and current position by coloured shapes in all three modes. For entering clearances, possibilities for including some specifications were added. The mode keys are now located directly next to each other (see figure 5 to 7). The left key is to get in and out of the input mode and the right key to toggle between guide and view mode. In the input mode the left key’s nominated function is ‘enter’, in the other modes the clearance is located in the bar above the key. ‘Guide’ or ‘View’ is presented
54
USABILITY TESTING OF A USER INTERFACE FOR AIRCRAFT TAXI GUIDANCE
above the other key and there is no indication of the present mode in order to reduce the density of information.
Figure 5. Input mode, concept 2
Figure 6. Guidance mode, concept 2
Figure 7. View mode, concept 2
The taxiway menu and the link menu are now combined. Also the possibility of using the exit and entry menu is visualised. By presenting the menus together the user can quickly switch from one menu to the
CONTEMPOARY ERGONOMICS 2000
55
other by pressing the arrow keys pointing to the left or the right. This will save one key-stroke per item and the information on the display is not altering all the time, which should give the user the feeling of having full control. All fields reachable with the cursor are presented as outlined frames. In the second concept the entered items can not be reached with the arrow keys but just by the ‘OK’ key, which is a consequence of the change in the via-menu presentation. The second prototype did present its current position before the clearance was entered. Unfortunately this was not implemented soon enough before the second user trial, as only one of the scenarios provided this, so the verification of the route using the global awareness mode could not be tested. User trial 2 Method Three different airline pilots participated in the user trial, preceded by a pilot trial with one general aviation pilot. A similar set of clearances and the same set up were used in the trials. Results Now the word ‘enter’ was understood immediately, although one of the four participants (including the pilot trial) thought ‘enter’ had to be selected with the cursor. The clearance in the bar above the ‘enter’-soft key should indicate that this key is to be pressed to access the clearance again. The word ‘enter’ is not visible in the other modes, because it does not have the same meaning in the other modes (see figures 5 to 7). This was not immediately clear to all participants, but they all managed to find this out by themselves. The effect of the information reduction was hard to measure in this small sample. From the opinions expressed by the participants, it was concluded that the combined route menu in the first tasks caused more thinking before it was understood. The time taken to enter a clearance still did not meet its objective, but the overall opinion was that it was reasonably short. The participants found it worthwhile, especially in difficult circumstances. Suggestions were made that after writing down the clearance, other instruments could be set to subsequently enter the clearance. One pilot complained about the order of the clearance items. In the FMS (Flight Management System) the order happens to be ‘origin’, ‘route’ and ‘destination’, from top to bottom. This was the first time a participant compared the taxi device to this aircraft system, which again underlined the complexity of the intended operational environment. Discussion In the design of innovative products many uncertainties must be overcome. For products that are intended to support tasks that are currently being carried out by the user, thorough analysis of the current situation is necessary. Especially when designing for specialist users, in this case airline pilots, retrieving a full insight in the tasks is complicated. This project stresses the importance of an evaluation of the design by the intended user in as early a stage as possible. Valuable behavioral and expressed information collected from user trials can, at any point during the trials, trigger the question of whether continuation of the trials is useful. This requires a designer’s viewpoint, as solutions are to be anticipated, as well as an open-minded attitude towards the participants’ feedback. The project demonstrates the benefit of concurrent monitoring
56
USABILITY TESTING OF A USER INTERFACE FOR AIRCRAFT TAXI GUIDANCE
of the observations in a user trial, in order to be able to stop when appropriate, instead of rigid adherence to the completion of a predetermined schedule. References Rooden, M.J. 1998, Thinking about thinking aloud. In M.Hanson (ed.) Contemporary Ergonomics (Taylor & Francis), 328–332 ISO 13407, 1997, Human centred design processes for interactive systems, ISO, Geneva
THE COGNITIVE COCKPIT: OPERATIONAL REQUIREMENT AND TECHNICAL CHALLENGE R.M.Taylor1, H.Howells2 & Sqn. Ldr. D.Watson RAF3 1Centre
for Human Sciences, Manpower Integration Department DERA Farnborough, Farnborough GU14 0LX, UK
2Aircraft
Sector, Systems Integration Department DERA Farnborough, Farnborough GU14 0LX, UK
3RM
SANS, Ministry of Defence, Main Building, Whitehall London SW1A 2HB, UK
MOD has established a programme of research on the development and application of cockpit adaptive automation and decision support. This paper describes the background and scope of the resultant DERA Cognitive Cockpit project. It outlines the operational requirement and describes the technical approach, with inputs from human sciences, computing, and cognitive systems engineering. Introduction Advances in computing, coupled with the need for increased cost-effectiveness, mission effectiveness, and safety in the military aircraft operations, have increased the need for more intelligent adaptive automation and decision support systems. Eurofighter will enter service in an air defence role with conventional cockpit automation. The pilot’s role involves cognitive decision-making tasks of situation assessment and mission management. Upgrading Eurofighter with intelligent pilot aiding will need full justification. For the future, to 2030 at least, there will be a role for manned platforms in future European air forces. Manned aircraft are expected to be a key part of a Future Offensive Air System (FOAS) working with uninhabited air vehicles (UAV). Levels of automation are increasing, but aircrew involvement in decision-making will continue to warrant consideration of intelligent aiding, such as for assisting weapons deployment, helping eliminate causalities, fratricide and collateral damage (Cafferky, 1999). To assist future pilots with primarily cognitive tasks, it is believed that technology is needed for automated decision support that is adaptive or ‘context-sensitive’ to be responsive to changing mission requirements, in particular for in-flight situation assessment and re-planning. Technology also needs to be considered that is influenced by aircrew’s physiological and behavioural state, adaptively responding to an individual’s indications of overload, distraction and incapacitation. Implementation will need to be based on sound human factors (HF) cognitive engineering principles, keeping the aircrew in control of the system, rather than the system controlling the aircrew. DERA has been tasked with developing a “cognitive cockpit”. This is to allow the FOAS pilot, either airborne or on the ground controlling a UAV, “to concentrate his skills towards the relevant critical mission event, at the appropriate time, to the appropriate level”. The resultant DERA Cognitive Cockpit (COGPIT) programme provides HF research on intelligent aiding systems in which the relationship between the pilot and the system is flexible and context dependent. This flexibility is derived from a functional architecture that couples on-line mission analysis with on-line monitoring of the pilot’s functional state, deriving
58
THE COGNITIVE COCKPIT: OPERATIONAL REQUIREMENT AND TECHNICAL
information to mediate the timing, saliency and autonomy of the aiding. The potential system benefits include the following: • • • •
Real-time pilot functional state assessment for cockpit task adaptation Real-time support for situation assessment, task prioritisation and decision making Real-time idiosyncratic and bespoke cockpit ergonomics Real-time safety net, with potential to recover to base an incapacitated pilot. Background
Historically, the aircraft pilot and cockpit systems have had a master-slave relationship, with full pilot authority for aircraft control functions. This relationship changed with the introduction of computer control technology, with the pilot acquiring systems monitoring and supervisory roles. In the late 1970’s, ideas arose for more intelligent cockpit systems, with an interactive and synergistic pilot-system relationship (Reising 1979). The crew-adaptive cockpit proposed sensors for monitoring the pilot’s state, artificial intelligence (AI) software enabling the computer to learn, and pictorial displays allowing efficient presentation of cockpit information. This developed into a form of ‘R2D2’ intelligent agent co-operating with the pilot as a Human-Electronic Crewmember (HEC) team. Developments in advanced computer technology now make this realisable, including real-time data acquisition, fusion and processing, and computer modelling and AI inferencing techniques, such as expert systems, knowledge-based systems (KBS) and neural nets (Taylor and Reising, 1999). Beginning with the USAF/DARPA’s Pilot’s Associate (PA) programme (1985–1992), expert systems showed the potential of AI to support the pilot’s problem analysis and solution generation. PA research identified HF issues of adaptive automation, dynamic function allocation, levels of system autonomy and trust, and introduced goal-plan tracking for inferencing pilot intent. The USAF SBIR Hazard Monitor provided a real-time KBS for supporting system malfunction management in transport aircraft. Now, the US Army’s Rotorcraft PA provides a Cognitive Decision Aiding System and Cockpit Information Manager. In Europe in the 1990’s, AI efforts on pilot aiding have centred on the French “Co-pilote Electronique” (CE), and on the German civil and military Cockpit Assistant Systems. In contrast to PA, the CE program focused on AI support for problem recognition and situation assessment. The German CASSY project provided flight test of flight management KBS for re-routing of civil aircraft. Situation assessment modules provided perception, diagnosis, decisions and communication management, with pilot intent and error recognition functions. CAMMA has extended this application of KBS to military missions. In the UK in the late 1980’s, the joint Industry/MOD Mission Management Aid (MMA) project applied conventional computer techniques to sensor fusion, situation assessment and dynamic planning. Using deterministic, rule-based, event driven logic MMA found positioning (re-routing) and EW functions more difficult to assist and automate reliably than fuel and time management tasks. Lessons-learnt have been applied to Eurofighter to reduce pilot workload. MMA identified the need for context-sensitive prioritisation of interrupts in high workload phases. Subsequently, industry research has used AI model-based reasoning with multiple-goals to provide context-sensitive prioritisation for intelligent warning systems for civil cockpit applications. Applying KBS to safety critical functions poses certification problems. At Farnborough in the 1990’s, MOD Navy has sponsored AI research by DERA Aircraft Sector (AS) focusing on KBS for aiding aircrew mission decision making in helicopter anti-surface warfare and airborne early warning. This has led to development of real-time multi-agent KBS software, and new methodologies for knowledge acquisition and management. Other MOD RAF sponsored HF research at DERA CHS on
CONTEMPOARY ERGONOMICS 2000
59
adaptive automation and decision support, in collaboration with Sweden FOA, identified the need for a ‘cognitive cockpit’ approach based on principles for control of cognitive systems (Taylor, 1997). Coupled with the confidence-building DERA KBS work, and encouraging research on monitoring cognitive load, this led in 1998 to an enhanced MOD RAF HF program on automated decision support, aimed at influencing the FOAS cockpit (Banbury et al. 1999). Technical Challenges The COGPIT program has to consider novel, high-risk technical options for automated decision support. The system should select and digest information for consideration; adapt the level of automation to the tasks being performed, the operational context and the crew responses; and facilitate HEC teamwork. The work focuses on advanced technologies for pilot state monitoring and KBS situation assessment and decisionaiding, with ideas for tasking interfaces coupling requirements for context-sensitivity and control in cognitive systems (e.g. Hollnagel, 1997). Significant technical challenges include: • Improving adaptiveness without increases in workload or unpredictable automation. • Providing real-time, context-sensitive aiding with accuracy and precision to be useful and trustworthy, with tracking of operator’s goals and plans to infer intent. • Building an integrated KBS for prioritising pilot tasks and aiding decisions. • Supporting adaptiveness in skill, rule and knowledge-based levels of performance, critiqueing performance, preventing cognitive bias, and aiding error rectification. • Providing useful functional state information for task adaptations and interruptions. • Providing quantitative, scientific assessment of a broad set of aiding options using measures of effectiveness based on mission task performance. • Providing a blend of automation levels and pilot cognitive control strategies with: ➢ Pilot executive authority for controlling the system ➢ Stability through feed-back (reactive) and feed-forward (proactive) control ➢ Strategically planful pilot control at the knowledge-based level (feed-forward) ➢ Automation of reactive skill and rule-based responses (feed-back). • Focusing system design on functional purpose, control allocation and information utility using cognitive work analysis methodologies.
COGPIT Agents Functional Architecture The COGPIT systems under development involve the interacting agents and communications shown in Figure 1, with the functionality summarised in Table 1. Cognition Monitor (COGMON)—This module is concerned with on-line analysis of the psychological, physiological and behavioural state of the pilot. Primary system functions include continuous monitoring of workload, and inferences about current attentional focus, ongoing cognition and intentions. It also seeks to detect dangerously high and low levels of arousal. Overall, this system provides information about the objective and subjective state of the pilot within a mission context. This information is used in order to optimise pilot performance and safety, and provides a basis for the implementation of pilot aiding (see Pleydell-Pearce and Dickson, 2000).
60
THE COGNITIVE COCKPIT: OPERATIONAL REQUIREMENT AND TECHNICAL
Situation Assessment Support System (SASS)—This module is concerned with on-line mission analysis, aiding and support provided by real-time, multi-agent KBS software. This system is privy to the current mission, aircraft (e.g. heading, altitude and threat) and environmental status, and is also invested with extensive a priori tactical, operational and situational knowledge. Overall, this system provides information about the objective state of the aircraft within a mission context, and uses extensive KBS to aid and support pilot decisions (see Shadbolt et al, 2000). Tasking Interface Manager (TIM)—This module is concerned with on-line analysis of higher-order outputs from COGMON and SASS, and other aircraft systems. A central function for this system is maximisation of the goodness of fit between aircraft status, ‘pilot-state’ and tactical assessments provided by the SASS. These integrative functions enable this system to influence the prioritisation of tasks and, at a logical level, to determine the means by which pilot information is communicated. Overall, this system allows pilots to manage their interaction with the cockpit automation, by context-sensitive control over the allocation of tasks to the automated systems (see Bonner et al 2000). COGPIT Simulation Test Environment (COGSIM)—COGSIM is concerned with the specification and provision of a proof-of-concept simulation test environment for pilot aiding, including the form and function of-a cockpit in which the COGPIT modules will be implemented, tested and validated. It will use aiding taxonomies and existing HF analysis methods and human-computer interaction guidelines. Computer application tools are used for prototyping, simulation and scenario management (VAPS, VEGA, Stage). FY00/01 COGPIT Programme Status Work to date has provided mission and human system analyses, functional specifications, knowledge documents and some initial development of the COGPIT modules. A baseline conventional ‘EF22’ cockpit has been built, with initial scenario scripting for a partial prototype proof-of-concept demonstration. Future work in FY00/01 will extend the system functionality and scenarios, provide integration of sub-systems for evaluating candidate cockpit options, and consider wider applications. A programme of empirical validation testing is planned, with collaboration on assessment methodologies from the USAF Adaptive Interfaces program (see Vidulich and MacMillan, 2000).
Figure 1. COGPIT Agents
CONTEMPOARY ERGONOMICS 2000
61
Table 1. Summarised COGPIT Functional Decomposition
References Cafferky P. 1999, Tomorrow’s Cockpit, Friend or Foe? Air Clues, 1, 1999, 34–37. Reising, J. 1979, The crew adaptive cockpit: Firefox here we come. In Proceedings of the 3rd Digital Avionics conference, Fort Worth, Texas. Taylor, R.M., and Reising, J. 1998, The human-electronic crew: Human-computer collaborative team working. In RTO Meeting Proceedings 4, (NATO Advisory Group for Aerospace Research and Development, Neuilly sur Seine CEDEX). Taylor, R, M. 1997, Human electronic crew teamwork: Cognitive requirements for compatibility and control with dynamic function allocation. In M.J.Smith et al (eds), Design of Human-computer Systems, (Elsevier: Amsterdam), 21B, 247–250. Banbury, S., Bonner, M., Dickson B., Howells H. and Taylor, R.M. 1999, Application of adaptive automation in FOAS (Manned Option). DERA/CHS/MID/CR990196/1.0 Hollnagel, E. 1997. Control versus dependence: Striking the balance in function allocation. In M.J.Smith, et al (eds), Design of Human-computer Systems, (Elsevier: Amsterdam), 21B, 243–246. Pleydell-Pearce, K. and Dickson, B. 2000, Cognition Monitor: A System for Real-time Functional State Assessment . In M.A.Hanson et al (eds.) Contemporary Ergonomics 2000 (Taylor and Francis, London). Shadbolt N.R., Tennison, J., Milton N. & Howells H. 2000. Situation Assessor Support System: A Knowledge-Based Systems Approach to Pilot Aiding. In M.A.Hanson et al (eds.) Contemporary Ergonomics 2000 (Taylor and Francis, London). Bonner, M., Taylor R.M., and Miller C. 2000, Tasking Interface Manager: Affording Pilot Control of Adaptive Automation and Aiding. In M.A.Hanson et al (eds.) Contemporary Ergonomics 2000 (Taylor and Francis, London). Vidulich M. and MacMillan G. 2000, The Global Implicit Measure: Evaluation of Metrics for Cockpit Adaptation. In M.A.Hanson et al (eds.) Contemporary Ergonomics 2000 (Taylor and Francis, London).
© British Crown Copyright 2000/DERA Published with the permission of the controller of Her Majesty’s Stationary Office
SITUATION ASSESSOR SUPPORT SYSTEM: A KNOWLEDGE-BASED SYSTEMS APPROACH TO PILOT AIDING N.R.Shadbolt1, J.Tennison2, N.Milton2 & H.Howells3 1Department
of Electronics and Computer Science University of Southampton, Highfleld, Southampton SO17 1BJ, UK 2Epistemics 3DERA
Ltd, Strelley Hall, Nottingham NG8 6PE, UK
Farnborough, Farnborough GU14 0LX, UK
The Situation Assessment Support System (SASS) seeks to demonstrate a knowledge-based subsystem that will provide a dynamic assessment of the operational context and generate recommendations to support COGPIT tactical decision making. Extensive knowledge acquisition and validation has been undertaken with appropriate experts over four vignettes, leading to the production of the knowledge base document. This encapsulates all relevant expertise, for integration and aiding pilot tactical decision making in the proposed COGPIT simulation test environment. Prioritised areas of support agreed with MOD, focus on plan assessment, system-health checks, DAS, rerouting and target-attack vignettes. A series of knowledge-acquisition sessions were conducted to build the knowledge base with the involvement of RAF and RN aircrew. The individual task decompositions and detailed knowledge captured during this phase provides the basis for future architectural and softwaredesign processes. By exploiting software and toolkits developed under MOD CRP funding, the work seeks to define, design and construct a decision support sub-system prototype to operate in scenarios associated with FOAS and FCBA, using Real Time Multi Agent Software. This paper will describe the concept of operation and technical development of SASS. Introduction Knowledge-based decision support systems are becoming a recognised technology in the defence industry, with situational assessment and awareness recognised as a key capability in military decision-support systems. This paper describes the current state of development for the knowledge-based system component of the Cognitive Cockpit (COGPIT) programme. In Section Error! Reference source not found., we give some background to this project by describing the work we have carried out on other knowledge-based decision support systems involving situation assessment. In Section Error! Reference source not found. we describe how the Situation Assessment Support System (SASS) fits into the COGPIT, and in Section Error! Reference source not found. the structured methodology we are using to develop it.
CONTEMPOARY ERGONOMICS 2000
63
Background Previous collaborations between DERA and Epistemics Ltd have included two projects which developed real-time Knowledge-Based Systems with a major emphasis on situation assessment. These projects were Helicopter Aircrew Decision Support (HADS) and Future Organic Airborne Early Warning (FOAEW). Helicopter Aircrew Decision Support (HADS) In collaboration with Cambridge Consultants Ltd, this project developed a helicopter-based decision-support system for anti-surface warfare. The system provides automated support for the key decisions in the principle mission tasks. It interprets available sensor data to determine the identity of each surface vessel, then plans optimum routes for helicopters to move closer to vessels to confirm their identity and analyse any threat that they may pose. Route planning takes into account the speed and direction of vessels, while prioritising according to their possible threat. A knowledge-based approach allowed the informal reasoning involved in the task to be described and used in a flexible manner. In such tasks, no conclusions can be certain, and they depend upon other information that is similarly uncertain. The known features of a particular contact are matched with typical descriptions of certain types of vessel: for example, a contact with a high speed is likely to be a warship or merchant ship, rather than a fishing vessel. In this application, a knowledge-based system provides an extra level of support and supervision to increase operational efficiency. The underpinning real-time, multi-agent software required for the HADS system is described by Martin and Howells (1995). Future Organic Airborne Early Warning (FOAEW) This project successfully demonstrated the feasibility of a knowledge-based decision support system to aid helicopter-based Airborne Early Warning (AEW) crew in detecting and eliminating enemy aircraft. The system performs such key tasks as placement of the helicopter barrier, identification of hostile aircraft, management of Combat Air Patrol (CAP) aircraft, and fuel/position management. Without such a system, it is expected that future AEW operator workload will increase to levels likely to have a detrimental effect on the performance of AEW operations. Epistemics Ltd performed all knowledge acquisition for the system using the PC PACK software toolkit (Schreiber et al., 2000, Chapter 8), and facilitated the implementation carried out by Cambridge Consultants Ltd. The structure of the knowledge models constructed in PC PACK was replicated in the system architecture to aid in the validation, upgrading and maintenance of future systems (Zanconato and Davies, 1997). During knowledge acquisition, extensive use was made of generic, reusable models of problem solving, which are supported within the GDM tool in PC PACK. A full description of the GDM tool and the use of this method is described in O’Hara, Shadbolt and Van Heijst (1998). As Zanconato and Davies point out, the system developed was not intended as an autonomous system with which the FOAEW operator has minimal interaction. Instead, it was required to be a co-operative system in which the system and operator are able to utilise the skills most appropriate to their capabilities. As such, the design of the MMI was crucial to successful operation. Hence, the system was designed to interface with the Royal Navy’s latest AEW MMI. Using this system configuration in a concept demonstrator, operator’s confidence in the accuracy and reliability of the advice provided increased significantly. The dynamic filtering of information coupled with the MMI displays implemented were felt to provide temporal and consistency gains in achieving overall situation assessment (Davies, 1999).
64
SITUATION ASSESSOR SUPPORT SYSTEM: A KNOWLEDGE-BASED SYSTEMS
Situation Assessment for FOAS Part of the COGPIT Technical Demonstrator described in Taylor, Howells & Watson (2000) will be a knowledge-based decision support system, terms the situation-assessment support system (SASS). The COGPIT Technical Demonstrator is intended to showcase the role of future technologies within the cockpit of the Future Offensive Air System (FOAS). As such, the SASS is one of three initiatives: • Cognition Monitor (COGMON): a module that monitors the pilot’s physiology and behaviour • Situation Assessment Support System (SASS): a module that monitors the situation and recommends actions • Tasking Interface Manager (TIM): a module that manages the interface the pilot is presented with As with our previous approaches to situation assessment, the SASS handles situation assessment on a task by task basis with no separate module or agent performing situation assessment. We believe this integrated approach is best suited to such applications, since the knowledge used by human operators when performing situation assessment is best acquired and modelled within the context of the task being performed. In other words, expert human operators conceptualise situation assessment in a task-specific way and not as a separate activity (Klein, 1995). This approach still allows specific information on situation assessment to be requested from the knowledge-based decision support system, for example for explanation to the human operator or use in another automated module, without the need for a specific situation-assessment module. Methodology The development of the SASS follows the CommonKADS model for the development of knowledge-based systems (KBSs) (Schreiber et al., 2000). CommonKADS is a development methodology that is the result of a number of research and applied projects on knowledge engineering over the past 16 years and has been used in a wide variety of business contexts. CommonKADS describes a number of knowledge-level models that should be developed prior to the implementation of a KBS. These models are: • Organisational model: organisational analysis to identify the opportunities for knowledge-intensive systems within it • Task model: identification of the major tasks involved within the organisation • Agent model: modelling of the agents (humans, information systems and other entities) that carry out tasks within the organisation • Knowledge model: an implementation-independent description of the knowledge components involved in carrying out a task • Communication model: a description of the interactions between the various agents involved in a task • Design model: a technical system specification that indicates how the knowledge model and communication model will be implemented within a specific environment Figure 1 shows how the CommonKADS models are combined: the organisational, task and agent models provide information for the knowledge and communication models, which themselves provide information for the design model. The resulting models are then implemented according to structure-preserving design principles: the implemented code should retain the organisation and structure of the antecedent models (knowledge model, communication model etc.).
CONTEMPOARY ERGONOMICS 2000
65
Figure 1: The CommonKADS models
The development of the organisational model used a structured approach to examine the organisation and assess the feasibility of knowledge-based solutions for the problems that are identified. This scoping procedure uncovered four main areas that could benefit from knowledge-based decision support: plan assessment, system health checks, the attack phase of the mission, and the DAS/reroute task. Initial knowledge acquisition has been performed on each of these task areas. For the purpose of the Technical Demonstrator, future work will focus on the DAS/reroute task, which involves the use of the Defensive Aids Suite and rerouting to counter problems caused by threats and weather. The development of the knowledge model was substantially aided through the reuse of models, structure and content used in the development of decision-support systems for HADS and FOAEW as described in Section Error! Reference source not found.. While those systems were used within helicopters, and with different tasks, a number of concepts could be reused due to the fact they were all systems to be deployed in a military airborne context. The knowledge acquisition involved in the development of the CommonKADS models for the SASS has utilised a number of KA techniques, including structured interviews, laddering, repertory grid analysis, card sorts and 20 questions. We conduct knowledge acquisition in parallel with knowledge modelling, in consultation with the experts, which improves the validity of the models. The PC PACK and MetaPACK toolsets1, developed by Epistemics Ltd, have been essential in supporting the acquisition and modelling processes. The results of the knowledge acquisition are, firstly, a number of scenarios using which the SASS, and the COGPIT as a whole, can be demonstrated and evaluated, and, secondly, knowledge documents giving implementation-independent models of the knowledge involved in the relevant tasks. The implementation of the SASS will involve three stages. The first stage is a conceptual implementation, using the CLIPS expert-system shell, in which SASS will give advice on the best course of action given static situations. The second stage involves the integration of the SASS with the other modules of the COGPIT, involving the dynamic exchange of information between them. The final stage will involve the implementation of decision support for the other tasks. In this and other projects, we are seeking to establish the power and utility of an incremental and structured knowledge-oriented development methodology. We observe that using this approach improves the efficiency of knowledge acquisition, a classic bottleneck in system development. Moreover, we are now
66
SITUATION ASSESSOR SUPPORT SYSTEM: A KNOWLEDGE-BASED SYSTEMS
demonstrating that it leads to substantial reuse of knowledge that has been elicited at great cost in previous projects. Finally, we aim to also demonstrate the enhanced maintainability of systems developed in this way. Together, these developments should decrease the risk associated with knowledge-intensive system development. References Davies, A.J. (1999). Assessment of FOAEW decision support concept demonstrator and advice to DOR(Sea) concerning route to Staff Target definition (U). Technical report, DERA, Farnborough: DERA/AS/SID/CR990117/1. 0. Hoffman, R., Shadbolt, N.R., Burton, A.M. & Klein, G. (1995). Eliciting Knowledge from Experts: A Methodological Analysis. Organizational Behavior and Decision Processes, 62, 129–158. Klein, G. (1995) Naturalistic Decision Making. Ohio: CSERIAC. Martin, S. & Howells, H. (1995). Real Time Software for Knowledge Based Systems. IEEE Colloquium on Real Time Systems. London, 1995. O’hara, K. & Shadbolt, N.R. & Van Heijst, G. (1998). Generalised Directive Models: Integrating Model Development and Knowledge Acquisition. International Journal of Human-Computer Studies, 49, 497–522. Schreiber, A.Th., Akkermans, J., Anjewierden, A., De Hoog, R., Shadbolt. N., Van De Velde, W. & Wielinga, B. (Forthcoming). Knowledge Engineering and Management: The CommonKADS Methodology. The MIT Press. Taylor, R.M., Howells, H. & Watson, D. (2000). The Cognitive Cockpit: Operational Requirement and Technical Challenge. In Proceedings of the Ergonomics Society Annual Conference, Grantham, 4–6 April 2000. Zanconato, R. & Davies, A. (1997). Design for a Knowledge-Based Decision Support System to Assist Near Littoral Airborne Early Warning (AEW) Operations. Proceedings of the 4th Joint GAF/RAF/USAF Workshop in HumanComputer Teamwork. Kreuth, Germany.
1For
more information, see http://www.epistemics.co.uk/products/
COGNITION MONITOR: A SYSTEM FOR REAL TIME PILOT STATE ASSESSMENT Kit Pleydell-Pearce1, Blair Dickson2 & Sharron Whitecross1 1Burden 2F138
Neurological Insititute, Stoke Lane, Stapleton, Bristol BS16 8QT, UK
Building, Centre for Human Sciences, DERA Farnborough Farnborough GU14 0LX, UK
Cognition Monitor (COGMON) is a system designed to provide real time information about the cognitive-affective state of pilots. It derives data from four principal sources: physiology, behaviour, context and subjective states. Data from these sources are combined in order to update a real time model of pilot state. This model can then be used as a basis for optimising pilot performance, enhancing safety and for the implementation of various on-board cockpit aiding systems. This paper provides an overview of the architecture of COGMON, its underlying theoretical basis and ends with a discussion of the nature and uses of its outputs. Introduction One of the basic principles underlying COGMON is the view that the term ‘workload’ is too limited and should be replaced by the more embracing concept of ‘operator state’. With regard to aircraft environments we view ‘pilot state’ as a multidimensional concept. It includes, for example, levels of stress and alertness, current physical and mental demand, current locus of attention, nature of cognitive activity, current context as well as higher-order concepts such as pilot intent and situational awareness. At present there is no single measure which even remotely provides information about these various aspects of pilot state. For this reason, COGMON continuously samples a range of variables in order to provide a real time model of pilot state. The data sources upon which COGMON relies can be divided into four general classes and these are now discussed in turn. Physiological measures A full review of COGMON physiological recording and analytical facilities is beyond the present scope. However, the system includes measurement of heart rate, respiration rate, electromyogram, electrodermal activity, skin temperature, electro-oculogram and electroencephalographic (EEG) activity. These measures provide information concerning levels of autonomic reactivity (e.g. stress) as well as information about current levels of alertness. Measurements of eye-movement activity and blink rate provide an index of visual workload and recent improvements in biosensor technology and signal processing have allowed a dramatic improvement in locus of gaze detection (when head position is known). However, an optical solution to gaze location is presently seen as most promising. Recordings of brain electrical activity from the scalp also provide information about workload. For example, COGMON is capable of recording slow cortical potentials within the EEG which have been shown to be sensitive to fluctuations in cognitive demand (e.g. Pleydell-
68
COGNITION MONITOR: A SYSTEM FOR REAL TIME PILOT STATE
Pearce et. al., 1995) and capable of differentiating load imposed upon distinct cognitive systems (e.g. Pleydell-Pearce, 1994). COGMON also employs spectral decomposition and coherence analysis of EEG in order to differentiate levels of cognitive load. It is worth noting that many physiological measures are correlated. For example, heart rate and electrodermal activity are both influenced by respiration rate (e.g. Bernston et al, 1997) and many biosensors are sensitive to thermal and vibratory artefact. For this reason, COGMON uses various mathematical tools aimed at uncoupling correlations between its incoming physiological variables. Finally, physiological sensors can be time consuming to apply. So, the development of COGMON includes the design of fast fit biosensors including helmet mounted non-polarising EEG electrodes. Behavioural Measures While physiological measures provide a wide range of useful information they are presently poor at providing fine-grained information about specific forms (i.e. contents) of cognitive activity. For this reason, behavioural data, and in particular, interactions with cockpit controls provide a rich database which can be used in order to make inferences about cognitive state. Interactions with controls are monitored by COGMON for two general purposes. First, such measures permit strong inferences about the nature of ongoing cognitive activity. For example, manual interaction with a visually-guided cockpit control which uses an onscreen cursor typically indicates visuo-spatial workload and permits the inference that visual, somatosensory and motoric attention are invested in that task. A second major aspect of COGMON is based on the view that a great deal of pilot behaviour can be decomposed into separate largely encapsulated procedures or algorithms. A crucial aspect of COGMON function is therefore the facility to recognise when these specific procedures/algorithms are being performed. Such inferences rely heavily upon interpretation of interactions with aircraft controls although other measures taken by COGMON can supply additional information. COGMON refers to a database in order to detect the onset and track the progress of specific procedures. When a particular procedure is detected, COGMON uses a stored functional taxonomy to provide information about affective and cognitive states such as stress and workload that are likely to accompany the procedure. This kind of information derives in part from a priori subjective measures (see ahead). It also depends upon a deconstruction of procedures into components based upon logical analyses. The database can also indicate many other factors such as whether the procedure is one which when started must be taken rapidly towards completion or can be left to ‘idle’ in the background. The database also contains information about which distinct procedures can be combined without mutual interference on both logical and empirical grounds. It is worth noting, though, that novel or unusual procedures adopted by pilots may not be correctly recognised. Under such circumstances, COGMON can still gain some information based upon lower level monitoring of interactions with controls. For example, COGMON monitors all vocalisations from as well as auditory inputs to pilots. While this information may not be analysed to the level of meaning, it does provide useful information about ongoing cognitive processes. Specific combinations of particular procedures indicate more global goals and permit inferences about pilot intent. At this more macroscopic level, COGMON attempts to infer pilot intent using a pre-existing database in which the probable significance of particular procedural combinations are stored. Analysis at this level may also be guided by pertinent contextual information (see ahead). However it is important to note that at this level, novel or unusual combinations of particular procedures may be enacted in the pursuit of complex unknown goals. Finally, the interpretation of some interactions with controls can be ambiguous. However, such sources of ambiguity can be minimised in carefully designed cockpits.
CONTEMPOARY ERGONOMICS 2000
69
Subjective Measures Subjective measures of pilot state are those provided by the pilot. In conventional settings these are often paper and pencil tasks (e.g. the NASA task load index—Hart and Staveland, 1988). COGMON makes use of two kinds of subjective measure. ‘Prospective’ measures can be signalled by the pilot to COGMON at any time and include communications such as “I am—drowsy, bored, stressed or experiencing high levels of workload.” We call this system the Pilot Load Indicator (PLI). Communication is currently made via pushbuttons. The direct communication of subjective states to COGMON provides useful additional information although the use of this system is currently seen as an issue of pilot preference. It is also clear that under conditions of high stress and high workload the PLI could constitute an extra source of load although it does have a single prominent “emergency” button to signal such states. Furthermore, incorporating such measures within COGMON gives the pilot a direct link with on-board flight systems, and does not therefore treat the individual as passive and ‘out of the loop.’ For similar reasons we are considering the possibility of providing direct though simplified pilot feedback concerning current levels of pilot state inferred by COGMON. A priori subjective measures are those which have been collected on the basis of interviews with pilots. Identifiable algorithms and procedures (defined above) are rated in terms of factors such as probable degrees of accompanying workload and stress. Thus when any actual task is detected, COGMON can make use of this existing knowledge. Furthermore, pilots can supply a priori information about the ease with which various separate tasks can be combined and the kinds of load that are associated with tasks and their components (e.g. visual/auditory/somatosensory, spatial/verbal or estimates of task time pressure). Finally, information concerning the stress and workload consequences of failures of various cockpit systems as well as influences of contextual measures (next) are contained within the database. Contextual Measures Context provides a powerful basis for interpreting pilot state data. COGMON has access to contextual information which includes factors such as altitude, speed, levels of threat and whether aircraft controls are functioning normally. This provides COGMON with a context for interpreting incoming data. COGMON also collects low-level contextual information as well. Examples of this include ambient noise, luminance, vibration and temperature, which are all factors known to influence pilot performance and outputs from biosensors. Bespoke Systems A characteristic feature of human performance is that there are widespread differences in behavioural and physiological responses to similar situations. This means that conclusions based upon average findings from a group of individuals may only correlate weakly with the behaviour of a particular individual. However, scientific approaches to problems such as mental workload are usually based upon data averaged across subjects. In contrast, less research has attempted to identify unique but reproducible changes within single individuals. A major feature of COGMON is that it is designed to learn about the behaviour of individuals, and look for predictable regularities in their particular responses to changing patterns of workload. This means that COGMON holds a database for each pilot, which is activated when that pilot is identified. This is seen as a supplement to other aspects of COGMON, because in the absence of such a database, it would rely upon its non-bespoke systems.
70
COGNITION MONITOR: A SYSTEM FOR REAL TIME PILOT STATE
Convergent Processing The previous sections indicate that COGMON processes a large amount of data. Although the various forms of data can be treated as separate variables, the relationships between different data sources will contain valuable information. For example, the absence of an arousal reaction to a mild threat, such as a low altitude warning, may indicate that the pilot is confident and in control. However, it might instead indicate a loss of situational awareness caused by dangerously low levels of arousal. In recognition of the importance of convergent processing, COGMON is capable of performing complex on- and off-line multivariate analysis in order to improve inferences about pilot state. These routines include the facility to look for redundancy within measures. In other words, if two COGMON measures provide near identical information then it makes sense to select the measure which is easiest to collect and process. A further benefit of convergent processing is that hidden predictive trends can often be discovered in the relations between data sets which cannot be obtained from either data set alone. COGMON research has also employed artificial neural networks in order to search for ‘hidden’ patterns within data. A Model of Pilot State Broadly speaking COGMON provides an estimate of sleep-wakefulness, relaxation-stress, cognitive load (including an assessment of load imposed upon distinct modalities), an index of currently active procedures (algorithms) and an assessment of current intents and some specification of longer term goals. COGMON outputs may also permit some estimates of situational awareness. For example, failure to have performed any (or recent) actions which might signal awareness of a particular threat would constitute grounds for inferring a deficit in situational awareness. Similarly, sustained focus of attention on a single task serves to warn that situational awareness may have decreased. Taken together, these various goals of COGMON processing constitute our multidimensional model of pilot state. The Nature and Uses of COGMON Outputs At present, COGMON is one component of the Cognitive Cockpit Research Program being developed at DERA/CHS Farnborough. This program is aimed at the production of a cockpit which can monitor pilot state and implement automisation and various forms of aiding as and when appropriate. In this system various aspects of aircraft control can be taken over by a Situation-Assessment Support System (SASS), for example, when the pilot is heavily overloaded. Decisions about which tasks will be automated are taken by a third system called the Tasking Interface Manager (TIM) which is supplied with a constantly updated model of pilot state by COGMON. The TIM system uses this information to maintain pilot performance at optimal levels. For example, which task(s) might benefit from automation or how and where warning should be displayed? Similarly, COGMON can warn TIM if the pilot is dysfunctionally stressed, overloaded or even underloaded and drowsy. Another function of COGMON is its capacity to store data for later off-line analysis. This allows it to examine patterns of performance in detail, improve prediction on future flights and update the precision of its bespoke analyses. This facility also provides a useful tool for flight training, debriefing and a basis for improving various aspects of flight management. More generally, COGMON architecture employs computational principles that mean its individual components can function in isolation from the whole. This is even true of the systems that interpret interventions with cockpit controls which will work in conjunction with any suitably specified functional taxonomy. For this reason the system can be easily adapted to other platforms (in part or in entirety) and also constitutes a stand-alone research tool.
CONTEMPOARY ERGONOMICS 2000
71
References Berntson, G.G., Bigger, J.T., Eckberg, D.L., Grossman, P., Kaufman, P.G., Malik, M., Nagaraja, H.N., Porges, S.W., Saul, J.P., Stone, P.H., Van Der Molen, M.W. 1997, Heart rate variability: Origins, methods and interpretive caveats. Psychophysiology 34; 623–648. Hart, S.G. and Staveland, L. 1988, Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In P.A.Hancock and N.Meshkati (Eds) Human and Mental Workload. Elsevier, (Amsterdam, The Netherlands.) 139–183 Pleydell-Pearce, C.W. 1994, DC potential correlates of attention and cognitive load. Cognitive Neuropsychology 11(2) 149–166. Pleydell-Pearce, C. W., McCallum, W.C. and Curry, S.H. 1995, DC shifts and cognitive load. In G.Karmos, M.Molnar, V.Csepe, I.Czigler and J.E.Desmedt (Eds). Perspectives of Event Related Potentials Research. Supplement 44. to Electroencephalography and Clinical Neurophysiology. (Elsevier, Amsterdam.) 302–311.
TASKING INTERFACE MANAGER: AFFORDING PILOT CONTROL OF ADAPTIVE AUTOMATION AND AIDING M.C.Bonner1, R.M.Taylor1 & C.A.Miller2 1DERA 2Honeywell
Farnborough, Farnborough GU14 0LX, UK
Technology Centre, 3660 Technology Drive Minneapolis MN 55418, USA
The Tasking Interface Manager (TIM) seeks to demonstrate real-time adaptive automation and real-time task, interface and timeline management to support pilot operations in the Cognitive Cockpit (COGPIT). The intended TIM application is to enable the pilot to concentrate his/her cognitive capabilities on the tactical aspects of the mission and off-load the routine activities to automation. Ideally, this would allow the pilot to remain in a feed-forward activity, whilst most, if not all feedback requirements are met by decision aiding and automation. The TIM utilises output from the Situation Assessment Support System (SASS) and the Cognition Monitor (COGMON) to adaptively present information and adaptively automate tasks according to the situational context and the pilot’s internal state. The main features of a tasking interface are a shared mental model, the ability to track goals, plans and tasks, and the ability to communicate intent about the mission plan. The paper will describe the concept of operation and the technical development of the TIM. Introduction The complexity of the military aviation task domains is such that without considerable computerised assistance aircrew would not be able to cope with the very large number of potentially relevant features and a vast number of possible responses. Perceiving and interpreting all of the relevant features and choosing an appropriate response within the tight temporal constraints of the domain will challenge any intelligent agent —whether human or machine (Banbury et al, 1999). One method of reducing the task and cognitive load on aircrew is the provision of intelligent decision aids coupled with adaptive automation that is capable of assisting aircrew decision-making and selectively off-loading tasks. This paper describes the current state of development of a Tasking Interface component of the COGPIT programme, that allows aircrew to retain executive control of aircraft and mission parameters, whilst benefiting from such computerised assistance. Background The military aviation domain is characterised by being uncertain and by having shifting goals, dynamic evolution, time stress, action feedback loops, high stakes and multiple players. While operators may wish to remain in charge, and it is critical that they do so, today’s complex systems no longer permit them to be fully in charge of all system operations at all times; at least not in the same way as in earlier cockpits and
CONTEMPOARY ERGONOMICS 2000
73
workstations (Miller et al, 1999). Cockpit automation has been, and will continue to grow more intelligent and more sensitive to context and mission objectives. But no one seriously believes that cockpit automation and decision aids can or should replace pilot control. Instead, they must free up pilot resources to concentrate on the most important tasks and must create in the pilot a situation awareness that allows him to make decisions correctly and very quickly. This emerging situation begs questions about the appropriate roles for pilot and smart automation in future military aircraft. Functional integration is an important characteristic of advanced Intelligent Aiding systems, in that the required behaviour can be shared across many functional components, including the user (Geddes, 1997). That is, several functional components can collectively perform many of the same behaviours as the pilot—because they are aware of each other, capable of sharing information, aware of overall mission goals and capable of integrating their behaviours in the same way the pilot would. Functional integration of cockpit duties provides for a more robust and flexible integrated system when compared to systems based upon more strict function allocation to individual and unique components. As the integrated automation systems in an adaptive cockpit become more aware and capable of augmenting or even replacing pilot activities in some cases, new forms of interaction between human and automation become both possible and necessary. Our goal is the creation of an adaptive or “tasking” interface that allows aircrew to pose a task for automation in the same way that they would task another skilled crewmember. It affords aircrew the ability to retain executive control of tasks whilst delegating their execution to the automation. A tasking interface will necessitate the development of a cockpit control/ display interface that allows the pilot to change the level of automation in accordance with mission situation, pilot requirements and/or pilot capabilities. It is necessary that both the pilot and the system operate from a shared task model, affording the communication of tasking instructions in the form of desired goals, tasks, partial plans or constraints that accord with the task structures defined in the shared task model. Adaptive Interface Management for FOAS The COGPIT Technical Demonstrator will consist of three main initiatives to showcase the role of adaptive automation and intelligent decision aiding in the Future Offensive Air System (FOAS): • A Cognition Monitor (COGMON) that monitors the pilot’s physiology and behaviour to provide an estimation of pilot state. • A Situation Assessment Support System (SASS) that recommends actions based on the status of the aircraft and the outside environment. • A Tasking Interface Manager (TIM) that tracks goals and plans and manages the pilot/vehicle interface and system automation. • A Cockpit (COGPIT) that interprets and initiates display and automation modifications upon request. The central feature of the COGPIT is to afford the pilot the capability to concentrate his skills towards the relevant critical mission event, at the appropriate time and to the appropriate level. This does not necessarily imply the exclusion of all other data from the pilot, rather mission critical information will be of primary focus and other temporally non-critical but mission important data will be presented at a lower level of salience. In order to achieve this, the COGPIT will monitor three aspects of the situation: the environment, both external to the aircraft and the aircraft systems, the pilot to take account of his physiological and cognitive state and the mission plan to indicate current and future pilot actions.
74
TASKING INTERFACE MANAGER: AFFORDING PILOT CONTROL
Figure 1. Flow of information across functions (
primary
secondary)
The work will exploit the lessons learnt from the U.S. Army’s Rotorcraft Pilot’s Associate program (RPA) programme (Miller et al., 1999) through consultancy with the US developer of the tasking approach. The aim will be to produce a solution tailored to the requirements of the Cognitive Cockpit project, and that is compatible with the outputs of the SASS and CM work. The TIM will utilise the monitoring and analysis of the mission tasks provided by the SASS combined with the pilot state monitoring of the COGMON to afford adaptive automation, adaptive information presentation and task and timeline management. Functional Requirements The functional requirements for the TIM are being developed by Honeywell Technology Centre. The overall architecture of an adaptive cockpit we are working with involves twelve functions, with a natural flow of information and control across the functions as loosely illustrated by Figure 1. Implementing TIM Shared Task Model In order to develop a tasking interface, it is essential to be able to code, track and dynamically modify user’s goals and plans. The use of a “task model” format shared by both the operator and the knowledge based planning system affords a high level of co-ordination between the human and the supporting system (Miller et al., 1999). In order to support a tasking interface a task model must be organised via functional decomposition, wherein there are alternative methods to achieve each task or goal. These tasks must be representative of the way pilots think of their domain and use operator based labelling conventions (Miller et al, 1999). The task model used for the COGPIT uses three task categories: generic tasks that are constant for a particular task for any mission, mission specific tasks that are constant for a particular task within a particular mission and specific tasks that differ for each instance of a particular task.
CONTEMPOARY ERGONOMICS 2000
75
TIM’s Task Tracking Capabilities The goal plan tracking (GPT) system is intended to take the form of a three-pass assessment, with the first pass taking cockpit manipulanda and interface information to infer a goal, a plan/objective and a task (for example pilot stick inputs might imply SAM avoidance or acceptance of a new target or need to abort the mission). The second pass would use contextual information provided by the Situation Assessment Support System to disambiguate the first pass (for example a SAM site in search mode has been located 20° on the right at approximately 20km). The final pass, which is pilot direct input, would only be used if the assessment was incorrect (for example in this situation the pilot would agree with the assessment and the TIM would then act upon this assessment to request interface modifications and automation requirements from the cockpit). Communication about Intent One of the goals of TIM is to allow the pilot to interact with advanced automation flexibly at a variety of levels. This allows the pilot to smoothly vary the ‘amount’ of automation used depending on such variables as time available, workload, criticality of the decision, degree of trust, etc.—variables known to influence human willingness and accuracy in automation use (Riley, 1996). It further allows the human to flexibly act within the limitations imposed by the capabilities and constraints of the equipment and the world—a strategy shown to produce superior aviation plans and superior human understanding of plan considerations (Layton, et al, 1994). There are three primary challenges involved in the construction of a tasking interface: (1) A shared vocabulary must be developed, through which the operator can flexibly pose tasks to the automation and the automation can report how it intends to perform those tasks. This challenge was discussed above. (2) Sufficient knowledge must be built into the interface to enable making intelligent choices within the tasking constraints imposed by the user. This is the role of the information and automation needs interpreters illustrated in Figure 1. (3) One or more interfaces must be developed which will permit inspection and manipulation of the tasking vocabulary to pose tasks and review task elaborations in a rapid and easy fashion. This final challenge is one that will have to be undertaken for the FOAS fighter domain. The goal is to allow the human operator to communicate tasking instructions in the form of desired goals, tasks, partial plans or constraints in accordance with the task structures defined in the shared task model. These are, in fact, the methods used to communicate commander’s intent in current training approaches for U.S. battalion level commanders (Shattuck, 1995). Usage Scenario The intended TIM application is to enable the pilot to concentrate his/her cognitive capabilities on the tactical aspects of the mission (knowledge-based) and off-load the routine (rule-based and skill-based) activities to automation. In effect this will allow the pilot to remain in a feed-forward loop whilst, most, if not all feedback requirements are met through decision aiding and automation.
76
TASKING INTERFACE MANAGER: AFFORDING PILOT CONTROL
• The SASS provides rule-based decision-aiding information, according to the situational context. For example progressively providing avoid, evade and defeat action requirements against ground and air threats as the scenario develops. • The COGMON provides pilot state information (cognitive capability) according to the pilot’s physiological condition. For example provide the TIM with the information that the pilot is high on visual and cognitive workload coupled with a high alertness and high arousal but low activity. • The TIM affords the ability to adaptively provide information according to the situational context and either selectively (pilot controlled) or adaptively (TIM controlled) offload tasks to automation in accordance with the mission plan. For example the TIM could adaptively increase the automation level on aspects of the Defensive Aid System and aircraft defensive manoeuvres to allow the pilot to concentrate on the ramifications of the threat avoidance to mission completion. References Banbury, S.P., Bonner, M.C. Dickson, B., Howells, H. and Taylor, R.M. (1999) Application of Adaptive Automation in FOAS (Manned Option) (UKR) DERA/CHS/MID/CR990196/1.0 Geddes, N.D. (1997). Associate systems: A framework for human-computer co-operation. In M.J.Smith, G.Salvendy, and R.J.Koubek (Eds). Design of Human-computer systems: Social and ergonomic considerations. Elsevier: Amsterdam, 21B, 237–242. Layton, C., Smith, P, and McCoy, E. (1994) Design of a cooperative problem solving system for enroute flight planning: An empirical evaluation. Human Factors, 36(1), 94–119. Miller, C.A., Guerlain, S. and Hannen, M.D. (1999) The Rotorcraft Pilot’s Associate Cockpit Information Manager: Acceptable Behaviour from a New Crewmember. Presented at the American Helicopter Society 55th Annual Forum, Montreal, Quebec, May 25–27, 1999. Riley, V. (1996) Operator reliance on automation: Theory and data. In R.Parasuraman and M.Mouloua (Eds.), Automation and Human Performance: Current Theory and Applications. Lawrence Erlbaum: Hillsdale, NJ, 19–36. Shattuck, L. (1995) Communication of Intent in Distributed Supervisory Control Systems. Unpublished dissertation. The Ohio State University, Columbus, OH.
© British Crown Copyright 2000/DERA Published with the permission of the controller of Her Majesty’s Stationary Office
THE GLOBAL IMPLICIT MEASURE: EVALUATION OF METRICS FOR COCKPIT ADAPTATION Michael Vidulich & Grant McMillan Crew System Interface Division, Human Effectiveness Directorate Air Force Research Laboratory, Wright-Patterson AFB, Ohio, USA
Real-time measurement of pilot situation awareness (SA) is key for guiding adaptive aiding. The Global Implicit Measure (GIM) approach was developed to be a performance-based metric to guide adaptation. A simulator study of the effectiveness of GIM measurement in an air-to-air scenario was conducted. Two cockpit configurations (Conventional and Candidate) were used. Dependent measures included mission outcomes, subjective ratings, and GIM scores. Preliminary analyses have demons-trated the Candidate cockpit to be more effective and efficient. Subjective ratings of mental workload showed a lower demand, while GIM scores were higher across the mission phases. GIM diagnosticity is now under evaluation to determine whether the GIM can provide a real-time tool for identifying when and what type of adaptive aiding would benefit the pilot. Introduction As the number of sensor and weapon systems in military aircraft have proliferated over the years the military pilot’s task has become increasingly complex and difficult. In many settings automation has been used as a means of keeping the task demands within a reasonable level. However, wide use of automation has introduced difficulties of its own. Pilots often find automation doing unexpected things for no obvious reasons. The lack of understanding might be caused by the pilot being unaware of the current mode of the automation, or possibly not having a good mental model of what the automation in a given mode is going to do under various circumstances. Such problems will often be attributed as pilot error when they occur and may be addressed with additional training or alarms to “assist” future pilots. On the other hand, it is equally reasonable to suspect that some of the problems in modern aircraft might be due to the tendency of the automation to effectively ignore the pilot’s apparent goals and activities. A good human crewmember, particularly a co-pilot, not only assesses the situation surrounding the aircraft, but also assesses what the pilot appears to be trying to do. Attempts to aid the pilot would be guided by knowledge of the objective situation and an assessment of the pilot’s current goals and performance. For example, a pilot that was being overwhelmed by simultaneous high-priority tasks might have a fine understanding of the relevant goals for the moment, but be unable to accomplish them all. A responsive copilot might assist by identifying sub-tasks that could be off-loaded from the pilot. However, a co-pilot that detected a fundamental mismatch between a situation and a pilot’s current activity, might just try to bring the pilot’s attention to the situation.
78
THE GLOBAL IMPLICIT MEASURE: EVALUATION OF METRICS FOR COCKPIT
One way to characterise the challenge of designing effective automated aiding is to consider it an example of the crew situation awareness issue that confronts any team. An effective team includes team members that are not only reacting to the environment, but also to each other. If everyone reacted perfectly to the objective situation and team member roles were perfectly defined, then there would be no need for any situation awareness of the other team members. But, if one team member is not reacting appropriately to the situation (perhaps due to faulty information or confusion), it is up to the other team members to become aware of this and correct it. A pilot attempts to keep track of the aircraft’s “situation awareness” bymonitoring the situation and the current operating mode of the automation. If the pilot, correctly perceives the situation and the automation mode, and if the pilot also possesses a good understanding of the rules implemented in the automation, then the pilot can be said to have good team situation awareness in regard to the automated team member. What about the automated system’s team situation awareness? The automation is presumably provided with appropriate information about the situation to guide its activity. However, it also is presumably susceptible to “mode error” if it does not understand the operating “mode” of the pilot. An understanding of the pilot should logically be a piece of the information provided to the automated aiding system. This is a serious challenge for the situation awareness measurement. To be useful in correcting the lack of situation awareness sometimes displayed by automated aids it is necessary to develop a situation awareness measure that can be assessed in real-time and used by the computerised algorithm guiding the application of the automated aid. Situation awareness measurement Situation awareness has most often been measured by memory probes or subjective ratings. Both of these approaches can be effective in the test and evaluation of competing interface designs or perhaps as an aid to guide training programs. Neither of these approaches is amenable to being used as a continuous real-time situation awareness metric. However, given the assumption that the pilot is attempting to accomplish known goals with various known priority levels, it is possible to consider the momentary progress towards accomplishing these goals as a performance-based measure of situation awareness. The Global Implicit Measure (GIM, Brickman et al., 1995; Brickman et al., 1999; Vidulich, 1995) is an attempt to develop an approach for creating real-time situation awareness measurement that could guide effective automated pilot aiding. In the GIM approach a detailed task analysis is used to link measurable behaviours to the accomplishment of mission goals. The goals will vary depending upon the current mission phase. For each phase, measurable behaviours that logically affect goal accomplishment are identified and scored. The scoring is based on contribution to goal accomplishmem (0=performance inadequate, no contribution to goal accomplishment; or 1=performance successful and contributing to goal accomplishment). The proportion of 1’s in the GIM algorithms related to a specified mission phase identifies how well the pilot is accomplishing the goals of that phase (leaving aside the issue of weighting behavioural components of different priorities, for simplicity sake). More importantly, the behavioural components scored as 0’s should help to identify the portions of the task that the pilot is either unaware of or unable to perform at the moment. Thus, the GIM score can potentially provide a real-time indication of the quality of task performance and a diagnosis of the problem if task performance deviates from the ideal, as specified by the GIM task analysis and scoring algorithms.
CONTEMPOARY ERGONOMICS 2000
79
The current study created and tested a set of GIM algorithms for scoring pilot performance in a simulated air-to-air combat task. The goal of the study was to examine the suitability of the GIM data as a guide to automated aiding in combat aircraft. Method Subjects Seven current or previous US military subjects were used in the experiment. Subjects were either trained pilots or weapons systems officers. All subjects possessed considerable expertise applicable to real-world air combat. Equipment The simulation was conducted in the Fusion Interfaces for Tactical Environments (FITE) portion of the Synthesized Interface Research Environment (SIRE) facility at Wright-Patterson Air Force Base. FITE combined 6 projectors to give a wrap-around outside view for the pilot in the cockpit. The outside view wraps around the subject to both the right and left sides and both above and slightly below the cockpit to a point just behind the subject’s seat in the F-16 shell that contains the cockpit. Although this does not provide a view directly behind the cockpit, the look-up and look-down capabilities do provide a compelling world scene for guiding air-to-air combat. The cockpit also incorporated a projected HUD display, a headmounted-display, head-down LCD displays within the cockpit, and an F-16 throttle and stick for control. Two cockpit configurations were used in the simulation. The two cockpits differed mostly in terms of the information portrayed on the head-down displays. A “Conventional” cockpit used traditional independent gauges and displays of flight and tactical information. A more advanced “Candidate” cockpit used a more integrated, virtually augmented display format. The FITE simulator was also connected to two simpler manned threat stations. These stations were used by the pilots of the enemy fighters in the simulation. Scenario Every trial in the data-collection portion of the experiment used variations of the same basic scenario. The scenario began with the subject’s aircraft flying an oval-shaped Combat Air Patrol (CAP) pattern near his home airfield. The subject’s task was to monitor a designated portion of airspace. At some point, 4 enemy bombers and 2 enemy fighters would violate this airspace. The bombers were computer controlled and would always fly directly towards the home airfield. Unless shot down, the bombers would bomb the airfield and then turn around to depart. The enemy fighters were controlled by human operators trying to protect the bombers and shoot down the subject. There was also a computer-controlled friendly fighter and a computer-controlled Airbus airliner flying around within the simulation. The rules of engagement required the subject to focus on trying to shoot down the bombers before they could reach the airfield. Failing that, shooting the bombers after they bombed was good. The subject’s were only to deal with the fighters if they became an immediate threat. After destroying all bombers, or after any surviving bombers had escaped the designated air space, the subject would egress back to friendly territory, at which point the subject was declared “Safe” and the trial ended.
80
THE GLOBAL IMPLICIT MEASURE: EVALUATION OF METRICS FOR COCKPIT
The GIM task analysis identified 5 unique mission phases: CAP, Intercept (trying to shoot down bombers), Defensive (forced to deal with the fighters; can be subdivided into 4 levels defined by the level of threat), Egress, and Safe. Each phase had between 13 and 19 individual GIM algorithms associated with it. Again, the GIM algorithms scored the pilot behaviours critical to accomplishing the goals of each phase. Experimental Design A within-subject design was used. Each subject was first put through several sessions of training that emphasised complying with the rules of engagement. After training, each subject participated in three sessions. Each session contained two blocks. One block contained three trials using the Conventional Cockpit, and the other block contained three trials using the Candidate cockpit. The order of the blocks within sessions was random across subjects and sessions. Dependent Measures Mission outcome results, raw performance and raw GIM algorithm scores were collected during each trial. Following each block of three trials with a cockpit, subjective ratings of mental workload and situation awareness were collected. Following completion of all three sessions a retrospective measure of mental workload and a qualitative questionnaire were filled out during a debriefing session. Results and Discussion The manipulation of the cockpit design was included in this experiment to provide a test case for judging the sensitivity of the GIM in a test and evaluation environment. In order for it to be an interesting test case it is necessary to demonstrate that there was a legitimate difference between the cockpits for the GIM to evaluate. Therefore the first series of data analyses involved evaluating the non-GIM data to determine if there was a cockpit effect. Following the evaluation of the cockpit effect on non-GIM data, the GIM scoring will be investigated for consistency with the other data. Non-GIM data analyses Two major forms of non-GIM data will be discussed in this paper. First, several measures of mission effectiveness will reviewed to determine if the cockpits influenced the pilot’s success in the air-to-air combat mission. Second, ratings data will be reviewed to determine the pilots’ subjective reactions to the two cockpit designs. Unless otherwise noted, all mentioned effects were statistically significant at a 0.05 level. One of the main statistics used to evaluate success in operational air combat settings is the exchange ratio. The exchange ratio is the number of enemies shot down for each of our own aircraft shot down. It was not possible to analyse the exchange ratio as a metric in an ANOVA due to the fact that numerous pilots were never shot down in any of the nine trials in a given cockpit and a ratio cannot be calculated with a zero denominator. Nevertheless, the overall group exchange ratios were calculated and revealed a striking difference between the cockpits. In the Conventional cockpit the exchange ratio versus enemy bombers was 8.7 to 1, but in the Candidate cockpit the exchange ratio was nearly twice as good, 15.2 to 1. The main goal during a trial was to shoot down bombers, and this certainly suggests that the pilots were more effective at doing so in the Candidate cockpit. There was no appreciable difference in the exchange ratios against enemy fighters (Conventional, 1.8 to 1; Candidate, 2.0 to 1).
CONTEMPOARY ERGONOMICS 2000
81
The exchange ratio effect against the bombers was statistically supported by an ANOVA of the bombers shot down before and after reaching the pilot’s airfield. Not only was there a main effect of Cockpit (Conventional, 2.6 bombers; Candidate, 3.1 bombers), there was also an interaction showing that pilots in the Candidate cockpit tended to destroy more bombers prior to bombing (2.59 before bombing, vs. 0.54 after) than did pilots in the Conventional cockpit (1.97 vs. 0.67). There were no cockpit-related effects detected in the number of fighters shot down. Happily, fratricides (i.e., killing the friendly fighter or the Airbus) were so rare that analysing the data in an ANOVA was impractical. However, the trend regarding cockpits and fratricides was intriguing. With two possible fratricide targets in each of the 63 trials performed by the pilots in each cockpit, there were a total of 126 fratricide opportunities in each cockpit. In the Conventional cockpit there were 5 fratricide events (3.9%), but in the Candidate cockpit there was only 1 (0.7%). The post-block ratings of subjective mental workload showed a significant advantage for the Candidate cockpit, but despite a promising trend favouring the Candidate cockpit in the ratings of situation awareness, the effect was not significant. Overall, the analyses of the non-GIM data support the contention that the candidate cockpit was a successful redesign. Despite the lack of statistical significance in the subjective situation awareness ratings, the pilots were more effective and efficient in performing the task in the Candidate cockpit. These results combined with a lower level of reported mental workload and a trend towards higher rated situation awareness suggests that the cockpit redesign did achieve its goal of improving the pilot’s situation awareness and performance. GIM data analysis As a measure of the GIM’s sensitivity to the cockpit effect and its potential utility as a guide for adaptive aiding, the average GIM scores for each of the five possible phases in a trial were calculated and analysed. Not surprisingly, the average GIM score varied as a function of phase (e.g., highest in CAP, 0.935; lowest in Defensive, 0.716; out of a maximum score of 1.000). More importantly, there was a significantly better average GIM score associated with the Candidate cockpit (0.822) than with the Conventional cockpit (0. 797). This suggests that the GIM scores were a reliable measure that reflected the same effects detected in the data analyses reviewed in the previous section. Conclusions So far, the results of the GIM analyses are encouraging. However, much work remains to be done. Further analyses of the current data set are underway to determine if the individual GIM algorithms that the pilot fails on in the various phases are consistent over time and if they suggest reasonable aids that could be activated by the real-time GIM calculations. If so, the next step would be a test implementation of GIMactivated aiding to determine if it does improve mission performance. References Brickman, B.J., Hettinger, L.J., Roe, M.M., Stautberg, D., Vidulich, M.A., Haas, M.W., and Shaw, R.L. 1995, An assessment of situation awareness in an air combat simulation: The Global Implicit Measure approach. In D.J.Garland and M.R. Endsley (eds.) Experimental analysis and measurement of situation awareness, (Embry-Riddle Aeronautical University Press, Daytona Beach), 339–344.
82
THE GLOBAL IMPLICIT MEASURE: EVALUATION OF METRICS FOR COCKPIT
Brickman, B.J., Hettinger, L.J., Stautberg, D., Haas, M.W., Vidulich, M.A., and Shaw, R.L. 1999, The Global Implicit Measurement of situation awareness: Implications for design and adaptive interface technologies. In M.W.Scerbo and M.Mouloua (eds.) Automation technology and human performance: Current research and trends, (Lawrence Erlbaum Associates, Mahwah), 160–164. Vidulich, M.A. 1995, The role of scope as a feature of situation awareness metrics, In D.J.Garland and M.R.Endsley (eds.) Experimental analysis and measurement of situation awareness, (Embry-Riddle Aeronautical University Press, Daytona Beach), 69–74.
Drivers & driving
BRAVE NEW WORLD: THE VEHICLE AUTOPIA OF THE 21ST CENTURY? Mark S.Young1 & Neville A.Stanton2 1Department
of Psychology, University of Southampton Highfield, Southampton SO17 1BJ, UK
2Department
of Design, Brunel University, Runnymede Campus Coopers Hill Lane, Egham, Surrey TW20 0JZ, UK
With the new millennium upon us, vehicle automation devices such as Adaptive Cruise Control are being offered by major motor manufacturers. Over the last five years, the development of these systems has been reflected by the increasing number of publications in technical journals. However, there does not seem to have been a similar effort in the ergonomics literature devoted to the effects of vehicle automation on driving performance. The current paper investigates whether driving performance with automation changes across levels of driver skill. This issue raises substantial practical concerns. As vehicle automation becomes commonplace, the demographics of the driving population which have access to it will become increasingly variable. Therefore, the results are interpreted with respect to issues of litigation and training for inexperienced drivers with automation. Introduction As we enter the new millennium, new vehicle automation devices are being offered by major motor manufacturers. Adaptive Cruise Control (ACC) has been released in the last year, offering total longitudinal control of the vehicle. Soon, we will see lateral control devices such as Active Steering (AS) taking to the roads. During the development of these devices, a number of papers have been published detailing the technology, control strategies, and modelling techniques involved (e.g., Richardson et al, 1997). However, it seems that the ergonomics community has not kept pace with their engineering counterparts, with few publications about the effects of vehicle automation on the driver (for exceptions see Bloomfield and Carroll, 1996; Stanton et al, 1997, Young and Stanton, 1997). In particular, none of the research to date has investigated whether driver skill is an important factor in determining the impact of automation in future vehicles. Driving is a classic example of a skilled activity. The role of experience and skill in driving has led many to the conclusion that at least some elements of the driving task represent automatic behaviour (Stanton and Marsden, 1996). The advantages of automaticity are realised in areas of driving such as vehicle control (Blaauw, 1982), choice of driving strategy (Coyne, 1994), and brake reaction time (Nilsson, 1995). However, automaticity can also lead to certain accidents. Hale et al (1988) found that drivers approaching a familiar crossroads had strong expectations that it would be clear, such that they failed to perceive oncoming traffic.
CONTEMPOARY ERGONOMICS 2000
85
It will eventually become the case that any driver may step into a vehicle equipped with automated systems, regardless of their experience. Initially, novel technologies are fitted to prestige models only, implying that the drivers who have access to them are highly experienced. However, just as with power assisted steering, anti-lock brakes, and even automatic transmission, these new devices will eventually filter down to become widely available. It is conceivable that a newly-qualified driver with basic training could immediately use a vehicle equipped with ACC, or in the future, AS. The interaction of skill and automation is important for a number of reasons. It is posited here that all operators—novices and experts alike—essentially satisfy the criteria for automaticity when faced with automation. In discussing a theory of automaticity as knowledge, Bainbridge (1978) makes the point that increased demand essentially transforms an expert into a novice. It is surely plausible to assume that the reverse would be true in a situation of unusually low demand. However, whereas the expert has an enhanced knowledge base and can anticipate events, the novice is deprived of this ability. Thus they will not react as experts in critical situations, such as the overlearned braking response (e.g., Nilsson 1995). The present paper presents driving performance data from a large scale experiment conducted in the Southampton Driving Simulator. Although the results are of theoretical significance in terms of skill acquisition, the practical applications are highlighted here. Therefore, the primary task results are analysed in detail, with respect to issues of litigation and training for inexperienced drivers with automation. Method A mixed design was used. Level of automation constituted the within-subjects variable, with four levels: manual (i.e., the participant controls speed, headway, and steering), ACC (i.e., longitudinal control is automated), AS (i.e., lateral control is automated), and ACC+AS (i.e., both longitudinal and lateral control are automated). The latter condition essentially constitutes fully automated vehicle control. Order of presentation of these conditions was randomised to counterbalance practice effects. Driver skill level was the between-subjects factor, again with four levels: novice (i.e., never driven before), learner (i.e., currently learning but does not hold a full licence), expert (i.e., held a full licence for at least one year), and advanced (i.e., member of the Institute of Advanced Motorists in the UK). The latter group was chosen as a high level skill group because these drivers have undertaken further training based on police driving skills, and are considered to be 50–70% less likely to be involved in an accident than other drivers without such training. There were 23 novice drivers in this experiment, and 30 participants in each of the learner, expert, and advanced conditions. A 15-minute practice run was followed by the four experimental conditions, each lasting 10 minutes. In each of the experimental trials, participants were instructed to follow a lead vehicle travelling at 70mph for the entire duration. The simulated road was a mixture of straight and curved sections. Data reduction and analysis Given the demands of the primary task (i.e., maintain a consistent speed and headway), it was felt that a measure of location (i.e., mean) and dispersion (i.e., instability) would suffice as dependent variables speed and headway. Instability refers to the standard deviation of the regression line of speed/headway against time, and was recommended as a driving performance measure by Bloomfield and Carroll (1996). Dependent variables for lateral position were more simple—time out of lane, and absolute number of lane excursions. It was felt that these would be fairer estimates of performance than mean and instability, as
86
BRAVE NEW WORLD: THE VEHICLE AUTOPIA OF THE 21ST CENTURY?
‘good’ driving performance is not necessarily characterised by driving in a perfect straight line in the centre of the lane (see Coyne, 1994, for further details). All dependent variables were subjected to repeated measures analyses of variance (ANOVAs), with repeated contrasts or post hoc t-tests where appropriate. Only significant results are reported. Results There were main effects of automation on average headway (F3,327=17.4; p<0.001) and average speed (F3, p<0.001). For average headway, repeated contrasts showed no difference between manual and ACC conditions, however a significant reductions in the AS condition (F1,109=H.5; p<0.005) and the ACC +AS condition (F1,109=39.5; p<0.001) were observed. A similar pattern emerged for average speed, with no difference between manual and ACC conditions, but significant increases in the AS (F1,109=5.64; p<0.05) and ACC+AS (F1,109=23.6; p<0.001) conditions. For headway instability, main effects of automation (F3,327=14.8; p<0.001) and skill (F3,109=2.83; p<0. 05). Repeated contrasts reveal that headway instability does not differ between manual and ACC conditions, however decreases in the AS (F1,109=13.2; p <0.001) and ACC+AS (F1,109=5.60; p<0.05) conditions. Post hoc t-tests show that instability in the expert group is lower man in each of the novice (t210= −2.60; p<0.05), learner (t238=−4.05; p<0.001), and advanced groups (t238=−4.62; p<0.001). There was also a significant interaction between skill and automation on headway instability (F9,327=2.74; p<0.005). For the purposes of this paper, only interactions in the manual and AS conditions are reported, due to the fact that the remaining two conditions use ACC, which automates headway control. In the manual condition, the expert group exhibited significantly greater headway than the novice (t51=−2.22; p< 0. 05), learner (t58=−3.09; p<0.005) and advanced groups (t58=−2.56; p<0.05). In the AS condition, there was more instability in the advanced group than the expert group (t58 =−2.08; p<0.05). Speed instability showed a main effect of automation (F3,327=48.0; p<0.001) and skill (F3,109=28.8; p<0. 05). Further investigations reveal no difference between the manual and ACC conditions, however speed instability decreases with AS (F1,109=27.0; p<0.001) and again with ACC+AS (F1,109=36.3; p<0.001). In addition, the expert group shows less instability than each of the novice (t210=−3.01; p<0.005), learner (t238 =−3.44; p<0.005), and advanced groups (t238=−1.98; p<0.05). An interaction between skill and automation was also found for speed instability (F9,327=3.05; p<0.005). Again, only interactions in the manual condition are reported as speed is automated in each of the ACC conditions (there were no significant interactions in the AS condition). Under manual driving, then, instability for expert drivers is lower than novices (t51=−2.31; p<0.05) and learners (t58=−2.63; p<0.05). Also, the advanced drivers showed less instability than the learners (t58=2.05; p<0.05). The lateral position measure of time out of lane showed main effects for automation (F3,327=303.0; p<0. 001) and skill (F3,109=5.43; p<0.005). Post hoc contrasts revealed that time out of lane is reduced when AS is engaged compared to non-AS conditions (F1,109=328.5; p<0.001). Advanced drivers spend less time out of lane than novices (t210=3.24; p<0.005), learners (t238=2.97; p<0.005), and experts (t238= 2.25; p<0.05). There was an interaction between skill and automation for time out of lane (F9,327 =5.02; p<0.001). Only significant results in the manual and ACC conditions are reported, as lateral control is automated in the remaining two conditions. In the manual condition, the advanced group spent less time out of lane than the novice (t51=3.29; p< 0.005), learner (t58=3.31; p<0.005), and expert groups (t58=2.66; p<0.05). In the ACC condition, the advanced group spent less time out of lane than the novices (t51= 3.59; p<0.005) and learners (t58=3.27; p<0.005); also, the experts spent less time out of lane than the novices (t51=−2.14; p<0.05).
327=8.36;
CONTEMPOARY ERGONOMICS 2000
87
Finally, the number of lane excursions exhibited main effects for automation (F3,327=662.1; p<0.001) and skill (F3,109=7.23; p<0.001). Lane excursions are reduced when comparing AS conditions to non-AS conditions (F1,109=742.3; p<0.001). Lane excursions are lower for the advanced group compared to the novice (t210=2.81; p <0.01) and learner groups (t238=2.51; p<0.05). An interaction was found between automation and skill for number of lane excursions (F9,327=6.21; p<0. 001). Again, only relevant results in the non-AS conditions are presented. In manual driving, advanced drivers made less lane excursions than novices (t51=3.25; p<0.005) and learners (t58=3.12; p<0.005). When driving with ACC, the advanced group made less lane excursions than the novices (t51=4.40; p< 0.001), the learners (t58=4.11; p<0.001), and the experts (t58=2.22; p<0.05); whilst experts made less lane excursions than novices (t51=−2.30; p<0.05). Discussion The results of this experiment are quite complex, however can be summarised as follows. In general, longitudinal control was at its most stable in the expert driver group, and stability increased with more levels of automation. However, this was tempered by automation, such that when AS was used, the advantage of experts largely disappeared. For lateral control, there is a clear difference between AS and non-AS conditions. Now, though, it is the advanced group who are generally better at keeping the vehicle in lane. Automation reveals a further difference between skill groups, to the extent that when ACC is used, the experts are more stable than novice drivers. There are a number of implications arising from these results. It seems that increasing levels of automation does indeed attenuate observable differences in performance between skill groups, as suggested above. That is, the performance of inexperienced drivers on longitudinal control resembles that of experts only when lateral control is automated. On the face of it, this is a promising finding—everybody will drive better with more automation. But what happens when the driver has to take over again? Presumably the automatic behaviour of experts will quickly resume, save for a little skill degradation. Those with less experience, though, may have more trouble recalling stored routines and responding in a controlled manner. So, would increased driver training solve this problem? It is not possible to train for unique and unexpected events, such as automation failure, so this would be a limited solution. This also raises the issue of litigation—if a system failure results in an accident, who would be held responsible? An investigation into skill differences when coping with automation failure is currently being carried out in the Southampton Driving Simulator. This study will shed more light on the question of whether the knowledge base of experts provides an advantage over less experienced drivers in critical situations. Perhaps one day, when vehicles become fully automated, these questions will not be so important. In the meantime, while the human is still an integral part of the system, it is necessary to consider their capabilities and limitations. The automated Utopia, or ‘autopia’ of the future may very well be mired in questions of litigation and acceptance (cf. Hancock et al, 1996). It is up to ergonomics to ensure that our millennium highways do not turn out to be as dystopian as Huxley’s (1970) own Brave New World. References Bainbridge, L. 1978, Forgotten alternatives in skill and work-load, Ergonomics, 21, 169– 185. Blaauw, G.J. 1982, Driving experience and task demands in simulator and instrumented car: a validation study, Human Factors, 24, 473–486.
88
BRAVE NEW WORLD: THE VEHICLE AUTOPIA OF THE 21ST CENTURY?
Bloomfield, J.R., and Carroll, S.A. 1996, New measures of driving performance. In S. A.Robertson (ed.) Comtemporary Ergonomics 1996, (Taylor and Francis, London), 335–340). Coyne, P. 1994, Roadcraft: The police driver’s handbook, (HMSO, London). Hale, A.R., Quist, B.W., and Stoop, J. 1988, Errors in routine driving tasks: a model and proposed analysis technique. Ergonomics, 31, 631–641. Huxley, A. 1970, Brave new world, (Chatto and Windus, London). Nilsson, L. 1995, Safety effects of adaptive cruise control in critical traffic situations, Proceedings of the second world congress on intelligent transport systems: Vol. 3. Richardson, M., Barber, P., King, P., Hoare, E., and Cooper, D. 1997, Longitudinal driver support systems. Proceedings of Autotech ’97, (IMechE, London). Stanton, N.A., and Marsden, P. 1996, From fly-by-wire to drive-by-wire: safety implications of automation in vehicles, Safety Science, 24, 35–49. Stanton, N.A., Young, M. and McCaulder, B. 1997, Drive-by-wire: the case of driver workload and reclaiming control with adaptive cruise control, Safety Science, 27, 149–159. Young, M.S. and Stanton, N.A. 1997, Automotive automation: investigating the impact on drivers’ mental workload, International Journal of Cognitive Ergonomics, 1, 325–336.
GENDER DIFFERENCES IN PRIMARY AND SECONDARY PERFORMANCE DURING SIMULATED DRIVING Nikki Brook-Carter, Terry C.Lansdown & Tanita Kersloot Transport Research Laboratory, Crowthorne, Berkshire RG45 6AU, UK
This paper presents findings from experimental work investigating driver conflicts from multiple in-vehicle information sources. Drivers were presented with visual, auditory, and a combination of visual & auditory, secondary information, whilst undertaking a primary tracking task. Findings are presented which suggest a gender specific disparity in task performance. Female performance in both primary and secondary tasks was found to be significantly lower than that of males. The implications of gender differences in primary and secondary task performance are discussed. Introduction The introduction of intelligent transportation systems within vehicles has raised numerous safety concerns. The workload imposed by the primary task of driving is highly variable and the driver is typically left with spare capacity (Rockwell, 1988; Dingus. Antin. Hulse and Wierwille. 1989). This capacity is increasingly becoming utilised in additional tasks involving interaction with in-vehicle systems. However the distractions and interruptions caused by these secondary tasks may threaten the primary task of vehicle control, particularly when these additional systems are not integrated and the driver is interrupted by more than one system simultaneously. Few studies have reported gender differences in driving or secondary sub-task performance. Dorn et al. (1991) suggested that females find driving attentionally more demanding in comparison to males. Storie (1977) found that males are more likely to have been travelling too fast before an accident, whereas females are more likely to have shown cognitive errors. Findings from these and other studies imply that although males deploy their driving-related information processing abilities more effectively during vehicle operation, females do not exhibit sensation seeking or excessive risk taking to the same degree as their male counterparts. Methodological problems with research conducted concerning gender and driving have potentially confounded some of the results of these studies. Over (1998) stressed that when considering gender differences in driving it is essential to consider factors such as driving exposure. For example, in Stories’ (1977) study the females may have had less driving experience than the males, which may account for the errors observed. Studies that have attempted to control for relevant factors have presented interesting results. Evans (1994) considered the apparently higher accident occurrences for male and particularly young drivers,
90
GENDER DIFFERENCES IN PRIMARY AND SECONDARY
finding that when the accident occurrences were controlled for exposure (in this case, mileage) the significant differences between gender and age distributions disappeared. Similarly it could be suggested that exposure to computer interfaces might impact performance on interactions with computer simulations. Levin (1994) reported that among adults, eighteen percent of men and only nine percent of women use a home PC regularly. This study investigated the impact of multiple secondary tasks on driving performance. During the investigation surprising gender differences emerged, which are presented and discussed below. Method Twenty-two participants were involved in this study. The age of the participants ranged from 20 to 58 years (mean=42). The sample consisted of 12 males and 10 females. The level of driving experience of the males ranged from 9 to 30k miles pa (mean=16k) and the level of experience of the females ranged from 7.5 to 30k miles pa (mean=13k). The primary task occurred in a central screen, (see Figure 1). During the Training, Control and Experimental conditions, participants were asked to maintain the inner ‘white’ bar within the randomly moving ‘black’ bar using the steering wheel. When the inner white bar moved outside the edge of the moving black bar a lane exceedence was noted. The primary tracking task was designed to represent pursuit (driver following the road) and compensatory (vehicle drifting in lane) aspects of lateral control of the vehicle. In the secondary task conditions participants were required to respond to even numbers and ignore odd numbers that were simultaneously presented in screens on either side of the central display, or headphones in the auditory condition. Participants responded by pressing the corresponding Left and Right buttons. In the visual & auditory condition, stimuli appeared in both visual and auditory displays, but always one in the left and one in the right. After each condition, the NASA Task Load Index (TLX), subjective mental workload assessment (Hart and Staveland, 1988) was administered.
Figure 1. Primary tracking task screen display
Results Mental Workload A Mann Whitney U-test showed that females reported the mental workload sub-scale ‘performance’ to be significantly poorer (mean=72.33) in comparison to males (mean=50.00) in the auditory condition, (U=19, Z=−2.49, p<0.05). Performance was reverse scored, i.e., higher means poorer performance.
CONTEMPOARY ERGONOMICS 2000
91
Primary Tracking Task Analysis of variance was carried out on this data. Figure 2 illustrates that significantly more lane exceedences were observed for females than males, (F=17.19, df=1, p<0.001) during the separate conditions. Significantly fewer lane exceedences were observed in the control condition in comparison with the conditions involving a secondary task, (F=10.34, df=3, p<0.001). There were no significant interactions between gender and condition.
Figure 2. Mean lane exceedences by condition and gender
Figure 3. Tracking performance by condition and gender
Significantly worse tracking performance was found for females in comparison to males in all four conditions, (F=8.38, df=1, p<0.05), see Figure 3. A significant difference for steering deviations between conditions was not observed. There were no significant interactions between gender and condition. Secondary Task Responses and Reaction Times A Mann Whitney U-test revealed that females made significantly more incorrect responses (mean=4.64) in the visual and auditory condition than males (mean=1.45), (U=20.0, df=1, p<0.05). Friedman tests found significant reaction time differences for both mean number of hits (correct responses), (χ2=4.42, df=1, p<0. 05) and mean number of false alarms (incorrect responses), (χ2=4.28, df=1, p<0.05) for gender in the visual condition. Females were found to respond more quickly (mean=0.82s) when making hits than males
92
GENDER DIFFERENCES IN PRIMARY AND SECONDARY
(mean=1.05), whereas females were significantly slower (mean=1.03s) in making false alarms than males (mean=0.93s). Discussion This paper has presented gender-related findings from experimental work investigating driver conflicts from multiple in-vehicle information sources. During primary task performance females made more lane exceedences, and greater steering deviations deviations than males. In the subjective assessments females correctly reported their performance to be inferior in comparison to males. Steering deviations were observed to increase for males during the introduction of secondary tasks. However, although the relative level of primary task performance was lower for females, their lane tracking performance in the primary task was not significantly affected by the addition of secondary tasks. It is therefore not surprising that, in general, female performance in secondary tasks was found to be poorer than that of males. Females made more false alarms than males in the visual and auditory condition and had quicker false alarm reaction times in the auditory condition. These results could be interpreted as females expressing less caution in responding to the task stimuli than males, or females experiencing increased difficulty in distinguishing the visual-auditory conflicting information. In the visual condition female reaction times for hits were significantly quicker, and for false alarms were significantly slower, than males. In this condition the number of lane exceedences were found to be highest and females were found to make significantly more lane exceedences than males. It therefore seems reasonable to suggest that females were giving higher priority to the visual task than the primary task. In some gender studies results may be questioned because of methodological problems, e.g, driving exposure (Over, 1998). Familiarity with driving was controlled in this study; therefore an experience differential is not thought to be responsible for the differences identified. It is possible that there was a gender difference in the base skill level of participants during the secondary task, due to differences in exposure to computers. The findings from this study provide evidence that some gender differences exist when interacting with primary or primary and secondary tasks. However, caution is urged in generalising findings from this study to public road use. The tracking task employed may at best be considered to represent some lateral control aspects of vehicle control. Conclusions This paper presents findings indicating that female performance during a primary tracking task was inferior to males. Females were shown to deviate in lane and make lane exceedences more frequently than males when conducting the experimental tasks. Females correctly reported lower task performance than males. During the secondary tasks involving an auditory aspect, females were found to make more false alarms and react more slowly than males. However, when correctly undertaking secondary tasks, females generally reacted faster in their decision making than males. Acknowledgements This experiment was undertaken as part of the UK Department of the Environment, Transport and the Regions (DETR) Traffic, Management and Tolls (TMT) Division funded project UG140 ‘Optimise’. The Authors would like to thank DETR for their support during this undertaking.
CONTEMPOARY ERGONOMICS 2000
93
References Dingus, T.A., J.A. Antin, Hulse, M.C., and Wierwille, W.W. 1989, Attentional demand requirements of an automobile moving-map navigation system. Transportation Research (a) 23 (4): 301–315. Dorn, L., Glendon, A.I., Hoyes, T.W., Matthews, G., Davies, D.R., and Taylor, R.G. 1991, Group differences in driving performance. Behavioural Research in Road Safety II. Proceedings of a seminar at Manchester University. Edited by G B Grayson. (TRL. Crowthorne) (PA2/93/92) (IRRD 853020, pp. 68–78). Evans, L. 1994, The older driver problem: an epidemiological overview. Paper presented at the 14th Enhanced Safety of Vehicles Conference, Munich, Germany, May. Hart, S.G., and Staveland, 1988. Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. In P.Hancock & N.Meshlcati (Eds.), Advances in psychology: human mental workload, pp. 139–183. Elsevier Science. North Holland. Levin, C. (1994). Survey reveals psyche of wired society. PC Magazine. Vol 13, Issue 14, p30. Ziff-Davis Publishing Co., New York. Over, R. 1998, Women and cars: emancipation, enrichment and efficiency. The Public Policy department, the Automobile Association, Hampshire. Storie, V.J. 1977, Male and female car drivers: differences observed in accidents. Transport and Road Research Report 761. Transport and Road Research Laboratory, Crowthorne.
Copyright © Transport Research Laboratory 2000
USING OBSERVATION OF ONE TRAFFIC VIOLATION TO PREDICT AN IMMEDIATE SECOND VIOLATION Tay Wilson & Céline Arsenault Psychology Department, Laurentian University, Ramsey Lake Road Sudbury, Ontario, Canada P3E 2C6 Tel: (705) 675–1151
One on-road driving practice that has been insufficiently studied is the probability that a driver, having committed one traffic violation, will commit a second traffic violation on the immediately next independent traffic interaction. Data collected over eight years is presented regarding vehicles going through a stop sign with out stopping immediately after improperly overtaking a vehicle traveling at the speed limit yielded a probability of an immediate second violation traffic approaching unity. The extension of the results of this study to other driving situations is examined. The application of results, in this context, for advanced driver training is developed. It is contended firstly, that it is the attainment of specific (e.g., knowing that only a very small percentage of drivers attempt a particular manoeuvre in particular circumstances) rather than general “social” skills (e.g., “drive carefully”) which are most likely to reduce collisions (Wilson, 1991; Wilson, 1996); and secondly that particularly among inexperienced drivers, poor notions of the probabilities of specific driving actions leads to risky driving manoeuvres. One on-road driving practice that has been insufficiently studied is the probability that a driver, having committed one traffic violation, will commit a second traffic violation on the immediately next independent traffic interaction. In this paper, this driving practice will be examined by a re-analysis of data from some dozen earlier studies followed by the presentation of new data bearing on failing to stop at a stop sign immediately after having overtaken a vehicle at a speed above the posted limit. Data from the following on-road driving style observation studies, spanning fifteen years, in which the first author was involved, were re-examined to search for evidence of relative frequency of immediately subsequent traffic violations: Wilson and Best (1982), Wilson, Postans, and Garrod (1983), Postans and Wilson (1983), Wilson and McArdle (1992), Wilson and Godin (1993; 1994), Wilson and Ng’Andu (1994), Wilson and Neff (1995), Wilson and Chisel (1997), Wilson (1996; 1997; 1999). In only 3 of these studies (Wilson, Postans, & Garrod, 1983; Wilson & Best, 1982; Wilson & McArdle, 1992) were such data found; moreover, the results were meager. It is to be emphasized that the data of interest here are traffic transgressions which occur at the immediately next opportunity after a first traffic transgression has occurred. The results appear below.
CONTEMPOARY ERGONOMICS 2000
95
Successive Motorway Overtaking Transgressions Wilson, Postans, and Garrod (1983) provided data on 1129 lorry or coach overtaking incidents on a Bedfordshire section of the M1 motorway in England. Overtaking was classified as remaining in the inside lane (RIIL) and either overtaking vehicles in outside lanes or indicating desire to overtake by tailgating (following gap of less than 1 sec), remaining in the centre lane (RICL) while overtaking vehicles in the inside lane, remaining in the outside lane (RIOL) while overtaking vehicles in the inside two lanes, two thirds early procedure (2/3E) in which the overtaking vehicle moves out to an outside lane to overtake but fails to return to the original lane after overtaking, two thirds late procedure (2/3L) in which the overtaking vehicle begins overtaking manoeuvre in an outside lane but returns to an inside lane after overtaking or finally full overtaking procedure (FOP) in which the overtaking vehicle begins and ends in the same lane. In addition three types of encroachments were observed: tailgating (following with a gap less than one second), cutting in (forcing a vehicle to slow down or move over when passing vehicle changes lanes) and lane sharing (failing to keep entire vehicle in one lane will overtaking). Are there any data in this study which might bear upon the issue of relative rate of second traffic transgression? Well, for lorries, if one considers remaining in the centre lane (RICL) after overtaking to be inappropriate relative to carrying out a two-thirds late or a full overtaking manoeuvre, then the probability of immediately subsequent tailgating (p=130/396 =0.33) when remaining in the centre lane is significantly higher (χ2=127.3, df=2, p<0.001) than the probability of such behaviour under two thirds late (p=0/320=0) or full overtaking procedure (p=17/132=0. 13). Coaches provide a similar result albeit with a much smaller data base. The probability of immediate subsequent tailgating for RICL coaches (p=6/38=0.16) is significantly higher (χ2=5.1, df =1, p<0.02) than that for two thirds late or full overtaking procedure coaches (p=0/28=0). No other significant differences were noted. Successive Two-lane Overtaking Transgressions Wilson and Best (1982) assessed 422 overtaking manoeuvres of flying overtakers—overtaker does not attempt to follow the overtaken vehicle but maintains speed and overtakes the vehicle ahead—and accelerative overtakers—involving a period of speed reduction and following before overtaking—on a twolane A-road in England in terms of, inter alia, “sudden braking to follow”, “piggy backing”—overtaking a vehicle in a platoon of two or more cars, “cutting-in”—forcing the overtaken vehicle to take such action as braking or altering course when the overtaking vehicle returns to the travelling lane and “lane-sharing”—the overtaking vehicle fails to completely cross the central line and so passes the overtaken vehicle partially in both lanes. If using a small gap (less than 400 metres, the Crawford (1963) distance required to pass safely at a speed of 60 km/hr), sharing a lane during overtaking and cutting in are considered driving transgressions, then the following findings appear to be relevant to the quest of seeking relative frequency of immediate successive transgressions. The probability of immediately subsequent “cutting-in” following braking to follow (5/11=0. 45) was significantly greater (χ2 =6.8, df=1, p<0.01) than the probability of those who did not (54/411=0. 13). Accelerative overtakers using small gaps for overtaking performed more cutting in (27/48 vs 7/353, χ2=163, df=1, p<0.001) and more lane sharing (21/48 vs 25/353, χ2=49.2, df=1, p<0.001). The proportion of accelerative overtakers using small gaps for overtaking who lane shared without cutting in (15/21) was greater than that of those (6/27) who lane shared and cut in together (χ2 =9.7, df=1, p<0.002). Piggybackers using small gaps performed more cutting in (4/13 vs 0/17) (χ2=4.6, df=1, p<0.05) and more lane sharing 10/ 13 vs 5/17) than when using large gaps (χ2=6.6, df=1, p<0.05). Similarly non piggybackers using small gaps performed more cutting in (27/95 vs 7/347) (χ2=164, df=1, p<0.001) and more lane sharing (18/45 vs 26/
96
USING OBSERVATION OF ONE TRAFFIC VIOLATION TO PREDICT
347) (χ2=42.8, df=1, p<0.001) than when using large gaps for overtaking. Overtakers using small gaps without having braked to follow before overtaking cut-in (29/52 vs 7/359) (χ2 =148, df=1, p<0.001) and lane shared (23/52 vs 31/359) (χ2=48.5, df=1, p<0.001) more than those who use large gaps. Successive Transgressions at Pedestrian Crossings Finally Wilson and McArdle (1992) observed pedestrian-driver incidents at a set of pedestrian crosswalks in northern London, England. One incident involved the recording of a traffic transgression plus the description of resulting behaviour at the immediately next traffic interaction opportunity. A lorry behind a car waiting for pedestrians to cross at a crossing honked loudly “forcing” the car to proceed around pedestrians who were well into the roadway. The lorry then turned the corner and failed to give way to the pedestrians who were forced to run across the road to complete their journey. With this small N of one the probability of transgression on the immediately next opportunity is unity. Driving through a Stop Sign Immediately After Overtaking above the Speed Limit More data on the probability of committing a second driving transgression on the immediately next opportunity is clearly needed. In the present study, attention was focussed on the probability of driving through a stop sign without stopping immediately after having overtaken a driver at a speed well above the posted speed limit. The road chosen was a 900-metre stretch of South Bay Road leading from Ramsey Lake Road to the Laurentian University turn-off in Sudbury, Ontario. The road winds downhill in such a manner that over sections of the road there is insufficient sight line distance for overtaking. The posted speed limit is 50 km/h. White lines are painted on both sides of the road demarking a one to two metre strip of pavement used extensively by pedestrians and cyclists who use it for commuting, jogging and recreational walking. Since, uphill the road “dead ends” at the university on the right and at a small residential subdivision on the left, the traffic is overwhelmingly local and largely from the university. Therefore it is safe to conclude that at least 95% of all drivers on the road are aware of the geometry of the road, the existence of the stop sign at the bottom of the road and its frequent multi purpose use by pedestrians. Indeed it is likely that most of the drivers work or study along side these pedestrian and cycle road users. As is usually the case in city driving there is little to be gained by driving fast on this stretch. Even travelling at 50% above the speed limit— clearly a dangerous option on such a road—over the complete stretch, blind area and all, would only reduce ones trip time from sixty-five to forty three seconds. Consequently, there is not a huge number of overtakings on the road. In order to collect a reasonable number of on-road observations of the behaviour in question the first author was forced to drive down the road at the speed limit at various times of the day during clear traffic conditions over a five year period. Every time a vehicle overtook the experimenters vehicle on this stretch, the overtaking vehicle was subsequently observed to see if it then stopped at the stop sign at the bottom of the road and whether the intersecting road was clear of traffic. Some thirty one cases were observed. In all but one case the overtaking vehicle was judged to gain at least 10 secs or about 150 metres or more on the experimenters vehicle in half of the road distance and hence to be travelling at least 25% faster and generally closer to 50% faster than the experimenters vehicle and hence the speed limit. At the stop sign just under half of the fast overtakers (14 out of 30) had a clear road ahead and all of them drove straight through with out stopping at the stop sign. The one slow overtaker who gained considerably less than ten seconds on the experimenter after overtaking came to a slow rolling stop at the stop sign. Over a five year period then the observed probability of failing to stop at the next available stop sign when the road is clear for high speed overtakers is unity. These results were compared with an observation of 175
CONTEMPOARY ERGONOMICS 2000
97
vehicles over four hours across two days arriving at the stop sign in the absence of any following or cross traffic. Stops were classified as full or virtually full stop (74), slow roll (50), fast roll (48) and complete failure to stop (3). Among these vehicles the proportion of completely failing to stop (3/175) is significantly lower than the proportion of overtaking vehicles failing to stop (1) as observed above (χ2=145, df=1, p<0. 001). Even if the fast rolling stops are combined with complete failures the proportion of such (51/175) is significantly lower than that of overtaking drivers (χ2=27.8, df=l, p<0.001). Discussion Given the emphasis upon developing specific concrete social driving skills as opposed to general “be careful” skills, how can data such as these be used in advanced driver improvement instruction. Well, first with regard to lorry and coach overtaking on the motorway one might instruct drivers to be prepared for a high probability of immediately subsequent tailgating when the overtaking lorry or coach remains in the centre lane relative to when the overtaking lorry or coach returns to an inside lane. Second, regarding overtaking on a two lane A-road, one might instruct drivers first to be prepared for a relatively high probability of cutting in after overtaking when the following vehicles brakes hard to follow just before overtaking or when the overtaking vehicle uses a small gap for overtaking; further to be prepared for a relatively high probability of overtaking with lane sharing (passing very close to overtaken vehicle) when small overtaking gaps are used; moreover, to be particularly prepared for a relatively high proportion of piggy backers (two or more vehicles overtaking together) who cut-in and lane share. Finally, with regard to Sudbury city overtaking and stop signs, to be prepared for a relatively high probability of stop sign running immediately after being overtaken at relatively high speeds by a vehicle. All of these specific points allow drivers to take prior precautionary measures which should lower the probability of a subsequent collision. There is clearly a need for and great opportunity to study further apply the occurrence of driving transgressions which immediately follow a first transgression. References Crawford, A., 1963, The overtaking driver, Ergonomics 6(2), 153–169. Postans, R.L. and Wilson, W.T. 1983, Close following on the motorway, Ergonomics, 2(4), 317–327. Wilson, T. and Best, W., 1982, Driving strategies in overtaking, Accident Analysis and Prevention, 14(3), 179–185. Wilson, T., Postans, R., and Garrod, G., 1983, Lorry and coach overtaking on the motorway, Traffic Engineering and Control, June/July, 311–314. Wilson, T and Smith, T., 1983, Driving after stroke. International Rehabilitation Medicine, 5, 170–177. Wilson, T., 1991, Locale driving assessment—A neglected base of driver improvement interventions. In E.J.Lovesey (ed.) Contemporary Ergonomics, (Praeger, London), 388–393. Wilson, T. and McArdle, G., 1992, Driving style caused pedestrian incidents at corner and zebra crossings. In E.J.Lovesey (ed.) Contemporary Ergonomics, (Praeger, London), 250–255. Wilson, T. and Godin, M., 1993, A study of co-operation extended to trapped merging drivers. In E.J.Lovesey (ed.) Contemporary Ergonomics, (Praeger, London), 387–391. Wilson, T. and Godin, M., 1994, Pedestrian/Vehicle crossing incidents near shopping centres in Sudbury, Canada. In Contemporary Ergonomics (Praeger, London), 186–192. Wilson, T. and Ng’Andu, B., 1994, Trip time estimation errors for drivers classified by accident and experience. In S.A.Robertson (ed.) Contemporary Ergonomics, (Praeger, London), 217–222. Wilson, T. and Neff, C., 1995, Vehicle overtaking in the clear-out phase after overturned lorry has closed a highway. In S.A.Robertson (ed.) Contemporary Ergonomics, (Praeger, London), 299–303.
98
USING OBSERVATION OF ONE TRAFFIC VIOLATION TO PREDICT
Wilson, T., 1996. Normal traffic flow usage of purpose built overtaking lanes: A technique for assessing need for highway four-laning. In S.A.Robertson (ed.) Contemporary Ergonomics, (Praeger, London), 329–333. Wilson, T. and Chisel, C., 1997, On-campus pedestrian crossings: Opportunity for locale based driver improvement. In S.A.Robertson (ed.) Contemporary Ergonomics, (Praeger, London), 86–91. Wilson, T., 1997, Overtaking on the Trans-Canada highway: Conventional wisdom revised. In S.A.Robertson (ed.) Contemporary Ergonomics, (Praeger, London), Wilson, T., 1999, Driving the aftermath of collision -closed highways: Road rage and advanced driver hints. Engineering Psychology and Cognitive Ergonomics, 3, 345–349.
Error & systems
ANALYSIS OF SHIFT CHANGE IN THE AIRCRAFT MAINTENANCE ENVIRONMENT: FINDINGS AND RECOMMENDATIONS Anand K.Gramopadhye & Kuldeep Kelkar Department of Industrial Engineering, Clemson University, Clemson SC 29634, USA
Shift change has been widely reported as a cause of errors/accidents in the aircraft maintenance industry. To alleviate this situation, the industry has developed ad-hoc measures and general guidelines to assist various personnel involved in the shift change process. This approach has resulted in various aircraft maintenance organizations developing their own internal procedures, which vary in their level of instruction/detail. As a result, shift change procedures are often not standardized. Moreover, they are not based on sound principles of human factors design. In response to the need this research looked at the shift change process at representative aircraft maintenance sites. A detailed task analysis of the shift change process, led to the development of taxonomy of errors. This analysis along with the taxonomy of errors was then used to identify human factor interventions to develop a standardized shift change process to minimize shift change errors. Introduction Task analysis of maintenance activities has revealed aircraft inspection to be a complex activity requiring above average coordination, communication and cooperation between inspectors, maintenance personnel, supervisors and various other sub-systems (e.g., planning, stores, clean-up crew, shops) to be effective and efficient. A large portion of the work done by inspectors and maintenance technicians is accomplished through teamwork. The challenge is to work autonomously but still be a part of the team. In a typical maintenance environment, first, the inspector looks for defects and reports them. The maintenance personnel then repair the reported defects and work with the original inspector or the buy-back inspector to ensure that the job meets predefined standards. During the entire process, the inspectors and maintenance technicians work with their colleagues from the same shift and the next shift as well as personnel from planning, stores, etc. as part of a larger team to ensure that the task gets completed (FAA 1991). Thus, in a typical maintenance environment, the technician has to learn to be a team member, communicating, and coordinating the activities with other technicians, and inspectors. One of the areas requiring the use of effective team skills is shift change, but this procedure has been widely reported as a cause of several errors/ accidents in the aircraft maintenance industry (see FAA 1991, FAA 1993, Hobbs et al. 1995 and the recent Continental Express crash). This can be attributed to a lack of well-defined shift change procedures for use by the aircraft maintenance industry. In response to this need, industry has developed ad-hoc measures and general guidelines to assist various personnel involved in the shift change process. This has resulted in
CONTEMPOARY ERGONOMICS 2000
101
various organizations developing their own internal procedures, which vary in their level of instruction/ detail. Because of this situation, shift change procedures are not standardized across the industry. Moreover, they are often not based on sound principles of human factors design. Hence, there exists a need to look at the shift change process. In response to this need, this research looked at the entire shift change process to identify human factors interventions that can be applied to develop a standardized shift change process which will minimize shift change errors. Analysis of the Shift Change Operation A detailed task analysis (Gramopadhye and Thaker 1998) of the operations was conducted with data collected using shadowing, observation, and interviewing techniques. The tasks were analyzed using HTA (Hierarchical Task Analysis) and column formats (see Figure 1 and 2). Following the analysis, a comprehensive error classification scheme was developed to classify the potential errors by expanding each step of the task analysis into sub-steps and then listing all the failure modes for each substep using the Failure Modes and Effects Analysis Approach (Hobbs et al 1995) (see Figure 3). Following this, a classification scheme for errors was developed based on Rouse and Rouse’s (1983) human error classification scheme. Using the analyses the team partners provided the research team with access to their facilities, personnel, and documentation and allowed the research team to analyze their existing shift change protocol. The team analyzed shift change at three different maintenance sites at different times of the shift. The research team worked with the manager, line supervisor/shift foreman, inspectors, and aircraft maintenance technicians. The research team visited sites that had both light and heavy inspection and maintenance work. During a typical site visit, the research team followed one or more inspectors and maintenance technicians, attended shift meetings, and asked probing questions, if necessary, during direct observations. Following this step, the researchers conducted follow-up interviews with the various personnel involved to ensure that all aspects of the shift change process were covered. These interviews covered issues concerning the tasks they were undertaking or had just performed and general issues concerning their work environment, both physical and organizational. All data was contributed anonymously, and system participants were honest, motivated to assist the research team, and concerned about improving aviation safety. Findings Following observations and discussions with various shift teams and a detailed task analysis of the shift change processes, the following general observations were made about the shift hand-over procedures between an outgoing and an incoming shift. These observations were in addition to those identified using the error taxonomy (see Gramopadhye and Kelkar 1999). Shift Protocol Related Issues: In general, the shift hand-over procedures did not follow any defined protocol. The procedures were informal and often ad hoc. The discussions relied primarily, and in some cases heavily, on oral communication. The level of detail and discussion was dependent on the inspectors, maintenance technicians, and supervisors. Although companies have outlined basic shift change procedures, these often were not strictly adhered to. Moreover, these procedures are often difficult to locate. Detailed procedures need to be developed for situations where continuing work is transferred from one shift to the next: for example, when (a) Work is started on one shift but has to be stopped and continued on the next one because of various circumstances such as personnel availability, non-availability of parts or equipment,
102
ANALYSIS OF SHIFT CHANGE IN THE AIRCRAFT MAINTENANCE
Figure 1. Hierarchical description of the shift change process
Figure 2. Task analysis of the shift change process
Figure 3. Error Taxonomy
parallel work, reassignment of work; (b) work is started but partially completed with some items completed but not signed off and (c) work is started and partially completed with all completed items signed off. Awareness and Enforcement Related Issues: Discussion with personnel revealed that they were not aware and consistent in reporting the company’s written procedures on shift hand-over, although all emphasized the importance of a proper shift hand-over. Although personnel were aware of the need for face-to-face debriefings during shift change, often these were not adhered to. Moreover, the nature of the debriefing between individual personnel at work sites for work-in-progress was left to individual personnel.
CONTEMPOARY ERGONOMICS 2000
103
Information Related Issues: Written communication on work in progress is not standardized. Personnel provide different levels of detail on work completed and work in progress. There exists a need for an efficient and effective system that will facilitate the transfer of information on work in progress from one shift to the next. Often personnel have to retrieve written information on work in progress from various sources and access an involved/complicated/complex route of procedures. Training: Shift change training on the use of correct shift change procedures and the importance of following correct protocol is not a part of regular training at most facilities. Organizational Support: A critical component missing was the lack of management support for a standardized shift change protocol. In the absence of an industry-wide standard, organizations have developed their own standards. Moreover, enforcement by management of the existing shift change protocol was often found to be lacking. The protocol was not communicated to various personnel involved in shift change. In the absence of such communication, individuals had developed their own internal procedures. Thus, there exists much variability in the way shift change was accomplished. MRM Related Issues: After analysis, it was clear that personnel need training on MRM-related issues such as communication, interpersonal relationships, leadership, and decision-making. These skills are critical for facilitating a smooth shift change, but most organizations do not have programs in place to train personnel on them. Lack of Useful Job-Aids: Shift change tasks can be aided through the provision of decision support tools and job aids. Often, supervisors had to rely on memory; experience and judgment to decide on work assignments, organize shift meetings, and estimate work status. Conclusions The research reported here represents the results of task analysis of shift change operations conducted at representative aircraft maintenance facilities. Although the sample size was restricted to the representative aircraft maintenance organizations the findings can be used and applied by other organizations. The development of the error taxonomy followed by the identification of human factor interventions has led to the development of a standardized shift change protocol (see Gramopadhye and Kelkar 1999). It is anticipated that the adoption and use of the protocol by the aviation industry will ultimately lead to a safer and more effective and efficient shift change. It is important that the research team and the FAA work closely with the organizations to implement and measure the effectiveness of these changes. Acknowledgements This research was funded by a grant through the office of Aviation Medicine, Federal Aviation Administration, Human Factors in Aviation Maintenance Program (Program Manager: Jean Watson) to the first author. References Drury, C.G. (1989) The information environment in aircraft inspection. In Proceedings of the Second International Conference on Human Factors in Aging Aircraft. Biotechnology, Inc. Falls Church, Virginia. FAA. (1991). Human Factors in Aviation Maintenance—Phase One Progress Report, DOT/FAA/AM-91/16, Washington, DC: Office of Aviation Medicine. FAA (1993) Human Factors in Aviation Maintenance—Phase Three, Volume 1 Progress Report, DOT/FAA/AM-93/ 15.
104
ANALYSIS OF SHIFT CHANGE IN THE AIRCRAFT MAINTENANCE
Gramopadhye, A.K. and Thaker, J.P. (1998) Task Analysis. Chapter 17 In the Occupational Handbook of Ergonomics. (Editors: Karakowaski W. and Manas, W.S. CRC Press: New York.) Gramopadhye, A.K. and Kelkar, K.P. (1999) Analysis of Shift Change in the Aircraft Maintenance Environment: Findings and Recommendations. Technical Report submitted to Office of Aviation Medicine, Federal Aviation Administration, 1999. Hobbs, A. and Williamson, A (1995) Human Factors in Airline Maintenance: A Preliminary Study. Proceedings of the eighth International Symposium on Aviation Psychology. 461–465 Rouse, W.B., Rouse, S.H. (1983) Analysis and Classification of Human Errors. In IEEE Transactions on Systems, Man and Cybernetics, Vol SMC-13, NO 4.
CONSISTENCY IN HRA AND IMPACTS ON HUMAN FACTORS ANALYSIS Richard Kennedy1, Barry Kirwan1, Bob Summersgill2 & Keith Rea3 1Air
Traffic Management Development Centre, National Air Traffic Services, Bournemouth Airport, Christchurch, Dorset, BH23 6DF, UK
2Amey
Vectra Limited, Europa House, 310 Europa Boulevard, Gemini Business Park, Westbrook, Warrington, WA5 5YQ, UK
3British
Nuclear Fuels, Safety Department, Risley, Warrington, Cheshire, WA3 6AS, UK
Approaches for Human Reliability Assessment (HRA) determine what human errors can occur within engineered systems and Human Factors specialists specify how those errors may be prevented. The type of Human Factors interventions that are made to address potential human errors depends on how the assessor models the error scenario. This modelling is subject to great variance and there are several sources for this inconsistency. The paper then goes on to describe work that has been carried out to improve the consistency of HRA.
Introduction Human Reliability Assessment (HRA) is the approach that aims to identify what human errors can occur and how likely they are to occur in a given system (e.g. control room tasks, driver operations, maintenance team activities). Human Factors recommendations are then applied to the human error assessment in an attempt to reduce the error likelihood (e.g. improvements to HMI, training recommendations, improvements to procedures etc.). Therefore HRA identifies what is wrong and Human Factors specifies approaches for how the problem should be addressed. There have been a number of recent studies that have validated HRA techniques (e.g. Kirwan et al, 1997). Although assessors derived similar Human Error Probabilities (HEPs) for given scenarios, the way in which they used the HRA techniques varied and thus there was inconsistency in HRA technique
106
CONSISTENCY IN HRA AND IMPACTS ON HUMAN FACTORS ANALYSIS
Table 1. Variance in modelling a human error scenario using HEART
application. This is an important finding for the Human Factors analyses, as variance in technique application will lead to differences in ‘modelling’ of the scenario. The derivation of error reduction measures are not based on the HEP that the assessor calculates but on the identification of the ‘psychological error mechanisms’ that underpin the error and the ‘performance shaping factors’ that affect the likelihood of error occurrence. This is exemplified in Table 1 that shows that although two analysts derive similar HEPs for a human error scenario using the Human Error Assessment and Reduction (HEART) technique (Williams, 1992), the modelling of the task were very differed to each other. The assessors chose different ‘Generic Tasks’ for assessment of the scenario and also took a different interpretation on the ‘Error Producing Conditions’ (EPC) that were of most influence on the error. Assessor 1 therefore derives recommendations to reduce human error likelihood that are targeted at training/quality assurance factors, whereas the set of recommendations from Assessor 2 is targeted at design/cultural factors. The development of appropriate Human Factors recommendations to reduce human error potential in the system therefore relies on an accurate modelling of the human error scenario. Sources of inconsistency in HRA In an area as complicated as predicting human performance there will be a number of sources of inconsistency and these will be difficult to eradicate. The objective of research to improve consistency should be to deal with the major inconsistencies, i.e. the ones that will have an impact on the HEP, and those that would give rise to false error reduction measures which would not actually reduce risk. Both these focuses are aimed at ensuring that risk is not mis-represented and, in particular, is not under-assessed. The sources of inconsistency in HRA are summarised in Figure 1. The first source of inconsistency is the tool itself. Thus, there may be flexibility built into the tool, which means that different users may use the technique slightly differently. Additionally, certain aspects of the tool may be more generic or even abstract, so that it can adapt to new tasks or new task environments. This increases the range of application
CONTEMPOARY ERGONOMICS 2000
107
Figure 1. Sources of inconsistency in HRA
of the technique. However consistency of usage will inevitably be more at risk than with a technique that is focused closely on a relatively homogeneous industrial context. Alternatively the technique could be underspecified in terms of training or guidance, which will be a natural source of inconsistency. The assessor is the second main source of inconsistency. Each assessor has an experience base that is important, especially given the general scarcity of real HEP data. The assessor must therefore utilise judgement frequently. If an assessor believes he or she knows the real HEP, there will be a natural tendency, whether consciously or unconsciously directed, to arrive at that number. The assessor may therefore ‘steer’ the technique towards that number. This is an aspect of ‘calibration’ of the assessor, and also one of ‘self-confidence’. The third main source of inconsistency is the task information available and accessible to the assessor. This aspect was ‘controlled’ in the validation experiment by giving all assessors the same information and ensuring the content was not favouring any particular technique. However when carrying out HRA in the ‘field’, information availability can have a large impact on what Performance Shaping Factors (PSF) are utilised and consequently what errors are identified as being possible. It is the responsibility of the assessor to unearth the requisite information via plant visits, review of operational experience, and interviews etc., but the very open-ended nature of such information acquisition and assimilation activities means that inconsistencies are likely to arise. The fourth main source of inconsistency is the context within which the HRA is carried out. Therefore factors such as the general organisational culture, employee commitment and attitudes towards safety, the view of the purpose of the safety case etc. may influence the results of the HRA. The HRA assessments need to be objective but if limited modelling and consideration of the human error is carried out due to various project pressures then assessments may become inaccurate and unreliable. Therefore the assessor does not function in an organisational vacuum and will be subject to a number of project pressures that could affect the HRA. Derivation of consistency improvement measures for HRA This paper now describes a study to develop a number of consistency improvement measures that could be applied, in the short term, to HRA techniques (Kennedy et al, 2000). A number of potential improvement measures were identified by an expert group of HRA assessors and these were put into a guidance package that practitioners could consult. Insight into the effectiveness of the guidance was obtained by measuring the consistency of assessor modelling of scenarios via structured trial with fifteen professional assessors and
108
CONSISTENCY IN HRA AND IMPACTS ON HUMAN FACTORS ANALYSIS
Table 2. Consistency improvement measures for HRA techniques
also through a number of ‘debrief’ interviews with the assessors after they had completed the formal trial. The final consistency improvement measures for HRA techniques are divided into the following categories: • Technique Measures—specific to particular techniques and aim to make usage of the technique components more uniform across different assessors • Quality Assurance (QA)—generic across techniques and aim to improve consistency through application of QA measures and robust acceptance criteria • Qualifications and Training—specific and generic across techniques and aim to ensure that suitably qualified and experienced individuals carry out the assessments Table 3. Double counting matrix for HEART Error Producing Conditions (EPC)
CONTEMPOARY ERGONOMICS 2000
109
As an example of a ‘technique specific’ consistency improvement measure, Table 3 shows guidance in the application of HEART Error Producing Conditions (EPC). This matrix was used by assessors to ensure that they did not double count in their selection of EPC. Therefore, for instance, if the analyst chooses EPC1 (unfamiliarity), then EPC9 (technique unlearning) and EPC 15 (operator inexperience) should not be chosen, as the concepts are already included within EPC1. Conclusion This paper has described various measures that can be applied to human reliability assessment techniques and the overall HRA process in an attempt to improve consistency of assessor modelling and thus derive appropriate human factors interventions Acknowledgements: This work was carried out for the Nuclear Industry Management Committee Contract Reference HO200116 ‘Consistency in HRA’. Appreciation is extended to the project officers for the work (Keith Rea and Trevor Waters) and the 15 HRA assessors that took part in the study. The opinions in this paper are those of the authors, and do not necessarily represent those of NATS or other companies involved in this research. References Kennedy, R., Kirwan, B., Summersgill, R., Rea, K. (2000) Making HRA a More Consistent Science. Paper to be Presented at Foresight and Precaution—ESREL 2000 and SRA Europe, May 14–17, Edinburgh. Kirwan, B., Kennedy, R., Taylor-Adams, S., and Lambert, B. (1997) A validation study of three Human Reliability Quantification Techniques: THERP, HEART, and JHEDI: Part II—Results of Validation Exercise . Applied Ergonomics, 28, 1, 17–25. Williams, J.C. (1992) Toward an Improved Evaluation Tool for Users of HEART. In Proceedings of the International Conference on Hazard Identification, Risk Analysis, Human Factors and Human Reliability Process Safety, Orlando, February, Chemical Centre for Process Studies (CCPS).
General ergonomics
A PILOT STUDY EXPLORING THE DESIGN OF ROLES BASED ON MANUFACTURING PROCESS KNOWLEDGE C.E.Siemieniuch1 & M.A.Sinclair2 1HUSAT
Research Institute, Loughborough University, UK
2Department
of Human Sciences, Loughborough University, UK
Customarily, within the manufacturing domain, Ergonomists design roles based on sociotechnical principles, whereas process engineers and managers usually use the ‘left-over’ principle. Both of these are essentially ‘one-off’ methods, with little connection to modelling and simulation environments. Clearly, a link to Enterprise Modelling within the IT domain would be very useful (e.g. CIM-OSA, GERAM). We report on a completed 1-year pilot study investigating an approach to link job design to enterprise modelling technology. This is based on the concept of the organisation as a ‘knowledge engine’. The approach was tested on job design in two locations, a manufacturing cell in a SME and a process line in a LE. We report on the latter. Introduction Customarily, within the manufacturing domain ergonomists design roles based on socio-technical principles (Cherns 1976; Davis 1982; Eason 1988). Process engineers and managers usually use the ‘left-over’ principle, where the roles are an accumulation of operations left over from automation, guided by the principle of minimal change from role descriptions that existed before (Chapanis 1970). Within the IT domain, there is an Enterprise Modelling initiative (Vernadat 1996), to permit reasoning about the structure and behaviour of an organisation, to assist in the procurement of equipment and systems, and, eventually, to create enactable models that can be used for operational control of the organisation. A particular advantage of this approach lies in the steady development of Reference Architectures; standardised ontologies for modelling purposes. These can then be used for specification of organisational IT systems which are, by definition, modular and interoperable. However, Enterprise Modelling has a paucity of organisational content (Vernadat, op. cit.), and the sociotechnical approach is frequently equally short of technical content (Clegg and Ulich 1989; Clegg 1993). This paper reports one attempt to close this gap, utilising the concept of the organisation as a ‘knowledge engine’. In the next sections we outline the approach, describe the case studies, and list some conclusions. Enterprise modelling An excellent outline of the field will be found in Vernadat (op. cit), supplemented by the web sites for the Intelligent Manufacturing Technology Roadmap Vol 4 (http://imtr.ornl.gov) and for GERAM (Generic
112
A PILOT STUDY EXPLORING THE DESIGN OF ROLES
Enterprise Reference Architecture for Manufacturing—http://www.cit.gu.edu.au/~bernus/ clearinghouse.html). The intention of these efforts is to provide both manufacturing organisations and their IT suppliers with ‘Reference Architectures’ or ontologies by which they can model the organisation, simulate its operation, and choose between alternative configurations of resources to maximise the likelihood of achieving their stated vision and goals. From an IT perspective, this would allow a company to select the ‘best in class’ applications or hardware from whichever supplier they choose, in the confident knowledge that, not only will it ‘plug and play’, it will also interface with the other resources of the company (e.g. humans, machines) with minimal interruption. Unfortunately, we are still some way from this ideal. Enterprise modelling is acknowledged to be particularly weak with regard to organisational issues; the very area where ergonomics could make its biggest contribution. However, if ergonomics is to make this contribution, there is a need for concepts to join the usual socio-technical approach to the virtual world of enterprise modelling. One approach, based on knowledge, was developed in the ‘Modelling Human Resources’ project (EPSRC GR/M11721), a follow-on project from SIMPLOFI (EPSRC GR/J40348). The SIMPLOFI project explored two particular aspects; how the organisation’s context could influence the design of roles (Brookes & Backhouse, 1998), and how roles could be designed based on the knowledge required to execute processes (Siemieniuch, Sinclair et al. 1999). The MHR project extended this work; firstly, by creating UML representations of many of the concepts, suitable for inclusion in enterprise modelling exercises, and secondly by exploring the inclusion of workload concepts and of management structures into the process. It is the latter aspect which is reported in this paper. Basing job and role design on knowledge requirements Figure 1 provides an overview of the MHR pilot demonstrator. Starting assumptions for this are that the organisation has elucidated its mission, and based on this the organisation has developed a coherent set of policies, performance criteria, etc; itemised in pale boxes on the left of Fig. 1. Furthermore, it has formalised its processes, as indicated by the boxes down the middle of the figure. On the right hand side are the contributions developed within SIMPLOFI and MHR, suitably tailored for the company in question. Generic knowledge classes are selected for the activities in a given process, and an allocation of functions process is undertaken, as outlined in (Siemieniuch, Sinclair et al. 1999), following (Mital, Motorwala et al. 1994; Mital, Motorwala et al. 1994). Levels of expertise for these classes are specified, together with workload estimates, covering both physical and cognitive aspects. Sets of Grouping Rules and Stopping Rules are then utilised to group the human activities into roles, using as criteria the coherence of knowledge, and the avoidance of ‘excessive’ workload (Fairclough 1999). The outcomes of this are definitions of roles, together with skills and knowledge specifications and workload estimates. It should be noted that alternative sets of these can be developed. These roles can then be combined to form jobs for individuals and/or teams.
CONTEMPOARY ERGONOMICS 2000
113
Figure 1: Overview of the ‘MHR Model
Paste Functions Whatever roles/jobs are formed, they will operate within an organisational context (left hand side of Fig. 1) which will have implications for the nature of role interactions that will be required. For example, team working and extensive peer-to-peer interaction will require a certain type of enterprise culture in order to be successful. To capture these elements the concept of ‘Paste Functions’ has been developed (they are pasted in between roles, to mark their boundaries and classify the interactions between them). For example, roles need to communicate or integrate or co-ordinate their output; one role may be in a position to delegate authority to another within a specific set of time or target constraints; etc. Exploration of the usefulness of Paste Functions It became possible to conduct a brief case study in a steel mill. The management of the mill was carrying out a transition to a new, empowered-team structure, and wished for some reassurance that the new structure would be successful. The shopfloor teams were already fixed, as was the managerial superstructure. Accordingly, the case study concentrated on role interactions, to establish that the roles are structurally sound. As a test of the paste functions concepts, these were used to map the role interactions. A total of 9 stakeholders were interviewed, from the two teams at the ‘Hot Mill’ end of the process and their management team. These comprised: 2 Finishers, a Floor Roller, a Shift Manager, a UPM Operator, the Hot Mill Manager, a representative of the Day Support Team, a representative of Maintenance Engineering, and the Plant Manager. Interviews concentrated on the nature of the role, and the role interactions necessary to maintain plant operations. A set of conclusions and recommendations from these were reported to the management. In addition, five Scenarios were generated to explore the use of Paste Functions: • • • •
Scenario 1: On-going operations (no problems) Scenario 2: On-going operations (fixing problems e.g. slab not hot enough) Scenario 3: On-going operations (fixing problems e.g. QC problem with rollers) Scenario 4: Executing first-line maintenance
114
A PILOT STUDY EXPLORING THE DESIGN OF ROLES
Fig. 2 Paste Functions Grids, showing locations of roles on two axes of discretion. Left-hand picture shows paste functions; right hand picture shows roles positioned on grid.
• Scenario 5: On-going operations (major problem e.g. trapped sheet of hot metal) Relationships between the roles in the Hot Mill for these scenarios were plotted using a ‘Paste Functions Grid’, to show the relationships for operational control—see Fig 2. Transitions across grid lines as shown by the directed arcs imply the insertion of paste functions; depending on which grid lines are involved, an identified subset from the set: Hand-over, Delegation, Targetting, Co-ordination, Control, Integration, and Congruence is inserted. This subset will then affect personal management activities within the role itself—Negotiation, Configuration, Scheduling, Tracking, Recovery, Information seeking, and Diagnosis. Conclusions For the company: • The shifts of role positions in the grids for each scenario indicated that teams rather than crews are better equipped for the operational context. • The near total overlap in role positions for both teams for all scenarios indicated that similar policies, rewards and recognition for teams and individuals would be necessary. • Good communications and co-operation between the teams is critical to process efficiency, and policies, rewards, management procedures should support this For Paste Functions: • The Paste Functions Grid enables paste functions to be invoked according to inherent rules of the grid. • It appears that the grid rules for allocating paste functions are plausible; the grid captures the role management aspects, but not the information-passing aspects • The grid shows different role management structures for different scenarios • It is possible to visualise relationships between teams, by appropriate colour-coding
CONTEMPOARY ERGONOMICS 2000
115
• Clearer indications are required on how to use the Grid and interpret the output. References Brookes N.J. and C.J.Backhouse (1998). “Understanding concurrent engineering implementation: a case-study approach.” International Journal of Production Research 36(11): 3035–3054 Chapanis, A. (1970). Human factors in systems engineering. Systems Psychology. K.B. d. Greene. New York, McGrawHill: 51–78. Cherns, A.B. (1976). “Principles of socio-technical design.” Human Relations 29:783– 792. Clegg, C. and E.Ulich (1989). Job design, MRC/ESRC Social & Applied Psychology Unit, University of Sheffield. Clegg, C.W. (1993). “Social systems that marginalise the psychological and organisational aspects of Information Technology.” Behaviour and Information Technology 12(5): 261–266. Davis, L.E. (1982). Organisational Design. Handbook of Human Factors. G.Salvendy. New York, Wiley: 433–452. Eason, K.D. (1988). Information technology and organisational change. London, Taylor & Francis. Fairclough, S.H. (1999). Modelling mental workload and human resources. Mital, A., A.Motorwala, et al. (1994). “Allocation of functions to humans and machines in a manufacturing environment: Part 1—Guidelines for practitioners.” International Journal of Industrial Ergonomics. 14(1 and 2):3–31. Mital, A., A.Motorwala, et al. (1994). “Allocation of functions to humans and machines in a manufacturing environment: Part 2—Scientific basis (knowledge basis) for the Guide.” International Journal of Industrial Ergonomics. 14(1 and 2):33–49. Siemieniuch, C.E., M.A.Sinclair, et al. (1999). “A method for decision support for the allocation of functions and the design of jobs in manufacturing, based on knowledge requirements.” International Journal of Computer-Integrated Manufacturing 12(4): 311–324. Vernadat, F.B. (1996). Enterprise modelling and integration. London, Chapman & Hall.
LONG DAYS AND SHORT WEEKS—THE BENEFITS AND DISADVANTAGES Karl J.N.C.Rich Human Engineering Ltd, Shore House, 68 Westbury Hill, Westbury-On-Trym, Bristol BS9 3AA, UK
The desire for flexibility from employers and employees has led to a wide range of industries experimenting with Compressed Working Weeks. The main advantage for employees is increased leisure time and for employers, a move away from an overtime culture and reduced costs. Research in this area reveals equivocal results, with some early successes leading to later failure. Fatigue is a major problem for some workers and moonlighting is prevalent, raising concerns regarding exposure to physical hazards and toxins. Work schedules must take account of legislation such as the EU Working Time Directive and Health and Safety Law. Introduction The 24 hour society, globalisation and industrial deregulation has led to immense changes in working practices. Shiftworking, part-time activity, flexible working hours and self-employment, are perhaps the most obvious. The inherent conflict within all of these practices is the drive for increased productivity and efficiency versus the workers’ need for domestic and professional fulfilment. Table 1. Examples of CWW Schedules from Pierce et al where Type A=hours constant, longer days and Type B=shorter hours, fewer days
Harmonisation of these criteria is often difficult and trade-offs are inevitable. One such trade-off is the compressed working week (CWW), where a set number of contracted hours are worked but they are delivered in a non-standard week. CWW has been defined as a trade off between the number of days worked per week and the number of hours worked per day (Ronen, 1984). There are many variants in use across the world (Ronen, 1984, Pierce et al, 1989) and some common examples are described in Table One. Most of the schedules employed fall into two categories: those with constant hours and longer working days and those with decreased hours and fewer working days. CWWs present a number of apparent benefits for both workers and management but may also lay traps for the unwary.
CONTEMPOARY ERGONOMICS 2000
117
Uptake These work systems have been utilised principally in North America (Maric, 1977) with 3% of the total workforce of Canada being involved (BoLaC, 1994). Despite early optimism, CWWs have not broken through into larger proportions of the workforce for reasons which will be discussed below. Flexibility The desire to change to a less traditional and more flexible working week has come from both employers and their workforces. Employers may need to replace expensive arrangements such as overtime and shift pay with cheaper alternatives such as employing part timers or annualising working hours. Extending operating hours may also reduce unit costs. Employees desire for change are twofold: demographic (more women workers, students financing their studies, more diverse life styles) and a rejection of night and weekend work by many workers (OECD, 1996). Most (American) Trade Unions were against the idea of CWWs as they had historically opposed long working days. Some however saw it as an opportunity to reduce working time. The Canadian Labour Congress and UK unions have also sought a shorter working week and less hours (Ronen, 1984, Maric, 1977). User profile In the 1970s the pioneers of CWW were usually small (mostly non-union, non-urban) manufacturing firms and service retail companies. At that time the trend appeared to be towards urban centred organisations: hospitals; insurance companies; police departments. Since then industries including manufacturing, telecommunications, banking and printing have utilised CWWs (Ivancevich, 1974). Examples Two major Canadian companies introduced innovative work systems with varying degrees of success (BoLaC, 1994). The Bank of Montreal introduced a range of work options for all staff, including a 3 day compressed working week (3–36), job sharing, part time, teleworking and special arrangements for workers having difficulty balancing the needs of their families with their jobs. Interestingly the Bank reported an uptake lower than anticipated. Bell CEP negotiated an agreement to reduce working hours and bring in a four day week rather than make staff redundant. Staff worked a nine hour, four day week. This proved popular but the company were concerned that most days off were being taken on Fridays and Mondays. Hardly surprising! An electrical engineering company adopted 3–36 with a voluntary night shift during the 1973–74 miners strike. This schedule proved popular with the workers but was discontinued. The company was taken over and the new owners brought in a 4–40 system with a night shift. This resulted in reports of increased production, recruitment of more staff (paid for by reduced overtime payment), reduced overheads (lighting, heating etc.), easier access for maintenance, reduced start-up/shut-down cost and improved recruitment potential. The employees incurred less travelling expenses and commuting time (Lapping, 1983). Benefits & Disadvantages Analysis of CWW programmes produce equivocal results. Early studies reported reduced absenteeism, higher levels of production, increased job satisfaction and decreased start-up and shutdown costs. However, in the
118
LONG DAYS AND SHORT WEEKS—THE BENEFITS
longer term there was evidence of a return to higher rates of absenteeism and tardiness, increased work related accidents, reports of fatigue and higher employee turnover (Ivancevich & Lyon, 1977, and Tepas, 1985). Despite the mostly contradictory evidence, two major issues stand out unequivocally: leisure and fatigue; CWWs provide workers with larger blocks of leisure time but expose them to longer working days. There are no inherent reasons why CWWs should enhance performance but there are very good reasons why they should cause performance decrements—fatigue and circadian effects. Reduced absenteeism may not be a result of greater satisfaction with CWWs. The economic consequences of a lost day under for example a 4–40 system is 25% greater than under a traditional schedule. Alternatively, the additional day off allows workers to attend to their personal and domestic needs in their own and not the firm’s time (Pierce et al, 1989). Whilst job satisfaction is generally higher in CWWs one cannot assume a direct causal link. There may be indirect causes such as unanticipated changes in the job (enrichment through increased learning, greater autonomy or responsibility) or activities outside the workplace. Generally speaking, CWWs are popular with workers and because of this it may be extremely difficult to go back once a company has implemented a scheme. They should therefore only be implemented after careful diagnosis of the feasibility and the consequences of change (Pierce et al, 1989). Tepas (1985) cites many potential disadvantages and advantages of CWWs. Some issues are double edged, demonstrating the ambiguity or conflicting views on these schedules. Commuting for example may favour those with their own transport since urban trips during busy periods are reduced. Workers relying on public transport however may have problems with off-peak travel. Compressed working may suit some activities and not others. Vigilance or inspection tasks may be made more difficult, as would physiologically demanding manual work. Changes in working schedules may require a concomitant change in the tasks themselves i.e. job redesign. Fatigue & Circadian effects Sleep deprivation and fatigue may be a problem if workers are exposed to nightwork. But compressed hours may eliminate or reduce need for night hours. However, even without nightwork, Physical and mental fatigue reduces physiological and psychological performance, resulting in increased error rates, accidents etc. As the working day lengthens, the probability of including circadian troughs rises, with concomitant decrements in performance. Moonlighting Some American studies demonstrate that employees working a four day week are twice as likely to hold a second job (Ronen, 1984). Regulatory standards of exposure to toxic materials or physical hazards (such as noise) may be based on time weighted averages or dose levels, assuming that the normal day is eight hours (ISO, 1990). There is a risk of over-exposure and moonlighting in this context presents a very serious threat to the employee. To implement or not To succeed, CWW scheduling must undergo careful analysis in terms of the nature of the production process or service, the pattern of customer requirements and worker involvement. A major hurdle to wider uptake is the number of hours worked (Blyton, 1985). CWW scheduling is most popular amongst those
CONTEMPOARY ERGONOMICS 2000
119
engaged in light work and who are not fatigued by heavy domestic responsibilities. A substantial reduction in hours worked may make these schedules more popular (Owen, 1979). CWWs will affect each company differently and there can be no one implementation method that will suit all. However, there are a number of methods available and these are discussed elsewhere (Ronen, 1984, Pierce et al, 1989, Wedderburn, 1989, Rich, 1998). The Eu Working Time Directive & Uk Health And Safety Legislation The Working Time Directive sets limits on the number of hours that can be worked per week (48) within a given reference period and entitles workers to breaks during their working day, daily and weekly rest periods and paid annual leave (DTI, 1998). This could interfere with some CWW schedules. Additionally, the onus on employers to deal with moonlighting is explicit and could also cause problems. The UK Health and Safety at Work Act places a duty upon employers to ensure, so far as is reasonably practicable, the health and safety at work of employees and other persons who may be affected by the work activity. The UK Management of Health and Safety at Work Regulations state that every employer should carry out an assessment of risk of injury or ill health arising from the working activity. Any significant change in work schedule which may affect the level of risk must be assessed prior to implementation. A risk based approach will greatly assist managers when considering such factors as exposure to hazards, toxins etc. Conclusions CWW schedules are popular with employees but their benefits to the business may be unsustainable in the long term. Some of the findings in the literature are equivocal and there are many advantages and disadvantages, some of which appear contradictory. Worker fatigue and moonlighting are major problems, as is the risk of uncontrolled and excessive exposure to toxins or physical hazards. Benefits to some businesses include reductions in overtime, greater worker flexibility, reduced start up and shut down costs and decreased overheads. CWWs must not be viewed in isolation and the global effects on the business should be considered. Careful analysis should be undertaken prior to the implementation of any new schedule and changes in work schedules must consider the Working Time Directive, Health and Safety and other legislation. REFERENCES Blyton, P. 1985. Changes in working time: an international review. (Croom Helm, London and Sydney) Bureau of Labour Information and Communications (Programs) of Human Resources Development Canada. 1994. Report of the Advisory Group on Working Time and the Distribution of Work. Department of Trade and Industry 1998. Guidelines on The Working Time Directive, DTI Website ISO 1999. 1990. Acoustics—Determination of Occupational Noise Exposure and Estimation of Noise Induced Hearing Loss. International Standards Organisation, Geneva. Ivancevich, J.M. 1974. Effects of the shorter working week on selected satisfaction and performance measures. Journal of Applied Psychology. 59, No 6, 717–721 Ivancevich, J.M., and Lyon, H.L. 1977. The shortened workweek: A field experiment. Journal of Applied Psychology. 62 No 1, 34–37. Lapping, A. 1983. Working time in Britain and West Germany. Anglo-German Foundation for the Study of Industrial Society. (Belmont Press, Northampton)
120
LONG DAYS AND SHORT WEEKS—THE BENEFITS
Maric D. 1977. Adapting working hours to modern needs. International Labour Organisation, Geneva OECD 1996. Working Hours, Working Paper No 82, Vol. 4 Owen, J.D. 1979. Working hours. (D.C. Heath and Company, U.S.A.) Pierce, J.L., Newstrom, J.W., Dunham, R.B. and Barber, A.F. 1989. Alternative work schedules. (Allyn and Bacon Inc., Boston London, Sydney and Toronto) Rich, K.J. 1998. The benefits and disadvantages of the compressed working week. Shiftworking and Rostering Conference, IIR Ltd, The Euston Plaza, London WC1, 29 –30 June 1998 Ronen, S. 1984. Alternative work schedules: Selecting Implementing and Evaluating. Dow Jones—Irwin, Illinois. Tepas, D. 1985. Flexitime, Compressed Workweeks, Other Work Schedules. Ch 13 In Hours of Work: Temporal Factors in Work-Scheduling (Eds. Folkard, S., and Monk, T.H.). (J.Wiley and Sons, Chichester, New York, Brisbane, Toronto, Singapore) Wedderburn, A. 1989. Negotiating shorter working hours in the European Community. Bulletin of European Shiftwork Topics, EF/89/29/EN. European Foundation for the Improvement of Living and Working Conditions.
ERGONOMICS NEEDS OF SMALLHOLDER FARMERS IN MOZAMBIQUE D.H.O’Neill1 & E.J.Fraqueza2 1Silsoe 2World
Research Institute, UK
Vision, Quelimane, Mozambique
This survey is one of the first to take an integrated approach, incorporating all aspects of smallholder family farming enterprises, rather than discrete tasks or farming activities separately. The sample of 197 households was stratified into four wealth categories and differences in needs between the categories are revealed. Interventions to address poverty, based on the findings of the survey, were identified, including, as the prime need, increasing agricultural productivity through the use of better hoes. Introduction Mozambique, with almost 70% of its population living in absolute poverty (ie less than 1US$ per day), is one of the poorest countries in the world. A poverty alleviation project1 has been set up in Zambezia Province and addresses poverty primarily by aiming to increase food self-sufficiency of rural families. It is estimated that only about half of the families achieve this and the situation is exacerbated by the families’ desperate scarcity of resources. There is negligible use of fertiliser, agricultural tools (except hoes) or draught animal power, thereby making human labour particularly critical for agricultural production. Shortage of credit and lack of access to markets prevent families from obtaining food items to supplement their own production and whatever nature provides in the environment. For most in Zambezia, survival depends on establishing and harvesting their staple crops and, if the opportunity arises, generating income, from agricultural, domestic or other activities to cover the purchase of supplementary food and any other essential items. Zambezia is reasonably well endowed with the biophysical resources for crop production (although the quality of the soil varies considerably) so the key component in the survival strategy is human labour. It is essential to know how people spend their time and energy so that opportunities to raise production or expand income-generating activities can be identified. A participatory survey, followed by focus group meetings, collected this information.
1Zambezia
Agricultural Development Project, managed by World Vision Mozambique, supported by funding from the UK Department for International Development. The views expressed are not attributable to the organisations involved but are the responsibility of the authors.
122
ERGONOMICS NEEDS OF SMALLHOLDER FARMERS IN MOZAMBIQUE
Participatory survey A total of 197 households in the three Districts was surveyed. There are differences between these Districts, particularly regarding the type of farming system, topography ‘and infrastructure. Gurué is upland whilst Namacurra and Nicoadala are coastal lowland, with Namacurra having the best infrastructure. Households were selected to represent the different status of household heads (eg married man, widow etc) at four levels of wealth/poverty (very poor, poor, medium, rich), according to the findings of the wealth ranking exercise previously undertaken within the project. It was not possible to include equal numbers for each ranking as they were not necessarily distributed appropriately in the communities (eg in some communities there were no rich widows). The survey elicited, through semi-structured interviews, information on tasks, tools and equipment, together with associated problems, for the three main areas of household enterprise—agricultural, extra-agricultural (ie beyond crop production) and domestic activities. For each household, four database tables were compiled, one for each of the three areas given above and one containing any constraints reported concerning manual labour. The three areas of activity included many different tasks—16 agricultural, 8 domestic and 63 extra-agricultural, but these could be reduced to 11, 8 and 20 respectively by combining those that were very similar and by disregarding those (mainly extra-agricultural) that were pursued by less than five households (eg tailoring). It was possible to create an inventory of the household ownership of tools and how they are used. This helped identify shortages and inadequacies, which could be subsequently confirmed at focus group meetings, and which might indicate opportunities for interventions to raise agricultural productivity. It was also possible, using the database, to examine interactions between production constraints, problems experienced and tool ownership. Results and discussion The most frequently cited constraint on agricultural production for all households in the sample was weeding (29%), followed by cultivation (22%); the least frequent was planting (2%). However, if these results are analysed according to status and wealth ranking, a slightly different pattern emerges, as shown in Fig 1. From Fig 1 it can be seen that for the very poor cultivating, rather than weeding, is the most frequently cited constraint. For the rich, weeding is by far the most commonly cited constraint, followed by harvesting, which does not appear to present any constraint to widows or the very poor. It would, therefore, seem that poorer families face their greatest difficulties in preparing their land for cropping, which is the most energydemanding and tiring task. Better resourced families have greater access to labour and better tools and equipment, so would be less constrained by land preparation and would be likely to crop larger areas. Constraints then arise at weeding and harvest times in managing these larger areas. Ownership of agricultural tools and equipment in the participating households was limited. Eleven types were identified but, for most of these, ownership was not widespread. For the more common items (eg hoes), size was also a consideration. Only large hoes (n=272), small hoes (n=255) and sacks (n=220) averaged more than one per household. The household use of tools and equipment for agricultural tasks is summarised in Table 1. As can be seen from Table 1, the three items used most were the large hoe, the small hoe and the large cutlass. As is shown in Table 2, the households which cultivated and reported cultivation as a constraint had fewer large hoes and more small hoes than the households which did not. A similar finding on hoe size did not apply to households reporting weeding to be a constraint. Table 2 also shows a breakdown of how labour is provided by the households reporting constraints or not with these two tasks. It may be significant that cultivation is done by women alone in a greater proportion of the households reporting cultivation to be a constraint.
CONTEMPOARY ERGONOMICS 2000
123
Figure 1 Incidence of agricultural constraints for selected status
The 20 most common extra-agricultural activities and their distribution according to the four wealth rankings are given in Fig 2. These activities are undertaken primarily for income generation and it can be readily seen from Fig 2 that families of different wealth ranking take advantage of different opportunities. The rich, for example, are carpenters, administer traditional medicine and sell rice (which they have grown). The very poor sell wood, drinks and charcoal—all of which they can do with a minimum investment in equipment and by using raw materials freely available in the environment. The households in between tend to generate income by growing and selling cash crops, such as tomatoes, and commodities that they can harvest from the environment such as coconuts and the products of hunting and fishing. Conclusions The survey revealed that the constraints on agricultural production and the opportunities for incomegeneration depend on the wealth ranking of the household. The poorest cite cultivation as their main constraint, and their efforts to generate income are restricted by their own limited resources. This survey has enabled interventions to be better targeted to the needs of different households. The importance of the hoe for agricultural production was confirmed at focus group meetings and led to an intervention aimed at increasing the availability of locally fabricated, large hoes of the design preferred by the farmers (with sockets rather than tangs).
Table 2 Ownership of hoes and sources of labour for households reporting the main agricultural constraints.
Table 1. The numbers of households using various items of tools and equipment for agricultural tasks.
124 ERGONOMICS NEEDS OF SMALLHOLDER FARMERS IN MOZAMBIQUE
(* indicates making and selling)
Figure 2 Distribution of extra-agriculture activities to wealth ranking
CONTEMPOARY ERGONOMICS 2000 125
ARE PROFILING BEDS BETTER? EVIDENCE FROM USERS AND RECORDS John Mitchell1, Jude Bennington2, Norman Jones3 & John McClenahan4 1Director
of the Wheelchair Lifemaps User Trials, RICAbility and The Essex Rivers Bed Project, The King’s Fund
2Researcher
on the Wheelchair Lifemaps User Trials, RICAbility and The Essex Rivers Bed Project, The King’s Fund 3Director
of Operations, Essex Rivers Healthcare Trust
4Fellow,
Leadership Programme, The King’s Fund
Difficulties in finding out how Healthcare products affect their users and resources may well delay the adoption of better, safer working methods. The King’s Fund and Essex Rivers Healthcare Trust therefore investigated the changes that accompanied the introduction of higher performance beds on the West Bergholt Acute Trauma Ward in Colchester General Hospital. Though the benefits of the new beds were obvious to their users, hospital records and budgets were initially unable to detect any corresponding changes. To explore the discrepancy, individual case-notes were analysed to reveal that, in fact, the incidence of pressure sores had fallen by more than two thirds with consequent benefits to consumers and resources. Powered and unpowered healthcare beds Ergonomists are frequently concerned with the usability of products and whether they will improve or impede their functions, work and lives of the people who use them. Improved performance in healthcare products could enable NHS consumers and staff to live, recover and work more comfortably and effectively. This, in turn, could improve the quality of healthcare and make better use of resources. Healthcare beds exemplify these concerns and opportunities. They are essential components in the care, diagnosis and treatment of sick and dependent people in hospitals, in residential and nursing homes and in
CONTEMPOARY ERGONOMICS 2000
127
the community. If their performance is unsatisfactory, it could hamper both recovery and staff effectiveness. Hospitals have used unpowered, ‘King’s Fund’ beds almost exclusively since their introduction in the 1960s (King’s Fund, 1967). They are now being challenged by the arrival of powered, ‘profiling beds. Staff have found that adjusting unpowered beds can be difficult, demanding and often dangerous while consumers generally find it quite impossible (Mitchell et al 1998a). Powered beds are demonstrable easier to adjust for staff and for most consumers, even those who are in pain or very weak. The USA and other developed countries have been quick to use powered, profiling beds in hospitals and in the community. The NHS, on the other hand, has been relatively slow. According to some manufacturers, the sales of ‘King’s Fund’ beds continue to outstrip those of profiling beds. The reluctance to try out non-standard beds could reflect a number of factors including the difficulty of revealing and responding to users’ needs, the surprising lack of evidence of the benefits of improved performance and price differential between powered and unpowered beds (Mitchell et al, 1998a). The King’s Fund addressed the first problem by publishing guides to choosing beds for hospitals (Jones et, 1998), nursing and residential homes (MacNair et al, 1998) and the community (Mitchell et al 1998b and 19998c). The second problem is addressed by the study reported here (Mitchell et al, in press). Testing the Benefits of Profiling Beds To begin gathering a fuller account of the human and economic effects of these beds, the King’s Fund and Essex Rivers Healthcare Trust studied the changes which occurred when profiling beds were used to replace ‘King’s Fund’ beds on the West Bergholt Acute Trauma Ward in Colchester General Hospital. The Trust hoped that the beds might benefit its operations in the areas shown in Table 1 below. Table 1 Possible benefits of ‘profiling’ beds
The aims of the six-month study were to: • reveal the effects of the new beds on their users • find out if such effects are easy to detect in hospital records
128
ARE PROFILING BEDS BETTER? EVIDENCE FROM USERS
• find out if any savings exceed the cost of acquiring the beds Methods Interviews and focus group discussions were used to collect qualitative data from both consumers and staff on the functions they carried out and how these might have changed since the introduction of the new beds. Table 2 Participants
Consumers were asked about their activities including ‘life support’ (eg eating, drinking and keeping warm), resting, getting comfortable, interacting with other people, recreation and using other equipment (chairs, commodes, bedside lockers, over-bed tables, lights, TV, radio and telephones). Staff were asked about the activities they undertook in helping consumers, providing treatments, tests, moving, cleaning and maintaining beds. All users were asked if the new beds had affected these activities in any way and, in particular, if they now functioned more or less safely, effectively, comfortably, independently and easily. At the same time, quantitative data was collected from hospital records and budgets for the period of the study and the previous eighteen months. This was used to establish a baseline for the ward’s performance and to detect any changes that occurred. Results With the exception of the phlebotomist who had not been informed about the new beds, consumers and staff universally, but not uncritically, welcomed the new beds and reported that they: • were able to function more easily • were more comfortable • needed less help from others For example, consumers were able to adjust their beds without help whenever they wished and kept going until they had got comfortable. Staff found the beds much easier to use and felt noticeably less tired at the end of shifts. Nurses could work alone during meal rounds instead of needing one to adjust the beds and make consumers comfortable and another to hand out the trays. These results might well be expected to affect the ward’s recovery, dependence and productivity rates. However, no unusual changes whatever could be detected in either the hospital records or budgets.
CONTEMPOARY ERGONOMICS 2000
129
Exploring the Discrepancy It seemed unlikely that users had seriously misreported the effects of the new beds on themselves and their activities. Participants had no difficulty in authenticating and demonstrating the changes they had reported and both consumers and staff independently confirmed one another’s accounts. The team therefore chose a single area, the incidence of pressure sores, to test the quality of hospital record. A hand search of individual case notes was carried out and compared with the data from hospital records. This showed that: • • • • •
acquired pressure sore rates had fallen by more than two thirds 290 out of 5500 bed-days had been released the ward had notionally saved £20,700 the hospital had notionally saved a further £ 15,800 savings more than covered costs of acquiring the beds
The team followed up this analysis by examining the methods used in the remaining areas of hospital records and suggested that three main factors could account for the observed discrepancies, namely: • the study was too short to affect long-term trends eg accidents • high ward performance can leave little scope for improvement • reporting systems for dependence, productivity, bed-occupancy, incidence of pressure sores may be insensitive to the reported changes Implications of the Study Routine hospital records are an essential means of evaluating decision-making. It is difficult for managers to develop more effective strategies for improving heafthcare and the use of resources if they do not know precisely how well their previous decisions have worked. It is a matter of some concern that at least one of the hospital’s recording systems was unable to detect changes which were obvious and striking to those who were directly involved. If the benefits of better products cannot be detected it is very unlikely that they will be widely adopted, particularly if they are more expensive to purchase. It is therefore recommended that • existing recording methods should be carefully evaluated for their ability to detect such changes and better methods developed wherever necessary. • the human and economic effects of healthcare products should be extended to improve the quality of future choice and innovation References King’s Fund, (1967) Design of Hospital Bedsteads. London: King Edward’s Hospital Fund for London
130
ARE PROFILING BEDS BETTER? EVIDENCE FROM USERS
Mitchell, J., Jones, J., McNair, B. and McClenahan, J. (1998a) Better Beds for Health Care. London: King’s Fund Jones, J., McNair, B. and Mitchell, J. (1998) Choosing Beds for Hospitals. London: King’s Fund McNair, B., Jones, J., and Mitchell, J. (1998), Choosing Beds for Nursing and Residential Homes. London: King’s Fund Mitchell, J., McNair, B. and Jones, J. (1998b), Choosing Health Care Beds for Use at Home. London: King’s Fund Mitchell, J., McNair, B. and Jones, J. (1998c) Who Needs a Better Bed? London: King’s Fund
HUMAN FACTORS ASSOCIATED WITH ESCAPE FROM SIDE-FLOATING HELICOPTERS D.W.Jamieson, S.R.K.Coleshaw, I.J.Armstrong, C.Sellar & D.Howson1 Centre for Health and Safety Sciences, RGIT Limited, 338 King Street, Aberdeen AB24 5BQ, UK 1Civil
Aviation Authority Safety Regulation Group, Aviation House, Gatwick Airport South, West Sussex RH6 0YR, UK
Helicopters which are forced to ditch in water are inherently unstable due to their high centre of gravity. The Civil Aviation Authority (CAA) have, inter alia, identified the stability of ditched helicopters as an area for improvement. A floatation system was tested which allowed a helicopter model to come to rest on its side at an angle of around 150° after capsize. Before this system could be considered for operational use, it was important to assess the human factors associated with escape from this position and confirm it to be easier and safer than from a fully inverted helicopter. Trials were undertaken which required naive subjects to make several escapes from a helicopter simulator. Results showed that most subjects found it easier to escape when the simulator was floating on its side. Introduction In one year (1991) offshore helicopter operators in the UK sector of the North Sea flew over two million passengers on over 300,000 flights. Over the preceding 10 years there were 12 accidents involving UK registered helicopters engaged in service for the offshore industry, with a total of 73 fatalities. These statistics indicate that a very high level of flight safety exists although the only satisfactory record is one in which no fatal accidents occur at all (Bycroft, 1992). In 1984 the Civil Aviation Authority (CAA) set up the Helicopter Airworthiness Review Panel (HARP) which, inter alia, identified the stability of ditched helicopters as an area for improvement. In 1995, the CAA’s Report on the Review of Helicopter Offshore Safety and Survival (RHOSS) re-emphasised the importance of the helicopter’s emergency flotation system to survivability following a ditching or water impact, and added impetus to the ongoing research. Model tests were commissioned to investigate systems intended to mitigate the consequences of a capsize by preventing the total inversion of a helicopter. The scheme, developed and successfully model tested in a wave tank (Jackson & Rowe, 1997), allowed the helicopter to come to rest on its side at an angle of 150°, retaining an air pocket inside the cabin and maintaining at least some of the exits above the water surface. Before such a flotation system is put into operation, it is essential to ensure that it will improve the survival chances of the crew and passengers. In theory, there is great benefit in having one side of the helicopter, and thus one set of exits above water level. Accident reports and research have indicated that locating and opening an exit from a fully inverted position underwater are two of the most critical factors in escape from a capsized helicopter. Poor visibility, disorientation and in-rushing water may seriously hamper an individual’s attempts to reach an exit (Ryack, 1986). Furthermore, inadequate ergonomic design of the
132
HUMAN FACTORS ASSOCIATED WITH ESCAPE FROM SIDE-FLOATING
jettison mechanism renders them extremely difficult to operate from an inverted position underwater, due to factors such as poor depth perception and magnification effects (Brooks, 1994). The air gap in a side-floating helicopter is another feature which is likely to increase the chances of escape after capsize. This should make it easier to locate and operate an exit and also means that occupants can surface inside the helicopter after shorter breath-hold times underwater. This is crucial if the helicopter ditches in water colder than 15°C, since the occupants will experience cold shock which has been shown to dramatically reduce breath hold time. Tipton et al (1988) found that in water temperatures of 10°C, breath hold time may be as low as 10 seconds in some subjects. A research programme was developed to investigate the human factors issues associated with escape from a side-floating helicopter. Of particular concern was to highlight unforeseen difficulties associated with escape from the side-floating position. A helicopter simulator was used for the trials. The main objectives were: • to develop appropriate techniques and procedures for escape from a side-floating helicopter; • to determine the overall benefits and/or disadvantages of the scheme by direct comparison with escape from a fully inverted helicopter. Method Thirty subjects were recruited from three age ranges (18–30, 31–40, 41–50 years) to represent the profile of the offshore population as far as possible. Subjects were naive in that they had not been through helicopter underwater escape training before. Each subject completed two trials. In one, three escapes were carried out from a fully inverted helicopter simulator following a 180° capsize. In the other, the subjects carried out three escapes from a side-floating helicopter simulator following capsizes of either 150° or 210° (the latter being a reverse roll to the same end position). A cross-over protocol was used, with half of the subjects performing the fully inverted 180° trial first and half completing the side-floating trial first. This was done to remove any order and training effect which might affect subjects’ preferences and perceptions of difficulty. Feasibility studies had shown that escape from the side-floating simulator could cause problems due to the potential for occupants who release their harnesses from a position mostly above the water falling onto others who are surfacing from seats mostly underwater. Whilst minor injuries caused by such contact would not be life-threatening in a real incident, there was a need to reduce the risk during the trials. It was therefore decided that the capsizes would be conducted with only two subjects plus one member of the safety team in the helicopter simulator at any one time. Each trial consisted of 5 exercises, building up the skills and confidence of the subjects at each step. In both trials, the first exercise was a controlled landing on water, with the subjects leaving the cabin by the door and stepping into a heliraft. The second exercise involved a partial submersion of the helicopter simulator, with subjects making an underwater escape from a window next to their seat. The three capsize exercises were: 1) escape from the exit next to the seat; 2) as 1 but following a reverse direction capsize; 3) crosscabin escape through a distant exit. Subjects were given a full briefing in helicopter underwater escape training (HUET) before each trial by an approved Training Officer. Physiological stress was measured throughout the trials by collecting saliva samples in order that cortisol levels could be measured and by recording heart rate. Subjects also filled in the State/Trait Anxiety Inventory (Spielberger et al, 1983) to measure psychological stress and completed a series of questionnaires in which they were asked about the factors which made escape difficult and about their confidence and coping in the trials.
CONTEMPOARY ERGONOMICS 2000
133
Results The 27 male and 3 female subjects who took part in the study were aged from 18 to 49. Their average height was 1.75m and their average weight was 82.4kg. The majority rated themselves as being moderately fit, good swimmers, confident about helicopter transport and had no previous knowledge of helicopter underwater escape training. Perceived Difficulty, Confidence And Coping The vast majority of subjects (90%) indicated that they preferred escape from the sidefloating position because it was easier than in the fully inverted trial. In addition, subjects were significantly more satisfied with their coping in the side-floating trial (p= 0.019). Subjects’ confidence was moderate to high in the study and this was not greatly affected by either of the trials. However, after the side-floating capsizes, more subjects had greater confidence about coping in a real accident and fewer had less confidence than was the case after the fully inverted trial. Table 1 shows how difficult subjects found each capsize. It is clear that the fully inverted cross cabin escape was found much more difficult than the side-floating equivalent. Only 29% of subjects found this escape moderately or very difficult from the side-floating simulator compared to 89% in the fully inverted exercise. This difference was highly significant (p=0.0001). Table 1. Subjects’ difficulty rating of each capsize
Closer analysis showed that swimming, disorientation, locating an exit and using an exit were all rated as significantly more difficult in cross cabin escape from the fully inverted helicopter simulator than in the side-floating equivalent. Only 15 of the 30 subjects managed to complete the fully inverted cross cabin escape as instructed. The other 15 either left by the exit nearest to them or made use of the air-pocket which was always present in the simulator for safety reasons. Other significant differences included breath holding being rated as significantly more difficult in the sit/ escape same side fully inverted capsize (p=0.04) and releasing the harness and locating the exit being rated as more difficult in escape from the side-floating simulator after a reverse capsize (p=0.05 and 0.03 respectively). Breath Holding Times The average breath hold times in the side-floating trials were all shorter than in the corresponding fully inverted trial. Statistical analysis showed that subjects’ breath holding time was significantly less in the side-
134
HUMAN FACTORS ASSOCIATED WITH ESCAPE FROM SIDE-FLOATING
floating cross cabin escape and the side-floating sit/escape same side than in the equivalent fully inverted exercises (p=0.0001 in both cases). The most striking difference was between the breath hold times for the cross-cabin exercises. Only half as many subjects completed the fully inverted cross cabin escape and on average they had to hold their breath for twice as long. The air-pocket inside the side-floating helicopter simulator seems to have allowed shorter breath hold times than was possible after the fully inverted capsizes. Psychological and physiological stress Heart rate was sampled before, during and after the escape exercises. Saliva samples were taken at the start of the trial and after the exercises had finished. When the average heart rates were compared between the trials no significant differences were found. There were no significant differences between the salivary cortisol levels in the trials. State anxiety was recorded before the trials and before and after each set of exercises. It was found that in both trials, pre-exercise anxiety was significantly higher than it was after the exercises (p=0.0005 in both cases) and these scores were significantly higher than the control scores (P=0.0005 in both cases). Taken as a whole, these results show that neither physiological or psychological stress differed depending on the escape scenario. Discussion It has been shown that helicopter underwater escape training can be adapted to teach people how to escape from a side-floating helicopter. It can be concluded that the vast majority of subjects in this study found it easier to escape from a side-floating helicopter simulator than from a fully inverted one without finding it any more stressful. In the side-floating trial, more subjects were satisfied with how they coped and more were instilled with greater confidence in their ability to deal with a real helicopter ditching. These findings suggest there could be significant benefit in training people to escape from helicopters which were designed to float on their side after capsize. The results of this study indicate why subjects found the side-floating trial easier than the fully inverted one. The provision of an air-pocket and exits above the water were important factors in making escape easier. The air-pocket in particular helped to make disorientation less of a problem and meant that subjects did not need to hold their breath for so long. This latter point is important considering that in a real helicopter accident, occupants may have to overcome the effects of cold shock which has been shown to reduce breath hold time to as little as 10 seconds. In two of the three escape scenarios subjects found locating and using the exit less difficult during escape from the side-floating simulator as a result of exits being above the water on one side. This was especially true when the cross cabin escapes were compared. In a real capsize, this would mean that occupants will be less hampered by poor visibility and their inherent buoyancy in their attempts to reach and jettison an exit. In the case of a 210° roll causing extra disorientation, as suggested by the reverse side-floating capsize, the presence of an air pocket will provide occupants with extra time to make their escape. There were a few problems with escape from the side-floating simulator which were identified from the feasibility trials and/or the naive subject trials. The most serious problem identified was the potential for an occupant on the upper side to release their harness, which may be difficult due to the uneven load on the buckle, and fall with force onto someone rising to the air pocket from the lower side. It is thought that such a collision carries a high risk of injury. However, in a real capsize, it is likely that the risk of collision will
CONTEMPOARY ERGONOMICS 2000
135
be similar if the helicopter ends up fully inverted since some of the occupants may have released their harness before capsize. None of the problems with escape from a side-floating helicopter which were identified in this study are thought to be life-threatening. They do not outweigh the advantages that such a scenario has over escape from a fully inverted aircraft. On the contrary, the evidence suggests that the occupant of a side-floating helicopter has a much better chance of escape than someone inside a fully inverted aircraft. References Brooks CJ, Bohemier AP, Snelling GR, 1994, The ergonomics of jettisoning escape hatches in a ditched helicopter. Aviation, Space and Environmental Medicine.; 65:387–395 Bycroft DH., 1992, Evacuation procedures. In the proceedings of Helicopter Survival, London, UK Civil Aviation Authority 1984, Review of helicopter airworthiness. Report of the Helicopter Airworthiness Review Panel (HARP) of the Airworthiness Requirements Board , Civil Aviation Authority London, Report no. CAP 491 Civil Aviation Authority 1995, Report of the Review of Helicopter Offshore Safety and Survival (RHOSS), Civil Aviation Authority London, Report no. CAP 641 Jackson GE, Rowe SJ, 1997, Devices to prevent helicopter total inversion following a ditching. CAA Paper, 97010 Ryack BL, Luria SM, Smith PF., 1986, Surviving Helicopter Crashes at Sea: A review of studies of underwater egress from helicopters. Aviation, Space and Environmental Medicine.; 57:603–609 Spielberger CD, Gorsuch RL, Lushene R, Vagg PR and Jacobs GA, 1983. Manual for the State-Trait Anxiety Inventory STAI (Form Y) (“Self-Evaluation Questionnaire”). Consulting Psychologists Press Inc, Palo Alto Tipton MJ, Vincent MJ. Submerged helicopter escape and survival. Surrey, UK: Robens Institute Report, University of Surrey, 1988
ERGONOMIC EVALUATION OF WORK AND ENVIRONMENTAL STRESSES ON TECHNICIANS WORKING IN A MULTIMEDIA CHIP MANUFACTURING INDUSTRY IN MALAYSIA Rabindra Nath Sen & Yeow-Hwa Quek Ergonomics Centre, Faculty of Management, Multimedia University (MMU), Jalan Multimedia, 63100 Cyberjaya, Selangor, Malaysia
Stresses on personnel working as executives or in the medical field or as vehicle drivers, etc., have been intensively investigated. But, there is practically no such investigation on environmental and work stresses carried out on personnel working in the multimedia electronic chip manufacturing (MMCM) industry especially in Malaysia. The present paper aims at identifying the causes and effects of these stresses faced by these technicians in Malaysia. The technicians faced critical mental loads especially when they have to attend to the repetitive and several of the simultaneous machine errors and stoppages. One of the ways of resolving these was found to be to use different alarm tones with different rates of interruptions and different coloured alarm lights. These differences depend on the severity of the machine errors in each of the 11 production assembly lines to draw immediate attention of the specific technician to improve the production outputs and quality. Introduction Stress has become a widely used and yet poorly understood term. It is on the rise and gaining the attention as a costly workplace hazard that causes occupational illness and costs industry millions of dollars each year (Farren, 1999). In, Stress at Work, a fifth of companies attributed upto 50% of all days off sickness due to stress-related illness. An estimated USD 200 billion or more and upto 20 billions Pound Sterlings are lost each year as a result of stress (Warren and Toll, 1993). The word “stress” has been defined differently by different researchers. Seyle (1930) defined the stress as the reaction of the organism to a threatening or oppressing situation (Seyle, 1978; Kroemer and Grandjean, 1997). Others had defined it as a challenge to a person’s capacity to adapt to the inner and outer demands that are physiologically arousing and emotionally and psychologically disturbing and call for cognitive or behavioural responses (Westen, 1996). There is a positive stress that can be beneficial and may be required in raising the productivity, qualities and performance termed as “eustress”. But frequently people are facing the negative stress, distress that causes negative effects to the individuals as well as to the organisations. Job related stressors arise in a variety of manifestations and effects, depending on the individual concerned, in the workload whether it is quantitative or qualitative overload/underload, the situation encountered in working conditions, work patterns, work roles, productivity improvement, rejection reduction, etc. and combination of the stressors therein (Sen and Yeow, 1999). The present investigation has approaches that focused on establishing the ergonomic factors as the root causes of environmental and work stresses faced by the technicians working in the MMCM industry.
CONTEMPOARY ERGONOMICS 2000
137
Ergonomic improvements and modifications were suggested to reduce the environmental and work stresses, in general and to minimize the rejections to increase productivity and quality of the memory chip products, in particular. Method In the present study, direct observations on the existing workload as subjectively perceived by the technicians : quantitative or qualitative work overload or underload, their work roles or tasks, working conditions or environment, etc., are being studied as an ongoing project. The possible stresses are identified, analysed and categorised into three main groups : Environmental, Physical and Mental stresses. These possible stresses are being evaluated by a score to each of the answers given to each of the questions in a Questionnaire prepared and administered for the purpose. Subjective rating of workload by using NASA Task Load Index (TLX) assessment and other subjective and objective assessments such as Strain Index, RULA, Subjective Workload Assessment Techniques (SWAT), physiological and psycho-physiological responses, etc. and follow up studies are going on. A total of 28 technicians had successfully completed the Questionnaire study. Some of the information are classified by the factory and hence could not be provided. An Assmann Hygrometer was used to measure the Dry-bulb Temperature (DBT) and Wet-bulb Temperature (WBT) at the workplace of the Technicians. The Vernon’s Copper Globe Temperature was used to measure the Mean Radiant Temperature and Wet-bulb Globe Temperature Index. Results and Discussion In each of 11 production lines of the factory, there is one Die Bond Machine with four to five Wire Bond Machines. One Die Bond Machine has the capability of producing chips that would serve four to five Wire Bond Machines. Die Bond is a process of attaching each wafer chip to the pad of a frame, while Wire Bond is the process after the Die Bonding. In this process, gold wires are put to the respective locations of a chip to complete the electrical circuitry. In each of the three shifts, there are 11 Technicians assigned to run each of the 11 production assembly lines with 4 Operational Specialists (OS) and 3 other Technicians as supports, whenever a particular line needs assistance or help in place of another technician who is taking regular 15 minutes’ tea break or 30 minutes’ lunch break, etc. Environmental Stresses The results of typical environmental data are presented in Table 1. The Technicians (16 persons) felt chills especially when they are working in the night shift. They (13 persons) thought that working in the airconditioned environment leads to dry mouth. They (9 persons) consider that the environment is noisy especially when the alarms or buzzers are simultaneously turned on due to the machine errors, stoppages, etc.
138
ERGONOMIC EVALUATION OF WORK AND ENVIRONMENTAL STRESSES
Table 1. Environmental Stresses and Strains Faced by the Technicians
Physical/Physiological Stress In Table 2, the different types of stresses as perceived by the Technicians as in Visual Inspections or 100% Visual Inspections cause eye problems for 25 Technicians (89.3%). The Visual Inspections have to be carried out once with one frame every hour in the Die Bond Process. While in the Wire Bond Process, one frame per magazine have to be inspected every 20 minutes and this have to be done on four to five Wire Bonders in each production line. The 100% Visual Inspections have to be carried out only whenever the Technicians find rejects tags put on them. Seven Technicians felt low back pain and stiffness at the neck region because they have to frequently bend their body to take, load and unload the magazines, use the Visual Display Terminals (VDTs), performing the ordinary Visual Inspections or 100% Visual Inspections and attend to the machine errors and rectify the stoppages. Fifteen Technicians also felt pain in the leg due to long hours of continuous standing (as the management does not prefer to give them chairs) due to lots of movements within the assembly line to attend to the machine errors and stoppages; standby, change and settle the lot; standby the frame and magazine; frequently take recordings; etc. Table 2. Physical or Physiological Stresses and Strains Faced by the Technicians
Mental Stress and Strains The mental stress and strains of the Technicians are presented in Table 3. In general, 28.6% (8 Technicians) of the total 28 Technicians have work and mental stress frequently as arguments with their superiors especially regarding different devices, packages or products to be processed in the assembly lines, control
CONTEMPOARY ERGONOMICS 2000
139
Table 3. Mental Stresses and Strains Faced by the Technicians
of the lots to be processed or planning and the line outputs. The Technicians (13 persons) frequently have to shorten their regular breaks to address the machine problems in order to keep the machine continuously running to prevent loss of production. Eighteen Technicians (64.3%) mentioned that it would help them to do their job much better if more intensive training on the operational process and the machine troubleshooting are arranged for them and if more monetary incentives are given to them. The reject or quality problems have caused 15 (53.6%) of the Technicians’ inability to sleep well because they have the anxiety to explain to the top management what is the root cause of the rejects, etc. Eight Technicians felt very high mental load and wished to have more support or assistance especially in handling the mechanical and electronic problems of the machines, repetitive machine errors or stoppages. Four Technicians felt bored by the repetitive and monotonous tasks and preferred to leave the job. Five Technicians were of the opinion that the top management did not show any appreciation of their work when they solve the problems of the machine errors, complete the work or tasks in time and repair the major rejects. Subjective Assessment A total of 12 Technicians were involved in providing further information. Whenever the sounds of the alarms are heard, 11 out of 12 Technicians mentioned that they could not identify easily the source of the alarm. They would raise their head to check whether it was the machines under them, which were having errors or others. There is no specific meaning for the continuous or intermittent sounds (8 persons). Every machine was supposed to have the audio alarm and the light to indicate the errors in the machine (7 persons). The two most frequent machine errors in the Die Bonders are Chips out of Tolerance (12 persons) and Too many material thrown away or frame cannot be picked up by the loader (6 persons) (Table 4). While in Wire Bonder, Ball Non-stick (12 persons) and Wire Break (6 persons) are the two most frequently occurring machine errors (Table 5).
140
ERGONOMIC EVALUATION OF WORK AND ENVIRONMENTAL STRESSES
The targeted output in the Die Bonder is around 15k chips per shift per machine, while each Wire Bonder, it is around 3.35k per machine. The two most frequent processed devices are the 64M SDRAM-SS4 DRESDEN-DT8SR4HSG and 64M SDRAM-SS4 PROMOS AG-DT8SR4DSG. Table 4. Machine Errors at Die Bonder, its Frequency and Time to Fix Them
Table 5. Machine Errors at Wire Bonder, its Frequency and Time to Fix Them
Conclusions and Recommendations In the new management system applied to the assembly line, the Technicians felt that they alone are unable to handle the whole line with one Die Bonder and four to five Wire Bonders especially when there are so many simultaneous machine errors and stoppages. So, the machines’ minor errors and problems that frequently or repetitively had occurred have to be solved or reduced to a minimum, e.g., problems of chips out of tolerance and lead frame pickup errors at Die Bonders; problems of ball non-stick, wire breaks and wire too short at Wire Bonders. As many of the data were classified, hence, it could not be presented in this paper. The main recommendations are:
CONTEMPOARY ERGONOMICS 2000
141
• The audio alarms for the machines’ errors should be set at different tones having different interruptions depending on the severity of the errors and stoppages which caused loss of production and the quality, so that each specific Technician on the line can identify and do the needful. However, these may increase the ambient noise level in which case the alternative Ergonomic way would be to use specific radio frequency to feed information through the headphones of each Technician. • Further study is necessary especially on the effects of shift work on the Technicians since the factory is implementing 8 hours and 12 hours shifts. • Economically redesign improvements on the jumpsuit are necessary. • The morale of the Technicians should be improved and frustration to be reduced by the Management’s appreciation of their good performance. • To reduce continuous standing work and consequent fatigure, the Technicians should be allowed to sit and record the data at least for 5 minutes every hour except the tea break and Lunch breaks hours. • Due to the glares of the unshaded fluorescent ceiling lights on the VDT monitors, the Technicians had much difficulty in reading the Display monitors. The fluorescent lights should be shaded to rectify this. References Farren, C. 1999, Stress and Productivity: What Tips the Scale?, (Strategy & Leadership, Chicago), 27, 36 Kroemer, K.H.E. and Grandjean, E. 1997, Fitting the Task to the Human—A Textbook of Occupational Ergonomics, (Taylor and Francis, London), 211–217 Sen, R.N. and Yeow, P.H.P. 1999, An Ergonomic Study on the Processes in a Medium-sized Printed Circuit Assembly Factory for Productivity and Rejection Reduction, In: Contemporary Ergonomics 1999, ed. Hansen, M.A., Lovesey, E.J. and Robertson, S.A. (Taylor and Francis, London), ISBN 074840872X, 38–42 Seyle, H. 1978, The Stress of Life, (MacGraw-Hill, New York) Warren, E. and Toll, C. 1993, The Stress Work Book, (Nicholas Brealey Publishing, London), 184pp Westen, D. 1996, Psychology : Mind, Brain and Culture, (John Wiley & Sons Inc., New York), 755pp
THE DEVELOPMENT OF PHYSICAL SELECTION PROCEDURES FOR THE BRITISH ARMY. PHASE 3: VALIDATION Mark Rayson1, Harvey Pynn1, Anne Rothwell1 & Alan Nevill2 1Optimal
Performance Ltd, Old Chambers, 93–94 West Street, Farnham, Surrey GU9 7EB, UK
2Liverpool
John Moores University, Sport and Exercise Sciences, Byrom Street, Liverpool, UK
This paper describes a study to validate the British Army’s new Physical Selection Standards for Recruits (PSS(R)) as predictors of job performance. PSS(R) was validated against four discrete Job Performance Criteria in 1009 recruits (770 men and 239 women). The PSS(R) scores correctly predicted outcomes on all 4 Representative Military Tasks in 75% of recruits; recruits who passed their PSS(R) Tests lost less days through injury and sickness than recruits who failed (0 vs 2 days: p<0.01), were more likely to pass out of training first time (74% vs 63%: p<0.05), and tended to have higher Job Performance Ratings (6.7 vs 6.0: p>0.05). The PSS (R) tests are deemed to be valid and useful predictors of job performance, and if implemented appropriately should improve operational effectiveness, reduce absenteeism during training, and improve first-time pass rates. Introduction This paper is the third in a series of three which describe the development and application of a systematic approach to setting and validating occupation-related physical selection standards, using the British Army as an example. The first paper (Rayson 1998) described the process of identifying and quantifying the most physically-demanding tasks within each occupation in the army. A variety of techniques were used including questionnaires, interviews, observation, and physiological, biomechanical and psychophysical measurement techniques. Four criterion tasks, which were referred to as Representative Military Tasks (RMTs), were defined (a single lift, carry, repetitive lift and loaded march), and all occupations in the army were allocated to one of three levels of performance on each RMT. The second paper (Rayson, Holliman and Belyavin in press) determined which combination of physical performance and anthropometric tests could be best used to predict RMT performance, preferably using ‘gender-free’ and ‘gender-unbiased’ models. The objectives were met by analysing performance data on the RMTs and a large battery of physical performance tests collected from 379 trained soldiers. Ten preferred models were selected to predict the RMTs that contained in total, nine physical performance tests, but the errors of prediction varied markedly between models. One model was gender-free, three were genderrelated (i.e. contained ‘gender’ explicitly in the model), and six were gender-specific (i.e. were appropriate for men or for women). Due to both a lower accuracy of prediction in women’s scores and a greater tendency for the women’s scores to be distributed around the pass standards, a greater percentage of women than men were misclassifled as passing or failing, resulting in indirect discrimination.
CONTEMPOARY ERGONOMICS 2000
143
This paper describes a study to validate the new selection tests and standards, known as Physical Selection Standards for Recruits (PSS(R)), as predictors of job performance, thereby confirming the findings from the initial phases of the project. The army’s intention was to replace the existing tests (pullups, sit-ups and 1.5 mile run) with PSS(R), to provide a rational and scientifically valid method of selecting men and women for jobs to which they were physically suited. This improved match between the physical capability of the soldier and the demands of the job should avoid irrational discrimination and result in greater operational effectiveness and reduced injury and attrition rates. The objectives of the study were addressed by validating PSS(R) against 4 Job Performance Criteria. The RMTs were selected as the primary Job Performance Criterion. Three indirect measures of Job Performance were also assessed: number of days lost to Injury and Sickness during basic training; Training Outcome coded as 1: never started, 2: never completed, 3: completed after ‘back-squadding’, and 4: passed out first time; and Job Performance Ratings conducted by self, peer and supervisor. Method Two batteries of tests were administered to recruits—the 9 PSS(R) tests and the 4 RMTs. The PSS(R) tests, comprising measures of body mass, body fat and fat free mass, static lift strength, back extension strength, dynamic lift strength, pull-ups, static arm endurance, and the Multistage Fitness Test (Rayson, Holliman and Belyavin in press) were administered at the Recruit Selection Centres. The PSS(R) test scores were used to predict RMT scores using equations that were developed on independent samples from previous studies, and to determine Selection Outcome (i.e., pass or failure to achieve the required Levels of RMT performance for their occupation). Actual RMTs were administered in the final week of basic training: they comprised a Single Lift, Carry, Repetitive Lift and Loaded March (Rayson, Holliman and Belyavin in press). The Selection Outcomes were compared with actual RMT performance using 2×2 classification tables for each RMT. Tabulations were produced for all 3 Levels of the 4 RMTs; chi-square tests were performed to determine if the proportion of men and women being correctly and incorrectly classified as pass or fail were different. The tables classified the recruits into true positive, true negative, false positive, false negative. The Injury and Sickness data (total number of days off during basic training) and Selection Outcome were analysed using non-parametric tests (Mann-Whitney and Kruskal-Wallis tests). The median number of days lost due to Injury and Sickness were used due to the non-normality of the distribution. Selection Outcomes were analysed against Training Outcome using a chi-square test. Job Performance Ratings were measured using a continuous scale from 0 (very poor) to 10 (very good). The reliability of the Job Performance Ratings was verified: all fell within statistically acceptable levels of reliability (CoF 10.9 to 15. 4%). The three scores resulting from the Job Performance Ratings were analysed separately using independent t-tests. 1009 recruits (770 men and 239 women) were tested on PSS(R), representing a shortfall in the targeted number of women but a surplus of men. 624 recruits (512 men and 112 women), representing 61.8% of the initial sample, commenced basic training. 315 recruits (271 men and 44 women), representing 31.2% of the sample, completed basic training and were tested on the actual RMTs. Job Performance Ratings on 72 recruits (61 men and 11 women), representing 7.1% of the sample were returned for analysis.
144
THE DEVELOPMENT OF PHYSICAL SELECTION PROCEDURES
Results Selection Outcome and Representative Military Task Performance Table 1 illustrates, by RMT and by Level, the percentages of recruits who were correctly and incorrectly classified as passing and failing the RMT Levels required for their chosen occupation. Table 1. Predicted versus actual RMT outcome by RMT and Level
Table 2. Percentage of recruits passing the Selection Tests and Training Outcome
Selection Outcome and Injury and Sickness during basic training Medical data were available for 282 recruits. Recruits who failed their Selection Outcome (n=73) lost a median of 2 days of basic training to injury or sickness compared to recruits who passed their Selection Outcome (n=209) who lost 0 days (p<0.01). Selection Outcome and Training Outcome Table 2 shows the relationship between Selection and Training Outcome (n=799). A significant (p<0.05) relationship was found between the probability of passing the Selection Tests and Training Outcome. Post hoc analyses indicated that recruits who did not complete basic training first time
CONTEMPOARY ERGONOMICS 2000
145
Table 3. Selection Outcome and Job Performance Ratings
(Training Outcomes=1, 2 and 3) had a lower pass rate on PSS(R) than recruits who did (Training Outcome=4). Selection Outcome and Job Performance Ratings Table 3 illustrates the mean rating scores for those who failed and those who passed PSS(R). The relationship between the rating scores and Selection Outcome was not significant (p>0.05). However, there was a trend for recruits who failed the Selection Tests to have a lower mean score on all three Job Performance Ratings than recruits who passed the Selection Tests. The mean difference in rating score was 0.7. Discussion In this study we set out to validate PSS(R) against four Job Performance Criteria. The primary Job Performance Criterion was performance on the RMTs at the end of basic training. On average, PSS(R) correctly classified recruits in 89.3% of RMTs over all Levels—85.7% of which were true positives (recruits who passed the PSS(R) (i.e. predicted RMT success) and went on to pass the RMTs at the end of basic training) and 3.6% were true negatives (recruits who failed PSS(R) (i.e. predicted RMT failures) and went on to fail the RMTs at the end of basic training). These individuals would have been correctly accepted into or correctly rejected from the army, on the basis of their subsequent performance on key physically-demanding tasks. This ‘hit’ rate compares favourably with many other psychosocial employment selection procedures that are widely deployed, where ‘hit’ rates of around 70% are more typical. Furthermore, given the predictive nature of these selection procedures (predicting RMT performance typically some 4 to 6 months after performance of PSS(R)), it is all the more impressive. However, if implemented as proposed, recruits will be permitted or refused entry to the army on the basis of their predicted capability to perform each of four RMTs, rather than any single RMT. Consequently, the overall classification rate, which considers correct and incorrect classification on all 4 RMTs combined, may be a more appropriate statistic. Overall, when classification rates on the 4 RMTs were pooled, PSS(R) correctly predicted 74.9% of the 289 recruits to pass or fail for their occupation at the end of basic training. 58.7% of the recruits were classified as true positives and 16.2% were true negatives. The remainder (25.1%) was misclassified into false positives (15.5%) and false negatives (9.6%). The 15.5% false positives constitute recruits who would be accepted by the army but who in reality would be unsuited for their occupation as they failed to achieve the Levels required on the actual RMTs. The false negatives constitute the group that warrants greatest concern as these recruits would be rejected by the army but, in fact, would achieve the Levels required on the actual RMTs (i.e. these recruits would be incorrectly rejected from the Army). However, this group represents only 9.6% of the sample. Further investigation of the individual RMTs and the Levels within them provides an insight into potential areas for concern, for example the 17% false negatives for Single Lift Level 1. The Level 2 and 3
146
THE DEVELOPMENT OF PHYSICAL SELECTION PROCEDURES
data produced the lowest incorrect classification rates. This would be expected, as the standards required for a Level 2 or 3 are well within the physical capability of the majority of recruits. The higher incorrect classification rates for Level 1 RMTs were due to recruits only just passing or failing the standard required for their occupation. A minor alteration to the standards could have a marked effect on the figures for Level 1 RMTs. The three indirect measures of Job Performance provided further supportive evidence of the validity of PSS(R). PSS(R) was found to be related to Injury and Sickness during basic training, with those who failed their Selection Outcome losing a median of 2 days more than those who passed. A difference of 2 days of absenteeism may not sound much, but when one considers that the army trains around 14,000 new recruits every year, the total number of working days lost and the financial implications of this absenteeism become more obvious. Similarly, PSS(R) was related to Training Outcome. Whilst there was no difference between PSS(R) success rates in those who never commenced training, never completed training and completed training only after ‘back-squadding’, those recruits who passed out first-time had around 10% higher PSS(R) success rates than the remainder. Although the response rate for the Job Performance Ratings was disappointing and the resulting data set was small, a consistent trend was apparent for recruits who passed their PSS(R) to have higher self, peer and supervisor ratings than those who failed. The female sample size was too low to be able to draw any firm conclusions about gender bias, but some observations were noted. Although, misclassification rates were higher in women for some RMTs, the majority of these misclassifications were false positives, resulting in these women being incorrectly accepted rather than incorrectly rejected from the army. Gender accounted for most of the relationship between Selection Outcome and Injury and Sickness: men and women had a median of 0 and 5 days off, respectively. Women were also less likely than men both to achieve the Selection Outcome and to pass out of basic training. However, both men and women who passed out first time had the highest success rates on PSS(R). If we consider the impact that the introduction of PSS(R) may have on aspects of recruitment and manning levels in the British Army we find that whilst PSS(R) failed a lower percentage of recruits (10.9%) than the current BFL system (24.8%) for the minimum standard to the Army, overall, PSS(R) failed a higher percentage of recruits (35.3%) than BFL (24.8%), for chosen occupations. It seems likely therefore that recruitment numbers to the physically-demanding occupations will decrease, though improved pass-out rates may mitigate potential reductions in manning levels. Amongst the women, the Selection Tests failed a higher percentage than BFL tests, both for the minimum standard to the Army (45.4% versus 38.6%), and more especially for chosen occupation (66.4% versus 38. 6%). It seems likely therefore that recruitment of women will decrease. Although the implementation of PSS (R) will impact on recruitment, recruits who are successful will be of a higher physical calibre and may suffer less injury and attrition. In summary, a relationship was found between 3 of the Job Performance Criteria and PSS(R) and trends were also apparent in the fourth. PSS(R) correctly predicted outcomes on all 4 RMTs in 75% of recruits. Recruits who passed PSS(R) lost fewer days through injury and sickness and were more likely to pass out of basic training first time. There was a tendency for recruits who passed PSS(R) to have higher Job Performance Ratings. This work was supported by the Ministry of Defence.
CONTEMPOARY ERGONOMICS 2000
147
References Rayson, M.P. 1998. The development of physical selection procedures. Phase 1: job analysis. In MA Hanson (ed.) Contemporary Ergonomics 1998, (Taylor and Francis, London) 393–397. Rayson, M.P., Holliman, D.E.H. and Belyavin, A. in press. The development of physical selection procedures for the British Army. Phase 2: the relationship between physical performance tests and criterion tasks. Ergonomics.
THE ERGONOMIC DESIGN OF LONDON UNDERGROUND LIMITED’S INCIDENT REPORTING FORMS Adam Whitlock1, Simon Layton1, Mike Sinclair-Williams2 & Julie Parham2 1Human
Engineering Limited, Shore House, 68 Westbury Hill, Westbury-On-Trym, Bristol BS9 3AA, UK
2London
Underground Limited, Albany House, 55 Broadway, London SW1H 0BD, UK
Abstract This paper describes the methodology used for re-designing London Underground Limited’s (LUL) standard incident reporting form (INF). Human Engineering were required to conduct a programme of work to review the use and layout of the existing form, and provide options for a re-designed form. The work involved familiarisation with form users’ environments and the facilitation of workshops to identify re-design requirements. The form delivered to LUL had fields grouped according to the type of information captured and clearly laid out sections. In addition, the form was designed to provide safety managers with relevant and reliable data for analysis purposes. The re-designed INF is currently being trialled on one of LUL’s Operating Lines. Introduction Human Engineering were instructed by LUL to re-design their Incident Notification Form (INF). The previous INF was in A3, landscape format, and was regarded by line managers as too wieldy and included sections that were not relevant to their requirements. As a result, some managers had developed their own forms for incident notification. The use of several different forms compromised the reliability of the information recorded on the forms, and the subsequent incident statistics generated by Safety Managers in Loss Control Information Section (LCIS). The purpose of Human Engineering’s work was to: • • • •
Identify where INFs were being used and by whom Identify problems associated with the layout and content of the existing form Provide several concept designs for a new form Validate the designs and select the form that best met the requirements of the stakeholders and improved LUL’s business efficiency
Human Engineering were required to deliver a final version of the re-designed form, plus any necessary guidance material.
CONTEMPOARY ERGONOMICS 2000
149
Methodology The work was carried out in four stages; familiarisation, problem identification and requirements capture, concept design generation and design validation. Stage 1—Familiarisation The aims of this stage were to identify the different INF stakeholders and to discuss the general issues related to the existing INF form (shown in Figure 1).
Figure 1. The existing INF
During communications with LUL, the INF stakeholders were identified as follows: • • • • •
Duty Line Managers Duty Station Managers Station Supervisors LCIS Line Controllers
Human Engineering visited a sample of these stakeholders at their place of work to informally discuss issues associated with the existing form’s purpose, content, layout, and ease and convenience of use. The results of the discussions were used to structure the subsequent INF workshop. Stage 2—Formal problem identification and requirements capture The aims of this stage were to formally identify all the problems associated with the existing form, and compile a list of requirements for the proposed re-designed form. This was achieved through guided workshop discussions, facilitated by Human Engineering. Representatives from each of the INF
150
THE ERGONOMIC DESIGN OF LONDON UNDERGROUND
stakeholders attended the workshop. In addition, various LUL safety officials were present to provide an input into the requirements capture process. The structure of the workshop was as follows: • Discuss and differentiate long and short term goals for the INF • Identify problems with the existing form • Determine ‘high level’ requirements for the re-designed form The long-term goals included improved training in completing INFs, feedback mechanisms, computerised forms and the investigation of the effects of privatisation. These goals would provide direction for future INFs. The short-term goals included: – – – – – –
Re-designing the form, but maintaining a similar appearance to the previous form Designing the form in A4 format, on one sheet of paper Agreeing that the re-designed form should become the accepted standard for incident notification Development of guidance notes, as an interim to full training Separating the form into standard, train and station sections Ensuring that compulsory sections are completed
Workshop discussions suggested that a radically re-designed form would meet resistance from staff. However, it was agreed that a single page, A4 form would be welcomed by staff. Guidance notes would be produced and distributed to address the issue of staff not completing compulsory sections (or providing excess detail). The notes would prompt staff as to the information that should be recorded and the level of detail required. In addition, the notes would give the procedures for the filing and distribution of the completed form. Finally, it was agreed that some of the information was standard to both the trains and stations departments of LUL. This was to be included in one section. Train and station specific data were to be recorded in separate sections.
Stage 3—Concept generation and initial form design The aim of this stage was to produce a range of concept designs for the INF that fulfilled the parameters for the design of the form, as specified during the workshop. Ergonomic principles of sequence of use and functional grouping were applied to the concept designs. To ensure legibility, the minimum font size used was 8 point. A sans serif, black font was used on a white background. To keep the same format as the previous form, compulsory sections were given a dark border. Noncompulsory sections were given a light border. In addition, the titles for each of the sections remained the same, where appropriate. The four concept designs are shown on the following page.
CONTEMPOARY ERGONOMICS 2000
151
Stage 4—Design validation The aims of this stage were to discuss the concept designs and select a design that would be used in trials. This was achieved through guided discussion during a second workshop, attended by those LUL staff that were present at the first workshop. Human Engineering facilitated the discussions. The structure of the workshop was as follows: • Presentation of four re-designed forms to the workshop • General discussion of the layout and structure of the four options • Selection of a preferred form and detailed discussion for amendments • Discussion of the requirements for the guidance notes • Methods of carrying out the trials and implementing the form throughout LUL The workshop opted for the design that: • Described the free text fields according to the fundamental incident questions of ‘What happened?’, ‘Why did it happen?’, ‘What was done about it?’ and ‘What could have been done to prevent it?’ • Arranged these free text fields so they would be completed sequentially Final Design and Conclusion Following the workshop, minor changes were made to the concept design. These included the adjusting of the size of the free text fields to accommodate the necessary level of data capture. In addition, the ‘Train Specific Information’ and ‘Station Specific Information’ fields were to be moved to improve the logic of the layout. The subsequent final design is shown on the following page. The guidance notes were designed using the same layout as the INF, to aid mapping. Simple section descriptors such as ‘Why?’, ‘What?’, ‘Where?’ and ‘When’ were used to prompt form users. In conclusion, the update of the paper version of the INF provided LUL with an interim solution to the issue of incident reporting. However, a paper based INF is not the ultimate solution. The next phase of work will investigate the feasibility and design of an electronic version of the INF, where users type the data directly into a computer database.
152
THE ERGONOMIC DESIGN OF LONDON UNDERGROUND
Figure 2. The re-designed INF
HCI & IT systems
RESEARCH ON CULTURAL FACTORS AND INTERFACE METAPHORS IN INTERNET APPLICATIONS Chien-Hsiung Chen & Chia-Ching Hsu Graduate School of Industrial Design, Tatung University, 40 Chungshan N. Road, 3rd Sec., Taipei 10451, Taiwan
This paper explores the concepts of cultural factors that influence Website design and interface metaphors in the context of Internet applications. A Web shopping store was created based on the culture of a department store in Taiwan for the experimental purpose. This study successfully demonstrated that the appropriate use of a culturally dependent interface metaphor can actually improve a user’s Web browsing and searching tasks. The authors hope that the appropriate use of interface metaphors can also be used to facilitate other types of interaction tasks to improve users’ task performance. Introduction In recent years, due to the rapid progress of computer technology in Internet applications, users from different parts of the world have been able to utilize the Internet for various interaction purposes, such as information acquisition (search and download), information storage (Web hard disk), electronic commerce (shopping), communication (e-mail, Web phone and digital imaging), as well as entertainment (games). The Internet can be viewed as a virtual information space. It is an interactive computer-generated environment within which users can acquire various types of information in a digital form. In fact, the Internet provides “artificial reality” within which users can travel from one Website to another and conduct various types of interaction tasks. The planning and design of a Website can be influenced by the designer’s own cultural background and strategies used. Among various types of strategies for designing effective Websites, the application of an interface metaphor can facilitate users’ Internet navigation and Web browsing purposes. The Design of Effective Websites An effective Website should be self-explanatory and easily navigated by all of its potential users. A useful Web interface should also make it possible for users with different levels of Web experience to achieve their browsing objectives easily. Thus, Website designers should incorporate both cultural factors and an interface metaphor in the design process in order to create an interface that users can interact with easily.
CONTEMPOARY ERGONOMICS 2000
155
Cultural factors Culture can be viewed as “shared patterns of behavior” (Mead, 1953). That is, within one culture, all members are able to interact with each other based on similar cultural patterns. Culture provides an emotional space in which a set of beliefs, values, and behaviors are commonly shared by members of the same society. Cultural traditions (i.e., patterns) must be generally agreed upon by the majority of the members of the culture, not just by an individual alone. Hall (1969) organized cultures by the amount of information implied by the setting or context of the communication itself, regardless of the specific words spoken. He argued that cultures differ on a continuum ranging from the high to low context. In high-context cultures, the communication is implied by a physical setting or by an individual’s beliefs and values. In low-context cultures, the communication among culture members is expected to be explicit, and everyone has equal access to available information. In the context of human-computer interaction, the communication between a human and a computer is moving from lowcontext to high-context interaction. Because of the progress of advanced computer technology, traditionally rigid command languages are no longer necessary for a user to interact with a computer. Instead, the concept of direct manipulation has provided multiple and flexible interaction styles to facilitate the user’s interaction tasks. Interface metaphor Using an interface metaphor entails applying existing well-known concepts as an analogy to a new design concept. Center and Stevens (1983) view the user’s understanding of a computer system as high-level explanations between users and interface metaphors. In fact, an interface metaphor can be viewed as a representational model used to help design a Website. Research has also demonstrated that the selection of appropriate metaphors is crucial to facilitating users’ interaction tasks (Carroll & Thomas, 1982). Therefore, an interface metaphor can work as a facilitator to help users establish an initial understanding of Websites. The culture of a department store in Taiwan A department store is usually a large retail outlet where customers can purchase various types of merchandise in one location. In Taiwan, department stores are usually located in a tall building with several floors. Though the total number of floors can be different among various department stores, the function of each particular floor remains similar from one store to another. For example, the supermarket and food bazaar are usually located in the basement. The functions of higher floors are often in the sequence of cosmetics and lady’s shoes, young ladies’ fashions, jewelry and ladies’ shirts, designer fashions, women’s nightwear, men’s fashions, children’s clothing and toys, sports and leisure, electrical home appliances, furniture and house decorations. Because of the difference in the total number of the floors, some functions can be combined and presented on the same floor. Research on using the culture of a department store as an Internet metaphor Objective Because almost every Internet user in Taiwan knows how to purchase things from a department store, it is possible that using the culture of a department store as a metaphor can benefit users when shopping from a
156
RESEARCH ON CULTURAL FACTORS AND INTERFACE
Figure 1. The Webstore created by using the culture of department store as an interface metaphor
Webstore. Therefore, the objective of this study was to investigate whether or not the application of the culture of a department store as a metaphor for a Website interface could actually facilitate users’ shopping performance when shopping on the Web. Experiment For the purpose of the experiment, a Website was created based on the metaphor of an 11-floor department store, including a basement (see Figure 1). Each floor was assigned with suitable functions according to the culture of a department store in Taiwan. In addition, an existing shopping Website was also chosen for comparison. A total of 26 participants (16 males and 10 females) were recruited for the experiment. They were all student volunteers from the Department of Industrial design at Tatung University. None was paid to participate in the study. Nonetheless, some students received class credit for participating in the experiment. Each participant was randomly assigned to visit one of the two Websites and perform a shopping activity. The participant was asked to buy two items from a list and four items of their own choosing. The equipment used in the experiment was a Pentium II PC using Windows 98 as the operating system. All the participants used Internet Explorer (version 5.0) as the Web browser in this experiment. Each participant’s actions and task times were video recorded for statistical analysis. Results and discussions By analyzing the videotape produced during the experiment, the researchers calculated each participant’s task time and number of steps to complete the experiment. The results show that on average, the participants who visited the Webstore created using the department store as the interface metaphor took 207. 3 seconds (SD=79.2) and 27.5 steps (SD=7.1) to complete the experiment. In addition, the participants who visited the existing Webstore created without any specific interface metaphor took 367.1 seconds (SD=180. 4) and 24.4 steps (SD=4.9) to complete the experiment. A t-test revealed a significant difference between
CONTEMPOARY ERGONOMICS 2000
157
these two groups in terms of the time spent to complete the task, t(24)=2.92, P<0.01. Nonetheless, no statistically significant difference existed in the number of steps taken to complete the task. The statistical results indicate that the participants did spend less shopping time when visiting the Webstore created with the interface metaphor. However, this time reduction was not caused by a decrease in the number of shopping steps. In fact, the participants took almost 3 more steps shopping in the metaphor designed Webstore than in the non-metaphor designed store, though no statistically significant difference was found. Therefore, the time reduction can be due to the fact that the department store metaphor did facilitate the participants’ shopping technique. This study supports the finding that the appropriate use of an interface metaphor for Web design based on cultural considerations can improve users’ browsing and searching techniques. Conclusions This study was intended to explore the potential of a culturally sensitive interface metaphor in a user interface design. The authors have successfully demonstrated that the appropriate use of an interface metaphor can actually improve a user’s performance on Web browsing and searching tasks as long as the metaphor is easy to understand, i.e., designed with cultural considerations. The authors hope that the application of interface metaphors can also be used to facilitate other types of interaction tasks to improve users’ task performance. Acknowledgements Financial support of this research by National Science Council under the grant NSC 89–2213-E-036–009 is gratefully acknowledged. References Carroll, J.M. and Thomas, J.C. 1982, Metaphor and the cognitive representation of computing systems. IEEE Transactions on Systems, Man, and Cybernetics, 12, 107–116 Gentner, D. and Stevens, A.L. 1983, Mental Models, (Lawrence Erlbaum, Hillsdale, NJ) Hall, E.T. 1969, The Hidden Dimension, (Doubleday, New York) Mead, M. 1953, Coming of Age in Samoa, (Modern Library, New York)
DESIGN AND EVALUATION OF A DIRECT MANIPULATION OBJECT FOR APPLICATION IN THE POSTPRODUCTION SPECIAL EFFECTS DOMAIN Martin Hicks1, John Long1 & Clare Borras2 1Ergonomics
& HCI Unit, University College London, 26 Bedford Way, London WC1H 0AP, UK
2Sony
Broadcast & Professional Europe, Jays Close, Viables, Basingstoke, Hampshire RG22 4SB, UK
Commercial postproduction special effects software typically incorporates multiple windows and an arbitrary combination of visual representations. Such graphical conventions can potentially distract the special effects editor, hence a requirement for a simplified user interface (UI) design. This paper adopts an innovative approach to the design of graphical representations of postproduction tools, based on research conducted by Sony Broadcast and Professional Europe (‘EditWorld’). The design specification incorporates a 3D direct manipulation object (DMO) in the form of a camera tool. It was postulated that editing effectiveness could be enhanced by the inclusion of a 3D DMO camera tool. Printed screen layouts of the design were evaluated by adopting a semi-structured interview, based on a usability assessment questionnaire. Due to the subjective nature of the assessment, the evaluation objectives could not be fully realised. However, useful feedback was derived to inform future design decisions. Introduction In both television and film production, postproduction consists of editing which may include graphics and visual special effects, and sound dubbing. Postproduction is typically situated at the end of the production cycle, but may represent different meanings to different postproduction houses, projects and project types, according to time and budget constraints. Within the postproduction domain, there are a number of commercial postproduction software products, available to the editor, which serve as postproduction tools to perform edits on 3D image contents, between image clips, graphics and text. Based on a UI comparison of various current commercial special effects and editor packages, ‘EditWorld’ (1999a) identified a number of areas in extant 3D special effects applications, which may result in ineffective editing performance. Accordingly, the ‘EditWorld’ UI design problem, as identified by the initial market analyses conducted by ‘EditWorld’ (1999a), may be characterised as: ‘an ineffective editing performance arising from the editor being distracted or alienated from the editing task by the complex UI design incorporating multiple windows on the screen desktop, and from an arbitrary combination of visual representations (i.e., textual, graphical, iconic, animation and video)’. To promote effectiveness in postproduction special effects editing, ‘EditWorld’ (1999a) state that a main requirement is to provide a design solution which minimises display complexity by rationalising display layout, while maintaining compatibility with 3D postproduction special effects UI representations. From their initial market analyses, ‘EditWorld’ (1999a) have identified that certain commercial packages have incorporated graphic representations in the form of direct manipulation objects (DMOs). The examples of
CONTEMPOARY ERGONOMICS 2000
159
DMO UI representations provided by ‘EditWorld’ (1999a), show a lack of dialogs and windows to conserve screen real estate. With respect to the design problem stated, it is expected that the effectiveness in editing performance will be enhanced by a simplified UI design resulting from the inclusion of DMOs, and by minimising combinations of visual representations. Graphical representations of a DMO According to ‘EditWorld’ (1999b), using graphical representations of DMOs is well suited to creative spatial tasks, as users become acquainted with the creative and exploratory aspects involved. Furthermore, well-designed graphic representations of DMOs are seen to adhere to certain human factors principles. Principles of virtuality, whereby a representation of reality can enhance manipulations, and transparency, whereby skills can be directly applied to tasks, are detailed in Shneiderman (1998). ‘EditWorld’ (1999b) propose that problem-solving and learning skills are enhanced when problems and solutions have a suitable visible representation, as memory is better equipped to retain physical spatial representations. The incorporation of DMOs in UI representations contributes to efficient interactions due to the following attributes: • Combined location of data display with data entry • Immediate feedback reduces logical errors The design properties of a graphical representation incorporating DMOs have been suggested by Card et al. (1999), who state that such designs should reflect the way people manipulate objects in the real world, and should include the following aspects: • • • •
Continuous representations of objects and actions (maintains visual momentum) Objects be manipulated by physical actions or directional keys Operations be visible, rapid and reversible DMOs be designed as recognisable objects and reflect real world actions Enhanced graph metaphor
The ‘enhanced graph metaphor’ proposed by ‘EditWorld’ (1999b), refers to ‘our ability to perceive and understand relationships from an analogue understanding of the world, whereby larger spatial differences are more important than smaller ones’. Thus, an enhanced graph design needs to adequately represent spatial distance. The enhanced graph metaphor intends to make use of the visual field by presenting information at an appropriate level of detail, whilst mapping a detailed view within the external view in the 3D environment. This metaphor is expanded to propose a solution to the integration of summary and detailed information, and to exploit the visual field in a UI representation. For example, the UI representation may include a magnifier, as selected areas can be examined in detail, with external information providing a summary view. This functionality may be incorporated, for example, in a 3D-camera tool, with the selected lens view overlaying the 3D external view. Additionally, temporal continuity is supported if the 3D external view is unaltered to sustain visual momentum, allowing the changes occurring to a 3D object to be readily perceived. This outcome is made possible by maintaining visual momentum and temporal continuity, which support the design properties of graphical representations incorporating DMOs (Card et al., 1999).
160
DESIGN AND EVALUATION OF A DIRECT MANIPULATION OBJECT
As a result, the benefits of dynamic and flexible operations can be realised in spatial editing tasks within the postproduction special effects domain. Scope of design solution Specifically, the scope of the design solution is limited to the specification of a 3D DMO camera tool as a subset of a postproduction tool set, to further incorporate: • the specification of a concept demonstrator of a portable Sony studio camera, incorporating the appearance, and where possible, some basic functionality (zoom control) of a studio camera (‘EditWorld’, 1999b) • the enhanced graph metaphor, with selected lens view overlaying 3D external view (‘EditWorld’, 1999b) • the facility to reflect changes in external 3D objects attributes in selected lens view, and to maintain temporal continuity and visual momentum in object manipulations (‘EditWorld’, 1999b; Card et al., 1999) • the specific properties, semiotics, and guidelines of successful icon/direct manipulation interface design to support design practice The specification of a 3D DMO concept demonstrator, as a graphic representation of a real world object overlaid on an image, may provide benefits to the editor by: • the simplification of UI design by including a 3D DMO and reducing display complexity • minimising visual representations (incorporating DMOs with successful icon/direct manipulation interface design guidelines) • incorporating a 3D DMO with the additional functionality of a real world metaphor According to ‘EditWorld’ (1999b), these benefits can be rationalised to promote effective performance in editing tasks by allowing the following attributes: • immediate feedback, allowing editors to monitor actions in relation to goals. Where actions are counterproductive, editors can easily change direction through inverse actions. In this way, operations are visible, rapid and reversible (Card et al, 1999) • rapid manipulation of objects of interest • both data input and output are in the same screen location • physical and visual representations are easier to manipulate and retain in memory • easier navigation through complex scenarios • intermittent users retain operational concepts more easily Method for Usability Engineering (DMO) MUSE (Method for Usability Engineering, Lim & Long, 1994), is a structured method for usability engineering (for an overview of the MUSE method, see Lim & Long, 1994). As the design specification is limited to a subsystem (i.e., a 3D DMO camera tool as a subset of a postproduction toolset), the MUSE methodology was configured, with a limited scope in terms of the design products realised at each stage. Accordingly, MUSEDMO is a novel configuration of MUSE, which is limited to the production of a design
CONTEMPOARY ERGONOMICS 2000
161
Figure 1: Target subsystem design for 3D camera tool and selected lens view
specification for a 3D DMO camera postproduction tool. The following summary characterises MUSEDMO, which adopts the three phases of the original method. The configuration of MUSEDMO includes: • Information Elicitation and Analysis Phase—including the development of both extant and target Generalised Task Models (GTMs), derived from Statement of Requirements (for example, initial functionality as characterised by ‘EditWorld’ (1999b)), an extant editing graphical representation system (Discreet Logic’s ‘Effect’), and task decomposition. • Design Synthesis Phase—including a Statement of User Needs (SUN), expanded to include results from literature reviews (including icon/direct manipulation interface design guidelines) in a summarised format. A Domain of Design Discourse (DODD) to identify particular characteristics and concepts of the editing domain. The Composite Task Model (CTM), highlights the conceptual design of the target DMO editor interaction. The CTM is decomposed further to identify subtasks of target DMO editor interaction to derive System Task Model (STM), including allocation of function between human and computer behaviours. • Design Specification Phase—human behaviours identified in STM, are decomposed further to derive device-level specification of the interaction of DMO in an Interaction Task Model (also informed by SUN & DODD). The computer behaviours identified in the STM are decomposed further to derive a set of Interface Models (note: as only one DMO is represented, the definition of screen objects is limited). Pictorial Screen Layouts (linked to the Interface Models) for both the 3D camera design iterations, and for the target subsystem design, were composed directly using Amapi Studio 3.0, and converted to bitmap files for illustration purposes. Figure 1 shows the Pictorial Screen Layouts for Screens S1 and S2, illustrating the target subsystem design for a 3D camera tool and selected lens view.
Evaluation of DMO Design Specification A semi-structured interview, for the purposes of the evaluation of the DMO design specification, was adapted from a ‘Perceived Usefulness and Ease of Use’ usability questionnaire (based on Davis, 1989). The usability questionnaire is particularly suited to the assessment of design solutions, incorporating paperbased screen layouts. Notably, the questions were amended to directly reference the 3D DMO camera tool design. However, due to limited access to working editors within postproduction, the editor involved in the Extant Systems Analysis stage of MUSEDMO, was recruited as the sole subject for evaluation purposes.
162
DESIGN AND EVALUATION OF A DIRECT MANIPULATION OBJECT
Although the evaluation conducted had compromised external validity, it can be considered an informal usability assessment, suitable for informing subsequent design decisions. The evaluation procedure entailed presenting the editor with three Pictorial Screen Layouts (including two Screen Layouts in Figure 1) in printed form. It was emphasised to the editor that these paper-based screen layouts should be viewed together as a format for the design specification, and functionality (albeit limited), for a 3D DMO camera tool. Despite the degree of subjectivity involved, useful feedback in the form of suggested camera functionality was derived, which can be used to inform further design decisions. The majority of the editor’s responses were dependent upon additional factors, which could not be properly represented by the paperbased screen layouts. Conclusions The design specification of a 3D DMO camera tool represents an innovative approach to the development of postproduction tools. This approach represents a departure from the common windows based UI design, as characterised by ‘EditWorld’ research. ‘EditWorld’ (1999a, 1999b) propose that graphical representations incorporating DMOs may be beneficial to the editor, when performing special effects edits. A review of extant commercial special effects packages has identified special effects editing tasks, where the 3D DMO camera tool could potentially be applied. In its present form, the 3D DMO camera tool is represented in the early design stage. While the usability assessment of the design solution is inconclusive, the general response to the UI representation was favourable. On reflection, the initial functionality in the present design specification is representative of high level DMO camera functionality. The additional functionality proposed by the editor during the evaluation, offers scope for the future development of a 3D DMO camera postproduction tool. References Card, S.K., Mackinlay, J.D., and Shneiderman, B. (1999). Readings in Information Visualisation Using Vision to Think. Morgan Kaufmann Publishers Inc. Davis, F.D., (1989). Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly, 13(3), p 319–340. EditWorld (1999a). Special Effects Research Team, Project: EditWorld. Unpublished document. Sony Corporation. EditWorld (1999b). Human Factors and the Design of an Enhanced GUI to Support Post-Production Tasks. HumanComputer Interaction INTERACT ’99 (Volume II). In S.Brewster, A.Casey, and G.Cockton (eds). Lim, K.Y., and Long, J.B. (1994). The MUSE Method for Usability Engineering. Cambridge University Press: Cambridge. Shneiderman, B. (1998). Designing the User Interface—Strategies for Effective Human-Computer Interaction (Third Edition). Addison Wesley Longman Inc.
OLDER ADULTS’ USE OF PUBLIC TECHNOLOGY Mary Sheard1, Jan Noyes1 & Tim Perfect2 1University
of Bristol, Department of Experimental Psychology, 8 Woodland Road, Bristol BS8 1TN, UK
2University
of Plymouth, Department of Psychology, Drake Circus, Plymouth, Devon PL4 8AA, UK
In Western society, the proportion of the population aged over 60 is on the increase, and is expected to continue growing for the foreseeable future. The use of computers, and the necessity to interact with them, is becoming more widespread as they are employed for a wide variety of public access functions from cashpoint machines to library databases. The study reported here aims to examine the performance of older adults in their interactions with a computer-based public library database. Older and younger adults were trained to use the system and asked to carry out a series of searches for a variety of library material. As expected, older adults were slower in carrying out these searches, and also made more errors than younger adults. It is hoped that the findings from this study can be applied to improve the design of public access technology for older adult users. Introduction In 1996 the number of people aged over 65 years in the UK population was 9.3 million, an increase of nearly 50% since 1961. Of these people, 1.1 million were aged over 85 in 1996, nearly a threefold increase since 1961. It is projected that by 2021 there will be 12 million people aged over 65 in the UK (source: Social Trends 28). Computer technology is becoming more widely available in the home, at work and for public access functions, such as cashpoint machines and library computer databases. In 1998, 2997 of 4607 public library service points were using online public access catalogues (OPACs), and in many cases this technology is replacing, rather than supplementing, existing methods of information retrieval (source: Public Library Statistics 1998-9). It is important that people can access the information with minimal training and computer experience. Older adults may find themselves at a disadvantage when learning about and using computers, due to the physical and cognitive changes that occur with ageing. Declines in perceptual processing, memory, problem solving skills and psychomotor co-ordination, coupled with a general cognitive slowing and resource reduction, may disrupt both learning and general performance when using computers. Stereotypes of older adults describe them as being unwilling to use technology, although this view is unsupported in the literature. For example, Dyck and Smither (1994) found that older adults attitudes towards computers were more positive than those of younger adults, although this was not mirrored in their confidence in their own ability to use computers. Mead et al. (submitted) suggest that computer software design changes may improve the usability of OPACs for older adults. They asked novice library database users (young and old) to carry out a series of
164
OLDER ADULTS’ USE OF PUBLIC TECHNOLOGY
OPAC searches following a basic training course. Young users were found to be able to use the system successfully to carry out searches, although their underlying understanding of it was low. A comparison of the older and younger adults performance found older adults to be significantly less able to access information held on the system, especially when advanced searching, eg. the use of Boolean operators, was required. Joyce (1989) found older adults computer users had particular difficulties with the use of function keys relative to menus. This is suggested to be due to higher working memory loads in the former system. The system examined in this study required the use of both types of interface, and it was predicted that a similar pattern of results would be replicated here. Method Design and participants In this study, older and younger adults were trained in how to use a library database, and then carried out twelve searches. Their performance was examined for its speed, response accuracy, and errors. There were 10 participants in each of the two age groups. Older participants had a mean age of 74.9 years (range=63–82) and younger participants had a mean age of 24.8 years (range=21–36). All were regular library users. The older group had a mean error score on the National Adult Reading Test (NART) of 10, and the younger group of 14.7. This is indicative of a higher mean intelligence for the older group. Two members of the ‘old’ condition and one member of the ‘young’ condition had used the OPAC under examination before, although none of these judged themselves to be experienced users. As such, all participants are judged to be novice OPAC users. Two additional members of the older adult condition were excluded from the study. The first of these did not meet the training criterion. The second met the training criterion, but took over an hour to do so, leaving insufficient time to complete the search tasks. Materials This study was carried out using an OPAC in a local library. The system used is a library catalogue system on general release, which can be found in over 100 library services in the UK, including university, college and local authority libraries. It is customised for each client to suit the particular needs of their library service. The system is operated using a custom made keyboard interface with a standard QWERTY layout and a monochromatic text display. The system is menu driven and uses the function keys, cursors and enter key to select options and carry out searches. Procedure First the NART was administered using the standard procedure, followed by completion of the pre-test questionnaire. In the training all five search types required for the search tasks were carried out by the participant, and all key functions that would be required were explained and practiced. After five prespecified search examples had been completed, questions were asked from the training criterion sheet. All had to be answered correctly prior to commencement of the search tasks. If the criteria were not met, further training searches were carried out. These were not standardised, rather they aimed to revise those functions that the participant had not understood to a satisfactory level. The time taken for the participant to reach the training criterion was recorded.
CONTEMPOARY ERGONOMICS 2000
165
The search task consisted of twelve searches carried out by all participants in the same order. All twelve differed from each other on some features, including what search type would need to be used, eg. author or keyword, and what information was required, eg. the full title, publication date or branches holding copies. Each question was read to the participant from a card, which was then placed on the table beside the keyboard to be referred to as required. Each search was subject to a three-and-a-half minute time limit. This was in place to ensure that if a participant was not making progress on a question they did not spend a long time working on it without progressing significantly. If at the time limit the participant was judged to be close to the correct answer they were permitted to carry on working either to completion, or until they made negative progress. This part of the experiment was video recorded for later analysis. Results Time and accuracy data There was a significant difference (F=11.411, df=18, p=0.003) between the ‘old’ (mean =32.1 minutes) and ‘young’ (mean=10.2 minutes) participants on the time taken to meet the training criterion. The first time measure taken from the search tasks was the time for the participant to select a search type. This was taken as being the time between hearing the question and the participant reaching a search screen with a typing prompt. There was a significant difference between the young and old participants for 10 of the 12 questions, with the older participants taking longer. No significant differences were found for the remaining two questions. The second time measure was the time taken for the participants to complete the searches, regardless of whether their responses were correct. Here again the older participants were significantly slower than the younger participants on 11 of the 12 questions. The accuracy of responses was significantly different between the two age groups (t=2.5, df=20, p=0.02). Members of the ‘young’ group reached the correct answer in 107 out of a possible 120 searches while members of the ‘old’ group did so in 81 searches. Error classification The errors made by the participants in the search tasks were categorised into eight classes. Some errors fell into more than one class. A typing error (T) refers only to errors made in typing the search term, not to any other keyboard error. Wrong key presses (W) have been classified separately, and include inappropriate choices of function keys (including enter and cursors). Mistakes (M) are taken as being errors due to problems in the planning stage of the solution, such as the selection of an inappropriate search type or strategy. If the correct response to the search question, or information clearly indicating where it is to be found, was on the screen, but the participant did not indicate and act on recognition, this has been classified as error type B. Menu errors (Me) are where the participant did not make correct use of the pull down menu system in the software. An ‘inadvertent’ error (I) is made where the participant made an error due to an ambiguity in the system, or in the question. If any type of error was repeated during one trial where its first use was not fruitful, this is classified as a perseveration (P). The final error classification is a non-optimum action (NO). This is taken as being when the participant did not use all the information available to them, and thus reached the correct answer, but by a less efficient route. Also classified in the qualitative analysis are verbal prompts from the experimenter to the participant, including both those prompted by a question, and those initiated by the experimenter.
166
OLDER ADULTS’ USE OF PUBLIC TECHNOLOGY
Figure 1. Graph showing occurrence of the error categories by age
It was found that inadvertent and typing errors did not occur with significantly different frequencies between the two age conditions. In each of the remaining categories, older adults’ performance was significantly poorer. Discussion This study examined the performance of older and younger adult novice users in using an OPAC. Older adults’ overall performance on the tasks was poorer than that of younger adults, with more time being taken for searches. This is as predicted by the cognitive ageing literature, where a general cognitive slowing with ageing has been observed. Studies concerning older adults’ use of computers have shown them to make more errors than younger adults, and to have particular difficulties with certain aspects of computer operation. This is confirmed here. The NART scores showed the older participants to have a higher mean intelligence score than the younger participants. As such, any poorer performance in the older adult group cannot be attributed to their intelligence. Younger adults were more accurate in their responses to the search questions, and took less time to carry out the searches. These differences are explained in part by the error analyses discussed below. The effects observed here are consistent with the literature. When carrying out the searches the older participants made more errors in total, as well as requiring more verbal prompting from the experimenter. Typing mistakes and ‘inadvertent’ errors were the only two types which occurred with equal frequency in both age groups. The typing errors appeared to be made under different circumstances. Many of the younger participants were highly familiar with the keyboard layout and worked very quickly, with typing errors arising as a result of haste. The older participants were less familiar with the keyboard and not confident using it, and made errors in typing for this reason. Inadvertent errors, by definition, are to be expected to be equally common in both age groups. However, some perseveration of errors occurred in relation to inadvertent errors, and these perseverations were significantly more frequent for older than for younger adults. The most commonly occurring type of error for both age groups was mistakes. These errors occurred in planning how to carry out the search, and were more frequent for the older adults. This may be rectifiable either by giving more training prior to using the system, or by providing more and higher quality on screen help. The former is not an appropriate solution for most public access systems, as many users will be occasional and not wish to spend a long time learning about the system (Rowley and Slack, 1998). The latter would therefore be more appropriate in this case. This could also help in the reduction of inadvertent errors.
CONTEMPOARY ERGONOMICS 2000
167
Errors in using the menu system occurred only in the older age condition, and accounted for less than three percent of the older adults’ errors. Problems with using the function keys accounted for sixteen percent of older adults’ errors. These data confirm the findings of Joyce (1989) and suggest that a system that does not involve function keys would be more accessible to older adults. Public access technology should aim to rely minimally on training its users, instead utilising good design in facilitating successful use. This study has highlighted issues in the use of public access technology by older adults. The use of function keys is a hindrance to older adult users in particular and should be avoided in favour of a menu based interface. In addition to this, clearer and more detailed on screen instructions and information about system functioning would be an appropriate means of assisting users in planning and carrying out searches. This would be of particular use in conjunction with error messages to suggest alternative courses of action to the system user. Acknowledgements This research was supported by the Research into Ageing and AgeNet joint studentship number 187s. References Dyck J.L. and Smither J.A. 1994, Age differences in computer anxiety: the role of computer experience, gender and education, Journal of Educational Computing Research, 10, 239–248 Joyce B.J. 1989, Identifying differences in learning to use a text-editor: the role of menu structure and learner characteristics, unpublished Masters thesis submitted to the State University of New York at Buffalo Mead S.E., Sit R.A., Rogers W.A., Jamieson B.A. and Rousseau G.K. 1998 (submitted), Influences of general computer experience and age on library database search performance, Behaviour and Information Technology Office for National Statistics, 1998, Social Trends 28, (The Stationery Office, London) Public Library Statistics 1998–1999: Actuals, (Chartered Institute of Public Finance and Accountancy, London) Rowley J. and Slack F. 1998, Designing Public Access Systems, (Gower Press, Aldershot)
AN ASSESSMENT OF THE RATIONALE & EFFECTIVENESS OF ACCELERATOR KEYS IN COMPUTER APPLICATIONS Cyprian Cin Howe Wong & Kee Yong Lim Design Research Centre, School of Mechanical and Production Engineering, Nanyang Technological University, Nanyang Avenue, Singapore 639798
This paper examines the provision of accelerator keys in popular computer applications on both PC and Macintosh platforms. A total of fourteen versions of six applications were investigated. The accelerator keys were categorised into specific groups to facilitate user testing. Two tests, namely a questionnaire test to evaluate users’ ability to recall and recognise accelerator keys, and a hands-on performance test, were conducted. The results of the questionnaire tests showed that users were better at recognising accelerator keys than at recalling them. However, their performance for both tests was rather poor. Users recognised and used mainly generic accelerator keys, and appeared reluctant to release the mouse. From the test results, it is recommended that other than generic accelerator keys, the number of accelerator keys provided should be minimised. Background The evolution of the computer is nothing short of amazing. Computers of yesteryear filled a large room and had a team of trained technicians to make it work. Advances in the microchip industry and the Microsoft Disk Operating System (DOS) made computers more accessible to the masses. DOS was useful and powerful but its text entry system was cumbersome and proved to be quite intimidating for the novice user. This situation was rectified in a large part by the advent of the Graphical User Interface (GUI), which made its debut in the early 1980s and was popularised by Apple Computer’s Macintosh Operating System (OS). GUIs appealed to users as they did not require them to learn a set of commands to use the computer. However, GUIs also present limitations. It does not have the flexibility of a command based operating system. For instance, users of GUI systems have to navigate through several layers of menus before accessing the function they desire. In contrast, users of command based applications need only enter a command line to access the same function. To give users more flexibility, GUI programmers thus began to include accelerator keys for certain functions. Since then, such keys have proliferated exponentially. Some applications have up to fifty visible accelerator keys. In many cases, accelerator keys were assigned poorly relative to the functions they were meant to facilitate access; e.g. the failure to provide users with meaningful mnemonics. It is unlikely then that users will remember all fifty of the accelerator keys, much less use them. Any attempt to use the keys would involve undesirable additions to the cognitive workload of the user. Thus, time and money is wasted in encoding these accelerator keys into the application (Lim, 1996).
CONTEMPOARY ERGONOMICS 2000
169
To assess the extent of difficulties encountered by users, a series of tests was conducted to gauge users’ knowledge of accelerator keys in and across common applications, and across application versions, vendors, and platforms. It is hoped that the data gathered in the study will inform and motivate programmers to apply a more rational basis for the assignment and inclusion of accelerator keys in future applications. A taxonomy of accelerator key groups Many OS companies have proposed a set of guidelines for assigning accelerator keys. For instance, the Open Software Foundation (OSF) has published styleguides to help programmers design applications. These styleguides include guidelines on assigning accelerator keys. Similarly, companies such as Apple Computer and Microsoft have defined a list of generic accelerator keys for common functions in their styleguides (Apple Computer, 1984; Microsoft Corporation, 1995). These styleguides are usually applied to ensure a consistent “look and feel” for an application on a specific platform (Apple Computer 1992, 1997). However, these styleguides do not provide specific enough instructions to programmers on how to assign accelerator keys. They specify only that an assigned accelerator key should not be an existing generic accelerator key, the combination of a standard key and the Shift modifier key (eg. Shift P or Shift SemiColon), or include a combination which may activate a system level function (eg. Control Alternate Delete). To aggravate matters, studies have shown that programmers do not even follow these guidelines closely when assigning accelerator keys (see Tetzlaff et al, 1991). Generic accelerator keys are not the only keys that can be found in modern computer applications. Specifically, accelerator keys can be divided into the following categories: • Generic accelerator keys. These accelerator keys are predefined by the OS company. They can also be found in all applications on the platform. For example “Control X” for “Cut” and “Command V” for “Paste”. • Alphabetic accelerator keys. Accelerator keys which have one or more modifier keys (eg. Command or Alternate) and an alphabet key (eg. A, G or Y) are considered alphabetic accelerator keys. They can also be divided into simple alphabetic accelerator keys (eg. “Control H” for “Replace”) and complex alphabetic accelerator keys (eg. “Shift Command S” for “Save As….”). • Function accelerator keys. Function accelerator keys are accelerator keys which have a combination of one or more modifier keys and a non-alphabetic key (eg. Semi-Colon, 5, F7). The keys can also be divided into simple function accelerator keys (eg. “F7” for “Spell Check”) and complex function accelerator keys (eg. “Command Control Semi-Colon” for “Lock Guides”). This categorisation of accelerator keys was applied in the present study. Details of the study Fourteen versions of six popular applications were investigated. The applications were divided into word processing and graphic applications, across the Windows and Macintosh platforms. The applications investigated are listed in Table 1. PC systems were operating under Windows 3.1 and 95, and Macintosh systems were operating on System 7.5. Since the availability of subjects varied across these applications, the number used ranged from 10 to 50. Subjects were recruited largely from tertiary institutions.
170
AN ASSESSMENT OF THE RATIONALE & EFFECTIVENESS
Table 1. List of applications surveyed
The number of accelerator keys in each successive version of an application was charted based on the accelerator key categories described above. This survey of applications is intended to uncover the industry trend in respect of the provision of accelerator keys. For user testing, subjects were asked to complete a questionnaire test and a hands-on performance test. The questionnaire test was divided into two sections to assess the subjects’ ability to recall and recognise accelerator keys. A summary of the scope of the questionnaire tests follows: • Basic test. This basic questionnaire test was intended to assess the users’ ability to recall and recognise accelerator keys for applications with which they are most familiar. • Cross version test. This test was included to assess the subjects’ knowledge of accelerator keys across different versions of the same application (eg. between MS Word 97 and MS Word 7 for Windows 95). This was done to determine the extent of difficulty users might experience when upgrading to a newer version of the application. • Cross platform test. This test was conducted to assess subjects’ knowledge of accelerator keys for the same application offered across different platforms (eg. MS Word 97 for Windows and MS Word 6.0 for Macintosh). The tests were intended to determine the extent of users’ confusion when they try to use accelerator keys across different platforms. The hands-on performance test was conducted for all the applications listed in Table 1. In the test, subjects were required to complete a set of tasks which they were likely to perform when using the application. The tasks used in the test can be performed in two to three ways of interaction; namely via pull-down menus, accelerator keys and in some cases, the toolbar. The means of interaction used by the test subjects were observed and recorded. These observations were collated to determine user preference (if any) for the menu bar, toolbar or accelerator keys, notwithstanding their knowledge of the accelerator keys required to perform the task. Results of the study Application Review For most of the application versions surveyed, the number of accelerator keys in each successive version decreased. However, there were a few exceptions. For example the number of accelerator keys for Adobe Photoshop for the Macintosh increased steadily from versions 2 to 4. In many of the applications, generic accelerator keys made up the bulk of the accelerator keys provided in accordance with the recommendation by styleguides. However, in the case of Photoshop, there were more simple and complex alphabetic accelerator keys than in any other category of accelerator keys. This is a worrying trend when one considers the results of the questionnaire and hands-on tests (see later).
CONTEMPOARY ERGONOMICS 2000
171
Results of the Questionnaire Tests The results of the basic questionnaire test showed that subjects recognised more accelerator keys than they can recall. The majority of the accelerator keys recognised in the test were generic accelerator keys . However, the general performance of the subjects in the test was surprisingly poor for all categories of accelerator keys. Less than 5% of non-generic accelerator keys were recognised correctly in the test, as opposed to about 20% for generic accelerator keys. Macintosh users generally performed better in the test than their PC counterparts. This could be due to Apple’s strict guidelines for consistency across all applications running under the Macintosh platform. For the cross application version test, the results also showed that subjects recognised generic accelerator keys more than any other category of accelerator keys. As before, the subjects were better at recognising accelerator keys than they were at recalling them. However, the percentage correct responses was generally less than that of the basic test for all accelerator key groups. When interviewed, the subjects revealed that they were more uncertain of whether the accelerator keys remained unchanged across different versions of the application. In this test, Macintosh users again performed significantly better than their PC counterparts. For the cross platform test, the results were rather discouraging. The subjects performed very poorly. Specifically, they were only able to recognise a small number of generic accelerator keys, and hardly any from the other categories of accelerator keys. The recall section of the test elicited almost no response for all categories of accelerator keys. One reason for this poor result could be the use of different modifier keys across the platforms. In particular, the primary modifier key for PC systems is the “Control” key, while for the Macintosh it is the “Command” key. Another reason other the different modifier keys might be that few of the subjects used both platforms on a regular basis. In general, the subjects were able to recognise and recall generic accelerator keys significantly better than any other category of accelerator keys. From the poor performance with the latter category of accelerator keys, it may be concluded that the trend of providing an increasingly large number of alphabetic and function accelerator keys, might prove to be misguided and a waste of programmers’ time and effort. Results of the Hands-On Performance Tests In the hands-on performance test, subjects were observed to prefer the toolbar to pull-down menus and accelerator keys. On average, the subjects used accelerator keys less than 10% of the time. This observation may be expected considering the poor performance of the subjects in recalling accelerator keys. As before, subjects also used more generic accelerator keys than any other category of accelerator keys. This observation was consistent with the results of the questionnaire tests. Macintosh users again performed better in this test than PC users. For reasons of space, a more detailed account of subject performance in the tests above is precluded in this paper. Such an account will be reported at a later date. Conclusions and design recommendations Generic accelerator keys are an integral part of any computer application. Since the introduction of GUIs, OS vendors such as Apple and Microsoft have emphasized the need for a consistent group of accelerator keys which users can access regardless of application class, version or vendor. In this respect, the better performance by Macintosh users in the tests indicated indirectly that Apple’s strict guidelines for consistent assignment of generic accelerator keys across application class and versions, may have been successful in encouraging greater user awareness of the keys.
172
AN ASSESSMENT OF THE RATIONALE & EFFECTIVENESS
As for the hands-on test, the low usage of accelerator keys may be attributed to a number of factors. First, subjects might not know the keys to use as their performance in the recall section of the questionnaire test was rather poor. Since the use of accelerator keys involved recall in many cases, it was to be expected that few of the keys would be used. Second, the provision of a constantly visible tool bar made it unnecessary for users to remember the accelerator keys. Further, interaction with a tool bar was far more convenient and attractive than typing in accelerator key combinations. A third possible explanation of the results of the hands-on tests was that in some cases, accelerator key input required subjects to use both their hands. This requirement would mean that they would have to release the mouse, which was their preferred means of interaction. In these cases, subjects would have shunned away from using accelerator keys. Finally, subjects might prefer not to use accelerator keys because the need to recall them from memory might draw on cognitive resources (minor or otherwise) which would otherwise be used to perform their actual task. In this respect, the keys might reduce the transparency of the user interface and so interfere with the subjects’ performance of their task. To ensure the usability of an application, programmers should take note of the findings reported here. In particular, they should provide fewer accelerator keys and limit them to functions which are frequently used. In addition, since subjects appeared unwilling to release the mouse, it might be better to assign accelerator keys which can be activated by one hand. References Apple Computer Inc. 1984, Apple Human Interface Guidelines, Apple Computer. Apple Computer Inc. 1992, Macintosh Human Interface Guidelines, Addison-Wesley Publications. Available from http://developer.apple.com/techpubs/mac/pdf Apple Computer Inc. 1997, Macintosh OS8 Human Interface Guidelines, Technical Publications. Available from http:// developer.apple.com/techpubs/mac/pdf Lim, K.Y. 1996, Command/Shortcut Keys in WIMP User Interfaces: A Lost Cause? In Human Computer Interaction, Interact 1997, Chapman and Hall Publishers. Microsoft Corp. 1995, The Windows Interface Guidelines, A Guide for Designing Software, Microsoft Corporation. OSF/Motif. 1993, Style Guides, Open Software Foundation, PTR, Prentice Hall, New Jersey. Tetzlaff, L. and Schwartz, D.R. 1991, Use of Guidelines in Interface Design. In CHI ’91 Conference Proceedings Reaching Through Technology ACM Press.
CAN SOUND OUTPUT ENHANCE GRAPHICAL COMPUTER INTERFACES? W.Morrissey & M.Zajicek Speech Project, School of Computing and Mathematical Sciences, Oxford Brookes University, Gipsy Lane, Headington, Oxford OX3 0BP Tel: +44 1865 483709 email:
[email protected]
This paper is concerned with enhancing a graphical computer interface with the use of nonspeech sound to facilitate browsing the World Wide Web for visually impaired users. We are using a web browser called BrookesTalk, which uses synthetic speech to read out a web page as the platform for our experimentation. We are currently undertaking experimentation utilising non-speech sound to help users navigate the web more easily. An evaluation was conducted by testing sonically enhanced navigation buttons. Experimental results showed that sound enhanced buttons improved usability and decreased task times. Most visually impaired participants felt that the use of non-speech sound helped them to navigate back and forth through pages on the world wide web more quickly and confidently. Introduction With the emergence of multimedia computer interfaces there is considerable interest in exploiting the use of sound to increase the ‘bandwidth’ of computer output. Bandwidth describes the amount [quantity] of information [data] that can be retrieved from a computer. This includes the graphical user interface structure, icons, text, graphics, speech and non-speech sound. The modalities utilising sound are of particular interest for blind and visually impaired users. BrookesTalk BrookesTalk is a speech output web browser for blind and visually impaired users, which facilitates searching the World Wide Web for information (details on http://www.brookes.ac.uk/speech). It provides functions, which enable the user to scan web pages in the same way that sighted users do. It reads out the web page using speech synthesis in word, sentence or paragraph mode and ‘quick’ views of web pages are provided using information retrieval and natural language processing techniques (Zajicek et al, 1999). BrookesTalk is function key driven for blind and visually impaired users, providing accessibility via the keyboard. It also provides a configurable large text window for partially sighted users, which displays synthetic speech output in text; and a standard visual browser so that visually impaired users can work together with sighted users. The research under discussion in this paper is that of enhancing the BrookesTalk multimodal interface with the use of non-speech sound to complement the speech output. To put this in context, users of standard graphical computer interfaces require keen eyesight to read small typeset, a good memory to remember
174
CAN SOUND OUTPUT ENHANCE GRAPHICAL COMPUTER INTERFACES?
sequences of commands and the ability to control a mouse to make high-precision clicks on small interaction objects. The pre-requisite abilities to perform such tasks exclude a significant proportion of the population from computer use, including the world wide web, through visual impairment or physical impairment. In addition, speech output provides information in the form of text strings, which may be of considerable length, requiring the user to listen to a whole sentence to elicit the information required. We are seeking to enhance the usability of BrookesTalk with non-speech sound to provide feedback to the user, with respect to successfully executing mouse button clicks to navigate back and forth through web pages. Background We investigated current research in the area of auditory interfaces. We are interested in the use of accessory sound. This could take the form of one of the following: • Auditory Icons—‘Real World’ sounds • Earcons—abstract sounds Auditory Icons Sighted users often supplement the graphical user interface with sounds that represent certain events carried out by their computer, for example, emptying the recycle bin (sounds like paper being screwed up and thrown away). These sounds become familiar relatively quickly because they tend to sound similar to the action they represent and are often accompanied by a visual prop (warning message or dialogue box). This is intuitive, but does not work if there is no sound which is representative of a computer operation. Myra Bussemakers (Bussemakers et al, 1999) has undertaken studies using a categorisation paradigm to study the mulitmodal integration processes that take place when working with an interface. Her results showed that integrating sound output does not always lead to faster responses. However, it seems that the type of sound and its congruency with visual information can influence its usability. She has also reported that the use of accessory sound can actually decrease reaction times of subjects carrying out tasks and caused annoyance to some users. Earcons Earcons are defined as “non-verbal audio messages that are used in the computer/user interface to provide information to the user about some computer object, operation or interaction” (Blattner et al, 1989). Stephen Brewster (Brewster et al, 1995) has successfully implemented earcons for the sonification of specific interface elements, such as graphical buttons and scrollbars. He has also tested the use of sound on a Personal Digital Assistant (PDA) to help users to select buttons accurately and has again found favourable results (Brewster, 1999). Incorporating non-speech sound into BrookesTalk The use of sound in the computer interface must be judicious, since the aim is to provide a mechanism for improvement of usability and to improve the user’s performance. It has been recognised that non-speech sound can provide a source of output that can either help or hinder users performance.
CONTEMPOARY ERGONOMICS 2000
175
Figure 1. BrookesTalk screen showing sonically enhanced navigation buttons
Members of the Speech Project are currently working on a new version of BrookesTalk, which integrates the use of non-speech sound with synthetic speech. We have piloted the sonification of the browser navigation buttons as shown below. When using graphical buttons the mouse cursor may accidentally slip off the edge during the process of selection, either whilst pressing down the mouse button or by moving the mouse before selection. In either case the feedback given to the user is the same as if the graphical button had been successfully selected. The sounds added to the buttons provide feedback as to whether or not the button is selected. The sound was incorporated in the form of earcons using Stephen Brewster’s Audio Buttons as a model. Three different sounds were implemented to denote three different states, as follows: • Over Button Sound confirms to the user that the mouse cursor is over the button • Button Down Sound indicates that the button is being pressed down • Selection Sound confirms that the button has been successfully selected Experimentation Our experimentation seeks to find out whether the addition of non-speech sound facilitates users to firstly, execute tasks more quickly and secondly, increase their confidence in navigating the web. Method Twelve subjects volunteered to undertake the experiment, aged between 33 and 65. The subjects were evaluated in respect of their sight capacity and IT experience. Most subjects considered themselves to be novice users. Two subjects had normal sight whilst wearing spectacles; the remaining subjects wore glasses
176
CAN SOUND OUTPUT ENHANCE GRAPHICAL COMPUTER INTERFACES?
but still considered their sight to be ‘poor’ when looking at a graphical computer interface. None had used BrookesTalk before. Subjects were firstly trained to use BrookesTalk and were informed of what the three different sounds signified when using the sonically enhanced version. Each subject carried out the experiment under the two conditions, the order of which was randomised. a) without sonically enhanced buttons b) with sonically enhanced buttons Subjects were given their instructions to undertake the experiment. They were asked to visit a number of pages on the site in a given order, using the mouse to click on the links, which were all placed on the left hand side of the pages (in the form of a table designed to imitate the style of a framed page). Having reached the terminal point of the tour, subjects were reminded of what was expected of them and were told that they would be timed for the next part of the experiment. Subjects were then asked to navigate through the visited pages in a given order using the ‘back’ and ‘forward’ buttons as appropriate. The process involved a minimum of 21 button depressions to complete the navigation exercise. Timings and errors were recorded. On completion of the physical experiment, subjects were asked whether or not they found the sound enhancement helpful or annoying and if so, why. Results and analysis The data are paired allowing an analysis of the difference in times to be undertaken. The distribution of differences was successfully tested for normality, thus validating the use of a paired t-test. Summary data are shown in Table 1. Table 1. Summary of Results
In the sonified condition the subjects navigated through the web pages on average significantly more quickly than in the non-sonified condition (one-tailed hypothesis, P=0.046), the difference between means was 8.42 seconds, which represents a 8.98% improvement. Conclusion This study has shown that there is a statistically significant improvement in performance when using sonically enhanced buttons in line with Stephen Brewster’s (Brewster et al, 1995) findings. The results of questioning our subjects about whether they found the sound enhancement helpful, yielded positive results. Most subjects said they found the sonification helpful, particularly the ‘Selection Sound’, as it confirmed that they had successfully clicked the graphical button. Most said they did not find the sound enhancements annoying since the ‘Over Button Sound’ and ‘Button Down Sound’ were at a low volume
CONTEMPOARY ERGONOMICS 2000
177
(probably too low, if there were other background sounds in the room). Some subjects felt that it would be more useful if the back and forward buttons were allocated different sounds; one subject suggested a ‘vehicle reversing’ type sound for going backwards. A few of the subjects who found it more difficult to see objects on the screen found it difficult to discern between the back and forward buttons and would have preferred them to be bigger. In addition, it was interesting to note that some users employed the sound enhancement in order to adopt strategic techniques to navigate back and forth through he web pages. For example, users found they could easily navigate to the ‘home’ page by repeatedly selecting the back button until the sound was disabled, thereby indicating ‘home’ was reached, without having to wait for the pages to load. Future Work We are ultimately aiming to design an interface for blind and visually impaired users, which provides the user with feedback and produces an easy to use and learn application, thereby helping users to build up a conceptual model of the world wide web (Zajicek et al, 1999). Further work will involve the investigation of integrating non-speech sound to improve usability of the navigation buttons and function keys. In addition non-speech sound could be integrated to enable users to orientate themselves whilst navigating the web. References Blattner, M., Sumikawa, D. & Greenberg, R. 1989, Earcons and icons: their structure and common design principles. Human Computer Interaction, 4(1), 11–44 Brewster, S.A. 1994, Providing a structured method for integrating Non-Speech Audio into HCI. Doctoral Dissertation, University of York Brewster, S.A., Wright, P.C., Dix, A.J. & Edwards, A.D.N. 1995, The sonic enhancement of graphical buttons. In K.Nordby, P.Helmersen, D.Gilmore & S.Arnesen (eds.) Proceedings of Interact ’95, Lillehammer, Norway: (Chapman & Hall), 43–48 Brewster, S.A. 1999, Sound in the Interface to a Mobile Computer. In H-J Bullinger, J Ziegler (eds.) Human Computer Interaction: Ergonomics and User Interfaces, 2, (Lawrence Erlbaum Associates, London), 43–47 Bussemakers, M.P., de Haan, A.E. and Lemmens, P.M.C. 1999, The effect of auditory accessory stimuli on picture categorisation; implications for interface design. In H-J Bullinger, J Ziegler (eds.) Human Computer Interaction: Ergonomics and User Interfaces, 1, (Lawrence Erlbaum Associates, London), 436–440 Zajicek, M., Powell, C. and Reeves, C., 1999, Evaluation of a World Wide Web scanning interface for blind and visually impaired users. In H-J Bullinger, J Ziegler (eds.) Human Computer Interaction: Communication, Cooperation, and Application Design, 2, (Lawrence Erlbaum Associates, London), 980–984
ADAPTIVE AUTOMATION: WHO HAS CONTROL? I.R.Craig, S.G.Russell & E.K.Flood Defence Evaluation and Research Agency (DERA), Centre for Human Sciences (CHS), Farnborough, Hampshire, GU14 OLX, United Kingdom
This paper outlines an experiment to investigate the use of Direct Voice Input (DVI) for control of an adaptive automation system through forced switching of automation of a tracking task. A within-subjects design was used with 12 participants. The experimental conditions consisted of automatic, DVI, part manual, and full manual. The results from the root mean squared error deviations indicated significantly larger (p<0.001) initial 6 seconds of deviation during the automatic condition. The target acquisition times suggested that the part manual condition, during periods of manual control, produced significantly slower reaction times (p<0.007). The dials’ monitoring acquisition times indicated a significantly slower reaction time in the full manual condition (p<0.02). The DVI condition did not highlight any major disadvantages with any of the measures. Introduction The introduction of new technology will ideally reduce operator workload. Conversely, inadequate design of new technology may actually increase operator workload. An approach often taken to reduce workload is to automate those functions that can be automated and leave the remainder to the human operator. This approach may not be the most efficient as an operator may be left with only the high workload tasks or with a purely monitoring role, that may lead to a loss of situational awareness and de-skilling. One way to address this problem is through adaptive automation. Adaptive automation considers the advantages of both human and machine abilities through a strategy that allows changes in task allocation to occur. As described by Hancock and Scallen (1997), adaptive automation is when the control decisions concerning the onset, the offset, and the degree of automation are shared between the human and machine. However, when adaptive automation is in place it is important for the operator to remain aware of these tasks that are automated and those under manual control. Control through Direct Voice Input (DVI) is one method that could be used to provide such awareness. Past research at DERA (Russell et al. 1999) has investigated the use of Direct Voice Output for feedback of an adaptive automation system. The aim of the present experiment was to investigate the use of DVI for control of an adaptive automation system. Most operator control in an adaptive automation system is through manual control. However, the use of DVI may provide an alternative. Direct Voice Input offers the advantage of utilising another modality for control to operators who are heavily loaded by manual tasks. The use of DVI will have particular relevance for time-critical events in current and future military systems.
CONTEMPOARY ERGONOMICS 2000
179
Method Participants Twelve subjects took part in the experiment: 6 males (mean age 26 years, range 23–29) and 6 females (mean age 23 years, range 20–28) with an overall age of 24. Apparatus The ‘Strategic Task Adaptive: Ramifications For Interface Relocation Experimentation’ (STARFIRE) program was used to measure the Root Mean Squared Error (RMSE) of a flight task and reaction times to targets and dials’ monitoring. Scallen (1997) developed STARFIRE to measure the effects of adaptive automation with a program consisting of 3 typical flight tasks: tracking, target detection and dials’ monitoring. The program was run using an Indigo2 Silicon Graphics (SG) machine with keyboard and joystick. The DVI system used for the experiment was developed by the Defence Evaluation and Research Agency (DERA), Malvern. Experimental design The experiment was based on a within-subjects design. Of the STARFIRE tasks, only the tracking task was semi-automated. The semi-automation of the tracking task refers to periods of time when the tracking task is controlled by the system. Participants completed a practice session of all tasks, with frequent pauses for instructions from the experimenter. There were 4 experimental conditions for the tracking task: (1) Part manual control—changes in automation status on the delivery of a signal involved pressing the ‘return’ key on the keyboard; (2) DVI control—changes in automation status on the delivery of a signal involved speaking the word ‘return’; (3) Automatic control—changes in automation were controlled by the system. (4) Full manual—the tracking task was under manual control. The reaction times for each automation change were measured for all participants for the DVI and part manual control conditions. Participants received all the experimental conditions in a random order with the following measurements recorded:
Procedure Participants were given written and oral instructions before each condition. Each condition, except full manual, was separated into 10 randomly allocated 1-minute intervals, half under manual control (5 minutes in total) and half-automated (5 minutes in total). The first minute of each condition was randomly assigned to begin either under manual or automatic control and continued to change each minute in a systematic pattern thereafter. Automation status was pre-programmed with the participant responding to a signal (except full manual and automatic) produced by the computer. The signal was one of two different computer generated noises, one to indicate a change from manual to automation, and another a change from automation to manual. For all conditions each minute contained two randomly allocated dials’ monitoring and two targets. Participants were given 5 seconds to respond to a dial monitoring or target. If no response
180
ADAPTIVE AUTOMATION: WHO HAS CONTROL?
was made in the time allotted the deviation or target was corrected or disappeared, and recorded as a miss. Deviations or targets did not appear during the first and last 10 seconds of each one-minute interval. The tracking task remained the same through all conditions. Results Reaction time for each automation change For both the DVI and part manual conditions participants received either four or five changes from automation to manual or vice versa—depending on the starting automation position, nine changes in total were made. The data were analysed using a two-tailed paired t-test. A significant difference was found (t=7. 188; df=11; p<0.001), with the part manual condition producing significantly faster reaction times than the DVI condition. There was, however, a lag in the computer of approximately one-second with the DVI commands, and this would have affected the reliability of the result. Initial root mean squared error deviations The mean was taken of the initial sixteen seconds of root mean squared error (RMSE) deviations following a switch from automation to manual for three conditions (DVI, part manual, and automatic). The data were analysed using repeated measures, one-way Analysis of Variance (ANOVA). No significant result was found (p>0.05). A further two ANOVAs were undertaken to examine the period between 1–6 seconds, and 7–16 seconds. No significant result was recorded between 7–16 seconds, although a significant interaction (F=1.169; df=10,110; p<0.001) between condition and time was recorded for the period between 1–6 seconds. Figure 1 shows the distribution of the data illustrating the higher RMSE deviations for the automatic condition compared with either the DVI or part manual condition.
Figure 1: Distribution of the data for Root Mean Squared Error (RMSE)
CONTEMPOARY ERGONOMICS 2000
181
Figure 2: Mean target acquisition time (seconds) for all conditions
Figure 3: Mean dials’ monitoring acquisition time for all conditions
Target acquisition time The mean was taken of target acquisition times for full manual and during manual and automation periods for all other conditions. Significant differences were recorded using a repeated measures one-way ANOVA (F=3.750; df=6,30; p<0.007). Post Hoc analysis using pairwise comparisons indicated that the part manual condition, during manual periods, produced significantly slower reaction times than all other conditions. Figure 2 shows the mean and standard error of target acquisition time for all conditions. Dials’ monitoring acquisition time The mean was taken of target acquisition times for full manual and during manual and automation periods for all other conditions. Significant differences were recorded using a repeated measures one-way ANOVA (F=4.443; df=6,30; p<0.02). Post hoc analysis using pairwise comparisons indicated significantly slower reaction times for the full manual condition compared to the automatic (automated) (p<0.009), automatic (manual) (p<0.047), DVI (automated) (p<0.028), and part manual (automated) (p<0.001). Figure 3 shows the mean and standard error of dials’ monitoring acquisition time for all conditions.
182
ADAPTIVE AUTOMATION: WHO HAS CONTROL?
Discussion The results from the experiment show that for reaction time, during automation changes, part manual control is faster than DVI control. This may be attributed to computer lag and it is envisaged that future DVI technology will improve sufficiently to reduce this concern. The results from the tracking task show that the automatic condition produced larger deviations during the initial 6 seconds following changes from manual to automation than the DVI and part manual conditions. This may be due to difficulties encountered by participants in transferring from automation to manual control without any warning or control of the system. Slower reaction times were generated for target acquisition times in the part manual condition, during manual periods, suggesting that an increase in manual workload through the pressing of the return key on the keyboard may have inhibited performance. The extra task in the full manual condition may have contributed to slower reaction times for dials’ monitoring. Conclusion The DVI condition did not highlight any disadvantages with any of the measures, apart from reaction time to automation changes. It is therefore recommended that, if a dynamic form of allocation is used within a system, the operator is given control to change a task(s) using DVI, thus retaining situational awareness and helping to prevent de-skilling. References Hancock, P.A. and Scallen, S.F. 1997, The performance and workload effects of task re-location during automation. Displays, 17, pp 61–68. Russell, S.G. Craig, I.R. and Flood, E.K. 1999, Evaluation of Feedback Mechanisms for an Adaptive Automation System, poster presented at the Human Factors and Ergonomics Society 43rd Annual Meeting, Houston, Texas. Scallen, S.F. 1997, Performance and workload effects for full versus partial automation in a high fidelity multi-task system. Unpublished doctoral dissertation, University of Minnesota, Minneapolis, MN.
This work was carried out as part of Technology Group 5 of the MoD Corporate Research Programme. It was carried out under contract to DERA. “© British Crown copyright 2000. Published with the permission of the Defence Evaluation and Research Agency on behalf of the controller of HMSO”
RESEARCH ON CHINESE COMPUTER USERS’ MENTAL MODELS IN SOFTWARE INTERFACE DESIGN Chien-Hsiung Chen & Che-Hui Chen Graduate School of Industrial Design, Tatung University, 40 Chungshan N. Road, 3rd Sec., Taipei 10451, Taiwan
The purpose of this research is to investigate Chinese computer users’ mental models by measuring their understanding of interface icons. Because most interface icons have been created by foreign software companies, Chinese computer users may interpret these interface icons in different ways. The research findings reveal that Chinese computer users were able to assign various meanings or associations to each of the 57 interface icons generated from Adobe Photoshop software. However, based on their mental models, they rated only 15 out of the 57 icons as easy-to-understand icons. Introduction The term “model” is used frequently as an explanatory tool to facilitate the explanation and representation of a specific concept or a certain type of construct. Similarly, the term “mental models” can be viewed as a type of human knowledge representation derived from the interaction between a human and his/her surrounding environment. The formation of mental models is one of the most important concepts in facilitating the design of effective user interfaces. Norman (1983) argues that a user’s mental model is an important cognitive aspect regarding the interaction between a user and a computer system. Gott, Lajoie, and Lesgold (1991) also emphasize that individuals often need to construct a mental model in order to understand and solve a task and that expertise is generally guided by several kinds of mental models. In fact, conducting research on users’ mental models can help interaction designers predict users’ behavior, assess users’ preferences and needs, facilitate users’ interactions, provide interface consistency among various interface designs, and evaluate different types of user interfaces. Using mental models to facilitate interaction design Users’ mental models are culturally dependent models. That is, users from different cultures may possess different mental models towards the same interface. Although changes in users’ mental models often occur through unconscious processes, there is no doubt that users are upgrading their internal representations persistently through different interaction strategies. To an interaction designer, the purpose of conducting research on users’ mental models is to investigate how a user communicates with a computer based on his/ her existing mental models, and the research findings, in turn, will be analyzed and utilized to help design a better user interface.
184
RESEARCH ON CHINESE COMPUTER USERS’
The design of a user-friendly interface is by far one of the most challenging design tasks that an interaction designer can encounter. Theoretically, this user-friendly interface should satisfy the majority of users’ needs and preferences even though they possess different mental models. Nonetheless, there exist neither concrete nor implementable formal methods to guarantee the creation of a universal interface for this ultimate interaction. It may even be impossible to generate a set of detailed guidelines to cover every relevant design issue in creating a universal interface. However, based on the existing design knowledge, an interaction designer may be able to apply some important interaction principles to facilitate the design of a useful interface. Furthermore, if the interface is equipped with some degree of adaptability and flexibility, the user will be able to dedicate more effort to carrying out the computer task without spending too much time learning to use the computer. Research on users’ mental models and interface icons Objective Currently, most software applications used in Taiwan are created by foreign companies. Though some of them have been translated into Chinese, the interface icons remain the same. The objective of this study was to investigate Chinese computer users’ understanding of interface icons developed by foreign software companies. Experiment In order to investigate the users’ mental models pertinent to their understanding of interface icons, a survey was conducted by using questionnaires. The chosen software was Adobe Photoshop software (version 5.0) because it is a common image processing software used in Taiwan. A total of 46 student volunteers from the Department of Industrial Design at Tatung University were recruited as participants. In this experiment, no participant was paid to participate in the study. However, some students received class credit for taking part in the experiment. All participants were asked to complete three different questionnaires. The first questionnaire was used to obtain demographic data. Participants were then asked to fill out the second questionnaire, which required them to provide meanings or associations for 57 icons generated from the software. Finally, a third questionnaire that provided all 57 icons and their Chinese meanings was given to the participants. They were required to indicate the degree of agreement between the developer prescribed meanings for the icons and their personally ascribed meanings by using a Likert scale of 7 equal intervals (i.e., from −3 to +3). Results and discussions Forty-six student volunteers (29 males and 17 females) participated in this study. Their ages ranged from 19 to 24 years old, and most of them (44 out of 46) possessed their own personal computers. Based on the results from the first questionnaire, participants used computers for various purposes, such as word processing (100%), Web browsing (93.4%), CAD drawing (87%), image processing (78.3%), listening to music (78.3%), watching movies (67.4%), playing games (63%), and statistical analysis (15.2%). Furthermore, from the results of the second questionnaire, participants assigned various meanings or associations to each of the 57 Photoshop icons. Among them, the meanings and associations of 5 commonly used icons are presented in this paper (see Table 1) to help explain the diverse meanings of interface icons.
CONTEMPOARY ERGONOMICS 2000
185
Though the majority of the participants could correctly identify the icons’ meanings, there still existed other possible associations. The results from the third questionnaire indicate the rating between what the participants thought the icons meant and their real meanings or associations of all 57 icons. The mean scores and standard deviations (SD) are shown in Table 2. Only 15 out of the 57 icons had scores higher than 2, which meant that participants understood these 15 icons better than other icons. Table 1. Users assign diverse meanings and associations to interface icons
Note: Two participants did not answer the question for icon 2; one participant did not answer the question for icon 4. Table 2. The mean and standard deviation of each icon based on a 7-equal-interval Likert scale
186
RESEARCH ON CHINESE COMPUTER USERS’
* Participants’ rating scores are higher than 2. Conclusions This research demonstrates that users may possess diverse mental models even when they interact with the same interface. In the experiment, Chinese computer users were able to assign various meanings or associations to each of the 57 interface icons. Nonetheless, Chinese computer users rated only 15 out of the 57 icons as easy to understand. The authors hope that future interaction designers can develop better interface icons by incorporating the users’ mental models into the interface design process. Acknowledgements Financial support of this research by Tatung University, Taipei, Taiwan, R.O.C., under the grant B88-1500-02 is gratefully acknowledged. References Gott, S.P., Lajoie, S.P., and Lesgold, A. 1991, Problem solving in technical domains: How mental models and metacognition affect performance. In R.F.Dillon and J.W. Pellegrino (Eds.), Instruction: Theoretical and Applied Perspectives, (Praeger, New York) Hollnagel, E. 1991, The influence of artificial intelligence on human-computer interaction: Much ado about nothing? In J.Rasmussen, H.B.Anderson, and N.O.Bernsen (Eds.), Human-Computer Interaction: Research Directions in Cognitive Science, European Perspectives Vol 3, (Lawrence Erlbaum, Hillsdale, NJ) Norman, D.A. 1983, Some observations about mental models. In D.Gentner and A.L. Stevens (Eds.), Mental Models, (Lawrence Erlbaum, Hillsdale, NJ)
USER REQUIREMENTS ANALYSIS FOR DECISION SUPPORT SYSTEMS: THE QUESTION APPROACH Carolina Parker HUSAT Research Institute, The Elms, Elms Grove, Loughborough, Leics LE11 1RG, UK Email:
[email protected]
This paper uses agricultural decision support systems experience to illustrate problems faced by Decision Support System (DSS) developers trying to analyse and specify user requirements, and to describe how an approach developed in management science provides a viable solution. The notion of task, central to many system development methods is inappropriate to DSS. Decisions can be broken down into sub-components but these are highly volatile. Thus analysis and specification tools developed for systems with static task components are not appropriate. The paper describes a solution to the problem based on user enquiries and its successful application to DESSAC, a flagship agricultural project. Introduction It has been suggested that the main contributor to the poor uptake of agricultural DSS is the failure of developers to pay attention to users, their requirements, and to the environment in which they operate (Parker, 1999). In fact, DSS developers in UK agriculture have not found it easy to adopt a truly usercentred approach to design and development. Two factors are of particular importance. First, many agricultural DSS developers are scientists and their DSS are developed primarily in the context of research. The typical agricultural DSS developer is someone without a software development background with little or no knowledge of commercial software development methods. Unfortunately, alerting people to the problems associated with a user-free design process is not sufficient, in itself, to improve matters. Even developers embracing user-centred design wholeheartedly find little practical support: there are few truly user-centred methods suitable for the small-scale development that characterises agricultural DSS. Those methodologies which are of appropriate scale (e.g. DSDM, 1996) fail because they do not support the capture of user requirements for the decision making task. This paper examines the problem of user requirements capture for DSS in more detail and describes a potential solution, drawing case material from experiences with agricultural DSS. The problem of task analysis for decision support systems It has been stated (Preece, 1993) that one of the primary characteristics of a user-centred design approach is the use of task analysis to gather user requirements. In the context of human-based tasks, task analysis is the means by which details of the user’s tasks, and information about the task environment, are collected so that the users’ needs are well understood. Task analysis methods are concerned with formal ways of collecting
188
USER REQUIREMENTS ANALYSIS FOR DECISION SUPPORT SYSTEMS
information, organising and using it as the basis for design decisions, and for ensuring that tasks and functions are appropriately allocated within a new system (Kirwan & Ainsworth, 1993). A large number of techniques fall under the task analysis heading. Many of them share a common approach to the collection and organisation of information about the task environment. That is, to describe a situation, process or task by breaking it down into smaller and smaller sub-units, for example by a process of ‘hierarchical decomposition’, until it is considered to be fully represented. The type of task analysis, which might be considered to be of most relevance to the decision task, is cognitive task analysis (CTA). CTA concerns itself with the knowledge that people have, or need to have, in order to complete a task. Its approach is to describe and represent the cognitive elements that that underlie decision making, goal generation, judgements etc. There are a number of methods which fall under the CTA heading (e.g. TAKD and GOMS), the underlying aim of which is to identify the ‘mental model’ employed by the user in carrying out the task, and to use an understanding of this model to improve the system design. CTA is considered to be ‘appropriate for tasks that are cognitively complex (requiring an extensive knowledge base, complex inferences and judgement) and which take place in a complex, dynamic, uncertain, real-time environment’ (O’Hare, 1998). This description would seem to make it a highly appropriate choice for use in a method for DSS developers. Decision making in agriculture can certainly be described as cognitively complex. The chaotic nature of weather, the interactions between it and crops and the myriad things that impact on crop or animal growth create a complex, dynamic, and uncertain environment. Unfortunately, there are a number of explanations for why Cognitive Task Analysis is not suitable for use in a DSS methodology. This is best seen from the context of the expected output of a cognitive task analysis i.e. • a description of knowledge required to carry out the task (declarative knowledge) • a representation of mental models used by individuals when carrying out the task • A description of how the user carries out the task (‘how to’ knowledge) These outputs do little to inform the design of a DSS for the following reasons: A description of knowledge required to carry out the task (declarative knowledge) This type of data is very useful as it describes the information the system will have to provide to support the decision process. It does not, however tell the designer anything about how to present the information. A representation of the mental model used by individuals when carrying out the task Obtaining a mental model of a physical system may be possible and this same model might be used by a number of people, in this case, such a model might usefully be identified and employed e.g. as a training aid or as part of a software navigation device. Within the context of decision making this approach is not so useful because: • Models can vary considerably between individuals, e.g. in crop production, decisions are based on different levels of scientific knowledge and on the degree to which factors (cost, environment etc) are considered important. One person may have an accurate perception of a crop development cycle and of pest
CONTEMPOARY ERGONOMICS 2000
189
development, and another may base decision making on a model of seasonal tasks handed down from his or her parents. • Models are subject to constant change. In many complex decision environments including agriculture the decision process is based on incomplete and constantly changing information. Crop development, for example, is only partially understood and as new science appears, mental models have to change to accommodate it. The degree to which the new material is accommodated by individuals will also vary. A description of how the user carries out the task (‘how to’ knowledge) Decision-making does not lend itself to the type of analysis which yields useful ‘how to’ knowledge, as many real-life decisions are for ill-defined problems with several and conflicting goals. In such circumstances there is simply no single correct ‘how to’ and probably as many examples of ‘how to’ as there are decision-makers. Identifying an optimum way of arriving at the best decision solution relies on knowing what the best solution will be, and that may not always be obvious. For example, a good financial outcome for one field/crop may have negative impact on another more important one, or it may be personally inconvenient to apply the optimum. The use of cognitive task analysis techniques to identify an optimum ‘how to’ for decision making would therefore seem to be futile. Cognitive Task Analysis is also unsuitable for agricultural DSS as it is not really suitable for use by nonspecialists. As Militello & Hutton (1998) point out ‘While a wide range of powerful methods of cognitive task analysis have been developed and applied over the last 10 years, few have become accessible to…the engineering community designing systems” (op cit.. p 619) CTA has also been criticised for not being compatible with software engineering methods (Diaper et al, 1998). If cognitive task analysis methods can’t provide the information required by the DSS developer to develop a requirements specification, how can they hope to engage in user-centred design? The question approach During early research in the agricultural sector it became apparent to the author that some DSS were not adopted because they did not answer ‘the right questions’. The models they contained were capable of producing information the user required but the way the system had been designed prevented the user from accessing it. The observation that system developers needed to pay more attention to the actual questions their users wanted to ask of the system was re-enforced by discussions with users in other projects. Investigation of the potential of this approach led to the discovery of a method based on user questions formulated within management science by Arinze (1989). Arinze (op cit.) notes that the difference between traditional programs, and by implication task analysis methods associated with them, is directly related to the nature of the problems which DSS and other systems address. Mainstream methods are concerned with designing for ‘relatively static, well-defined tasks, with low uncertainty, and slow changing user requirements and data structures.’ (op cit. p 166). They cope less well with decision based activities which are by their nature uncertain and volatile in terms of their information structures and user requirements. This argument suggests that decision support systems are in need of methodologies tailored to their particular attributes, a view supported by a number of other authors (e.g. Sprague, 1986). In support of the question oriented approach Arinze states that the DSS role is oriented towards knowing rather than doing, and that DSS outputs are more of an indirect spur to action than the other types of system. It therefore follows, he feels, that a complete taxonomy of the enquiry types involved in ‘knowing’ will be
190
USER REQUIREMENTS ANALYSIS FOR DECISION SUPPORT SYSTEMS
key constituents of any proposed formalism. The user enquiry method can be seen as a form of task analysis. It is a formal means of gathering information about the task, of organising it and using it as the basis for design decisions and can be used to ensure that tasks and functions are appropriately allocated within the DSS. One of the most useful facets of Arinze’s approach is the grouping of decision enquiries into of one of three main types: state, action or projection. State enquiries are made when the user is seeking information about the state of the world (or a model of it). Action enquiries are requests for a plan of action to achieve a specified end state. This is a reverse ‘what if question i.e. instead of what will happen if I do this, a projection enquiry asks how do I get to this pre-specified end-state. In this type of query, it is the function of the DSS to generate actions in response to the user’s goal setting. Projection enquiries are more commonly known as ‘what if’ enquiries. They are requests for an indication of outcome given a set of defined conditions e.g. ‘How much will I lose if I delay the application of this spray for 3 days?. This type of enquiry also involves the assignment of probabilities to estimated outcomes, and will require risk and sensitivity analyses to be performed on potential solutions. State enquiries relate to the first, ‘intelligence’ or data gathering phase of decision making they indicate the data that the DSS must contain to support the decision process. Defining the set of important questions concerning the state of the ‘world’ will automatically define the key database requirements for a DSS. Action and Projection enquiries are used mainly to assist the decision-maker in the process of formulating potential decision solutions. These questions require a model of the ‘world’ before they can be answered. It is not possible to answer ‘what if’ and reverse ‘what if’ type questions without some description of the mechanics of the relationships between the objects in the ‘what if’ world. A full set of Action and Projection enquiries will therefore describe the types of models the DSS will have to contain to support the decision process. By identifying the ‘enquiries’ or questions inherent in a decision making process it becomes possible to state the users requirements for data and for ‘models’, two main components of the DSS. Because the designer knows that the system has to support the posing and answering of specific questions this knowledge also guides the development of interface functions and features. The availability of a comprehensive set of decision enquiries also permits developers to test the emerging system against its requirements. Practical use of question approach As a result of the authors interest in user questions and their apparent importance to decision making a large number of queries were gathered during a user requirements analysis for a large agricultural project. At this stage the utility of the taxonomy proposed by Arinze was unknown. As part of an exercise to generate smaller, more manageable requirements from over a hundred questions the requirements team attempted to group questions on a subjectively measured like-with-like basis. At this stage the Arinze categories were not consciously considered because there was no evidence that they would be particularly useful (the author however did have knowledge of them and this may have influenced her grouping strategy). As an additional exercise the groupings were compared with the headings proposed by Arinze. The match between the groupings generated by the exercise and the headings proposed by Arinze was striking. Very little reshuffling was required to fit all the questions into the categories. As a direct result of this discovery the project was able to specify very clearly what information was needed in the system and what results were required from models within it.
CONTEMPOARY ERGONOMICS 2000
191
Discussion This paper has attempted to show that DSS developers are faced with problems when attempting to adopt user-centred design, as existing task analysis techniques cannot cope with the volatile and uncertain nature of the decision task, and that another approach is needed. The aim of this paper has been to highlight the importance of the user question as the basic task element for decision making and to bring the Arinze method to the attention of anyone engaged in the specification of requirements for DSS. The use of questions and of the Arinze taxonomy is not, however, by itself sufficient to meet the needs of the agricultural DSS developer or of similar people in domains as diverse as medicine and production engineering (i.e. those working in small teams to tight budgets whose primary interest is within the domain itself). These people, like their agricultural counterparts, are specialists only in their own domain and need structured support if they are to get the benefits of user-centered design. The approach described within this paper has been incorporated within a simple and practical methodology for use, initially within the agricultural sector (it is currently being promoted within MAFF for use in LINK projects). The development of the methodology and its introduction to other sectors is also planned and funding to support this development is sought. References Arinze, B. (1989) Developing Decision Support Systems from a Model of the DSS/User Interface In Knowledge-based Management Support Systems (Ed, Doukidis, G.I., Land, F. & Miller, G.) Ellis Horwood Ltd, Chichester, pp. 166–182. Diaper, D., McKearney, S. & Hurne, J. (1998) Integrating task and data flow analyses using the pentanalysis technique Ergonomics, 41(11), 1553–1582. DSDM Consortium, (1996) Dynamic Systems Development Method, Tesseract, Ashford, Kent. Kirwan, B. & Ainsworth, L.K. (1993) A Guide to Task Analysis, Taylor & Francis, London. Militello, L.G. & Hutton, R.J.B (1998) Applied Cognitive Task Analysis (ACTA): A Practitioner’s Toolkit for Understanding Cognitive Task Demands. Ergonomics, 41 (11), 1618–1641. O’Hare, D., Wiggins, M., Williams, A. & Wong, W. (1998) Cognitive Task Analysis for Decision Centred Design and Training Ergonomics, 41 (11), 1698–1718. Parker, C.G. (1999) Decision Support Systems: Lessons from Past failures Farm Management, 10 (5), 273–289. Preece, J. (Ed) (1993) A Guide to Usability: Human Factors in Computing, The Open University. Addison Wesley. Sprague, R.H. & Watson, H.J. (1986) Decision Support Systems: Putting Theory into Practice, Prentice-Hall.
WHY DO IT SYSTEMS FAIL TO LIVE UP TO EXPECTATIONS? A CASE STUDY Andrew Bairsto & Susan Harker Department of Human Sciences, Loughborough University, UK
This paper reports a longitudinal case study of the development of a replacement for an existing information technology system within an organisation. The case study provides evidence of how design processes may impact upon success and failure in human and organisational terms. The central method of research used was one of participant observation coupled with a review of the documents generated over the lifetime of the project and an evaluation of the implemented system. Examination of the processes of requirements generation, choice of a solution and implementation strategy, indicate that a complex array of factors were involved even within this comparatively small scale project. Internal politics, financial constraints and the role of the end users influenced the relative success of the end solution. Introduction and Background There is ample evidence both from the literature, (Kearney, KDE, Moshowitz) and from the popular press (accounts of the failure of the London Ambulance Service system and the Passport Office) that IT systems often fail to live up to the expectations which are placed on them. The more spectacular failures are often the subject of detailed analysis and reporting but accounts of events are usually based on entirely retrospective analysis. The more typical cases which are a mixture of success and failure and which are delivered late or only in part, are rarely subject to the same level of detailed analysis to enable the causal links and implications of the development to be more fully understood. Thus the case reported here provides an opportunity to understand the complex and dynamic processes which affect development and the implications this has for the relative successes and failures of the delivered system. The approach taken in carrying out this research was centred upon participant observation. The first author was one of the projected end users prior to the initiation of the system development process and, as the I.T. co-ordinator for the department, played a role as part of the project development team, until leaving the organisation in September 1998. This involvement allowed a particular insight into the processes that happened whilst the author was employed within the organisation and provided a basis for analysis of events which occurred after his departure based on the documentation associated with the project. This analysis was supplemented by interviews with relevant staff involved in development. An evaluation of the implemented system and the reactions of end users based on interviews was carried out in the summer of 1999. The case study focuses on an organisation of approximately 600 staff. The core business of the organisation is to deliver, process and award examinations to schools and colleges. Within this organisation
CONTEMPOARY ERGONOMICS 2000
193
exists a department of eighteen staff whose chief concern is the provision and administration of training courses associated with the examination processes. The department processes in the order of 18,400 attendees per annum on about 1,500 courses. Of the eighteen staff, six are dedicated trainers. Of the remaining twelve staff, there is a department manager, a secretary; an administrative manager and a Word Processing operator. The remaining eight members of staff carry out the day to day administrative functions of the department which was the focus of the IT development. At the start of the development the key tasks were carried out using a combination of manual processes and two existing computer systems one based around an IBM Mainframe AS400 computer system which also served a significant number of other key departments within the wider organisation. The other computer system was PC-based running an internally developed bespoke software application. The Development Process The chief point of discussion in the initial stages of development was whether or not to develop one of the existing computer systems or whether to invest in an ‘off the shelf’ package solution. While the AS400 system was regarded as meeting most of the requirements, there were key areas of functionality which were not available and would have to be developed if this was to provide an overall solution (i.e. trainer allocation, letter generation and course evaluation). A dominant concern for the development team and, perhaps more so for the department’s manager and staff, was the time-scale available for the implementation of the I.T. solution. Following the initiation of the development process (June/July 1997) things moved relatively slowly so that by September 1997 a decision had still to be made about which route to follow. The cyclical nature of the department’s work meant that there was an optimum period within the year during which a new or revised system could be installed. Installing a new system in the period after April would result in a level of disruption which was regarded as unacceptable by all concerned. It could require both the old and new systems to be run in parallel for the rest of the coming cycle of events with the financial costs and pressures on staff being considerable. Delaying installation until the following year was not an option that either management or the development team wanted to consider at this stage. The decision to buy in a new package was made by the I.T. department. Specifications of requirements (drawn up by the business analyst in consultation with the department) were sent out to vendors during December and it was mid January before these firms were brought in to provide demonstrations of their software solutions. None of the products on show matched the requirements for a possible solution and each vendor readily acknowledged that customisation was necessary. A decision was made to work with one of the vendors and over the next month further demonstrations were arranged to clarify what customisation was needed and to arrive at an estimate of the total cost. By the middle of March a detailed list of user requirements had been drawn up and a significant number of necessary changes to the existing product had been identified by the end users. At this point a decision was made not to proceed with the vendor. At the time it was by no means clear as to why the decision came to drop the vendor, particularly as the user department felt that significant progress had been made towards achieving a solution. Again the decision was made by the I.T. department rather than by the user department. There is little evidence in the documentation of reasons for the decision and while financial factors may have played a part the most likely reason appears to have been a desire to use the project as part of a wider strategic move in the IT procurement policy. Inevitably the April deadline had to be abandoned. A new vendor was suggested by the I.T. department where once again the product was an off the shelf package which only broadly met the requirements and therefore needed a large amount
194
WHY DO IT SYSTEMS FAIL TO LIVE UP TO EXPECTATIONS?
of customisation. In addition the company did not yet have a Windows version of the application and was in the process of testing an automatic conversion program to produce this for them. The department could not afford to delay any further and therefore had no choice but to agree to proceed with this option. The vendor undertook to deliver the system progressively, providing the essential functions in time for the key dates and delivering the customised features progressively. This had the effect of locking the training department into the relationship with the supplier. The customisation took longer than had been predicted and required considerably more expenditure than had been budgeted for, thus eliminating the differential with the originally chosen vendor. To date, the total cost (not including internal costs) has tripled and far outweighs the potential cost savings that had been identified. A phased installation took place at the end of May. All this delay, had of course meant that the department was very much behind in its preparation for the coming season of courses (the earliest of which was due to take place in July). Initial entry of essential data concerning the courses would be way behind their usual schedule. The courses had already been advertised to customers and bookings were already being made for places on courses. Without the ability to book them on to the new system, booking forms were stockpiled. The decision had been taken not to run the old and new systems in parallel with each other and so there would be no time for testing the new system. The new computer system would therefore be put to the test using ‘live’ data and the knock on effect of having a new system was that new procedures were devised to deal with different circumstances. This was a true representation of a “big-bang” implementation as has been described by Eason (1988). The difference here of course was that there had been no real time for the endusers to effectively test each part of the system. Staff had had time to prepare for its introduction but were still taken off balance by the unknown. There were a significant number of errors and bugs in the software after its installation, which can hardly have been surprising given the minimal testing. Quite often the bugs were serious enough to prevent effective use of the system which created greater delays and pressure on the staff and at the time a very negative opinion of the system. Also following installation, staff began to find areas of the software which either did not live up to their expectations or made their tasks more time consuming. Tasks which could have been carried out using two input screens on their old system were having to use four or more input screens in the new system. In the beginning, unfamiliarity with the interface was certainly a hindrance but staff also found that the information required in order to achieve a particular task was far greater and often unnecessary in the new system. This was a result of the software sold as an off the shelf package having functionality which was not actually required. In general the staff soon developed a culture whereby they had to simply accept the new system, warts and all, because they had jobs to carry out that simply could not be put off any longer. It appears very much as though the staff were having to fit around the new I.T. solution rather than the solution matching the task and organisational requirements. A considerable amount of customisation took place following the installation of the system. The first phase of implementation (that of user training, installation, data input and agreed customisation) was completed by November. At this point the business analyst who had been the main driver and champion of the development left the organisation and plans for further up-take of the system in regional offices were not pursued. However, more positively for the department, one of the end-users had become a “local expert” on the system within the department and thus became the new champion. In November 1998 the software was reviewed and further development requirements were identified. These changes in requirements came about for a number of reasons. One was that staff were making direct comparisons with the old system and thus viewed the new system as lacking in certain task areas. A fundamental tenet of design should be that the usability of a new system should be at least as good as any that it replaces. The original list of requirements simply was not detailed enough and did not allow the effective mapping of tasks. The training department
CONTEMPOARY ERGONOMICS 2000
195
user evaluation of the new system was very unfavourable, particularly when related to the systems it replaced. To date, from its onset through to its development and implementation, the project has taken over two years. The customisation is on-going as are some of the problems. One of the most recent of these is a problem with Year 2000 compliance. From the moment that the vendor entered the project, the customer was assured of Year 2000 compliance and yet when attempting to print letters for courses taking place in January, got a response that the starting date is 1900. If this truly is a software problem then it points towards a certain ineptness on the part of the vendor and places a question mark over their work to date. Discussion The events and outcomes of this development are probably not untypical. That the system has been implemented in a way that is regarded as partially successful is in accordance with many other developments. It certainly didn’t meet either its timescale or budgetary targets. However there are a range and complexity of factors that influenced or had an impact upon the process. From the point at which the vendor was changed, there were two people trying to control the destiny of the project (the training department manager and the I.T. manager). Internal politics thus has important influences on the process. Notwithstanding this conflict, the roles of the department manager and the business analyst were key to the partial success which was achieved.. Without these two champions of the project, the manager who was keen to implement some kind of solution and the business analyst who kept the whole process moving, nothing might have been implemented.. Money is another factor, and, as in most cases, acted as a constraint to achieving the best solution. However, this case re-emphasises the fact that experienced developers (and experienced users) know that estimates of cost are rarely realistic. Another influencing factor is the role of the end users. In this case study there was a strong element of user participation although its initial effectiveness is questionable when looking at how well their needs were communicated. In particular the failure to recognise that a much more comprehensive task analysis was needed in order to specify the full requirements for customisation, lies at the heart of the failure to provide the functionality which the users required to perform effectively. There are of course lessons to be learned from the use of this case study. Not least of these is a need to improve the general approach to the whole development process in terms of a formal, structured design method, which incorporates appropriate human-centred design processes. Events happened on an ad hoc basis without sufficient planning and without affording enough time to areas such as requirements generation. A clear statement of user requirements is highlighted as one of the most important factors in achieving project success (The Standish Group, 1995) and should be treated as such. Eliciting requirements from users can, and indeed should, take a considerable amount of time and should make use of various tools (Harker et al 1993) in order to achieve something that will be a true reflection of needs and save development time later on. It is also apparent that there was inadequate attention to evaluating and checking solutions prior to implementation. In terms of organisational structures the role of an I.T. department in commissioning systems from third parties requires further attention.. In these circumstances IT specialists should facilitate communication between their internal clients, the end-user department, and the vendor. They should be responsible for providing the technical support which the users may not have and ensuring that they understand the implications of the technical and financial constraints which are operating. In general, the case study probably illustrates the type of practices that are apparent in a great many similar sized organisations. Issues of human factors do not seem to be highly prioritised in the development process and this study highlights the fact that solutions are still technically driven. In short, there needs to be
196
WHY DO IT SYSTEMS FAIL TO LIVE UP TO EXPECTATIONS?
a re-addressing of priorities within such areas of development. Greater emphasis needs to be placed on creating solutions that, at their very conception, incorporate key considerations like adequate user requirements generation. In the case of the training department, the financial cost and stress on staff was a lot greater than it should have been, had they focused more on these kinds of issues. References: Eason, K., 1988, Information technology and Organisational Change. London: Taylor and Francis. Harker, S.D.P., Eason, K.D. and Dobson, J.E., 1993, The change and evolution of requirements as a challenge to the practice of software engineering. In: IEE International Symposium on Requirements Engineering. Los Alamitos, California: IEEE Computer Society Press. 4–6 January 1993, 266–272 Mowshowitz, A., 1976, The Conquest of Will: Information Processing in Human Affairs. Reading: Addison-Wesley. The Standish Group, 1995, Chaos, http://www.standishgroup.com/chaos.html
DEVELOPMENT AND TRIALLING OF USER ACCESS TO AN INFORMATION SYSTEM FOR ARCHITECTS Stefanie Meltzer & Bill Green Delft University of Technology, Faculty of Design, Engineering and Production Jaffalaan 9, 2628 BX Delft, The Netherlands Email:
[email protected]
This paper discusses the development and evaluation of an information system for architects and in particular the human interface of the system. It introduces a functional model referring to the architectural design process. This functional model provides a framework for the structure of the information system and constitutes the base of the design of a new database visualisation. Koninklijke Hoogovens is a major producer and supplier of building products in steel and aluminium. This project was initiated by Hoogovens Research & Development and was a Master of Industrial Design Engineering graduation project. The information system presents users entry to a database of information about the building products and materials of the Hoogovens business units. How architects design Decisions on materials and products are often made early in the building process, mainly during the design phase by architectural designers and engineers. Architects play a dominant role in the decision making process, especially when it involves high end products such as cladding and roofing systems, having a high added value and characterising the concept. Providing architects with the right information during their design process can have a major influence on this decision making. Research on applying design knowledge during a design process (Dave 1994, also Dermikan 1998) states that architects frequently extract their relevant design knowledge from previous design processes. This way of reasoning referring to past experiences is known as case based reasoning. Case based reasoning is often applied in so called ill-structured, experience-rich domains where knowledge is incomplete and evidence is sparse. Every problem is complex and unique and fixed solution principles can not be used. The domain of architectural design is a typical domain where knowledge of previous cases is applied to a new situation. Functional model A functional model has been developed in order to fit the information system logically and useful into the architectural design process. It provides a framework for the structure of the information system and constitutes the basis of a new data base visualisation. This functional model refers to a diamond-shaped representation of a design process containing two phases: a divergence phase and a convergence phase. The first phase is characterised by a diverging process in order to generate a large amount of possible solutions to the design problem. At the end of this phase the
198
DEVELOPMENT AND TRIALLING OF USER ACCESS
Figure 1. Diamond-shaped representation of a design process
convergence phase starts and reduces all these variants to one best solution. Almost every design process contains several of these diamond-shaped parts, placed in successive or iterative sequence. In this case the model accepts one generic diamond and integrates it into the information system. The model presents the user with several levels of information characterised by differences in abstraction. These levels refer to the architectural designer’s different information needs in the successive stages of a design process. For example, an architect designing technical details of the roof needs more detailed information than an architect starting a new design problem. The user can choose to enter the system at the different levels in the divergence phase of the model. By navigating through the successive levels the system supports the users in making design decisions. Each level is progressively less abstract and provides the user with more specific information customised to the user’s requirements. The search through the systems can finally result in finding a specific bit of information, or contact with an expert. In addition, a user can enter the system from an external information source, for example the Internet. This approach results in an information system which can provide architects with information even in their earliest design stages by presenting them with a reference framework of visual images, descriptive words and reference projects. Database visualisation Based on the functional model a human interface has been developed offering architects a tool to navigate through the system and select information according to their own way of thinking and acting during a design process. The database visualisation serves as a tool for associatively and exploratively navigating as well as selecting criteria. In the first level the system shows the user all kinds of images surrounded by descriptive words linking the images by association. The user can navigate through the images and words by clicking on one of them. After that this image or word appears in the centre of the screen surrounded by its relative images and words. In this way the architect is offered the possibility to extract design knowledge from the database based on visual representations. In this process of collecting knowledge by designers importance is attached to using collages and characteristics of visual representations. The system creates a possibility for the user to form his own concepts on the basis of the shown information, without imposed restrictions and fixed concept formulation.
CONTEMPOARY ERGONOMICS 2000
199
Figure 2. Human interface with a descriptive word surrounded by images
In this database visualisation it is possible to show relations between objects in a dynamic and interactive way. These relations can be both hierarchical and semantic. An object referring to another hierarchic order is shown in a vague way and placed in the background. Example of such a way of displaying words is the application Thinkmap. (http://www.thinkmap.com) Associative values and filtering To every image a set of associative values has been attached. These associative values contain important terms used by architectural designers and found in the literature. Every image used in the database has been evaluated by a group of architects and architectural students. By selecting images that meet the requirements of the design problem and dragging them into the collecting place the user creates a personal set of criteria, which builds up a filter. This filter can then be applied to a database of reference projects in a subsequent level of the system. Using the possibility of creating this filter the architect forces the system to show him only those projects he is actual interested in. The architect can get access to the information on specific products used in these projects by pointing out one of the hotspots applied in the project images. Finally, authorised users can access a closed part of the system where they have the possibility to communicate with experts and to download specific information like CAD-details, textures and specifications. User testing Method There were three stages of user involvement in the design process. Stage one involved the evaluation of four different conceptual solutions for the interface. This evaluation was informal and demanded only a series of opinions from practitioners and architectural students. The results of this evaluation were primarily negative—subjects demonstrated aversion to some interface presentations—but this is valuable in what is essentially an elimination process. It facilitated the choice of a conceptual group for further development. The selected concept of the information system and its user interface was implemented in a demonstration model. In stage two this demonstration model was evaluated and tested by five architects.
200
DEVELOPMENT AND TRIALLING OF USER ACCESS
This group of five was assembled as a representative section of the total group of architects. Different ages, working experience and amount of experience in computer usage were represented Each was given an assignment divided into five tasks resembling different tasks in the design process. The actions of the subjects were recorded on videotape and observed by the researcher. Thinking aloud was also used as a supporting technique. After each assignment the subjects were asked to evaluate the tasks and the interface. The outcomes of the tests were used to optimise the concept. A final trial was conducted on the modified model. There were also five subjects in this trial. Assignments The tasks (five) constituting the total assignment were chosen to provide a scan of the tasks done by architects during their design process. This results in an assignment containing a wide range of the possible situation they may face and the problems that may occur. Each architect was given the five tasks in a different sequence to avoid order effects. The tasks refer to different stages of the design process and different ways of searching for information. For example, the task ‘Getting started’ refers to the very early phase where an architect has some requirements but no firm concept. Other tasks such as “Direct search’ or ‘Jumping to related subjects’ refer to a stage when the user already knows what s/he is looking for. Task example. “Getting Inspiration”
The aim of the task is to search for starting points for concept forming. The architect is asked to start a new assignment after a first meeting with the client in which a new company headquarters with a high-tech and industrial image is requested. The user was asked to use the demonstration model to (a) get inspiration and (b) search for examples of relevant construction modes. The main problems encountered by subjects were in the initial stages of the process when building up a filter. Icons were sometimes misunderstood and require initial textual support, and some problems of dragging and dropping images were encountered. Changes made as a result were e.g. a short introduction for first time users on selection of images, and pop-up words associated with icons. In general, subjects were successful when they employed a trial and error approach, and the relationship of the navigation to the design process was clear. Discussion That user trialling is a valuable, indeed we believe indispensable, tool in the design process is no news, and hardly worth repeating. What is significant in this study is the provision of access to a database which is radically different from the normal process, in that it allows a visual, and indeed emotional, approach which corresponds much more closely to the early stages of architectural design. The value of the functional model of the design process as a starting point was supported by the trials, which indicate an easy acceptance of what could be regarded as a strange and different system of database access. The results of the user testing show that the information system offers the architect a real possibility to search for information in an way which supports their own designerly way of handling images during the design process. The database visualisation serves both as a navigation tool and as a selecting tool. The
CONTEMPOARY ERGONOMICS 2000
201
functional model makes it possible to enlarge the functionality and information content of the system in order to offer users maximum convenience. Conclusions This information system possesses high commercial potential and shows how information can be made more accessible to particular user groups by integrating the possibilities of new information technology and (in this case) traditional design processes. While much is known about the relationship of visual and verbal information in other contexts, and there is some information on this in a design context (Muller and Passman 1996), further work needs to be done in using modifiable visual imagery and its mutation into textual information, which current information technology is making possible. References Dave, B., et al. 1994, Case based design in architecture, Artificial Intelligence in Design ‘94, 145–162 Demirkan, H. 1998, Integration of reasoning systems in architectural modeling activities, Automation in Construction, 7, 229–236 Muller, W., en Pasman, G. 1996, Typology and the organization of design knowledge, Design Studies, 17, 111–130
SUPPORTING UNIVERSAL ACCESS TO INFORMATION TECHNOLOGY Mary P.Zajicek1 & Albert G.Arnold2 1School
of Computing and Mathematical Sciences, Oxford Brookes University, Oxford OX3 OBP, UK
2Faculty
of Technology, Policy and Management, Delft University of Technology, Delft, The Netherlands
This paper addresses the challenges involved when attempting a ‘technology push’ for groups in the community who have no points of reference with Information Technology. A broadening of the definition of user requirements is suggested, to include off-line support for those members of the community who do not have the personal capital to enter easily into the Information Society. These individuals are often reluctant to become involved in Information Technology and lack confidence. They find it threatening and difficult and frequently lack the resources to see ahead to the benefits that will accrue. These issues should be addressed if we are to ‘push’ the use of Information Technology into these previously excluded populations. If they are not, these groups in society will become further isolated. Introduction This paper explores the problems involved when attempting a ‘technology push’ for groups in the community who up until now have not used information technology. It is based upon experience of working with two non-technology enabled user groups, elderly visually impaired people and those running small family hotels in Oxfordshire, UK. Although different, both groups were involved in making the transition from non-computer literacy into the world of Information Technology and represented individuals with varying degrees of commitment to change and motivation. The groups were found to require the same forms of support in order to participate in the Information Environment even though their personal resources for dealing with learning were different. This paper proposes guidelines for supporting the transition of non-technology enabled people into the Information Environment. The guidelines aim to increase the users’ confidence to the point at which they feel capable of using Information Technology. A widening of the scope of user requirements is suggested to include the technical, emotional and physical support of new computer users. Evaluating Technology Acceptance Models User confidence, described above, is an important parameter in Technology Acceptance Models (TAMs) as proposed by Davis (Davis, 1993). Related concepts are acceptability, usefulness, or utility. According to Davis’ TAM, actual system use is dependent on the behavioural intention of the users to use. This intention is created by a positive attitude towards the system, which stems from a cognitive evaluation process based on beliefs and norms.
CONTEMPOARY ERGONOMICS 2000
203
According to Davis ‘perceived usefulness’ and ‘perceived ease of use’ are strong beliefs in the attitude forming process. Perceived usefulness is defined as ‘the prospective user’s subjective probability that using a system will increase his or her job performance within an organisational context’. Perceived ease of use is defined as ‘the degree to which the prospective user expects the target system to be free of effort’. We found that although this model is valid for interactive technology, which is used in organisational contexts, it appears to be too simple for the general use of information technology for the users under discussion in this paper. Other factors come into play. For example, prospective users of information technology have more freedom to choose between various applications, than those working in organisations. They are interacting with information technology in a wider range of contexts. In their cognitive evaluation process various beliefs and norms will play a role. For example, an important assessment has to be made about expected task demands and their coping capabilities. Does the user feel that the job can be done using Information Technology? Of course personal user needs (e.g. the need for achievement) should also be taken into consideration. Furthermore, the characteristics of the prospective application should be evaluated with regard to the task in hand, e.g. accessibility, usability, security, and reliability should be weighed up. Many users’ expectation of their capacity to perform a task is based on a peer-related evaluation. If a friend or colleague who they rate as having comparable capabilities to themselves can perform a task satisfactorily, then they feel that they will be able to. User hypothesises are represented as follows: ▪ the effort and the use of the application will result in the desired performance ▪ the desired performance will lead to the rewards expected ▪ I am/am not surrounded by people who are already using information technology Two Non-Technology Enabled User Groups This section describes the problems encountered by our two sample user groups in getting started with Information Technology. Those running small family hotels One author was employed (Zajicek et al, 1998), to investigate reasons for the disappointing uptake of Information Technology among people running small hotels, hereafter referred to as operators, in the Oxford area. The aim was to increase the effectiveness of the hospitality industry in Oxfordshire by the introduction of Information Technology to small family run hotels. It was motivated by the belief that increased use of Information Technology, for example gathering customer information and manipulating it in order to target groups with special room rates etc., would increase room occupancy in these establishments. The number of guests staying in the Oxfordshire area would be increased and thereby generate more tourist based income for the whole area. Operators were offered PCs at a subsidised rate and encouraged to use standard office software, database, spreadsheet and word processor. They were interviewed in their hotels to determine their attitudes to Information Technology, what problems they faced in coping with it and to what extent they had adopted it. Operators expressed interest in training for the use of Information Technology with 60% showing a positive attitude. Most felt however that they could not leave their place of work to take up training. In many cases operators had actually purchased computing equipment and had attended courses on office software familiarisation. A major problem was that the information acquired by attending courses was
204
SUPPORTING UNIVERSAL ACCESS TO INFORMATION TECHNOLOGY
difficult to relate to the computerisation of their own business. They were taught how to use software on computers set up in the training establishment. They became proficient in the use of the software on the machines provided, but did not develop a sufficiently strong framework of the general concepts behind computer organisation to enable them to function on their own. The user group comprised mostly individuals with very low levels of formal education. They did not possess the analytical/learning skills needed to organise their own learning of software and use of the computer. They lacked confidence in learning from their mistakes and had no conceptual framework in which to work. The following hotel based support was welcomed by operators: ▪ The use of step-by-step videos, Web materials and home tutorials to introduce Windows, the concepts behind computers and computerisation of their business. ▪ Teaching material supported by home visits by PC experts to see how operators are using their computers and to suggest ways forward. The following centrally based support was welcomed by operators: ▪ Help line manned by PC experts open at set times for advice and step through instructions. ▪ Stepped courses with a clear definition of the skills that will be learnt, the skills required to benefit from the course and expected progression. ▪ Central organisation of a network of operators from similar establishments to contact from work and develop supportive relationships with. Elderly visually impaired people A study was performed to evaluate the use of BrookesTalk (Zajicek et al, 1999), a Web Browser for the blind and visually impaired, by non-technology enabled elderly people with a serious visual impairment. Evaluation of their use of BrookesTalk was performed on-line by means of email based questionnaires and by telephone. It was found that 65% of the group were unable to get up and running with BrookesTalk. They found it difficult to conceptualise the workings of a computer application. The nearest model to computer software that they could find, in order to draw comparisons, was a VCR. Many users assumed that you just had to know which button to press and it would ‘work’. The concept of having a dialogue with the computer and learning to use its language was new to many participants. Many problems encountered were also due to a lack of conceptual models of the World Wide Web (Zajicek et al, 1999) and a lack of understanding of the relationship between function keys and functions. Researchers interviewing elderly users over the telephone found that they did not have the skill or confidence to try out functions to see how they work. Impaired memory is a disadvantage in exploratory learning. Elderly users were therefore not able to employ the usual suck-it-and-see method for finding out how things work. BrookesTalk has limited functionality compared with standard visual browsers operating as it does with twelve function keys. The Microsoft Corporation has recently funded a project to integrate BrookesTalk’s web summarising capabilities with Microsoft Explorer. This will provide increased functionality such as bookmarks and email and increase the complexity of use. This will benefit young technology-enabled blind people who will gain access to the functionality of Internet Explorer through BrookesTalk. However this approach compounds the problems of the elderly who require less functionality in order to learn. As a result of this study a simplified version of BrookesTalk called BrookesSimple is under development which, at any
CONTEMPOARY ERGONOMICS 2000
205
point, tells the user the options available and allows them to select one, thus removing the need for the user to map their task requirements onto function keys, or to rely on memory of past actions. Guidelines for supporting new Information Technology users The users described above do not possess the necessary personal capabilities to enable them to benefit from the Information Technology Environment as other users do. Their learning skills in this medium are underdeveloped and they do not have a strategy for learning in a concurrent software environment. As a consequence their confidence is low. We argue that this group of users requires a technology bridge integrating familiar learning support mechanisms such as help lines, mentors, stepped learning or peer support. The users described above appeared to be disadvantaged for different reasons, but were all lacking the necessary capabilities to cope with information technology. Therefore the remedy for getting them going was similar. Currently, user-support requirements for the ‘technology push’ are not well understood and more study is required. However as a result of our experience, preliminary guidelines for support are listed. • Impose a sequential structure on the concurrency of the system being learnt, albeit temporarily, in order that the main concepts can be absorbed. • Provide a sequence of topics/functions to learn with manageable steps between them and insist on total familiarity with one topic/function before the learner moves on to the next • Plan the acquisition of competencies to reinforce users’ developing conceptual models of the system and of their tasks • Facilitate user driven learning. Provide flexibility so that wherever possible users can follow their own learning trail. ▪ Provide off-line support, including peer support, on the users’ own terms wherever practicable, at the time and place of their choice Conclusions We see that those who use Information Technology successfully have the confidence to explore the system they are learning and uncover complex functionality by for example cruising menus and trying things out. In order to ‘push’ technology use onto individuals who are currently excluded, we must increase their confidence by providing re-enforcement of competencies in training and interaction with peers. User requirements capture and system design should be interpreted in the widest sense to include outreach and different support measures. In addition current technology acceptance measures do not, at first sight, appear to include sufficient contextual information, to be able to contribute to an understanding of this group. The sphere of application in which they are useful should be extended. The challenge is to achieve a greater understanding of the needs of non-technology enabled people, so that we can support them in the use of Information Technology. It is important that those who are becoming marginalised through their lack of understanding of Information Technology should be included and those who are reluctant to join in must be met on their terms. A significant problem is that the groups under discussion see themselves as outside the main groups in society who are using Information Technology effectively.
206
SUPPORTING UNIVERSAL ACCESS TO INFORMATION TECHNOLOGY
References Davis, F.D. 1993, User acceptance of information technology: system characteristics, user perceptions and behavioural impacts. International Journal of Man-Machine Studies, 38, 475–487. Zajicek M., Wheatley B., Winstone-Partridge C. 1998, Improving the Performance of the Tourism and Hospitality Industry in the Thames Valley, Technical report no. CMS-TR-99–04, School of Computing and Mathematical Sciences, Oxford Brookes University Zajicek M., Powell C. and Reeves C. 1999, Evaluation of a World Wide Web scanning interface for blind and visually impaired users, HCI International ’99, Munich
CONSUMER ACCEPTANCE OF INTERNET SERVICES Martin Maguire HUSAT Research Institute, The Elms, Elms Grove, Loughborough, Leics LE11 1RG, UK. Email:
[email protected]
This paper addresses peoples’ acceptance of the many new electronic and Internet services now available. Consumers can now benefit from a myriad of information resources, buying and banking opportunities, and public information. Yet there are still many people who are reluctant to take up such services. This paper discusses problems of acceptance of Internet services and discusses approaches for encouraging their use by a wider range of society. Introduction There is now a vast range of Internet services available to consumers in the areas of ecommerce (e.g. Internet shopping and banking), broadcasting (e.g. digital TV and radio), social contact (e.g. email and online chat) and government services (e.g. tax filing and welfare benefit information). Consumers can receive such services at anytime and in their chosen setting. Suppliers are now moving rapidly to offer services over the Internet because of the potential savings that may be achieved from their retail outlets and the potential world-wide market that is available to them. Typical trends are as follows: • Government and public service agencies will, in future, offer more information and services on-line (Hoare, 1999, Rumbelow, 1999). • Supermarket shopping services are now available across the whole country. • Items such as holidays, consumer goods, books, cars, will be sold in greater quantities and more cheaply on-line. • Banks are encouraging customers to take up on-line banking which will be cheaper and more flexible. • Email is now a rapid, convenient and popular form of communication. While the future is exciting for those consumers most interested and motivated to use these new electronic services, less technically confident consumers may be slower to take them up, thinking that they lack the technical know-how to understand them, and seeing them as remote and impersonal. Older or disabled people, in particular, may feel that they are falling too far behind these technological developments to benefit from them.
208
CONSUMER ACCEPTANCE OF INTERNET SERVICES
Trends in Internet use There are signs that sections of the public may need persuading to take-up the concept of electronic services. It is reported that a £28m project in the UK, piloting government services over the Internet via public kiosks, may be halted because of public indifference (Phillips, 1999). This could be a setback to the Government’s plan (presented in the white paper Modernising Government) to deliver 100% of public services electronically by 2008. The reasons behind this lack of take-up is reported to be due to poor marketing and technical problems which has undermined public confidence in the system. These difficulties can be avoided but the debate continues about public attitudes to electronic government. A nine-month study of 4,000 citizens and business owners published by the Cabinet Office in Britain in November 1998 found that 20% were “antagonistic” toward electronic public services and a “significant hardcore” rejected the idea of smartcards. While many articles in IT journals and newspapers predict that the vast majority of the population will be using the Internet and similar electronic services in the near future, it appears that there will remain a group of people unlikely to use even well established services. For example, it is reported that after 30 years, one in four bank customers still do not use bank machines (Derbyshire, 1999). Reasons for non use are: distrust of computers, anxiety about becoming targets for muggers, and forgetting their PINs or secret access numbers. It has also been said that electronic shopping will be the normal method of buying goods in the future. Yet even by 2010, electronic shopping is only predicted to take account of 7% of all retail sales (Weathers, 2000). Market research companies find it helpful to categorise users and non-users of technology in order to determine which groups companies should focus on. The Henley Centre, categorises the UK adult population into five types of Internet users from advanced users to those who have not yet used it (Ward, 1999). They are: • • • •
The @home group (7% of the population) active users of the Internet at home and work. The @ll group (14%) use the Internet at work but are very likely to become home users. The Ne@rly group (30%) yet to use the Internet at all but may do so soon. The M@ybe group (19%) have less confidence using interactive media and little experience. Probably won’t use Internet at home but may do so at work in next 5 years. • The Not @lls: 30% of the population who are unlikely to use the Net at all.
The Henley Centre predicts that the Ne@rly group is going to be of the greatest significance to companies in the future, as they represent a huge section of the population and have already shown their willingness to use computers at work and at home. Although the Ne@rly group actively use computers, almost half say they don’t understand new technology. They prefer to wait and see what other people are doing and then follow their lead. It is of interest to note that 49% of the sample (the M@ybe group and the Not @ll group) will take much longer to use the Internet, if at all. At the time of writing, one in four adults in the UK have access to the Internet, a proportion that will undoubtedly increase. In every age group from 15 to 54, there are more Internet users than would be expected compared with the population as a whole (Schofield, 1999). But this is not the case with the over 65s who make up 20% of the population but only 4% of domestic Internet users. It is now realised unless older people are able to enjoy the benefits of the connected future, with information available electronically on tap and convenient and inexpensive shopping on-line, this will be a lost opportunity both for suppliers as well as for older people themselves. According to Oftel (the UK telephone watchdog organisation) and
CONTEMPOARY ERGONOMICS 2000
209
disability campaign groups, “A growing grey market containing millions of potential customers is being ignored in the telecoms boom” (Dawe, 1998). Issues of acceptance of Internet services The reluctance of certain groups of users to access new Internet services indicates that the issue of acceptance needs to be considered alongside the design of services which are functional and easy to use. A number of stages of user acceptance can be considered. Initially people will have pre-conceptions about new technologies when they hear about them. A good example is that of biometric techniques for personal identification. When this topic was studied at HUSAT by holding a number of discussion groups with the public, less than half thought that the techniques of voice recognition, automatic signature recognition, retina scanning, and hand shape recognition, would work well as identification techniques. Even iris recognition which has been shown in practice to be an effective method of identification for bank machines (after testing by the Nationwide Building Society) was only thought to be effective by just over 50% of the focus group sample. Security of the Internet for transactions is another area of concern to the public and is still a major barrier to online shoppers. This is shown by a survey carried out by NFO Interactive (Usability Lab, 2000) based on a sample of online customers. It was found that 31% of the sample had concerns about security, compared to other barriers: prefer to go to shops (21%), not got round to it (14%), like to see things before buying (11%), concerns about data protection (7%). An edition of the BBC’s Money Programme in November 1999 also found that 93% of British consumers do not feel secure when submitting credit card details over the Net. This level of concern is perhaps due to the fact that while many argue that Net shopping is generally safe (e.g. Baguley, 1999) examples of security failure still appear such as when one building society was forced to pull the plug on its Internet-based share-dealing service after a technical glitch allowed customers to into other people’s accounts (Hinde, 1999). Reports from America (Vernon, 1999) also show a concern over lack of security and privacy which is deterring people from submitting their income tax returns via their PCs. As with online shopping, America leads the world in online tax preparation and filing as part of the strategy, (common in both the US and Europe) to develop electronic government. Yet research conducted by Jupiter Communications (an analysis company in the US), shows that the number of American Internet users who plan to prepare their tax returns over the Net is less than 2% which is less than 0.5% of all expected tax returns. The main reasons for consumer reluctance to file their taxes over the Web are security, fear of error and privacy. While consumers are becoming more comfortable with electronic filing in general (e.g. completing on-line forms), they are hesitant to rely on the Web for tax returns which is regarded as a far more important transaction. The use of firewalls will improve security but unfortunately will degrade the performance of the system. To encourage online filing of tax returns, the Inland Revenue in the UK are offering discounts on tax bill if they are returned over the Internet (Hopegood, 1999). Seeing a system or service for the first time is the next stage in the process of user acceptance. A first look at a system that is being used by a salesperson or friend is of course a good way to show the user what it does, how it might be useful to them, that they can trust it. Yet there can be negative aspects to this process. A demonstration of a system may reveal too much complexity which deters potential users. For example, if a buyer sees an expert user operating the system too quickly, this may make them feel unable to use it themselves. When using a system or service for the first time, consumers will often compare it with existing processes they perform e.g. comparing an on-screen TV programme guide with newspaper or magazine listings. If by
210
CONSUMER ACCEPTANCE OF INTERNET SERVICES
comparison the new system seems problematic, this will discourage them from continuing to use it. These barriers may not be apparent when the system is first demonstrated to them. First use of Internet shopping can also be disappointing. Arends (1999) reports that “despite the hype, so little Christmas shopping is being done electronically. Retail experts Verdict reckon it will be less than 1% of total sales this year”. Arends then asks, “What no one measures is how many people tried to do their Christmas shopping over the net but gave up”. He then reports on the use of shopping sites provided by leading UK stores such as: • Having to update a browser to access the site. • Passing through too many administration screens to provide personal details, credit card check, and selecting a delivery date, even before deciding to buy anything. • Being given complex access codes that are easy to forget. • Forcing the user to register for a loyalty card. • Not offering enough goods to buy online, and too little information about them. • Not allowing goods to be searched for quickly by name. • The results of a search not being presented in an appropriate order. • The site not being available at night! Arends compares these problems with the slick and effective sites of online start-up companies such as amazon.co.uk. Users may see an Internet service as a challenge and will gain satisfaction in using it successfully. However if, after a period of continued use, they find that it does not match their needs, they will use it less. In terms of home shopping, problems such as not updating the site frequently enough, not allowing customisation (e.g. to set up a regular order for goods) and not making information easily accessible will discourage continued use. Encouraging user acceptance of Internet services In order to address the problems of acceptance listed above, this section presents some initial ideas to encourage wider use of electronic services. Pre-conceptions As previously stated, users will have preconceptions about services that may or may not be accurate. It is important that such pre-conceptions are based on correct information by providing more details about the service and possibly demonstrations of it. For example, in the HUSAT study of biometric techniques for user identification subjects had several misconceptions about iris scanning. Firstly they confused this technique, which involves photographing the iris, with retina scanning which involves the use of a low power laser to read the back of a person’s eye (a process which seemed more dangerous). Similarly they expected to have to bend down and look into a tube. This was thought would require and awkward posture and would be unhygienic. In fact they only had to stand infront of a camera that took a photograph of their eyes. It is important then to overcome inaccurate pre-conceptions by presenting clear explanatory information about a new service when it is being promoted.
CONTEMPOARY ERGONOMICS 2000
211
First look People’s acceptance of a service will also be influenced by their first viewing of it. It is believed that when non users see others successfully using the Internet, they soon realise they can use it too (Ward, 1999). Demonstrations of the system or service should perhaps be a simple overview, followed by a step-by-step description of the main functions. This will help users understand the concept of the system without overwhelming them with detail. After a demonstration, on the BBC’s Tomorrow’s World programme, of a bank machine which identified customers using iris scanning, members of the public found it quite acceptable. While a salesperson can be briefed about how to demonstrate a service effectively, this process is hard to control. However the use of introductory videos or a demonstration mode within the user-interface can provide a standard means of introduction to a system or service allowing the user judge its value to them objectively. First use Users should be guided to explore a service so they gain an understanding of it before using it. For example, to overcome the problem of home shopping registration, it would be better to allow users to browse through goods or a selection of them before requiring registration. Users should not be forced to focus in too quickly on a particular product to buy. As in a traditional shop, they should be allowed to compare products side-byside and to browse, before committing to buy. Value added features should be used when developing an existing service. For example in many home shopping sites, use may be made of features such as: personalisation (e.g. clothes size, colour preferences, likes and dislikes), providing extra information such as interviews with book authors, samples of music, etc. Feedback is also important so that receiving an email to confirm that an order has been accepted is a good way to boost confidence. Showing the shop’s refund and return policy up front on the Internet site is also important to building trust. Other aspects of first experience are to show the postage and packing costs and allowing the user to take items out of a trolley as well as putting them in. Continued use The service should match the users’ needs in the longer term if they are to continue to use it. For example, for an Internet based service, it is important to keep the site up to date, allow customer personalisation and make information easily accessible to encourage continued use. As users become familiar with a service, they will also expect to be able use short cuts to reach the information they require quickly and easily. Efficient fast paths should be provided for such users as demonstrated by mature search engine sites. Summary and conclusion This paper has tried to show the importance of considering acceptance issues for new Internet services. Table 1 lists acceptance barriers and ideas for addressing them. The key message of this paper is to ensure that the whole community can benefit from Internet services in the future. To quote Ruth Lea, head of policy unit, Institute of Directors (Tozer and Palmer, 2000): “Ecommerce will revolutionise our lives, but I do hope it benefits everybody. If those who are ‘on the net’ pull ahead in leaps and bounds from those who aren’t, that would exacerbate social inequality, which I would very much regret”.
212
CONSUMER ACCEPTANCE OF INTERNET SERVICES
Table 1. Overcoming barriers to user acceptance of services
References Arends, B., 1999, Christmas shoppers shun the World Wide Wait, Daily Mail, 20 Dec, p 51 Baguley, R., 1999, Go shopping now!, Internet Magazine, April, 37–40 Dawe, T. 1998, Big market is there for the disabled, Times ‘Interface’—Telecoms extra, 7 October, pT4 Derbyshire, D. 1999, Cash machine phobia, Daily Mail, 14 July, p 31 Hinde, S., 1999, We’re afraid the ‘e’ stands for ‘easily cheated’, Sunday Express, 28 Nov, p 2 Hoare, S. 1998, City hall opens 24-hour hotline, Times Telecoms, 29 Sept, p 9 Hopegood, 1999, Taxman to offer online discount, Daily Mail, 1 Dec Phillips, S. 1999, Public snubs e-government, Computer Weekly, 6 May, p 1 Rumbelow, H. 1999, Online NHS heralds interactive healthcare, Times, 8 Dec p 12 Schofield, J. 1999, Older hands weave the web, Guardian Online, 10 June, 2–3 Tozer, J. and Palmer, A., 2000, What I want for the future, Daily Mail, 3 Jan, p 11 Usability Lab, 2000, Convenience stores, PC Magazine, Ziff-Davis, Jan, 9, 1, 162–177 Vernon, M. 1999, Form-filling on the web—Filing tax returns over the Internet, Financial Times, FT-IT Review , 2 June, p 6 Ward, C. 1999, Hoi polloi are next on the net, Times Interface Supplement, 12 May, p2 Weathers, H. 2000, Shopping? So different but we will still have an eye for a bargain, Daily Mail, Millennium Mail Supplement, 1 Jan, p VI
Acknowledgement: This paper has been developed with the support of the EC TEN-Telecom TUSAM project and Esprit EMMUS project. The author also thanks his HUSAT colleagues Kathy Phillips and Colette Nicolle, for providing useful material for the paper.
Legislation
ERGONOMICS IN IRISH LEGISLATION Vincent Kelly Robens Centre for Health Ergonomics EIHMS, University of Surrey Guildford, Surrey, GU2 5XH
The objective of this paper is to identify and consider the areas of Irish Legislation where the principles of ergonomics are mentioned and set an acceptable level of conduct by employers. The relevant legislation which resulted from the enactment into Irish Law of various EU Directives on health and safety in the work place covers participatory ergonomics, personal fitness, manual handling, use of hand tools and machinery and display screen work. Introduction The Safety, Health & Welfare at Work (General Application) Regulations 1993 incorporated into Irish law Council Directive 89/654/EEC, on the minimum health and safety requirements for the work place. Under Regulation 5 it is the duty of the employer to ensure that measures are taken for the safety and health and protection of employees, these measures to take account of changing circumstances and the general principles specified in the First Schedule. The first Schedule at item (d) reads “the adoption of work to the individual, especially as regards to the design of the place of work, the choice of work equipment and the choice of systems of work, with a view, in particular, to alleviating monotonous work and work at a predetermined work rate and to reducing the effect on health”. This means a primary duty of the employer is the adoption of work to the individual. This is another way of saying fitting the task to the human, an approach defined within the scope of ergonomics (Kramer and Grandjean, 1997). Consequently, ergonomics has a central role in Irish Health & Safety legislation. This role covers participatory ergonomics, personal fitness, manual handling, use of tools and machinery and display screen work. The work factors stated in the First Schedule of Regulation 5 at item (d) are considered potential risk factors for ergonomic injuries. Ergonomic injuries result from a mismatch between the demands of the working task and the capacity of the working person to meet these demands, generally when the former exceeds the latter or the person is placed in a situation of overload (Pheasant 1996).
CONTEMPOARY ERGONOMICS 2000
215
Participatory Ergonomics Participatory ergonomics got statutory recognition through the Safety Health & Welfare at Work Act 1989 which gave the employee the right to make representations on health and safety in the work place (Section 13 (2)), and obligated the employer to take account, as far as is reasonably practicable, of representations (Section 13(1)). The General Application Regulations 1993 advanced this recognition in Regulation 12 which dealt with consultation and participation of employees. Regulation 12 (1) (a) places an obligation on the employer to consult employees on any measure proposed in the work place which substantially effects the safety, health and welfare of employees. Regulation 12 (1) goes on to provide for, in the event of a lack of competent personnel at the place of work, the obtaining of the services of a competent person (whether under contract of employment or otherwise) for the purpose of ensuring, is so far as is reasonably practicable, the safety and health at work of employees. The need for competent personnel is exemplified by Devereux et al. (1998) who found that neglecting professional ergonomics input in manual handling tool design by a participatory group resulted in a poor design solution that also had the potential to increase the risks due to manual handling. In a description of a participative strategy to reduce the risk of musculoskeletal disorders, Devereux & Buckle (1999) considered that expert ergonomics input provided objective and subjective data to support many of the intervention strategies that included equipment redesign, safer work system training and risk behaviour change. Personal Fitness and Suitability Regulation 15 of the 1993 Regulations sets out the duties of the employer in relation to health surveillance and defines health surveillance as the periodic review of the health of employees so that adverse variations in their health which may be related to working conditions are identified as early as possible. Health surveillance should include inquiry into the employees state of health and any changes in that state of health, the work and work station and objective and subjective consideration of other factors including social factors and the influence of factors related to the working environment. Regulation 13 deals with the issue of training and provides for adequate safety and health training including in particular information and instruction relating to the particular task and work station involved and the protection of particularly sensitive risk groups of employees against dangers which specifically effect them. Manual Handling The Factories Act 1955 dealt with lifting excessive weights and read “a person shall not be required to lift, carry or move any load so heavy as to be likely to cause injury to him”. This indicates an ergonomic approach almost 50 years ago. It is unambiguous in that each person has to be treated as an individual and within his/her capacity. This contrasted with the general trust of the Common Law where the overriding standard was the custom and practice. A retrograde step was taken when the Factories Act, 1955 (Manual Labour) (Maximum Weights and Transport) Regulations 1972 were introduced which set maximum acceptable limits for male (55 kgs/121 lbs) and female (16 kgs/35.2 lbs). This was revoked 20 years later in light of advancing ergonomic knowledge.
216
ERGONOMICS IN IRISH LEGISLATION
Council Directive 90/269/EEC on Minimum Health & Safety requirements for the manual handling of loads, where there is risk particularly of back injury, was introduced into Irish Law through Regulations 13, 27 and 28 of the 1993 Regulations. In the definition of manual handling of loads given in Regulation 27 the expression “unfavourable ergonomic conditions” is used. This EU Directive was based on the report of the ad hoc Committee (President Professor P.Davis) on lumbar hazards at work and this was the first time that manual handling legislation was rooted on a scientific basis. Regulation 28 of the 1993 Regulations sets out the duties of the employer in relation to manual handling, the first of which (Reg. 28 (a)) is to take appropriate organisational measures, in particular mechanical means, to avoid the need for manual handling of loads where practicable. Regulation 28 (b) and (c) deal with situations where the need for manual handling of loads cannot be avoided and provide for the provision of appropriate means in order to reduce the risk and with the organisation of the work station in such a way as to make manual handling as safe as possible having regard in both instances to the Eighth Schedule which sets out the reference factors for the manual handling of loads under the headings, the characteristic of load, physical effort required, characteristics of work environment and requirements of the activity. Regulation 28 (d) says that the employee should be advised, where possible, of the precise weight of the load and the centre of gravity of the heaviest side when the package is eccentrically loaded. Manual handling is specifically mentioned in Regulation 15 on Health Surveillance and also Regulation 13 which deals with training. In both Regulations the employer has to take account of the Ninth Schedule. The Ninth Schedule says that an employee may be at risk if he:—is physically unsuited to carry out the task in question—is wearing unsuitable clothing, footwear or other personal effects—does not have adequate or appropriate knowledge or training Whilst not referred to in the Eighth and Ninth Schedules social factors do have statutory recognition in relation to manual handling under Regulation 5 and cannot he ignored. Devereux (1997) assessed whether an interaction between physical and psychosocial work factors was associated with an increased risk of low back disorders and other work related musculoskeletal disorders. It was concluded that the combination of high physical and high psychosocial exposure increased the magnitude of the risk to greater than the sum of the risk for both exposures acting relatively independently indicating an interaction effect for the low back and wrist/hands. This was, in effect, a finding of a synergic effect between physical and psychosocial work factors in relation to the lower back (Devereux et al, 1999). Plant and Machinery The Safety, Health & Welfare at Work (Construction) Regulations 1995 gives effect to Council Directive 92/57/EEC. Regulation 124 dealing with installations, machinery and equipment reads: “all installations, machinery and equipment, including hand tools whether power operated or not shall be properly designed and constructed taking account, as far as possible, of the principles of ergonomics”. Sanders & McCormack (1993) in dealing with the principles of hand tool and device design indicated three primary principles: (a) the maintenance of a straight wrist, (b) avoidance of tissue compression stress and (c) avoidance of repetitive finger action. In terms of machinery, particularly on a building site to which these Regulations apply, the primary exposure is vibration. In a review on health effects of long term occupational exposure to whole body vibration, Wickstrom et al. 1994 concluded that there was a possibility of an excess risk of injuries and/or
CONTEMPOARY ERGONOMICS 2000
217
disorders in the lower back where whole body vibrations combined with unsuitable postures and prolonged sitting without pauses. The options open to an employer to reduce the exposure of workers to hand arm vibration include: (1) selecting tools with the lowest level of vibration, (2) properly maintain the tools and keep cutting tools sharpened, (3) use vibration reduction (damping) gloves, (4) minimise the grip needed to hold and control the tool, (5) alternate tasks, (6) limit the daily use of vibrating tools, (7) provide long breaks while using vibrating tools and (8) limit the number of days per week when vibrating tools can be used. Display Screen Work Regulation 31 (giving effect to Council Directive 90/270/EEC) deals with the duties of the employer in relation to display screen equipment. The general duties involve the analysis of the work station, particularly as regards possible risk to eyesight, physical problems and problems related to stress, the taking of appropriate measures to avoid these risks and to plan the activities of employees in such a way that daily work on a display screen is periodically interrupted by breaks or a change in activity which reduce the work load at the display screen. In addition the Tenth Schedule attached to this Regulation sets out the minimum requirements for all display screen equipment under two broad headings, (1) equipment and (2) environment. The Eleventh Schedule sets out the minimum requirements for workstations under three headings, equipment, environment and employee/computer interface. Equipment requirements are set out under display screen, keyboard, work desk or work surface and work chair. Environmental considerations are reflections and glare, noise, heat and humidity. In designing, selecting, commissioning and modifying software and in designing tasks using display screen equipment the employer shall take into account the following principles: (a) software shall be suitable for the task, (b) software shall be easy to use and where appropriate adaptable to the employees level of knowledge and experience, (c) systems shall provide feed back to employees on their performance, (d) systems shall display information in a format at a pace which is adapted to employees and (e) the principle of software ergonomics shall be applied, in particular to the human data processing. Conclusion As set out above, ergonomics has general recognition in health and safety legislation by virtue of the “adoption of work to the individual”. Specific mention is made of ergonomics in the manual handling regulations and the construction regulations dealing with machinery and hand tools. There are as yet no reported judgements which would give an indication of the attitude the Courts will take to the interpretation of the Regulations in practice. This is of particular interest to ergonomists as it will indicate the approach to scientifically based regulations by a common law system which for centuries has been based on custom and practice. This may not be as traumatic as feared by some because custom and practice of its nature changes with time and generally adapts to advancing knowledge, in this particular instance advancing ergonomic knowledge. References Council Directive 90/269/EEC. O.J. 21.6.90, No. L 156/13 Council Directive 90/270/EEC, O.J. 21.6.90, No. L 156/14.
218
ERGONOMICS IN IRISH LEGISLATION
Council Directive 92/57/EEC. O.J. 26.8.92, No. L 245/6. Devereux, J. (1997) The Study of interactions between work risk factors and work related musculoskeletal disorders. Ph.D. Thesis, University of Surrey. Devereux, J., Buckle, P., (1999) A Participative Strategy to reduce Risks of Musculoskeletal Disorders. In: Hanson, M.A, Lovesey, E.J., & Robertson, S.A., Eds. Contemporary Ergonomics 1999 London: Taylor & Francis. Devereux, J.J., Buckle, P., & Haisman, M.F. (1998) The evaluation of a hand-handle interface tool (HHIT) for reducing musculoskeletal discomfort associated with the manual handling of gas cylinders, International Journal of Industrial Ergonomics, 21, 23–24. Devereux, J., Buckle, P., Vlachonikolis, I. (1999) Interactions between physical and psychosocial risk factors at work increase the risk of back disorders: an epidemiological approach, Occupational and Environmental Medicine; 56; 343–353. Factories Act 1955, Dublin Stationery Office. Kraemer, K., Grandjean, E. (1997) Fitting the Task to the Human, 5th edn, London: Taylor & Francis. Pheasant, S. (1996) Body Space, Anthropometry, Ergonomics and the Design of Work, 2nd edn, London: Taylor & Francis. Report of the ad hoc Committee on lumbar hazards at work to the Director General, Division V, Commission of the European Communities (1986). Doc. No. 2080/86 EN Safety Health & Welfare at Work Act, 1989. Dublin Stationery Office. Safety Health & Welfare at Work (Construction) Regs. 1995 . Dublin Stationery Office. Safety Health & Welfare at Work (General Application) Regs. 1993. Dublin Stationery Office. Sanders, M.S., McCormick, E.J. (1992) Human Factors in Engineering and Design, 7th edn, New York: McGraw-Hill. Wickstrom, B., Kgellberg, A., Landstrom, U. (1994) Health Effects of Long Term Occupational Exposure to Whole Body Vibration—A Review, International Journal of Industrial Ergonomics 14:273/292
PUBLIC TRANSPORT AND THE DISABILITY DISCRIMINATION ACT 1995 Fiona Bellerby (formerly of Loughborough University), Ergonomics Consultant, Davis Associates Ltd, Wyllyotts Place, Potters Bar, Herts EN6 2JD, UK
The Disability Discrimination Act 1995 (DDA) has received various criticisms since entering Parliament as a Bill in January 1995 for doing little to end discrimination against disabled persons. This project aimed, therefore, to consider the advantages of the Transport section and Goods, Facilities and Services section of the Act and highlight the shortfalls of air, rail, bus and coach travel for those with special needs. Five focus group discussions and three individual interviews were conducted with disabled and elderly people to determine which of their special needs are not being met. To balance the argument interviews were conducted with service providers to determine their disability policies and the actions that are being taken, or proposed for the future, to increase access to their services, as well as the constraints under which they must operate. Introduction The DDA came into force in November 1995. Since then doubts have been raised as to whether it will be fully implemented, and the Act has been criticised for its lack of depth and completeness. This project’s aim is to study the Transport sections of the DDA and discuss its relevant components with service providers and users. The outcome of the research is future recommendations to be included in proposed changes to transport systems. This can limit later costs to service providers, improve their public image and enable more people to travel on public transport vehicles. Methodology Twelve representatives, including director of operations, of 10 Leicestershire operating companies including an airport, 5 rail operators, 3 urban bus operators and 1 inter urban coach operator were interviewed individually. In addition, 49 elderly and disabled people were interviewed in groups or individually. They ranged from 16 to over 75 years old; 35 were physically disabled, 7 were visually impaired and 1 person hearing impaired. The remaining 7 users were elderly people who did not like to term their ailments as a disability. Interviews with service providers and users took place at their premises or meeting place, and with their permission the interviews were recorded on audio tape. The service provider interviews covered the DDA with regard to transport and the user interviews followed a typical journey using public transport to enable group
220
PUBLIC TRANSPORT AND THE DISABILITY DISCRIMINATION
members to visualise their actions. The user group members also completed a questionnaire to record their personal details. The audio tapes were transcribed and the information provided by the users’ was categorised and summarised into tables with the relevant reply from the service providers. Finally, recommendations were produced either from the literature, or by devising a compromise solution between the needs of the users and the restrictions imposed upon the service providers. Literature Review In Britain in 1991 there were 2.1 million people over the age of 80. This will increase to 3.2 million by 2021 (Stewart-Davis, 1996). In 1997 there were almost 1.1 million people who were economically active and who have long-term illness (Great Britain, 1997). Before transport systems become unacceptable to users we need to address the requirements of an ageing population as well as meeting the needs of fringe members of society (Geehan & Suen, 1993). If people with special needs are to be fully integrated members of society, they need access to public transport (ECMT, 1989; Geehan et al, 1992). Without it they cannot access education and business, which in the UK are the greatest reasons for using public transport (Great Britain, 1997). To travel people need information for pre-planning, in the terminal and in the vehicle. However there are no consistent methods by which information is communicated by operators to users. The amount of information is often insufficient, in-vehicle information is confusing and signage too small, inappropriately placed, of little contrast and poorly maintained. Information is provided visually or auditorily, rarely in both and usually excludes tactile information (Arnold & Wallersteiner, 1994). Information systems need to harness users’ capabilities and be compatible with the users perceptual, cognitive and behavioural characteristics. Regarding physical access, two of the most common complaints about buses are the high steps and the distance between the vehicle and the kerb (ECMT, 1989; Petzall, 1993). The Petzall study determined that uniform steps were needed, as they require fewer and smaller trunk movements. It is also preferable for the bottom step to be in-line with the pavement, as it reduces leg movement, and the pavement edge marks the starting line of steps for visually impaired people. It is important for service providers to communicate with users to determine their actual needs and identify their perceptions of public transport. When solutions have been proposed it is important to carry out user testing of transportation technologies to avoid designing with one group in mind and presenting barriers to others (Abdel-Aty and Jovanis, 1997). Disability Discrimination Act 1995 The Act defines a disability as physical or mental impairment which has a substantial and long-term adverse effect on [a person’s] ability to carry out normal day-today activities (1(1)) and therefore a disabled person is a person who has a disability (1(2)). Part III Goods, Facilities and Services This part of the Act came into force in October 1999 and covers access to, and use of, communication and information services. It is unlawful for a service provider to discriminate against a disabled person by “refusing to provide, or by not providing, any service which he provides, or is prepared to provide to
CONTEMPOARY ERGONOMICS 2000
221
members of the public”. It must not be impossible or unreasonably difficult for the disabled person to make use of such service. The standard of service, the manner in which, and terms on which it is provided must not be discriminatory. It therefore follows in Section 21 that if it is impossible or unreasonably difficult for disabled persons to make use of a service it is the service provider’s duty to take all reasonable steps in order to change the practice, procedure or policy so that it no longer has that effect. Part V Public Transport—Public Service Vehicles (PSV’s) Section 40(1), and sections 46 and 47b which refer only to rail vehicles, state that regulations will be produced to ensure that it is possible for disabled persons— (a) to get on and off regulated public service vehicles in safety and without unreasonable difficulty (and in the case of disabled persons in wheelchairs to do so while remaining in their wheelchairs); and (b) to be carried in such vehicles in safety and in reasonable comfort. Mapping user needs to service provision Access to check-in and information desks There are few check-in and service desks in terminals at the height of a seated wheelchair. Airports are unable to provide these as the desks are designed to protect staff and equipment. Railway stations will provide lower desks as and when stations are refurbished, but in the meantime service providers are willing to serve passengers in the waiting area. Access to vehicles Bus steps are too high for many passengers, and low floor buses will only be introduced as older buses need replacing. One alternative is “castle kerbs” which raise the level of the pavement to that of the bus. However, low floor buses and “castle kerbs” become redundant if the bus cannot park flush with the kerb, and a good-run to the parking bay is necessary to achieve this. Therefore saw-tooth layouts need to be installed to provide a greater run-in. Access to stations and platforms Physically disabled users stated that they were unable to access stations. Information provided by a station leaseholder showed that of their stations 70% are unstaffed, 70% of stations are wheelchair accessible from the main entrance however only 60% allow access to all platforms and 15% do not afford access to any part of the station. These problems should be overcome by Railtrack’s £300 million plan to upgrade and refit stations between 1999 and 2020. Facilities—Toilets Disabled toilets are often provided but they generally only accommodate people in wheelchairs no larger than the government’s standard wheelchair. The designers of trains find the positioning of disabled toilets
222
PUBLIC TRANSPORT AND THE DISABILITY DISCRIMINATION
difficult because they take up a lot of room and receive a great deal of criticism from groups representing disabled people Personal Assistance The availability of personal assistance can mean the difference between a person being able to travel or not. However many people are not aware of the services that are available, the costs, if they will be subjected to degrading manual handling practices, or forgotten. All operators have staff training schemes in operation, but the operators state that it is not possible to ensure that individual staff members put this into practice. Space and Seating The physically disabled users agreed that there was not enough leg room and too few spaces for wheelchair users. However more spaces and increased leg room will reduce seating capacity causing problems for ambulant disabled persons travelling at peak times, and cause economic problems for the service providers. Colour coding and tactile cues Three of the visually impaired users compared that bright yellow is used for colour coding which causes them eye strain and headaches. As an alternative red would be preferable. The colour choice coincides with the Rail Vehicle Accessibility Regulations. Tactile cues are often provided for visually impaired persons, however none of those interviewed were aware of the existence of many of the cues—because nobody has told them! Users need more appropriate information provided to inform of the presence of such cues. In-terminal information Users with a visual impairment found that too much audible information was presented at once announcements are indistinguishable from background noise. One way around this can be found in new information sources such as telephone-information kiosks which provide information e.g. the next service from that platform. However the unreliability of services make in-terminal information redundant. Bus and train operators are aware of, and are experimenting with, Real-Time scheduling, which provides accurate information transmitted via satellite from the vehicle to the terminal. In-vehicle information Travellers with visual impairments or those in an unfamiliar place are unsure of their location, especially on buses and trains, when information is not always provided and often inaudible and illegible. On buses and trains users require visual and audible information that is clear, timely and accurate. Conclusion There are many aspects of urban, inter-urban and international travel that makes it difficult or impossible for disabled and elderly people to travel independently and there are many feasible solutions to these problems. However the solutions cannot be implemented overnight and many operators will not provide the solutions without guidance from the Government. Therefore this report acknowledges that whilst the needs
CONTEMPOARY ERGONOMICS 2000
223
of disabled and elderly travellers are not being met, the service providers need time and guidance to implement the relevant improvements. The DDA covers many of the areas highlighted by the users for improvement however some important areas, such as personal assistance, are not included in present regulations. The inclusion of the recommendations in this paper which are not already covered by the DDA will increase the usability and accessibility of transport systems for those with special needs, making the British Transport system truly public. Recommendations ▪ ▪ ▪ ▪ ▪
The provision of lower service desk for wheelchair users, Improved access to vehicles with castle kerbs, low floor buses and sawtooth access to bus stops. Toilet facilities need to be more accessible to those in larger wheelchairs, More flexible seating needs to be introduced to allow spaces for wheelchairs to be created with ease. Audio-visual information that is distinguishable and readily available is needed for all journey stages from pre-planning to the destination for people of all abilities. References
Abdel-Aty, M.A. and Jovanis, P.P, 1997, Using new technologies to meet the transportation demand of the special needs travellers—a framework, Proceedings of the 30th international symposium on automotive technology and automation Arnold, A.K. and Wallersteiner, U., et al, 1994, Human factors evaluation of information systems on board public transportation vehicles: implications for travellers with sensory and cognitive disabilities, Proceedings of the 12th triennial congress of the International Ergonomics Association, 15th–19th August. ECMT, 1989, Transport for People with Mobility Handicaps, European Conference of Ministers of Transport Seminar, Dunkirk, 19th November. Geehan, T., Arnold, A.K., Suen, L., 1992, An ergonomic assessment of assistive listening devices for travellers with hearing impairments, Proceedings of the 6th international conference. Geehan, T., Suen, L., 1993, User acceptance of advanced traveller information systems for elderly and disabled travellers in Canada, Proceedings of the IEEE-IEE vehicle navigation and information system conference, Ottowa, Canada, 12–15 October. Great Britain, 1997, Transport: Social Trends 27, Office for National Statistics. Great Britain, 1998, The Rail Vehicle Accessibility Regulations, London: HMSO, Petzall, J., 1993, Traversing step obstacles with manual wheelchairs, Applied Ergonomics, 24(5), 313–326. Stewart-David, D., 1996, Planning transport for octogenarians, Global Transport, 66–68.
THE DSE DIRECTIVE: WHAT DOES IT MEAN? Nigel Heaton & Andrew Baird Human Applications, 139 Ashby Road, Loughborough LE11 3AD, UK
The European Directive 90/270/EEC “on the minimum safety and health requirements for work with display screen equipment” has now been with us for seven years, but there remains some confusion over its implementation. The European Court of Justice (ECJ) has recently delivered a judgement which may have far reaching implications for companies using DSE. Given the significant recent changes in the nature of DSE work (including homeworking, hot desking, hoteling, etc.) many organisations will need to undertake a radical re-think of the way in which they manage DSE-related health & safety issues. A user centred approach is needed and not the workstation driven approach which the letter of the law suggests. Introduction On 1 January 1993 the UK introduced new Regulations under the Health & Safety at work etc. Act 1974. The Health & Safety (Display Screen Equipment) Regulations 1992—the ‘DSE Regs’—represented the UK’s response to a Directive promulgated by the European Commission. These regulations were part of a ‘six pack’ which included the framework “Management of Health & Safety at Work Regulations 1992”. The DSE Regs were supported by Guidance issued by the Health & Safety Executive (HSE). Seven years on, the DSE Regulations are still misunderstood, poorly applied or simply ignored by many organisations. For example, the HSE published a report (Honey et al, 1997) indicating that 39% of respondents to their study had undertaken risk assessments of workstations. As this is the main requirement of the DSE Regs, on the basis of this study around 61% of the respondents are in breach of the Law! So what do the Regulations mean, and how can they be applied effectively in the modern working environment? Background Essentially, there are five critical elements to the process: The need to assess, reduce and monitor risk (risk management) • • • •
The definition of a user and how users are to be involved in the assessment process The definition of a workstation and the impact of the Schedule of Minimum Requirements Planning work routines and supervising their implementation
CONTEMPOARY ERGONOMICS 2000
225
• Training and Informing Users Modern health & safety law revolves around the concept of risk management and this is covered within the DSE Regs as Regulation 2. In essence, you are required to assess the risks posed by DSE, reduce them as far as is reasonably practicable and monitor the situation into the future. In theory, employers must carry out assessments of ‘workstations’, but in reality, the Regulations only make sense if the risk is assessed in relation to the user of the DSE. We do not see broken chairs becoming injured or suing organisations, it is people who have the potential to become injured, therefore it is people we must protect. So the first concept associated with any sensible interpretation of the Regulations is to make your approach user-centred (i.e. the only workable approach to meeting the requirements of the regulations is to take a classic ergonomics approach). We have seen many organisations spend many man-months trying to define which employees and contractors are covered by the regulations. Regulation 1 describes two types of individual: • a user • an operator (who is a self employed user) Both are covered by the Regulations, provided they habitually use DSE for a significant part of their work. The HSE offers guidance on who a user/operator is by providing a list of task descriptions aimed at helping organisations decide and examples of those who are, might be and are not users. Our experience is that many organisations fail to understand some genuine problems associated with defining users. Firstly, any definition will be arbitrary. There is no legal, scientific or medical case to be made for a definition, based say, on simple exposure (e.g. you are a user if you use DSE for 2 or more hours per day). By implication, one second under and you no longer qualify. Secondly, the nature of most DSE usage is that it varies in time. Our exposure one day is different from our exposure the following day. Job flexibility and constant change mean that we can no longer be confident that yesterday’s ‘non-user’ will not be today’s ‘user’. Thirdly, we must examine the purpose of Risk Assessment. That is to control the risks to individuals as far as is reasonably practicable. The difficulty of defining a ‘user’ is that in the event of an injury to a ‘nonuser’, the only defence as to the absence of a risk assessment is our arbitrary definition. Furthermore, under the Management of Health & Safety at Work Regulations 1992, we are required to have an assessment at least covering the activity under question. No DSE Risk Assessment will ever be wasted, as it will act as a general assessment, even if DSE related risks are negligible (e.g. it should still pick up any general problems). The solution is simple—organisations should define a ‘user’ as any person who uses DSE. This inevitably causes organisations concern. However, if managed properly, such a definition can yield substantial benefits. If everybody who uses DSE is a user it rapidly becomes apparent that not all users are equally exposed to risk. Indeed, if you use DSE for a few minutes a day, it is virtually inconceivable that you would suffer an injury as a result of your usage. However, if such a user is given basic information about the need to report increase in usage and to report any problems early, and if such information is recorded then we have the beginning of an audit trail which will protect both user and organisation into the future. It is apparent that requesting users to flag problems and changes is a reasonable request (indeed both Section 7 of the Health and Safety at Work etc. Act and the Management of Health and Safety at Work Regulations explicitly recognise the employee’s role in this respect).
226
THE DSE DIRECTIVE: WHAT DOES IT MEAN?
Furthermore, we create an audit trail which establishes that we are assessing all workstations, all users and managing risks. In 1995 the European Court of Justice (ECJ) was asked to rule on two linked cases in Italy. The Italian interpretation of the EC Directive was questioned, in particular, clarification was sought as to who was a user, whether the eye sight provision was appropriate (and legal) and which workstations were covered by the technical annexe to the Directive (in the UK this is a Schedule of Minimum Requirements). The ECJ judgements—C-74/95 and C-129/95 ruled: • A user can not be defined. The court issued no definitive ruling on who would constitute a user. • An eye sight policy which was based on age, gender, race, etc. was in contravention of existing European Law and as such policy should be independent of those factors (e.g. not one rule for the over 50’s and another for everyone else). • All workstations, whether they are used by a user or not must meet the minimum requirements laid down in the technical annexe. The implications of this ECJ ruling may be quite profound and far reaching. In the UK, the Schedule of Minimum Requirements to the DSE Regulations is unusual, both in terms of health and safety law and in general UK law. It represents a series of mandatory minimum requirements, i.e. not covered by the concept of reasonable practicability. Although the HSE have prefaced the Statutory Instrument with a series of clauses relating to the presence of the component, the need to consider the health, safety and welfare of the person and the need to consider the inherent characteristics of the task, in broad terms, failure to meet the minimum requirements is a criminal offence and could be used in a civil court as evidence of a breach of duty of care. The extension by the ECJ to all workstations means that organisations are required to show that every workstation used meets these minimum requirements. This includes home workers, temporary workstations (e.g. hotel rooms) and hot desking environments. However, under UK Law, a workstation must meet the minimum requirements only as it affects the health, safety and welfare of the person at work. The only way to evaluate this is to assess risk, not against the Schedule but against RISK. This neatly deals with a second problem. Namely, the presence (or absence) of a particular feature tells you nothing about risk. For example, a monitor might be capable of being configured to display information and characters appropriately, as well as being free from reflections and glare. However, once a user starts using it, the choice of font, colour and positioning of the monitor may cause significant problems. The main pitfall to avoid here is checklist assessment. We do not care about the features, what we care about is actual usage! This is ergonomics—who, doing what, with what and where. It is also useful to ask “why?”. However, since the Directive was first drafted and since the HSE issued their guidance ‘work’ has changed. For example, we now see an enormous number of workers within “call centres” which were relatively rare 10 years ago. The Regulations are written in a robust way (e.g. see the definition of Display Screen Equipment). But their interpretation must reflect modern ways of working and the risks accompanied with them. This brings us neatly to the issue of daily work routines (Regulation 4). Whilst most organisations are coming to terms with the ‘physical’ elements of the DSE Regs (at least in terms of routine procurement) much confusion still exists regarding task issues. Indeed, the recent high profile litigation involving the “Frimley Five” hinged to a large extent on work routines and the way in which they were monitored. In data processing and call centre environments, there is often little variation in the job. In these environments, in
CONTEMPOARY ERGONOMICS 2000
227
particular, we must explicitly engineer a suitable system of breaks and changes of activity. In more varied environments (e.g. where a user is also involved in telephone calls, meetings, clerical work, etc.) it is often considered that natural variation will suffice. It may well, but you cannot simply assume that, without looking at the task in question and ensuring adequate supervision. Whilst there are duties upon employees, employers have an absolute duty to “manage” work patterns and that includes protecting workers from their own keenness, stupidity or fear of redundancy. If we cheerfully allow workers to work through breaks whilst suggesting it is their choice, is a straightforward breach of the law. It is not uncommon for organisations to balk at this as unreasonable, suggesting that it is almost impossible to manage what can be a well hidden problem. The solution to this is the involvement of staff at all levels, from the board, through all levels of management to “end users”. The risk assessment process must be an integral part of management and quality systems, i.e. performance in relation to health & safety issues must be part of all reviews and audits. It should be impossible for managers to be rewarded (e.g. in relation to budget cuts) for scrimping on employee health, safety and welfare. It’s a cliché, but health & safety is everybody’s problem. Following seven years of experience with a wide variety of clients, we have found that the most effective way to provide information and training is as an integral part of the risk assessment process. Risk assessors become educators and facilitators rather than simple box tickers. Clearly risk assessors need support and users benefit greatly from additional reminder materials such as user guides (in paper or on-line form), but essentially it is the risk assessor who should be the local champion. The process then should be characterised by dialogue and participation. Our experience has been that the risk assessment process has proved an invaluable dialogue tool which has reaped benefits way beyond technical compliance with legislation. Improved understanding all round also helps deal with the fact that DSE related problems are generally multi-factorial. Problems tend to arise due to a combination of work patterns, workstation constraints, lack of understanding (often complicated by misinformation), etc. We would refer to these as “usage” problems which can easily occur with compliant furniture and within caring organisations. Workstations are often cluttered, poorly laid out and inappropriately adjusted. The fact that a chair offers height adjustment is not the end of the story. By definition, adjustability allows users to get it wrong and therefore they must understand how to get it right (workstation set-up) and work in an environment where poor postures will be picked up by colleagues and managers. Poor postures are a major source of musculoskeletal disorders and yet little thought is given to educate users in the importance of posture and how they can work with healthy postures. It is our experience that poor postures are more common in many offices than acceptable ones and the problem is being compounded by technological changes which result in more time being spent at the desk. Conclusions Experience with the interpretation and implementation of the DSE Regs has shown the need for a classic ergonomics approach i.e. user centred. Despite the wording of the regulations which focuses on workstations, the spirit of the regulations concentrates on users. To meet successfully the requirements of the regulations organisations must involve users in the risk assessment process. The assessment must not focus on just the physical attributes of the workstation, but must also consider task factors, individual needs, environmental issues and organisational influences.
228
THE DSE DIRECTIVE: WHAT DOES IT MEAN?
Not only must these issues be assessed, they must be well enough understood throughout the organisation such that they can be successfully supervised and that user behaviour can be relied upon as a risk management tool rather than constituting a risk in itself DSE is not inherently dangerous, but like any tool it can be misused. The challenge for organisations is not to feel a constant drive to buy more sophisticated furniture and new fangled ‘gadgets’ but to manage real world use and then procure against genuine task and user needs. Reference Honey S, Hillage J, Frost D and LaValle I, 1997 “Evaluation of the DSE Regulations 1992” HMSO, ISBN 07176 1334 8
Methodology
HOW MANY PARTICIPANTS: A SIMPLE STATISTIC WITH SOME LIMITATION H.Arisz, H.Kanis & M.J.Rooden School of Industrial Design Engineering, Delft University of Technology, Jaffalaan 9, 2628 BX Delft, the Netherlands
Prediction of hitherto unobserved usability problems would constitute a contradiction in terms. Estimating the number of unobserved usability problems, such as during a users’ trial, seems feasible, based on the ratio of diminishing returns, or the increasing overlap in findings with more participants. On the basis of theoretical considerations and empirical data, a simple statistic, introduced in a previous paper, is discussed as to the inevitable, though limited, bias of its estimates and the insight it gives into the tenability of the ‘five-subjects-is-enough’ rule of thumb. Introduction In Arisz and Kanis (1999) a statistical approach is proposed as a possible means to monitor the number of participants in a user trial concurrently. In this approach, the total number of distinctive phenomena of interest found after an unlimited number (‘∞’) of participants, F∞, is estimated on the basis of the findings after n participants. This insight can be used to assess the expected payoff (new information) of the extra effort (time, money) to involve more participants. In the present paper, the underlying mechanism of the statistical approach is discussed, especially as to the emergence of biased estimates. In addition, application of the approach to empirical data sheds light on the tenability of the rule of thumb that 4–5 participants would be enough to gather 80% of F∞. In the rest of this paper, the term usability problems is used as a short-hand indication of all kinds of possible phenomena of interest. The statistic The findings after n participants consist of – a number of distinctive usability problems, Fn, and – the frequency of each problem. In the previous paper, it is argued that current statistical approaches are not readily applicable to estimate F∞ as the subsequential sampling of usability problems does not fit into neat statistical procedures. Each new participant may provide zero observations (e.g. no problems), repeat previous observations (rendering
CONTEMPOARY ERGONOMICS 2000
231
overlap) and/or produce new observations (but never multiples of a single observation as one problem is , with pi the probability of never observed twice in the same person). Finally, there is no condition for coming across usability problem i in subsequently sampled participants (i=1…Fn after n participants); may be ≤1 but may very well be >1. Getting more confident estimates of separate pi the more participants are ), the number of unseen usability problems after n involved, is uninformative for estimating ( subjects. There is just no way to link observed pi with unobserved problems, unless extra knowledge is supposed to be available such as in terms of interdependencies between different types of problems. But that type of presumptions requires specifications which would render observational research into F∞ at least partly superfluous. Diminishing returns A rationale for the estimation of F∞ is the phenomenon of diminishing returns the more participants are involved. The most simple description of this phenomenon is the well-known formula --------------------------------------------------------(1) as the average probability of coming across usability problems, n the number of participants, and Fn and F∞ as defined above. In generalising a suggestion of Nielsen and Landauer (1993), by the elimination of , the following expression can be derived for estimating F∞ on the basis of findings after n consecutively involved participants: --------------------------------------------------------(2) as the average of the n observations (one for each participant) and as the average of the n combinations of the findings over n-1 participants, in order to produce the best empirical estimates for and . Of note is that in expression (2) the denominator gives, on average, the overlap between the number of the different usability problems found after n-1 participants and the number of usability problems found with an extra participant (the nth). In terms of diminishing returns, on average this overlap, as the number as the maximum attainable overlap; of common usability problems, asymptotically approaches for that is, ultimately, with no new usability problems detected as new participants are involved. With eliminated, the estimate of F∞ according to (2) is still based on formula (1), as if each usability needs further problem has the same probability of being found. This averaging of pi to a mean consideration as to its consequences. Biased estimates due to averaging pi In Figure 1, the proportion of usability problems found according to formula (1), Fn/F∞, is plotted against the number of participants for two cases: (a) all usability problems have the same probability of being discovered, say pai=0.3, and (b) two classes with an equal number of usability problems, the one with pbj=0. 5, the other with pbk=0.1, so the overall mean is the same for (a) and (b). Figure 1 illustrates that F∞ is underestimated by the adoption of an average, , while, actually, pi will always vary in practice. The reverse, estimation of Fn given F∞, would consequently end up in an overestimation of Fn. In mathematical terms, the general rule describing the emergence of this type of differences is called Jensen’s Inequality. The previous paper discusses that biased estimates reflect the underestimation of the is adopted; for two participants, this underestimation of overlap between participants if an average overlap turns out to be proportional to the variance of pi, i.e. the more dispersion in pi, the more the estimate of F∞ will be biased downward. It appears difficult to identify analytically a general tendency of the extent
232
HOW MANY PARTICIPANTS: A SIMPLE STATISTIC WITH SOME LIMITATION
Figure 1. The proportion of usability problems—Fn/F∞—found after n subjects
of bias, with a rising number of participants involved. In a hypothetical example given in the previous paper, the initial underestimation of F∞, computed according to (2), turns out to disappear quickly with increasing n. Note that the denominator in expression (2) is not biased, based as it is on empirically observed , and Fn. It is the (implicit) mediation by formula (1) which renders an underestimated F∞. For that matter, the adoption of some as an average of unknown pi is not the only source of a biased estimate of F∞ in practice, see the next example below. First it is looked into another consequence of the averaging of pi: its effect on the proportion of usability problems found after n participants, i.e. Fn/F∞. Four/five participants enough for 80% of F∞? This rule of thumb, as the operationalisation of the idea of diminishing returns (see above), seems to result from studies in which first F∞ is adopted on the basis of the observation of some tens of participants, followed by the conclusion that, in retrospective, knowing F4 or F5 would have been enough to gather some 80% of F∞ (cf. Virzi, 1992; Nielsen, 1994). In these studies, ‘neat’ retrospective estimates of Fn are computed according to formula (1). Figure 1 illustrates that the five-is-enough-rule may go astray in two (the shaded area) for pa=0.3 has been reached after five participants, this is ways. First, though . In other words, in averaging pi as if all usability problems have the only after nine participants for same probability of occurrence, the amount of information provided by a specific number of participants is proportionally overestimated. Secondly, Figure 1 shows that Fn/F∞ only amounts to ±0.4 for pbk=0.1 after five participants. Thus, the inherent difficulty to come across operational problems with a relatively small p (cf. Lewis, 1994) tends to be obscured by modeling the sequential identification of usability problems with an average for pi. All this is further dealt with in the next example.
CONTEMPOARY ERGONOMICS 2000
233
An empirical example In a study by Rooden et al. (1999) twenty practitioners predicted usability problems with a coffeemaker with extended functionality, including filtering the water and a timer function. Expert knowledge and information from users’ trials were used to make the predictions under different conditions; these conditions are not relevant for the sake of this example. One of the issues studied is to what extent the twenty ‘samples’ of predicted usability problems can be seen to be comprehensive. Figure 2 shows that the estimated F∞ after twenty participants is fairly well predicted after a handful of participants, with F20=86. In addition, F∞ tends to increase slightly the more participants are involved, which
Figure 2. Prediction of usability problems by experts in a study by Rooden (1999) F∞ scaled on right vertical axis, Fn/F∞ and on left vertical axis
goes together with a tiny decrease of . These tendencies reflect the fact that usability problems with the highest p, that is: indicated most frequently by the experts, will, by definition, mainly be discovered first. Or, in other words, usability problems mentioned for the first time at the end will lower , which therefore cannot escape from overestimation in practice. This is the second reason, next to the averaging of pi (see above), why the estimation of F∞ based on expression (1) will always tend towards some underestimation. Both reasons thrive on dispersion in pi. In the example of Rooden’s study, a few usability problems were mentioned by almost all experts, while no less than 31 usability problems were put forward by only one expert. Due to this relatively large number of low-p usability problems, Fn/F∞ only amounts to 0.6 after five participants, in defiance of the 80% rule of thumb. Obviously, the very fact of monitoring the information gathered so far by computing Fn/F∞ would help prevent unthinking application of the rule of thumb. Discussion There is no magic means to delineate the unpredictable. The simple statistic we propose may help to counterbalance unwarranted optimism such as misguidance by the ‘five-is-enough’ rule. The limitation of
234
HOW MANY PARTICIPANTS: A SIMPLE STATISTIC WITH SOME LIMITATION
our statistic—the underestimation of F∞—needs further analytical exploration, with new empirical examples as anchors for the actual problems that have to be tackled, rather than drifting away into mathematical generalities which tend to be difficult to digest in the application oriented area of design. References Arisz, H. and Kanis, H. 1999, Towards concurrent monotoring of the number of subjects in user trials. In M.A.Hanson, E.J.Lovesey and S.A.Robertson (eds.) Contemporary Ergonomics (Taylor & Francis, London), 417–421 Lewis, J.R. 1994, Sample Sizes for Usability Studies: Additional Considerations, Human Factors, 36, 368–378 Nielsen, J. and Landauer, T.K. 1993, A Mathematical Model of the Finding of Usability Problems. In Interchi ’93, 206–213 Nielsen, J. 1994 Estimating the number of subjects needed for a thinking aloud test, International Journal of Computer Studies, 41, 385–3 97 Rooden, M.J., Green, W.S. and Kanis, H., 1999, Difficulties in usage of a coffee maker predicted on the basis of design models, Proceedings of the Human Factors and Ergonomics Society 43rd Annual Meeting, 476–480 Virzi, R.A. 1992, Refining the Test Phase of Usability Evaluation: How many subjects is enough?, Human Factors, 34, 457–468
PSYCHOPHYSICAL METHODS FOR QUANTIFYING OPINIONS AND PREFERENCES Johan Engström & Peter C.Burns Department of Human-Systems Integration, Volvo Technological Development, Gothenburg, Sweden
This paper describes psychophysical scaling as a method for obtaining quantitative data of opinions and preferences, and compares it to standard, questionnaire-based, category scaling. Introduction Measuring opinions and preferences is of great importance in ergonomics as well as for product design in general. The most common method for obtaining subjective quantitative data is category scaling (CS) which usually takes the form of standard questionnaires were subjects express their views by choosing one of a fixed number of options. Despite its popularity, there are several problems with category scaling, including loss of information due to limited resolution of the categories, truncation of the response region by the questionnaire format and unequal spacing of the categories. Thus, obtaining quantitative data by assigning numerical values to the categories is problematic since ratio- or interval properties of the category scales cannot be guaranteed. Quantitative statements about the magnitude of opinions (e.g. “design A is twice as preferred as design B”) are not theoretically justified on the basis of category scaling data, although the practical significance of these problems is still hotly debated. At least some researchers (e.g. Lodge, 1980) argue that data obtained from category scaling should be strictly treated as ordinal, thus excluding not only the possibilities of magnitude scaling, but also hypothesis testing by means of parametric statistics. An alternative to category scaling, is psychophysical scaling (PS) which, at least in theory, makes possible validated ratio scales of opinions, thus justifying hypotheses about the magnitude of opinions and the use of parametric statistics for testing them. These techniques are generally more elaborate than category scaling, and this is probably an important reason why they are rarely used in practical applications. Still, PS has been successfully applied in a variety of domains of opinion measurements, e.g. the measurement of public political support (Lodge et al., 1975) and formative evaluation of interior designs in trains (Han et al., 1998). In the former study, a comparison was made between psychophysical and category scales with labelled categories, which clearly showed that the CS data were not properly scaled for quantification. PS builds on classical work in psychophysics, which originally was mainly concerned with the relation between sensory perceptions and physical stimuli. As proposed by Stevens (1957), in a wide range of situations this relationship can be modelled by a power function: (1),
236
PSYCHOPHYSICAL METHODS FOR QUANTIFYING OPINIONS AND PREFERENCES
where R is the perceived magnitude of the physical intensity S, and k and b are constants. In essence, this relationship, known as Stevens’ law, states that equal stimulus ratios produce equal response ratios. While k can be regarded as merely a scaling factor, the exponent b is characteristic of a particular sensorystimulus relationship. Characteristic exponents have been empirically verified for a wide range of sensory continua, e.g. heaviness (b=1.45) and loudness (b=0.67) (Lodge, 1980). If the logarithms are taken of both sides of (1) we obtain (2). Thus, b corresponds to the slope of the linear curve obtained when the logarithm of R is plotted against the logarithm of S. The psychophysical scale obtained for a particular sensory-stimulus continuum can be validated by means of a technique known a cross-modality matching (CMM). The basic idea of CMM is that, instead of just expressing the sensory magnitude numerically, subjects produce the response in a different sensory modality. In this case, the perceived magnitude of the stimulus Ri equals the perceived magnitude of the response Rj Thus, provided that Steven’s law holds for both continua, using (1), the following relation is established: (3) where Si refers to the magnitude of the physical stimulus, and Sj to the magnitude of the physical response. We can now write (3) as (4). Thus, if the psychophysical relationship to be validated (Si→Ri) obeys Stevens’ law (and provided that the corresponding relationship for the response modality, Sj→Rj, does as well), when the logarithms of Si are plotted against the logarithms of Sj, a linear relationship with the slope bj/bi should be obtained. Since this version of CMM, sometimes called direct CMM, is only applicable in cases where the stimulus magnitude (Si) is known, it cannot be used for the scaling of opinions or preferences where the stimuli generally lack metric properties. However, this problem can be overcome by a slight extension of the CMM paradigm known as indirect CMM, where both Si and Sj are obtained from responses to a non-metric (e.g. social or aesthetic) stimulus S0. Thus, for these three variables, a relationship similar to (3) is established and the psychophysical scale can be validated using (4) as before (for a more detailed description of the CMM methodology, see Lodge, 1980). The objective of the present study was to investigate the utility of psychophysical scaling and to compare it to standard category scaling. For this purpose, a simple experiment was designed, where subjects were asked to assess the safety of different car models, employing both PS and CS. Furthermore, two different regimes for presenting the stimuli in PS were tested. Method Twenty subjects, eight females and twelve males, 23–55 years of age, participated in the study. Their main task was to express their opinions of the traffic safety of ten common car models, presented in a series of pictures. The subjects were explicitly instructed to judge the cars from their general knowledge about the models, rather than from just the appearance of the pictures. The ratings were performed by means of PS as well as CS. The former was validated by indirect CMM, with line-production (LP) and number estimation (NE) as response modalities. For the category scaling, a seven-point scale was used with only the endpoints labelled.
CONTEMPOARY ERGONOMICS 2000
237
In order to compare PS to CS, a within-group design was employed where all subjects performed the safety judgements using both techniques. To account for order-effects, half of the group began with CS while the other half started with PS. The procedure for PS generally followed that employed by Tanenhaus and Murphy (1981), where the subjects were given detailed written instructions for each scaling task. According to these instructions, the judgements should be performed in a strictly sequential fashion where subjects are not allowed to go back and check previous responses. Furthermore, a fixed reference is given to the subjects by assigning a response value (e.g. a line of a certain length) to a reference stimulus. However, since pilot testing revealed that subjects tended to feel uncomfortable with this strict way of presenting the stimuli, more liberal conditions were employed for half of the subjects where they were free to go back to earlier stimuli as well as choosing their own reference. The two experimental groups are henceforth referred to as Group A (strict conditions) and Group B (liberal conditions). For CS, the same conditions applied to all subjects. Results Before performing the actual psychophysical scaling, subjects were given a practise task, where they performed line-estimations from given numbers, and vice versa. This also functioned as a test of their understanding of the basic principles of magnitude scaling. All subjects passed this test. In PS, the total scores Ti for each car model i were calculated according to (5), where NEi and LPi are the geometrical means, per car stimulus, of the subjects’ ratings expressed in the two response modalities respectively (i.e. number estimations and line lengths). For Group B, where the reference differed between subjects, the score for each subject was linearly re-scaled in order to be of a similar order of magnitude (unit variance and a mean of 2). The total scores for the safety judgements obtained for the two groups are shown in Figure 1.
Figure 1. The total scores for the safety judgements of each car model for the two experimental groups. The results for Group A (strict conditions) are shown to the left and for Group B (liberal conditions) on the right.
The obtained psychophysical scale was validated by means of indirect CMM were the response data from the first modality (NE), averaged for each car model, were plotted against the response data from the other (LP). The characteristic exponents for NE and LP are both 1.0 (Lodge 1980) and thus, according to (4), the slope predicted from the indirect CMM validation is b=1.0/1.0=1. Since both the variables NE and LP are potentially subject to error, the error-in-both-variables regression model was used to calculate b (Lodge, 1980). The results from the indirect CMM are summarised in Table 1.
238
PSYCHOPHYSICAL METHODS FOR QUANTIFYING OPINIONS AND PREFERENCES
Table 1. Summary of the results from the indirect cross-modality matching
As is clear from Table 1, for both groups, the predicted regression coefficient (1.0) is within the 95% confidence limits of the obtained exponents. For the comparison between PS and CS, all PS data were linearly re-scaled to unit variance and a mean of 2. It turned out that the PS and CS data were approximately linearly related, and the product-moment correlation coefficient was high (r=0.99). Discussion First a general word of caution. There are many controversial methodological issues concerning the use of PS for quantifying opinions, in particular for the simple pencil and paper technique employed here. However, there is no space to go any further into this debate here; see references in Dawson (1974) for some early criticisms of the method. As is clear from Figure 1, the scores from the two experimental groups were very similar and it is clear that the mode of presentation did not significantly influence the results. Since subjects generally favoured the liberal method, it seems preferable in most situations. Both groups passed the indirect CMM validation of the PS data, which indicates that subjects performed properly ratio scaled safety judgements using PS. Moreover, since CS was highly correlated to PS this holds for the CS data as well. This contrasts to the results obtained for labelled category scales (Lodge et al. 1975), suggesting that the endpoint-labelling format is more appropriate for quantification. Closer examination of the PS data plotted in Figure 1 suggests that subjects judged the safety of the cars by grouping them into three main categories: “safe” (Mercedes S-class, Audi A4, Saab 9–5), “medium safe” (WW Golf, Renault Megane, Mazda 626, Toyota Corolla, Mitsubishi Carisma) and “less safe” (Nissan Micra, Opel Corsa). Such results tells us something about how a target audience judge a product which is obviously important information for the manufacturer. Making such inferences on the basis of CS is problematic, since it could be argued that the categorisation occurred as a result of the categories being forced upon the subjects by the questionnaire format. Thus, in this respect, PS has an advantage, since it imposes no such constraints. This spontaneous categorisation may also explain the high correlation between CS and PS. From a practical point of view, it could be argued that, given that the subjects perform categorical judgements, CS is preferable due to its less labour requirements. However, it is important to notice that this argument relies on the PS validation being done in the first place. It is possible that, for ratings that cover more subtle differences, the limitations of CS will become more apparent, but further research is needed to clarify this. In conclusion, the results show that psychophysical scaling is a feasible method for obtaining quantitative data of opinions and preferences, with some important advantages over standard category scaling. These advantages may become most prominent when investigating new questions or developing a rating system for general use. A main drawback of the method is the extra amount of work required, and thus in some cases categorical scaling techniques may be sufficient to quantify subjective ratings.
CONTEMPOARY ERGONOMICS 2000
239
References Dawson, W.E. 1974, An assessment of ratio scales of opinion produced by sensory modality matching. In H.R Moscowitz, B.Scharf and J.C.Stevens (eds.) Sensation and measurement: Papers in honor of S.S.Stevens. (Riedel, Dordrecht) Han, S.H., Jung, E.S., Jung M., Kwahk, J. and Park S. 1998, Psychophysical methods and passenger preferences of interior designs, Applied Ergonomics, 29, 499–506 Lodge, M. 1980, Magnitude scaling, (Sage, London) Lodge, M., Cross, D., Tursky, B. & Tanenhaus, J. 1975, The psychophysical scaling and validation of a political support scale, American Journal of Political Science, 19 Stevens, S.S. 1957, On the psychophysical law, The Psychological Review, 64, 153–181 Tanenhaus, J. and Murphy, W. 1981, Patterns of public support for the supreme court: a panel study. Journal of Politics, 43, 324–339
USING THE WEB TO SUPPORT GEOGRAPHICALLY DISPERSED, LONGITUDINAL USABILITY EVALUATIONS Matthew Beard1 & Carolina Parker2 1UNISYS,
The Octagon, Brunel Way, Slough SL1 1XW, UK Email:
[email protected]
2HUSAT
Research Institute, The Elms, Elms Grove, Loughborough, Leics LE11 1RG, UK Email:
[email protected]
This paper examines the problem of managing a longitudinal usability evaluation process in a highly distributed environment. It evaluates the tools available for remote evaluation and the potential of new and existing technology to support this process. The focus for the research is DESSAC, a flagship program for the arable sector in agriculture. DESSAC’s users are geographically dispersed and the system can only be evaluated over a full farming season. Standard usability tools may not be effective or practical in these circumstances. The paper examines the best alternatives for this situation and describes the development of an on-line evaluation mechanism. Introduction DESSAC is an agricultural computer-based decision support system currently in its final year of development. The evaluation of the system is the responsibility of the HUSAT Research Institute at Loughborough University. The software has been through successive design iterations and is now ready for longitudinal testing ‘on-site’. Previous on-site trials revealed a reluctance on the part of users to complete paper based reports, and face-to-face interviewing proved particularly costly. The problem posed by DESSAC and taken by the author as a research topic was to: a) Identify the best approach for usability monitoring for the HUSAT DESSAC usability trials in year five. b) Design and create an example of the chosen monitoring method. c) Evaluate its suitability and effectiveness. d) Provide recommendations for HUSAT on the basis of this evaluation process. Initial research led to the identification of five potential solutions: automated monitoring, postal surveys, telephone interviews, regular group meetings and remote online reporting. After examining each it was concluded that the ideal solution would be automated monitoring software. Unfortunately this technique was well beyond the technical and financial abilities of the project. The author therefore had to identify a less optimal but cost and resource-effective technique that could be rapidly developed and deployed.
CONTEMPOARY ERGONOMICS 2000
241
Table 1. Summary of cost-benefit assessment
Method of selection The author administered a cost-benefit analysis to decide on the most viable option from those remaining, a summary of the assessment is shown in Table 1. As Table 1 indicates the best solution was the remote on-line reporting solution. It potential is however shadowed by the problem of maintaining use. Kraut et al. (1996) assessed the longitudinal impact of residential internet use in 1994/95 and found that of 157 participants receiving an Internet account during a trial, 78% used it over once a week but usage deteriorated, halving after 22 weeks. Bearing this in mind the main goals for the development of a site are summarised below: 1. 2. 3. 4.
Capture usability data in day-to-day task situations. Ensure data capture is cost-effective. Ensure data is of sufficient quality to identify usability problems. Ensure that users use the system regularly. Method Identifying requirements for the web-site
In order to identify detailed requirements for the web-site the author applied the Human Factors in Information Technology Planning Analysis Specification Tools (HUFTT PAS) to build a profile of the users and map their characteristics to the typical tasks they would be managing for remote-reporting. A Functionality Matrix was used as a brainstorming tool. The results of this analysis were as follows: ▪ Provide structured feedback—On-line web forms could be used to support the collection of structured data but ease the degree of input required by the user. ▪ Support interactivity—Users like to communicate with each other (i.e. WebChat, discussion forums) and it provides a source of additional usability data.
242
USING THE WEB TO SUPPORT GEOGRAPHICALLY DISPERSED, LONGITUDINAL
▪ Provide support for DESSAC help services—An appointments diary could be used to book help slots and interviews with DESSAC support team. ▪ Provide access to help support—The web page can also be used to answer common problems by providing updateable FAQ pages. ▪ Give added value information—Inform users through updateable News pages. ▪ Provide access to product downloads—The identification of problems often leads to new releases or patches which web-pages can provide easy access to. ▪ Provide incentives for visiting. One of the ways of encouraging this was seen to be advertising i.e providing free space to clients for self promotion/classifieds. ▪ Provide easy usable and identifiable user interface—The provision of a usable and consistent graphic identity scheme was required. Tips from other web-site developments Before engaging in design work the author researched web site studies, a summary of the most significant discoveries is as follows: 1. Web sites are not like software. In software & hardware testing users choose the product they are most successful with, they don’t always choose the website they are most successful with, often preferring the sites that are most interesting (Spool et al., 1999). 2. Do not assume users know the domain the site is based in, users don’t form mental models of web sites (Spool et al. 1999). 3. Users don’t read on the web, they scan text. Web pages should highlight keywords, use bulleted lists, keep one idea per paragraph, halve the conventional word count and use an inverted pyramid style of writing, i.e. state the main points and conclusions in the first paragraph, then build the supporting evidence (Nielsen, 1997). Technology used to build the web-site The author applied three main technologies in order to provide the general structure and layout and the software engines that run the interactive elements. • HTML (Hypertext Markup Language) to code the framework of the site providing a consistent layout and graphic identity scheme. • JavaScript for providing minor interactive elements such as screen resolution displays and 3D buttons. • PERL (Practical Extraction and Reporting Language) to provide the main interactive elements such as the bookable appointment’s diary, discussion forums, WebChat, feedback forms and upload/download facilities. The whole development process was achieved within 5 weeks using one programmer. Method for evaluating the web-site The author designed a set of realistic task based activities i.e. locating specific information, downloading files, booking appointments and posting messages in discussion forums. Both quantitative and qualitative
CONTEMPOARY ERGONOMICS 2000
243
methods were used. The WAMMI (Website Analysis MeasureMent Inventory) was unfortunately unavailable for this project and Maguires’ SAQ (Software Acceptance Questionnaire) (Maguire, 1998) was employed as a substitute. A cognitive workload measure was used to ascertain what specific tasks were harder to complete than others. Automatic logging was used to compute the overall usage of the site and failure rates, suggesting potential usability faults and technical discrepancies. Fifteen users were recruited from the DESSAC user database to complete the trials of the web-page. They were invited via email and asked to go to the site and to print out and complete the tasks listed there. Their responses and comments were logged and additional comments via email recorded. Results Table 2 Results of SAQ and Task performance
The results of the SAQ and task performance analysis can be found in Table 2. There is insufficient space here to describe the results in full but they were largely encouraging. All SAQ measures of usability scored above the acceptable minimum and three of the tasks were performed easily by most users. Two of the main tasks did perform poorly and specific note was taken of the reasons for this which largely related to poor onscreen instruction. A lot of useful information on improvements was provided by the users. Conclusions The provision of a web site is a viable solution to the problem of geographically dispersed longitudinal usability trials. It is fast, cheap, accessible and provides an extremely efficient solution of collating data and disseminating it to usability practitioners. However, because the system gives total discretion to the user it needs work to keep them interested in re-visiting. Time and effort are required to keep it regularly updated and made attractive. The initial evaluation of the web-site was encouraging and it provided sufficient information to produce a second improved version. However the real test of the web-site as usability collection tool cannot be assessed until it has been used ‘in anger’ over a full season on-site. This is the basis for future research. The author aims to have the results of the full trial run available during the winter of 2000. The Internet is rapidly becoming the new dynamic method for exchanging the most diverse forms of information socially and commercially. As the author foresees, remote usability testing, particularly for
244
USING THE WEB TO SUPPORT GEOGRAPHICALLY DISPERSED, LONGITUDINAL
software, is likely to become standard for many software-engineering practices. Therefore more research needs to be done to assess the most effective techniques of remote evaluation, now that it’s becoming so cheap to do, longer trial runs are likely to increase so we must be ready to harness this opportunity to it’s greatest effect. References Eason, K.D. 1988, Information Technology and Organisational Change, (Taylor & Francis) Kraut, R., Scherlis, W., Mukhopadhyay, T., Manning, J., Kiesler, S. 1996, HomeNet: A field trial of Residential Internet Services http://www.acm.org/pubs/articles/proceddings/chi/238386/p284-kraut/p284-kraut.html Maguire, M. Software Acceptance Questionnaire (SAQ), HUSAT, The Elms, Loughborough University, Loughborough, Leics. Nielsen, J. 1996, Inverted Pyramids in Cyberspace Last updated: 05/05/99 Last viewed: 02/06/99. http://www.useit.com/ alertbox/9606.html Nielsen, J. 1997, How Users Read on the web Last updated 16/08/99 Last viewed: 22/08/99. http://www.useit.com/ alertbox/9710a.html Spool, J.M. Scanlon, T. Schroeder, W. Snyder, C. DeAngelo, T. 1999, Web Site Usability A Designers Guide Morgan Kaufman S1F. CA.
THE PRACTICE OF TRIANGULATION Iain S.MacLeod, Linda Wells & Karen Lane Aerosystems International, West Hendford, Yeovil BA20 2AL, UK
Triangulation in the practice of geographical Navigation is a way of combining positional data, based on the assessed accuracy of that data, to derive the most probable position of an object. In Ergonomics, ‘Triangulation’ has little to do with triangles but much to do with the analyses of mires of collected qualitative and quantitative information. It involves the combination of information in such a way as to give substance and rigour to the results of ergonomic investigations. This is especially true if the study is concerned with real work situations under strong cost and time constraints and using only a few test subjects. The facets of work that can be examined through this form of analysis include system cognitive functions (MacLeod, 1998), design consistency, operators’ work rules, their methods of system management & supervision, and their approaches to system control. Introduction The practice of triangulation involves the use of diverse and mainly subjective information obtained through the application of a number of investigative approaches to the examination of work. That information is usually immersed in the noise of the ‘real-life’ work environment. In Signal Processing the environmental noise has to be understood but is subdued to allow the best appreciation of the signal. However, in the examination of work practices the noise must be merely differentiated from the sought information, as an understanding of the influences of context on work is also important. Triangulation practice includes methods for the handling and classification of information, rules for the comparison of the information, for iteration on these comparisons, and for arguing the threads of connectivity between the diverse information forms. Important throughout is an explicit consideration of the assumptions used, the issues involved, and the implications of the findings with relation to these issues. This article will briefly outline and discuss some principles of the practice of triangulation founded on many years of practise. Triangulation—Substance, Probabilities, and Rules Before the advent of accurate Navigation positional aids, for example the Global Positioning System (GPS), Air Navigation was performed using rules for the optimum combination of positional information diverse probabilities on it’s accuracy. The rules were as a result of good practice whereas the determination of probabilities was governed by the form of the positional aid, the aircraft situation, and the
246
THE PRACTICE OF TRIANGULATION
Figure 1. Example of Triangulation in Air Navigation
flying environment (see illustrative example in Figure One). The safe application of the derived positional information was also prescribed. In Ergonomics, the practice of triangulation has been performed for many years (Richardson, 1996), mainly as a ‘behind the scenes’ investigative and qualitative form of methodology, this possibly not to be compared to the more rigorous application of quantitative experimental methods to experimentation, these usually performed in a laboratory or a strongly contrived setting. In reality, the use of qualitative and quantitative methods has always co-existed (MacLeod, 1997). Furthermore, the form of triangulation under discussion does not involve geometry. Rather it involves the simultaneous analysis of dissimilar forms of data, this under an explicit set of rules, the results of which are then involved further in the triangulation process through an interactive analysis of all the results. The analysis may be performed several times from different perspectives. Moreover, the complexity of modern systems means that their test, evaluation, and acceptance, to be effective, must be performed in an incremental, timely, and iterative manner at different stages of system development and build. Apart from the concept stage of design, laboratory style experimentation is now frequently replaced by Ergonomic assessments of the developing system. Prototyping of a system, not the ‘one off rapid prototype as might be performed by an engineer to assist his work during system development, is now frequently replaced by Ergonomic evaluations of the actual system on software/ hardware integration rigs and simulators depending on the stage of an system development. Experimental Design and Performance It is important that an experiment is designed from the onset to consider the number of subjects, the equipment specification/functionality, and the form of analyses to be used, to name but a few relevant points. Further, a good understanding of system concept and tasks must be made explicit in the experimental design. The build, conduct, and analyses associated with an experiment can be depicted in a pyramid form as depicted in Figure Two. In general, the broader the base of the pyramid, the more likely the experiment will support both qualitative and quantitative analysis methods. There is of course an optimum base for each
CONTEMPOARY ERGONOMICS 2000
247
Figure 2. The Build and Span of the Experimentation Pyramid
case. Too narrow a base on the pyramid should suggest that the experiment is probably not worth conducting. Below the optimum, especially where experimental conditions do not support detailed quantitative methods of analysis, the use of qualitative methods of analysis should increasingly come to the fore. Triangulation and Subjects’ Expertise The less subjects used, the more important it is to determine their qualification as Subject Matter Experts (SMEs) and any major biases that they may possess. For example, an SME with a lot of experience on the operation of previous equipment in their field may be unduly critical of any new equipment design if it contravenes their previous work practices, this regardless of their efficacy. In addition, it is important that the test scenarios are as authentic as possible and involve the SMEs in the diverse approaches to work allowed by the equipment being used in the test. Figure Three illustrates one way of depicting SMEs expertise so that any associated weightings applied in the experiment results can be made explicit. Uses of Different Forms of Data in Triangulation The data used in triangulation must exist in at least two forms. For convenience these are discussed as having ‘Top Down’ and ‘Bottom Up’ forms. As a general rule, SMEs comments can be considered as data related to specific critiques from their performance of task activities (Bottom Up), whereas questionnaire and rating scale results are more indicative of the SMEs’ overall impressions on the system’s fit to specified requirements or to fitness for purpose (Top Down). Other Top Down data forms arise from Group Debriefs and checks of system functionality against specified system requirements. Bottom Up data forms could also be obtained using methods such as Verbal Protocol Analysis, the analysis of video recordings of SME work, or through the information contained in a system based data logger. Rules are needed to equate influences on the results, influences such as might arise from varying SME skill levels. Rules are also needed to allow an effective combination of the experimental data and to make explicit the methods of combination used. Figure Four illustrates. SMEs provide the core of the results from the tests in question. These results have to be initially collated and placed into pertinent categories. It is often convenient to initially break down results into the categories
248
THE PRACTICE OF TRIANGULATION
Figure 3. Subjects’ Expertise in Context (Line slopes are illustrative only—the 2 symbols depict possible scores from 2 subjects)
Figure 4. The Substance of Triangulation
Table One: Rule Examples to Initiate Triangulation Analysis
CONTEMPOARY ERGONOMICS 2000
249
of ‘Very Good’, ‘Good’, ‘Reasonable’, ‘Poor’, and’ Unacceptable’. Table One gives two examples of category placement rules pertaining to experimental data. Conclusion—Issues and Implications It is probable that some experimental data will show conflictions and rules must be made explicit on how data should be handled. For example, Top Down results generally tend to be more positive in nature than Bottom Up results; comments are usually concerned with specific constructive criticisms and not praise. It takes a very critical comment, or a number of more minor critical comments, before a direct inference can be made between comments and the scores obtained from such as questionnaires rating scales. Moreover, it is possible that a category, such as Positive Results, may contain few if any comments of praise or satisfaction. Similarly, it is possible that other categories, such as Negative Results, may encompass a great number of critical comments but few Top Down scores. In these cases it is necessary to argue association across categories with relation to the overall levels of the scores, and the individual scores on specific topics, especially when these scored topics areas encompass areas covered by specific critical comments. The argument on association must also consider the number and categories of responses by participating SMEs, and their level of agreement. Generally, if less than half the SMEs contribute comments to a particular topic area, then individual SME biases must be considered and investigated later through a SME group debrief. The triangulation process is completed by examining both the consistencies and inconsistencies of collated data in each of the assigned categories and arguing the resultant issues and implications primarily from a Bottom Up perspective. This particular section of the experimental report is important in that it sets the ground for the report recommendations. The issues must all arise from the experimental data, though they can then be supported by the results of previous tests, and their implication on operator work and mission/flight effectiveness argued with relation to both the system performance requirements and it’s ‘fitness for purpose’. Frequently, a particular triangulation exercise will raise questions and issues that can only be fully addressed through an iteration on these issues at a later test session. As a last step, observations and professional advice from Ergonomic observers of the test(s) are then explicitly included, in association with the evidence obtained from the SMEs, to formulate a set of recommendations from the overall assessment process. References Richardson, T.E. (Ed) (1996) Handbook of Qualitative Research Methods for Psychology and the Social Sciences, BPS Books (The British Psychological Society), Leicester, UK. MacLeod, I.S. (1997), Some Benefits and Pitfalls of Qualitative Investigation as Conducted in the ‘Real-World’ of Aircrew Work, Proceedings of 9th International Symposium of Aviation Psychologists, Columbus, Ohio. MacLeod, I.S. (1998), A Case for the Consideration of System Related Cognitive Functions, in Proceedings of 8th International Symposium of the International Council of Systems Engineering, Vancouver. September.
Manual handling
WORK PERFORMED IN THREE DIFFERENT MODES OF DYNAMIC LIFTING A.D.J.Pinder1, M.P.Rayson2 & D.W.Grieve3 1 Health and Safety Laboratory, Broad Lane, Sheffield, S3 7HQ 2 Optimal Performance Ltd, Old Chambers, 93–94 West Street, Farnham, Surrey, GU9 7EB 3 Institute for Human Performance, RNOHT, Brockley Hill, Stanmore, Middx, HA7 4LP
256 subjects (193 male, 63 female) performed maximal dynamic lifts in three modes. These were freestyle maximal incremental box lifts to 1.7 and 1.45 m, freestyle lifts on an Incremental Lift Machine to 1.7 and 1.45 m, and a maximal exertion on a hydrodynamometer to at least 2 m. Comparisons of the different devices were made using work done to 1.7 and 1.45 m. More work was done in the box lifts (mean to 1.7 m=835 J, SD 257 J) and ILM lifts (mean to 1.7 m=783 J, SD 241 J) than in the hydrodynamometer lifts (mean to 1.7 m=643 J, SD 155 J). Linear regression showed that the three devices had 65.5–71.2% of the variance in work done in common. When split by gender, the correlations for males ranged from 0.585 to 0.671, and for females from 0.421 to 0.558. Introduction Measurement of dynamic strength There is no universally accepted single measure of dynamic strength and various devices, as well as free weights, have been used for dynamic strength testing. These devices include the Incremental Lift Machine (ILM) (Stevenson et al., 1990), and a hydrodynamometer (Pinder and Grieve, 1997). Fixed loads and isoinertial devices such as the ILM have constant resistance to acceleration. In a hydrodynamometer, motion itself is resisted by a drag force caused by the movement of a body through water. In such a device the relationship between velocity and effort can be preset. The purpose of this study was to compare three methods of assessing dynamic lifting strength: maximal box lifting, and lifting on an ILM and on a hydrodynamometer. The Incremental Lift Machine McDaniel (1983) devised the ILM which consists of a weight stack constrained to move vertically. The subject uses a pair of handles to lift the weight stack from a starting position to a specified finish position. The weight is incremented and the procedure repeated until the subject chooses to stop, is unable to lift the handles to the finishing point, or reaches the maximum weight. Numerous studies have subsequently been
252
WORK PERFORMED IN THREE DIFFERENT MODES OF DYNAMIC LIFTING
carried out using the ILM with only slight modifications of the basic design (e.g. Stevenson et al., 1990, Duggan and Legg, 1993). Stevenson et al. (1990) compared ILM performance with box lifting performance. For a ‘straight back, bent knees’ lift, correlations between box lift and ILM performance were consistent across gender and up to 50% of the variance could be predicted. The hydrodynamometer This device (Pinder and Grieve, 1997) consists of a 2 m high water-filled tube containing a leaky piston connected via a wire rope to a handle with a start height of 0.4 m and a maximum height of 2.2 m. Force in the rope is measured at 12.5 kHz using strain gauges and displacement of the rope is measured synchronously with a resolution of 0.278 mm. The hydraulic drag force is a function of the square of the velocity. Duggan and Legg (1993) measured performance of 384 male army recruits on a series of strength tests, including an earlier version of the hydrodynamometer. They found a mean power output of 431 W (SD 119 W) over the 0.7 to 1.0 m height range, with a linear correlation of 0.67 between hydrodynamometer power and maximum ILM performance to 1.52 m. They used multiple regression to show that, when combined with measures of height and weight, both the hydrodynamometer and an isometric upright pull at a height of 380 mm, were equally good predictors of ILM performance (r=0.77, R2=59%–60%). They concluded that the hydrodynamometer had a high level of criterion-related validity and reasonable face validity. In a preliminary study of 69 male and 9 female soldiers Rayson et al. (1995) examined the relationship between work done on the ILM and the hydrodynamometer to a height of 1.7m and found a correlation of 0. 80. When they split the analysis between males and females they found correlations of 0.58 and only 0.06 respectively. A linear regression to predict work done on the ILM from hydrodynamometer work and gender had an R2 value of 75%. Methods Both the ILM and the hydrodynamometer were used in a series of studies examining the use of a battery of physical performance tests to predict performance on a number of ‘Representative Military Tasks’ (RMTs) (Rayson et al., 1995, Rayson, 1997, Rayson et al., 2000). This paper further explores some of the data from one of these studies. The study was approved by the Ethics Committee of the Centre for Human Sciences of the Defence Research Agency. Prior to testing all subjects gave informed consent and were medically screened. They wore PT clothing during all fitness testing and appropriate military clothing while performing the RMTs. They were asked to perform all tasks to their individual safe maximum. The lifting tests All test procedures were explained and demonstrated to subjects, and practice attempts were made. Subjects were advised on safe lifting techniques, but essentially the lifts were freestyle. The box lift (referred to as a ‘single lift’ by Rayson et al., 2000) involved lifting a weighted ammunition box from the ground to two heights (1.7 m and 1.45 m) using an incremental protocol. The initial load was 10 kg, and after each successful attempt 5 kg (or 4 kg after 40 kg) was added until the subject could not safely achieve the lift within ten seconds, or until a load of 72 kg was achieved. This limit was due to the size of the box.
CONTEMPOARY ERGONOMICS 2000
253
The maximal weights that subjects could lift to 1.7 m and 1.45 m on the ILM were determined using an incremental protocol. The initial weight of the carriage was 18.1 kg. After each successful lift the weight was increased by 4.5 kg, to a maximum of 90.7 kg. Testing continued until the subject chose to stop or failed to reach 1.7 m. After the last unsuccessful lift the weight was reduced by 2.3 kg and the lift repeated. After failure at 1.7 m the subject rested for one minute then attempted the last unsuccessful weight to a height of 1.45 m and continued until failure at that height. Subjects started the lift on the hydrodynamometer with an overhand grip and pulled as hard and as fast as possible on the handle from the start height to at least 2m high. A change of grip in the shoulder region from an overhand lift to an underhand upward push was required. Results 379 subjects (304 males, 75 females) were entered into the main study. Useable data from all three types of lift were obtained from 256 subjects (193 males and 63 females). The anthropometric characteristics of the subjects are given in Table 1. Table 1. Mean (SD) values of personal data of 193 male and 63 female subjects
On the ILM seven subjects reached the upper limit of 93 kg for the lift to 1.45 m and three reached this limit for the lift to 1.7 m. For the box lift 85 subjects reached the maximum value of 72 kg for the lift to 1. 45 m, and 32 reached it for the lift to 1.7 m. The start heights and distances moved of the three lifts were not identical. The hydrodynamometer started at 0.4 m, and the ILM at 0.3 m. The box was lifted from the ground, but subjects normally grasped the handles attached to the top of the box at approximately 0.3 m. As subjects usually changed grip part way to grasp the bottom of the box, the distance the hands moved was less than the distance the load moved, but on the ILM and hydrodynamometer the load and the hands moved the same distance. Work done (the product of force and distance) to the two heights was calculated for each of the tests. The results are shown in Table 2. Due to the different physical laws governing the devices, for the ILM and the box lift, work was a linear function of height, but on the hydrodynamometer work varied as instantaneous force varied. The larger values for the work done on the ILM and in box lifting partly reflect the greater distances through which these lifts occurred. The masses lifted on the ILM to 1.7 and 1.45m were 57.0 kg (SD 17.5 kg) and 62.8 kg (SD 16.8 kg) respectively. Males lifted 64.2 kg (SD 13.0 kg) and 69.8 kg (SD 12.5 kg); females lifted 34.9 kg (SD 8.9 kg) and 41.7 kg (SD 8.5 kg). For the box lifts the loads lifted to 1.7 and 1.45 m were 46.1 kg (SD 15.4 kg) and 54.4 kg (SD 14.7 kg) respectively. Males lifted 53.0 kg (SD 10.3 kg) and 61.6 kg (SD 7.2 kg); females lifted 24.9 kg (SD 6.3 kg) and 32.3 kg (SD 8.7 kg). Linear regression was used to examine the relationships between the work done in the three modes of lifting, including taking into account the effects of gender. The results are shown in Tables 3 and 4.
254
WORK PERFORMED IN THREE DIFFERENT MODES OF DYNAMIC LIFTING
Table 2. Work done to 1.7 and 1.45m on the hydrodynamometer (H170, H145), the ILM (ILM170, ILM145), and in box lifting (BL170, BL145) (n=256)
Table 3. Linear regressions between work done in the three modes of lifting
Table 4. Effect of gender on linear regressions between work done in the three modes of lifting. Gender=0 for males, 1 for females
Discussion The correlation of 0.844 between ILM and hydrodynamometer work to 1.7m is close to the value of 0.80 found by Rayson et al. (1995). The combined model to predict ILM work to 1.7m from hydrodynamometer work to 1.7m and gender had an R2 value of 72.5%, which, is also close to the previous value of 75%. The correlation of 0.671 for males is slightly greater than the 0.58 previously found, but the value of 0.558 for females is very different from the earlier value of 0.06. The very small number of females (n=9) in the earlier study must have resulted in a spuriously small correlation. The correlation for the combined group is larger than for either gender due to gender differences spreading the range over which the correlation is calculated. The correlations between work to 1.7 m on the three devices for the combined male and female data ranged between 0.810 and 0.844. The correlations for lifts to 1.45 m were very similar. The correlations with box lifting performance may have been reduced since a large number of subjects achieved the upper limit on the box lift. It therefore appears that performance in the three modes of lifting related equally well to each other and that the three tests can be used to measure an underlying dynamic strength factor, even though they require different techniques and muscle activations. However, decisions on test selection will also consider other factors such as test safety.
CONTEMPOARY ERGONOMICS 2000
255
The work done, i.e. an integrated measure over the range of the lift, is clearly a useful parameter for comparing performance on devices which use different physical principles. Bibliography Duggan, A. and Legg, S.J. 1993, Prediction of maximal isoinertial lift capacity in army recruits. In R.Nielsen and K.Jorgensen (eds.) Advances in Industrial Ergonomics and Safety V, (Taylor & Francis, London), 209–216 McDaniel, J.W., Skandis, R.J. and Madole, S.W. 1983, Weight lift capabilities of Air Force basic trainees, (AFAMRL, WPAFB, Dayton, Ohio) AFAMRL-TR-83–0001, AD A129 543 Pinder, A.D.J. and Grieve, D.W. 1997, Hydro-resistive measurement of dynamic lifting strength, Journal of Biomechanics, 30, 399–402 Rayson, M.P., Holliman, D.E., Pinder, A.D.J., Grieve, D.W. and Bell D.G. 1995, A comparison of work performed using an incremental lifting machine and a prototype lifting hydrodynamometer, Medicine and Science in Sports and Exercise, 27, S153 Rayson, M.P. 1997, The development of selection procedures for physically demanding occupations, PhD Thesis, University of Birmingham Rayson, M.P., Holliman, D.E. and Belyavin, A. 2000, Development of physical selection procedures for the British Army. Phase 2: The relationship between physical performance tests and criterion tasks, Ergonomics, 43, 73–105 Stevenson, J., Bryant, J., Greenhorn, D., Smith, T., Deakin, J. and Surgenor, B. 1990, The effect of lifting protocol on comparisons with isoinertial lifting performance, Ergonomics, 33, 1455–1469
Acknowledgements This work was carried out as part of the Technology Group 5 (Human Sciences and Synthetic Environments) component of the MOD Corporate Research Programme.
TEACHING THE NEUROMUSCULAR APPROACH TO EFFICIENT HANDLING AND MOVING Christine Donnelly Lecturer/Efficient Handling and Moving Coordinator, Faculty of Health Studies, Napier University, Edinburgh EH10 4TB
This paper outlines the developments in the teaching of the Neuromuscular Approach (NMA) to Efficient Handling and Moving (EHM), particularly in nursing. A recent, small-scale survey identifies that teaching of the NMA is becoming more widespread throughout Britain. A review of current handling and moving courses identifies a discrepancy between the length and content of courses and teaching strategies. Current teaching strategies appear to be lacking effectiveness in terms of developing proficiency and enhancing recall of the NMA principles. This research is investigating the effectiveness of different teaching strategies in relation to teaching and learning the NMA in order to develop an effective training package that will improve costeffectiveness of training programmes and promote proficiency in the performance of neuromuscular patterning conditioning. Introduction The NMA to EHM has been taught for many years throughout Scotland and more recently in England, Wales, Northern and Southern Ireland. It has been developed over the last six decades or so from the original work of McClurg-Anderson in relation to kinetic lifting. Vasey and Crozier (1983) further developed his concepts and changed the terminology from kinetic lifting to the Neuromuscular Approach as more knowledge and insight was gained from applied physiology research. Although the biggest group of users of the approach are associated with healthcare professions, many other professions and occupations are adopting the approach. The NMA is a principle-based approach rather than a taught technique. A principle-based approach enables the nurse to problem solve each situation as it arises, thereby reducing the risk of injury to themselves and/or their clients. The NMA may also offer some protection against unexpected spinal loading, which is commonplace in nursing as well as other professions (Owen and Damron, 1984). Although there has been an increase in the amount and variety of equipment available to move patients, healthcare professionals are caught between safety for themselves and their patients, and promoting independence and rehabilitation. Manual handling therefore remains a large part of nursing. Those who have learned and practice the NMA in its entirety find it beneficial both for themselves and their patients. It feels right. However, some teachers admit to not including specific conditioning in their training programmes because of lack of time and many students of the NMA admit to not practising specific and patterning conditioning. They identify that some courses are of such short duration that they can neither assimilate nor integrate the information sufficiently to enable recall following the training session.
CONTEMPOARY ERGONOMICS 2000
257
Recently a small-scale survey was conducted through a one-page questionnaire that was sent out to 40 members of the NMA Association and at random to 85 manual handling coordinators throughout the UK (N=125) with a response rate of 77%. Of the 96 respondents, 78 (81%) had attended courses in the NMA. When asked if they continued to practice both specific and patterning conditioning, 76% of respondents identified that they practised specific conditioning and 79% identified that they practised patterning conditioning. The results were further analysed both by profession and attendance at courses developed and taught by Crozier and Cozens. The results of the Chi-Squared test indicated that physiotherapists (N=42) who attended Crozier and Cozens’ courses were more likely to practise specific and patterning conditioning than those who had attended NMA courses run by other people. Although a similar pattern was shown for nurses (N=40), the results were less significant. The group who identified themselves as ‘other’ behaved slightly differently in that there is a higher uptake of specific and patterning conditioning following both types of courses. The number in this group was small (N=13). It is worth noting that Crozier and Cozens (1997) spend more time teaching conditioning and patterning than other trainers do. What is the NMA In its simplest terms, the NMA is a way of moving that allows the operator to move in balance and reduce reflex postural stiffening to a minimum, thereby reducing muscle effort and potential cumulative strain. Sustained tension in muscles affects the quality of the tissue, leading to cumulative strain and usually shortening and thickening of the tendons and stiffening of both deep and superficial fascia. The development of adaptive tissue reduces the range of movement available at joints that in turn affects handling and moving ability. When individuals habitually move in a top-heavy fashion, useful energy is wasted in trying to keep their balance. When this is considered in relation to EHM, an operator who is himself not balanced, will have increased difficulty dealing with an unbalanced patient or load, leading to a potential accident and/or injury. Practising neuromuscular patterning conditioning allows the new core pattern of movement to be learned and applied both in daily life as well as in load handling situations. However, neuromuscular patterning conditioning should not be taught in isolation due to the amount of tissue adaptation that most individuals have developed as a result of the top-heavy movement patterns used throughout life. Specific neuromuscular conditioning is a vital step in the re-educative phase of the learning process. It is a series of small stretching movements that are performed slowly and within the trainee’s own range of movement. Specific conditioning works muscles at their innermost range of movement and increases both sensitivity and awareness of muscle actions. Over time, specific conditioning has a therapeutic effect on both the deep and superficial fascia, releasing the tethering effects of adaptive tissue and allowing greater range of available movement at the joints. Comparison of the effectiveness of current teaching strategies Learning and training are different. The literature would support that different teaching strategies are required to ensure that a skill is learned to the point that it can be repeated in all types of circumstances (Reference). Skills training needs to be repetitive, performance needs to be monitored and feedback given at appropriate intervals so that learning is enhanced rather than inhibited. When the amount of time given over to handling and moving training is compared to the time taken to learn to type, perform sports, learn a musical instrument or any other complex skill, it is much less. However, handling and moving is a complex skill, particularly as students are being asked to apply principles of handling and moving to many different situations. Their movement pattern needs to become reflex or automatic, so that they can effectively ‘time-
258
TEACHING THE NEUROMUSCULAR APPROACH TO EFFICIENT HANDLING
swap’ decision-making when working in complex situations (Holding, 1989). They need to know when to think about their own movement pattern and that of the patient as well as dealing with all the other information they are trying to process at the same time. It would appear that when there is information overload, an individual’s movement pattern is the last thing to be thought about. By that time an injury may have occurred. However, not all training programmes are designed to enable skill development. Over a period of 18 months three groups of students who were learning the NMA had their movement patterns analysed by video at the beginning, during and end of their courses. Each group of students experienced different teaching strategies although the content was similar. All the students were volunteers. The groups of students are identified as: Group 1. Students were recruited from a new intake of diploma nursing students. These 14 students had been randomly assigned to either the Control Group (G1C) or the Experimental Group (G1E). The control group only received the 25 hours of training already written into the curriculum and delivered at intervals over an eighteen-month period. The experimental group was offered an extra 1 hour of teaching, for six weeks, at the beginning of their course, in addition to the 25 curricular hours. Group 2. Students were recruited from a training course lasting 5 consecutive days. All were volunteers but differed from group 1 in that they were all registered practitioners either in nursing or physiotherapy. There were 10 students in this group. These students had chosen to attend the course and some were selffinancing while employers seconded others. Group 3. Students were recruited from a nursing degree course. The 9 students received 21 hours of handling and moving input, delivered in 3-hour sessions over a seven-month period. These hours were included within the curriculum. No extra teaching was offered. Each of the above groups of students received a similar content with the following topics being covered: current legislation, review of the structure and function of the spine, specific neuromuscular conditioning, patterning conditioning, use of equipment in patient handling and manual handling. The difference between the courses lay in the amount of student contact hours during the initial stages of learning and the emphasis on specific and patterning conditioning. Two of the courses allowed students to see recordings of themselves performing the neuromuscular pattern, while one of the courses did not have access to equipment to allow this style of feedback. A checklist that would identify all the necessary elements of the NMA was developed and used throughout the video analysis. During each one of the data collection sessions, the students were taperecorded carrying out the same task: picking a pillow up from the floor and placing it on a chair. The students were asked to stand in front of a large portable grid that was made from 6×A1 sheets of white cardboard taped together and marked out into rectangles measuring 15 cms high×30cms wide using black insulating tape. The grid enabled the analysers to see movement of the volunteer’s back, knees, feet and head more easily. Analysis of the data identified that in all groups there was a statistically significant improvement in the performance in all the students from the beginning of the NMA courses to the end. The group that performed least well was G1C that had no extra input at the beginning of the course. However, the student numbers were insufficient to prove that the difference in performance between groups having their input spaced out over months, or delivered in a 5-day session, was significant. What was interesting to note is that both the groups G1C and G1E lost some of their expertise between recordings 3 and 4, a period of about 12 months, and this was statistically significant. Overall, results implied that those students who had spent a greater duration of time practising specific and patterning conditioning included more elements of the NMA in their movement pattern, even though the performance of the pattern was not proficient. Those students who received video feedback of their movement pattern performed better than those students who did not.
CONTEMPOARY ERGONOMICS 2000
259
Development of an effective training package for the NMA In order to try to bridge the gap between education and training in relation to EHM, a teaching strategy is being researched which allows regular short periods of practice of specific and patterning conditioning prior to the introduction of patient handling. Currently, a group of 25 student volunteers are being exposed to a variety of different teaching strategies for a period of 1 hour per week, for 15 weeks, not including holidays. Concurrently, the students will also attend their normal curricular classes in EHM; a total of 11 hours delivered on two sessions about 12 weeks apart. The teaching strategies employed focus on fostering a positive mental approach to learning the new movement pattern. They include centering and visual imagery techniques that may enhance performance of the neuromuscular pattern through increasing awareness. The class content for each session focuses on practising specific and patterning conditioning. The students are given a little information about each specific conditioning move through discussion and demonstration and are then allowed the opportunity to practise and experience the effects of the conditioning for themselves. This is followed by visual imagery of patterning conditioning prior to practise. In order to assist students, who have no knowledge of anatomy and physiology, wall charts of the body musculature and half torso models are displayed in the classroom for reference. Throughout the session a video of the ideal performance of the neuromuscular patterning conditioning is played with no volume. Each student has his/her own copy of the video, with a voice over explaining the elements of patterning conditioning, for use at home and during vacations. The students have each been given a copy of the checklist that will be used in the analysis of their movement patterns. At intervals throughout the 15 weeks, each student is video recorded and allowed to analyse their own pattern of movement as well as receiving both verbal and written feedback from the tutor on their performance. The students are allowed to practise picking objects up from the floor but are encouraged not to move out of their range of movement. If they begin to feel unbalanced, they are asked to come out of the move. Practice of the neuromuscular pattern is applied to everyday situations like tying shoelaces, rolling up mats that have been on the floor, picking up rucksacks, sitting, standing and moving chairs. The aim of this type of teaching is to allow the students repetitive practice of the NMA in order to facilitate skill development in relation to the NMA prior to the introduction of patient handling. Each student has been asked to keep a reflective diary of his/her experience in relation to specific and neuromuscular conditioning. Some students find this difficult while others seem to find the reflection helpful. The students were given a questionnaire to complete about their physical characteristics in relation to height, weight, physical activity and ethnic origin at the beginning of the research. This will be analysed in relation to the reflective diaries and videotapes to see if particular physical characteristics either enhance or impede the development neuromuscular specific and patterning conditioning. Early indications are that the students experiencing this variety of teaching strategies are performing better and showing a greater proficiency in the performance of neuromuscular patterning conditioning. Students are reporting preference for the short sessions rather than a 1-day session. Conclusion Factors such as large numbers of trainees, time, trainees with previous musculoskeletal injury and the relationship between education and training create dilemmas for trainers. Fear of litigation results in most ‘trainers’ now running short ‘awareness’ sessions rather than training sessions. However, raising awareness of staff is not providing them with the necessary skills and experience they require while performing their daily work. The challenge is to find a strategy for teaching the NMA that is effective in terms of both cost and staff safety and to provide update sessions at appropriate intervals rather than a standard annual update.
260
TEACHING THE NEUROMUSCULAR APPROACH TO EFFICIENT HANDLING
Using a variety of different teaching strategies for short periods at regular intervals seems to enhance learning and may be more effective than 1 day courses separated by several months or even a year. Students state a preference for this type of training. The quality of training of the trainers is also an important factor. Trainers who themselves do not practise specific and patterning conditioning and who do not receive regular, appropriate feedback on their own movement pattern are unlikely to motivate other staff to practise. References Crozier L. and Cozens S. 1997, The Neuromuscular Approach to Efficient Handling and Moving. In The guide to the handling of patients: introducing a safer handling policy 1997, (NBPA and Rcn, Middlesex) 58–62 Holding D.H. 1989, Human Skills: studies in human performance, Second Edition, (John Wiley & Sons, Chichester) 95–105 Owen B.E. and Damron C.F. 1984, Personal characteristics and back injury among hospital nursing personnel, Research in Nursing and Health, 7, 305–313 Vasey J.R. and Crozier L. 1982, A move in the right direction, Nursing Mirror, 1982, April
INFLUENCE OF PACKING METHODS ON MUSCULOSKELETAL PROBLEMS AMONG BRICK PACKERS A.D.J.Pinder Health and Safety Laboratory, Broad Lane, Sheffield S3 7HQ, UK
A field survey examined the problems caused by manual sorting and packing of bricks. 139 packers from 12 plants completed the Nordic Musculoskeletal questionnaire. Heart rates were recorded over a shift for 45 workers, and their postures were videoed. Rates of musculoskeletal trouble were found to be very high, particularly in the wrists/hands and low back and were higher in completely manual systems (‘hand packing’) than in semi-mechanised systems (‘monorails’). Hand packing produced higher heart rates and required more bending and twisting. Where the task cannot be mechanised action should be taken to reduce the risks. Introduction Bricks are fired in kilns at temperatures of about 1200°C in stacks with spaces for hot gases to circulate. After firing and cooling they are transferred, outside the kiln, to despatch packs which are tightly packed. As they are packed bricks are inspected for defects such as excessive colour variations and cracks. Normal defect rates range from over 10% to under 1%. Where problems in firing occur reject rates can be very high. In automated packing manual handling has been eliminated except for removal of seconds and rejects or in the event of mechanical breakdown. In some manual packing systems mechanised jigs (‘monorails’) are used for building despatch packs. The jigs are indexed between packing workstations at fixed intervals and each worker is expected to place a set number of bricks into each jig before it moves on. In hand packing bricks are packed from a fixed kiln pack to a fixed despatch pack, usually in a fixed jig. Hand packers are typically given a set number of bricks to pack and work at their own pace. Standard bricks are 100 mm×65 mm×210 mm and range in weight from about 1.8 kg to more than 3.0 kg. Inspection policies may require a packer to handle only two bricks at once (one per hand) or may permit handling five or more bricks at once (usually held between the hands). ‘Maximum Brick Limits’ (MBLs) may also be used to attempt to control the risks of manual handling. Loads handled can be in the region of 12.5–13 kg. Total loads may exceed 30 tonnes per man per day. Kiln packs are typically up to 1.5 m high and four brick lengths deep. Jigs are typically 8 bricks high. Monorail jigs are two brick lengths deep, but hand packing jigs are usually five brick lengths deep. The British Ceramics Confederation (BCC) (1998) estimated that 650 workers were employed by member firms in hand sorting/packing. In 1996–7 13 sites using monorails reported 21 three-day accidents under RIDDOR 95 and 17 sites using hand packing reported 16 accidents. The mean numbers of bricks per worker per shift were 14178 on monorails and 14167 for hand-packing. The annual injury rates per million
262
INFLUENCE OF PACKING METHODS ON MUSCULOSKELETAL
daily bricks ranged from 0 to 16.7 (mean 4.31, SD 5.73) on monorails and from 0 to 28.9 (mean 3.71, SD 6. 81) on hand packing. These rates are not significantly different and suggest that in unfavourable circumstances severe problems can arise in both systems. Ferreira and Tracy (1991) compared work practices in two plants with monorails with different injury rates and suggested that differences in work organisation and methods of handling could be influencing the injury rates. They described workers in the plant with more injuries as having handling techniques and work organisation which were characterised by lack of variety, whereas the other plant was characterised by versatility, and used a wide variety of handling techniques. The BCC (1998) described two questions as unresolved: 1) The relative risks of hand packing and packing on monorails; and 2) The relative risks of handling 5 bricks at a time and of handling 2 bricks at a time and therefore lifting 2.5 times as often. Therefore a field investigation was undertaken to address these issues. Methods The study was approved by the HSE Research Ethics Committee. The survey aspect of the study was granted ministerial approval under the Survey Control procedures for government departments which wish to undertake statistical surveys in industry. All subjects gave informed consent before participating. The aim had been to study six sites using monorails and six using hand packing, three of each with MBLs of 4/5, and three of each with MBLs of 2, in a factorial design. It was found that no monorails had an MBL of 2, so three more monorails with MBLs of 4/5 were included. Examining MBLs solely within hand packing was also impossible as only two plants with MBLs of 2 could be identified. A plant with an MBL of 3 was found, but this limit was ignored by the packers who handled up to 7 bricks at once. HSE has used the Nordic Musculoskeletal Questionnaire to survey the prevalence of musculoskeletal ‘trouble’ (‘ache, pain, discomfort or numbness’) across a number of work forces (Dickinson et al., 1992, Dickinson, 1998). Questions are asked for each of nine body regions to establish the annual prevalence, the weekly prevalence and the annual disability. All available packers were asked to complete the NMQ, normally during a morning break. Heart rates were measured over a normal shift for up to four workers at each plant using Polar Heart Rate Monitors (Polar Electro Oy, Finland). Resting heart rate was the minimum in part of a rest break with a standard deviation of less than 5 beats per minute. Working heart rate was the mean heart rate over a one hour period that did not include a break. Heart rate reserve was the difference between maximum (220-age, Astrand and Rodahl, 1986) and resting heart rates. Work pulse was the difference between the working and resting heart rates. Video recordings of the activities of the packers whose heart rates were being measured were made over the shift. Tapes from four plants were coded at 1 minute intervals using the Observer Pro video analysis system (Noldus Information Technology BV, The Netherlands) to control the video tape. The WinOWAS software (Tampere University of Technology, 1996) was used to assess the postures. This relates timesampled postures to ‘Action Categories’ linked to recommendations of the urgency of remedial action (Karhu et al, 1977, Vedder, 1998). It assigns Action Categories (Table 1) from the percentage of time that a body part is in a particular posture.
CONTEMPOARY ERGONOMICS 2000
263
Table 1. OWAS Action Categories
Results Survey of musculoskeletal trouble Excluding unavailable workers, the overall response rate for the NMQ was 82%. Basic anthropometric and personal data are reported in Table 2. The only statistically significant difference (P<0.05) between the two groups of packers was that the hand packers had on average almost three years more experience. Table 2. Anthropometric and work duration data
Annual and weekly prevalences of musculoskeletal trouble are given in Table 3 for hand and monorail packers, and for data collected by HSE from bricklayers and the 1985 Nordic Reference Data set (Foundation for Occupational and Environmental Medical Research and Development, Orebro, 1985). Table 3 also gives the suggestions of Dickinson (1998), for ‘high’ action levels for annual prevalences. The levels of trouble reported were very much higher than both the Nordic data and the ‘high’ levels of Dickinson (1998). The highest rates were in the wrists/hands and the lower back. Three significant differences were found between the two packing methods with hand packing worse in each case. The annual prevalences in the lower back were 87% and 72% respectively (χ2=4.33, P<0.05). The weekly prevalence in the wrists/hands was nearly twice that of monorail packers (48% and 26%, χ2=7.22, P<0.01) and more than twice that in the upper back (32% and 15%, χ2=5.29, P<0.05). While bricklayers also have high prevalences, the problems are different and less severe than in packers. Their weekly prevalence was 13% in the wrists/hands, but 48% and 26% for hand and monorail packers respectively. In the lower back the frequencies were 26% for bricklayers and 64% and 47% for hand and monorail packers. Heart rates in hand packing and monorail packing A significant difference of 8.8 bpm (P<0.05) was found in the working heart rates of the two types of packing but one was not found for resting heart rate (Table 4). As a result, significant differences existed in heart rate reserve (P<0.05) and work pulse (P<0.05).
264
INFLUENCE OF PACKING METHODS ON MUSCULOSKELETAL
Table 3. Annual/weekly prevalence data
Table 4. Heart rate data (bpm) from monorails and hand packing
Table 5. Percentage of postures assigned to the different OWAS Action Categories at the different plants
Posture analysis The posture analysis shows (Table 5) that at most 13% of the time was spent in postures in AC 3 and AC 4. For the two monorails approximately two-thirds of the postures were AC 1. However, for the hand packing sites, under 50% of the postures were AC 1, and over 50% were AC 2. Therefore, in terms of gross postures, hand packing is worse than monorail packing. Bending and twisting of the trunk reached AC 3 in three plants and AC 2 in the other and twisting by itself also reached AC 2 in one plant. It is therefore necessary to reduce the amount of bending and/or twisting, which will be best achieved by redesign of packing workstations and kiln and despatch packs. Reducing bending would best be done by increasing minimum heights of lift. Discussion This study indicates that manual sorting and packing of bricks is a high risk activity for musculoskeletal disorders. The very high levels of musculoskeletal trouble found among packers, particularly in the wrists/
CONTEMPOARY ERGONOMICS 2000
265
hands and the lower back, are far in excess of mean levels in working populations in both Scandinavia and the UK and are very high when compared to bricklaying. The heart rate data revealed that packing falls into the broad categories, as defined for men aged 20–30 (Astrand and Rodahl, 1986), of ‘moderate work’ on monorails and ‘heavy work’ in hand packing. Therefore hand packing is worse than monorail packing, as it has more reported musculoskeletal problems, is more strenuous, and involves more bending and stooping. Therefore it can be seen that the method of packing adopted has an effect on the musculoskeletal hazards of a job requiring large amounts of manual handling. The current trend within the industry is to mechanise packing and this is clearly the most effective method of reducing the risks from this task. However, manual sorting and packing will need to continue in some circumstances, particularly where waste rates are high or where production volumes are low. A proactive approach to management of the associated risks to musculoskeletal health will therefore be essential. Bibliography Astrand, P.O. and Rodahl, K. 1986, Textbook of Work Physiology: Physiological Bases of Exercise, Third Edition (McGraw-Hill Book Company, New York) British Ceramics Confederation 1998, Manual Handling of Loads, (BCC, Stoke) Dickinson, C. 1998, Interpreting the extent of musculoskeletal disorders. In M.Hanson (ed.) Contemporary Ergonomics 1998, (Taylor & Francis, London), 36–40 Dickinson, C.E., Campion, K., Foster, A.F., Newman, S.J., O’Rourke, A.M.T. and Thomas, P.G. 1992, Questionnaire development: An examination of the Nordic Musculoskeletal Questionnaire, Applied Ergonomics, 23, 197–201 Ferreira, D.P. and Tracy, M.F. 1991, Musculo-skeletal disorders in a brick company. In E.J.Lovesey (ed.) Contemporary Ergonomics 1991, (Taylor & Francis, London), 475–480 Karhu, O., Kansi, P. and Kuorinka, I. 1977, Correcting working postures in industry: A practical method for analysis, Applied Ergonomics, 8, 199–201 Vedder, J. (1998), Identifying postural hazards with a video-based occurrence sampling method, International Journal of Industrial Ergonomics, 22, 373–380
© Crown copyright (2000)
USE OF HUMAN EXPERTISE IN EVALUATING MANUAL LIFTING TASKS Ash Genaidy1, Jose Beltran1, Ali Alhemoud2 & Simon Yeung3 1Industrial
Engineering, University of Cincinnati, Cincinnati OH 45221–0116, USA
2Kuwait 3Department
Institute for Scientific Research, PO Box 24885, 13109 Safat, Kuwait
of Rehabilitation Sciences, Hong Kong Polytechnic University, Hong Kong
The objective of this study was to determine whether human expertise can be reliably used to evaluate manual handling activities. Five health and safety professionals “HSP” and five graduate students “GS” participated in the study. The findings indicated no significant differences between the two trials elapsed by a one-week period. Based on the results, however, it is felt that the amount of information evaluated may have a significant impact on the reliability of the data collected. An alternative approach and other results are discussed. Introduction Occupational lower back disorders “OLBD” are still considered the most frequent injuries in the workplace and the most costly in terms of worker compensation costs. Therefore, many techniques have been reported in the scientific literature in order to evaluate the risk associated with these injuries and to eventually devise interventions to control the incidence of OLBD. Unfortunately, all these techniques do not take into account the active role of workers in the evaluation and control of OLBD. It is our opinion that scientific methods are urgently needed to explore this issue which may help in reducing the frequency and severity of OLBD. This is particularly important because of the following: 1) industrial personnel are not usually kept abreast of the major findings published in scientific journals, therefore, they lack the knowledge and expertise to interpret the results obtained; 2) industrial managers usually look for quick solutions to deal with health and safety issues, and, analyzing lifting tasks through the use of these methods is usually time consuming for practical purposes; 3) in some cases, the cost of the equipment used in some of the evaluation techniques is prohibitive. Recently, we have explored the use of human expertise in the evaluation of lifting tasks. The preliminary findings indicated that: 1) the evaluation of lifting tasks by health and safety professionals, as well as industrial workers is consistent with the results in the published literature; 2) the perceived risk of injury and weight of load are major determinants of the perceived physical effort “PPE” (Yeung et al, 1999; Karwowski et al., 1999). The study reported herein is an extension of our earlier investigations to continue exploring whether human expertise can be used to evaluate manual handling activities. In particular, we wanted to determine whether human expertise is reliable with respect to the evaluation of manual lifting tasks. Furthermore, the following additional hypotheses were tested: 1) the effects of “light” load—“far” horizontal distance lifting conditions on PPE are not different from those for “heavy” load—“close” horizontal distance lifting conditions; 2) the effects of “light” load—“high” frequency lifting conditions on PPE are not different from
CONTEMPOARY ERGONOMICS 2000
267
those for “heavy” load—“low” frequency lifting conditions; 3) there is no difference among the input variables in terms of the effects on PPE; and 4) there is no difference between HSP and GS in terms of their evaluations. Experimental Methods and Procedures Subjects Five HSP and five GS volunteered to participate in the study. A summary of the highest degree attained, level of physical fitness, years of professional experience, and age is provided in table 1 for all study participants. Lifting activity and variable definition In this study, a lifting activity was defined as handling objects vertically with both hands from a lower position to a higher position. The following was assumed about the lifting activity: 1) the lifting object is a rectangular box with no sharp edges, uniformly distributed solid material, and no cutout handles; 2) the lifting activity is performed while the human body is in a standing position; 3) the task requires lifting loads, from or near the floor to a height 10 inches above this level over an eight-hour period (including onehalf hour lunch break and two 15 minute coffee breaks); 4) moderate physical environmental conditions (for example, comfortable temperature, adequate lighting, moderate level of noise); 5) comfortable clothing is worn; and, 6) no incentives are provided. The lifting variables (i.e., weight of load “WL”, horizontal distance “HD”, twisting angle “TW”, and frequency of lift “FL”) examined in the study were defined in a similar way to those described in the NIOSH lifting equation (Waters et al., 1993). In addition, physical effort was defined as “the amount of energy” required to perform a lifting activity. Table 1. Summary of subjects’ characteristics
Note: HSP—health and safety professionals GS—graduate students
268
USE OF HUMAN EXPERTISE IN EVALUATING MANUAL LIFTING TASKS
Procedures Each study volunteer was asked to evaluate, based on his knowledge and experience, the PPE associated with the performance of 108 lifting activities described in combinations of WL, HD, TA, and FL. Everyday language, such as “heavy” for weight of load, was used to quantify these lifting conditions. The specific descriptors used for the lifting task variables were: 1) WL—“light”, “moderate”, and “heavy”; 2) HD—“close”, “moderate”, and “far”; 3) TA—“small”, “moderate”, and “large”; and, 4) FL —“negligible”, “low”, “moderate”, and “high”. PPE was quantified by one of the following descriptors: “extremely low”, “very low”, “low”, “moderate”, “high”, “very high”, and “extremely high”. Each study volunteer performed the evaluation twice separated by a one-week period. The order of presentation of lifting condition combinations was not randomized and was given in a standardized format to all study participants in each of the two trials. The participants were only briefed about the purpose of the study and no formal training was given to them about the knowledge generated in the scientific literature. Results and Discussion The results of statistical analyses conducted on the data collected in this study revealed the following: 1) The two trials were not significantly from each other in terms of the linguistic evaluation of PPE for both HSP and GS. 2) There were significant differences between the effects of “light” WL— “far” HD lifting conditions on PPE and those for “heavy” WL—“close” HD lifting conditions for both HSP and GS. Heavy WL conditions were consistently more demanding than those of light WL conditions in terms of PPE. 3) There were significant differences between the effects of “light” WL—“high” FL conditions on PPE and those for “heavy” WL—“low” FL conditions for both HSP and GS. Heavy load lifting conditions continued to be more demanding in terms of PPE than light load conditions. 4) There were significant differences between HSP and GS. Logistic regression analyses indicated that, for the entire data set, the WL was considered the most important lifting variable affecting PPE (odds ratio “OR” =6.76), followed by FL (OR=2.99), HD (OR=2.0), then TW (OR=1.67). These results supplement the aforementioned statistical analyses. These findings clearly demonstrate as previously shown in our preliminary studies (Yeung et al., 1999; Karwowski et al., 1999) that the WL is the most important lifting variable impacting the PPE. The WL is, at a minimum, two times more important than other lifting variables. In additional logistic regression analyses, it was found that the logic of GS is comparable to that of HSP which provides additional support to the analysis of the entire data set. Differences were only observed in terms of the magnitude of the effects on PPE. The participants were asked to estimate the numerical values associated with the linguistic values of lifting variables. The results showed the following: 1) WL for HSP—light=9 lb, moderate=25 lb, heavy=55 lb; 2) WL for GS—light= 17 lb, moderate=40 lb, heavy=75 lb; 3) FL (lifts per minute) for HSP—low= 2.5, moderate=8, high=16; and, 4) FL for GS—low=8.5, moderate=14, high= 21. These findings indicate that the younger GS were consistently higher than the older HSP in terms of their numerical evaluations. It is further observed that the numerical evaluation of lifting variables is not as reliable as the linguistic evaluation.
CONTEMPOARY ERGONOMICS 2000
269
The participants estimated the perceived risk of lifting injury “PRLI” (defined linguistically in this study as the degree of harm that a person perceived as a consequence of performing lifting activities and using a similar scale to that of PPE) associated with each level of PPE. On average, the findings showed that: 1) the “extremely low” and “very low” PPE values were equivalent to “extremely low” PRLI for GS and HSP; 2) the “low”, “moderate”, and “high” PPE values corresponded to “very low”, “low”, and “moderate” PRLI values, respectively for both GS and HPS; 3) the “very high” and “extremely high” PPE values were identical to those of PRLI for GS and were comparable to “high” and “very high” PRLI for HSP. Conclusions The major conclusions drawn from this study are: 1) Although the results of this preliminary investigation demonstrated the reliability of linguistic evaluations, it is clear to us that the amount of information evaluated via the use of human expertise may have a significant impact on the reliability of the data collected. Thus, an alternative approach is to conduct a large scale study in different workplaces in order to establish the relationships between a wide range of lifting workloads and the risk of OLBD. As such, the workers will be only required to evaluate their jobs. Consequently, the amount of information requested for evaluation may greatly diminish and the reliability of data gathering may markedly improve. 2) The weight of load is the most important lifting variable in terms of its effects on PPE. 3) The linguistic evaluations of manual lifting tasks may be more reliable than the numerical evaluations. 4) The linguistic values of perceived risk of lifting injury do not correspond on a one-to-one basis with those of perceived physical effort. Generally, the linguistic value of perceived risk of lifting injury is usually lower by one level in comparison to those of perceived physical effort. References Karwowski, W., Genaidy, A., Huston, R., Yeung, S., and Beltran, J., 1999, Development of the quantitative model for application of workers’ expertise in evaluating manual lifting tasks. In Proceedings of the Human Factors and Ergonomics Society 43rd Annual meeting, 647–651. Waters, T.R., Putz-Anderson, Garg, A., and Fine, L.J., 1993, Revised NIOSH equation for the design and evaluation of manual lifting tasks, Ergonomics, 36(7), 749–776. Yueng, S., Genaidy, A., Karwowski, W., Huston, R., and Beltran, J., 1999, Application of the human expertise-based model for evaluation of manual lifting tasks in the Honk Kong worker population. In Proceedings of the Human Factors and Ergonomics Society 43rd Annual meeting, 652–655.
MANAGING A MANUAL HANDLING RISK ASSESSMENT PROCESS Janet Crowhurst, Bernie Catterall & Glyn Smyth Human Applications, 139 Ashby Road, Loughborough, Leics LE11 3AD, UK
This paper outlines the experiences of the authors in implementing a manual handling risk assessment process in a number of large multi-site distribution companies. It discusses the limited real-world value of generic risk assessments and comments on problems with the application of the HSE numerical guidelines in the distribution industry. The role of the ergonomics consultant and the position of manual handling risk assessments within the wider context of an overall risk assessment programme are also considered. The practical problems of implementing and managing a risk assessment process across several multi-site organisations with high levels of manual handling are discussed, and a number of positive conclusions are drawn. The quality of the risk management systems in place was found to be as important as the quality of individual assessments in addressing the myriad of handling risks within these complex work environments. Introduction The Manual Handling Operations Regulations 1992 have been in place since 1st January 1993, and yet there are numerous companies and organisations who are not only unaware of the content (or even existence) of these regulations, but are also ignorant about their potential implications. Recently, a manual handling injury combined with a lack of risk assessment resulted in a civil claim for personal injury at work. The case, against the Metropolitan Police, occurred following a back injury sustained by an employee whose duties were changed (to involve lifting boxes of stationary). The Force was shown to have been negligent in its expectations of the claimant’s work and, what is more, they could not produce a risk assessment on the revised manual handling duties in question. The claimant was awarded £384,497. This example of negligence highlights a handling-related problem that is apparent throughout many companies. There is also evidence of more cases reaching the courts. Judges are not only taking a harsh line on companies who cannot produce risk assessments, but are also questioning the quality and competence of any assessments made. Most significantly, they are questioning the adequacy of generic assessments in dealing with real-world issues. It is the duty of the employer to complete a competent risk assessment programme under the Management of Health & Safety at Work Regulations 1992.
CONTEMPOARY ERGONOMICS 2000
271
Risk Assessment Process Risk assessment is not an end in itself: it exists as an analysis tool specifically to enable prioritisation of the management actions required to deal with the range of risks faced. The concept of risk, in this context, is composed of two separate components: an estimation of the reasonably foreseeable consequences if a hazard is realised (in terms of injuries or ill-health); and secondly, an estimation of the likelihood of realisation of the identified hazard. A simple but effective formula can then be used to calculate the level of risk for a given task. In a simple 3-point scale, ‘consequence of hazard’ and ‘likelihood’ are given a rating of low (1), medium (2) or high (3) according to agreed criteria. The two figures are then multiplied together, giving a figure which is translated into a level of overall risk, namely low (1 or 2), medium (3 or 4), or high (6 or 9). This then enables management to identify those tasks which should be addressed in a risk control programme and with what urgency. We were asked by a number of large distribution companies to implement a risk management process. The risk process we implemented began with a comprehensive breakdown of the work activities at each distribution depot e.g. goods are delivered, unpacked, sorted and stored, re-loaded on lorries and distributed to sites across the country. It was then possible to divide further the different tasks into logical groups, such as warehousing, site services, transport, etc and, from this, to generate a separate task listing of the manual handling tasks. A clear coding system was applied for easier and more accurate assessment referencing, and to facilitate the future review and audit processes. The codes were specific to each of the tasks in question and a digit was added to denote each depot/location where the tasks occurred. Once these initial task analyses were complete, the next step was to perform initial walkthrough assessments. These identified the high risk concerns and put them in order of priority for later detailed assessment. The assessments were carried out to identify the specific issues which contribute to the hazard or its likely realisation. Risk controls were then determined and local management action plans identified and implemented. Implicit within the regulations is the idea that assessment programmes must also be auditable i.e. open to formal review. The need to be able to track assessments against an evolving risk management plan (and against change) is, in our experience, not well considered in most assessment programmes Manual Handling Risk Assessments The aim of the manual handling aspect of the project was to instigate a manual handling risk management programme that could be undertaken and managed at a local level, whilst being overseen by experienced external consultants. Despite the presence of large amounts of mechanical equipment to perform the distribution work process and an earlier attempt to undertake handling assessments, there remained a significant number of manual handling tasks requiring further or more detailed assessment. Our initial reviews found that each of the companies had completed a number of manual handling risk assessments but these were generally of a poor quality and failed to reflect the full range of manual handling risks faced. In addition, the assessments also lacked a structure for managing the required control actions. The first step taken to rectify this involved the task analyses described above. This relatively straightforward process showed a large number of similarities. These similarities in the tasks could then be tackled as “chunks” of a more manageable nature. Starting from the initial task analysis, we then generated more complete and ‘localised’ lists of manual handling activities specific to each of the depots. These then formed a management tool which allowed the different organisations and locations to get a feel for the scale and nature of the manual handling-related issues at both central and local levels. Many of the manual handling tasks identified had not been assessed by any of the companies involved. Many handling tasks had simply ‘slipped through the net’ and had inadvertently (but often surprisingly) been overlooked. The task
272
MANAGING A MANUAL HANDLING RISK ASSESSMENT PROCESS
analysis was therefore vital in ensuring sufficient coverage of manual handling concerns in each organisation. Our experience with these distribution companies was that the handling-related aspects of the risk assessment process could not realistically be tackled in isolation from general health & safety issues, specifically in terms of prioritising time and resources to deal with the risks identified. Also, as important as identifying key risk concerns was the clear identification of relatively ‘trivial’ handling risks which could be assigned a much lower priority. Identifying those manual handling tasks which pose lesser risk is an important output from an effective risk management system. Prioritisation of the full range of handling risks faced at each depot proved much more effective than previous systems in prompting managers (both at local and central levels) to act. Task analyses were followed by production of a comprehensive set of initial ‘walkthrough’ assessments (describing the full range of handling activities identified at each location, in order to conduct a brief initial comparative risk ranking exercise). These ‘walkthroughs’ defined a schedule for the conduct of detailed risk assessments within time-scales appropriate to the levels of initial risk identified. Prior to our involvement, all the companies involved had moved directly to detailed risk assessment, to comply with legislation, without a clear plan or method of prioritisation. Generic versus Individual Risk Assessments In compliance with the European legislation on manual handling and closely following the HSE guidelines on such matters, we were able to implement a methodology for managing the risks throughout the various locations. Our first consideration was whether generic risk assessments might suffice for the similar tasks across the different depots. This was soon rejected, however, owing to the considerable real-world variety of local workplace conditions. For example, each depot specialised in handling different goods, whether fresh produce, frozen goods, tins and bottles, etc. The question was whether the resulting control measures would differ for each, or could be generalised across all depots. As a rule, generic risk assessments were, in our experience, surprisingly inappropriate. The number of variables involved in a manual handling assessment were such that even if the tasks were seemingly “identical”, the variety in the physical design, individuals performing the tasks, or systems of work in place at different sites, meant that we had little confidence in a generic assessment of handling-related ‘risk’. This was a significant problem with such a large number of assessments and where there seemed to be considerable overlap between sites and tasks. For example, the task of ‘picking’ (the job of removing goods from storage racks or shelving and placing them into roll cages) is performed at each depot, but there were many significant variations between the sites, such as the size and weight of the goods, the temperature in the warehouse, the numbers performing the task and the different systems of work in place. Initial walkthrough assessments quickly demonstrated significant differences in many apparently similar tasks at the various depots. The quality of these initial summary assessments was, we found, critical both to the effective prioritisation of the handling risks at a given depot and also to aiding a recognition that the tasks were not in fact ‘generic’. The Numerical Guidelines in Practice One of the difficulties we experienced concerned the application of the HSE Numerical Guidelines to the workplaces under assessment. Their usage is intended only as a filter and they are not to be regarded as safe weight limits for lifting, but they clearly highlight the fact that many manual handling tasks performed in
CONTEMPOARY ERGONOMICS 2000
273
the workplace are inherently putting people at a risk of injury. In many areas of the distribution industry, weights are handled and forces applied to move objects that far exceed the guidelines. This was found for a significant number of handling tasks performed in the depots. For example retrieving items of stock weighing up to 15kg from racks up to 1.8m high, where the HSE recommended ‘limit’ for men may be interpreted as 5kg at arms’ length at head height. As such, this and many other manual handling tasks which constitute ‘normal’ activities in these companies seem (to managers who are not handling experts) to be so far out of line with the guidelines that there seems to be no realistic or viable starting point except, it appears, to ‘ban’ everything—which is clearly ludicrous. Our solution to this was to improve prioritisation— using the ‘walkthrough’ process to enable managers to determine which activities were indeed most ‘risky’. This was made possible since hazard estimation (always high in these circumstances) was often tempered with additional data indicating a lower likelihood of exposure for specific tasks—and thus a lower level of comparative risk. Risk control remained urgent for these tasks but at least priorities could be established. An additional benefit of the walkthrough process is that it provides evidence (to the court of law in a worst case scenario) that where risk assessments may not yet be available for all tasks, there is a system in place that justifies in which order the issues are being considered. Risk Assessor Competency As part of the projects, we trained a number of employees to become competent risk assessors—both in manual handling and general risk assessment. They were overseen locally by the depot Health & Safety Officers and the overall process was monitored by ourselves. The aim was to implement a standardised approach to the overall risk assessment process, yet to ensure the different assessments were completed by individuals who were familiar with the specific tasks. The risk assessors could be deemed to be demonstrably competent in their role since they had undergone a training course providing not only the theoretical knowledge but the practical skills. The courses were externally accredited and moderated by the Institution of Occupational Safety and Health. This external validation provided the companies with a standardised approach to risk assessment across their numerous depots. Local task knowledge was found to be invaluable and enabled the process of monitoring and reviewing to be undertaken on site by the same trained and experienced individuals. The disadvantage, however, was the problem of staff availability to perform the risk assessments, a common difficulty when employing in-house personnel. Risk Management It was not difficult to convince the employers of the importance of handling safety, the threat of handling accidents at work or the potential cost of compensation claims. Under health & safety law it is the employer’s responsibility to consider the ‘reasonably foreseeable’ risks and to reduce such risks ‘so far as is reasonably practicable’. Since there are no hard and fast rules for doing this, managers differ in the extent to which they are prepared or able to manage the process. We found that a rigorous and auditable assessment system overcame many of these difficulties. Arising from this risk assessment process, significant capital investment has been made at all of the depots concerned in controlling or reducing the handling risks faced. The provision of additional or more suitable mechanical-handling equipment (MHE) was a common solution. However, the ever-present pressure of workloads and problems with manning-levels at peak periods (particularly in using ‘agency’ or other temporary staff) presented continuing difficulties. The need for well-supervised safe systems of work for handling tasks and the provision of suitable and locally-appropriate safe handling training (as well as
274
MANAGING A MANUAL HANDLING RISK ASSESSMENT PROCESS
training in the safe use of MHE) continue to be essential components of effective local risk control. Some handling tasks (e.g. arising from the need to pick from ‘split-racking’) can only realistically be solved if company-wide handling-related risk reduction initiatives are implemented (e.g. resulting in wholesale redesign/repositioning of existing or future racking systems). Initiatives to address these problems are underway at each of the companies concerned. Conclusion Manual handling risk assessment is not an end in itself. It should be considered as a management resource for identifying and prioritising risk management requirements. This is the approach not only of ergonomists but also that implicit within the European health & safety legislation of 1992. What this means in practice is that employers should set up and maintain a simple but effective process whereby a complete handling task listing is first created from a task analysis. Then ‘walkthrough’ assessments are carried out to gather initial data about each handling task to assign an initial risk priority. Detailed risk assessments are then completed, control measures identified to reduce the risks in a clear management action plan. Only once this process has been instigated will managers have confidence in what they are doing to reduce handling risks. There is a temptation to amalgamate tasks that appear to be similar by performing generic risk assessments, but experience taught us that this was not an effective way of managing handling-related risks since there were too many local variables that were then not appropriately addressed. It is also important to develop a strategy to manage all aspects of risk assessment concurrently, whether general safety concerns, manual handling or others, so that the full spread of risks (particularly high risks) can be dealt with as priorities as well as in relation to one another. Once an effective prioritisation system is in place, it is then a matter of regularly monitoring and reviewing the process of risk control, and maintaining an effective audit trail in an ongoing quality assurance programme. Only then can improved handling safety at work be achieved, legislation be complied with and productivity be improved.
Hands & wrists
THE MEASUREMENT OF RANGE OF MOVEMENT OF THE WRIST: MAN OR MACHINE? George E.Torrens & Anne Newman Hand Performance Research Group, Department of Design and Technology, Loughborough University of Technology, Loughborough Leicestershire LE11 3TU, UK
This paper documents a pilot study undertaken to compare the effectiveness and efficiency of manually measuring the range of movement at the wrist using both a traditional goniometer and a motion capture technique (using a CODA system). Fifteen subjects, male and female 18–22 years old were measured using both techniques. The results indicate a variation of adduction and abduction values at the wrist, beyond that stated in existing references. The manual goniometric data collection was quicker than using the CODA system. A longer time period was spent in preparation and post processing of the results from the CODA system into a spreadsheet. However, the CODA system provided a more detailed description of the ROM of a wrist, highlighting the compound angle made by the wrist during the measurement of adduction and abduction which is not shown through manual goniometry. Introduction It has been established that position of forearm, wrist and hand during manual handling or manipulative tasks can affect an individual’s ability to grip an object (Pryce, 1980). However, the authors do not know of any work being undertaken to investigate the relationship between wrist mobility and an individual’s ability to perform a manual handling task. This investigation is part of a programme of work to define the characteristics and performance of the hands using a sample of the United Kingdom population, sponsored by the Defence Clothing and Textiles Agency. The tests documented within this paper describe the use of two methods of goniometric measurement, conventional goniometric measurement by an operator using a goniometer, and the angular measurement taken using a motion capture machine. The aim of this trial was to review the effectiveness and efficiency of the motion capture machine, Cartesian Optoelectronic Dynamic Anthropometer (CODA), in recording Range Of Motion (ROM) at the wrist, against manual measurement of the same angles. It was also to identify any relationships between conventional measurements of hand characteristics, anthropometrics and grip strength. The evaluation of how appropriate is it to use a motion capture system to record anthropometric characteristics of a subject within a trial is also discussed. Measurements were taken from a population of 157 first year undergraduate students, (of which 76 were female), from the Department of Design and Technology, Loughborough University and the Department of Textiles, Loughborough School of Art and Design, Loughborough University. They were aged between
CONTEMPOARY ERGONOMICS 2000
277
eighteen and twenty-two years old. No student had any impairment or abnormality in his or her hands. The trial ran between January and June 1999. Method The subjects were invited to take part in the trial which was undertaken in two phases, measurement of anthropometric data, followed by measurement of other aspects of hand performance such as grip strength, finger compliance and finger friction, detailed in previous references (Torrens and Gyi, 1999). The series of measurements to be taken were discussed with subjects and their written approval given before taking part in the trial. All measurements were taken in a room with ambient temperature, humidity and away from direct sunlight. In phase one, anthropometric measurements were taken from all 157 subjects. The measurements taken were of the right hand only, for speed of measurement, and the subject’s dominant hand recorded. BS 7231 (1990) was used as a guide to method of anthropometric measurement. In phase two, grip strength, finger friction, finger compliance and ROM measurements were taken from thirty-five subjects, fourteen of whom were female. Finger friction and compliance were also measured but are not discussed in this paper. Thirty-five subjects from the sample population had the ROM of their wrists measured and were chosen to be representative of the anthropometric percentile range of the population. Two operators processed subjects in groups of five. Due to time constraints, each measurement was taken only once, manual and machine measurements were taken in the same order each time and by the same operator. A consultant ergonomist who was experienced in taking anthropometric measurements was one of the operators. Data was recorded by hand. The finger friction and compliance measurements required a rest period of at least five minutes between each measurement and so the wrist goniometric (ROM) measurements were done between each finger test. Grip strength measurement followed the protocol described by Mathiowetz et al, 1994 and Torrens and Gyi, 1999. ROM of the hand about the wrist was measured using a modified conventional clinical joint motion protocol (Rowe et al, 1965). The flexion and extension of the hand about the wrist were measured with the hand in a mid-supinated (or neutral) position, i.e. at 90° to the horizontal. The authors were aware this would make the comparison of collected data with existing ROM references difficult, but would not affect the validity of the comparison between manual and machine measurements of the same subject. The chair in which the subject sat was fixed to avoid the need to adjust the CODA field of view for each new subject. It was necessary to constrain the limb position. A single CODA system was used and so the positioning of the arm in relation to the field of view of the CODA scanner was critical to ensure all the markers remained in the field of view. Each subject sat upright in a conventional steel contract chair, (seat height 400mm), with their right forearm held at a right angle, horizontal to the floor. The subject was shown the sequence of hand movements required. These were flexion (into the body), extension (away from the body), adduction (upward), and abduction (downward). All hand movements were performed with the subject keeping their arm resting on the armrest. Manual measurements were taken using a clear plastic, 300 mm long, clinical goniometer, as supplied by Nottingham Rehab Supplies Limited, Nottingham, U.K. The operator asked each subject to hold their maximum comfortable hand movement in each direction whilst they took the measurement from the back of the hand, (for flexion/extension), and the axis of the forearm through the centre of the wrist to the envisaged line running through the centre of the third digit (middle finger) for adduction/abduction. Motion capture measurements were taken using a CODA mpx30, supplied by Charnwood Dynamics Limited, Leicester, U.K. This system uses infrared emitters (markers) attached to the body, via surgical double-sided tape, and stereo sensors within a scanner unit to define an anatomical location on the body.
278
THE MEASUREMENT OF RANGE OF MOVEMENT OF THE WRIST
Each subject was fitted with five markers to locate the anatomical references of the humerus (marker one) at its proximal point, ulnar (marker two) and radial (marker three), next to the wrist processes. The proximal side of the metacarpophanangeal joints of the second (marker four) and fifth digits (marker five) were also marked. With the markers in position, the subject was asked to follow the sequence of movements previously shown to them. The CODA system had been set to sample at 200Hz over a ten second period. Earlier pretrial tests had shown that this was a sufficient amount of time within which the movements could be completed. The goniometric data from each motion capture was reviewed and the maximum angles taken from a graph generated by the CODA system software that enables windows-based views. The data was manually transferred on to a spreadsheet following completion of the trial. Only complete sets of data were processed in full. The data sets within the completed spreadsheet were compared through correlation analysis and descriptive statistics. Correlation analysis was used to help identify where the measurements taken manually and by machine were comparable. This form of comparison also highlighted the interrelationships between each measurement for further evaluation. Due to the small number of subjects, no further evaluation was undertaken on the data. Results and discussion Once processed, fifteen complete sets of data from nine males and six females provided the basis for correlation analysis. The lost data was due to operator and computer error. The data gathered from subjects documented in Table 1. shows the stature of the subjects ranged from 1922mm (Subject 4, male) to 1522mm (Subject 11, female). The final sample group was not fully representative of largest percentile stature in the original sample population, but did include the smallest female percentile. Weight ranged from 87.7Kgs (Subject 8, male) to 51.5Kgs (Subject 11, female). These results correspond to equivalent existing data sets for a U.K. population accessed using the computer-based anthropometric database PEOPLESIZE (Open Ergonomics, 1999). Grip strength ranged from 58Kgs/569N (Subject 2, male) to 23Kgs/226N (Subject 15, female). There were strong correlations between stature and limb length segments. The ROM results, shown in Table 2, highlighted discrepancies between adduction and abduction values and those published (Rowe et al, 1965). Whilst reviewing the motion capture files, through the CODA software, it was noted that subjects did not hold their wrists vertically during the motion performance and that the angle made between the forearm, wrist and hand was a compound angle even when the hand appeared to be in line with the forearm. A reason for the high abduction value of the female subject 11, shown in Table 2., may be seen in the motion recording. Analysis of her wrist using a graph plot and stickfigure diagram showed that her wrist had rotated over 36° to the vertical, giving the motion a component of extension rather than adduction alone. There were strong negative correlations between abduction and anthropometric values in female group that was not significant in the male correlations within the male group. Following the completion of this study the authors and Charnwood Dynamics Limited have each developed methods to avoid laborious location of markers on to anatomical references on a subject that should significantly reduce preparation time and increase repeatability. The authors are experimenting with a glove with markers attached, Charnwood Dynamics Limited have developed a software solution where the markers are grouped to define body segments. Manual goniometric measurement of wrist motion seems the most cost-effective method of data collection, however, using the CODA system has highlighted a number of issues such as reducing the compound angle between hand, wrist and forearm during measurement when using a manual goniometer. Taking multiple measurements of the whole upper limb at the same time would
CONTEMPOARY ERGONOMICS 2000
279
Table 1. Age, Weight and Anthropometric measurements in millimetres of fifteen subjects (1–9 male, 10–15 female)
Table 2. Goniometric measurements, in degrees, of fifteen subjects (1–9 male, 10–15 female)
280
THE MEASUREMENT OF RANGE OF MOVEMENT OF THE WRIST
Table 3. A summary of issues when using manual and motion capture techniques for goniometric measurement.
greatly reduce the overall time taken when compared with undertaking the individual goniometric measurements manually. The relationships between measured hand characteristics highlighted in this pilot warrant further study. References British Standards Institute 1990, BS 7231, Part 1, Body measurements of boys and girls from birth up to 16.9. Mathiowetz, V., Kashman, N., Volland, G., Weber, K., Dowe, M., Rogers, S 1985, Grip and Pinch Strength: Normative Data for Adults. Archives of Physical Medicine and Rehabilitation, 66, 2 Open Ergonomics Limited 1999 PEOPLESIZE software, Loughborough. Pryce JC 1980, The wrist position between neutral and ulnar deviation that facilitates the maximum power grip strength. Journal of Biomechanics, 13, 505–511 Rowe CR, Heck CV, Hendryson IE 1965, Joint Motion: Method of measuring and recording. American Academy of Orthopaedic Surgeons, United States of America. Torrens GE, Gyi D 1999, Towards the integrated measurement of hand and object interaction. 7th International Conference on Product Safety Research, European Consumer Safety Association, U.S. Consumer Product Safety Commission, Washington D.C., p 217–226 ISBN 90–6788–251–8
HAND FUNCTION TESTS FOR WORKERS EXPOSED TO HAND-TRANSMITTED VIBRATION Barbara M.Haward & Michael J.Griffin Human Factors Research Unit, Institute of Sound and Vibration Research University of Southampton, Highfield, Southampton, SO17 1BJ
Occupational exposure to hand-transmitted vibration can initiate vascular, neurological and musculoskeletal problems, resulting in reductions in grip strength and manual dexterity. Changes can be measured using grip strength and manual dexterity tests, as part of workplace health surveillance. The interpretation of such tests requires normative data from healthy individuals and an understanding of factors that influence measurements. Hand dimensions, grip strength and manual dexterity were measured for a group of 18 male subjects (mean age 20.1 years). There were few consistent associations between hand dimensions and grip strength or between hand dimensions and manual dexterity. Order of testing hands did not affect grip strength or manual dexterity. Further work is needed to assess the repeatability of the tests and investigate the extent to which their performance depends on age and occupation. Introduction From a recent UK postal questionnaire survey, Palmer et al. (1999) estimated that 4.2 million men and 667, 000 women were occupationally exposed to hand-transmitted vibration in a one-week period. There is evidence from experimental and epidemiological studies that vibration transmitted to the upper limbs can initiate vascular changes, neurological disorders and musculoskeletal problems. Workers who have vibration-induced hand and upper limb injuries often have reduced tactile sensitivity and impaired musculoskeletal function. These may be manifested as changes to manual dexterity and strength in the hands. To assess musculoskeletal function and manual dexterity, several tests are required. A literature review of available tests indicates that few of the current tests have comprehensive testing protocols, normative data and test reliability data. Grip strength testing can be used to assess changes in muscle force, as an indicator of musculoskeletal function, while manual dexterity can be measured using the Purdue Pegboard. Both tests are easy to administer, of short duration and use standardised equipment, which make them suitable for use as part of health surveillance procedures in the workplace. Some normative data are available for both tests but these do not give complete information relating to variables that may influence test results, such as subject characteristics, hand dimensions or the order in which the hands are tested.
282
HAND FUNCTION TESTS FOR WORKERS EXPOSED
Aims of the research When recommending tests for use with those occupationally exposed to hand-transmitted vibration, it is important to have baseline data from healthy individuals for comparison. Using a sample of healthy subjects, the purpose of this initial experiment was to investigate the relationships between: 1. Hand dimensions and grip strength 2. Hand dimensions and manual dexterity 3. The order in which hands are tested (e.g. dominant hand first or second). Methods Safety and Ethics Prior to commencement of the experiment, approval was gained from the University of Southampton, Institute of Sound and Vibration Research, Human Experimentation Safety and Ethics Committee. Subjects The UK postal questionnaire survey (Palmer et al. 1999) indicated that the number of male workers exposed occupationally to hand-transmitted vibration exceeded females almost tenfold. It was therefore decided to restrict the study to male subjects. Some normative data studies have shown age effects (e.g. Mathiowetz et al. 1985), especially for subjects older than 50 years. For this study, subject age was restricted to 20±2 years to eliminate any confounding effects of age. Hand dimensions Nine measurements were taken for each of the subjects’ hands. These were hand length, palm length, hand breadth, metacarpal hand breadth, thumb length, index finger length mid finger length and index finger and middle finger grasp. Measurement positions were taken as illustrated by Pheasant (1986). Measurements were obtained with a tape measure along, or around, the finger or hand. Grasp size measurements were taken by asking the subject to form an ‘O’ shape with their thumb and finger, and then by measuring the diameter of the ‘O’shape. Hand volume In addition to linear dimensions of the hands, volume measurements were taken to investigate effects of body composition or musculature that vary between subjects. A simple displacement technique was used (based on that reported by Chau et al. 1997), whereby the subject immerses the test hand into a pre-weighed, full to the top, container of water up to the proximal crease of the wrist. The volume of the hand is determined by the weight of water displaced from the container.
CONTEMPOARY ERGONOMICS 2000
283
Grip strength and Purdue Pegboard tests Subjects sat on a height adjustable office chair to perform the hand function tests on a bench of 760 mm height. Grip strength was measured using a Jamar 5030J1 hydraulic dynamometer set at the second handle position. Subjects sat with their elbows flexed to 90° and wrist in neutral position and forearm supported on the bench, and were instructed to squeeze the dynamometer 3 times with each hand with a 10 second interval between each attempt. The Purdue Pegboard was placed 5cm from the edge of the bench to avoid any overstretching of the upper limbs to reach the required pins. The subject sat directly in front of the pegboard. Using each hand separately, the subject was instructed to pick up pins from the cup on the corresponding side of the board to the hand being tested and was timed using a stopwatch to see how many pins could be placed in 30 seconds. A practice test was carried out for each hand prior to the timed test. The test was completed once for each hand and then for both hands together. Order of testing Half the subjects were tested using their dominant hands first and the other half using their non-dominant hand first. All subjects performed the tests in the following order: hand dimension measurement, Purdue Pegboard, grip strength, hand volume. Hand volume was performed last to ensure that any skin cooling effects from placing hands in water did not influence performance of the function tests. Temperature All measurements were carried out in a test room of 23±1°C. Subjects acclimatised in the test room for a period of 15 minutes prior to the commencement of function tests. Subject finger skin temperature was measured, using a thermocouple pinched between index finger and thumb of each hand to ensure that subjects hands were not unduly cold prior to the start of the experiment, as a temperature in excess of 22.0° C is required to perform the function tests satisfactorily. The median measured finger skin temperatures were 32.6°C and 31.5°C with a range of 22.8°C to 35.2°C and 22.4°C to 34.8°C for the right and left hands respectively. Results Subject characteristics Eighteen subjects participated in the experiment, mean age 20.1yrs±1.2yrs. Fourteen subjects were righthanded and four were left-handed. Hand dimensions and grip strength Over the three grip strength trials performed with each hand, the median grip strength varied from 456 to 485N for the dominant hand and from 441 to 451N for the non-dominant hand. Median hand volumes were 403cm3 and 380cm3, for dominant and non-dominant hands, respectively.
284
HAND FUNCTION TESTS FOR WORKERS EXPOSED
Spearman’s rank order correlation coefficient (two-tailed) was used to examine relationships between hand dimensions and grip strength. Significant associations were only found between grip strength and hand breadth (Rs=0.623, p=0.06) in non-dominant hands, between grip strength and volume of the dominant hand in four left-handed subjects (Rs=1.00, p<0.001) and between grip strength and volume of the non-dominant hand in 14 right-handed subjects (Rs=0.593, p=0.025). Hand dimensions and manual dexterity The median numbers of pegs placed in the Purdue pegboard were 16.0 (interquartile range 14.8 to 17.3) for the dominant hand and 14.5 (interquartile range 13.0 to 16.0) for the non-dominant hands. There were no significant associations between manual dexterity and either hand dimensions or hand volume for either hand. Height, weight, grip strength and manual dexterity Spearman’s rank order correlation coefficient was used to examine relationships between height, weight and grip strength and manual dexterity. Significant associations were only found between grip strength and weight (Rs=0.753, p<0.001) in non-dominant hands. Order of testing Test scores from subjects having their dominant hand tested first were compared with those whose nondominant hands were tested first, using the Mann Whitney U test (two-tailed), to identify any significant differences due to test order effects. For grip strength, values of p=0.796 and p=0.605 were obtained for the dominant and non-dominant hands, respectively. For manual dexterity, values of p=0.666 and p=0.118 were obtained for the dominant and non-dominant hands. It is concluded that there were no significant effects arising from the order of hand testing in either test. Discussion and conclusions Median grip strength was lower than that reported by Mathiowetz et al. (1985) (523N for dominant and 479N for non-dominant hands), although Mathiowetz used subjects with a wider age range (20–54 years). Previous studies of the effects of hand size on grip strength have given mixed results: Firrell and Crain (1996) found no relationships, while Fiebert et al. (1998) found that grip strength was correlated with hand length in female subjects (Rs=0.33, p<0.001). The current data only show significant correlations between grip strength and hand breadth and between grip strength and hand volume in a subgroup. Normative values quoted in the Purdue Pegboard instruction manual (17.9 and 16.8 pegs for dominant and non-dominant hands respectively) were higher than scores obtained with test subjects. Again, these values were obtained with a broader range of ages (17–65 years). No previous report of a relationship between hand size and manual dexterity (as indicated by the Purdue Pegboard) could be found. The current data indicate no significant relationship. No relationships were found between stature and grip strength. For weight, a significant correlation was found only for the non-dominant hand. Some previous research provides different results. Fiebert et al. (1998) found a positive correlation between grip strength and stature in females (Rs=0.42, p<0.001) but not in males, and also a correlation with weight in females (Rs=0.66, p<0.001). Schmidt and Toews (1970)
CONTEMPOARY ERGONOMICS 2000
285
found a positive association between grip strength and both height and weight in male subjects (less than 32 years of age), when using a modified handle of the Jamar dynamometer to improve grip. No previous studies could be found which examined relationships between manual dexterity and stature or weight. The current data indicate no significant relationship. The finding that order of hand testing does not affect results is useful: it simplifies procedures when assessing hand function and removes a potential source of error. It is unclear why grip strength and pegboard scores in this study were lower than those reported in previous research. The reduction might be due to the subjects not having developed their peak grip strength. Schmidt and Toews (1970) recorded maximum grip strength in subjects aged 30±2 years. The lack of association between test performance and hand dimensions, subject height and weight strengthens the application of these tests to health surveillance. If associations had been observed, confounding effects of these factors would need to be taken into account when assessing hand function in workers who have had occupational exposure to hand-transmitted vibration. Further work is needed to investigate the two tests in different age ranges and in different worker groups. It is also necessary to assess the repeatability of the tests and the extent to which their results are related to neurological and musculoskeletal symptoms reported by workers exposed to hand-transmitted vibration in the workplace. References Chau, N. Perry, D.; Bourgkard, E., Hugenin, P., Remy, E., Andre, J. 1997. Comparison between estimates of hand volume and hand strengths with sex, age with and without anthropometric data in healthy working people. European Journal of Epidemiology, 13, pp 309–316. Fiebert, IM, Roach KE, Fromdahl, JW, Moyer, JD. Pfeiffer FF 1998. Relationship between hand size, grip strength, and dynanometer position in women. Journal of Back and Musculoskeletal Rehabilitation 10, pp 137–142. Firrell, JC, Crain GM 1996. Which setting of the dynamometer provides maximal grip strength? Journal of Hand Surgery, 21A, pp 397–401. Mathiowetz, V, Kashman, N, Volland, G, Weber, K, Dowe, M, Rogers, S 1985. Grip and pinch strength, normative data for adults. Archives of Physical Medicine and Rehabilitation, 66, pp 69–72. Palmer, KT, Coggon, D, Bendall, HE, Pannett B, Griffin, MJ, Haward, BM 1999. Hand-transmitted vibration: Occupational exposures and their health effects in Great Britain. CRR 232/99. London. Health and Safety Executive, HSE Books. Pheasant, ST 1986. Bodyspace, (Taylor and Francis, London) Schmidt, RT, Toews, JV. 1970. Grip strength as measured by the Jamar dynamometer. Archives of Physical Medicine and Rehabilitation, 51, pp 321–327. Tiffin, J, Asher, EJ 1948. The Purdue Pegboard: Norms and studies of reliability and validity. Journal of Applied Psychology, 32, pp 234–47
THE RELATIONSHIP OF WRIST POSTURE TO DISCOMFORT DURING REPETITIVE EXERTIONS E.J.Carey & T.J.Gallwey Ergonomics Research Centre, Department of Manufacturing & Operations Engineering, University of Limerick, Limerick, Ireland
This study investigated the effect of combined flexion/extension and radial/ulnar deviation of the wrist on discomfort for repetitive exertions in order to develop regression equations for the prediction of relative discomfort. Sixteen subjects participated in the study. The wrist positions were defined relative to the range of motion in each plane. Radial deviation and flexion caused significantly more discomfort than the other simple types of deviation. Extreme combined deviations gave more discomfort than extreme simple deviations. Contours of iso-discomfort were developed which can be used to assess the relative stress of wrist positions. Introduction Not enough is presently known of the dose-response relationships involved in the development of Work Related Musculoskeletal Disorders (WMSDs) to be able to make quantitative predictions of the probability of injury. Laboratory based experiments can be conducted, however, to examine the relationship between task factors and the subjective feeling of discomfort. Statistically significant equations for the prediction of wrist discomfort have been developed for a number of specific motions and exertions. Lin et al (1997) developed strata of equal discomfort which related the level of exertion, frequency and wrist flexion to wrist discomfort. Carey and Gallwey (1999) developed equations relating the amount of wrist deviation and the elapsed time to discomfort for repetitive motions. The effect of different levels of combined flexion/ extension (F/E) and radial/ulnar deviation (R/U) on discomfort for repetitive exertions, however, has not been considered. In this study, 49 levels of wrist posture (7 in F/E and 7 in R/U) were examined. Each level of deviation was related to the Range of Motion (ROM) of the individual, based on the zone definitions of Drury (1987). One objective of the study was to develop equations for the prediction of discomfort, which would identify the most uncomfortable wrist postures for this type of exertion. Another objective was to investigate the validity of the postural zones used which were arbitrarily defined based on expert experience, and have not been experimentally validated. The zones provide a great amount of detail and a quantitative means of postural analysis.
CONTEMPOARY ERGONOMICS 2000
287
Method Experimental Design Sixteen right-handed male subjects participated in the experiment. The 49 wrist postures considered were based on the zone definitions of Drury (1987). There were 7 zones in each plane of deviation of the wrist (R/ U and F/E): the neutral zone (10% of the ROM in one direction to 10% of the ROM in the opposite direction), and zones 1 (10–25% of the ROM), 2 (25–50% of the ROM) and 3 (>50% of the ROM) in each direction. The neutral zone was assigned to subjects separately from the other zones. The 48 non-neutral zones were divided into 4 groups of 12 zones each, where the zones in each group were expected to show similar discomfort levels. The zones were then ordered such that a “difficult” position, such as Zone 3 in F/ E combined with Zone 3 in R/U, was followed by an easier position, such as Zone 0 in F/E combined with Zone 1 in R/U. This allowed the subject to recover from difficult positions, and reduced the confounding effect of performing the same type of deviations consecutively. The level of repetitiveness of the exertions was held constant at 15 exertions per minute. The level of force was 10N±1N. The dependent variable was subjective discomfort from a 10cm Visual Analogue Scale (VAS). Apparatus The wrist deviation in two planes was measured using a twin-axis Penny & Giles Biometrics electrogoniometer, model XM65. The level of force was measured using a Biometrics P100 force gauge. The signals were amplified and passed to a PC using a National Instruments data acquisition board, model PCI-MIO-16XE-50, which contained an A/D converter. A rig consisting of a fixed flat horizontal surface on which the forearm rested, and a hinged surface to control the degree of flexion and extension of the wrist, was used to stabilise the upper limb and to allow full deviation of the wrist. The rig also provided a surface against which the force was applied. A Lab VIEW Virtual Instrument (VI) was written to control the communications protocol between the PC and the interface devices. Procedure Initially, the maximum angular deviation of the dominant wrist in each direction of movement was measured using the electrogoniometer. For the main experiment, a graph on the PC screen indicated the percentage of the ROM in F/E plotted against the percentage of the ROM in R/U. Ulnar deviation and extension were the positive deviations, and this resulted in a virtual wrist position on the PC screen which could be easily controlled. The neutral and desired positions were also shown on the graph. The subject moved the wrist to the desired position at the start of each treatment. A beeping sound from the PC signalled the start of the data acquisition. Two beeps, separated in time by 1 second, occurred every 4 seconds. The subject exerted the required force on the first beep, and released it on the second beep. After 5 minutes the subject rated the discomfort level on the VAS on the PC screen. This was followed by a rest period of 1 minute, and then the next wrist position was displayed. This cycle of events continued for 24 zones, which lasted for 148 minutes, or approximately 2½ hours. There was then a break of 1 hour. At the end of the break, another 24 zones were presented to the subject, again resulting in 148 minutes of experimentation. The neutral zone was presented to subjects during one of the sessions. In total, each subject exerted the 10N±1N force 3750 times over the course of the experiment.
288
THE RELATIONSHIP OF WRIST POSTURE TO DISCOMFORT
Results The discomfort values were recorded from 0 (no discomfort) to 10 (extreme discomfort). The data was standardised using a min-max standardisation procedure as follows:
where minimum value is the minimum value of each subject’s discomfort rating maximum value is the maximum value of each subject’s discomfort rating The maximum average standardised discomfort occurred for extreme ulnar deviation combined with extreme flexion (5.62), followed by extreme radial deviation combined with extreme flexion (5.39). The lowest discomfort occurred for Zone 1 of ulnar deviation (1.22), followed by the neutral zone (1.32). Wilcoxon signed ranks tests were performed on each pair of zones. The difference between the neutral position and Zone 1 of ulnar deviation was not statistically significant (p>0.10). The neutral zone gave significantly lower discomfort than all positions involving radial deviation to Zone 2 or 3 (p<0.05). The neutral zone was not significantly different from any Zone 1 simple deviation. Zone 1 in flexion combined with Zone 1 in radial deviation was significantly different from neutral (p<0.05). Only two positions involving ulnar deviation caused significantly greater discomfort than the neutral zone. Both positions also involved flexion. There were significant differences between Zone 3 simple deviations and Zone 3 combined deviations for radial deviation, ulnar deviation and extension, i.e. extreme ulnar deviation caused significantly less discomfort when it occurred in neutral flexion/ extension than when it occurred with Zone 3 in either flexion or extension (p<0.05). Table 1. Regression equations for each simple and combined deviation
A=deviation as a percentage of the ROM in that plane D=standardised wrist discomfort (0–10) Simple regression equations were developed for each simple deviation and each combined deviation, predicting the amount of standardised discomfort as a function of the percentage of the ROM. The equations are shown in Table 1. The quadrants are indicated using the initial letters of each type of deviation, e.g. flexion combined with radial deviation is ‘FR Quadrant’. Contours of iso-discomfort were developed from the equations, as shown in Figure 1. Ulnar deviation and extension are positive deviations in the figure. Discussion The results suggest that a certain proportion of the maximum radial deviation or flexion causes more discomfort than the same proportion of ulnar deviation or extension, and that the EU quadrant results in less
CONTEMPOARY ERGONOMICS 2000
289
Figure 1. Contours of wrist discomfort based on percentage of ROM
discomfort for a given percentage of maximum deviation than the other quadrants. The maximum deviation values used for the combined deviations are based on the ROM values in simple deviation, due to the difficulty of quantifying and controlling a proportion of the true deviation for combined motions. The contours of Figure 1 are only valid for the specific task examined, i.e. exertion of a 10N force in a downwards direction at a rate of 15 exertions per minute and for a duration of 5 minutes. The results do, however, support and extend the findings of other researchers such as Marras and Schoenmarklin (1993) who reported the maximum wrist deviation values of operators regarded as being at high and low risk of developing WMSDs. Radial deviation was found to have the lowest value of the wrist motions in terms of average population ROM, followed by flexion, extension and ulnar deviation. This study confirms that this is also the order of decreasing discomfort of the wrist motions, and places the combined deviation positions ahead of all of the simple deviations. The equations and contours developed are not intended to be used for the prediction of absolute discomfort levels. Since standardised discomfort is the dependent variable, the equations and contour lines predict the relative magnitude of the discomfort at each position where a value of 10 is assigned to the most extreme discomfort experienced by each individual. Some of the wrist positions investigated were quite large, and though subjects were instructed to keep the wrist in the centre of each zone, deviations from this position occurred. Also, the zones were not of equal size, and for certain adjacent zones, the angular difference between the zone centres was as small as 6°. The fact that the neutral zone did not yield the lowest discomfort score was a somewhat unexpected result, though previous studies have reported that the maximum gripping strength occurs in ulnar deviation (Hazleton et al., 1975). Further experiments are being conducted for greater deviations of the simple motions, and for different levels of force and repetitiveness in order to obtain a true picture of wrist discomfort.
290
THE RELATIONSHIP OF WRIST POSTURE TO DISCOMFORT
Conclusions • The lowest average standardised discomfort occurred for 10–25% of ulnar deviation and the highest for combined flexion with ulnar deviation of over 50% and combined flexion with radial deviation of over 50%. • The radial deviation and flexion positions caused significantly greater discomfort than the ulnar deviation and extension positions. • For deviations of less than 10% in each simple plane, the average standardised discomfort was less than 1 out of 10. • Combined deviations resulted in significantly higher discomfort scores than the same proportion of simple deviations. • More studies need to be conducted at different levels of force and repetitiveness. Acknowledgement The research documented in this paper is part of the BRITE-FURAM III Project BE96– 3568 IDEA funded by the European Union. References Carey, E.J. and Gallwey, T.J. 1999, Discomfort Prediction from Postural Deviations of the Wrist, in M.A.Hanson (Ed.) Contemporary Ergonomics 1999, 296–300 Drury, C.G. 1987, A biomechanical evaluation of the repetitive motion injury potential of industrial jobs, Seminars in Occupational Medicine, 2, 41–49 Hazleton, F.T., Smidt, G.L., Flatt, A.E. and Stephens, R.I. 1975, The influence of wrist position on the force produced by the finger flexors, Journal of Biomechanics, 8, 301–306 Lin, M.L., Radwin, R.G., Snook, S.H. 1997, A single metric for quantifying biomechanical stress in repetitive motions and exertions, Ergonomics, 40, 543–58 Marras, W.S. and Schoenmarklin, R.W. 1993, Wrist motions in industry, Ergonomics, 36, 341–351
IMPROVING UTENSIL AND IMPLEMENT HANDLE DESIGN THROUGH ENHANCED ROTATION AND TILT Glen Heavenor Managing Director Rotilt Technology Limited, 118 Queen’s Drive, Glasgow, G42 8BJ
This paper presents a case study of a handle innovation discovery which may solve the problem of developing and maintaining the neutral wrist position when applied to a wide range of implements and utensils. The author, a practising dentist, noted widespread reluctance among his patients to make more use of their wrists in order to improve the effectiveness of toothbrushing. Readily accessible data on population hand sizes and greater choice of materials in recent years have led to handles that are more ergonomic with regard to size, shape and texture. However postural improvements have remained largely elusive. This may be due to the problem that changes in head/handle alignment to improve implement comfort for use in one position lead to discomfort when the implement is used in a second position. Introduction Patent literature searches describe many innovations for improving the usability of the toothbrush. Recently, the emphasis has been on changing the bristles where length, diameter, alignment and configuration have all been altered in the attempt to design a toothbrush which removes plaque from the spaces between the teeth and under the gum margins without the need for highly user sensitive techniques. However, such alteration in toothbrush head design may cause the application of excessive force to the gum attachment with the resultant risk of increased gum recession. Various attempts have been made to increase bristle penetration by means of controlling the angle of presentation of the brush head through its alignment with the handle as a means of improvement in usability through improved arm posture, an early innovation (Fell, 1934) describes a brush where the head is rotated in relation to the handle to form a tilted head. The main disadvantage of such innovation described to date is that changes in head/handle alignment to improve comfort during angled presentation for use in one position tend to lead to discomfort when angled presentation is used in a second position. This type of problem is not unique to toothbrushes and dentistry but exists for almost any type of implement which is used in more than one position. A second and more widely applied innovation known as the “Bennett Bend” (Emanuel et al, 1980) based on bending the tool, rather than rotating the handle has been applied to implements such as hammers, brooms and toothbrushes. Schoenmarklin and Marras (1989a, 1989b,) found that among novices, the use of hammers with handles bent 20 or 40 degrees resulted in less total ulnar deviation than hammers with conventional head/handle alignment. Unfortunately, as with the first
292
IMPROVING UTENSIL AND IMPLEMENT HANDLE DESIGN
innovation, the postural improvements can only be demonstrated over a limited range of hammering activities. Background Between 1993 and 1995, approximately 750 adult patients from my dental practice opted to undergo their treatment under a system of forecasting where each patient was given a forecast for the year ahead containing the likely items of treatment required and the timing of this treatment together with the recommended steps in home care to minimise the need for invasive treatment. As patients started to return for follow-up visits it soon became clear that the general standard of plaque control was inadequate if the hoped for reduction in fillings and gum disease was to be realised. At this point it seemed that the most logical solution was to ensure that patients were observed brushing their teeth on each visit to the practice to make toothbrushing instruction more like learning a sport or a musical instrument. The observed improvement in levels of plaque and reduction in bleeding of the gums was almost immediate with many adult patients from 22 to 80 years of age volunteering that no-one had ever really shown them how to brush their teeth. However, even with this improvement, there still remained some degree of difficulty in persuading patients to angle the head of their toothbrushes to effect better cleaning between the teeth and into the small “gutter” the gingival sulcus between the gum and the tooth (Figure 1.).
Figure 1. degree of bristle penetration
This problem was present in all toothbrush designs observed during the coaching sessions where ulnar deviation of the wrist is present when cleaning upper and lower front teeth and dorsiflexion is present when cleaning the outer aspects of the upper and lower right molars and premolars (Figure 2.). Indeed with some designs there is extreme ulnar deviation present irrespective of whether a tilted presentation is used or not. Unfortunately, encouraging patients to use an angled presentation of bristles increases the degree of ulnar deviation and dorsiflexion of the wrist. However, if handle design allowed wrist movements to be centred on a more neutral position, the additional ulnar deviation and dorsiflexion might be within a comfortable range. It is known that the greater the amplitude of a posture, the less it is tolerated for a prolonged time (Bhatnager et al 1985; Boussena et al, 1982). A number of patients adopting a pen type grip for brushing had particular difficulty in working the bristles into the spaces between the teeth due to the difficulty in applying enough pressure with this type of grip. In addition, patients with excellent manual dexterity balanced their fingers on the longitudinally aligned edges of the handle creating substantial gaps between what the manufacturer intended as the gripping portion and the palmer surfaces of their hands. At this point I decided to modify a standard
CONTEMPOARY ERGONOMICS 2000
293
figure 2. conventional alignment
figure 3. tilted alignment
toothbrush highly recommended by dentists and dental hygienists, the Sensodyne Search 3.5. This brush fulfils the design perameters of head design in that it is gentle on the gingivae, the bristles can penetrate into the approximal spaces and hard to get areas. In addition the filaments are smooth, have rounded ends and are flexible enough to achieve penetration into the subgingival areas without undue pressure (Beiswanger, 1995).
294
IMPROVING UTENSIL AND IMPLEMENT HANDLE DESIGN
Method The neck region of the standard toothbrush described was heated and bent in a number of different ways to discover the effect, if any, on the holding wrist movements when brushing different parts of the mouth. Wrist posture was studied when cleaning the outer aspects of the upper and lower teeth by dividing the arches into one front and two side sections corresponding to the changes in brush position when moving from the incisors to the molars and premolars. Of all the bends and twists applied to the softened neck, the only noticeable improvement was obtained through tilting the head by rotation in the same manner as described by Joseph Fell in 1934 (Figure 3.). Tilting the head by 15 degrees appeared to eliminate dorsiflexion of the wrist when brushing the upper and lower outer side teeth and considerably reduced the extent of ulnar deviation when brushing the upper incisors but introduced a small element of dorsiflexion in its place. The only area which seemed to be adversely affected by this modification was the outer aspect of the lower incisors where an abrupt increase in upper arm elevation was required to align the brush effectively. This awkward posture may be the reason why the tilted toothbrush has not been more popular. If the tilted head improves wrist posture through a number of holding positions, what would happen if this one problem area could be eliminated? Perhaps it would possible to combine the tilted handle with the conventional handle by means of providing a notched area in the handle for use as a thumb rest on the same plane as the ends of the bristles for use when brushing the lower incisors. Would the remaining part of the handle feel comfortable when the notch was engaged?
figure 4. modified tilted head/handle alignment
Results Carving a notch in the handle immediately reduced the need for upper arm elevation when brushing the lower incisors. However, patients were confused by the notch and were unsure when to place their thumb in it. The notch was modified by the addition of resin which spiralled to a position which could be engaged by the adducted thumb yet provided a rest on a plane parallel to the tips of the bristles. The resulting wrist
CONTEMPOARY ERGONOMICS 2000
295
positions are shown (Figure 4.). Unexpectedly, for an asymmetrical handle, the modified handle fits left handed use equally well and provides the same degree of neutral wrist positions throughout the range of positions under consideration. Instead of observing left handers using the right configured modified toothbrush—a left configured mirror image toothbrush was supplied for right handers in order to make comparison easier. Discussion It comes as no surprise that a simple spiral combined with an offset should appear to address long standing postural issues for a simple utensil considering that human and animal musculoskeletal form and function consists of muscle fibres spiralling across long bones which are spiral in shape with offsets at the ends for better leverage and more powerful rotation when muscles contract. Alternatively this arrangement is used to create alternative postures. This improved posture and more efficient rotation in the toothbrush has given rise to the term enhanced rotation and tilt or rotilt. It may be that the development of a more ergonomic toothbrush handle providing comfort throughout a wide range of possible brushing positions, may have wider implications for many consumer and industrial hand held implements and utensils. References Beiswanger, B. 1995. Design Parameters for an effective, safe toothbrush. FDI Symposium Bhatnager et al, 1985, Posture, postural discomfort and performance. Human Factors, 27, 189–199 Boussena et al, 1982, The relation between discomfort and postural loading at the joints. Ergonomics, 25, 315–322 Emanuel et al. 1980. In search of a better handle. Proceedings of the Symposium: Human Factors and Industrial Design in Consumer Products Medford, MTufts University Fell, J.G. 1934, US Pat No. 2,056,447 Schoenmarklin, R. and Marras, W. 1989a, Effect of handle angle and work orientation on hammering: I. Wrist motion and hammering performance. Human Factors, 31(4), 397–411 Schoenmarklin, R. and Marras, W. 1989b, Effect of handle angle and work orientation on hammering: II. Muscle fatigue and subjective ratings of body discomfort . Human Factors, 31(4), 413–420
RISK ASSESSMENT OF MANUAL TIPPING OF LETTER TRAYS Corinne Parsons & Anne Truelove Post Office Consulting, Royal Mail Technology Centre, Wheatstone Road, Dorcan, Swindon SN3 4RD, UK
Manual tipping of letter trays is a common task within Royal Mail and is associated with extreme wrist postures. Analysis of accident and health data, postural analysis using freeze frame video and mathematical modelling of forces were used to assess the risk of injury from this task. The task was discussed with staff, occupational health personnel and managers. The study identified several techniques for tipping trays with marked differences in the forces acting on the wrist. Recommendations were made for the best method of tipping trays and factors to consider when designing associated equipment. Introduction The use of trays to carry letters in Royal Mail has increased over the past 5 years due to a containerisation programme. Mail used to be carried exclusively in mail sacks but plastic trays, weighing up to 10 kg when loaded, have been introduced to improve the efficiency of mail processing and distribution and to reduce damage to mail in transit. Currently the majority of trayed mail is generated within mail processing centres but an increasing proportion of mail from business customers is collected already in trays. Trays have major benefits for staff because they are easier to lift and carry than sacks and because the orientation of the mail is restored during transit, which reduces the amount of handling needed on arrival. Despite the benefits of the introduction of trays concern was raised by the reporting of two suspected scaphoid wrist fractures that had occurred whilst letters were being tipped out of trays. As a result of the accidents an investigation was carried out to quantify the problem nationally and to make recommendations to reduce the risks. Scaphoid fractures most commonly occur in men of working age. The fracture is usually caused by a fall onto an outstretched hand resulting in a force extending, and radially deviating the wrist. Fractures are suspected if there is tenderness at the base of the thumb and pain when the hand is extended. Fractures are frequently not visible on X-rays immediately after the injury and so if there is any doubt the injury would be treated as a fracture and the X-rays would be repeated after 10 days, (Pearson et al 1979). The main risk factors that are known to be associated with the development upper limb disorders are a combination of high force, high repetition rates and extremes of joint postures. A number of activities and jobs have been linked to specific disorders. Those relating to the hand and wrist include packing and production line work, meat cutting and playing musical instruments. These tasks are characterised by repeated forceful wrist movements, (Putz-Anderson, 1988).
CONTEMPOARY ERGONOMICS 2000
297
Awkward postures pose significant stresses on the joints and surrounding tissues and research has established that posture is a significant factor in the development of upper limb disorders. Specific postures are associated with particular conditions. De Quervain’s tenosynovitis is the most common condition that involves the tendons crossing the wrist. It is associated with activities that require forceful ulna and radial deviation of the wrist, (Moore, 1991). Pain occurs at the base of the thumb. Force is also known to be a critical factor for the development of upper limb disorders. High forces increase the workload of muscles resulting in fatigue and if recovery times are insufficient a build up of waste products will occur. High force coupled with extreme posture is likely to increase friction on tendons and tendon sheaths, (Markin et al, 1998). Accident, injury and office survey data. Royal Mail has a national accident database and also an upper limb disorder database. They were both interrogated to identify any accidents or problems that had resulted in injuries to the arm, hand, or wrist linked to tipping letter trays over the previous 5 years. Only 7 accidents had been reported due to tray tipping over the 5 year period, these included the two suspected scaphoid fractures and four sprains. They had occurred at six different offices. In addition one confirmed case of tenosynovitis had been reported. Wrist discomfort that resolved when staff were moved from tray tipping duties was reported in one area. A survey of tray handling at 23 offices was carried out. All offices handled trays transferred from other offices but the proportion of mail received directly from customers in trays and the proportion of mail carried in trays between processes in the mail centre varied. Trays were emptied at a rate between 4 and 40 per hour. Wrist discomfort was more often reported in offices where the usage of trays was higher. Posture and method analysis Employees at four offices were filmed whilst tipping trays onto machines or sorting frames. Two distinct methods of tray tipping were observed:• the majority of employees tipped trays towards them as shown in Figure 1. • less commonly trays were tipped away from the body as shown in Figure 2. Both methods required the trays to be tipped quickly to avoid spilling the letters. Method 1: Tray tipped towards the body. This was the most common method observed. Video analysis showed that the tipping action started by swinging the tray upwards and away from the body to build up speed before pivoting the tray around the fingers in the handle, and flipping the tray upside down onto the machine feed or sorting frame, as shown in Figure 1. This method of tipping resulted in extreme ulna deviation at the beginning of the cycle as the movement was initiated, (see Figure 1A) and sometimes extreme radial deviation at the end of the cycle as the tray was lowered onto the conveyor.
298
RISK ASSESSMENT OF MANUAL TIPPING OF LETTER TRAYS
Figure 1: Method 1. Tipping the tray towards the body
Figure 2. Method 2: Tray tipped away from the body
Method 2: Tray tipped away from the body. This method was less commonly observed, (see Figure 2) and there were several subtle variations. Generally the tray was pivoted on its edge on the work surface and tipped away from the employee. Variations in the hand position were observed, sometimes holding the handles, or using a friction grip against the sides. This was seen as a smooth action or done in stages, changing the grip on the tray during the tipping action. The wrist postures observed were more variable than when tipping the tray by Method 1 and tended to be less extreme. Force Analysis The forces in the arm during tray tipping were modelled using Pro/MECHANICA MOTION software. This is a design analysis product that focuses on the kinematic and dynamic aspects of mechanism designs, helping to analyse the loads and drivers within them. Both methods of tray tipping described above were modelled. The simulation of motion was modelled according to that observed on the video footage, assuming a 7.5kg tray and a 0.5 second cycle time. The model was based on a 95th percentile male. The results are described below and shown graphically. The forces are expressed as axial and transverse. The axial force is always aligned parallel to the forearm, and the transverse force is perpendicular to this. Method 1: Tipping the tray towards the body The maximum transverse force of 200N occurs at the beginning of the cycle when the wrist is in its most extreme position of ulna deviation. The maximum tensile axial force rises to its peak early in the cycle due to the acceleration forces as the tray attempts to move away from the body. Once the tray is inverted the
CONTEMPOARY ERGONOMICS 2000
299
Figure 3. Graph showing the forces in the arm during tray tipping.
base of the tray does not support the weight of the mail and consequently the forces are much lower. This is the part of the cycle where radial deviation is observed. Method 2: Tipping the tray away from the body supported on the worksurface Pivoting the tray with its edge supported on the worksurface empties the tray. The forces for pivoting and pushing the tray are very low, typically less than 5N in axial compression and rising to 5N in the transverse direction. Discussion Analysis of accident statistics, surveys of mail centres and Employee Health Service data showed that injuries related to tray tipping were not widespread. Only 7 wrist injuries had been reported nationally over the past 5 years that were specifically related to tray tipping. Two of the injuries were suspected, but unconfirmed fractures of the scaphoid bone. As this fracture cannot always be identified on X-rays in the first 10 days following the injury the accidents would have been reported before the injury had been confirmed. The area of tenderness would have been similar to that in De Quervain’s tenosynovitis. One confirmed case of tenosynovitis had been reported. Analysis of the methods of tray tipping has shown that there are a number of different ways of tipping a tray. The most common method is to tip the tray towards the body. This results in forces of around 200 N when the wrist is in extreme ulna deviation. High forces repeatedly applied in this posture have been associated with De Quervain’s tenosynovitis. Force and posture analysis did not show any evidence of high forces in a direction or posture that would cause a fracture of the scaphoid bone. (This fracture is usually caused by a fall on an outstretched hand resulting in forceful extension and radial deviation of the hand with a force of around 750 Newtons). A second method of tray tipping, where the tray is tipped away from the body, whilst supporting its weight on the worksurface resulted in a significant reduction in the forces and more variation in the postures observed. This would significantly reduce the risk of tenosynovitis.
300
RISK ASSESSMENT OF MANUAL TIPPING OF LETTER TRAYS
Conclusions Mathematical modelling of the forces acting on the arm during tray tipping, coupled with postural analysis was successful in identifying risk of injury from tray tipping task. The results show that scaphoid wrist fractures will not occur as a result of tipping trays but that there is a risk of the development of tenosynovitis which can be greatly reduced by supporting the tray on the worksurface as it is tipped. This has implications for staff training and for the design of sorting machines to ensure that there is space to support the trays on the worksurface while they are being tipped. References Ayoub, M.M. and Mital, A. 1989, Manual Materials Handling 62–65. Taylor and Francis Putz-Anderson, V. 1988, Cumulative Trauma Disorders. Taylor and Francis Pearson, J.R. and Austin, R.T. 1979, Accident Surgery and Orthopaedics for Students. Lloyd-Luke Hagberg, M. 1995, Work Related Upper Limb Disorders. Taylor and Francis Moore, A. 1991, Quantifying exposure in occupational manual tasks with CDT potential. Ergonomics 34 12 1433–1453. Marklin, R.W. and Monroe, J.E. 1998, Quantifying biomechanical analysis of wrist motion in bone trimming jobs in the meat packing industry. Ergonomics 41 2. 227–237.
THE EVALUATION OF GLOVED AND UNGLOVED HANDS George E.Torrens & Anne Newman Hand Performance Research Group, Department of Design and Technology, Loughborough University, Loughborough, Leicestershire, LE11 3TU
This paper is a pilot study carried out to compare the performance characteristics of subjects with gloved and ungloved hands. The aim of the study was to assess any reduction in functional performance due to wearing gloves and if there was any relationship between the measurements taken. Twelve 18 to 22 year old male subjects had their anthropometric; pinch and power grip; finger friction and finger compliance measured and subsequently performed a pegboard dexterity test. The Nuclear Biological and Chemical (NBC) glove and a British Army glove (CS95) were found to increase the pegboard test times by an average of 87% and 53% respectively. This related to a reduced grip strength of 13% when using the NBC glove and 28% using the CS95 glove from that achieved with bare hands. There were some correlations between anthropometric and performance characteristics. Introduction Through the course of investigating the characteristics of an individual’s hand and their subsequent performance of specified tasks, the authors have developed a model of hand and object interaction (Torrens and Gyi, 1999). This paper will focus upon the comparison of measurement of physical characteristics such as anthropometrics, grip and pinch strength and performance of individuals when wearing two different types of gloves and when ungloved. The Personal Protective Handwear (PPE) or gloves used in this pilot study were leather combat gloves (CS95) and Nuclear, Biological and Chemical (NBC) protection gloves, with their liners, as issued to the British Army. Correlation analysis was undertaken on the data to see if there were any relationships between the measured hand characteristics and subsequent performance. Because of the small sample, no further statistical analysis was undertaken. This paper is part of a long-term programme of work, sponsored by the Defence Clothing and Textiles Agency, United Kingdom. The longterm aim is to provide evidence for the enhancement of tactility and dexterity within military PPE handwear. The choice of young males reflected those who might enter the British Army and use the type of gloves being tested within the Infantry Regiments. The battery of measurements and tests taken related to different aspects of the physical characteristics or performance of the hand and the individual being measured. The anthropometric measurements define the size of object that may be handled by an individual. Grip strength and pinch strength indicate the ability of the subject to apply force, i.e. pressure to the surface of an object. Finger friction and finger compliance provided an indication of the subject’s ability to convert the applied force into a static contact with the surface material and features. Range of Motion was not measured in this study due to time constraints.
302
THE EVALUATION OF GLOVED AND UNGLOVED HANDS
Method Twelve male subjects, aged 18 to 22 years old, took part in the trials which were undertaken over a fourhour period during one day in autumn 1999. Each subject was measured over a period of twenty minutes. Due to time constraints, no measurements were repeated and the same sequence of measurements followed for each subject. The rest period between finger friction and finger compliance measurements, used in previous trials, was reduced to two minutes, from five minutes. There were two researchers working together one of whom took all the anthropometric measurements for consistency. No subject had any hand injuries, impairments or deformities. The measurements to be taken were discussed with each subject and their written permission given before commencing the trial. Each subject was asked to wash their hands prior to being measured, to reduce the level of contaminants on the hand. The measurements and tests were conducted in two parts (1) anthropometric measurements and (2) performance measurements. The measurements taken whilst subjects had bare hands were repeated with the subjects wearing leather military combat gloves, reference CS95, and then NBC gloves. The CS95 gloves are fabricated cow leather, with a synthetic fibre liner. The NBC gloves consisted mainly of a dip moulded neoprene material, with a cotton synthetic knitted liner, both NBC and CS95 gloves are current British Army issue. There was a range of glove sizes from which each subject could choose. Their choice was based upon best fit where the glove was close fitting, but did not feel too tight or restricting finger movement. Each subject’s shoes size was noted. It is a common practice for British Army quartermasters to use shoe size as an indicator for glove fit. Each subject filled in a questionnaire about lifestyle and background that took between ten and twenty minutes to complete. The questionnaire allowed each subject to reach a state of rest before commencing the second part of the trial. Each subject’s anthropometric measurements were taken using a stadiometer, anthropometer and digital callipers. The right hand and second digit only were measured; the subject’s dominant hand was noted. The method of taking the measurements followed BS 7231 (1990) except for the finger tip measurement. A set of engineering grade digital callipers were used to measure the finger tip length from the tip of the finger to the first crease of the distal phalanx, second digit, with the hand supinated and finger slightly extended. The fingertip width was taken over the distal interphalangeal (DIP) joint with the hand pronated. Fingertip depth was measured, with the hand mid-supinated, proximally from the base of the nail bed to the soft tissue of the proximal ungual pulp on the opposite side, in front of the first crease of the DIP joint. Care had to be taken to obtain a repeatable ‘feel’ to the pressure applied over the soft tissue. Measurements were recorded manually. Following the anthropometric measurements, finger compliance was the first measurement to be taken. The measurement of compliance or stiffness of the finger provides an indication of how the soft tissue on the glaborous surface of the palmar fascia of a hand will interact with a surface feature, i.e. ridges, grooves or dimples on the object surface. The test comprises of two measurements, the vertical displacement of the finger volume and the area of the fingerprint on a flat surface. The subject held on to a handle that orientated the hand to present the distal phalangeal of the second digit, (index finger), pronated between two platens with the fingertip located over the lower platen distal to the first distal interphalangeal joint (DIP). The finger was then squashed between them over a period of ten seconds. The vertical displacement of the fingertip will be the only measurement presented here. Grip and pinch strength measurements, as documented in an earlier trial (Torrens and Gyi, 1999), followed. Each subject undertook the finger compliance test, this was followed by measurement of grip and pinch strength, following the method used in an earlier trial (Torrens and Gyi, 1999). A pegboard test was used to indicate the level of dexterity of each subject. The pegboard used was 200mm wide by 170mm long and had a holding tray attached at the back of the board that was 200mm long, 60mm wide and 50mm deep. The pegs holes were set at intervals of 25mm, to accommodate six pegs in each row with five rows available.
CONTEMPOARY ERGONOMICS 2000
303
Only the two front rows, 12 pegs total, were used. The pegs were 40mm long, and 10mm in diameter. Each subject was asked to pick up the pegs one at a time and place them into the holes in the board forming two completed parallel rows. They were asked to complete the task as quickly as possible and the time taken to do this was measured by using a digital stopwatch. The equipment and methods used to test finger friction were a modification of an earlier protocol and equipment (Torrens, 1997). The change made was the introduction of a known downward force. This was achieved by the application of a sliding carriage of a total weight of 1Kg (9.81 N). The reason for doing this was that earlier trials had shown that when subjects were asked to self-regulate their application of a downward force they found it difficult to maintain a constant pressure. This was thought to be due to the shock absorbing qualities of the ungual pulp, which created a constant correction of pressure by the subject. This over correction by the subject could be compared with driving a car with very soft suspension. The pull back component of the test, that generates friction, was also automated to improve the usability of the equipment and repeatability of the test. This was done using a powered screw thread that pulled the forearm, hand and finger in a position proximal to the body, along the axis of the forearm, and away from the fixed carriage onto which the finger was applying the downward pressure. The sample surface was a 25mm square acrylic block that had a 5mm castellated section at 10 mm pitch. This had previously been identified as the optimum interface for friction using a bare finger. (Torrens 1997) Results and discussion The stature of the 12 subjects, shown in Table 1 ranged from 1878mm to 1700mm, these measurements matched 97th percentile and 24th percentile measurements from adult British males, (calculated using PEOPLESIZE, Open Ergonomics, 1999). The mean power grip strength and pinch strength of the subjects when using the CS95 gloves, shown in Table 2, was reduced by 28% and 13% respectively compared to the grip strength of bare hands. The reduction of power grip and pinch strength when using the NBC gloves with liner compared to the results using bare hands were 19% and 13% respectively. The pegboard times were increased by 87% and 53% respectively when using a CS95 and NBC glove, when compared to the performance obtained from bare hands. The mean coefficient of finger friction values shown in Table 3 indicate that the CS95 glove has a better friction value (0.7) than with bare hands or when using NBC gloves (both 0.5). One reason for this can be found in the finger compliance rating of the CS95 glove. The mean vertical displacement recorded from the finger of a CS95 (5.89mm) is nearly twice that of the mean displacement of a bare finger (3.13mm), with the mean displacement of the NBC glove between these values (5.13mm), providing a better mechanical interlock between finger, glove and object. The results indicate the CS95 glove does provide a good frictional interface, but reduces power grip and pinch strength when compare to the measurements taken from a bare hand. It also reduces dexterity task performance compared to that achieved with a bare hand. The difference in performance of the NBC glove, when compared with that of the CS95 glove and bare hand, fell between the two. As a crude indicator, the difference in physical characteristics and performances of bare hands compared to the two gloves may be attributed to glove thickness. The thicker the glove the less movement at each finger joint, increasing resistance to taking up grip patterns and applying pressure to an object’s surface. A correlation analysis provided a grouping of correlations that indicate a relationship between stature and upper limb length and a relationship between power grip and pinch grip strengths between the NBC and CS95 gloves. The relationships highlighted in this pilot study warrant further investigation.
304
THE EVALUATION OF GLOVED AND UNGLOVED HANDS
Table 1. Anthropometric measurements (in mm) of 12 male subjects 18–22 years old.
Table 2. Power grip, Pinch grip and pegboard dexterity time test (in seconds) of 12 male subjects 18–22 years old when using bare hands, wearing CS95 and when wearing NBC gloves with liners.
CONTEMPOARY ERGONOMICS 2000
305
Table 3. Finger coefficient of friction, finger temperature and finger compliance of 12 male subjects 18–22 years old when measured with bare hands, wearing CS95 gloves and NBC gloves with liner.
References British Standards Institute 1990, BS 7231, Part 1, Body measurements of boys and girls from birth up to 16.9. Open Ergonomics Limited 1999 PEOPLESIZE software, Loughborough. Torrens GE, Gyi D 1999, Towards the integrated measurement of hand and object interaction. 7th International Conference on Product Safety Research, European Consumer Safety Association, U.S. Consumer Product Safety Commission, Washington D.C., p 217–226 ISBN 90–6788–251–8 Torrens GE, What is the optimum surface feature? 1997 A comparison of five surface features when measuring the digit coefficient of friction of ten subjects, Contemporary Ergonomics, (Ed) Robertson SA, Taylor Francis, London, Annual conference of the Ergonomics Society, University of Lincoln
Museuloskeletal disorders
EVALUATING THE USE OF SINGLE DISC FLOOR CLEANERS Sophie Hide, Wendy Morris, Christine M.Haslegrave, Olanrewaju O.Okunribido & Sarah C.Nichols Institute for Occupational Ergonomics, University of Nottingham, Nottingham NG7 2RD, UK
Cleaning work can be physically demanding and a need has been identified to develop methods for systematic ergonomics evaluation of new products. Upon a manufacturer’s initiative an evaluation programme was devised to appraise a modified design of single disc floor cleaner and compare it with a range of current designs. Subjective, observational and empirical data were obtained. Analysis of results provided feedback to the manufacturer on issues relating to usability, musculo-skeletal loading during use, and design feature preferences. It is suggested that the methodology might also be used to identify cleaner training requirements and for iterative development of future product re-design initiatives. Introduction Results of national and international surveys indicate that cleaners (typically older females) have a high incidence of reporting musculo-skeletal disorders (Woods et al, 1999). Cleaners’ tasks involve a combination of mopping, vacuuming and the use of industrial cleaners such as buffing machines. Ergonomics evaluations of such buffing machines have indicated that there are genuine concerns about user interactions with the design and handling of cleaning machines (Woods et al, 1999, Haslam and Williams, 1999). Multidisciplinary European research has previously called for the selection of cleaning machines with the technical properties to allow long-term use without unnecessary strain on the users (Kruger et al 1997). Although Kruger et al recognised that technical advances are being made, they also identified a need for methods for systematic ergonomics evaluation of new products on the market. In order to address these issues in the present study, ergonomics evaluation of a range of new and original designs of single disc industrial floor cleaners (SDFCs) was undertaken by user trials with habitual users of SDFCs. This provided both subjective data from users (to evaluate preferences for a variety of SDFC design features), and observational data of the different techniques used and problems experienced by subjects.
308
EVALUATING THE USE OF SINGLE DISC FLOOR CLEANERS
Methods Development of test programme Expert appraisal was undertaken, using anthropometric analysis and task analysis to highlight the range of user interactions with the client products, and this was used as baseline data from which to develop the criteria for the user trials. A preliminary biomechanical analysis was made to investigate musculo-skeletal loading, which might be experienced by cleaners while operating the machines, during certain tasks. Equipment and subjects Eight single disc floor cleaners were used; four were SDFCs for wet scrubbing and four were SDFCs for dry buffing tasks. Each group of four comprised the modified design SDFC and three existing products for either wet or dry buffing use. Five female subjects (age range 49–59 years) were recruited from the University cleaning staff to undertake the trials. Three subjects undertook both trials and the remaining two subjects undertook either the wet or dry mode study. All subjects were experienced in the use of SDFCs . Procedure in user trials Each subject was asked to follow the sequence of setting-up the machine, five minutes use and then storage, for each of the four machines in turn. These procedures were divided into a series of sub-tasks and the subject was asked to describe the action they were intending to make in anticipation of effecting each aspect of the procedure. This was (1) to ensure that any potentially unsafe acts could be anticipated, and (2) to identify where problems or misunderstandings were occurring. Data collection A number of evaluative methods were used and the following are reported in the present paper. A ‘user evaluation’ questionnaire was developed to record each subject’s perceptions of the product after its use. The questions related to the cleaner design features and to the operation of the machine in either dry buffing or wet scrubbing mode. A visual analogue scale was provided for the responses, with text descriptors (of condition extremes) provided at the end points of each of the scales. These end points were for example ‘extremely easy/comfortable’ and ‘extremely uncomfortable/ difficult’. Where appropriate, a mid-point descriptor ‘correct’ was also included. Video recordings were made to obtain data for the postural and biomechanical analysis, and to record the techniques used among subjects. Results Summary data from the rating scales showed that overall there tended to be greater consistency and agreement of responses among the subjects when looking at ‘the design of the cleaner’ than when looking at ‘using the cleaner’. ‘The design of the cleaners’ described fixed features of the products, such as handle height and dimensions, reaches to levers or triggers and operational forces thereon. For ‘using the cleaner’, the questions related to a range of operations, such as applying the cleaning pad and disc to the machine, selecting an operational handle height, controlling the machine and managing the flex.
CONTEMPOARY ERGONOMICS 2000
309
Subjective feedback—dry mode (buffing) machines Subjects responses were most consistent for ‘reach from the handle to the height adjustment lever’ and ‘force to hold down the trigger’—finding them acceptable features on all machines. Responses relating to the ‘force to pull the height adjustment lever’ and the ‘bulk of the handle for gripping’ were fairly consistent for three of the machines, but subjects showed some dissatisfaction with one or other of these features on two of the products, indicating too great a handle width or too much force to pull the height adjustment lever. Subjects reported a good range of handle height adjustment positions yet, for three of the products, found the handle height too high when vertical. There was much less grouping of responses when looking at results for ‘using the cleaner’ and subjects all appeared to have had different experiences. Most comments on ‘ease of use’ were positive, with the most positive responses (for all the machines) for ‘selecting the correct handle height before start-up’, ‘control at start-up’ and ‘flex management during use’. The mean ratings indicated that the few instances where more than one machine was judged with a negative response, the subjects had difficulties in ‘manoeuvring the cleaner with the handle up’ and ‘control during use’. Subjective feedback—wet mode (scrubbing) machines As with the dry buffing mode, there was greater consistency of responses when looking at the results for ‘the design of the cleaner’ than for ‘using the cleaner’. Overall, however, subjects ratings were less consistent than with the ‘dry mode’ cleaners although some general trends can still be seen. The greatest consistency of responses and closeness to the scale mid-point among these results was in rating the bulk of the handle for gripping. For the most part ratings of the ‘range of handle height adjustment positions’ were generally good, although on this occasion there were negative reports about reach to the height adjustment lever and about the forces to control this and the power trigger. Responses for the additional wet mode design features tended to be positive for most of the questions. There was again much less grouping of responses when looking at results for ‘using the cleaner’ and subjects all appeared to have had different experiences. However, comments on ‘ease of use’ were mostly positive and, among the machine operation aspects, disc handling, tilting the cleaner to the floor, disc application, manoeuvring the cleaner with the handle up and management of the flex during use received consistently positive reports. The few instances where more than one machine was negatively rated, concerned ‘fitting the pad to the disc’, and ‘control during use’. Observational data Observations of subject trials showed that some of the subjects experienced problems, but did not always report this in the rating scales. Additional problems (although not necessarily rated by subjects at the end of the trial), were also described spontaneously as subjects were undertaking the trials. These were for the most part observable, but features like force application or grasp perception can only be addressed through the subjective feedback. On many occasions, the excessive handle height appeared to impose additional effort for subjects when they tried to upend the machines (necessary when wheeling the equipment to/from the area of operation, and in order to apply the pad and disc). Subjects also spontaneously described some problems with forces required to operate levers/triggers or awkwardness in controlling the machines. Attachment of the pad to the disc was on many occasions problematic (due to poor adherence) and then securing the disc to the machine was also sometimes awkward.
310
EVALUATING THE USE OF SINGLE DISC FLOOR CLEANERS
General observations made by the researchers indicated that, despite the skills which had been expected of the subjects, there were two fundamental shortcomings in the techniques they used. Subjects required formal instruction in methods to control the movement of the machines despite the fact that SDFCs all operate under the same principles of handle raising and lowering to effect left or right movement. However, subjects described various methods (of twisting the handle to the left or right), which they used to control their own work machines. Although these techniques were not observed, they could have influenced their choice of handle height and explain why they preferred it to be above the suggested guide height for optimum biomechanical advantage. Video analysis Postures seen on the video records of the SDFC trials were assessed to provide an indication of musculoskeletal loading during various tasks. Most subjects operated the machines in a similar fashion, taking a central position and swinging the machine to either side, in an arc, around them. There was only a small amount of foot movement with this style and much side-to-side machine motion was effected through upper body rotation. The preferred height at which the machine handle was held generally appeared to be just below waist level and hence the posture was supported with some static loading across the shoulders and upper arms. The machine handles were held with a palmar grasp and with the forearms pronated. Analysis of the wrist postures indicated that the wrist was frequently bent away from the neutral posture, most acutely as the machine was swung to the extremes of the arc. The greatest force exertions appeared to be on raising the handle upwards to effect a movement to the right (since the direction of disc rotation naturally draws the machine to the left) and consequently most of the load bearing activity appeared to be in counterbalancing the pull to the left and in moving the machine to the right. Biomechanical analysis The observed actions, swinging the machine in an arc from left to right and back, were simulated by a subject standing on a force plate. The foot forces measured were equivalent to the forces exerted by the hands on the handle of the machine. The posture and force were input into a three-dimensional biomechanical model in order to estimate the load on the spine and torques at shoulders and elbow joints. A larger study would be necessary to analyse these for the range of tasks performed and the variation in individual anthropometry and technique, but the simple simulation indicated that shoulder and elbow loadings were significant, as were extension and lateral flexion movements at the lower back. Compression and sheer forces on the spine were not however high. Discussion Even from this small group of subjects, a disparity between subjective feedback and observational data is apparent. Using the rating scales subjects spontaneously gave quite positive feedback about product design features and usability. However, at times these comments appeared to contradict the comments made during the trials or difficulties observed by the researchers. Whether this is related to questionnaire design, a reluctance to complain or an inherent degree of tolerance of poor product design among our subject group, requires deeper investigation. One possible explanation is that the machines used in the trials were found to be better than those used by the cleaners in their regular work.
CONTEMPOARY ERGONOMICS 2000
311
Postural and biomechanical analysis indicates that in the use of SDFCs there are various awkward postures and manipulations which have to be adopted in set-up and use of the machines—on occasions with concurrent force application. The method of laying down the cleaner to manually apply the pad and disc (a technique adopted only in the UK) appeared especially awkward and cumbersome for subjects. Elsewhere, cleaners lay the pad and disc on the floor and coupling is achieved by holding the upright machine over the point of attachment during start-up. No comparative data was found to critically appraise the merits of either technique. The trials unexpectedly showed that, even for experienced cleaners, it cannot be assumed that they have a full understanding of SDFC operational technique. This may also have implications for posture and performance and overall can only strengthen the argument for appropriate supervision and training of staff. Conclusions The evaluation has shown how the use of multiple assessment techniques can provide a comprehensive appraisal of the design, usability and musculoskeletal loading upon the operator. Trials with a larger number of subjects would certainly strengthen the conclusions, yet even with the small numbers involved it was possible to provide feedback to the manufaturer upon the range of new and original features of the SDFCs. Following Kruger et al’s (1997) conclusions that there is a need for methods to evaluate new cleaning equipment, the methodology developed in the current study would seem appropriate to be incorporated in iterative design process by product developers. Acknowledgement The authors would like to acknowledge DiverseyLever for the funding of this work. References Haslam, R.A. and Williams, H.J., 1999, Ergonomics considerations in the design and use of single disc floor cleaning machines, Applied Ergonomics, 30, 391–399 Kruger, D., Louhevaara, V., Nielsen, J. and Schneider, T., 1997, Risk Assessment and Preventative Strategies in Cleaning Work, (Wirtschaftsverlag NW, Hamburg) Woods, V., Buckle, P. and Haisman, M, 1999, Musculoskeletal health of cleaners, (HSE Books, Sudbury)
HEALTH RISKS FROM MICE AND OTHER NONKEYBOARD INPUT DEVICES S.Hastings1, V.Woods2, R.A.Haslam1 & P.Buckle2 1Health
and Safety Ergonomics Unit, Department of Human Sciences, Loughborough University, Loughborough, Leicestershire LE11 3TU, UK
2Robens
Centre for Health Ergonomics, EIHMS, University of Surrey, Guildford, Surrey GU2 5XH, UK
This paper describes research in progress, which is attempting to identify the extent of use and problems associated with non-keyboard input devices (NKID). Details are given of the first two stages of the study, a survey of organisations carried out in order to collect data on NKID used and scale of problems experienced, and in-depth interviews with NKID users at their workplace collecting more detailed individual information. Of organisations responding to the survey 22% reported some experience of upper limb discomfort associated with NKID use. In the workplace interviews, pain and discomfort relating specifically to use of the mouse was reported by 36% of the interviewees, with reports of weakness in the wrists and stiffness and discomfort in the hands and wrist after using the device for long periods. Introduction There have been suggestions since the mid-eighties that the use of non-keyboard input devices (NKID) may have health implications for users (Abernethy and Hodes, 1987). NKID include mice, trackballs, touch pads, joysticks and touchscreens, for example. As graphical computer interfaces requiring NKID manipulation have become more widespread, there have been increasing indications of the potential for musculoskeletal discomfort and harm. However, most reports are anecdotal and there is only limited objective evidence as to the true extent of the problem. The research described here is a major HSE funded study, with the aim of assessing patterns of NKID use and the extent of symptoms among operators. The paper discusses methodological considerations and presents early findings. Literature Review Fogleman and Brogmus (1995) reported that mouse related worker compensation claims are increasing rapidly, with claims most frequently involving the hand, lower arm and upper arm (including clavicle and scapula). Hagberg (1995) analysed self-reported musculoskeletal symptoms, within a study group of 751 computer operators, with intense mouse users reporting higher levels of discomfort in the shoulderscapular, wrist and hand-finger regions than a comparison group. These findings are commensurate with other studies that have found mouse use to be associated with raised levels of muscle activity in the shoulder region, arm abduction, and ulnar deviation of the wrist (Karlqvist et al, 1994; Harvey and Peper, 1997). It seems likely that finger symptoms relate to the frequency and forces involved in device button operation.
CONTEMPOARY ERGONOMICS 2000
313
With regard to mouse position, there is evidence that placement allowing a near neutral posture of the wrist is preferable (Karlqvist et al, 1996, 1997). However, keyboards incorporating a numeric number pad may impede such an arrangement for right-sided mouse users (Cook and Kothiyal, 1996). Use of a mouse mat has been found to affect wrist posture, with higher pad surface height leading to increased flexion and ulnar deviation (Damann and Kroemer, 1995). Wrist/arm supports have been found to be beneficial (Damann and Kroemer, 1995; Paul et al, 1996; Karlqvist et al, 1999). Comparing devices, it seems that a trackball reduces the loading on shoulder muscles, while increasing the demands on the wrist (Harvey and Peper, 1997; Karlqvist et al, 1999). Although, in both these studies, and Haward (1998), the subjects did not seem able to detect this at the subjective level and considerable individual differences have been observed (Burgess-Limerick, 1999). Both Armstrong et al (1995) and ISO/DIS 9241–9 (1998) provide guidance on NKID design and application. In the case of Armstrong et al, this is based on a theoretical analysis of mouse use. Both provide broad recommendations regarding posture and more specific advice with respect to features such as button operation forces and grip characteristics. Beyond this, there is very little information available on NKID with the content needed to guide DSE users or their managers. Few manufacturers making claims regarding the ‘ergonomic’ design of their equipment provide much by way of ergonomics advice on its use. In summary, there is fairly consistent evidence indicating a health and safety problem with at least some NKID, although the extent of this is unknown. Almost all research attention has been directed at the mouse, although there have been comparisons with trackballs. Little is known about health and comfort advantages and disadvantages of devices, such as joysticks, touch pads and touchscreens. The principle advice arising from the literature is the desirability of maintaining neutral posture of a user’s wrist and arm, although there has been no thorough validation of the practical application of this. The guidance that does exist is not of a form that can be easily implemented by users. Methodological Issues There are significant difficulties with undertaking field studies in this area. Health effects from use of NKID interact with those arising from other aspects of DSE work (e.g. keyboard input, constrained posture, possible psychosocial influences). Also, for field investigation to be informative, it is important to have good measures of exposure, i.e. periods and proportion of time spent using NKID. Unfortunately, users find it difficult to estimate variables such as these and direct measurement is resource intensive. Also, different NKID tend to be used with particular types of DSE (e.g. mouse with desktop computers, touchpad with portable machines), further complicating survey design and analysis. Studies under controlled laboratory conditions are constrained by the cost and ethical considerations of running trials of extended duration. This limits the scope of laboratory results, as findings from short term exposure may not be a good predictor of musculoskeletal complaints after prolonged use. To address these problems, a combination of methods is being used. Presented here are details of the research methodology and initial results from the first two stages of the investigation, a survey of IT and Health & Safety managers and user observation and interviews. Manager survey A questionnaire survey of IT and Health and Safety Managers was conducted to collect data on the types of NKID currently in use in organisations and to identify the various applications for which these devices are
314
HEALTH RISKS FROM MICE AND OTHER NON-KEYBOARD
used. The questionnaire survey was also designed to investigate problems with the use of NKID and to collect information on health problems and sickness absence associated with computer and NKID usage. In total, 128 IT/Health and Safety Managers, representing 102 different organisations, responded. Organisations covered a wide range of industrial sectors including manufacturing, education, construction, finance and business. The total number of employees encompassed by the questionnaire was 124, 500. Results In order to investigate the types of devices used, managers were asked to report NKID currently used with both desktop and laptop computers in their organisations. The mouse was used with desktop computers at 97% of organisations surveyed, at 73% of these all desktop users used the mouse. Trackballs were used by 20% of organisations, touch pads at 9%, touchscreens at 8% and joysticks at 2%. The mouse was used with laptop computers at 64% of organisations, at 31 % of these, all users used the mouse with their laptop. Touch pads were used in 31% of organisations, 28% used trackballs and 6% reported use of a touchscreen to operate laptops. Other NKID were reported to be in use but to a lesser extent, including barcode wands, scanners, voice recognition and the tablet and puck. Respondents were asked about the applications NKID were used for, the range of which was extensive for all types of device. The mouse appeared to be in high use at the majority of organisations for word processing (95%) but also for a range of other applications including accounting and computer aided design (CAD). Touchscreens were used for tasks including accessing control operation and telephone call handling. Of the organisations surveyed 20% said they envisaged using new NKID in the future such as voice activated software, alternative types of mouse and trackballs. Information was collected from organisations on problems experienced with the use of various NKID. A total of 38% of organisations mentioned problems with the mouse including pain and discomfort in the fingers, hands and wrists after prolonged use (17%), maintenance issues (11%), poor workstation set up (6%) and issues relating to the size and shape of the mouse (6%). Skin infections were reportedly related to the use of touchscreens, touch pads and the laptop mini-joystick. Other problems associated with the use of the touch pad and mini-joystick included difficulties in controlling the device and lack of precision. Approximately 1 in 5 (22%) of organisations responding to the survey reported some experience of upper limb discomfort associated with NKID use. User observation and interviews To gain further insight into how users arrange their workstations and use NKID in connection with different tasks, 25 user observations and interviews have been undertaken with both intensive and non-intensive NKID users within 5 organisations. A further 25 interviews (5 more organisations) are planned. The sites for the workplace assessments have been selected to be representative of the wide range of industrial sectors surveyed in the manager survey. Observational data have also been collected on posture and workstation set-up with video recordings made at all sites. These will be used to determine whether or not people can accurately estimate the amount of time spent using various NKID; diaries were also completed by subjects to investigate this issue.
CONTEMPOARY ERGONOMICS 2000
315
Results Of the interviewees reporting pain and discomfort 46% had not experienced these problems before using their input device. Pain and discomfort related specifically to use of the mouse was reported by 9 of the interviewees (36%) with reports of weakness in the wrists and stiffness and discomfort in the hands and wrist after using the device for long periods of time. Configurations of furniture, the desk, chair etc, varied considerably and appeared to constrain the position of the input device in relation to the user. Workplace limitations such as height of shelves above the desk, limited space and size of the screen, affected where the user positioned the visual display unit (VDU), central processing unit (CPU) and the keyboard, which in turn affected the placement of the input device. Few users had received any training in the use of input devices; where training had been given this was usually in the form of advice from colleagues or supervisors. Conclusion Results from the manager survey and initial interviews have demonstrated the wide range of NKID in use within organisations and the applications for which they are being used including programming, graphic design and secretarial work. These initial findings will be supplemented by results from a cross-sectional survey of user symptoms and exposure and a laboratory investigation. Acknowledgement We wish to acknowledge the support of the HSE who are funding this research. The views expressed, however, are those of the authors and do not necessarily represent those of the HSE. References Abernethy C N and Hodes D G, 1987, Economically determined pointing device (mouse) design, Behaviour and Information Technology, 6, 311–314. Armstrong T J, Martin B J, Franzblau A, Rempel D M, and Johnson P W, 1995, Mouse input devices and work-related upper limb disorders. In: Work with Display Units 4, edited by Grieco A, Molteni G, Piccoli B and Occhipinti E (Elsevier: Amsterdam), pp 375–380. Burgess-Limerick R, Shemmell J, Scadden R and Plooy A, 1999, Wrist posture during computer pointing device use, Clinical Biomechanics, 14, 280–286. Cook C and Kothiyal K, 1996, Does a keyboard numeric pad adversely affect muscular activity and symptoms in the neck, shoulder and arm in computer mouse users? In: Ergonomics—Enhancing Human Performance, Proceedings of the 32nd Annual Ergonomics Society of Australia and the Safety Institute of Australia National Conference. Damann E A and Kroemer K H E, 1995, Wrist posture during computer mouse usage. In: Designing for the Global Village, Proceedings of the Human Factors and Ergonomics Society 39th Annual Meeting (Human Factors and Ergonomics Society: Santa Monica, California), 625–629. Fogleman M and Brogmus G, 1995, Computer mouse use and cumulative trauma disorders of the upper extremities, Ergonomics, 38, 2465–2475. Hagberg M, 1995, The “mouse-arm syndrome”—concurrence of musculoskeletal symptoms and possible pathogenesis among VDU operators. In: Work with Display Units 94, edited by Grieco A, Molteni G, Piccoli B and Occhipinti E (Elsevier: Amsterdam), pp 381–385.
316
HEALTH RISKS FROM MICE AND OTHER NON-KEYBOARD
Haward B, 1998, An evaluation of a trackball as an ergonomic intervention. In: Contemporary Ergonomics 1998, edited by Hanson M A (Taylor & Francis: London), pp 135–139. ISO/DIS 9241–9, 1998, Ergonomic requirements for office work with visual display terminals (VDTs)—Part 9: Requirements for non-keyboard input devices. International Organisation for Standardisation. Karlqvist L, Bernmark E, Ekenvall L, Hagberg M, Isaksson A, and Rostö T, 1997, Position of the computer mouse—a determinant of posture, muscular load and perceived exertion? In: From Experience to Innovation, IEA’97 Proceedings, edited by Seppälä P, Luopajärvi T, Nygård C and Mattila M, volume 4, pp 61–63, Finnish Institute of Occupational Health, Helsinki. Karlqvist L, Bernmark E, Ekenvall L, Hagberg M, Isaksson A, and Rostö T, 1999, Computer mouse and track-ball operation—similarities and differences in posture, muscular load and perceived exertion, International Journal of Industrial Ergonomics, 23, 157–169 Karlqvist L K, Hagberg M, Köster M, Wenemark M and Ånell R, 1996, Musculoskeletal symptoms among computerassisted design (CAD) operators and evaluation of a self-assessment questionnaire, International Journal of Occupational and Environmental Health, 2, 185–194. Karlqvist L, Hagberg M and Selin K, 1994, Variation in upper limb posture and movement during word processing with and without mouse use, Ergonomics, 37, 1261–1267. Paul R, Lueder R, Selner A and Limaye J, 1996, Impact of new input technology on design of chair armrests: investigation on keyboard and mouse. In: Human Centered Technology—Key to the Future, Proceedings of the Human Factors and Ergonomics Society 40th Annual Meeting (Human Factors and Ergonomics Society: Santa Monica, California), volume 1, 380–384.
REDUCING RISKS FOR WORK-RELATED MUSCULOSKELETAL DISORDERS IN SCHOOL NURSERIES Carol Coole & Christine M.Haslegrave Institute for Occupational Ergonomics, University of Nottingham, Nottingham NG7 2RD, UK
This study investigated factors relevant to the risk of work-related musculoskeletal disorders among nursery staff employed by a local education authority. The aims of the study were to assess the WMSD risk factors present and level of reported prevalence of pain and discomfort, to find what determined exposure for individuals, and to identify feasible generic solutions. This included studies of three representative workplaces, a questionnaire survey of nursery staff and postural analysis of selected nursery activities. A high prevalence of reported back and neck trouble was found. Recommendations to reduce risk included improvements to work heights and seating, greater consideration of the organisation of nursery activities, provision of sufficient space for activities and for storage of equipment, and reduction in manual handling. Introduction A number of studies have reported a high incidence of work-related musculoskeletal disorders, especially in the back, neck and shoulders, among staff working in child care nurseries (Crawford and Lane, 1998; Grant et al, 1995; Shimaoka et al, 1998). Some risk factors are obvious, particularly for those working with babies and toddlers who are more likely to need lifting and carrying than three and four year old children. Provision of child friendly furniture also results in excessive and frequent bending for the adults. However, removal of these risk factors is difficult. The design of furniture can be improved, as has been addressed in Scandinavia where compatibility for both staff and children has been considered in new designs, but a wide scale change would be expensive in the short/medium term for existing nurseries. The present study investigated conditions in nurseries in a local authority area, with the aim of determining factors which affect exposure for individual staff and of identifying generic solutions which could reduce risk. Methodology The tasks performed by staff were investigated in three nursery units. These represented over 50 nursery units located within primary schools as well as separate nursery schools. Activity sampling was used to identify the wide variety of tasks performed in the units and posture analysis using OWAS (Karhu et al, 1977) was carried out for a range of representative tasks. A questionnaire was sent to approximately 200 nursery staff in over 50 schools in the local authority. The questions covered a range of physical, psychosocial and individual aspects of the work. The responses received (41%) from 34 teachers and 53 nursery nurses were analysed.
318
REDUCING RISKS FOR WORK-RELATED MUSCULOSKELETAL DISORDERS
Results Apart from one male teacher, the staff involved in the study were female. The majority worked 32.5 hours per week over five days during school terms. Only two respondents were over 55 years, indicating that most nursery staff have either retired or changed their job as they reach their mid-fifties. Prevalence of pain and discomfort This was assessed using the Nordic Questionnaire (Kuorinka et al, 1987). This showed that 85% of staff had experienced low back trouble and 57% had experienced neck trouble during the previous year. There was also a high prevalence of reported trouble in shoulders (55%), knees (49%), hips (35%) and ankles (31%). Statistical tests carried out on the questionnaire data showed that reported low back trouble was significantly associated with the lack of suitable work heights and furniture, high manual handling demands, and low levels of social support and decision authority. Reported neck trouble was significantly associated with frequency of lifting, perceived stress, age of staff and short stature. The length of time that staff had previously worked with children under 3 years was found to be a significant factor in both back and neck trouble. Unexpectedly, those who were shorter in stature were also more likely to report low back trouble and neck trouble. Work heights, seating and workplace layout The presence of children’s furniture was a significant problem. Children’s chairs, desks and sinks were used frequently by the staff, even when they were not directly supervising the children (e.g. when preparing work, completing reports/registers, during breaks). Adult easy chairs were often used when reading/talking/singing with groups of children seated on the carpet, but these tended to be very low and soft with inadequate back rests, leading to a slumped posture. During many activities, staff were frequently seen to be bending, stooping, kneeling and squatting in order to interact with the children. Some activities were more posturally demanding than others, as illustrated in Figure 1. Manual handling There was little routine lifting or carrying for the age group (3–5yrs) in these nurseries but it was required occasionally in particularly difficult circumstances, for example when a disruptive child had to be removed quickly from a group. Staff caring for children with special educational needs or challenging behaviour may be at greater risk of manual handling injuries. It was also found that there were various heavy manual handling tasks required in moving (and storing) play and physical education equipment, and furniture. Equipment tends to be purchased with regard to play value for the children, child safety, durability, stability and cost. Less regard seems to be given to weight, storage requirements and manoeuverability. Wooden equipment, for example, is often durable and stable, but can very heavy. The majority of staff had not received training in manual handling. Work organisation Work tasks were generally shared and rotated equally between different members of the nursery staff. However, the way that activities are rotated can lead to an uneven balance of daily postural demands. For
CONTEMPOARY ERGONOMICS 2000
319
Figure 1. Percentage OWAS Action Levels for different nursery activities.
example many nurseries operated a weekly rota system in which each member of staff was responsible for one main activity area. So one staff member might be responsible for creative or sand/water activities for a week, where the demands of bending and squatting are high, and another would be supervising outdoor play where the postural demands are much less (although manual handling demands may be greater). The ways in which the activities were supervised also varied. Some activities were more structured than others, depending on how the staff choose to implement the ‘desirable outcomes’ and on the personal style of interaction. Some staff made more effort to involve the children in ‘housekeeping’ duties such as washing up and clearing away—either by approaching the duties as play activities, or by rewarding helpful behaviour, while others spent more time and energy in preparing and clearing away for the children. These different approaches also depended on the effect of layout and storage provision in the nursery. Larger nurseries may be able to store toys and materials in such a way that the children can help themselves. Nurseries with limited space may require the staff to tidy away activities on higher shelving. Psychosocial factors Most staff reported high levels of job satisfaction and social support. Nursery teachers reported higher levels of stress than nursery nurses but also higher levels of decision authority. In contrast, nursery nurses reported that they felt dissatisfaction at the fewer opportunities for progression and promotion.
320
REDUCING RISKS FOR WORK-RELATED MUSCULOSKELETAL DISORDERS
Conclusions and Recommendations Each nursery had its own advantages and disadvantages in terms of layout, facilities, storage and activity space. Staffing levels differed, as did individual approaches to educating and caring for the children. Equipment and furniture provision also varied. Therefore each nursery needs to be directly involved in establishing its own priorities to improve work conditions and reduce risks of work-related musculoskeletal disorders, based on the following recommendations. Work heights and seating Staff need access to suitable and appropriately placed adult work surfaces (for use in both sitting and standing) and seating, for those times when they are not in direct contact with the children. This includes break times and lunchtimes, staff/team meetings, administrative duties and activity preparation. When purchasing new equipment, arrangements should be made with suppliers for ‘trial periods’ prior to purchasing. Further considerations are the weight of any furniture which needs to be moved and smooth/ rounded edges so that the staff are at less risk of bruising their legs. Seating guidelines suggested by Pheasant (1996) should be used when selecting adult chairs for use by staff who are supervising groups of children seated on the floor. Staff should experiment with various heights to see whether a higher seat would be acceptable for interacting with the children, or whether this would increase the angle of neck flexion. The feasibility of children using adjustable seating and work surfaces should be considered, as should the possibility of their using high stools or standing at an adult work-surface for some activities, through fitting trials or by using modelling techniques. Care is needed to ensure that children will be safe when using high stools. Manual handling Manual handling risk assessments should consider the range of activities in nurseries, and all staff should receive manual handling training. This should include the use of behavioural strategies to manage situations where the lifting of children becomes necessary. Staff should consider their storage arrangements so that handling of equipment and materials can be kept to a minimum. This particularly applies to outdoor play equipment, and to the practice of moving heavy equipment and furniture outside in good weather. The involvement of children in fetching and putting away toys etc. should be encouraged as a normal part of nursery activities. Organisation of activity rotas would allow handling and moving of equipment to be evenly distributed amongst the staff. Work organisation Exposure levels to excessive postural demands could be reduced by altering the activity rotas in nurseries in order to improve the balance of activities over each working day. There should be a more even distribution of unsupported seated activities, creative/sand and water play (which involves the highest levels of bending, squatting and kneeling) and outdoor play (which is less posturally demanding). The postures of both staff and children would benefit from engaging in daily ‘stretches’ as a group activity. Work should be organised in such a way that staff are able to make use of break times and lunchtimes in order to regain either an upright erect posture or a supported posture in an adult seat.
CONTEMPOARY ERGONOMICS 2000
321
Consideration needs to be given to meeting the employment needs of nursery staff as they become older and less able to manage the physical demands involved in nursery education. Psychosocial factors Action should be taken to address the effect of increased managerial demands and educational expectations on nursery staff, and to create working conditions which can enable staff to increase their levels of support for each other. This would aim to lessen the impact of the high levels of ‘stress’ perceived by nursery staff which was found to be linked to reported back and neck trouble. References Crawford, J.O. and Lane, R.M. (1998) Posture analysis and manual handling in nursery professionals. In Contemporary Ergonomics 1998, edited by M.A.Hanson, (London: Taylor & Francis. Grant, K.A., Habes, D.J., Tepper, A.L. (1995) Work activities and musculoskeletal complaints among pre-school workers. Applied Ergonomics, 26, 6, 405–410. Karhu, O., Kansi, P., Kuorinka, I. (1977) Correcting work postures in industry: A practical method for analysis. Applied Ergonomics, 22, 1, 43–48. Kuorinka, I., Johnsson, B., Kilbom, A., Vinterberg, H., Biering-Sorensen, F., Andersson, G., and Jorgensen, K. (1987) Standardised Nordic questionnaires for the analysis of musculoskeletal symptoms. Applied Ergonomics, 18, 3, 233–237. Pheasant, S. (1996) Bodyspace: Anthropometry, Ergonomics and the Design of Work, (London: Taylor & Francis). Shimaoka, M., Hiruta, S., Ono, Y., Nonaka, H., Wigaeus Helm, E., and Hagberg, M. (1998) A comparative study of physical work load in Japanese and Swedish nursery school teachers. European Journal of Applied Physiology, 77, 10–18.
PSYCHOSOCIAL AND PHYSICAL FACTORS AND MUSCULOSKELETAL ILLNESS IN TAXI DRIVERS Donald Anderson & Ruth Kjærsti Raanaas Centre for Occupational and Environmental Medicine, National Hospital, University of Oslo, N-0027, Norway
This survey was based on a hypothesis that working conditions may be responsible for ill-health amongst taxi drivers in Norway, especially musculoskeletal complaints. Factors considered included design of vehicles, baggage handling, traffic, working hours, lifestyle and psychosocial stress caused by violence and domestic stress. The data was intended to permit a characterisation of taxi driving and quantification of existing problems, and was collected via two self-reporting questionnaires. The incidence of musculoskeletal problems was established using the Nordic Questionnaire for the analysis of musculoskeletal symptoms, complemented by a so-called Workload Questionnaire. Taxi drivers completing the questionnaires were selected from drivers’ associations and taxi companies across Norway. Results showed that taxi drivers have an elevated incidence of musculoskeletal complaints compared with the general population. An important concern for many drivers is a fear of violence, actual or threatened. Analyses of these data support the contention that taxi driving may contain elements that contribute to reduced health. Introduction Taxi drivers have been shown to be vulnerable to health problems in a number of studies, in Norway, other Scandinavian countries and elsewhere (e.g.Gustavsson, et al. 1996; Tüchsen, 1997). The aim of this questionnaire survey was to examine an aspect not previously studied, reported musculoskeletal pain among taxi drivers in Norway, and relate them to different possible physical and psychosocial factors, both related to job and in connection with private lifestyle. The number of hours spent behind the wheel has been shown to be connected to self-reported back pain for salesmen (Skov, et al. 1996), police officers (Gyi and Porter, 1998) and public transport operatives (Krause, et.al. 1997). The latter study also demonstrated a connection between lack of breaks, inadequate rest possibilities and back pain. In a study of Sydney taxi drivers, lack of breaks during the shift also was also shown to be connected to an increased accident risk (Dalziel and Job, 1997). Exposure to whole-body vibration and lifting was shown to be related to back pain in a study of truck drivers in USA and Sweden (Magnusson, et.al. 1996). Independent effects of a psychosocial nature have also been reported associated with musculoskeletal problems among professional drivers. In a recent study, both physical and psychosocial factors and back pain were investigated in different occupations, including drivers, where the two factors were found to
CONTEMPOARY ERGONOMICS 2000
323
interact and jointly caused a higher effect on back pain than simple addition of the two factors (Devereux, et.al. 1999). Survey population and questionnaires Taxi drivers in the whole of Norway fall into two classes—owner-drivers and employed drivers. Nearly all the former are registered with the Norwegian Taxi-drivers Association, (covering about 5000 issued licenses), of whom 20% were randomly selected for the survey (N=960). A cut-off of 60 years was applied to avoid any general ageing factors from confounding the data of interest. A further 197 employed drivers under 60 years are registered as members of a drivers union and were added to the total, together with an additional selected group of 331 non-union employed drivers, to give a total subject pool of 1488 drivers for the survey. Questionnaires The primary questionnaire survey used the Standardised Nordic Questionnaire (NMQ) for the analysis of musculoskeletal symptoms (Kourinka, et.al. 1987). This questionnaire contains nine screening questions, covering 12 months prevalence of musculoskeletal problems in different body areas (neck, shoulders, elbows, hands, upper back, lower back, hips, knees and feet), point prevalence (seven days) and pain severity of the same body areas, and 27 detailed questions about neck, shoulder and lower back pain. The NMQ has been tested for reliability and validity and was found acceptable for the purpose of examining or screening in occupational health. It is also the only practical and economic way of collecting pain data in a study such as reported here, from so many subjects spread across the whole of Norway. In order to collect data on drivers’ working conditions, required for assessment of possible connections with reported pain, a separate questionnaire was specially developed for the purpose. This contained 57 questions, including items about the frequency and duration of driving, shiftworking, and routes driven. Some psychosocially related questions were included, e.g. waiting time and violence, as well as relevant personal details such as age, weight, height, eyesight, diet and health. Additional information about the vehicle driven was recorded, but is not discussed in this paper. The contents of this questionnaire were discussed and agreed with a reference group before being distributed. The reference group included representatives from each of the classes of driver. Conduct of survey A covering letter was prepared to explain the purpose of the survey and, together with the two questionnaires, mailed to the whole group of 1488 drivers. After five weeks and one reminder, 63.4% had responded. A sample of non-responders was randomly selected and contacted by telephone to find out why the questionnaires were not completed and returned. No reasons were given which were relevant to the content of the questionnaire and this suggested that little or no personal bias was present in the answers. Results All data were analysed using SPSS. No differences were found between full and part time drivers in the amount of pain over the last 12 months in any of the body areas, and no information was available about other physical activities apart from taxi driving for the part time group. Therefore, the analysis was
324
PSYCHOSOCIAL AND PHYSICAL FACTORS AND MUSCULOSKELETAL
Table 1: Reported pain last 12 months by gender
*Reference group data from Natvig et al. (1995)
restricted to full time drivers only. Because of the very large amount of data, only some highlights of the results are discussed. Pain prevalence reported for different body parts, by gender, is shown in Table 1. For comparison purposes a Norwegian reference group has been identified with similar data, based on responses from 2726 subjects, ranging from 20–72 years, in a local community in, around 50% of whom were women (Natvig, et.al.1995). Population variables Pain prevalence during the last 12 months was higher among the employed drivers than the owner drivers for the neck and lower back, but owners appeared to have more daily pain than the employed drivers. The study sample contained 14.3% women, who report having had neck and shoulder pain more than men in the last 12 months. Four ethnic groups are present in the sample, made up of 92.2% Norwegian, 0. 4% other Scandinavians, 1.1% other European, and 6.3% non-European (all men). Pain prevalence is higher for non-Europeans in the neck and shoulders. Neither age nor driving experience was significantly related to reported pain. Workload Mean hours driven per week was 53.8 and per shift, 10.1 for the whole population. A relationship exists between number of hours driven per week and pain reported for lower back. This is not linear, however, as more of those driving 20–39 hours per week report pain than those driving 40–79 hours. A clear linear relationship was found between number of hours driven per shift and reported pain in the neck and shoulders. A question was included about carrying passengers luggage and accompanying them to their
CONTEMPOARY ERGONOMICS 2000
325
Table 2: Reported pain last 12 months by experience of violence or threats
door and while no relationship to pain report was found for the former, those who daily follow passengers to the door report more pain in the neck than those who seldom do so. Waiting time between passengers was assumed to be a psychosocial workload factor, and a range of times from 0.5mins. to 240 minutes was reported (mean of 20.6 mins; mode of 10 mins). A relationship was found between waiting time and reported neck pain. Experience of violence Violence, actual or threatened was another psychosocial factor often reported by the drivers, of whom 27.6% had experienced actual violence and no less than 52.9% threats of violence, see Table 2. Fear of violence was the factor most strongly related to pain in the whole study, with a higher prevalence of pain in the neck, shoulders and low back amongst those most fearful, with women reported more fear than men, even though experiencing less violence. An interaction effect was observed also between fear and ethnic origin. Diet Drivers were asked about eating habits, specifically whether they took a packed lunch (traditional in Norway) or eat a hot meal at home or snacked on fast food. Table 3 gives some of the results. For Norwegian men, a lower prevalence of pain for neck and lower back is reported among those eating a hot meal at home, but an opposite effect was observed for non-Europeans and Norwegian women.
326
PSYCHOSOCIAL AND PHYSICAL FACTORS AND MUSCULOSKELETAL
Table 3: Reported pain the last 12 months by some diet questions
Smoking Smokers in the group amounted to 54.7%. More women than men smoked, as did more employed drivers than owners. A positive relationship was found between smoking and pain in the neck and shoulders. Exercise No relationship was found between frequency of exercise or sport and pain prevalence, but for regular exercisers, pain prevalence was lower for shoulder pain among joggers or walkers than other forms of activity. Other results A number of other factors were significantly related to pain, including taking a nap in the car (neck, shoulders and lower back), stature (lower back) and weight (lower back). Another 15 other factors were not significant, including number of breaks per shift, days off per week, getting in and out of the car and type of car. Discussion Few studies have been reported on pain prevalence in the general population. Two studies in Norway allow a comparison of the data from this with such reference data.. Comparing the data from this study with the reference Natvig’s data (1995), where quite high levels of pain were reported, they are far exceeded by the
CONTEMPOARY ERGONOMICS 2000
327
taxi driver. Higher pain prevalence was reported by the employed drivers than by owners, although a greater number of owners reported daily pain. This may suggest that the two driver groups may have different strategies for reporting pain, possible influenced by the different terms for sickness insurance applied to the groups. That shift length was related to musculoskeletal pain was not surprising, although no significant effects were measured from the organisation of shifts or breaks and time off, contrary to other studies (e.g. Krause et.al. 1997). However, possible stress factors were examined in a bus driver study (Meijman and Kompier, 1998), who found a clear relation between the opposed factors of time demands with safety and social activities and musculoskeletal complaints. Violence may be considered an extreme form of psychosocial stress, as found in research on workplace violence (Hewett and Levin, 1997), where taxi drivers were considered a typical example of what the authors called ‘violence related to interaction with the public’. It was interesting to note that in the Norwegian study that fear of violence was so strongly related to musculoskeletal complaints. That diet (or eating habits) was found to be related to musculoskeletal pain was interesting, as it seemed not to be a question of nutrition, as weight was only so related in women, suggesting it may have something to do with organisation and quality of private life and satisfaction. Conclusion Several interesting questions are raised by the data from this study, especially questions of coping with daily psychosocial stress situations and examining the question of social support, an aspect being investigated further in a follow-up study. It is also possible that some influence may be related to the ergonomics aspects of the vehicle. It is clear that some factors of both a physical and of a psychosocial nature contribute to reports of pain, and that violence is especially significant. A re-appraisal of all the data after the follow-up study will take into account all these factors when considering possible recommendations for intervention. References Dalziel, J. & Job, R.F.S. 1997. Motor vehicle accidents, fatigue and optimism bias in taxi drivers. Accident analysis and prevention, 29 (4), 489–494. Devereux, J.J., Buckle, P.W. and Vlachonikolis, I.G. 1999. Interaction between physical and psychosocial risk factors at work increase the risk of back disorders: an epidemiological approach. Occupational Environmental Medicine, 56, 343–353. Gustavsson, P. Alfredsson, L. Brunnberg, H. Hammar, N Jakobsson, R. Reuterwall, C. and Östlin, P. 1996. Myocardial infarction among male bus, taxi and lorry drivers in middle Sweden. Occupational and environmental medicine. 53 (4), 253–40. Gyi, D.E. and Porter, J.M. 1998. Musculoskeletal problems and driving in police officers. Occupational Medicine, 48 (3), 153–160. Kourinka, I., Jonsson, B., Kilbom, A., Vinterberg, H., Biering-Sørensen, F., Andersson, G. and Jørgensen, K. 1987. Standardised Nordic questionnaires for analysis of musculoskeletal symptoms. Applied Ergonomics, 18 (3), 233–237. Krause, N., Ragland, D.R., Greiner, B., Syme, L and Fisher, J.M. 1997. Psychososial job factors associated with back and neck pain in public transit operators. Scandinavian Journal of Work Environmental Health, 23, 179–186. Magnusson, M.L., Pope, M.H., Wilder, D.G. and Areskoug, B. 1996. Are occupational drivers at an increased risk for developing musculoskeletal disorders. Spine, 21 (6), 710–717.
328
PSYCHOSOCIAL AND PHYSICAL FACTORS AND MUSCULOSKELETAL
Natvig, B., Nessiøy, I., Bruusgaard, D. and Rutle, O. 1995. Musculoskeletal symptoms in a local community. European Journal of Gen. Practice, 1, March. Skov, T., Borg, V. and Ørhede, E., 1996. Psychosocial and physical risk factors for musculoskeletal disorders of the neck, shoulders, and lower back in salespeople. Occupational and Environmental Medicine, 53, 351–356. Tüchsen, F. 1997. Stroke morbidity in professional drivers in Denmark 1981–1990. International journal of epidemiology, 26 (5), 989–994.
BLACK HAWK HELICOPTER LOADMASTER ERGONOMICS Peter Blanchonette, Robert King, David Crone & Peter Simpson Air Operations Division, Aeronautical and Maritime Research Laboratory, Defence Science and Technology Organisation, 506 Lorimer Street, Fishermans Bend, Vic 3207, Australia
For the past ten years the Black Hawk helicopter has been providing the Australian Army with a utility helicopter which is in high demand for operational use. The Black Hawk has four crew, consisting of two pilots and two loadmasters. Musculoskeletal problems are emerging in the loadmaster population, putting at risk these aircrew, who represent a substantial capital investment in terms of training and on-the-job experience. The Defence Science and Technology Organisation was asked by the Australian Army to investigate the impact of workstation design and job requirements on loadmaster health and the safety and effectiveness of the whole airborne system. This paper discusses the systems approach we are taking to minimise the musculoskeletal problems experienced by the loadmasters. Introduction The S-70A-9 Black Hawk helicopter provides the Army with a utility helicopter capability which is in high demand for transporting troops and equipment. A crew of four, comprised of two pilots and two loadmasters, operates the Black Hawk. The pilots occupy the front seats in the cabin and face forward while the loadmasters occupy seats situated directly behind the pilots and face outward. A typical mission may last for up to three hours, and the crew may be airborne for 20 hours a week. The two loadmasters play a range of roles during missions, including surveillance (updating the pilot on the air picture around the helicopter), clearance (updating the pilot on the air and ground picture while landing), troop management, loading and unloading equipment and gunnery. The musculoskeletal problems experienced by helicopter pilots have been reported in the literature since the late 1960s. The incidence of lower back pain in helicopter pilots is high (up to 95%), with the pain lasting for up to 48 hours after the flight (Bowden, 1987). Long-term exposure results in chronic ache in the lumbar area with episodes of acute pain and spasm also occurring in around 50% of cases (Delahaye, Auffret, Metges, Poirier & Vettes, 1982). However, while the incidence of musculoskeletal pain in helicopter pilots has been the subject of research for several decades, the problems of non-piloting aircrew have been largely ignored. Recent research suggests that non-pilot crew have more than three times the risk of developing back problems than pilots (Simon-Arndt, Yuan & Hourani, 1997). Black Hawk loadmasters are often required to adopt extreme postures in order to fulfil functions critical for mission safety and success. Musculoskeletal problems are emerging in this population putting at risk these aircrew who represent a substantial capital investment in terms of training and on-the-job experience. The Army is conscious of its duty-of-care and wishes to ensure that the loadmasters do not suffer chronic
330
BLACK HAWK HELICOPTER LOADMASTER ERGONOMICS
disability as a result of their work environment. The Defence Science and Technology Organisation (DSTO) was asked to investigate the impact of workstation design and job requirements on loadmaster health and the operational effectiveness of the whole airborne system. This paper describes the approach we have taken to identify, classify, analyse and resolve loadmaster postural problems. Method and Results Survey of Pilot and Loadmaster Pain and Discomfort A survey of Black Hawk loadmasters was conducted by Air Operations Division in 1996 (Foran & Zalevski, 1999). The main result of this survey is that loadmasters report experiencing higher levels of pain across more body locations than pilots. Loadmasters experience pain predominantly in the knees (90%), lower back (83%) and neck (90%). Loadmasters considered their tasks and the space limitations of their workspace in the Black Hawk to be the major causes of the reported pain. Overall, the results of the survey indicate that loadmasters are at a greater risk of developing musculoskeletal problems when compared to pilots. Video Analysis of Typical Missions In order to identify loadmaster postures and the activities associated with them, aircrew activity during typical missions was videotaped. Analysis showed that loadmasters assume a restricted number of postures (Crone, King & Blanchonette, 1999). Three postures were adopted for more time than the others, occupying 82% of the averaged total mission time. These postures consisted of sitting upright, sitting forward and kneeling (see Figure 1).
Figure 1. Typical loadmaster postures taken from mission videos. From left to right, sitting upright, sitting forward and kneeling.
Loadmasters spent more time kneeling than sitting upright or sitting forward, spending 39% of the total mission time kneeling, performing primarily the visual tasks of surveillance and clearance (Crone, King & Blanchonette, 1999). Biomechanical analysis shows that kneeling is the most potentially harmful posture of the three (Ackland, Lloyd & Skoss, 1999). One way to minimise the injury risks to loadmasters would be to reduce their kneeling time. However, the loadmaster’s field-of-regard (FOR) when performing the visual tasks is a function of their eyepoint, which varies as a function of their posture. Changing the loadmaster’s posture to a less stressful position (such as sitting upright or forward) will affect the loadmasters’ FOR, and also the combined FOR (for all aircrew) around the aircraft. Reducing the risk of musculoskeletal injury to loadmasters by changing posture could well increase the risk to aircrew and aircraft safety by altering the loadmasters’ FOR.
CONTEMPOARY ERGONOMICS 2000
331
Figure 2. MQPro mannequin used to determine the loadmaster eyepoint. From left to right: (a) sitting upright (b) sitting forward and (c) kneeling. In this case the mannequin is a 50th percentile (stature) male from the US Army 1988 database.
A computer-based human modelling approach was employed to systematically assess the visual coverage of the pilots and the loadmasters in the three most commonly observed postures. The pilot and loadmaster workstations and the associated cockpit transparencies were digitised and the CAD model was then imported into the MQPro (Humancad) human-modelling package, In order to assess the effect of posture and stature on the loadmasters’ FOR, 5th, 50th and 95th percentile (stature) male mannequins were positioned in the sitting upright, sitting forward and kneeling postures at the loadmaster workstation (see Figure 2). A visual model based on Harrington (1964) was used to measure the visual field of each mannequin relative to a common reference point. Combined Visual Coverage around Aircraft Figure 3(a) shows the combined FOR of the pilots and loadmasters (in this case 50th percentile stature) with the loadmasters sitting upright, sitting forward and kneeling. The overlap between the loadmasters FOR and the pilot’s view through the cockpit door and overhead transparency is shown in Figure 3(b). Kneeling Kneeling provides the greatest coverage towards the front of the aircraft for 5th, 50th and 95th percentile loadmasters. Kneeling also produces the greatest overlap with the pilots’ vision through the cockpit door and overhead transparency. Shorter loadmasters have an advantage over taller loadmasters in that they can see significantly higher that the taller loadmasters. Sitting Forward Sitting forward significantly reduces the FOR of the loadmasters towards the front of the aircraft compared to the kneeling posture. Sitting forward increases the upward vision for tall and midsize, while decreasing the downward view for mid-size and short loadmasters. Tall loadmasters have an advantage over shorter loadmasters in this posture as they have greater visual coverage in all directions. Sitting Upright Loadmasters of all statures have the most restricted FOR when sitting upright. Taller loadmasters have a slightly greater FOR than shorter ones. However, while this is the most visually restricted posture it is also the least biomechanically stressful posture for the loadmaster (Ackland, Lloyd & Skoss, 1999).
332
BLACK HAWK HELICOPTER LOADMASTER ERGONOMICS
While loadmasters of all heights have their greatest visual coverage towards the front of the aircraft while kneeling biomechanical analysis has shown that this is the most harmful posture of the three most common postures. By either sitting upright or forward, the loadmaster loses a portion of his overlap with the pilot, but the postural stresses are reduced. Concluding Remarks In this paper we have discussed the approach we are taking to the occupational health and safety issues onboard the Black Hawk helicopter. Based on the task-postural analysis of typical missions, the detailed biomechanical assessment of the commonly observed postures and the visual coverage analysis recommendations will be made to Army for future operational procedures. References Ackland, T., Lloyd, D. & Skoss, R. (1999) A psychophysical and EMG investigation of aircrew working postures. University of Western Australia. Report under DSTO contract number 654335. Crone, D., King, R. & Blanchonette, P. (1999) Task and postural analysis of Black Hawk helicopter aircrew. Air Operations Division Client Report 99/12, AOD-CR 99/12 (in preparation). Delahaye, R.P., Auffret, R., Metges, P.J., Poirier, J.L. & Vettes, B. (1982). Backache in helicopter pilots, in Delahaye, R.P. and Auffret, R. (Eds), Physiopathology and Pathology of Spinal Injuries in Aerospace Medicine (2nd Ed.). AGARD, Neuilly-Sur-Seine, France, 211–263. Foran, D.A. and Zalevski, A. (1999) A survey of musculoskeletal pain and discomfort experienced by S-70A-9 aircrew. Air Operations Division Client Report 98/27, AOD 98/27, Defence Science and Technology Organisation, Melbourne, Australia. Harrington, D.O. (1964) The visual fields. (St. Louis. Mosby) Simon-Arndt, C.M., Yuan, H. & Hourani, L.L. (1997) Aircraft type and diagnosed back disorders in U.S. navy pilots and aircrew. Aviat, Space and Environ Mede, 68, 1012– 1018.
Figure 3. The combined FOR around the aircraft with loadmasters sitting upright, sitting forward and kneeling. The combined FOR is shown both as a (a) plan view in the horizontal plane and (b) projection of the combined FOR of the left hand pilot and loadmaster the combined FOR of the left hand pilot and loadmaster in the horizontal and vertical directions.
CONTEMPOARY ERGONOMICS 2000 333
ORGANISATIONAL ISSUES AS OBSTACLES TO INTERVENTION FOR MUSCULOSKELETAL COMPLAINTS Clare Lawton1 & Roger A.Haslam Health and Safety Ergonomics Unit, Department of Human Sciences, Loughborough University, Loughborough, Leicestershire, LE11 3TU 1
now at Health & Safety Laboratory, Ergonomics Section, Broad Lane, Sheffield, S3 7HQ
It is important for ergonomists to ensure that their interventions and recommendations are actually effective. This paper illustrates how subtle organisational issues can be a significant obstacle when dealing with seemingly straightforward ergonomics problems. A case study is presented illustrating various workplace concerns, which on the surface appeared uncomplicated to correct. However, at a deeper level, it is argued that a root cause to the problems, and obstacles encountered attempting to resolve them, were deficiencies in organisational structures, regarding communication, responsibility and awareness. This study highlights the benefit of making an assessment of the macroergonomics context when addressing ergonomics issues within organisations. Intervention strategies can then be tailored accordingly. Introduction There is general agreement among researchers that work-related musculoskeletal disorders (WMSD) arise from a combination of repetition, force, posture, individual and psychosocial factors. However, despite the widespread recognition of these primary risk factors, ergonomics interventions often only achieve limited success in changing work practices and reducing operator exposure to WMSD risks. Ergonomists are frequently able to identify elementary flaws in task and workstation designs and yet rectifying these obvious problems regularly proves to be difficult in practice. A growing body of research demonstrates that, despite the potential utility of ergonomics for firms and employees, all too rarely are guidance and recommendations actually implemented (Liker et al, 1984; Urlings et al 1990; Hendrick, 1991; and Alexander and Orr 1999). An analogy with economics has been drawn in making the distinction between microergonomics, the first generation of ergonomics theory, and macroergonomics, dealing with the wider context (Hendrick 1991). Studies to identify the reasons for failings of the traditional ‘micro’ approach were conducted by Liker et al (1984) and in a follow up study (Liker et al, 1991). Their findings highlight several generic organisational factors that inhibit the application of ergonomics in the workplace. Similarly Urlings et al (1990), suggested reasons for lack of success may relate to the attitudes/behaviours of managers and employees of an organisation. These authors highlight the importance of considering organisational ‘macro’ factors when dealing with more specific workplace problems.
CONTEMPOARY ERGONOMICS 2000
335
Study context This study was conducted in an electronics manufacturing company that assembles circuit boards for television and video tuners. Work is conducted along a ‘modern’ assembly line, with tasks that are highly repetitive and specialised in nature. The study was initiated in response to a HSE inspection that had raised concerns regarding the presence of high risk factors associated with WMSDs. The instigation for ergonomics input thus arose from a HSE ‘strong recommendation’ rather than a company driven desire to embrace ergonomics. The study was split into four main components encompassing two main aims: (i) the marketing of ergonomics and (ii) the development of a long-term ergonomics programme, employing a macroergonomics approach (figure 1).
Figure 1. Study components
Gaining commitment—marketing ergonomics The indication from HSE that a problem existed was insufficient by itself in gaining shared recognition of this among management and employees. The severity, cause and broader implications of WMSDs were not acknowledged or fully understood. To gain commitment in the early stages of the study, it was considered necessary to identify a marketing scheme aimed at increasing awareness of the problems, their broader implications and the need for ergonomics intervention and its benefits. An assessment of the current situation An assessment of the current extent of the problem was undertaken in which results from task analysis, postural analysis and workplace questionnaires confirmed that operators were exposed to high levels of risk. The main areas of concern were combinations of task invariability, repetition and poor posture. The results of the postural analysis and workplace questionnaires highlighted the severity of the situation and the potential for escalation in WMSD cases. A high percentage of the workforce reported WMSD symptoms and over 50% had been sufficiently affected to be incapacitated in some form within the last 12 months. Assessing the current situation provided data to persuade the company that they had a matter that needed attention. However, in accordance with Simpson (1989), it is important that ergonomics is presented as a solution and not solely a problem if management commitment is to be gained. Therefore, the strategy pursued was to raise awareness of the extent of the current problems while also presenting possible
336
ORGANISATIONAL ISSUES AS OBSTACLES TO INTERVENTION
solutions as a way forward. This was achieved in two ways: working on a small individual workstation project (microergonomics) and in the development of an ergonomics package (macroergonomics). The microergonomics project Working on a small workstation project allowed demonstration of user-centred design practices, time line analysis, table top discussions and mock-ups. In addition, it provided the opportunity to introduce line engineers and designers to information sources such as anthropometric data tables. The project helped promote ergonomics design practices, provided an illustrative example to management, and generated respect on both sides. Stake holder analysis Activities such as informal discussions, structured interviews, attending meetings etc proved of value in understanding company functioning and culture. The analysis suggested that although on the surface problems appeared primarily due to shortcomings in job and workstation design, at a deeper level root causes were deficiencies in management processes. Significant weaknesses in the implementation of design interventions and follow up evaluations were symptomatic of this, illustrated by poor communication and acceptance of responsibility. Although these problems were acknowledged by individuals within the company this had not been sufficient to improve the situation. Examples Figure 2 provides an example of the rudimentary nature of workplace problems found within the company. In their simplicity and obviousness they also illustrate the severity and extent of inadequacies with communication and responsibility. In this instance a new jig had been placed onto the line to increase output. However, in doing so the production line engineers failed to consider the implications of incorporating an additional operator. The distance between operators was significantly reduced, confining freedom of movement and personal space of the operator and neighbours. In addition, the jig was placed in close proximity to a table leg. This caused the operator to have to adopt a twisted trunk to operate the jig, a position sustained throughout much of the 8 hour working day. The situation had been drawn to the attention of the engineers, whose solution had been to bandage the table leg. Despite the health and safety officer highlighting that this situation was still unsatisfactory, it remained in this condition for a further 2 months. A second case is illustrated in figure 3, showing problems with the layout and design of the task and equipment. Machine operating height is at mid-chest level and out to the side from body mid-line, placing stress on the right shoulder and causing twisting and flexion of the neck and trunk. The right hand position places postural stress on the wrist, requiring excessive flexion. This task element has a cycle time of approximately 4 seconds, with high repetition. The operator sought medical attention regarding shoulder problems; subsequently the equipment was lowered, leading to some improvement. Unfortunately, an identical workstation that had been the subject of a similar complaint was left unchanged in its original poor configuration. An opportunity to propagate a solution throughout the rest of the process had not been recognised. The underlying cause of these examples was judged to be a failure to involve key stakeholders, such as operators and health and safety personnel, in production design decisions. This was made worse by the
CONTEMPOARY ERGONOMICS 2000
337
Figure 2
Figure 3
absence of organisational procedures for responding to operator complaints and taking remedial action. A part of the solution to this would be improvement of communication and accountability between the occupational health, health and safety, and engineering departments within the plant. Macroergonomics—formulation of an ergonomics package An ergonomics package was compiled with the aims of: • improving understanding of WMSDs and ergonomics • bridging the gap between health and safety and production engineers • encouraging, sustaining and evaluating ergonomics interventions by providing systematic and traceable assessment procedures
338
ORGANISATIONAL ISSUES AS OBSTACLES TO INTERVENTION
As part of the package, risk assessment tools were developed with a view to improving communication between health and safety personnel and line engineers. Previous risk assessments had been ineffectual in achieving change. The revised risk assessment documentation considered engineers as potential users, and were specifically designed to promote the transference from risk assessment to solution generation. This was achieved by risk assessments categorising risks into high, medium and low, while also providing crossreferencing to appropriate guidelines or design considerations. The completion of the risk assessments occurs within a wider framework, providing structure to the actual assessment and the identification of solutions, and subsequent implementation and evaluation. The package also draws together staff from various divisions, promoting communication and supporting clearer delegation of responsibility for dealing with issues. Conclusion Organisational issues can be a significant obstacle when addressing more specific work-related problems such as WMSDs and should ideally be considered from the outset. A macroergonomics approach encourages total quality management rather than isolated and one-off improvements at the micro level. It is argued that such a strategy enables individual problems to be tackled more effectively, and instigates an adaptable and evolutionary basis from which future problems can be addressed. References Alexander, D.C and Orr, G.B. 1999, Development of ergonomics programs. In: W.Karwowski and W.S.Marras (eds) The Occupational Ergonomics Handbook (CRC Press), 79–96 Hendrick, H.W. 1991, Ergonomics in organisational design and management, Ergonomics, 34, 743–756 Lawton, C.G. 1999, The implementation of ergonomics into an electronics manufacturing company in relation to work related musculoskeletal disorders. MSc Ergonomics Dissertation, Loughborough University Liker, J.K, Joseph, B.S. and Armstrong, T.J. 1984, From ergonomic theory to practice: Organisational factors affecting the utilisation of ergonomic knowledge. In: Human Factors in Organisational Design and Management. Proceedings of the first symposium held in Honolulu, Hawaii Liker, J.K., Joseph, B.S. and Ulin, S.S. 1991, Participatory ergonomics in two US automotive plants. In K.Noro and A.S.Imada (eds) Participatory Ergonomics (Taylor & Francis, London), 97–138. Simpson, G C 1989, Costs and benefits in occupational ergonomics. Keynote address at the International Conference on Marketing Ergonomics, Noordwijk, Netherlands, 5–8 June. Urlings I.J.M. and Nijboer I.D. and Dul J. 1990, A method for changing the attitudes and behaviour of management and employees to stimulate the implementation of ergonomic improvements, Ergonomics, 33, 629–637
EVALUATING THE RISK OF UPPER LIMB DISORDERS FOR OPERATORS IN A COMPANY USING SANDING AND POLISHING EQUIPMENT Philip D.Bust & Christine M.Haslegrave Institute of Occupational Ergonomics, University of Nottingham, Nottingham NG7 2RD, UK
A risk assessment was carried out at a company manufacturing automotive components to determine the likely risk of work related upper limb disorders. An initial walk through highlighted vibration, repetitive work and prolonged standing as potential risk factors. The investigation included questionnaire guided interviews, and measurements of work stations and the environment. A literature search of the effects of vibration, repetitive work, psychosocial factors, prolonged standing, work-rest schedules and thermal comfort was also carried out. Potential risks with regard to holding of small work pieces, workstation design, precision work and prolonged standing were identified. Recommendations included jigs to support small work pieces, adjustable work stations, training of staff at all levels to identify risk factors, improved work-rest schedule and continued monitoring of vibration and noise levels. Introduction The company where the investigation was carried out had approximately 300 employees and produced veneered interior parts for cars. The main process at the factory’ required the application of thin wood veneers onto plastic substrates before coating with a polyester film. The finish of the articles is of prime importance with preparation and finishing. The substrate, veneer and polyester coating require a gradually increased fine sanding operation and then completion with waxing and polishing of the polyester coating. An initial walk-through with the health and safety manager highlighted the areas which might pose a risk of injury to the work force. Methods A questionnaire was drafted in order to obtain the views of a sample of the workforce regarding their work and leisure activities, any physical discomfort to their hands (related specifically to vibration), and their bodics(using body maps) that they were experiencing. There were also questions relating to their general health and well being (taken from the General Well-being questionnaire. Cox and Griffiths, 1995) and personal details. In all 28 interviews were conducted. In order to assess any undesirable work practices and poor adopted postures, videos of the tasks were examined and a breakdown of tasks and posture assessment using RULA, McAtamney and Corlett (1993), was carried out. RULA scores were checked against RULA action levels.
340
EVALUATING THE RISK OF UPPER LIMB DISORDERS FOR OPERATORS
To identify any failings in the workplace design, a brief dimensional survey was carried out. Dimensions of the work stations were compared to published recommendations and assessed in relation to anthropometric measurements for the worker population. A sound pressure level meter and indoor climate analyser were used to measure the relevant sound and thermal levels. Hand-arm vibration levels for the sanding and polishing equipment were measured using a vibration meter. Results Questionnaire/interviews Of the sample of 28 workers, 16 had 1–5 years work service at the company, 6 had less than one year and 6 more than 5 years experience. Self report results for hand discomfort showed little sign of numbness and blanching. Results for body discomfort were generally low with the hands and lower arms and feet and lower legs recording higher values than the rest of the body. The workers reported general satisfaction with their work station although there were a small number of reports of difficulty working with small items (accident records also showed a number of incidents where small items were involved). Stress levels indicated by the self report questions compared favourably with published results of the General-Well being questionnaire. Observation The four tasks (orbital, hand and belt sanding and polishing) all contained transmission of vibration from machinery to the operators hands either by direct contact with the machinery or via the piece being worked. All shop floor work required the workers to stand at their work stations. All of the operations kept the worker in a relatively fixed position required either by the location of the machine or the support of the piece being worked. The polishing operation required the most force and the operatives used their body to assist in applying that force except with small pieces when the force was applied more with the hands. Belt sanding could only be carried out by a small number of skilled operatives. A great deal of practice/ training was needed with this task in order to achieve the fine finishes required. Dimensional measurements The workstation dimensions for the sanding operations would be suitable for approximately 1/3 of a male worker population and almost entirely excluded a female worker population. One workstation, however, successfully incorporated flexibility in positioning of the work pieces by allowing the fitting anywhere on a pyramid surface. The workstation dimensions suited the current workforce (predominantly male). There were no females in the sample and none working in these operations. There were, however, females working on other operations in the factory. Environmental measurements From the noise exposure assessment results, four of the areas under consideration were in between first and second action levels. So that hearing protection had to be made available for all the operators performing sanding and polishing tasks. Conditions in the area where the polishing operations (most force required)
CONTEMPOARY ERGONOMICS 2000
341
took place were the warmest in the factory due to the location of adjacent presses and distance from suitable ventilation. Self report from the workers in this area gave the most complaints of being too warm. Vibration measurements Readings for the orbital and hand sanding equipment were below statutory action levels. It was not possible to obtain reliable readings for the other operations due to static feedback affecting the equipment. Discussion When considering the risk factors for upper limb disorders. Colombini (1998) proposes a general analytical model with the four main risk factors identified as repetitiveness (frequency), force, awkward postures and movements, and lack of recovery time. Factors that increase the risks are vibrations, velocity and acceleration of movement, and precision among others. Repetition The work was repetitive. Silverstein et al (1987) define high repetition as cycles lasting less than 30 seconds or where over 50% of the cycle time is spent executing the same type of action. In the sanding and polishing tasks, many actions had cycles less than 30 seconds but separated by other actions within the overall cycle time of 45 seconds. However, work pieces had to be repositioned, with consequential variation in posture, at least once within the 45 second cycles. Force Physically none of the tasks under assessment required heavy manual work or high force exertion. Posture None of the 28 operators interviewed reported problems with the work benches. Thus, if they were working in poor postures, they were not conscious of any effects. The video analysis showed the operators changing position often which may have countered any ill effects of static loading. At the moment workstations are comfortable for the majority of the workers using them possibly due to chance, as there was no indication that the company selected workers to suit the tasks. Future changes in personnel may lead to postural problems with these tasks should the workers not fit the physical restraints of the equipment. The RULA scores revealed poor postures adopted in most tasks. However these were usually only for a small part of the overall cycle and there were frequent variations in posture. The fact that all work on the tasks under investigation required the operatives to stand at a fixed workstation will lead to static loading in the legs. However some of the operations had a rocking motion which would assist circulation. The reports of leg discomfort in the interviews, however, indicates that further attention is needed to reduce the amount of fixed standing postures.
342
EVALUATING THE RISK OF UPPER LIMB DISORDERS FOR OPERATORS
Recovery Efficiency studies carried out at the company have led to the removal of unnecessary travel of the work pieces within the factory and to combining separate operations into cells. The effect of this on the workers is the reduction of walking in the factory and thus recovery of the legs from static loading. Colombini (1998) says that if work is earned out through the day with 50 minutes work followed by 10 minutes break, lack of recovery does not represent an additional risk but that longer work periods increase the risk. This concurs with the recommendations of Murrell (1965) and Rohmert (1973a & 1973b). The schedule at the factory required 2 hour periods of work with two 10 minute breaks and a 30 minute lunch during the day. An increase in rest breaks would allow recovery from load on the workers legs as well as relief from any constrained posture they may have adopted to carry out the work. Conclusions and Recommendations The operations, the equipment used and the environment they are used in did contain potential risks which needed to be addressed. Of the main risk factors (repetition, force, posture and recovery time) none of the operations allowed adequate recovery time. With the additional factors (vibration, prolonged standing and precision) operatives had to stand all day on a concrete floor and virtually all the procedures required precision work. Where measurement of the workstations showed them to be suitable for the operatives using them and measurements of thermal conditions unsuitable in some areas this was confirmed by self report in the questionnaires. Prevention starts when the job or task is created, as can be seen from approaches such as Putz Anderson. (1988), Bergamasco et al, (1998), Muggleton et al. (1999). This can be the responsibility of anyone in the workplace (management, engineering or workforce) so that education and training is required at all levels to make everyone aware of the benefits (reduced work related upper limb disorders) of good ergonomic design. It was recommended that attempts should be made to provide support for small pieces as these seemed to be causing the operators problems on some operations. The move towards workstations like the pyramids should continue as these provide the opportunity for the workers to adopt a comfortable posture. Training of staff at all levels was recommended to ensure that new operations are free from risk factors. A move is also recommended towards an improved work-rest schedule to provide better recovery time and therefore reduce potential problems with repetitive tasks. Vibration, sound and lighting levels should continue to be monitored to identify hazards. Acknowledgements The authors would like to thank Lawrence Automotive Ltd., of Nottingham, for the opportunity to carry out this study. References Bergamasco, R., Girola, C. and Colombini, D. (1998). Guidelines for designing jobs featuring repetitive tasks. Ergonomics 41(9):1364–1383. Colombini, D. (1998). An observational method for classifying exposure to repetitive movements of the upper limbs. Ergonomics 41(9):1261–1289. Cox, T. and Griffiths, A. (1995). The nature and measurement of work stress: theory and practice. In Evaluation of Human Work, Eds J.R.Wilson and E.N.Corlett. Taylor and Francis, London.
CONTEMPOARY ERGONOMICS 2000
343
McAtamney, L. and Corlett, E.N. (1993). RULA: a study method for the investigation of work related upper limb disorders. Applied Ergonomics 24(2):91–99. Muggleton, J.M., Allen, R. and Chappell P.H. (1999). Hand and arm injuries associated with repetitive manual work in industry: a review of disorders, risk factors and preventive measures. Ergonomics 42(5):714–739, Murrell, K. (1965). Human performance in industry. Reinhold, New York. Putz-Anderson, V. (Ed) (1988). Cumulative trauma disorders : A manual for musculoskeletal diseases of the upper limbs. Taylor and Francis, London. Rohmert, W. (1973a). Problems in determining rest allowances. Part 1—Use of modern methods to evaluate stress and strain in static muscular work. Applied Ergonomics 4(2):91–95. Rohmert, W. (1973b). Problems in determining rest allowances. Part 2—Determining rest allowances in different human tasks. Applied Ergonomics 4(3):158–162. Silverstein, B.A., Fine, L.J. and Armstrong, T.J. (1987). Occupational factors and carpal tunnel syndrome. American Journal of Industrial Medicine 11:343–358.
Personal protective equipment
SPECIFICATION OF FOOTWEAR FOR POSTAL WORKERS Corinne Parsons & Amanda Wray Post Office Consulting, Royal Mail Technology Centre, Dorcan, Swindon SN3 4RD, UK
A review of uniform footwear was carried out due to concern over high numbers of slip, trip and fall accidents and non-compliance with wearing uniform shoes and safety shoes. The study involved questionnaire surveys, user workshops, discussions with postal workers and manufacturers. Footwear trials and analysis of accident statistics were also carried out. Functional specifications were developed based on the user requirements identified. The footwear has now been issued and its acceptance by staff and accident statistics are being monitored. Introduction A high number of employees were observed to wear their own footwear in preference to the uniform and safety footwear issued. Three types of footwear were provided as part of the uniform including two shoes that met Occupational Footwear Standard prEN 347 and a training shoe. A safety shoe, boot and trainer were also available which all met the requirements of the Occupational Footwear Standard and the Safety Footwear Standard prEN 345. Uniform footwear was provided to staff in three main sections within Royal Mail. These were delivery work, (mainly out door walking on a wide range of surfaces, usually carrying a bag of mail), processing work (indoor) and distribution work (loading vehicles with wheeled containers, manoeuvring pallet trucks and driving). Most distribution employees required safety footwear. Method A study was carried out to find out why staff did not wear the footwear provided which included the following activities:• • • •
Questionnaire survey of uniformed employees Workshops Review of accident statistics Trials of footwear by delivery and distribution staff
346
SPECIFICATION OF FOOTWEAR FOR POSTAL WORKERS
Questionnaire Questionnaires were completed by 115 delivery employees and 79 processing employees. Information was obtained relating to shoe preferences and features of the footwear including; comfort, grip, protection, support, durability, sole flexibility, breathability, style and suitability for the job. A comparison was made between summer and winter. A five point scale of very poor, poor, ok, good and very good was used. The results showed for delivery Shoe 1 scored as good for almost all features in the summer, Shoe 2 as ok or good, but durability was inadequate for both shoes. In winter the footwear was not adequately waterproof and grip was only ok or poor for both shoe types. Only 5% of the staff considered that trainers were suitable for delivery work. All of the footwear types were considered suitable for processing work with most features being rated as good. The only feature scoring as poor was breathability and this was mainly related to the training shoes. Workshops Four workshops were held with staff from processing, distribution and delivery being represented. Information was obtained using a mixture of open discussion, syndicate exercises and individual work. The workshops identified the good and bad points of the footwear issued, including areas for improvement, the features of footwear most important to the staff, and the conditions under which the footwear was used. Shoe 1 was clearly the most popular type of footwear, with staff generally agreeing that they were very comfortable but that the soles became smooth and slippery very quickly and that the shoes were not waterproof. Trainers were generally found unsuitable for delivery. Some of the staff liked Shoe 2 because they were more durable, more waterproof and smarter in appearance than the other shoes. They had the major drawback of being uncomfortable and difficult to wear-in. It was suggested that boots should be provided for winter wear. Current footwear was considered satisfactory for processing work but the safety footwear provided for distribution work was found to be uncomfortable and had poor style. Half sizes and width fittings were suggested to improve fit. Important features for the footwear for delivery and distribution work were:- good arch support, comfort, weatherproof, durability, protection, good grip, flexibility, breathability, ankle support and style. Review of Accident Statistics 1734 outdoor fall accidents reported by delivery postmen/women in the Midlands during 1993/4 were analysed showing that 60% of the accidents occurred in snow or ice whilst walking on flat ground, (Bentley et al, 1995). In nearly 50% of cases the fall-initiating event was a slip. Of the slips over 60% occurred on ice or snow. The ankle was the most common part of the body to be injured. A survey of footwear revealed that the tread was completely or virtually worn down on the majority of shoes within 2 months. Similar results were found by analysis of 154 outdoor fall accidents in London during 1993/5. The main difference was a higher proportion of accidents in wet conditions and fewer accidents related to ice and snow. This may well be due to a difference in the weather conditions encountered. The conclusions of this study were:• The footwear issued was inappropriate for the work, particularly for the winter conditions. • The grip was inadequate and the tread wears rapidly, particularly in the training shoes.
CONTEMPOARY ERGONOMICS 2000
347
Trials of Footwear by Delivery Staff. As a result of the questionnaire survey and workshops, a three month trial of winter boots and shoes was undertaken involving 198 participants. The trial took place between December 1995 and March 1996, at four offices: These offices are representative of urban/rural localities, differing terrains and delivery types. The footwear was used on foot, cycle and driving deliveries. Four types of footwear were tested using two sole compositions, in a boot and a shoe style. The two soles types had a deep tread and were constructed from a double-density of polyurethane, one had a rubber tread. The four footwear types were evenly distributed amongst participants. Questionnaires were used at the beginning and end of the trial to obtain feedback on the performance of the footwear for the following features; fit for purpose, comfort, grip, fit, support, weight, flexibility, cushioning, style, waterproofing, breathability, warmth, protection and durability. Information on the conditions under which the footwear was worn was also sought. Interviews took place at the end of the trial so that more detail could be obtained and comments were invited. As before, a five point scale from very poor to very good was used on the questionnaire. All of the footwear types were worn in dry, wet, icy and snowy conditions. All of the footwear catered adequately for the width fittings encountered. The trial footwear was seen as an improvement over the issue footwear scoring at least ok for all features and good or very good for most. Grip in all conditions was good as was waterproofing. Slip, trip and fall accidents were lower in the trial offices than in previous years but this followed a downward trend already occurring and would have been influenced by other initiatives that had also been introduced. Discussion The main findings of the study showed that the footwear supplied was considered suitable for indoor work but was unsuitable for outdoor work, particularly in the winter despite meeting the requirements of the European Occupational footwear standard. Inadequate grip and poor durability were the main problems, comfort and styling were also issues, particularly for safety footwear. Trials of winter footwear were successful showing an improvement in grip and waterproofing. Slip resistance The slip resistance of footwear will influence the likelihood of accidents. This is dependant on a number of factors e.g. sole material, tread pattern, temperature, floor surface and lubrication. Of the two main footwear types worn outdoors one had a shallow tread pattern that became smooth very quickly, the other had a PVC sole. PVC wears quickly and becomes very hard in cold conditions, reducing its grip. Icy conditions tend to result in poor grip even for the best soles. Tests by Bruce et al (1986) found the worst solings to have only half the grip of the best. Comfort. Comfort of shoes is related to factors such as fit, shock absorption, flexibility and breathability. A single range of shoes cannot be expected to fit everyone, although some clearly fit a wider range of people than others. Comfort can be greatly increased by the inclusion of a resilient foam insole that will increase the
348
SPECIFICATION OF FOOTWEAR FOR POSTAL WORKERS
shock absorbency, provide support and absorb sweat. The American Podiatry Association recommend a polyurethane foam insole 3mm thick, (La Fortune et al 1992). In order to provide a satisfactory fit for more than 50% of the population footwear should be issued in half sizes. This will increase the fit to 70%, offering width fittings or different depth footbeds would increase the fitting range to 80%, (Wilson, 1990). Conclusions Footwear provided to postal workers was found to be suitable for indoor work but had inadequate durability, grip and waterproofing for outdoor work despite meeting the European Occupational Footwear standard, prEN347. Safety footwear lacked style and was considered uncomfortable by many employees. As a result of the findings of the study functional specifications based on user requirements were produced for the Post Office by the Shoe and Associated Trades Research Association, (SATRA). Durability, grip in cold, wet and dry conditions and a comfort assessment are important parts of the specification. Separate specifications were produced for indoor work, delivery shoes, winter boots and safety shoes and boots. The footwear is being supplied in half sizes and width fittings. The footwear is currently being distributed to staff on a rolling program and its effectiveness will be reviewed by a national comparison of accident statistics before and after introduction, by user opinion and by monitoring wear. Acknowledgements We would like to thank the postmen and postwomen who took part in the workshops and the trials and the other members of the study team for their contributions. References Audemars, P. 1978, Underfoot Cushioning in Working Footwear. Protection 15 9 11–14 Bentley, T.A. and Haslam, R.A. 1995, Slip, trip and fall accidents occuring during the delivery of mail. Internal report of Post Office. Bruce, M. et al 1986, Slip Resistance on Icy Surfaces. Journal of Occupational Accidents 7 4 273–283 Bruce, M. and Jones, C. 1984, Many a Slip. Occupational Health & Safety. 14 3 16–21. Gronquist, R.G. and Hironen, M. 1994, Pedestrian Safety on Icy Surfaces. Advances in Industrial Ergonomics and Safety. Taylor and Francis 315 La fortune, M.A. and Henning, E.M. 1992, Cushioning Properties of Footwear During Walking. Clinical Biomechanics 7 181–184. Pfauth, M.J. and Miller, J.M. 1976, Work Surface Friction Coefficients. Journal of Safety Research 8 2 77–89 Strandberg, L. 1980, The Mechanics of Slipping Accidents. Ergonomics in Action Conference Proceedings 20–22 Tisserand, M. 1985, Prevention of Falls Caused by Slipping. Ergonomics 28 7 1027–1042. Wilson, M.P. 1990, SATRA Slip Test. Slips Stumbles and Falls. B Everett Gray ASTM 113 Wilson, M.P. and Perkins, P.J. 1985. Evaluation of Slip Resistance Tests for Shoes. Ergonomics 28 7 1080
WHAT DO BRITISH SOLDIERS WANT FROM THEIR GLOVES? Deana McDonagh-Philp & George E.Torrens Hand Performance Research Group, Department of Design and Technology, Loughborough University of Technology, Loughborough Leicestershire, LE11 3TU
A pilot study was undertaken to find out what British Army infantry soldiers want from their current personal protective equipment (PPE) issue combat gloves CS95. It had been previously found that evidence to support the product design specification of a new glove is locked within tacit knowledge (personal experience) and practical based programmes of military training (i.e. instructor demonstration). The soldiers mainly use CS95 gloves only when cold in a cold environment or on sentry duty and found that they did not dry out as quickly as required. The consensus of the two sample groups (n=16) (n=28) interviewed was that the CS95 gloves restricted movement and preferred not wear them to undertake weapon firing or fine manipulative tasks. The results indicate that more care is required when matching issued gloves to the soldier and glove thickness should be reduced. Introduction This paper will discuss a pilot study undertaken to find out what were the needs and aspirations of infantry soldiers are for their current personal protective issue (PPE) handwear, combat gloves CS95. The authors had already found that available evidence to support the product design specification of a new glove to be locked within tacit knowledge. This knowledge was in the form of personal experience in the field and practical based programmes of military training, (i.e. instructor demonstration, video and practice sessions). The techniques used to access relevant information from soldiers (such as a soldier’s daily activities, frequency of specific tasks relating to wearing gloves and the equipment used when wearing gloves) were focus group activities and interviews. This study is part of a larger programme of work sponsored by the Defence Clothing and Textiles Agency to develop PPE handwear with improved dexterity and haptic performance over the current issue. Two groups of eight soldiers were interviewed. One group consisted of eight Loughborough University Officer Training Corp (UOTC) undergraduate and postgraduate male students, aged 18–22 years old. The second group consisted of eight serving male and female infanteers from a range of British Army Regiments, aged 19–38 years. The infanteers were all soldiers under sentence, military prisoners, (SUS). The UOTC group were interviewed in the Department of Design and Technology, Loughborough. The SUS group was interviewed in the Correction Facility, Colchester. Both sessions were undertaken in the autumn 1999. Questionnaires were used to validate some of the issues highlighted from the focus group sessions. The questionnaires were given to 15 serving soldiers of the Highland Regiment stationed at the barracks in Colchester and 16 UOTC students at Loughborough University.
350
WHAT DO BRITISH SOLDIERS WANT FROM THEIR GLOVES?
Method Focus group methodology does not offer a representative view of the population. It offers an insight into particular user experience leading to increased awareness by the designer to assist more efficient designing processes (Krueger, 1988). The method relies upon participants interacting and generating a synergetic effect (Kitzinger, 1994). The focus groups concentrated on eliciting qualitative data. Two discussion groups took place that involved 8 soldiers in each session and were one and 90 minutes duration. A moderator, scribe and research assistant were also present. The focus groups were recorded using camcorders, cameras, audiotape and notes taken during the sessions by a scribe and moderator. Recording the discussions enables evaluation of the activity, especially between the verbal and non-verbal communication. It also allows further triangulation by making it possible for other researchers to evaluate the activities. Two video cameras were set to record just before the arrival of the subjects and left running for the duration of the session. Photographs were taken during the discussion to record issues highlighted that were difficult to describe verbally. Though the aim was for qualitative data to be accessed, brief questionnaires provided the opportunity for quantitative data to be collected as well. The guiding objectives for this activity were: • to elicit user perceptions of current PPE gloves • to become familiar with user experience, language and forms of expression • to gain experience in managing group discussion with this type of user The outline of each session was discussed with the subjects and their written consent obtained before commencing the session. Discussion was initiated to encourage the perception of an informal and nonjudgemental environment. The choice of subject seating arrangements was based upon the senior moderator’s experience of balancing the group dynamic to enable all subjects to contribute to the discussion. A senior member of personnel was present during the SUS focus groups. The subjects were asked to consider the following: • which duties and tasks are affected and perceived to be affected by the use of current PPE handwear • to offer suggestions to improving the PPE handwear. During this period the subjects were given examples of the current issue combat gloves to help them describe, in detail, the issues relating to impairment of their performance Scribes and one of the moderators, through digital photographs, recorded these actions and descriptions. The following represents a summary of the issues and concerns raised during the focus group activity (See Table 1). The questionnaire was distributed to the two sample groups. The content focused upon tasks that are commonly experienced by soldiers. The task list was drawn from the experience of the authors, military and scientific personnel within the Defence Clothing and Textiles Agency. The format of the questionnaire was a series of tick boxes responding to a scale (Likert) to illicit frequency of task (1=always, 5=never) and level of difficulty (1= very easy, 5=very difficulty). There were 39 tasks identified. The questions related to two aspects of the use of PPE handwear; (i) how often do they use their combat gloves within the identified tasks, (ii) how difficult is that task to perform wearing combat gloves?
CONTEMPOARY ERGONOMICS 2000
351
Results The following tables represent a summary of the outcomes from the focus group activities with both groups and the questionnaire feedback. Table 1. Summary of issues raised during focus group activity
Table 2. Suggestions put forward during focus group activities
352
WHAT DO BRITISH SOLDIERS WANT FROM THEIR GLOVES?
Table 3. Prioritised list of the frequency of wearing PPE handwear and level of difficulty in performing identified tasks whilst wearing PPE handwear
The following table (Table 2) highlights the soldiers recognised needs and aspirations for PPE handwear. Table 3 represents the questionnaire feedback from 28 subjects. Due to time constraints only 13 subjects from the Highland Regiment completed the questionnaire. Discussion and conclusions When considering what British soldiers want from their gloves, Table 3 highlights the tasks where combat gloves are most frequently used. What is interesting is that those that are found to be most difficult to perform when wearing gloves are very similar tasks to those where gloves are worn, i.e. sentry and patrol duty rifle and ammunition handling. The table also clearly identifies manipulative tasks involving pinchbased grips as the most difficult action to perform, including ammunition pouches and communication devices. This is supported by Table 1 that identifies poor fitting of the fingertip, within the finger of the glove, as an important factor of task impairment. Critical tasks such as weapons and munitions handling feature high in both frequency of wearing gloves and high level of difficulty in performing tasks when wearing gloves. When asked for solutions to the problems associated with PPE handwear, soldiers suggested options that mainly related to fit, weight, thermal insulation and the ability to customise the gloves (e.g. take out the fingers from the glove).
CONTEMPOARY ERGONOMICS 2000
353
What has been found by the authors during this pilot study is that the application of focus groups to a military sample group highlighted the bias of the groups towards consensus seeking. This affects the discussion in that the groups needs are constantly expressed over the individuals needs. Due to this group dynamic make it difficult to go beyond generalised consensus. The aim of focus groups is to encourage individual expression, needs and aspirations therefore it is worthwhile considering further studies that make use of single gender groups (male or female) that have met previously. The presence of senior personnel further dampened the discussion. Based on the outcomes of this study, the questionnaire can be refined and made easier to complete within a shorter time scale (i.e. 15 minutes), reflecting the often-brief opportunity soldiers have during their hectic daily schedule to complete such forms. This justifies further research and development into optimising hand and object interaction involving gloves. Although military personnel have been interviewed within this study, many of the issues raised are applicable to other professions where PPE handwear is worn (e.g. emergency services, fire and police). References Kitzinger, J. 1994, The methodology of focus groups: The importance of interaction between research participants. Sociology of Health and Illness 16, 103–21. Krueger, J. 1988, Focus Groups: A Practical Guide for Applied Research, (Sage: London).
Product & workplace design
CONTEMPORARY TRENDS AND PRODUCT DESIGN Patrick W.Jordan Manager Human Factors/Manager Aesthetic Trends Group, Philips Design Building W, P.O. Box 225, Damsterdiep 267, 9700 AE Groningen The Netherlands
Eight contemporary trends—likely to have a major influence between 2000 and 2005—are described. Not all of the trends will affect all of society—indeed some of the trends run counter to one another. Nevertheless, it is expected that each will affect enough people to be significant. The trends have been identified on the basis of professional judgement and cross-referenced against the predictions of other trends specialists1. Possible consequences of each these trends for design are described in the context of their wider implications for commerce, manufacturing and society as a whole. Feminization It is predicated that by the end of the year 2005 four out often US businesses will be run by women. This is indicative of the increasing influence that women are having in all areas of life from business to politics, from sport to entertainment—a ‘feminization’ of society. In the workplace this trend emphasizes cooperation and good relationships between colleagues. Firms which promote a relaxed informal way of working will flourish, whilst those which work according to strict hierarchical command structures will find it increasingly difficult to hold on to their most capable employees. This trend is having a strong effect on male lifestyles. Increasingly, men are rejecting the stereotype male role. In the early days of the contemporary feminist movement much lip-service was paid to the idea of men sharing in child rearing and homemaking. Now this is becoming a reality, with many major Western firms offering paternity leave to their male employees. Men are also starting to pay a lot more attention to their bodies. This is reflected in the success of cosmetics designed especially for men and in the sharp increase in the number of men visiting health clubs and fitness centers. Sadly it is also reflected in appearance of traditionally “female” illnesses, such as anorexia and bulimia, amongst young men. An early example of a design whose success reflects the feminization trend is the Philips ‘Billy’ bar blender. The styling of this product brings postmodern irony to a product which is designed to be used in a traditionally ‘feminine’ domain—the kitchen. The fun styling has proved popular with a new generation of men and women who enjoy working in the kitchen but don’t take their homemaking tasks too seriously. The product aesthetics are a humorous challenge to the idea of the housewife slaving away using kitchen tools.
1For
example Faith Popcorn, Trends Research Institute, Brand Futures Group, Megatrends, Studio Edelkort.
356
CONTEMPORARY TRENDS AND PRODUCT DESIGN
Hedonism The hedonism trend is appearing partly as a backlash to the health conscious and ‘correctness’ trends of the 1990s. Many people are getting fed up with being told what they can and can’t eat, what they can and can’t say, and about the sort of entertainment that they can and can’t enjoy. Nevertheless, people have also understood much of the positive benefits that the health and correctness trends bought with them. Hedonism is about guilt free indulgence—not necessarily as a whole way of life, but as a treat—a special moment of self pampering. So, whilst people may understand the benefit of a healthy diet, they may also enjoy special treats, such as rich chocolates, fine cigars or a good bottle of wine. Another reflection of this trend is ‘growing old disgracefully’. Many of the baby boomer generation— now in their sixties—have retired and are spending their (often considerable) savings on having a good time. For example, sales of sports cars and motorbikes have increased sharply amongst this age group. Aesthetic aspects of products may become increasingly important as a result of this trend. People will want designs that radiate quality through and through. In particular, this is likely to have an influence on the materials used in design. For example, there may be a move away from plastics towards ‘noble’ materials such as woods and metals. The compact and beautifully designed Canon Elph photo-camera is a good early example of a design in tune with this trend. Spirituality Spirituality is a post-materialist trend. It reflects a desire to rise above merely consuming to experiencing. During the materialistic 1980s many people thoroughly enjoyed conspicuous consumption. Being the envy of the neighbors was quite the thing. Spirituality is more about loving your neighbor. The consumer boom may have brought prosperity into people’s lives, but it hasn’t necessarily brought meaning. Spirituality is a search for that meaning. Perhaps the most obvious reflection of this trend is the increasing influence of religion, both in the West and the East. Nine out often Americans regard religion as important and seven out often pray every day. People are also looking beyond the Judeo-Christian traditions to the mystical religions of the East. However, this search for meaning is also reflected in other ways. Increasingly, when making purchase choices, people are considering not only the quality, but also the ethics and behavior of the company supplying the product or service. People are becoming increasingly sophisticated in their approach to purchase choices and companies which continue to feed their potential customers on a diet of mealymouthed hyperbole will soon find that their customers start to look elsewhere. In terms of consequences for design, products that carry ‘meaning’ or onto which meaning can be projected will be appreciated. Loud aesthetics—those which ‘scream out’ about the product’s functionality or monetary value will give way to quieter more restful aesthetics, helping to make the home a peaceful visual landscape. Spirituality is also about doing things well and doing them simply. Single function products which perform this function excellently will be appreciated. People will be prepared to pay a lot for a product provided that they can be sure that it will perform well for a long time. Global knives are a contemporary example of single function products which have been designed to the highest standards. They are made from a single metal extrusion—the weight of the handle balancing the blade. The surface of the handle has been textured to give a good grip and the blade has been hardened and is very sharp. A simple, high quality product which performs one function excellently.
CONTEMPOARY ERGONOMICS 2000
357
Downsizing People have been getting busier and busier…and people are growing sick and tired of it! Stress levels are increasing and people are starting to turn their backs on the rat-race. When asked whether they would rather have more money or more free time over half of Americans say that they would choose the free time. People are increasingly choosing to work at home, taking advantage of the opportunities provided by information technology—in particular the internet. Another reflection of this trend is the move out of the cities and into the countryside. Over four million Americans have left the cities for the countryside in the last four years and the trend looks set to continue. The leisure industry can hope to benefit from this trend. As people make more free time for themselves, they will look for exciting, fun or relaxing things to do alone or with their friends and families. An aspect of downsizing which has implications for design is the blurring of the distinction between the home and the workplace—increasingly the workplace is in the home. Even when people do go to another place to work, people may enjoy workplaces that are more than merely professional environments. Creating a cozy or fun atmosphere is appreciated. A result of this may be a blurring of the distinctions between the aesthetics of ‘professional’ products and the aesthetics of ‘household’ products. The use of colors and materials on the i-Mac computer is an early example of a professional product with a fun aesthetic. Tribalism It is often said that we are living in a ‘global village’. The internet and cheap air travel are the prime movers behind this trend. So are military and political developments which have led to the increasing Americanization of the world and the increased integration of Europe, arguably at the expense of national identity. The main symptom of tribalism is the search for membership of groups that give a feeling of collective identity. For some this search for identity takes the form of joining groups of like minded people—for example, through groups dedicated to common interests, such as sport, music, culture or politics. Increasingly, such groups are being facilitated by the internet. For others the search has taken the form of a reassertion of national identity. An example of this within popular culture can be seen within music. A few years ago, the European music charts were totally dominated by songs sung in English. Recently, however, there has been an upsurge in the fortunes of bands who sing in their own national language, many of whom are scoring hits in their national charts. Another side of tribalism is fusion. Fusion is about understanding other cultures and mixing and matching the best of these with the best of the domestic culture. This trend has already been noticeable for a number of years in Eastern cultures. In countries such as Hong Kong and Singapore, people will dress in the Western style and fill their houses with the latest Western gadgetry. However, many will eat in restaurants serving superb Asian food and work in companies which embody the values and practices of the Asian work ethic. The increasing importance of branding may be seen as a reflection of tribalism. If a company is strongly branded, then buying a product from this company can indicate a sense of belonging—an identification of the values promoted by the brand image. Design can play a major role in establishing a brand identity. For example, the Apple Macintosh range of computers have a number of common design elements—notably the look and feel of the interface and the use of sound—that help to create a strong identity. Macintosh have gained a strong and loyal following, particularly amongst people in ‘creative’ professions, such as design. For many years such people have
358
CONTEMPORARY TRENDS AND PRODUCT DESIGN
associated Macintosh with fun, creativity and user-friendliness. They may see being a Mac user as reinforcing their own regard for these values. Fear There is an increasing mistrust of governments and large corporations. One area in which this trend shows up is in the food industry. For example, the British government’s handling of the ‘mad cow disease’ epidemic sowed the seeds of mistrust amongst many British consumers. People felt betrayed and misled. This is now having an influence in the context of scares over genetically modified foods. Once again, the government—albeit of a different political hue—is trying to convince people that there is nothing to worry about, but now these reassurances are falling on deaf ears—a case of once bitten twice shy. A more extreme symptom of this trend—one that is particularly prevalent in the USA—is the rise of antigovernment militia groups. Many of these groups fear that the government is plotting to undermine the rights of individuals. Because of this they believe that it is important to be ready for armed struggle against their own national leaders. The Waco tragedy and the Atlanta bombing are two examples of the potentially horrific consequences of this trend. Technofear is also on the rise. For example, as the new millennium approached many people became worried about the effects of the millennium bug, believing that it could be potentially catastrophic. One reflection of this was the huge increase in the sale of tinned foods during 1999. People were concerned that potential difficulties which might arise in the transportation and storage of food might lead to severe food shortages and insured against this by stockpiling tinned foods in their homes. A consequence of this trend is the need for manufacturers to create products that are honest and responsible —and which are seen to be honest and responsible—in order to win back trust from their customers. This means, for example, that the product aesthetics should be straightforward—revealing how the product is constructed, what it does and how it works. Environmental responsibility and sustainability, both in terms of materials and manufacturing processes, are also important here. An example of a product whose design fits with many of these criteria is the Dyson vacuum cleaner. The Dyson vacuum cleaner has a design which reveals the way in which the product works, thus enhancing people’s understanding of the product. Because no dustbag is required, the product is seen as being environmentally friendly. Because Dyson is a relatively small manufacturer it may avoid some of the mistrust sometimes associated with some of its multinational competitors. Staying Alive Whilst the hedonism trend represents something of a backlash against the health concerns of the 1990s, the staying alive trend might be seen as a legacy of these concerns. This trend reflects people’s desire to live long and healthy lives and the belief that particular ways of living can help in achieving this. People are paying more and more attention to what they eat and drink. One manifestation of this is the increasing numbers of food manufacturers printing nutritional information on food labeling. People are also exercising more, particularly the middle-aged. Health clubs and fitness centers have reaped the benefits of this trend with membership of fitness clubs increasing by 64% for middle-aged Americans. Use of alternative medicines is another manifestation of this trend, with a sharp rise in the sale of homeopathic remedies over the last five or six years. An effect of this trend may be a move towards a ‘sub-medical’ aesthetic for products whose use may be associated with health consequences. This has already been seen, for example, in the case of vacuum
CONTEMPOARY ERGONOMICS 2000
359
cleaners with dust filters. Miele manufacture a vacuum cleaner colored white with a green cross. This emphasizes the health benefits of filtered air to the user. Individuality This is about people’s desire to assert their individuality in an increasingly impersonal world. The relentless drive towards computerization of services over the last few years has left many people with the impression of being just a number—processed rather than serviced. Service institutions that show an understanding of their customers individual needs will flourish in the coming years. One way in which people are asserting their individuality is through fashion. People are increasingly mixing and matching in order to develop their own style. People may wear an expensive Rolex watch along with a cheap pair of sneakers and, more and more, are refusing to be dictated to by designers and fashion gurus. This suggests that manufacturers will have to offer customers a wider range of styles more sharply focussed on the tastes and lifestyles of different people or groups. Manufacturing technology is making it increasingly feasible to produce products in comparatively small runs at a reasonable cost. Another possible response to the trend is to give people the chance to personalize products. For many years motorists have been offered a series of optional extras, color choices etc., when choosing a new car. However, Mercedes and Swatch have taken this a step further with the Smart Car. For example, owners can alter the appearance of the car by swapping the external panels.
SENSORY ENCOUNTER: THE CODIFICATION OF ‘SOFT’ QUALITIES Alastair S.MacDonald Course Leader, Product Design Engineering, Glasgow School of Art, 167 Renfrew Street, Glasgow G3 6RQ, Scotland
Dreyfuss’ Environmental Tolerance Zone (ETZ) set the paradigm for ergonomists’ approach to defining environmental stressors for the body and its senses, using units of measurement that describe physical (or ‘hard’) phenomena. During the process of ‘sensorial encounter’ with a product there is a complex layering of innate, individual, and cultural responses, which reveals that we are sensitive to other ‘soft’ phenomena, the description of which has been more elusive. This paper proposes an approach to designing which extends the range of criteria ergonomists currently acknowledge, to articulate values given to qualities perceived through the senses. By using a ‘scenario of sensory encounter’ method, one can acknowledge qualities in products with which customers may feel a greater measure of empathy. Environmental tolerance zone (ETZ) In human factors engineering, the impact of the environment on the senses has played a key role in determining safe or optimum human operating conditions. Dreyfuss’ (1966) Environmental Tolerance Zone (ETZ) set the paradigm for quantifying the robustness of the body and its senses measured on a ‘comfort-totolerance’ scale. His 1966 diagram shows a schematic individual within two concentric circles delineating the comfort and tolerance zones for a range of environmental factors. He describes the first circle as the ‘bearable zone limit’ and the band between the two circles as indicating the zone from the ‘comfort to the tolerance limit’. ‘Outside this limit great discomfort or physiological harm is encountered’. The units of measurement used to discuss these environmental factors are quantitative—lumens, parts per million, degrees Fahrenheit, pounds per square inch, and decibels. The ETZ crucially alerts us to environmental factors which may, at best, reduce our efficiency in the workplace, or at worst, cause us harm or even death. Dreyfuss’ schematic model has been adopted, extended and refined, for instance by Finch and Stedmon (1998) who have developed a taxonomy of stressors in response to the complexities of stress in the operational military environment. There now exists a substantial body of knowledge and legislation to ensure optimum environmental conditions prevail that safeguard us from physical stress and harm. The scenario of sensory encounter If one were driving a car, the level of noise from the exhaust, the temperature and humidity of the interior, the level of vibration as the car travels along the road surface, the glare of light entering the driver’s field of
CONTEMPOARY ERGONOMICS 2000
361
vision or from the instrument panel would be embraced by Dreyfuss’ model, but when one considers the following scenario, the ETZ model appears lacking in its description of the engagement of the senses. Approach a car and your initial impression, formed visually, will be either attraction, indifference or dislike. As you open its door, your tactile and auditory senses come into play. You are subconsciously judging its weight and quality by feel and sound: does the door-hinge feel secure, and does the door catch make the right sound as it closes? As you sit comfortably, or uncomfortably, in the car there is the smell of the material —of leather, or is it a sharper smell of leatherette? Finally, as you drive away there is the sensation of acceleration, of how the car handles and the sound of the exhaust. What has just occurred is a process of ‘sensory encounter’, a process in which value judgments are made, consciously or subconsciously, about information perceived through the senses. It is interesting to note in more detail just what is happening during this process. A map of the senses ‘Man has no Body distinct from his Soul; for that called Body is a portion of Soul discerned by the five senses, the chief inlets of Soul in this age.’ (Blake c. 1790). Penfield’s ‘homonculus’ (Blakemore 1976) helps us understand the relative importance of each of the senses and maps out the proportional representation of sensations obtained from different parts of the body onto the surface of the right cerebral cortex. It offers us a blueprint for designing for the senses: if one were a rabbit, a cat, or a monkey, the sensory ‘homonculus’ would reveal a peculiar predisposition in terms of the acuity of each senses, e.g. enlarged nose, ears, or eyes. (Kandal 1991) (Figure 1) As a culture which presumes that vision is the largely dominant sense, it perhaps comes as a surprise that, in the human, the mouth and the thumb appear so large in this form of mapping. To chart what is happening during ‘sensory encounter’, each of the senses is now explored in turn.
Figure 1. Penfield’s human homonculus compared to animal homonculi
Sight and empathy The first sense to be engaged in our scenario is sight. Restak (1995) describes a four level cognitive process involving the visual and association cortexes during the process of visual perception. When one sees a
362
SENSORY ENCOUNTER: THE CODIFICATION
product, one enters into a process of empathy (the attribution to an object of one’s own emotional or intellectual feelings about it) and Suri (1997) has given this issue more prominence recently. The reality of empathy, a process of personification, is evident if one analyses the language used to discuss the physical features of a car (Macdonald 1999a). The visual features may also embody codes, manifest as styles, which act as shorthand for sets of social or cultural values (Smets 1989). Touch and shifts in cultural value The next sense to engage with the car is touch. In European culture, there has been a distinct correlation between weight and value, where we have become used to associating the weight of a particular type of object with a particular value (Macdonald 1999). Whereas the value standard with reference to cars used to be ‘weight=strength+safety’, one’s perceptions require re-educating when shifts in material performance and manufacturing technology allow, for example, one car manufacturer to pose the question ‘how can light be strong?’ (Toyota 1999) and then to justify that its new model is strong due to a tough new body structure. This example serves as an illustration that a new cultural norm may be in the process of being established. There is also the need to consider demographic issues: one generation may have its own set of associations and corresponding values for the material properties of products distinct from another. The differing levels of acuity which accompany the ageing process (Pirkl 1994) will also condition one’s perception of a product. As Penfield’s homonculus reveals, tactile qualities have been largely undervalued in design and this particular territory offers good opportunities for enhancing the design and attractiveness of products. Hearing and smell: emotional cues In our car scenario, we mentioned sound—the sound of the door opening, or the sound of the exhaust. Recent research into the treatment of the hearing difficulty tinnitus has made great progress through recognition of the ‘emotional label’ associated with each and every sound we hear and learn the meaning of ‘which may change from time to time according to how we feel in ourselves and the context in which we hear it’ (Hazell 1999). This is due to the discovery that patterns of sound are detected by subconscious filters in the hearing pathways, and the conditioned response triggers activity outside the auditory system where there are large numbers of connections with the limbic system (concerned with emotion and learning). The nature of the design and engineering specification of a computer keyboard can result in either a pleasant or irritating ambient clatter of keys in the office environment, despite alternatives being economically correct and possessing parity of functional performance. Kansei engineering recognises the importance of attractively tuning sound emissions in car exhausts (Nagamachi 1995). In a similar vein, recent research at the Defence Evaluation and Research Agency (DERA 1999) has suggested that ‘odour cues’ might improve the recall of material learnt in a particular environment. ‘It is believed that memory will be significantly improved if there is a match between the context in which it is learnt and the context in which it is to be retrieved.’ Smell is different from other senses as it goes straight into the brain’s limbic system, which suggests that strong emotional connections are made with smells. There is a need to consider the importance of these emotional labels which occur during the process of the perception and cognition of products. While, on the one hand, these may be more associated with an individuals’ particular experiences rather than broadly held values, on the other there may be odours which are broadly attractive or repulsive through their associations.
CONTEMPOARY ERGONOMICS 2000
363
Speed and acceleration Moving away in our car, the rate of acceleration will affect the adrenaline levels in the body—increasing alertness and speed of reaction, a residual ‘fight or flight’ effect. Voluntary or involuntary movement, and experience of ‘G’ forces will add to the thrill of the chase. Involuntary movement, caused by poor suspension, jolting the body, will affect the ear’s balance mechanism and equilibrium. Responsive controls add to the pleasure. Layers As we have seen, during the process of ‘sensory encounter’ with the car there is a complex layering of intuitive or innate, individual and cultural correspondences (Hofstede 1991) with the object: a) the way our brains and senses have evolved biologically and influence the particular meanings and values we give to information perceived through our senses; b) personal experiences and associations which are the result of attaching an emotional label to a particular sensory event; and c) social and cultural factors which create values shared with a broader sector of the population. If these were tabulated, they would provide a useful matrix to be used as a checklist to map the complexity of considerations discussed in the above scenario (Table 1). Table 1. Sensory encounter of a car: three levels of sensory response for each of the senses discussed in the car scenario
364
SENSORY ENCOUNTER: THE CODIFICATION
Conclusions Relevance to ergonomists and product excellence Dreyfuss’ ETZ provided us with a useful, though partial model of the impact of the environment on our bodies and our senses. This now needs to be extended to embrace a range of ‘softer’ issues which influence our responses to products. The scenario approach to sensory encounter would allow a greater understanding of the role and importance of designing for each of the senses, for our ‘aesthetic intelligence’ (Macdonald 1999b). Some of the sensorial considerations for design, such as tactile qualities, have been undervalued and must offer worthwhile areas for more in-depth consideration. This is especially important when considering, e.g. the design of products for an ageing population, where the acuity of the senses change as one ages (Pirkl 1994). The trend in industrial production is for companies to increasing their range of products which are more tailored to people’s different needs and desires, products with which individuals can more easily empathise. The ‘homonculus’ in Figure 1 gives us a visual reminder of the relative sensitivity of each of our senses. Table 1 helps to remind us of the ‘scenario of sensory encounter’ in which the role of each of the senses is discussed in terms of innate response, personal association, and cultural value judgments. If ergonomists used these two tools along side Dreyfuss’ ETZ, they would be able to consider a far broader range of human factors with which to enhance the appropriateness and attractiveness of products, tasks, interfaces and environments. REFERENCES Blake, W. (c. 1790) The Marriage of Heaven and Hell, Oxford University Press, London 1975. Blakemore, C. 1976. Mechanics of the mind 79–80, BBC Reith lectures 1976, Cambridge University Press, Cambridge. DERA (1999) website Nov 99 http://www.deva.gov.uk/html/news/devanews/smell.htm Dreyfuss, H. (1967) The measure of man: human factors in design. Whitney Library of Design, New York. Finch, I.M., and Stedmon, A.W. ‘The complexities of stress in the operational military environment.’ in Contemporary Ergonomics 1998, 388–392 ed Hanson, M.A. Taylor and Francis, London. Hazell, J. (1999) Tinnitus retraining therapy based on the Jastreboff model, March 1999 http://www.ucl.ac.uk/ ~rmjp101/tin2.htm Kandal, E.R. et al. (1991) The principles of neural science (Third ed) Elsevier, New York, (illustrations of animal homonculi adapted from figures in Chap 26) Hofstede, G. (1991). Cultures and organisations, (McGraw-Hill International) Macdonald, A.S. (1999a) ‘Aesthetic intelligence: a cultural tool’ in Contemporary Ergonomics 1999, 95–99 ed Hanson, M.A, Lovesey, E.J., & Robertson, S.A. Taylor and Francis, London. Macdonald, A.S. (1999b) ‘Developing aesthetic intelligence as a cultural tool for engineering designers’ in Proceedings of the International Conference on Engineering Design (ICED) Munich, 1999. 297–300 eds Lindeman, U. et al Technische Universität München, Munich. Nagamachi, M. 1995. Kansei engineering: a new ergonomic consumer-orientated technology for consumer development. In International Journal of Industrial Ergonomics, eds M.Nagamachi and A.S.Imada 15, 3–11. Pirkl, J.J. (1994) Transgenerational design; products for an aging population, 41– 51 Van Nostrand Reinhold, New York. Restack, R. (1995) Brainscapes. 22–23, Hyperion, New York. Smets, G.J.F. (1989) Perceptual meaning. Design Issues Vol. V, No 2. Toyota (1999) advert for the Toyota Yaris.
USECUES IN THE DELFT DESIGN COURSE H.Kanis, M.J.Rooden & W.S.Green School of Industrial Design Engineering, Delft University of Technology, Jaffalaan 9, 2628 BX Delft, The Netherlands
For designers, featural and functional characteristics of a product are the obvious means to express its functionalities. Usage as intended in a design may be conceived as being mediated by product semantics or affordances. These concepts primarily involve scientific generalisations. Usecues are introduced as a pragmatic design tool. There is a brief discussion of experience with the notion of usecues in the Delft design course. Introduction In user product interaction, user activities (perception, cognition, actions, effort) involve the formulation of meanings for featural and functional product characteristics. Apart from documentation on paper, these featural and functional product characteristics are the obvious means for designers to express a what functionalities a product has, i.e. what possibilities to support, protect, replace, extend human activities, and also, in as far as desirable, b how these functionalities can be activated. Observational studies (e.g. Kanis, 1998) show that, whatever the effort of designers, the intended communication frequently fails; that is, meanings in product characteristics preconceived by designers, are not properly recognised by users. See Table 1 for an overview of reasons found in empirical studies why perception/cognition may be inadequate. In such studies, it has also been found that perception/cognition as intended in the design does not necessarily accomodate anticipated use actions—users may prefer their own way of operating or postpone the action at issue. Occasionally use actions anticipated in a design are carried out smoothly without users noticing or understanding designed characteristics of a product, (equally so in using a product for the first time). Obviously, better insight into the mediation between users and products by designed product characteristics could be of great help for designers in anticipating future usage. This subject is considered by dealing with the possible role of the concepts of product semantics and affordances in describing userproduct interaction, and by discussing the notion of usecues. Product semantics In terms of semantics, products often tend to be discussed as ‘wholes’, representing cultural, aesthetic or general functional values and information. In summarizing a study by Klöcker, Vihma (1995) points to the contrasting tendencies in product design: “the optimization towards a reduced and easily perceivable form
366
USECUES IN THE DELFT DESIGN COURSE
Table 1 Perceptions/cognitions by users different from intended by the design
1
Kanis, 1998
and the informative tendency with various details added to the form.” (p. 35). “‘Messages’ for the user must be designed”, this author adds (p. 38) in discussing “semantics of product language”. Product semantics can be thought of as meanings associated with product characteristics, e.g. form, dimensions, colour, graphics, texture, transparancy, fragility, grouping of product parts etc. In observational studies carried out at Delft (Kanis, op.cit.), graphics can be denoted as frequently occurring product semantics, indicating product functions (a above) rather than ways of usage (b), compare the references to icons and words in Table 1. In studies like Vihma’s, attribution of meaning tends to be discussed on a general, more theoretical level, rather than empirically, on the basis of the observation of user activities. One way to think of the role of product semantics in design is in terms of information processing. A combination of product characteristics, supposed by designers to have a particular meaning, are encoded in a design, subsequently to be decoded by users. In this view, communication primarily consists of the exchange of mental representations in a perceptive/cognitive process, which somehow thrives on experience and learning. Vihma points to the alternative of ‘self-explanation’ of a product, exhibiting its practical function in relation to a user (p. 39). Comparing this to the absence of cognitive mediation between users and products referred to in the introduction leads towards the concept of affordance. Affordances This notion denotes self-evident environmental possibilities/opportunities for living organisms (animals, humans) in being supported, protected or threatened, whilst these possibilities/opportunities are directly perceivable on the scale of an organism involved, i.e. by a direct coupling between this organism and its perceiving acting …], without a specified mental mediation. A keyenvironment in [… acting characteristic of the concept of affordance as introduced by Gibson is its simultaneous foothold in the agent
CONTEMPOARY ERGONOMICS 2000
367
as well as in the environment. To some extent, this ‘linking character’ appears to ‘explain away’ interaction between constituents which are distinguished as separated entities, i.e. as agent and environment in their own right. This may be one reason why it appears to be so difficult to come up with elucidating examples of affordances. Another reason seems the claimed self-evidence of the direct coupling between agent and environment, working out in smooth, automatic human-environment behaviour, that is: in essence pre- or non-linguistic. Compare new terms such as ‘walk-onableness’ of surfaces, or ‘sit-onableness’ of chairs, which often feature in attempts to clarify what an affordance is. A third reason for the absence of good examples may be the evolutionary character of the notion, addressing general human behaviour in natural environments, rather than activities of users interacting with artifacts. Why would a concept, which is operationally so evasive, have become so popular, at least in some design circles, see e.g. Amant (1999) and Norman (1999)? Is it the claim that affordances specify actions (e.g. Michaels & Carello, 1980)? For sure, a notion shedding light on the diversity of users’ actions and reducing their unpredictability would be of great help for designers. Affordances as such tend not to be seen as sufficient for this job, since usually human characteristics, in terms of effectivities or capabilities (Michaels & Carello, op.cit.), are resorted to in order to ‘co-explain’ variety in user activities. However, our studies have shown (Kanis, op.cit.) that the relevance of human limitations and capacities is largely constrained to setting boundary conditions. How users act within these boundaries may have little to do with their limitations and capacities (cf. Green et al., 1997). Whatever the reason for its attractiveness, the term affordance has surmounted its questionable conceptualisation, not unexpectedly by being given alternative interpretations. Norman (op.cit.) complains about the misuse of the term in the graphical world, for featural characteristics such as icons on products; that is, on a one-sided basis with the agent (user) unrecognised. An extreme and opposite interpretation is given by Vera & Simon (1993), who view affordances as “carefully and simply encoded internal representations of complex configurations of external objects,…” (p. 41). This appears some way off Gibson’s mark. Usecues The origin of both product semantics and affordances primarily involves theoretical concepts expressed as scientific generalisatons, rather than pragmatic notions as design tools. The popularity of the notion of affordance in particular, despite its resistance to operational demonstration, suggests that designers could do with a conceptual anchor to monitor ongoing design efforts in terms of possible future user activities. The term ‘usecue’, introduced in the Delft curriculum some years ago, seems to work in this way for industrial design students. Usecues are conceived as meanings, given to product characteristics, in terms of what functionalities a product has (see a above) and how these possibilities can be activated (b). Usecues involve primarily a pragmatic, bottom-up notion, rather than departing ‘top-down’ from cognitive, ecological or other ‘fundamental’ processes. Usecues can be seen to resemble what Vihma calls ‘indices’ (op.cit., p. 114). Whether conceived as product semantics or as affordances, usecues are more ‘down to earth’: the actual ‘voice’ of a product in practice in terms of its functionalities, see Figure 1 for an example. The following featural characteristics can be indicated as presumed usecues: contrasting colours of the controls; the on/off sign; the terms ‘auto’, ‘speed setting’; the graphics ‘1’, ‘2’ and ‘turbo’ under the leds as a scale (whatever ‘turbo’ may mean); the position of the leds above the controls, which simultaneously may be a source of confusion since the scale is addressed by two controls. Functional characteristics as presumed usecues may be the highlighting of the leds, and the blinking of led 2, followed by its switching to steady green or red.
368
USECUES IN THE DELFT DESIGN COURSE
Figure 1 Part of the control panel of an aircleaner
As can be seen in this example, the identification of presumed usecues goes along with the indication of possible deficiences and flaws, e.g. the meaning of ‘turbo’, the scaled leds addressed by two controls, and the meaning of the 3 min. blinking of led 2. User trials usually make very clear that usecues should not be seen in a positivistic way, as radiated messages just waiting to be discovered and understood by the user. There is no ‘the user’. There are many users, known to vary greatly in their perceptual and cognitive processes, dependent upon expectations and experience in different situations. In this respect, the designed ‘voices’ can best be seen as opportunities, to be realised conditionally in relation to the individual, situated predispositions of people. Users may have already learned the messages from these ‘voices’, may accept them as naturally self-evident, may be unaware of some or all of the processing, may be unknowingly guided by a fortuitous combination of the message and circumstance, or may actually recognise and accept the conscious attempt at guidance. The notion of usecues is meant not to be burdened by theory focused dialectic which has no significance or impact on designers, hence this ‘new’ term. Usecues in the Delft design course What may make the term usecues attractive to industrial design students is its articulation of something from which designers cannot escape: the creation of featural and functional product characteristics which are or can be transmitters of messages (voices) for users. Even if only used by hindsight, thinking in terms of usecues has been found to facilitate the identification of possible deficiences in a design underway, such as lacking ‘directions for use’ in a prototype, or ambiguous or misleading cues (compare Figure 1). The popularity of the term has its other side. Once recognised, the articulation of usecues as possible meanings of design characteristics sometimes may trigger the unwarranted feeling of design students being ‘in charge’ of directing usage, which is then no longer a complete gamble, since the design has been ‘usecued’! In extreme cases, any distinguishable featural or functional product characteristic is denoted as a designed usecue (in the case of a self developed model/protoype), or a presumed usecue (in the case of an unknown design history). Then, the notion of usecues tends to degenerate into a panacea and this degeneration may accompany and reinforce the misconception of a reductionistic user-product interaction: this cue for this, that one for something else. Such thinking in terms of isolated cues may give some indication of reasons why user activities differ from those anticipated in a design (see Table 1). This approach exhibits a bias against the way in which users may actually attribute meaning, namely contextually, rather than in a behaviouristic way with a product reduced to the sum of a series of distinct usecues which may end up in misguided design remedies. A neat, ‘one-to-one’ picture is further blurred by the difficulty of delineating what is, and what is not, a usecue. Clearly, speaking of a usecue seems to be warranted when a particular
CONTEMPOARY ERGONOMICS 2000
369
featural or functional characteristic results in the rejection of alternative ‘messages’ by a new design. However, such a ‘design story’ is no prerequisite. Users, in deciding what to do or not to do next, make sense of obvious characteristics e.g. the sound a product makes when functioning, or that it has become warm. Such characteristics, which are usecues by definition, can be seen as passive incorporations—not obscured (as opposed to deliberatly introduced) by designers, who may accomodate their designs implicitly to (presumed) current habits, practice, customs, cultural conventions. It appears inevitable that designers will give their own delineation of the term. Unacceptable as this may seem in a scientific context, there is nothing wrong with it, provided that usecues (designed, presumed) can be made explicit, i.e. capable of articulation, and can turn out to be a pragmatic and effective tool for drawing due attention to the consequences of all kinds of decisions made during the design process. References Amant, R.S. 1999, User Interface Affordances in a Planning Representation, Human-Computer Interaction, 14, 317–354 Green, W.S., Kanis, H. and Vermeeren, A.P.O.S. 1997, Tuning the design of everyday products to cognitive and physical activities of users. In S.A.Robertson (ed.) Contemporary Ergonomics, (Taylor & Francis, London), 175–180 Kanis, H. 1998, Usage centred research for everyday product design, Applied Ergonomics, 29, 75– 82 Michaels, C. and Carello, C. 1980, Direct perception, (Englewood Cliffs, New Jersey, USA) Norman, D.A. 1999, Affordances, Conventions, and Design. Interactions (may/june), 38–42 Vera, A.H. and Simon, H.A. 1993, Situated action: A symbolic interpretation, Cognitive Science, 17, 7–48 Vihma, S. 1995, Products as representations, (University of Art and Design, Helsinki)
DESIGN ISSUES AND VISUAL IMPAIRMENT Katie M.Stabler & Sabine van den Heuvel Royal National Institute for the Blind, Product Development Department, Bakewell Road, Orton Southgate, Peterborough PE2 6XU, UK
There are about 1.7 million people in the UK with a serious sight problem. 90% of these are over 60 years of age, and this is the only age group which is growing. Some estimate that by 2001 over 20 million people in the UK will be aged 50 or over, and this segment will hold around 75% of the nation’s wealth. In purely commercial terms it therefore makes good sense for designers and manufacturers to produce products that are suitable for older people, yet today little account has been taken of the characteristics of this group in mainstream product design. This paper aims to highlight the key design issues which must be addressed in order to make products more accessible to visually impaired people; namely, the nature and quality of visual, tactile and auditory product information. Introduction There are many myths about blindness and partial sight. Here are some of them:
CONTEMPOARY ERGONOMICS 2000
371
What is Visual Impairment? Many people find it hard to see even after having an eye test and wearing the right spectacles or contact lenses. There are 1.7 million people in the UK who have a serious sight problem which significantly affects the way they live; from reading the daily newspaper and cooking at home, to getting out shopping and socialising. Generally, ‘blindness’ is regarded as a substantial and permanent lack of sight. ‘Partial sight’ is a less severe loss of vision. A person can register as partially sighted if they can only see the top letter of the eye chart at a distance of six metres or less, wearing corrective spectacles. Sight loss is one of the commonest causes of disability in the UK. Blind and partially sighted people come from all sorts of backgrounds. They go to school, university, get jobs, bring up families, watch TV, enjoy holidays, friends and hobbies etc—but they may need help to do some or all of these things. Only 8% of blind and partially sighted people are born with impaired vision. Most visually impaired people have gradually lost their sight in later life as a result of the ageing process. In fact, four out of every five people with impaired vision are over retirement age (see Figure 1), many of whom may also have other disabilities or illnesses such as hearing loss or arthritis.
Figure 1—Percentage of blind and partially sighted people in the UK
The different eye conditions may have a variety of effects on individuals. Very few people see nothing at all. The four most common eye diseases causing low vision in the UK are: macular degeneration, which results in a loss of central vision; diabetic retinopathy, which can result in ‘patchy’ vision; glaucoma, which can cause loss of peripheral/side vision; and cataracts, which often cause ‘misty’ vision. So some people with impaired vision can see enough to read this article, although they might have difficulty crossing the road. Design issues and visual impairment When designing a product it is important to focus on all product-related issues, and not just on the product itself. For example, even a well designed product can be badly let down if the user is unable to open the packaging, read, or understand the instructions etc. The following is a list of aspects to take into consideration when designing a product. We trust that you understand the list is not complete, but it should get you thinking about other aspects of product accessibility for visually impaired people. • Information/advertising—How do people know about the product in the first place? Is the advertising appropriate? (e.g. is a TV advert purely visual, or does it have a verbal commentary?).
372
DESIGN ISSUES AND VISUAL IMPAIRMENT
• Packaging—How easy is it to find where and how to open the box, where are the top and bottom, etc.? (consider people with, for example, arthritis or dexterity problems). • Instructions—Are the instructions easy to read (e.g. large enough print size, Braille or audio tape)? Are they understandable? (lots of complicated diagrams and schematic drawings do not help). Is there a customer help line? • Assembly—Does the product require assembly by the customer? (e.g. does a table lamp need to be attached to its base/have a bulb inserted etc before it can be used). • Guarantee/return information—Is it clear what to do if the product does not work? Where to send it to, who to contact etc.? • Cleaning—Is the product easy to clean? Bear in mind that visually impaired people might find it more difficult than a sighted person to know whether a product is properly cleaned (this is particularly important with equipment which will come into contact with food, or for young children’s toys). • Safety—Even if a product complies with all the British and European safety regulations this does not ensure that it is necessarily safe for visually impaired people to use (consider warning lights, sharp edges, finger traps and moving parts). As an extreme example, consider a fully compliant motorcar—it would most likely be unsafe for a blind or partially sighted person to drive on the public road. • Replacement parts—How easy is it to get replacement parts like batteries etc.? Are they readily available, and how expensive are they? • Stigma—It is important that a product does not scream out ‘for use by visually impaired people only’. Some people, especially in the younger age groups do not like being labelled by their products as blind or partially sighted. They have just as much right to attractive ‘sexy’ products as their sighted peers. How do you make a product accessible? In this paper, we address how to make products more accessible for people with a visual impairment only, but in real life it is important to take other age-related disabilities into account for this group of people (e.g. hearing loss, arthritis, dexterity problems, etc.). The following three areas should be taken into consideration when designing for visually impaired people. 1. Visual information As most visually impaired people have some useful residual vision (72% are able to read large print—14 point Arial). Attention should be paid to the visual aspects of the product’s design. The following points are important to make a product ‘easy to see’. • Effective colours: Yellow seems to be a particularly easy colour to see and is often the last to be ‘lost’. It stands out in many situations, especially against black. Red stands out in good lighting conditions, though soon disappears as light fades. • Contrast in tone and colour are essential. A difference in tone is usually more effective than a colour difference. For example, bright red and blue contrast greatly in terms of colour, but very little in tone. Very light grey and very dark grey have no colour contrast but very good tonal contrast. • As a general rule for lettering, use dark characters against a light background; although the reverse can work well on larger signs, e.g. yellow letters on a matt black background.
CONTEMPOARY ERGONOMICS 2000
373
• Glare should be avoided at all times. Shiny surfaces reflect light in a way that can be confusing or uncomfortable to the eye. • Single upper case letters are easier to read than single lower case. • Words and sentences in mixed case lettering are easier to read than WORDS AND SENTENCES WHERE THE WORDS ARE ALL UPPER CASE. • Use a plain font with open letters, such as Arial (Don’t use fancy fonts). • Don’t leave too large spaces between the words. And don’t use justified text, as both these make documents difficult to read for some people who have a limited field of vision and/or who may be using low vision aids (e.g. a high power magnifier, which might focus just a few letters at a time). • A minimum of 14 point font size is recommended. • Touch screens and membrane key pads are very difficult for visually impaired people to use as it is not always possible to see or feel exactly where to press, and there is sometimes no tactile feedback. It is, therefore, important that the size and contrast of the characters on the screen/keypad are maximised to make them more suitable for visually impaired people. 2. Auditory information • It is useful for a click to be heard/felt as confirmation of pressing a button. • Use varied volume, pitch and duration in auditory signals to distinguish between the product’s various functions (e.g. on=1 bleep, off=2 bleeps). • For hearing impaired people it is important to use signal frequencies that can be heard. Consider that as a result of the ageing process, people tend to lose the ability to hear higher frequencies first, therefore, a male voice, is generally preferred. • When using speech, it is important to consider accents and languages in terms of understanding and personal preference (e.g. some English people find it difficult to understand a synthetic voice which has an American or Oriental accent). • A volume control and/or headphones are recommended for privacy (consider talking bathroom scales, or using a talking watch in a meeting). 3. Tactile information The number of Braille readers in Europe is less than 0.02% of the population. So although useful for some blind users, Braille is not a total solution for visually impaired users. (John Gill, 1998). In the UK only around 13, 000 visually impaired people read Braille. Aside from achieving the standard Braille profile, it is important to get tactile features right, particularly when considering the needs of elderly visually impaired people. With age, tactile sensitivity may be reduced due to loss of feeling in the fingertips (common age-related causes of which, are diabetes, strokes and circulation problems etc). • Tactile markings need to be much bigger than their printed equivalent. • Orientation cues are crucial (to ensure you have the product the right way up/round). • If two markings have to be distinguished from one other, it is important to make them feel as different as possible. Shape, size, height and texture can be used to differentiate between markings. • It is important that tactile information does not cover up visual information, as people may use both.
374
DESIGN ISSUES AND VISUAL IMPAIRMENT
• Vibratory output can be used for deaf-blind people. Each of the above areas is key to ensuring that a product is accessible for as many people as possible. A good rule of thumb, when designing for visually impaired people, is to use a combination of different types of feedback. Nevertheless, feedback given by all sources must be identical (e.g. the analogue hands on a clock should read exactly the same time as its speech output). Conclusion This paper has outlined some simple and cost effective design solutions to make products more accessible for visually impaired people. The RNIB Product Development team has had a decade of experience in designing and user testing specialist products to meet the needs of visually impaired and elderly people. We appreciate that there will always be a need for specialist products, but our long-term goal is to reduce the need for RNIB in-house design, by advising mainstream designers and manufacturers of the needs of visually impaired and older people. By actively promoting a ‘design for all’ approach, we hope to improve the lives of older, disabled and visually impaired people by making mainstream products affordable and accessible. Indeed, many mainstream products can easily be made more accessible without necessitating major design changes or costs. Take for example, the buttons on a phone: simply by adding some colour contrast with the background, upping the print size, and using a mix of upper and lower case lettering; buttons are immediately easier to see for visually impaired people. If mainstream products are more accessible for visually impaired people, it follows that they will also be more accessible for everyone. This is especially true for those of us, who, having lost our reading glasses, might find it a strain to find a button on a remote control, or read a small printed label. For more detailed design guidelines, and specific information on product design and evaluation, please do not hesitate to contact us at the RNIB Product Development Department in Peterborough. Tel: 01733– 375168/5155 For general information on visual impairment see our web page: www.rnib.org.uk References Gill J. 1998, Access prohibited, (Royal National Institute for the Blind, on behalf of Include) RNIB, 1990, General needs survey, (Royal National Institute for the Blind) RNIB website, 1999, www.rnib.org.uk
AUTONOMY FOR DISABLED CONSUMERS: THE NEED FOR SYSTEMATIC CHOICE AND INNOVATION John Mitchell1 & Jude Bennington2 1Director
of the Wheelchair Lifemaps User Trials, RICAbility and The Essex Rivers Bed Project, The King’s Fund
2Researcher
on the Wheelchair Lifemaps User Trials, RICAbility and The Essex Rivers Bed Project, The King’s Fund
This paper discusses the development of integrated, consumer-centred methods of revealing and responding to the priorities and problems of disabled and elderly consumers who have lost autonomy in important areas of their lives. Autonomy can often be restored by choosing effectively from available products, facilities and services. There are many stakeholders in this area, including consumers, service providers, information providers and manufacturers. At present there is little collaboration between these groups and as a result, consumers can often miss out on services that would be of use to them. In response, it is proposed that a Forum for Autonomy, Choice and Innovation be set up to lead the development of methods for: revealing consumers’ needs and problems; responding effectively to these needs; revealing the effects of dependence. The Effects of Design on Autonomy and Dependence Around fifteen percent of the UK population are estimated to have significant disabilities of one kind or another (Martin et al, 1988). Since the modern world contains many barriers against disabilities, their lives and activities can be considerably restricted. Making choices, accepting responsibility and taking opportunities are essential and desirable parts of life and its activities. However, people can only do this if they can use the systems (including facilities and products) they require. Unless systems are designed and chosen to match their users’ requirements and capacities, some will be unable to use them fully, effectively, easily, safely or comfortably. Some individuals can make up for ‘designed-in’ problems by using extra physical, sensory or cognitive capacity. However, these problems ‘filter out’ many elderly and disabled people who may not have extra capacities. The user-population represents the full spectrum of physical, sensory and cognitive capacities in the United Kingdom and includes those with high and low capacity. The effect of systems such as town centres, supermarkets, photocopiers or computer programmes that impose demands on their users is to reduce the percentage of the total population that can use each system. Highly demanding systems impede or exclude large numbers of their potential users. Those that place low demands on their users, on the other hand, can be safely and effectively used by virtually the entire population.
376
AUTONOMY FOR DISABLED CONSUMERS: THE NEED
Stakeholders in Autonomy The penalties of dependence affect a wide range of stakeholders, including: • Consumers (disabled and elderly and people who are dependent in important areas of their lives together with their formal and informal carers) • Providers (including frontline benefit, education, health, housing, social support and transport services and the planners and policy makers who plan and innovate responses to dependence) • Information Providers (who provide information and evidence about available responses to dependence and their effectiveness) • Regulators, Evaluators and ‘Standard Makers’ (who monitor and oversee the quality, value and safety of responses to dependence) • Innovators (who develop responses to dependence in the form of new or improved products, services and facilities). Approaches to restoring Autonomy Dependence is caused by unsuccessful interaction between consumers and the systems they wish to use. Stakeholders have therefore attempted to restore autonomy by increasing consumers’ capacities and making systems easier to use. Rehabilitation is aimed at improving consumers’ power, flexibility and control so that they can cope with demanding systems, such as bathrooms. Assistive technology is focused on the interface between consumers and systems. It uses products, such as hoists or wheelchairs, to effectively increase their capacities. Aids and adaptations, such as raised toilet seats and ramps, are also used to reduce the demands of ‘unfriendly’ systems to a point at which consumers can manage independently. ‘Inclusive design’ aims to produce ‘mainstream’ products and systems that can be used by the full spectrum of consumers. The approach originated in the USA as ‘barrier-free design’ and is also referred to as ‘low handicap technology’, ‘user-friendly design’ and ‘universal design’. A particular feature of this approach is that it maximises autonomy without stigmatising consumers as being disabled and that it reduces the need for rehabilitation and assistive technology. The starting point for all these approaches is a clear understanding of what consumers want to do and what impedes them. For example, a consumer might want to go upstairs but have difficulty in negotiating them easily and safely. Rehabilitation might focus on recovering strength, flexibility and balance. Assistive technology might solve the problem by providing a stair-lift. Inclusive design would seek to avoid the problem by ensuring ‘step-free’ access throughout homes. Restoring Autonomy through Choice and Innovation Each of the approaches outlined above offer options for responding to dependence and it is necessary to select those that are likely to be successful and to develop better options in the future. Consumers vary widely in the complexity of their disabilities, needs and circumstances and each of these options carries differing levels of feasibility, cost and effectiveness for individual consumers. Dependence can place considerable penalties on both consumers and providers and it is in their interest to minimise these penalties by restoring autonomy wherever possible.
CONTEMPOARY ERGONOMICS 2000
377
Figure 1 The Cycle of Choice and Innovation
However, choosing suitable options is a complex process that requires good quality evidence to inform it. Effective innovation from manufacturers and designers also requires sound evidence on the nature, extent and costs of dependence and on the relative effectiveness of available options. Essential Information Links between Stakeholders in Autonomy Each of the stakeholders requires information from at least one of the other stakeholders to enable them to use their resources to best effect. For example, before providers and innovators can use their resources effectively in restoring autonomy they need evidence, largely from consumers, on the: • nature, extent, and penalties of dependence • which interventions provide the best response to a particular aspect of dependence • what aspects of dependence still require effective responses to be developed. Similarly, before consumers and providers can use their resources to best effect in restoring autonomy, they need to find out from consumers, providers and innovators/suppliers: • what consumers need
378
AUTONOMY FOR DISABLED CONSUMERS: THE NEED
• what is available to restore their autonomy • the human and economic penalties of their dependence • the likely human and economic effects of the chosen response. Once the consumer has used the chosen response, these stakeholders also need to find out how well it worked so that they can weigh up its human and economic effects for future planning and provision. Information from consumers and providers is also needed by regulators, information providers and innovators to enable them to prioritise and focus their activities. Gaps in the Information Network The information systems that are currently used for revealing and responding to dependence were developed by the individual agencies to meet their own immediate requirements. There is no common, integrated, upgradable system that can collect, analyse and provide the information that each stakeholder requires. There is also no combined stakeholder group that could undertake this task on behalf of all its members. The result is a series of fragmented and incompletely developed procedures that produce inconsistent data that cannot be ‘pooled’, analysed or used for joint action, planning or policy making. For example, Mitchell and Bennington (in preparation) found that the various needs of wheelchair consumers can be separately assessed by benefit, health, housing, social support and transport agencies. There is also a multiplicity of different assessments for wheelchairs by local NHS wheelchair centres (approximately 180), retailers (approximately 200), Disabled Living Centres (approximately 40) and Mobility Centres (approximately 11). Marks (1998) and Winchcombe (1998) both found that disabled consumers needed better information about equipment. As Neuberger (1998) pointed out, information that is made available for service users is of use not only to consumers but also to professionals and can be used by all the different stakeholding agencies. The Way Forward The penalties of dependence are shared by a large number of different stakeholders and it is desirable that they combine to develop mutual solutions to their shared problems. Since the needs of dependent consumers provide the focus for their own actions as well the other agencies, these provide the obvious starting point for developing integrated responses to dependence. The three initial priorities are to develop effective, consumer-centred, integrated methods for: • revealing consumers’ needs, problems and priorities • responding effectively, consistently and reliably to these needs • revealing the human/economic effects of dependence/autonomy In order to achieve this it would be necessary to set up a Forum for Autonomy, Choice and Innovation. This would be run by an independent agency and would bring together all the stakeholders working in the area in order to provide a comprehensive, one-stop shop for users to access information and to collate and analyse information for the development of services and products to meet the diverse and changing needs of disabled consumers.
CONTEMPOARY ERGONOMICS 2000
379
References Marks, O. (1998) Equipped for Equality. London: Scope Martin, J., Meltzer, H. and Elliott, V. (1988) The Prevalence of Disability Among Adults London: OPCS Mitchell, J. and Bennington, J. (in preparation) Report on the user trials of the Wheelchair Lifemap System Neuberger, J. (1998) ‘Information for Health: whose information is it?’, Journal of Information Science 24(2), 67–73 Winchcombe (1998) Community Equipment Services…Why should we care? London: DLCC
POST OFFICE COUNTER CUSTOMER INTERFACE: A DESIGN CHALLENGE Robin Ellis & Corinne Parsons Post Office Consulting, Royal Mail Technology Centre, Wheatstone Road, Dorcan, Swindon SN3 4RD, UK
The Post Office’s counter design has been reviewed in response to new and conflicting demands. Most prominent is the need to meet the requirements of the Disability Discrimination Act (DDA) (1995) to improve the accessibility of services for disabled customers. It is the customer side of the counter that is dealt with here. Different customers have different needs and these can conflict, for example the counter height for a wheelchair customer compared to a standing customer. The user centred design process included a review of the implications of the legislation, workshops with counter staff and prototype testing with a range of customers, including people with disabilities, elderly and parents with children. This report outlines the process of balancing these issues and determining the optimum design taking account of all the relevant factors. Introduction A review of the customer interface of the Post Office counter was carried out as part of a project to improve accessibility to Post Office Counters services for disabled customers, as a requirement of the Disability Discrimination Act (1995). The act requires service providers to take reasonable steps to remove, alter or provide reasonable means of avoiding physical features that make it impossible or unreasonably difficult for disabled people to use the service by 2004, and for new premises to meet the requirements from October 1999. People with disabilities account for 11% of the population and the incidence of disability increases with age (50% of people with disabilities are over 70 years of age). Additionally, 69% of disabled people are unemployed. Payment of benefits and pensions are amongst the main services provided by Post Offices and so people with disabilities form a large proportion of Post Office customers. Disability includes mobility difficulties, visual impairment, hearing impairment, dexterity limitations and learning difficulties. It follows that the requirements of disabled customers are diverse and numerous. Requirements often conflict, e.g. working surface heights suitable for wheelchair users would be unsuitable for tall people with inflexible spines; and improving reach from wheelchairs by allowing access below the writing surface would make it harder for those with visual impairment to identify the edge of the counter. A survey carried out by Disability Matters Ltd (1998) showed that there were specific difficulties relating to most designs of Post Office Counters. The findings were:• The current counter top is too high for wheelchair customers
CONTEMPOARY ERGONOMICS 2000
381
• There are communication difficulties across the counter screen, particularly for those with hearing impairments • The dimpled counter surface was awkward to write on • Customers experienced difficulty picking up coins from the counter, particularly customers with dexterity problems. The findings were supported by a user workshop carried out with counter clerks which included questions to explore the difficulties that they had serving disabled customers, and their observation of the problems that disabled people had accessing the services. Clearly it would be impossible to optimise the counter to suit all disabled customers but specific features were identified that could be incorporated into the design that would offer significant advantages:Recommendations for physical environment • A writing height of maximum 800 mm for wheelchair customers with sufficient space under the writing surface to allow wheelchair customers close enough to the counter (at least 750 mm). • A writing surface for standing customers • Pick-up points and scales within easy reach of wheelchair customers. • Counter edges with a raised, profiled edge to help customers with dexterity impairment pick up coins and stamps etc. Recommendations for visual environment • An easy to read visual environment, including colour-contrasting edges on the counter and pick-up areas that provide a high contrast with coins and stamps. • Non-reflective surfaces. Good lighting, communication across security screens, clear signage and access to the counter are also very important but are being investigated by separate studies. Method and Results Once the basic requirements of the counter design had been established the outline design for the counter was developed in 4 main phases:-: • • • •
Mock-up trials Initial prototype trials Edge profile evaluations Pre-production prototype manufacture Mock-up trials
The research and user workshops lead to concept designs for the new counter, these were mocked up full size in dexion and card. Three different concepts were modelled (Illustrated in figure 1). Eight counter staff
382
POST OFFICE COUNTER CUSTOMER INTERFACE
Figure 1. Details of the mock-up counters used in the trial
participated along with twelve customers including representatives from the local Disabled Access Action Group. Two of the customers were elderly, one was a wheelchair user and another used crutches due to arthritis. The counter staff and customers undertook simulated transactions. Mock-up assessment involved obtaining opinion using questionnaires, observation and structured discussion. The main outcome of the mock up trials were recommendations to: • Increase the height of the counter top to 975 mm to provide a comfortable writing height for standing customers. • Allow some counter top space in the centre of the counter on the customer side of the screen to allow standing customers to undertake writing and manipulation tasks without having to twist away from the clerk. • Allow more space underneath the lower surface for wheelchair customers Initial Prototype trials The mock-up trials allowed the general layout of the counter to be established. Once the recommendations from the mock-up trials had been fed into the design process, three options were identified for further testing. A test rig was built (Figure 2) using material thickness and construction methods that represented suitable methods for final manufacture. Nine counter staff operated the counters over a period of 2 days using dummy transactions and live computer equipment so that the trials closely resembled real counter operation. The transactions were selected to match the actual transaction mix of a typical office. In addition to the counter clerks who also tested the workstations as customers, there were 7 other non-disabled customers, two disabled representatives from the Local Access Action Group, (one wheelchair bound and one using crutches), three senior citizens and three mothers with children ranging between 1 and 7 years old. The main outcomes from the initial prototype trial were: • The deeper top central writing shelf was preferred for standing customers, which also made the reach to the pick-up point easier for wheelchair customers. • The deeper low writing shelf was preferred by wheelchair customers. • The fully clad shelf was preferred by both wheelchair and standing customers, who thought they may loose things off the separate shelf concept. • The indented front surface clearly identified the serving points on the counter.
CONTEMPOARY ERGONOMICS 2000
383
Figure 2. Details of the prototype counter used for the trial
During the prototype testing the layout of queue barriers in front of the counter was also considered. It was found that at least 1500 mm from the base of the counter was needed to allow wheelchair customers to manoeuvre and turn, whilst still allowing standing customers to walk past. This is greater than recommended by Centre for Accessible Environment (1998). Edge Profile An additional activity to determine the optimum edge profile to assist the task of picking up coins and stamps was undertaken at a day centre. Ten senior citizens each tried picking up coins, stamps and forms from five different edge profiles presented at the height of the counter and in random order. The profiles had a sloping surface varying in length between 25 mm and 10 mm and in height between 3 and 7 mm. The results showed a preference for 5 mm height with a 25 mm length. It was observed that those with impaired dexterity cupped one hand under the counter edge whilst the other hand scooped the items over the edge and into the cupped hand. Final Outline Counter Design The final counter design is shown in figure 3. Additional features to those previously mentioned are: • Thick lower shelf and radiused underside to minimise injury to young children who bang their heads on the shelf. • The edge of all the top facing surfaces on the customer interface will be profiled according the outcome of edge profile trials • The top central writing area features an overhang, allowing customers to cup one hand and scoop with the other as described above.
Discussion and Conclusions Despite the varied and conflicting requirements of disabled customers the study has shown that significant benefits can be achieved for disabled customers, without compromising use by non-disabled customers. This was achieved by providing twin height writing surfaces in front of the clerk, wheelchair access below
384
POST OFFICE COUNTER CUSTOMER INTERFACE
Figure 3. Final counter design
the lower writing surface, and minimising the reach to the pick up point. Sloped edge profiles with strong colour contrast make it easier for those with impaired dexterity to pick items up and for visually impaired customers to identify the edge and serving point. A pre-production prototype of the counter is now being produced for further evaluation by a wider range of disabled customers. Acknowledgements We would like to thank the members of Swindon Access Action Group, Nythe Senior Citizens, Age Concern (Swindon Branch) and the counter clerks who took part in the development process and provided valuable feedback. References Centre for Accessible Environment. 1998, Designing for Acessibility—an introductory guide, 18–21. Disability Discrimination Act. 1995, Code of Practice Part III Disability Matters Ltd. 1998, Post Office Counters Limited—Disabled Customers Accessibility Report Post Office publication, A design guide: Access and facilities for people with disabilities Royal Mail Property Holdings—Building Policy Unit. 1994, Access and Facilities for people with disabilities: A Design Guide, (Royal Mail, London)
REVEALING AND RESPONDING TO THE NEEDS OF WHEELCHAIR CONSUMERS John Mitchell1 & Jude Bennington2 1Director
of the Wheelchair Lifemaps User Trials, RICAbility and The Essex Rivers Bed Project, The King’s Fund
2Researcher
on the Wheelchair Lifemaps User Trials, RICAbility and The Essex Rivers Bed Project, The King’s Fund
Products and systems are developed and provided to meet consumers’ needs and it follows that they can only be effectively chosen, evaluated and developed with knowledge of consumers’ needs, priorities and circumstances. This paper reports the background to development and testing of a new system, ‘the Wheelchair Lifemap’ which is intended to support and enable these functions. This system, which was developed for the NHS Executive, is now being trialled for the Department of Health. It comprises a series of documents that aim to find out: what consumers want to do from their wheelchair, what improvements they want over any previous chairs and what the main barriers are that impede them. The system aims to provide information not only for consumers and assessors, but also for manufacturers for the future development of wheelchairs. Four interfaces for the wheelchair The choice and innovation of wheelchairs is a demanding process because wheelchairs are potentially highly complex products. Able-bodied people engage in many different activities and pursuits and there is no obvious reason why wheelchair consumers should have lower aspirations. When surveyed (Ohras, 1997), wheelchair consumers reported a wide range of activities that they currently undertook or wanted to undertake. Analysis of these activities and of the international literature on wheelchairs (Mitchell, 1997) suggests that wheelchairs must successfully provide at least four interfaces for their consumers, as shown in Table 1 below. Table 1 Four Key Interfaces for Wheelchairs
386
REVEALING AND RESPONDING TO THE NEEDS OF WHEELCHAIR
Analysing the Process of Choice Mitchell (1976) found that deficiencies in revealing consumer needs and obtaining their feedback could hamper the development and supply of powered wheelchairs. In a survey of 143 wheelchair consumers, Ohras (op cit) found that each felt that they would be able to do more independently if their needs had been fully recognised and if they had the use of suitable wheelchair. These views could be explained by either the failure to reveal their needs fully, or to choose effectively from available wheelchairs or to develop better wheelchairs. During user-trials of the ‘Wheelchair Lifemap’, the project research team found that ‘wheelchair assessments’ are carried out by a wide range of different agencies, each of which tends to use methods that they have developed themselves. Amongst these are approximately 185 NHS Wheelchair Centres, 200 wheelchair retailers, 40 Disabled Living Centres and 11 Mobility Centres. Their needs were also assessed by benefit, education, social support and transport agencies. In order to find out if there was a ‘common process’ underlying these assessments and if they would reveal consumers’ needs in the four ‘interface’ areas, the research team analysed assessments from the statutory, commercial and voluntary sectors in the UK and North America as shown in Table 2 below. Table 2 Assessments Analysed
The Process of Choice The analysis exposed a process of choice into which each assessment and their constituent parts could be fitted, as shown in Table 3 below. Stages 1 and 2 focus on consumers to reveal their needs and their evaluations of their chosen wheelchairs. Stages 2 and 3 focus on responding to these needs by finding wheelchairs with performance specifications that match consumers’ profiles of need. None of the assessments covered the entire process of choice and none covered all aspects of any one stage in the process. For example, at Stage 1 clinical needs were often well covered but consumers were rarely asked where they wanted to go or what they wanted to do, or whether there were any particular improvements they needed over their previous wheelchair. Only one assessment (Canadian Occupational Performance Measure) asked consumers to prioritise their needs. None of the assessments included any form of integral or routine evaluation at Stage 4.
CONTEMPOARY ERGONOMICS 2000
387
Table 3 Stages in the Process of Choice
Linking the Processes of Choice and Innovation Richardson et al (1996) and Urban and Hauser (1993) agree that innovation should be based on an understanding of consumers’ needs and problems and include their evaluations of existing and prototype solutions. Analysis of consumers’ profiles of need and their evaluations of their chosen wheelchair would therefore provide a factual evidence base for improving the choice of available and the innovation of better wheelchairs in the future. Integrating Choice and Innovation with the ‘Wheelchair Lifemap’ The prototype Lifemap was originally developed for the NHS Executive as a holistic, consumer-centred tool to help consumers, voluntary agencies, retailers and the NHS to choose and innovate wheelchairs more effectively. It consists of a way of logging consumers needs and aspirations in order to produce recommendations for a wheelchair that would provide each individual with optimum mobility. To do this the Lifemap asks consumers and carers: • what they want to do from their new wheelchair • what improvements they want over any previous wheelchairs • what are the main barriers that impede them Their answers provide the basis for: • • • • •
choosing wheelchairs that match their needs and circumstances evaluating their effectiveness joint planning with other services ‘pooling’ and analysing data on consumers’ needs/dependence pinpointing products and services that require improvement
388
REVEALING AND RESPONDING TO THE NEEDS OF WHEELCHAIR
• innovating more effective wheelchairs and services in the future. The initial prototype (Mitchell et al, 1998) was developed from interviews with consumers about their needs (Ohras et al, 1997) and a review of the international wheelchair literature (Mitchell, 1997). The prototype has been refined during user-trials sponsored by the Department of Health. These have involved consultations with assessors in the statutory, commercial and voluntary sectors and the analysis of existing methods of revealing consumers’ needs. The Lifemap System currently under trial has the following four sections: ‘Wheelchair Request’ This collects the information which is needed either to supply a wheelchair immediately or to carry out a detailed assessment. . ‘Me and My Wheelchair’ Will be sent in advance to consumers/carers to help them consider their problems/priorities before assessment. ‘Wheelchair Assessment’ This summarises and records consumer priorities and clinical, environmental and administrative details, the actions taken and any joint planning with related services such as housing or social support. ‘Lifemap Logbook’ This contains the consumer’s profile, the actions agreed and space to record: • • • •
changes in their profile how well the chair satisfies their needs durability, reliability and repairs useful numbers and contacts
The report on the user trials will be presented to the Department of Health in March 2000. It is hoped that a launch of the Lifemap will follow in the summer of this year. The Lifemap system is currently in paper version, but the research team are now in the process of bidding for funding in order to develop an interactive IT version of the Lifemap system. It is intended that this system would eventually be extended to cover all types of disability equipment. References Mitchell, J. (1977) The Development, Manufacture and Supply of Powered Wheelchairs . Loughborough University of Technology: ICE Ergonomics Mitchell, J. (1997) International Wheelchair Bibliography 1985–95. London: RICAbility Mitchell, J., Bennington, J., Harrison, J. (1998) Choosing your wheelchair by mapping your life. Sheffield. Health Research Institute, Ohras, A., Mitchell, J., Yelding, D. (1997), Consumers and their Wheelchairs. London: RICA/Sheffield Hallam University Richardson, S, (1996) User Fit, TIDE, EC, DG X111 Urban, G. and Hauser, J.R.. (1993) Design and Marketing of New Products, 2nd Edition London: Prentice Hall
ADDRESSING PLEASURE IN CONSUMER PRODUCTS THROUGH ERGONOMICS Julien Simon & Rachel Benedyk Ergonomics and HCI Unit, University College London, 26 Bedford Way, London WC1H 0AP, UK
Traditionally, ergonomics has tended to concentrate on making features in consumer products (such as automobiles) ‘usable’—focusing on utilitarian, functional product benefits. The objective of this study was to look beyond usability, to the positive emotional and hedonic benefits that such products can bring to their users. Techniques extrapolated from Kansei Engineering were applied to particular control features in car interiors, allowing for the evaluation of the emotive and subjective aspects of their use. The data gathered in this study support the idea that product pleasurability involves more than usability alone. It is concluded that, in order to optimise the experience of product use, ergonomists should look both at and beyond usability—to ensure that products are a positive pleasure to use. Introduction The role of ergonomics professionals within the creation of consumer products is to advise designers and engineers of how best to match a product to user needs. However, arguably, few designed artefacts succeed by giving the user a purely ‘utilitarian’ experience. The majority of outstanding designs succeed not only because of their utility, but because they also arouse in the user gratifying experiences that go far beyond this. Yet, it is evident in ergonomics research that the emphasis remains strongly on improving functional aspects of products rather than focusing on the psychological “experience needs” of the users. People buy products not only for what those products ostensibly do but also for what they represent. Consumers buy into a symbolic world that both differentiates them from other people and reinforces their sense of belonging (Crampton Smith and Tabor, 1995). Decreasingly useful then, is the idea that the functionality of products is separable from, and takes precedence over, its appearance and styling. There is value in identifying specific product attributes that mediate the consumer’s qualitative response and designing these into the product. Creating pleasurable products requires an understanding of people— not just as physical and cognitive processors, but as rational and emotional beings with values, tastes, hopes and fears. It also requires an understanding of how people respond to particular elements of a product’s design; not just its functional elements, but also aspects such as the form, language and aesthetics and the ethos reflected in the design. The appeal of a product on an emotional level has increasingly been recognised as being a powerful influence (Jordan and Servaes, 1995), as a product may work well on a rational cognitive level, but fail to inspire or excite and thus not sell. The area of emotive, pleasurable (hedonic) issues, although acknowledged as important, has not, so far, been addressed in any depth by ergonomics (Dandavate, 1996), and in practice is largely addressed through
390
ADDRESSING PLEASURE IN CONSUMER PRODUCTS THROUGH
the art of the industrial designer. Research has confirmed the importance of emotive responses to designs, and has highlighted the need for a basis on which design teams can quickly acquire an improved understanding of these issues (Taylor, 1999). This is by no means easy. The hedonic values associated with the person-product relationship, although central to industrial design, inevitably present a somewhat elusive area for research, being subject to behavioural, socio-cultural and psychological influences as well as taskoriented responses to a product. Existing techniques in product evaluation fall short of assessing the hedonic component of ‘pleasurability’ of products. Usability-based approaches have undoubtedly brought huge benefits to users of products. However, in order to represent the user fully in the product creation process, ergonomists must take a wider view of person-centred design and look both at product use and at those using and experiencing products in a more holistic context. An approach is therefore needed which can move products beyond ‘usable’ to the stage where they are not only usable but also enjoyable, exciting and pleasurable. Kansei Engineering—a Japanese customer-oriented product development technique—was developed to meet these ends. It is defined as “a technology of translating consumer’s feeling and emotional needs (Kansei) of a product into design elements in the product development process” (Nagamachi, 1995). In other words, Kansei Engineering is an empirical technique aimed at linking the design characteristics of a product to the users’ responses to the product. Kansei Engineering’s multi-disciplinary, person-centred approach goes beyond traditional ergonomics involvement in product development. It tries to address more complicated psychological issues of consumer’s emotional expectation and satisfaction in perceiving the product. The Kansei includes the customer’s feeling about the product design, size, colour, mechanical function, feasibility of operation, and price as well. This study therefore proposed to apply Kansei techniques to a consumer product environment, where ergonomists traditionally address functionality. It was important for this investigation to find such a consumer product, which holds strong functionality aspects but also could be seen to have hedonic characteristics. The automobile sector seemed to fit these demands and was deemed to be a good illustrative example for addressing pleasurability. Since the invention of cars, emotional attachment has been associated with almost every advancement of automotive engineering. For many, driving a vehicle has to be a pleasure which is in addition to its function as a commuting tool. In the early days of its history, the automobile was simply a valuable means of transportation. Then it was given additional value as a comfortable vehicle for transporting people. But now that the car has become indispensable for daily life, people are no longer satisfied with the comfort it can provide. They want to use it for another purpose, which may be described as a place to stage their individuality. In other words, consumers are not buying cars just because of their inherent utility but also because of their subjective values. They are moved to buy a car, because “it fits them well”, “it has a good styling” or “it just feels right” (Jindo and Hirasago, 1997). Automobile manufacturers market different models to appeal to different customer bases. Some models offer a sporty appearance whereas others are more conservative. Some are luxurious and project an image of quality whereas others are more economical. The appearance of the automobile interior is as much a component in mediating the look and feel of the vehicle as the exterior. It is therefore important that various interior components elicit a qualitative response that is consistent with the styling goals of the interior package. Three features of cars were chosen for investigation: the door release handle, the seat track control and the seat recliner control. These all have functional uses, are not linked directly to any driving tasks and vary in design from car to car.
CONTEMPOARY ERGONOMICS 2000
391
The present exploratory study sought to identify the emotional and sensory feelings associated with the particular design aspects of these controls. While controlling for functionality, many aspects of the visual and sensorial attributes (i.e. styling, tactile surfaces, acoustic feedback, functional ease…) that impact qualitative responses to products could be identified and users’ level of pleasure experienced with these features could be investigated. Method The method of data collection and analysis , derived from Kansei, consisted of (1) the selection of adjective words, (2) the evaluation of the design components using semantic differential questionnaire and (3) the use of open-ended questions as an exploratory mean into participants’ experiences and responses to stimuli, and (4) multivariate analysis of evaluated data. Multivariate analysis was used for disclosing the implicit relations among adjectives and products or their physical attributes (eg size, shape) of each design component. It should be possible for the obtained relations among a component’s design, features and the adjectives to be made into inference rules for future design recommendations and low-level guidelines. Twenty participants took part in the study, ten females and ten males. The stimuli were three new models of cars. The makes and models of the cars could not be named for legal reasons, so the cars will be referred to as C1, C2 and C3. Comparisons were thought to be interesting between these cars as they all represent different manufacturers but all belong to the same marketing category. Further, they represent slightly different price ranges, The style, quality and ingenuity of each company’s image are represented through its products; C3 denotes quality engineering as well as status, C1 is renowned for longevity and safety while C2 is considered as representing value and quality in mass design. In order to control the ‘pleasure’ variable, and keep other variables constant, the cars belonged to the same category and targeted the same user profile. However, within the same bracket of cars, the various price ranges implied very different extremes and would enable differences in styling, luxury and other aspects of the car interiors to be subjectively tested. All three cars were brand new and parked in a row. They were completely covered so as to avoid any preconceived ideas of the quality, prestige and other characteristics which participants may have attributed to the particular makes and models of the cars. Large plastic cover sheets were used as covers, underneath which articles were placed to cover up particular features of the car (i.e. distinctive large circular headlights, protruding front bumper…). Inside, all labels were covered up using tape. Any ‘tell-tale’ signs (i.e. tax disc holder, foot mats, car stereo make…) were either removed or covered up. Participants, therefore, could neither recognise the car from its exterior appearance nor guess the make or model from the inside. Each feature was thus viewed and manipulated by participants in complete isolation, without any notion of what car he or she was testing, in order to avoid bias in the data. Each participant was asked to consider the emotive and sensorial feelings they experienced as they familiarised themselves with each feature in every car. All participants were told to take as long as they wanted ‘testing’ the feature until they had formed an appreciation of it and were able to make a judgement about it. A questionnaire survey was administered to participants for each feature in each of the cars. It consisted of three open-ended questions to solicit responses about (1) what they like, (2) what they dislike, and (3) how the design could be improved and why. Open-ended questions were thought to best represent the exploratory nature of the study. Semantic differential scales were chosen to represent, as thoroughly as possible, the participants’ judgements about the emotional, practical and hedonic benefits associated with the features (Osgood and Tannenbaum, 1957). Perceptions were rated on a scale of 1 to 7, and the results
392
ADDRESSING PLEASURE IN CONSUMER PRODUCTS THROUGH
were tallied. The scores of each category were analytical sensory and emotional evaluations, without allowance for individual preferences. The overall ratings, however, were taken as evaluations of preferences, on a scale from ‘“good” to “poor”. A factor analysis was then performed on the results to obtain the relationship between the adjectives and to find out what were the chief underlying dimensions of the set of attributes. Rankings and significances were tested via one-way repeated measures ANOVA and Tukey HSD post-hoc analysis. Results and Analysis Two factors were extracted from the loadings of the scales for all three features: factor 1, ‘design’ (appearance, styling, aesthetic aspects) and factor 2 ‘ease of use’. Feature 1—Door Release Handle The door release handle on C3 seemed to give more pleasure to users than that of C1, in turn, giving more joy to participants then the one on C2—despite C2’s having the best subjective ratings for smoothness and excitement. In designing for ‘pleasurability’, the more integrated strap design of C3 and of C1 were seen as being more elegant, stylish and expensive-looking compared to the strap design of C2. The door release handles of C3 and of C1 were also seen as being firmer and smoother. Interestingly, however, the paddle design of C2 was seen as being more exciting. In summary participants were more concerned with usability issues than aesthetic issues. However, concerns about styling and harmony represent more than one-third of the design issues, indicating that aesthetic issues should not be overlooked. Feature 2—Seat Track Control The seat track control in C2 seemed to give marginally more pleasure to users than the one in C3, which in turn, was preferred to the one in C1. The novel seat track control used in C2 was seen as more enticing, expensive and modern compared to the lever mechanisms. Although many concerns dealt with ‘usability’ issues, the material and texture was deemed very important aesthetically. Preference was clearly for a softer, more padded tactile feel. In summary, concerns about styling and harmony (e.g. “nice noise when moving the seat”, “could be hidden more and made to look better”) represent more than one-third of the design issues, indicating that aesthetics and harmony issues should not be overlooked. The indication is, therefore, that users regard usability as being important, but that there are other issues that need to be addressed in order to create pleasure in these features. Feature 3—Seat Recliner Control The seat recliner stands out as being considerably more pleasurable to use in C3 than in both of the other cars, although the seat recliner control in C2 was seen as being marginally faster than the one in C3. All three are hand wheels but the latter two are cog-shaped and made of plastic while the one in C3 is circular and covered in rubber with tactile bumps. All three hand wheel controls were, however, found to be too stiff and unresponsive. Pleasure associated with the seat recliner control would be much increased if the operation was made smoother, i.e. lighter action and reduced torque. In summary, concerns about styling and harmony represent a fraction of the design issues, indicating that ultimately, functionality is a priority and that if these basic standards are not met, there is no point in providing good aesthetics. The indication
CONTEMPOARY ERGONOMICS 2000
393
here is that basic usability and requirements for minimal comfort and efficiency are not met and that, therefore, other issues such as harmony and aesthetics are not addressed. Conclusions The data gathered in this study support the idea that product pleasurability involves more than usability alone. It is suggested, therefore, that in order to optimise the experience of product use, those involved in user-centred design should look both at and beyond usability—to ensure that products are a positive pleasure to use. In considering the findings of this study it is important to consider the limitations inherent in the method. The complete semantic space of design attributes is not represented in the semantic scales, the problem of false negatives, the diverse descriptions of the same emotion terms and the fact that reports of emotions depend on an individual’s particular conditioning history, all manifest the need for systematic follow-up studies (Plutchik, 1962). Such techniques, however, seem to have enormous potential in supporting the design for the creation of products that are a pleasure to use. Accepting and accommodating to such techniques might mean a wider role for the ergonomist in product creation than simply being involved in the design of products, e.g. working with marketing and those involved with the technical aspects of products. It may also mean that ergonomists will have to evaluate a wider range of issues than they have traditionally. There is clearly a desire to articulate issues which have not historically been within the professional field. The objective nature of traditional ergonomics methods has not provided practitioners a language with which to discuss subjective values at ease—in fact, very little language exists to explore and express such ideas. There is a need, therefore, within the field of ergonomics, to develop a range of tools which help to discuss and embody more of the aesthetic dimensions of a product, to take account of the sensorial and the cultural, as well as the physical and cognitive—to develop that qualitative sense. This exploratory study into a relatively novel concern in the field of ergonomics, serves to underline the importance for ergonomists to adopt a more holistic approach when designing products with people in mind. References Crampton Smith G. and Tabor P. 1996, The role of the artist-designer. In T. Winograd (ed.), Bringing Design to Software, (New-York: ACM Press) Dandavate U., Sanders E.B-N., and Stuart S. 1996, Emotions matter: user empathy in the product development process. In Proceedings of the Ergonomics and Ergonomics Society 40th Annual Meeting 1996 (Santa Monica), 415–418 Jindo, T. and Hirasago K. 1997, Application studies to car interior of Kansei engineering. International Journal of Industrial Ergonomics, 19, 105–114 Jordan, P. and Servaes, M. 1995, Pleasure in Product Use: Beyond Usability, In S.A. Robertson (ed.), Contemporary Ergonomics 1995, (Taylor and Francis, London) 341–346 Nagamachi, M. 1995, Kansei Engineering: A new ergonomics consumer-oriented technology for product development. International Journal of Industrial Ergonomics, 15, 3–11 Osgood C.E, Suci G.J. and Tannenbaum P.H. 1957, The Measurement of Meaning, (Champagne, IL: University of Illinois Press) Plutchik R. 1962, The Psychology and Biology of Emotions, (Harper-Collins, London) Taylor A.J. 1999, The relationship between ergonomics and industrial design in new product development, In M.A.Hanson, E.J.Lovesey and S.A.Robertson (ed.) Contemporary Ergonomics 1999, (Taylor and Francis, London)
Seating
SEATING IN THE REAL WORLD Andrew Baird, Vicky Malyon & Nigel Heaton Human Applications, 139 Ashby Road, Loughborough LE11 3AD, UK
We have seen significant growth in the office-bound service industries in recent years with many office workers spending longer periods at their desks. We have also seen a subsequent increase in the reported incidence of posture-related musculo-skeletal problems. The issue of chair design has therefore come under the spotlight. With sponsorship from a large (office) furniture manufacturer, interviews and direct observations were carried out to determine how people actually use their chairs and to pose questions about their working posture in light of recent ISO standards and the DSE Regulations. It was found that the majority of people chose not to use the adjustments available. It was concluded that the answer to the problems of seated posture lies not in the provision of ever more sophisticated seating, but in addressing other critical ergonomics issues. Introduction As the service sector expands and more employees become office bound, organisations are increasingly realising the importance of seating and office ergonomics. Seven years on from the Health and Safety (Display Screen Equipment) Regulations (1992)—the DSE Regs—and in light of the recently published ISO 9241 Part 5, it is time to review how people actually use their chairs in the real world. This paper will outline the variability in seating controls and discuss the use of seats. The paper is based on a study carried out for a large chair manufacturer into how people use their chairs, their preferences and their perceived requirements. The growth in service industries has been rapid. In addition, technology has advanced to the point at which many jobs can be done just with a PC such that we rarely have to leave our desks. Consequently, many of us have become Homo Sedens. (after Mandal, 1981) yet without any awareness of how to sit or the consequences of inappropriate sitting. Thus, whilst using the PC might be efficient for both our working and social life, it can have serious consequences for our longer term health. The issue is wider than just poor occupational health. As a society we are becoming more litigious. In the past we might have accepted a bad back as natural wear and tear, with no-one to blame, now, people are looking for causes of their discomfort and identifying who might be to blame. This inevitably brings the focus onto the chair. It has long been thought that a seated posture is “best”. One of the earliest pieces of legislation dealing with seating (the Factories Act) stated that if a job could be done sitting it should be. This general feeling
396
SEATING IN THE REAL WORLD
still remains. Even as late as 1992, there is an implicit assumption within the DSE Regulations that we should sit to use the computer and that it is the features of chairs which matter. Manufacturers have recognised that seating is neither simple nor trivial. Many seat designers have provided and promoted a huge range of features. These are supposed to promote and support good posture and ensure compliance with relevant legislation. The Legal Context It is not only the manufacturers who have obligations to provide appropriate equipment. Every ‘user’ of display screen equipment and every ‘workstation’ (whether it is used by a ‘user’ or not) should be provided with a suitable chair. The chair, per se, must meet the minimum requirements laid down in the Schedule to the DSE Regulations. The user must also be provided with the appropriate training, information and supervision. The Regulations provide a framework aimed at helping employers to ensure that they are providing the best situation for their employees “as far as is reasonably practicable”. The Schedule to the DSE Regs places mandatory requirements on organisations. It states that the work chair should be stable, provide easy freedom of movement and a comfortable position. It specifically states that the seat should be adjustable in height, the seat back should be adjustable in height and tilt and that a footrest should be provided to any user who wants one. The Standards Context EN ISO 9241:1999 is a multi-part International Standard related to the use of visual display terminals. Part 5 was produced in March 1999 and came into effect on 15th July 1999. It concerns workstation layout and postural requirements. A number of issues are addressed by this part of the standard. These include the philosophy that work organisation, job content and furniture design should encourage user movement, ensuring that prolonged static posture is minimised and that voluntary adjustments in posture can be made. More specifically the standard includes the following requirements with regards to chairs and posture in particular: • the seat should be adjustable in height where the appropriate height is that of the popliteal height plus the thickness of the footwear • seat depth should be adjustable or suitable for the intended user population • seat width should be wider than the width of the hips • seat angle should allow users to vary their posture forward or rearward • seat pan and back support should be independently adjustable • castors should allow freedom of movement over short distances • ability to swivel should allow users to rotate their body without rotating their spine or twisting the torso • back rest should provide support for the back in all sitting positions and particularly for the lumbar region • arm support should not restrict the preferred working posture or ease of access to the workplace ISO 9241–5 also states that users should be informed why and how the furniture and other devices should be adjusted and that it is desirable to design furniture to minimise the need for training and information.
CONTEMPOARY ERGONOMICS 2000
397
The Use of Chairs We were approached by a large chair manufacturer interested in knowing how chairs are really used. They wanted to know what sort of features users like, make use of, understand the point of, etc. and to understand how users actually sit whilst working. We had evidence that the quality of seating provision, particularly amongst large organisations had significantly increased in recent years. However, as recently as 1999 musculo-skeletal disorders were highlighted as a major source of occupational ill health within Europe, with many of the sufferers employed in sedentary work. The Study Some manufacturers appear to be of the opinion that increased functionality is required by users. We were concerned that rather like the development of pocket calculators and advanced telephone systems, the functionality might be there without actually meeting user requirements. How many of a chair’s features are redundant from a user’s perspective and how important is this? We determined to study users in a variety of work contexts, to review their actual use of the chair and to determine the real issues associated with the provision of seating. The study was carried out using a combination of interviews and observations. 25 participants from 6 different organisations were studied. A one hour video recording was taken of each participant during their normal working day. These were analysed by identifying the posture adopted (against a simple set of six options), the task being performed and the posture duration in the hope of providing an understanding of how frequently postures and tasks are changed and exactly how chairs are utilised. Questionnaires were also completed in order to establish how the users felt about their chair, how frequently they made adjustments and their knowledge about the adjustments. Findings Once the data had been analysed we drew a number of conclusions based on the general findings about chair use and more specific comments about the features available. It was found that 64% of participants spent over 70% of their working day sitting, which served to emphasise the importance of adopting an appropriate range of postures. One of the most surprising findings of the study was that over a quarter of participants had more than one desk but only one chair which meant that they had to physically move their chair. ‘Hotdesking’ i.e. a desk and chair used by a number of people, was also common e.g. an Internet machine which is used by a number of people each day. It was found that desks or chairs that were shared were generally of poorer quality. The majority of participants were engaged in a number of tasks which even within a one hour period involved them changing posture. More importantly it was noticed that there was a correlation between the task and the posture adopted whilst carrying it out. For example, whilst using the telephone the majority of participants leant forward and leaned on the desk, taking their back away from the back support and losing lumbar lordosis. Many of the participants had chairs with a great number of features, however most features were not used. Users reported that on initial reception of a chair they would make adjustments as necessary, but usually just seat height. After that the adjustments were rarely used unless they had been noticeably changed by somebody else. Approximately half of the participants had chairs with armrests and although some reported their importance other stated they were a hindrance. It was recognised that inappropriate arm rests can prevent good posture.
398
SEATING IN THE REAL WORLD
Several participants had seat depth and seat tilt adjustment but only one participant reported using the feature more than once. This could either be interpreted as the adjustment not being necessary or that the users were not aware of how or when to make such adjustments. Footrest use was high and it was noted that the provision of such equipment was good and if anything over prescribed. Another interesting result was that people do not question their chair set-up even when they are experiencing discomfort! Indeed, users typically ask for a “better” chair before exploring the options on the existing chair. Conclusion Sophisticated seating is not the simple answer to improved working postures. Instead, a number of related issues need to be addressed. A multitude of features are of no benefit unless the user is aware of how to use them, which highlights the importance of education and training. Manufacturers must either provide simple instructions or preferably, more obvious functionality. However, this does not ensure that the user actually utilises them. Most importantly, emphasis must be placed on encouragement of postural adjustment both in terms of seating adjustments and movement away from the workstation. This can be either by physical reminders, incorporated as part of a daily routine or as a result of educating the user in how and why adjustments should be made. Within procurement, user trialling should be used to ensure that the proposed seating option genuinely meets both user and task requirements and is compatible with other workstation elements. The impact of working postures must not be seen simply as function of furniture provision and use. The duration and nature of adopted postures will have a direct link with task design work allocation. Regulation 4 of the DSE Regs makes it an explicit requirement to manage users’ work routines. This is not simply to limit time at the keyboard, but time at the workstation generally. Ultimately, appropriate use of a chair relies on a number of factors which need to be considered and understood by the user and promoted, managed or brought to their attention by their employer (line manager). Not only is this sound ergonomics, it is the law! To return to the functionality of the chair itself, limited adjustment may be preferable until such time as users are capable of using more sophisticated features (i.e. until general awareness improves). This reflects not only on the need for education within the workplace, but also in schools, etc. In an ideal world we would have infinitely adaptable chairs utilised by fully aware workers, but in reality a compromise must be sought. Employers must ensure that costly features are not simply redundant and that users are actually able to utilise them. When making purchasing decisions it may well make sense to look at higher quality, lower specification chairs than multi-featured but lower quality chairs at the same price point. There is a school of thought that cheap, less comfortable chairs may actually be beneficial in reducing static load by encouraging fidgeting. We would dispute this however as discomfort leads to postural and other behavioural changes, together with muscle tension which can lead to musculo-skeletal problems. It would seem that a change of approach is required on seating in general. Given that users do not utilise what they currently have, the answer is clearly not ever more sophisticated furniture. It is imperative that the users of the chairs are not only aware of the implications of good posture but are also involved in seating selection (via controlled user trials) and training in chair use. Posture related MSDs will continue as long as we tolerate anatomically poor postures and the prolonged static postures often associated with modern office work. Education is vital, but it must be backed up with effective job design and appropriate supervision both on a day-to-day level and within the risk assessment process. Human Applications acknowledge the co-operation of Steelcase Strafor in the production of this paper.
CONTEMPOARY ERGONOMICS 2000
399
References Health and Safety (Display Screen Equipment) Regulations 1992. Guidance. HMSO. ISBN 0 11 886331 2 Management of Health and Safety at Work Regulations 1992. Approved Code of Practice. HMSO. ISBN 0 11 886330 4 ISO 9241—Ergonomic requirements for office work with visual display terminals (VDTs) Mandal, A.C. (1981) The Seated Man (Homo sedens). The seated work position, theory and practice, Applied Ergonomics, 12, 19–26
DRIVERS’ SPINAL RESPONSES TO THE EFFECTS OF SITTING POSTURE Tina J.Hadley* & Christine M.Haslegrave Institute for Occupational Ergonomics, University of Nottingham, Nottingham NG7 2RD *Now at Sandwell Healthcare NHST, West Bromwich, B71 4HU
Back pain in drivers remains a problem despite apparent advances in vehicle seat design and back care advice for the use of a lumbar support to maintain good spinal posture. Spinal responses to three different sitting postures were investigated, to study the effects of lumbar support compared with upright and slouched postures. Changes of the upper spine posture (in the sagittal plane) were recorded along with some preliminary measurements of stature change using a seated precision stadiometer. The results showed that the posture of the neck and upper thorax altered with sitting posture. The locations of points on the cervical spine (external occipital protuberance, fourth cervical vertebra and first thoracic vertebra) were significantly higher and further rearward when sitting with a lumbar support than when sitting slouched. The findings have implications for the positioning of headrests, vehicle seating development and spinal health. Introduction Epidemiological studies have shown that vehicle drivers have a high prevalence of low back pain, and one of the main factors to be associated with this is a poor sitting posture, which is held for long durations, and its influence on the spinal structures. Protecting the seated spine by using a lumbar support to assist the maintenance of good spinal posture is based upon several suggested biophysical properties of the spine. This is not a new concept and has long been the focus for seating development and evaluation. Keegan (1953) investigated alterations of the lumbar curve (lordosis) in relation to posture and seating. This was reported to change when extension of the knees caused a pull on tight hamstring muscles, which subsequently rotated the pelvis posteriorly, flexing the lumbar spine and flattening the lumbar lordosis. This was said to cause anterior wedging of intervertebral discs, which produced a posterior shift of the internal disc material (the nucleus pulposus). It was postulated that this posteriorly bulging disc material would place pressure upon pain sensitive structures in the low back resulting in low back pain. The use of a lumbar support to prevent the loss of the lumbar lordosis in the seated posture was subsequently recommended. Intradiscal pressure changes were measured in living subjects, in a series of investigations by Andersson et al (1974b); higher pressures were reported in sitting than in standing postures. Pressures were also higher in the straight or kyphotic postures than in the physiologically healthy lordotic posture. This was attributed to the deformation of the disc by flattening of the lumbar lordosis on sitting, together with an increase in the trunk load moment as the pelvis was rotated backwards and the lumbar spine and trunk rotated forwards.
CONTEMPOARY ERGONOMICS 2000
401
Andersson et al (1974) also found that lumbar supports of depth 5cm at the level of the third lumbar vertebra demonstrably reduced intradiscal pressure, reportedly due to the support changing the posture of the lumbar spine towards a lordosis and reducing the deformation of the lumbar discs. In further radiographic studies, Andersson et al (1979) showed that lumbar support up to the experimental limit of 4cm made the lumbar curve closely resemble that of the standing posture. Therefore, historically, the purpose of providing lumbar support has been to maintain the natural lumbar lordosis by preventing the spine from flexing excessively, and its development was initially influenced by physiological and biomechanical studies of loading on the lumbar spine. More recent studies of seating and lumbar support have investigated the preferences and behaviour of the sitter. Porter and Norris (1987) developed a standardised method for recording the external profile of the spine, whilst sitting or standing, to investigate the effects of posture and seat design on lumbar lordosis. They found that, when a lumbar support was adjusted for individual comfort, the lordosis was only half that when standing. This seems to conflict with the research findings already discussed, which indicate that lumbar support should be designed to minimise loss of lordosis on sitting. However, the assumption that perceived comfort is associated with a healthy posture, and consequential lack of tissue loading, damage or pathological changes, is not necessarily true. To the authors’ knowledge human perception of intradiscal pressure and microtrauma to the annulus is not possible A further problem is whether drivers actually use lumbar supports as intended by their designers. This may occur through lack of knowledge on the driver’s part, but Reed et al (1995) considered whether this might also be due to the design of the lumbar support contour. They suggested that lumbar supports alter the lumbar spinal contour by only approximately one-third of the depth of the support, due to the prominence of the support causing the subject to sit forward on the seat. McIlwraith (1996) conducted a similar study, investigating the loss of the lumbar curve on sitting in the driving seat of a car. A loss of the lordosis was demonstrated for each subject and it was noted that the position of the lumbar support did not appear to coincide with the lumbar curve in any of the subjects. It was concluded that the seat did not provide adequate support. However, in addition to improved design of the contour, greater adjustability might address this problem. Therapists have long recommended the use of the portable lumbar support as an adjunct to back pain treatment. In summary, research evidence provides conflicting guidance for the development and evaluation of vehicle seating and lumbar support. Recent investigations have supported the use of subject comfort ratings and based recommendations upon subject preferences, but these can conflict with the earlier biophysical recommendations. Other gaps exist in the literature and one of these is the effect of altering the posture of the lumbar spine on the rest of the trunk posture, and specifically on the upper spine. This led to the present investigation, with the aim of determining the effect of introducing lumbar support on the posture of the neck and shoulders. Some preliminary measurements were also made at the same time of spinal shrinkage when seated for a period of time, in order to evaluate the change in spinal loading due to the lumbar support, but the full results are not reported here. Method Subjects Six healthy males free from a history of low back or neck pain volunteered to participate in the investigation. Their mean age was 23.2 years (range 22–25 years), mean height 1828.3mm (range 1770– 1930mm), and mean weight 77.7kg (68.2–99.2kg).
402
DRIVERS’ SPINAL RESPONSES TO THE EFFECTS
Figure 1 Driving simulator with postural alignment measuring tool attached
Procedure The experiment was conducted in a driving simulator which incorporated a seated precision stadiometer, as shown in Figure 1. The seat back was positioned at 100° to a horizontal wooden seat. Lumbar support was provided by an Original McKenzie Lumbar Roll of 10cm diameter, which was attached to the seat back at the level of the third lumbar vertebra. The effects on the upper spine of sitting for 30 minutes were measured by means of a specially developed tool with three postural alignment measurement probes which were used to locate three anatomical reference points on the cervical spine (the external occipital protuberance (EOP), and the spinous processes of the fourth cervical (C4) and first thoracic vertebrae (T1)). The device was easily detachable and adjustable in the planes parallel and perpendicular to the seat back. The measurement accuracy of the tool was within 0.2cm horizontally and 0.5cm vertically. At the start of each trial, the subject was asked to sit with their thighs and lower legs in a standardised position. Three different trunk postures were adopted for the three experimental conditions: (a) sitting “upright”, well back in the seat and supported by the seat back, and with a hip angle of 100°, (b) sitting slouched with a hip angle of 120°, and (c) sitting well back in the seat, a hip angle of 90° and with a lumbar support. During the trials reported here, the subjects sat with their hands on their thighs and were asked to look straight ahead. They did not simulate the actions of driving. During the 30 minute trial, measurements were taken at ten minute intervals of the cervical spinal alignment (locations of the three reference points EOP, C4 and T1) and of stature change. Results The postural alignment results are given in Figure 2, where locations of EOP, C4 and T1 are indicated and show how the cervical posture (in the sagittal plane) changed with lumbar curvature. All subjects showed similar patterns of change in posture: the neck posture, when sitting upright, was intermediate between the postures when sitting slouched and when provided with the lumbar support. The change, however, was greater for some subjects than for others (least particularly for subject 5).
CONTEMPOARY ERGONOMICS 2000
403
Figure 2 Postural alignments for each subject in the experimental conditions
Statistical analysis (one-way related analysis of variance and post hoc Newman-Keul’s test) revealed that the location of EOP was significantly higher (p#0.01)when sitting with the lumbar support than when slouched or upright in the seat alone (by 3.9cm and 2.1cm respectively). Additionally C4 was significantly higher when sitting with a lumbar support or upright than when sitting slouched (by 2.7cm and 1.8cm respectively). The vertical location of T1 was less affected by the lumbar postural changes (being 1.5cm higher with lumbar support or sitting upright). With respect to the fore/aft locations, EOP, C4 and T1 were all significantly further forward from the seat back when sitting slouched than when sitting with a lumbar support (by 3.8cm to 4.0cm). Discussion The fact that only small changes occurred in the vertical location of T1 suggests that vertical changes occur in isolation in the cervical spine and head position in response to lumbar spine changes. The changes in fore/ aft location could be explained by the forward displacement of the upper spine in the slouched posture being an attempt to balance the spinal/trunk posture. The more upright alignment observed with the use of the lumbar
404
DRIVERS’ SPINAL RESPONSES TO THE EFFECTS
support could be due to its tendency towards normalisation of the spinal curves so that compensatory forward displacement is unnecessary. The improvement in cervical spine posture on using the lumbar support may be important for cervical spine health and perhaps for longer-term degenerative changes. It needs to be considered when designing headrests intended to protect the seat occupant against injuries such as whiplash. The addition of lumbar supports to vehicle seating and the consequent effect on seated height of the occupant may have implications, particularly for taller subjects who might then have to sit more slouched due to inadequate headroom or require different adjustments to achieve a good view through the driving mirrors. A few comments should be added about the results obtained for individual subjects. Subject 5 was found to gain little benefit from the lumbar support (as seen in Figure 2) because his weight (99kg) almost fully compressed it. This may be a more common problem than is realised for support provided in vehicle seats. Subject 3 by contrast never lost his lumbar lordosis because of very tight paraspinal musculature. It is interesting to note, from the stadiometer measurements, that he also experienced very little stature change in comparison to other subjects. It is possible that this may be one of the reasons for the high individual variability which has been found in stadiometer measurements. Conclusions The addition of a lumbar support, to maintain lumbar lordosis when sitting, appears to affect the posture of the upper spine, increasing EOP height (and so probably eye and head height) by around 3.9cm and moving it some 3.8cm rearward from the position in a slouched posture. This may have important consequences in vehicle design with respect to headroom, positioning of headrests and visual field. It is also a healthier neck posture. References Andersson, G.B.J., et al., 1974, Lumbar disc pressure and myoelectric back muscle activity during sitting IV. Scandinavian J. of Rehabilitation Medicine, 6, 128–133 Andersson, G.B.J., et al., 1974 b, Lumbar disc pressure and myoelectric back muscle activity during sitting I. Studies on an Experimental Chair. Scandinavian Journal of Rehabilitation Medicine, 3, 104–114 Andersson, G.B.J., et al., 1979, The Influence of backrest inclination and lumbar support on lumbar lordosis. Spine, 4, 1, 52–58 Jafry, T., and Haslegrave, C.M., 1992, The development of a precision seated stadiometer for measuring the effects of vibration on the human spine. In Contemporary Ergonomics 1992, (Ed. E.J.Lovesey), Taylor and Francis: London, 79–83 Keegan, J.J., 1953, Alterations of the lumbar curve related to posture and seating. The Journal of Bone and Joint Surgery, 35A, 3, 589–603 McIlwraith, B., 1996, Loss of the lumbar curve in the driving seat: a twenty person study. British Osteopathic Journal, XIX, 19–23 Porter, J.M., and Norris, B.J., 1987, The effects of posture and seat design on lumbar lordosis. In Contemporary Ergonomics ’87, (Ed. E.D.Megaw), Taylor & Francis: London, 191–196 Reed, M.P., et al., 1995, Some effects of lumbar support contour on driver seated posture, SAE Paper No. 950141, Society of Automotive Engineers: Warrendale, 9–20
THE INFLUENCE OF AUTOMOBILE SEAT BACKREST ANGLE AND LUMBAR SUPPORT ON LOW BACK MUSCLE ACTIVITY Mike Kolich, Salem M.Taboun & Ali I.Mohamed University of Windsor, Department of Industrial and Manufacturing Systems Engineering, Windsor, Ontario, Canada N9B 3P4
Six male subjects volunteered for a study into the effects of automobile seat backrest angle (110° and 120°) and lumbar support prominence (0 mm and 50 mm). There were 2×2 possible factorlevel combinations in this experimental design. Each subject participated in each experimental session twice. The sessions lasted for 1-hr. The root mean square (RMS) variation of the EMG was used to assess the stress imposed on the low back musculature. The dependent variable was the change in RMS (∆RMS) over time. By definition, the ∆RMS value becomes more positive as low back muscle activity decreases. Backrest angle was found to have a statistically significant main effect (p<.05). For the selected vehicle package, a 120° backrest angle was optimal. Lumbar support prominence was not found to affect low back muscle activity. Introduction In the context of automotive seating, it is rather obvious that traditional lumbar support recommendations are failing the consumer. To combat this problem, new features are constantly being developed to address the muscle activity common in sitting postures. Massaging lumbar mechanisms are an example. Backrest angle and lumbar support prominence are two factors that, independent of feature, affect the occupant. Andersson et al (1974) found that an increase in automobile seat backrest angle was accompanied by a decrease in myoelectric activity. The explanation is simple. When the backrest angle is increased, a larger proportion of the occupant’s body mass is transferred to the backrest and thus the stress on the back musculature is reduced. Even though the aforementioned rationale is fairly well understood, there is, to date, no universally accepted research that definitively outlines an optimal backrest angle. Vehicle package is, obviously, the limiting factor. More specifically, the backrest angle is restricted by the need for a good field of view. That is, the eyes must be suitably placed in relation to the automobile body so that vision is not obscured. When the backrest angle is too large the head must be flexed to enable the driver to see the road. The appropriate design of a lumbar support, in terms of prominence, is one of the most widely discussed issues in the ergonomics of seating. A lumbar support is a structure that contacts the lower back in the area of the lumbar spine during sitting. In traditional automotive seats, the lumbar support is integrated into the backrest contour. The general purpose of the lumbar support is to stabilize the occupant’s torso and, thereby, improve postural stability. This is accomplished by restricting the rearward rotation of the pelvis that normally accompanies sitting while at the same time reduce flexion (forward bending) of the lumbar spine. Rearward rotation leading to flexion causes the lumbar spine to move from lordosis towards kyphosis.
406
THE INFLUENCE OF AUTOMOBILE SEAT BACKREST ANGLE AND LUMBAR
Automobile seat designers have, for a long time, attempted to preserve or induce, to the extent possible, a lordotic lumbar spine curvature by providing a firm, longitudinally convex lumbar support in the lower part of the backrest. The deflected contour of such a support, based on general design practice, should mate with the lordosis of the occupant’s lower back, providing relatively even contact pressure behind the pelvis and lumbar spine. Conventional design wisdom states that if the design of the lumbar contour does not induce lordosis, there is often, a mismatch between the occupant’s back and the seat. According to Reed et al (1991) this mismatch may produce uncomfortable pressure concentrations or a lack of support in the lower levels of the lumbar spine (i.e., the region where discomfort is most frequently reported). In addition to creating discomfort, it is also possible to infer that this mismatch may lead to increased muscle activity. By the mid-1970s, most lumbar support recommendations were strongly influenced by physiological studies of the load on the lumbar spine. Andersson et al (1974) found the lowest level of myoelectric activity with an automobile seat lumbar support prominence of 50 mm. Based on the assumption that low myoelectric activity is favorable, Andersson et al (1974) recommend a lumbar support prominence of 50 mm. In view of this body of work, one might question the need for further research into lumbar support design. However, some recent investigations have suggested that current lumbar support recommendations based on physiological considerations do not adequately take into account the behavior of the occupant in the driving environment (Reed et al, 1991). As an example, Porter and Norris (1987), noting that the lumbar support specifications in the literature are based primarily on physiological rationales, constructed a wooden laboratory seat to compare the lumbar support specifications recommended by Andersson et al (1974) with occupant preferences. Porter and Norris (1987) found that people preferred postures with substantially less lordosis (i.e., 20 mm). More drastically, some researchers have even questioned whether a lordotic lumbar spine posture is desirable when seated. Adams and Hutton (1985) argue that the advantages of a flexed spine posture outweigh the disadvantages. They cite increased transport of disc metabolites with changing pressure levels as a factor in favor of flexed-spine postures. In summary, questions have started to surface regarding the role of lumbar support in automotive seating. With the quantity and quality of research done in the area of automobile seat backrest design, the lack of consensus is surprising. This study was conducted with the purpose of attempting to establish, for a specific vehicle package and experimental protocol, the most advantageous combination of backrest angle and lumbar support prominence (assuming that low myoelectric activity is favorable). Method Experimental set-up In order to investigate the effect of backrest angle and lumbar prominence on low back muscle activity, six healthy male subjects volunteered to sit (for a series of one hour sessions) in an experimental, luxury-level automobile seat (leather trim and power adjusters) that was mounted on a wooden base. The experimentation was spread over a period of a few months. At the beginning of the experiment, each subject signed a consent form to indicate that he did not have any musculoskeletal disorders (particularly with regard to the lower back) that would make participation in the study inadvisable. The muscle group of interest was the erector spinae (sacrospinalis). This muscle group stretches from the sacrum to the base of the skull. Since it is the most superficial muscle of the back, it is best suited to surface EMG evaluation methodologies.
CONTEMPOARY ERGONOMICS 2000
407
The erector spinae was targeted by placing six 10 mm diameter bipolar surface electrodes (in pairs) at the L3, L4, and L5 levels on the right and left sides of the subject’s back at a distance of approximately three centimetres from the centre of the spine. Each pair of electrodes corresponded to a channel. The exact attachment sites were determined based on the level of the palpable part of the spinous processes. To ensure that the EMG signal was free from noise, the attachment sites were carefully cleaned. When hair was found to cover the intended sites it was first removed. In order to achieve better conductivity, an electrolyte paste was used between the surface of the electrodes and the subject’s back. The electrodes were secured to the subjects using tape. The subjects were always seated so that the sacrum, lumbar, and thoracic spine contacted the backrest. The subjects were instructed to keep their heads directed forward and to fix their eyes straight ahead. The approximate angles for the ankles, knees, and elbows were 90°, 120°, and 90°, respectively. A cushion angle of 12° was adopted. This setup is typical of a luxury car package. Data were collected, from each channel, in 15-minute intervals. Although subjects were asked to refrain from any strenuous physical activity prior to their participation in a particular test session, a reading was not taken at time equal to zero because it was assumed that subjects would arrive with varying levels of muscle activity. In this way, the first 15 minutes of the session (plus the minimal set-up time) were used to stabilize the subject’s muscle activity to some normal, resting level. In summary, data were collected at four distinct time periods (i.e., 15 minute mark, 30 minute mark, 45 minute mark, and 60 minute mark). Experimental design There were two main factors in this experiment. They were backrest angle (measured as the angle between the horizontal and the front surface of the backrest) and lumbar support prominence (measured perpendicular to the backrest). The backrest angle was set to two levels: 110° and 120°. The lumbar support prominence was also set to two levels: 0 mm (i.e., flat or full-off) and 50 mm (full-on). The amount of lumbar prominence was varied using an adjustable lumbar support mechanism. As a result, there were four (i.e., 2×2) different experimental conditions. Each subject participated in each condition twice making this a full factorial, repeated measures design. Root mean square (RMS) values were used in the analysis. The dependent variable was the difference between the maximum RMS value obtained during the first 30 minutes and minimum RMS values obtained during the last 30 minutes. This measure will, from this point on, be referred to as ∆RMS. At each time interval the RMS values were averaged across all six channels. Results and discussion Demographics and anthropometry The subjects were from 25 to 35 years of age. The mean standing height was 176.17 cm (SD=4.07) and mean body weight was 79.50 kg (SD=16.16). Main effects and interaction A two factor ANOVA was used to reveal that (1) backrest angle had a statistically significant effect on ∆RMS values [F (1, 44)=5.860, p<.05], (2) lumbar support prominence did not produce a statistically significant effect on ∆RMS values, and (3) there was no statistically significant interaction.
408
THE INFLUENCE OF AUTOMOBILE SEAT BACKREST ANGLE AND LUMBAR
In particular, a 120° backrest angle (mean ∆RMS value=0.002740) was found to produce a larger decrease in erector spinae muscle activity over time than a 110° backrest angle (mean ∆RMS value=0. 000196). Explanation of study results It is acknowledged that, in this investigation, when the backrest angle was increased from 110° to 120° there was a small change in torso, hip, knee, and foot angles. This was accepted as the influence on the results was, probably, limited. With this said, the decrease in erector spinae muscle activity observed with a 120° backrest angle can be attributed to the increasing transfer of body weight to the backrest. In other words, the amount of support needed to balance the trunk was minimized as part of the body weight was transferred to the back support. As previously mentioned, the limiting factor is the need for a good field of view. This supported previous findings with other automotive seats (Andersson et al, 1974). The fact that lumbar support prominence does not affect erector spinae muscle activity can be attributed to the influence of the hamstrings. The hamstring muscles connect the pelvis and leg across the knee and hip joints and produce a restriction on pelvis orientation that varies according to knee angle (Stokes and Abery, 1980). When the knees are extended beyond 90°, as was the case in this study, the erect pelvic angle necessary to produce substantial lordosis with a reasonable thoracic orientation is not possible for many occupants without hamstring discomfort. In other words, hamstring tension resulting from the extended knee angle restricted forward pelvis rotation, which reduced the possibility of achieving a substantially lordotic spine posture. As a result, erector spinae muscle activity was, relatively, unaffected. The absence of a significant effect dealing with lumbar prominence implies that automobile backrests should be designed for drivers’ preferred postures rather than for postures with a large degree of lordosis, which are typically prescribed. In this context, Reed et al (1995) showed that lordotic lumbar curvatures are not prevalent even when the seat is designed to accommodate them. If this is indeed the case, then the purpose of lumbar supports in automobile seats needs to be reconsidered because the apparent physiological benefits of lumbar lordosis cannot be realized if occupants do not select such postures. The findings from this study suggest that backrests with fixed lumbar supports should provide support for nearly flat spine profiles, rather than for the standing spine curvature typically recommended. Providing a four-way (up-down and in-out) adjustable lumbar support can accommodate those people who prefer to sit with substantial lordosis. Recommendations for future work Rather than arbitrarily selecting a pre-existing piece of work dealing with backrest angle and lumbar support prominence and incorporating the recommendations as control variables in future studies designed to evaluate new lumbar support innovations (which will use the same experimental set-up), it was decided that another, separate investigation was warranted. It was felt that this prefatory study would lend credibility to the planned lumbar support research by arriving at backrest angle and lumbar prominence recommendations that can confidently be applied to the selected vehicle package and experimental protocol. The planned research, using this work as the starting point, will (1) evaluate two different types of lumbar support mechanisms separately with hopes of identifying optimal settings for control system variables, (2) compare the two different types of systems to determine if there is a measurable difference in muscle activity, and (3) compare EMG results to subjective perceptions of comfort.
CONTEMPOARY ERGONOMICS 2000
409
Acknowledgement Joe Benson and Gayle Litrichin of Schukra of North America supported this research by providing (1) the necessary funding, (2) the automobile seat with an adjustable lumbar support, and (3) insight, support, and assistance throughout the course of this research. References Adams, M.A. and Hutton, W.C. (1985). The effect of posture on the lumbar spine. Journal of Bone and Joint Surgery, 67(4), 625–629. Andersson, B.J.G., Ortengren, R., Nachemson, A., and Elfstrom, G. (1974). Lumbar disc pressure and myoelectric back muscle activity during sitting. IV. Studies on a car driver’s seat. Scandinavian Journal of Rehabilitation and Medicine, 6, 128–133. Porter, J.M. and Norris, B.J. (1987). The effects of posture and seat design on lumbar lordosis. In E.D.Megaw (ed.). Contemporary Ergonomics (pp. 191–196). Taylor & Francis, New York. Reed, M.P., Saito, M., Kakishima, Y., Lee, N.S., and Schneider, L.W. (1991). An investigation of driver discomfort and related seat design factors in extended-duration driving. SAE Technical Paper 910117, 1–30. Reed, M.P., Schneider, L.W., and Eby, A.H. (1995). Some effects of lumbar support contour on driver seated posture. SAE Technical Paper 950141, 9–20. Stokes, I.A.F. and Abery, J.M. (1980). Influence of the hamstring muscles on lumbar spine curvature in sitting. Spine, 5 (6), 525–528.
Training
AN INVESTIGATION OF THE EFFECT OF NIGHT VISION GOGGLES ON COCKPIT TASK PERFORMANCE Frederick Lian Kheng Tey, Kee Yong Lim & Yoon Ping Chui Design Research Centre School of Mechanical & Production Engineering Nanyang Technological University Nanyang Avenue, Singapore 639798
Night missions in military operations involving high performance vehicles require an increased dependence on night vision devices such as night vision goggles (NVGs). In this operation scenario, an area of human factors concern is the compatibility between the NVG and horizon display with respect to the capabilities of human vision (http://www.alhra.af.mil/alhra/efforts/ nvd.html). Landmark work reported by the US Air Force involved subjects looking at an alphanumeric stimulus through an NVG followed by naked eye viewing of a horizon display. This paper reports a follow-up of the work and extends the scope of investigation in several directions; namely the introduction of a more ecologically valid visual stimulus in the NVG view, a wider survey of the implications of other cockpit display devices, a profiling of elapsed time on viewing performance after transiting between the NVG and cockpit devices, and a study of potential blinding effects in bi-directional transitions between the NVG and cockpit displays. Background Military forces around the world have yearned to operate as effectively by night as they do by day. For instance, replenishment of logistics and redeployment of troops may be better conducted under the cover of darkness. Similarly, air strikes are conducted in darkness and at dawn and dusk, to avoid retaliation from enemy interceptors and air defence systems. In addition, it is vital that operational capability should not be limited or hampered by bad weather conditions and long hours of darkness during winter. These operational requirements have motivated the development of night vision devices (NVDs). Today, NVDs can effectively turn night into day. The operational capability of military forces is now extended effectively to cover all 24-hours of a day. In parallel, advancements in vehicular and avionics technology have introduced faster and more sophisticated systems that are able to operate in the most demanding environment and weather conditions. These developments in advanced technology often imply operation at or near the tolerance limits of a human being. Thus, the human component often constitutes a weak link in the human-machine system. Consequently, with advanced systems, there is an increasingly vital need to address Human Factors during their design, development and implementation. NVDs used in military aircraft to support optimal decision making in the shortest possible time, are no exception. For instance, the type of NVD selected (biocular, monocular or binocular) will limit the width of visual field of the user and may thus affect task performance. Commanders must therefore understand fully both operational and user requirements in order to procure the
412
AN INVESTIGATION OF THE EFFECT OF NIGHT VISION
Figure 1. A simulation of an NVG view from the cockpit (extracted from http://www.alhra.af.mil/alhra/efforts/ nvd.html)
right type of NVD. In the case of pilots, appropriate compatibility between NVDs and the lighting of aircraft instruments, panels, controls, indicators, and displays are essential. The lighting must be compatible over a wide range of ambient conditions, during dawn or dusk transition and at night. A class of NVDs, namely night vision goggles (NVGs), have been introduced into the cockpit together with other displays (see Figure 1), which have been optimised previously for night operations by naked eye. This operation set-up raises a number of questions concerning compatibility in luminance levels and so in the visibility of displays, when a pilot transits between looking through photo-amplified images in the NVG and at other cockpit displays with the naked eye. In this respect, little data is available on the implications for visual performance as past research have focused largely on effects in terms of minutes of adaptation rather than in seconds as required by the fighter pilot. Thus, data on these effects is urgently needed to support performance and equipment design assessments. Rabin and Wiley (1994) studied the transitory effects on visual resolution when switching from a forward-looking infrared (FLIR) device to an NVG to determine whether display luminance produce an adverse effect on visual resolution when switching from a higher luminance (i.e. the FLIR) to a lower luminance (i.e. NVG) display. A significant reduction in letter recognition was observed in the first second after switching from the FLIR display to a simulated NVG display. With a brighter luminance, the visual loss lasted up to 4 seconds, including a 2X reduction in visual acuity and a 3X reduction in contrast sensitivity. By varying size, contrast, lighting condition and exposure duration, it was possible to estimate the magnitude and duration of visual loss after switching from a very bright FLIR display. This transitory reduction after switching from FLIR display to NVG’s could interfere with object recognition during critical periods of aircraft control, target acquisition, and firing. It is recommended that large differences in luminance be avoided to optimise visual performance and pilot safety. Since the study was conducted with simulated FLIR and NVG displays, care should be taken to extrapolate the findings to real aviation performance, as the study did not take into account the dynamic imagery experienced in flight. Notwithstanding the limitations of the preceding studies, it is clear that visual performance can be affected by transitions between different NVDs. However, it remains to be determined whether similar effects on visual performance would apply when users transit between NVDs and cockpit displays. Landmark work in this direction has been initiated by the US airforce (http://www.alhra.af.mil/alhra/efforts/ nvd.html). The work involved subjects looking at a static alphanumeric stimulus through an NVG followed by a horizon display with the naked eye. Although the results of this work indicated that visual performance
CONTEMPOARY ERGONOMICS 2000
413
is affected, it suffers from a number of limitations in respect of ecological validity and the narrow scope of its investigation. This paper reports a follow-up of the work and extends the scope of investigation in several directions; namely: • the introduction of a more ecologically valid visual stimulus in the NVG view, including both static and moving stimuli that are also more representative of the luminance levels and target shapes and sizes associated with various operation contexts (e.g. terrain contours, tanks and aircrafts during engagement, etc.). The introduction of a moving stimuli would address in part the need to take into account the dynamic imagery experienced in real flight. Thus, the findings would be more applicable to actual aviation performance. In this way, some aspects of the pilot’s tasks may be accounted for as appropriate, • a more comprehensive survey of the implications for other cockpit display devices other than the horizon display. Cockpit display devices that may be assessed may include airspeed, angle-of-attack, attitude, direction indicator, revolutions per minute and temperature gauges as suggested by Pinkus (1988), • a characterisation of the recovery profile with respect to elapsed time effects on visual performance following a transition between the NVG and cockpit devices. This part of the work is similar to that done by Rabin and Wiley (1994) for transitions between different NVDs , and • a study of potential blinding effects in bi-directional transitions between the NVG and cockpit displays. The USAF study only addressed performance effects of pilot moving from the NVG to the cockpit but not vice versa. By understanding the performance implications of transiting between NVDs and cockpit devices, a more appropriate set of user requirements for cockpit display devices may then be defined for their re-design or future development. As the research project is ongoing, the results will be reported at the time of paper presentation. References http://www.alhra.af.mil/alhra/efforts/nvd.html Pinkus A.R., 1988, Night Lighting and Night Vision Goggle Compatibility, NATO Advisory Group for Aerospace Research & Development Lecture Series 156, 7–1/17. Rabin J., Wiley R., 1994, Switching from Forward-Looking Infrared to Night Vision Goggles Transitory Effects on Visual Resolution, Aviation and Space Environmental Medicine, 65:327–9.
A REDEFINITION OF PERSONAL KNOWLEDGE AND A TESTING METHOD TO IMPLEMENT IT Darwin P.Hunt Human Performance Enhancement, Inc. Executive Center II, 345 North Water Street Las Cruces, New Mexico, 88001 USA
Based upon the traditional concept of knowledge as a correct justified belief, current testing methods in training and education infer that a person does or does not know some subject matter based simply upon whether the person selects the correct answer. However, the correct answer may be given by the test taker due to a guess or by employing strategies designed to select the correct answer without knowing the material. Even worse, an incorrect answer may be due to the person strongly believing that the wrong answer selected is correct These deficiencies can be remedied by obtaining a self assessment by the test takers of the certainty about the correctness of their answers. An analysis of the answers and SA responses allows test takers to be classified on each item as informed, misinformed, uninformed, or guess/test strategy. Such test results are useful in directing training and selecting personnel for critical tasks. Introduction This paper discusses the topic of people’s certainty about the correctness of (1) their perception of a situation and (2) their decisions and alternative actions intended to accomplish some purpose as it is related to knowledge and performance. Our research over 20 years or so on the ability of people to assess the correctness of their own knowledge has led in many directions. For example, if—during learning—you require people to assess the correctness of each of their answers, they learn the material significantly faster than if they simply answered (Hunt, 1982). Also, we and others have observed that 5–20% of the answer about which people are extremely sure of their correctness are, in fact, wrong (Shuford, 1993; Hunt and Hassmén, 1997). This latter observation is of especial concern to education and training which helps establish the knowledge base for people’s behavior and performance. Being sure of something which is incorrect not only produces performance errors on the job, at home and in play, but also creates a foundation which is counterproductive for more advanced learning. Furthermore the methods now employed in schools, professional licensing examinations, etc for testing a person’s knowledge allow such wrong-but-sure beliefs to remain concealed. The most widely used testing method is the multiple choice test in which it is inferred that the person knows (is informed) or does not know (is uninformed about) the material based upon only whether the person selects the correct answer. Either of these inferences is unacceptably and unnecessarily flawed. The correct answer may be selected by the test taker due to a lucky guess or by employing test taking strategies designed to produce the correct answer without knowing the material in question. And an incorrect answer
CONTEMPOARY ERGONOMICS 2000
415
may be due to the person being misinformed and strongly believing that the wrong answer selected is correct which is much worse than being uninformed. As Colton (Seldes, 1985) said more than 170 years ago, “Ignorance is a blank sheet, on which we may write; but error is a scribbled one, from which we must first erase.” Both of the testing deficiencies mentioned above can be remedied by obtaining a self-assessment (SA) by the test takers of the certainty that their answer is correct. Of course, the contingencies (payoffs and costs) of the SA must be arranged so as to encourage the test taker to give as accurate SA as they can, e.g., they will receive the highest test scores by telling the “truth”. The correctness of the answer along with the SAs allows each item to be placed in one of four categories, correct-and-sure (informed), wrong-but-sure (misinformed), wrong-and-unsure (uninformed), and correct-but-unsure (guess, partially informed, good test strategy) I will discuss, first, the concept of “knowledge” or what it means to know something; then an expanded definition of knowledge which includes the person’s certainty; and finally, a method of certainty testing that employs the expanded definition. The concept of knowledge This paper is restricted to what can be called personal knowledge, i.e., what a person knows, in contrast to scientific knowledge. It is important to make this distinction because the process and rules by which most people acquire their beliefs is quite different than those by which scientific knowledge is acquired. These beliefs provide to a person an orderliness and ability to predict, e.g., that particular consequences more-orless regularly follow certain events. It is helpful to consider personal knowledge as a set of beliefs related to how things are perceived, what alternative actions are considered, etc. (Figure 1). Of course, knowledge is a concept—like gravity It is not directly measured or observed, but instead is inferred from observing performance on a knowledge test. The concept of knowledge has been discussed for more than 2000 years leading to the definition that knowledge is a belief that is true (or correct) and justified. A belief is, what a learning psychologist would call, an association between one thing and another,, e.g., between a perceived situation and a response, an action and the expected results of that action or between a verbal statement and its inferred meaning. The term ‘true’ is a complex concept (Fernández-Armesto, 1997). To be true means, here, that the belief of a person is in accordance with the way in which the objects, people, processes and events exist and behave in the real world. For purposes of testing in education, training, etc. one can avoid unnecessary complexities by employing the term, ‘correct’, rather than ‘true’. To be correct means that there are criteria, or operational definitions, to determine whether a response is correct. Thus, to be called knowledge, the belief must be correct. Otherwise it is an erroneous belief. Furthermore, to be called knowledge, a correct belief must be justified Exactly what justification or evidence is necessary and sufficient to allow a correct belief to be called knowledge has been a major topic of discussion (largely by philosophers) for more than 2000 years and we will say little about it. For most training purposes we rely on the experience and knowledge of experts on the subject matter being taught to justify its correctness. Plotkin’s (1994) elaboration is helpful: “If I say that I know it is raining, then, for this to be a claim of real and certain knowledge, (1) it must be raining, (2) I must believe it to be raining (merely to say that it is, out of whim, and for it to be raining at the time of whimsy. would not constitute knowledge that it is raining), and (3) I must be justified in having that true belief. By justify, epistemologists mean that the claim must be justified as reasonable rather than not. For example, I might genuinely believe it to be raining, and its
416
A REDEFINITION OF PERSONAL KNOWLEDGE AND A TESTING METHOD
Figure 1. An eight-factor representation of personal knowledge
raining, but my belief may be based on what someone else has told me and that person may be none too reliable. I may even know? that my informant is sometimes economical with the truth. ‘How do you know that it is raining?’ I am asked. ‘Why,’ I answer, ‘because so-and-so told me.’ ‘Well,’ say the philosophical judges on this matter,’ it is indeed raining, and you clearly believe it to be so doing, but your informant is unreliable and therefore you are not justified in your claim. You don’t really know with any certainty that it is raining.’” (p. 12). Knowledge as a correct, justified and certain belief The relation of a person’s knowledge to a person’s certainty of its correctness has been discussed for centuries by scientists and philosophers (Aristotle, c. 300 B.C. in Auden, 1970; Russell, 1948; Quine, 1987). Even earlier (c. 500 B.C.) Confucius (Streep, 1995) observes: “When you know a thing, to recognize that you know it, and when you do not know a thing, to recognize that you do not know it.” “That is knowledge.” Indeed, most people infer that a comment, ‘I know if means that I am certain of its correctness. The main benefit of including a certainty component in testing is that it allows a determination of (a) whether a correct response indicates that the person knows or does not know the material in question and (b) whether an incorrect response indicates that the person is misinformed or simply uninformed. These relationships are represented in Figure 2. However, there is a problem of how sure a person must be of the correctness of a belief in order to qualify as being called knowledge. Quine (1987) discusses this “boundary” problem more extensively. For practical purposes it seems that there is no single certainty level required to qualify as knowledge, but instead may depend on such factors as the costs of being incorrect and the benefits of being correct (Hunt and Hassmén, 1997).
CONTEMPOARY ERGONOMICS 2000
417
Figure 2. The relationship among correctness, sureness and usability (adapted from Hunt and Furustig, 1989)
A method of certainty testing The common multiple choice test, which is widely used to measure people’s knowledge, has many advantages which include objectivity, ease and economy of administration and scoring, and reliability. It also has the ability to measure simple and complex knowledge in most content areas at most level of knowledge. However, the knowledge of a person has more characteristics than is represented by the percentage correct score on a multiple choice test. Getting a correct answer not enough. We have developed the Self Assessment Computer Analyzed Testing1 (SACAT) method which incorporates the component of personal certainly aimed at providing test scores which are more representative of the ways in which knowledge influences a person’s everyday performance. Paper-and-pencil SACAT answer sheets and methods for scoring and interpreting the results were developed based on our and other’s findings so that the test takers mark their answer and, then, mark “how sure” they are that the answer they selected is correct. These responses are used to calculate a variety of scores including the conventional Percentage Correct Score and a Percentage Self Assessment (SA) Score, which is an index of how accurately the test takers have assessed the correctness of their answers. For the %SA Score the number of point gained (if the answer is correct) or lost (if the answer is wrong) is a logarithmic function of the certainty level marked. The intent is to construct scoring contingencies so that the expected number of points will be maximized by telling the “truth”. Namely, if test takers mark a certainty level that is higher than they feel, they will lose more points when they are wrong than they should; and if test takers mark a certainty that is lower than they feel, they will not get as many points when they are
1
SACAT is proprietary property and copyrighted by Dr. Darwin P.Hunt, Human Performance Enhancement, Inc., Las Cruces, NM 88001, USA; email:
[email protected]; fax: 505–541–8141.
418
A REDEFINITION OF PERSONAL KNOWLEDGE AND A TESTING METHOD
correct as they deserve. A related idea is that if their feelings are poorly calibrated, then the post-test feedback may help the test takers to recalibrates their process. Other test results provided for individual test takers, the group and specific items are the %Correct-andSure, % Wrong-but-Sure, %Correct-but-Unsure and %Wrong-and-Unsure. Also, a summary of the test results lists the test items on which the group may be misinformed. Such wrong-but-sure beliefs are the roots of performance errors. So it is important during learning to detect, identify and correct misinformation before it produces errors. At present, wrong-but-sure beliefs are not addressed during training and education; they remain hidden As a result, the number of performance errors attributable to misinformation is unknown. References Auden, W.H. 1970, The portable Greek reader, (The Colonial Press, Clinton, MA) Fernández-Armesto, F. 1997, Truth, (Bantam Press, London) Hunt, D.P. 1982, Effects of human self-assessment responding on learning. Journal of Applied Psychology, 67, 75–82 Hunt, D.P. 1993, Human self assessment: Theory and application to learning and testing. In D.Ledercq and J.Bruno (eds.) Item banking: Interactive testing and self-assessment. (Springer-Verlag, Berlin), 177–189 Hunt, D.P. and Furustig, H. 1989, Being informed, being misinformed and disinformation: A human learning and decision making approach, Technical Report PM 56:238, (National Defence Research Institute, Karlstad, Sweden) Hunt, D.P. and Hassmén, P. 1997, What it means to know something, Report No. 835 (Department of Psychology, Stockholm University) Plotkin, H.C. 1994, Darwin machines and the nature of knowledge, (Harvard University Press, Cambridge, MA) Quine, W.V. 1987, Quiddities: An intermittently philosophical dictionary, (The Belknap Press of Harvard University Press, Cambridge, MA) Russell, B. 1948, Human knowledge: Its scope and limits, (Simon and Schuster, New York) Seldes, G. 1985 The Great Thoughts, (Ballantine Books, New York) Shuford, E.H. 1993, In pursuit of fallacy: Resurrecting the penalty. In D.Leclercq and J.Bruno (eds.) Item banking: Interactive testing and self-assessment, (Springer-Verlag, Berlin), 76–98 Streep, P. 1995, Confucius: The wisdom, (Little and Brown, New York)
Visual displays
THE EFFECT OF TWO- & THREE-DIMENSIONAL DISPLAYS ON REMOTE CRANE CONTROL PERFORMANCE Roy Siew Ming Quek, Kee Yong Lim & Yoon Ping Chui Design Research Centre, Nanyang Technological University, School of Mechanical and Production Engineering, Nanyang Avenue, Singapore 639798
A local freight company has implemented a camera-based system to pick-up and land containers from a yard onto a prime mover and vice versa. However, a two-dimensional (2-D) camera-based system deprives the operator of the depth perception required for container handling. In particular, the operator may encounter difficulties in judging the speed of container landing, and the distance of the container to the prime mover chassis and to another container. To support depth perception, a three-dimensional (3-D) camera and display system may be developed to enable stereoscopic vision. A mock-up of such a system that includes basic crane controls is being built, to enable an assessment of the efficacy of the proposed 3-D set-up. The assessment will involve performance testing with subjects. The results of the tests will be reported at a later date. This paper reviews the shortcomings of the existing 2-D system and describes the set-up constructed to enable performance tests on the 3-D system proposed. Background A local freight company has implemented a new remote controlled camera-based system for container handling. Container handling jobs are allocated successively by a central computer to the operator, whose task is to position and land: • an empty spreader (i.e. the device used to grab a container) onto a container to be picked up; or • a loaded spreader onto a prime mover or another container. To perform these tasks, an operator needs to adjust the position of the spreader with or without a container, and occasionally dampen its sway before lowering it. By removing the operator from physically being in the crane (with its attendant poor working posture and vibrations), the new system promises to improve the working conditions of operators and enhance their efficiency and productivity (see Figure 1). However, a 2D camera-based remote control system deprives the operator of the depth perception required for efficient pick-up and landing of containers from a yard onto a prime mover and vice versa. Further, with cameras mounted directly above the spreader, the existing remote control display provides the operator with only a planar view of the 4 corners of the spreader and container (see Figure 2). Although this display may be useful for relative positioning of the container to the prime mover chassis, the operator would find it difficult to judge the speed of container landing and the distance of the container to the prime mover chassis
CONTEMPOARY ERGONOMICS 2000
421
Figure 1. Existing remote crane control workstation (Lim, 1998)
(dispatch operation) and to another container (stacking operation). Close monitoring of these containerhandling parameters would be necessary to maximise throughput, and to avoid damage to container contents and/or the prime mover chassis due to an unduly heavy landing of a container. In view of the above considerations, a system that can provide stereoscopic vision should be examined to ascertain its potential for enhancing remote crane control operation. Stereoscopic vision is the ability to judge depth or position of objects accurately in a 3-D field. The human eye perceives depth stereoscopically via optic nerves from the two eyes, which come together at the optic chiasma near the brain (Pedrotti, et al, 1998). The slightly different images that fall onto the left and right eye are sent to the brain, which integrates them into a single image. The fusion of the images is referred to as binocular vision. The slight differences in the images (i.e. binocular disparity, Schiffman, 1996) provide additional information on the depth or distance of an object, and this forms the basis of stereoscopic vision. The working principle of depth perception by the eye may be imitated by camera systems. The difference between a 2-D and 3-D display system is that the latter consists of a pair of cameras to simulate the eyes. A 3-D system setup also includes a real-time image decoder and a pair of stereoscopically ‘active’ glasses or a ‘passive’ display screen with monitor. The system simulates how the eyes perceive depth by alternately blocking the images projected into each eye synchronously with the monitor. Although a 3-D stereoscopic display system appears to be better for the container-handling task, the extent to which task performance would be enhanced is unclear. Since monocular vision and 2-D display systems also provide indirect cues for depth perception, it is uncertain to what extent operators could adapt to such displays. In this respect, indirect visual cues that could possibly support some depth perception include parallax, shadows, relative size, and perspective views. Such visual cues are commonly used in computer games and CAD applications. Thus, to better argue a case for a 3-D system, it is vital to establish the extent to which such a system could enhance operator performance. Thus, an experiment rig is constructed to collect some data for such an assessment.
422
THE EFFECT OF TWO- & THREE-DIMENSIONAL DISPLAYS
Figure 2. Planar display of the 4 corners of the spreader and container
Mock-Up of a Stereoscopic Crane Control System To assess the effect of the loss of stereoscopic vision on the performance of an operator, a crane model system comprising hardware and software components needs to be developed. The set-up includes a mockup crane control rig with two pairs of stereoscopic cameras, one located above the spreader to support positioning and another located at the top of the crane to provide an overview of the working platform. For reasons associated with the real operation scheme and transfer of learning effects, the remote controls in the mock-up follow the layout of manually operated cranes. In particular, critical controls (joysticks) for hoisting and lowering the spreader are located on the right, while controls to move the crane chassis are located on the left. To simulate remote control, test subjects are located in a room separated from the experiment rig, and crane control interactions are performed via the 2-D or 3-D display and control systems. The performance factors examined include the: • time taken to land a container, • accuracy of container landing, • impact force of container landing. Subject performance under different lighting levels, camera positions, container heights will be investigated for both 2-D and 3-D systems. As investigations are ongoing, the results will be reported at the time of paper presentation.
CONTEMPOARY ERGONOMICS 2000
423
References Lim T.C.K., 1998, Human Factors Study of Manual and Automated Yard Cranes, Final Year Project Report, School of Mechanical and Production Engineering, Nanyang Technological University. Pedrotti L.S., Pedrotti F.L., et al, 1998. Optics and Vision, Prentice Hall International, Inc. 194–223. Schiffman H.R., 1996, Sensation and Perception, An Integrated Approach, 4th edition, John Wiley & Sons, Inc. 215–245.
AIRPORT BAGGAGE INSPECTION—JUST ANOTHER X-RAY IMAGE? Alastair G.Gale1, Mark D.Mugglestone1, Kevin J.Purdy1 & Andrew McClumpha2 1Institute
of Behavioural Sciences, University of Derby, Derby DE22 3HL, UK
2Centre
for Human Sciences, DERA, Farnborough, Hants, UK
Studies of observer performance in examining X-ray images have led to a conceptual model which has been used successfully to understand diagnostic errors. The model stresses the three processes of visual search, detection and interpretation with most errors being due to the latter two factors. A similar inspection situation would appear to be that of the airport security screener who examines X-ray images of passenger baggage. There is, however, little overlap between the two areas in terms of research. An initial study is reported on baggage inspection, using brief image presentations, to examine the applicability of the medical model to this domain. Introduction Medical imaging involves producing a two dimensional image of some part of the three dimensional human form. Studies of observer performance in medical imaging have led to a theoretical model (Gale, 1993) stressing the importance of an initial fast global inspection stage followed by sequential detailed visual search of particular image areas. The processes of search, detection and interpretation as separate sources of observer error have been identified. Detailed eye movement research has demonstrated that, whilst visual search is an essential component of the medical image inspection situation, most diagnostic errors arise due to errors of detection and interpretation (Kundel et al., 1978). The delineation of errors into these three classes is based upon determining whether an observer has fixated sufficiently close to, or on, a particular potential target area (i.e. such that the target has fallen within the observer’s useful field of view, UFOV). Each fixation near a potential target, which the observer has failed to identify, is then examined using this UFOV to determine the error classification. The UFOV is empirically determined in such research and has variously been found to be some 2.5° visual angle radius around the point of fixation with different classes of medical image. A similar inspection situation would appear to be that of airport baggage inspection, where the observer typically examines a monitor displaying a 2D image of hand baggage (a 3D object). Images of hand baggage (bags, coats etc.) typically vary in shape, size and orientation, with respect to the observer, and can be viewed as static or dynamic images which can be either grey scale or false coloured. Typically any baggage item will be examined within circa 6–10s which is a similar inspection time as found in the medical imaging areas of mammography and chest radiography. In the UK we have the highest possible standards of aviation security and there are a number of strict guidelines and requirements that operate within a baggage screening environment.
CONTEMPOARY ERGONOMICS 2000
425
Despite the apparent similarities between these two X-ray image inspection situations there is, however, little interdisciplinary research. The present work begins to address this and extends medical imaging research to the airport security situation. Over the last two years a series of experiments have been performed at several UK airports where the visual search behaviour of inspectors has been examined and their performance on test series of images, some of which contained potential targets, detailed. Initial results of this research are presented here. In order to determine whether the same model could be applied to the baggage situation, such images first need to be examined to see whether parameters that have been determined in the medical imaging domain (e.g. the UFOV) can be applied to the task of baggage inspection. With X-ray images, tachistoscopic techniques have been employed to examine the amount of detailed information available to the observer in various brief presentation times. Chest radiographs have been studied with a presentation time of 200ms—shorter than a typical eye fixation (Kundel and Nodine, 1975; Gale et al., 1983). Data indicated that surprisingly sometimes the observer was able visually to attend to targets some distance away from the initial fixation point. Broadly similar results have been reported with mammographic images (Mugglestone et al., 1995, 1996, 1997). In such medical image research the observer’s task is typically to identify the presence or absence of a small target (e.g. simulated cancer or an actual lesion). In baggage inspection the inspector searches for various types of potential targets which may provide a threat to safe air travel. In the present experiment the task was to determine whether or not an IED (Improvised Explosive Device) was present in a baggage image which was viewed for short time periods. The objective was to determine how much baggage information could be identified in such short times when fixation location was controlled. Eye movements were recorded to allow a classification of the observers’ errors to be made and also the UFOV in this task to be determined. Method Images of passenger baggage were shown to observers for three periods of time, namely; 200ms, 1s and 6s. Their saccadic eye movements were recorded using an ASL 5000 remote oculometer, although these data are not reported in detail here. The observer’s task was to identify whether an IED was present. Each image was viewed twice by all participants; once in either of the brief presentations (200ms or 1s) as well as for 6s. Stimuli Fifty images of passenger baggage were used, 20 contained IED targets and 30 did not. All images were presented on a 17” PC monitor using specialised psychophysical software to control presentation times accurately. Participants viewed the monitor from a set distance (800mm), representative of normal working practice. The images were presented in the normal false coloured format. Participants Fourteen airport security officers from a UK airport took part in the study. All were routinely involved in baggage inspection.
426
AIRPORT BAGGAGE INSPECTION—JUST ANOTHER X-RAY
Procedure Participants were first familiarised with the procedure, calibrated for the eye movement system and practice trials run. The overall method was as follows. For each image presentation a visual noise mask was first presented, this was then replaced by a visual noise mask with a central fixation cross and a tone sounded indicating that the participant was to fixate the cross. In turn, this was replaced by the stimulus image for a predetermined time period of either; 200ms, 1s or 6s. At the end of the set stimulus time the noise mask was again displayed. Each participant was presented with 30 images for 200 ms, followed by 20 images for 1s, and then all 50 images were shown for 6s. For each image the participant had to state whether an IED was present or not and their confidence in their decision on a four point scale. If an IED was detected, subjects also had to indicate its location in the baggage image. Results Using the known position of target IEDs to localise responses then all observers’ responses were classified as: true positive (TP; the image contained an IED which was identified in the appropriate image area); false negative (FN; the image contained an IED and the observer did not report it or reported an IED in an inappropriate image area); false positive (FP; the image contained no IED but the observer reported an IED present) or true negative (TN; the image did not contain an IED and the observer correctly identified the image as normal). Comparing overall TP data for the same images viewed in the 200ms and 6s conditions then only some 27% of the images containing IEDs that were correctly identified in the 6 second presentation were identified in the brief 200ms presentation. Comparably in the 1s presentation this percentage rose to about 55% of the 6s value. The false negative responses decreased as the viewing time increased. These data indicate that increasing the time available to examine the images did not alter the observer’s response bias (which would have affected the TN/FP response proportions as well) but did increase the detectability of IEDs. The data also demonstrate the complexity of the IED identification task as the true positive responses in the brief presentation were not limited to a small number of ‘easy’ images but virtually every IED being identified by at least one observer. The TP performance levels at 6s for the images used in the 200ms and 1s presentations were comparable, thereby indicating the similarity in difficulty between these different image sets. The TN data were similar from all three stimulus time periods, as were the FP data. The different presentation times really only affected the TP and FN responses. The TN data from both the 200ms and 1s times, together with reports from the observers, indicated that participants found it possible to correctly identify various ‘normal’ baggage items within the images in such brief times, even when these items were some distance from the fixation point. ROC (Receiver Operating Characteristic) analysis was performed on the subjects’ rating data to determine their accuracy (Az values) in identifying IEDs. Not surprisingly, individual differences were found. By examining the locations of visual fixations in relation to the IED targets, an estimate of the UFOV was calculated to be 2.5° visual angle. The eye movement data were examined to determine the nature of the false negative errors in the 6s (normal viewing) condition. These errors were found to be due to visual search (14%), detection (19%) and interpretation (67%).
CONTEMPOARY ERGONOMICS 2000
427
Discussion and Conclusion The mean TP figure for IED identification in the 200ms presentation was lower than anticipated from the medical imaging data, indicating the complexity of the baggage inspection task as compared to identifying an abnormality in medical images. Clearly the conspicuity of a target in either inspection scenario is an important factor in the ease of identifying that target. IED conspicuity is potentially more complex than detecting an abnormality, as the detection of an IED is dependant upon identifying a number of different components that can be very diffuse throughout an image, with some components being hidden. Also when examining medical images observers can use anatomical features, that are always present on a particular image type, as reference structures. Using these reference structures enables the observers to quickly direct attentional resources to the informative areas of the image. Images of passenger baggage have no such reference structure which makes any initial parsing of the image more difficult. Examination of the visual search records enabled a figure for the UFOV to be determined. This value is similar for the two inspection tasks, demonstrating that this parameter may be domain independent. Considering the false negative errors made in the 6s condition (comparable to normal inspection time in the airport) then most errors were found to be due to errors of interpretation i.e. the location of an IED target was visual attended to (often for significant periods of time) but failure of decision making processes led to the area being misclassified as a non-target. This was a higher percentage than has been typically found in medical imaging. It is argued that this difference again indicates the greater cognitive complexity of IED identification. It could be hypothesised that the rise in TP responses with increased viewing time may indicate that in real life simply increasing the viewing time of each image should ensure correct identification of possible IEDs. Eye movement data from medical imaging and other domains, however, indicate clearly that above a particular threshold, increased viewing time does not result in more of an image being foveally examined in detail but rather that the observer simply returns more to complex image areas already examined. Overall the results illustrate the difficulty in extrapolating easily from one domain to another, apparently similar one. It is concluded that the results broadly support the utility of the theoretical model of observer performance in medical imaging being extended to the airport security inspection task. It is argued that by doing this an increased understanding of observer performance in imaging itself results. Acknowledgement This research was performed under contract to the Centre for Human Sciences, Defence Evaluation and Research Agency. References Kundel H.L. and Nodine C.F. 1975. Interpreting chest radiographs without visual search. Radiology, 116, 527–532. Kundel H.L., Nodine C.F. and Carmody D. 1978. Visual scanning, pattern recognition and decision making in pulmonary nodule detection. Investigative Radiology, 13; 175–181. Gale A.G., Vernon L., Millar K. and Worthington B.S. 1983. Interpreting radiographs in a single glance. Radiology, 149, 253. Gale A.G. 1993. Human response to visual stimuli. In: W.R.Hendee and P.N.T.Wells (eds.) The perception of visual information. (Springer Verlag, New York).
428
AIRPORT BAGGAGE INSPECTION—JUST ANOTHER X-RAY
Mugglestone M.D., Gale A.G., Cowley H.C., Wilson A.R.M. Diagnostic performance on briefly presented mammographic images. In H.L.Kundel (ed.) Medical Imaging 1995: Image Perception (SPIE Washington). Mugglestone M.D., Gale A.G., Cowley H.C., Wilson A.R.M. 1996. Defining the perceptual processes involved with mammographic diagnostic errors. In H.L.Kundel (ed.) Medical Imaging 1996: Image Perception (SPIE Washington). Mugglestone M.D., Gale A.G., Wilson A.R.M. 1997. Perceptual processes involved in mammographic film interpretation. In H.L.Kundel (ed.) Medical Imaging 1997: Image Perception (SPIE Washington).
Virtual reality
IMMERSIVE VIRTUAL REALITY AND ELDERLY USERS Nina Karlsson1, Johan Engström1 & Kurt Johansson2 1Department
of Human System Integration, Volvo Technological Development, Gothenburg, Sweden
2Traffic
Medicine Center, Karolinska Institutet, Stockholm, Sweden
The main objective of this study was to investigate the feasibility of elderly people using immersive virtual reality (IVR). The results show that elderly people are able to use IVR, at least for simple tasks and limited exposure times. Introduction During the past decade, Virtual Reality (VR) has become a useful research and development tool with many practical applications including entertainment, product prototyping, training, and assessment. There are various different kinds of VR systems, ranging from relatively simple desktop systems to more advanced systems where the virtual environment is projected on the walls in a room or in a head mounted display (HMD). The present paper will only deal with the latter category, also known as immersive virtual reality (IVR) (Draper, 1998). Recently, the possibilities of using VR as a clinical tool for rehabilitation and assessment of cognitive impairments have been increasingly exploited (e.g. Pugnetti et al, 1994). In theory, IVR systems should be very useful for testing and evaluating behaviour in potentially dangerous situations as the situation can be repeated and mistakes will not cause accidents or other harm. One potential application of this paradigm is the testing and evaluation of elderly automobile drivers’ behaviour in hazardous traffic situations. However, there are several potential obstacles associated with using IVR systems with elderly people. First, there might be visual problems. Since the HMD cannot readily compensate for any visual defects, glasses have to be worn inside the HMD, which could cause practical problems. Furthermore, since the light is rather intense inside the helmet, there could be problems with glare. It is known that older people are more sensitive to glare and need more time to recover than younger persons (Johansson and Winblad, 1991). Second, some mechanical problems might occur, affecting the comfort of the HMD. For example, due to stiffness associated with age, elderly people may have difficulties with head movements or may find the HMD too heavy. Third, there are potential physiological problems. In particular simulator sickness, a phenomenon related to motion sickness, is known to be a major obstacle when using IVR systems (e.g. Burns and Saluäär, 1999). However, although the relation between simulator sickness and age is not thoroughly investigated the common view regarding motion sickness is that sensitivity decreases with age (Reason and Brand, 1975).
CONTEMPOARY ERGONOMICS 2000
431
Finally, there might be psychological problems involved like claustrophobia or a general reluctance among elderly people to use this kind of technology. It is clear that before IVR can be used as an effective research tool, its feasibility for applications with elderly people involved must be demonstrated. The principle aim of this initial study was to investigate the possibility of elderly using HMD-based IVR systems, and to identify the general problems involved. Method Subjects Twenty-six subjects participated in the study. Twelve of these (five women and seven men) were elderly persons (65 years or more) and were recruited from a local society for retired in Gothenburg, Sweden. The ages of the elderly ranged from 65 to 84 years (mean 71.9). In order to isolate age-specific issues, a control group consisting of fourteen persons (six women and eight men), represented the younger population between 15 and 50 years of age. The ages in the control group were rather evenly distributed within this interval (mean 30.4). All subjects were physically healthy with no known eye diseases, severe heart problems or history of epilepsy. Equipment The HMD used was the n-vision Datavisor 80, which represents the current state-of-the-art in the field. The graphics were generated by a Silicon Graphics Onyx computer, and the head movements were tracked with an Ascension Flock of Birds system. Experimental set-up The subjects were seated in a detached front seat of a car wearing the HMD. The virtual environment used depicted a Volvo S-80 placed in a large hall. The subjects were placed in the driver’s seat and could, from this position, explore the environment in all possible directions. In order to keep the task simple, no manual interaction with the environment was possible. The subjects were given the task of finding certain target objects in the virtual environment by looking around. The objects were cartoon images of fruits and vegetables placed both inside and outside the vehicle. Some objects were moving and some were static. There were 10 different scenarios, each containing 5 objects in different positions. For each scenario, the subject’s task was to find all these objects. When all 5 objects in a scenario had been found, 5 new objects appeared replacing the previous ones. The scenarios were started by the experimenter and, for each, an upper limit of 45 s. was used (i.e. if all the objects were not found within this time, the subject was asked to continue with the next scenario), implying a maximum exposure time of 7 min. and 30 s. All subjects experienced the exact same set of scenarios. Measures Three general dependent variables were measured: task performance, simulator sickness, and satisfaction with the system. Task performance was measured as (I) the number of objects not found, and (II) the time to complete the task
432
IMMERSIVE VIRTUAL REALITY AND ELDERLY USERS
Simulator sickness was measured by means of the standard simulator sickness questionnaire (SSQ) (Kennedy et al, 1993). The questionnaire contains 28 symptoms rated on a four point scale and yields a total severity score as well as subscores for three principal factors: nausea and neurovegetative complaints (N), oculomotor disturbances (O), and disorientation (D). The subjects’ satisfaction with the system was measured with a post-exposure questionnaire (PEQ), containing 36 statements, which were rated on 7-point scale ranging from “strongly disagree” to “strongly agree”. The PEQ was subdivided into three principal, somewhat overlapping, themes: Hardware Ergonomics, referring to the comfort of the HMD, Perceived Quality, how the subjects rated the authenticity of the virtual environment, and Subjective Experience, how the subjects experienced being immersed in the virtual environment, and finally the Attitude towards VR technology and its use for rehabilitation and assessment. For each subcategory the total score was obtained by summing the ratings, where positive ratings were assigned numbers on the higher end of the scale and vice versa. Results Task performance The elderly group found significantly fewer target objects during the trials than the control group (t(24)=7. 18, p<0.01). The mean number of objects not found for the two groups were 3.2 and 11.9 respectively. The elderly were also significantly slower and required on the average 426 seconds to complete the task, compared to 322 seconds for the control group (t(24)=5.02, p<0.01). Simulator sickness The SSQ scores were generally higher for the control group, where several persons experienced rather severe symptoms. None of the elderly subjects experienced any serious problems and the majority was not affected at all. The median was zero on all sub-scores for the elderly group while non-zero on all sub-scores for the control group. However, a Mann-Whitney test, with the z-value corrected for ties, revealed that only the difference on the disorientation sub-score was statistically significant (U=48, p<0.05), although strong trends in the same direction were observed for the nausea sub-score (U= 50, p<0.06) as well as for the total severity score (U=55, p<0.119). It can also be noted that the obtained magnitude order of the sub-scores, D>N>O, corresponds to a pattern that has been shown to be characteristic of HMD systems (Stanney and Kennedy, 1997), although these differences were not statistically significant. Satisfaction The subjects’ general satisfaction with the system was measured by summing the subscores in the postexposure questionnaire (hardware ergonomics, perceived quality, subjective experience and attitude). The median for the elderly group was 6.5 and for the control group 6.4. There was no significant difference between the elderly and the control group concerning hardware ergonomics. Many subjects experienced the HMD as rather uncomfortable and there were large individual differences concerning the source of discomfort. However, no age specific factors could be isolated. A simple vision test indicated that all elderly subjects had a presbyopia. This could be roughly compensated for in the HMD by reading glasses, although several elderly subjects still reported to have experienced the image in the HMD as blurry. However, this did not seem to affect how they rated the
CONTEMPOARY ERGONOMICS 2000
433
quality of the system, and both groups reported a high level of presence in the virtual environment. There were no significant differences between the groups in this respect. Some practical problems occurred for subjects wearing glasses in the HMD. First, the glasses sometimes prevented the HMD from being properly tightened, and second, they sometimes got misted over due to the heat inside the HMD. However, these problems were equally common among elderly and younger subjects wearing glasses. No tendencies of negative stress, claustrophobia, frustration or boredom were observed for any of the groups. Rather, all subjects reported that their experience being immersed in the virtual environment was positive. Overall, the attitude towards virtual reality and its applications was very positive in both groups. The vast majority of the subjects could imagine participating in similar studies again. Furthermore, they believed that these kinds of VR systems could be useful for rehabilitation and training and they were positive to the idea of using the system themselves on a regular basis, e.g. for training purposes. Discussion The principal objective of the study was to investigate the possibilities of elderly people using HMD-based IVR. The results show that the elderly group detected significantly fewer objects and was slower and than the control group in performing the task. Observation revealed that one important factor was a restricted movement ability of the elderly subjects, making it harder for them to find objects placed behind them in the virtual environment. Most importantly, however, all elderly subjects were clearly able to perform the task in a flawless manner, detecting on average 80% of the target objects. The main physiological problem predicted, simulator sickness, did occur for several of the younger subjects, but hardly at all among the elderly, thus supporting the common hypothesis that sensitivity for these symptoms decreases with age. However, the task performed in the present study was comparably simple with and the exposure time was limited. Thus, simulator sickness may still be an important obstacle for elderly in more demanding experimental situations and further research is clearly needed. However, although the effects of simulator sickness are relatively harmless to healthy subjects, there are obviously ethical difficulties associated with experiments were these symptoms are deliberately provoked, especially for elderly people. There were some visual problems observed for elderly people, mainly due to their presbyopia. However, this was not critical for their ability to use the HMD, and in most cases, their own reading glasses were able to roughly compensate for the deficit. In further work, it would be useful to provide a set of reading glasses of different strength to better accommodate to the image distance in the HMD. No mechanical problems prevented the elderly from using the HMD, although age-related stiffness was observed to be a major factor affecting task performance. The subjects were generally satisfied with the IVR system and the experimental conditions. Negative opinions mainly concerned the ergonomics of the HMD (e.g. discomfort), but no differences were observed between the groups in this respect. Both elderly and younger subjects reported a high degree of presence in the virtual environment and the groups did not differ in respect to stress or any other psychological aspects. Both groups were very positive towards VR technology in general and its possible applications for rehabilitation. However, again it must be stressed that longer exposure times may induce problems that did not occur in the present experiment. For example, elderly may be more sensitive than younger people to problems with eyestrain, resulting from prolonged exposure to computer displays, because of the decline in tear production in old age. In conclusion, the prospects of using IVR in applications with elderly users seem promising. A final caveat should be mentioned, however. Since the elderly subjects volunteered to participate in the study,
434
IMMERSIVE VIRTUAL REALITY AND ELDERLY USERS
there are reasons to expect that these subjects are both healthier and more technologically interested than the average elderly person. Nevertheless, it seems safe to conclude from the present results that current IVR technology is accessible to elderly people, at least as long as the exposure time is limited and the task kept simple. Acknowledgments The study was supported by grants from Volvo Research Foundation. The authors would also like to thank Peter C.Burns and Emil Knabe at VTD for much valuable support during the project. References Burns, P. and Saluäär, D. 1999, Intersections between driving in reality and virtual reality (VR), In Gauriat, P and Kemeny, A. (eds.), Proceedings of the driving simulation conference ’99 (DSC’99), Paris Draper, M. 1998, The adaptive effects of virtual interfaces: Vestibulo-ocular reflex and simulator sickness, Doctoral dissertation, University of Washington, USA Johansson, K. and Winblad, B. 1991, De äldre bilförarnas ökade olycksrisk i trafiken sett i perspektivet av demenssjukdom: En litteraturstudie (The increased crash risk of older drivers seen in the perspective of dementing illness: A literature review), Trafiksäkerhetsverket, Rapport 1991:2 Kennedy, R.S., Lane, N.B., Berbaum, K.S. and Lillienthal, M.G. 1993, Simulator sickness questionnaire: An enhanced method for quantifying simulator sickness. The international journal of aviation psychology 3, 203–220. Stanney, K.M. and Kennedy, R.S. 1997, The psychometrics of cybersickness. Communications of the ACM, 8, 66–68 Pugnetti, L., Mendozzi, L. Motta, A., Cattaneo, A. Barbieri, E. and Brancotti, A. 1995, Evaluation and retraining of adults’ cognitive impairments: Which role for virtual reality? Computational Biological Medicine, 25 Reason, J.T. and Brand, J.J. 1975, Motion sickness, (Academic Press, London)
APPLICATION OF VIRTUAL REALITY TO ENHANCE USER EXPERIENCE OF ELECTRONIC COMMERCE (E-COMMERCE) TRANSACTIONS Hui Xu, Kee Yong Lim & Sai Cheong Fok Design Research Centre, School of Mechanical and Production Engineering, Nanyang Technological University, Nanyang Avenue, Singapore 639798
One of the main concerns of current electronic commerce systems is how to provide users with crucial information and yet maintain their desirable experience of real world transactions when business activities are conducted online. This paper introduces an analytic model that facilitates the mapping of crucial characteristics of real world transactions onto electronic world transactions. By cross mapping these characteristics, a basis could be established to examine how virtual reality technology could be recruited to enhance users’ experience of e-commerce transactions. To advance the model, a cyber shopping case study involving activities associated with the manipulation of virtual products and the optimization of webpage design, are examined. Background Electronic commerce (e-commerce) is the application of technology to the automation of business transactions and workflows; e.g. the buying and selling of products and information on the Internet and other online services (Kalakota & Whinston, 1997). The main goals of e-commerce are to increase the speed of business transactions and processes, to reduce product and service costs, to make business processes simpler and more efficient, and to expand the horizon of business participants. E-commerce began in the 1970’s with electronic funds transfer (EFT) between banks over secure private networks. Today, e-commerce is used widely in many areas, changing the traditional way of performing business transactions. As one of the key components of e-commerce, home shopping systems are being developed by companies such as US West, Bell Atlantic, Viacom, etc. (Santos, 1994). Extensive studies of cyber shopping have also been undertaken in Japan. As an Internet-based application, cyber shopping changes the way of conventional shopping. It helps people avoid traffic jam, bustling shopping malls, long queues at counters, etc. It allows people to shop at anytime and anywhere via an Internet enabled computer. For suppliers, cyber shopping reduces overhead costs through lower investments in physical stores and distribution channels. With these advantages, cyber shopping is becoming more and more popular. As a relatively new application, cyber shopping shows weaknesses in both its functionality and usability. In the functionality aspect, current cyber shopping applications are poor in interactivity. In the usability aspect, website designs may not be user friendly and fail to support effectively users’ requirements for product searching and/or information retrieval. A survey conducted by Singapore Press Holdings (1999) revealed the following frustrations reported commonly by users of online shopping: slow downloading
436
APPLICATION OF VIRTUAL REALITY TO ENHANCE USER EXPERIENCE
speed, unclear navigation paths, difficulty in locating the desired product, and complex online ordering processes. To promote online retail transactions, a cyber shopping case study involving the customer transactor role is examined. Two cyber-shopping activities, namely the manipulation of virtual products and the optimization of webpage design, are examined to facilitate the development of an analytic model to support the cross mapping of user requirements for real world transactions onto electronic world transactions. In this way, desirable aspects of real world transactions may be ported over and maintained in e-commerce transactions. The strengths and weaknesses particular to e-commerce technology may thus be inferred. Advanced technology such as virtual reality (VR) technology may then be considered for recruitment to carry through to e-commerce transactions, any desirable aspect of real world transactions. For instance, the intense interactivity and visualization capabilities of VR may be used to provide users a 3-D view of the virtual product, which they can then manipulate more realistically in a virtual 3-D world. Any inadequacy in e-commerce transactions from the users’ viewpoint may thus be bridged, and their use may then be encouraged. A preliminary model of real & electronic world transactions for shopping Figure 1 shows part of a high level model of real and electronic world transactions that could support a cross mapping comparison between the activities and attributes of the two worlds of interest. Conventional shopping activities are described in the left branch, while corresponding activities for the electronic world are indicated on the right. These activities may be decomposed further to derive a lower level description as shown in Figure 2 for the purchase of video compact discs (VCDs). In this case, a ‘conventional’ way of shopping may be to ask friends for a VCD recommendation or to read/view an advertisement, before proceeding to a physical store to make a purchase. Along the way, customers may encounter problems such as traffic jams, crowded stores, and long queues. In the case of cyber shopping, customers may stay at home and make orders online in a totally different physical and social environment, since the internet-based technology allows users to transcend physical space. Although problems associated with the physical world do not occur, new problems may arise; for instance desirable social interactions and product manipulations associated with real world shopping, are precluded in the electronic world. Thus, by decomposing the analytic model to a lower level of description, specific attributes associated with the activities in each of the transaction worlds may be thought through systematically to establish a more comprehensive set of user requirements. Key objects and activities may then be described in greater detail. Alternatively, a lower level description may be used to support an assessment of the strengths and weaknesses of existing designs of electronic shopping applications. For illustration purposes, a review of a sample of these descriptions follows. Defining a Virtual Shopping Environment
Existing virtual shopping environments are usually simple, e.g. a one-room-store virtual shopping environment. Multi-store virtual shopping environments are also limited usually to a one-level shop. To provide users an experience closer to ‘real’ shopping, the virtual environment may be constructed as a typical shopping center with multi-levels, with multiple stores on each level. Users may be guided to specific stores by a general directory that may be sorted by store name and category. Since this is a virtual
CONTEMPOARY ERGONOMICS 2000
Figure 1. A partial model of real & electronic worlds for shopping
Figure 2. A model of manipulating virtual product (VCD) shopping
437
438
APPLICATION OF VIRTUAL REALITY TO ENHANCE USER EXPERIENCE
world (as opposed to a physical world), users may either jump directly to a store by clicking its name on the directory, or enter a store by browsing through the shopping mall. In other words, the virtual shopping environment should provide users a ‘short-cut’ as well as a ‘conventional’ way of accessing the stores. The virtual shopping mall may consist of two kinds of objects, namely fixed objects (e.g. walls, stairs, decorations, etc.) and temporary objects (virtual products) which are modelled separately. Fixed objects are generated by geometry data, while temporary objects are modeled by 3-D images. Virtual reality modeling language (VRML) may be used to program the virtual environment. User Requirements for Virtual Product Manipulation
Presently, few cyber shopping systems provide the function of manipulating virtual products. Product information is typically in text description supported by graphics, animation, and/or video clips. Thus, customers can only derive product information by reading and watching, and not by touching and trying out the products as they would usually do in real world shopping. To provide users an experience closer to ‘real’ shopping, the provision of richer product interactions should be considered. For instance in the real world, customers can touch and rotate the VCD to get a better view of its packaging. In addition, they can preview the VCD to test its video and audio quality. A cyber shopping application should also allow customers to pick up VCDs from either a virtual display shelf or from an electronic catalogue (e-catalogue). Each time a VCD is selected, a fully interactive model of the item should be loaded from a VCD database and placed in the virtual environment. Users should be able to play the VCD on a virtual VCD player and preview its contents as in the physical world. In allowing users to play the VCD, a large volume of video and audio data will have to be downloaded. Thus, the transient response time could be long. To solve this problem, a scheme for pre-loading relevant information will have to be developed. For example, when users preview videos or listen to music or songs, the data package for the next track to be played could be extracted in advance to avoid data lag. This scheme may also be applied for loading a virtual environment. For example, when users enter a VCD store, the data for the next store could be loaded during the time they are browsing through the present store. However, this scheme has its drawbacks. For example, if the volume of the preloaded data is very large and before the data could be downloaded completely, the user moves to the next store or visits a different store, then the situation will be worse than without pre-loading data files. Solutions to solve these problems are being developed. Requirements for a User Friendly Web Page Design
Attracting users to visit business websites is the first step in online retailing. Thus, ensuring the design of a user friendly website is crucial for a successful cyber shopping application. In this respect, a number of concerns needs to be addressed. First, the transaction speed needs to be improved. In existing cyber shopping applications, it is common that the same web page is presented to all customers. For customers who log onto the website for the first time and/or those do not have a specific store in mind, the web page design with a default presentation is fine. However, for customers who frequent particular stores in the virtual mall, being presented with the same default web page display is boring and unfriendly. Thus, a second user requirement would be to mimic the way the staff in physical stores might treat known customers. To this end, a website should be ‘personalised’ to recognise frequent customers and to react to their preferences based on a historical database of their previous transactions and visits.
CONTEMPOARY ERGONOMICS 2000
439
Future Work This paper describes preliminary work on the development of an analytic model to support a more user centered design of e-commerce applications. To this end, a preliminary model that maps out crucial characteristics of real world and electronic world transactions for cyber shopping is derived. The model is then used to facilitate a systematic review of user requirements associated with specific cyber shopping activities, such as the personalisation of web page presentation by tracking customer transactions, and the manipulation of virtual products. Future work will involve the development of a complete analytic model to support a comprehensive cross mapping comparison between real world and electronic world transactions. A wider range of case-studies will also be undertaken to test the efficacy of the analytic model in facilitating the identification of key user requirements and desirable aspects of real world transactions that should be ported to the electronic world. In addition, the analytic model will be applied to identify particular user requirements for physical world product interactions that may be bridged by recruiting VR technology to support electronic world transactions. The results of these developments will be reported at a later date. References Kalakota, R. & Whinston, A.B. 1997, Electronic Commerce—A Manager’s Guide, (Addison-Wesley) Santos, K. 1994, Home Shopping on the Broadband Network: Performance Considerations, IEEE, 67–70 Singapore Press Holdings 1999, Lian He Zao Bao, http://www.zaobao.com.sg.
Warnings
THE PERCEIVED HAZARDOUSNESS, URGENCY AND ATTENTION-GETTINGNESS OF FLUORESCENT AND NON-FLUORESCENT COLOURS E.J.Tomkinson & R.B.Stammers Centre for Applied Psychology, University of Leicester, Leicester LE1 7RH, UK
Research has demonstrated the value of fluorescence in improving conspicuity, but there is little information on its perceived characteristics. Experiment 1 used 2 groups of 18 participants, one group rated fluorescent, the other non-fluorescent colours. Each participant rated red, orange, yellow or green, on a 9-point scale for the amount of hazard and urgency each indicated and how attention-getting each was. The results showed that the rank order of colours were in line with previous work, but there was no effect of fluorescence. In Experiment 2, 12 participants from Experiment 1 were used, but were asked to rate all eight colours. Here significantly higher ratings (p < 0.01) were given to fluorescent colours. In Experiment 3, new participants repeated Experiment 2 and the findings were replicated. These results suggest that fluorescent material has the potential benefit of giving emphasis to warning signs. Introduction There has been important research on the value of fluorescent material for improving the visibility of safety clothing (Alferdinck and Padmos, 1990; Isler et al., 1997). Another active area of research is in traffic sign conspicuity (Burns and Pavelka, 1995). However, there is little research on the general perceived characteristics of fluorescent colours in comparison with standard colour material. Two recent general reviews of warning signs and their characteristics make no mention of the use of this property of colour (Edworthy and Adams, 1996; Parsons et al., 1999). However, the most recent text in the area does draw attention to the need for more research on this topic in relation to warning labels (Wogalter et al., 1999: 126). Fluorescent materials not only reflect light in the way that normal materials do, but also, because of their chemical properties, emit extra light when stimulated. Fluorescent colours are increasingly used on warning signs because of their high conspicuity. Whilst this remains an important aspect of fluorescent colour, it is also important to explore how individuals gain impressions of such colours. Colour has been found to aid comprehension of signs that are not adequately understood from the image alone and stereotypes have been found to exist for colour coding in safety signs. Various studies have found that different colours are associated with different levels of hazard. For example, Dunlap et al. (1986) asked participants to rate colour words for the amount of “personal hazard” they thought was implied by each. The order from highest to lowest was: red, orange, yellow, blue, green, white. Apart from yellow, this order was consistent across the language groups of English, German/ Austrian, Scandinavian and Spanish/Portuguese.
442
THE PERCEIVED HAZARDOUSNESS, URGENCY
A number of studies have explored this area and although different colours have been used in different experiments, it seems that red is always regarded as indicating the most hazard. This is usually followed by orange and then yellow with green and blue coming lower. Given that colour has a very powerful influence on the interpretation of a sign in terms of its hazardousness, then it seems worthwhile to explore whether the fluorescent properties of colour enhance the perceived hazardousness and whether they add an impression of urgency to the situation. In addition, it was thought interesting to see whether such hues are also seen being as more “attention-getting”. Method A series of experiments have been conducted to explore the rating of fluorescent and non-fluorescent colours using a variety of experimental designs and participants. Experiment 1 In this study there were two independent variables, colour and fluorescent status. Colour had four states: red, orange, yellow and green. Fluorescent status had two states: fluorescent and non-fluorescent. In this first experiment, an independent groups design was used, with each participant viewing four colours in either their fluorescent or non-fluorescent form. The participants were 36 Aston University undergraduates. Their ages ranged between 18 and 45 (mean: 21.66), of these 25 were female and 11 were male. One participant was colour blind with a red/green deficiency. Each participant used three rating scales, one each for hazardousness, urgency and attention gettingness. Each scale had an explanation of how to use it. Participants were shown pieces of coloured card (23.4×22.5 cm) mounted on black card. Participants were asked to view each of the colours, which were presented in random order, and to then rate them on the scales mentioned above. ANOVAs were conducted on the data for each of the rating scales separately. For Hazardousness, it was found that colours in the fluorescent condition were not rated significantly differently to those in the non-fluorescent condition (F=1.78; df 1, 34; p>0.05), there was no interaction between colour and fluorescence (F=1.44; df 3, 102; p>0.05), there was however a significant effect of colour (F=46.03; df 3, 102; p<0.001). Individual tests show that the order of means was as might be predicted, ie, red, orange, yellow and green. The ratings given to the different colours were all significantly different from each other (Tukey’s HSD test, p<0.01) apart from red and orange. For Urgency, it was found that colours in the fluorescent condition were not rated significantly differently to those in the non-fluorescent condition (F=2.99; df 1, 34; p>0.05) and there was no significant interaction between colour and fluorescence (F<1). There was a significant effect of colour (F=44.44, df 3 102; p<0. 001). Analysis of individual colour comparisons (using Tukey’s HSD test) showed that all pairs differed significantly from each other, with the smallest difference being between red and orange. In the analysis of the Attention-gettingness results, it was found that fluorescent colours were not rated significantly differently to those in the non-fluorescent condition (F=2.25; df 1, 34; p>0.05). There was no interaction between colour and fluorescence (F=2.16; df 3, 102; p>0.05) but there was a significant effect of colour (F=12.77; df 3, 102; p<0.01). In this case, only green was found to be rated significantly lower than each of the other colours (Tukey’s HSD test, p<0.01).
CONTEMPOARY ERGONOMICS 2000
443
Table 1. Tukey comparisons for Hazardousness, Exp 2
**=p<0.01
Experiment 2 Consideration of the above results led to an alternative experiment being designed where all participants saw both versions of the colours and rated them all using the scales. Accordingly, an attempt was made to recall participants from the previous study and to get them to take part in a within-subject design experiment. Twelve of the original participants took part. The materials used were the same as before, but in this case participants rated all eight colours. For Hazardousness, there was a significant main effect of colour (F= 17.54; df 7, 77; p<0.001). The rank order of ratings for each colour were: fluorescent red (FR), fluorescent orange (FO), equal ranking of fluorescent yellow (FY) and orange, red, fluorescent green (FG), yellow and green. In Table 1 the results of individual comparisons using Tukey’s HSD test show a complex outcome, with FR and FO standing out as most strongly associated with hazard and green differing from most other hues at the lower end. For Urgency, there was a significant main effect of colour (F=15.68; df 7, 77; p<0.001). The rank order of ratings was the same as for Hazardousness, but with orange being rated below red. The pattern of differences revealed in the Tukey tests in Table 2 show FR and FO to have the strongest ratings for urgency, but less impact than was found for hazardousness. Green differed from most other hues at the low end and yellow was also lower than a number of others. The analysis of the Attention-gettingness results again showed a significant effect of colour (F=18.55; df 7, 77; p<0.001). The rank order of ratings was different here, being: FO, FR, FY, FG, a tie between red and yellow, then orange and lastly green. The results of the Tukey tests in Table 3 show that whilst the fluorescent colours did not differ from each other, they all were significantly different from the nonfluorescent hues. None of the latter colours differed from each other. Experiment 3 A final study was conducted repeating the design of Experiment 2. The results of Experiment 2 were very striking, but it was possible that the outcome could have been influenced by the participants having taken part in Experiment 1. A new group of 20 participants was used. They were office workers and had not previously taken part in an experiment of this nature.
444
THE PERCEIVED HAZARDOUSNESS, URGENCY
Table 2. Tukey comparisons for Urgency, Exp 2
**=p<0.01, *=p<0.05 Table 3. Tukey comparisons for Attention-gettingness, Exp 2
**=p<0.01, *=p<0.05
Significant effects of colour were found for Hazardousness (F=17.71; df 7, 133; p<0.001), Urgency (F=27.12; df 7, 133; p<0.001) and Attention-gettingness (F=59.78; df 7, 133; p<0.001). In the rank ordering of ratings for Hazardousness, orange dropped below red, otherwise the order was the same as that for Exp. 2. The Urgency rank order was the same as Exp. 2 apart from orange dropping below FG. With the Attention-gettingness rank order, FR was higher than FO and red dropped below orange. Discussion Experiment 1 confirmed the results of previous studies on the ratings of colour, in as much as the red, orange, yellow, green order was confirmed. It appeared, however, that ratings of fluorescent colours in isolation from alternatives did not yield results that suggested any particular effect associated with this colour form. However, when rated within a batch of alternative colours, a strong affect of fluorescence emerged. Experiment 2 could be criticized for using participants who had already rated the colours
CONTEMPOARY ERGONOMICS 2000
445
individually. However, its findings were effectively replicated by a further group of participants. It seems that adding a fluorescent dimension to colour does increase associations of hazard, urgency and attentiongettingness. These results emerged only when fluorescent colours were rated in comparison with others, but this does not diminish the clear findings that they are relatively more powerful in the way they are perceived. Whilst there were similarities between the scales, subtle differences are present. For example, FG gets a low rating for hazardousness and urgency but a high one for attention-gettingness. It is suggested therefore that this is an area that would repay more detailed study. The technology for using fluorescent material in warning signs is now more generally available, offering durable and low cost printing and painting. References Alferdinck, J.W.A.M. and Padmos, P. 1990, Conspicuity of fluorescent colours for safety garments: A literature review. (TNO Institute for Perception, Soesterberg, The Netherlands) TNO Rep. No. IZF 1990 C-21/E Burns, D.M. and Pavelka, L.A. 1995, Visibility of durable fluorescent material for signing applications. Color Research and Application, 20, 108–116 Dunlap, G.L., Granda, R.E and Kustas, M.S. 1986, Observer perceptions of implied hazard: Safety signal words and colour words. (IBM, Poughkeepsie, NY) Tech. Rep. No. TR00.3428 Edworthy, J. and Adams, A. 1996, Warning Design: A Research Prospective. (Taylor and Francis, London) Isler, R., Kirk, P., Bradford, S.J. and Parker, R.J. 1997, Testing the relative conspicuity of safety garments for New Zealand forestry workers. Applied Ergonomics, 28, 232–329 Parsons, S.O., Seminara, J.L. and Wogalter, M.S. 1999, A summary of warnings research. Ergonomics in Design, 7, 21–31 Wogalter, M.S., Dejoy, D.M. and Laughery, K.R. 1999, eds, Warnings and Risk Communication. (Taylor and Francis, London)
INCREASING THE CONSPICUITY OF FOOD CONTENTS WARNINGS E.A.Hoodless1 & R.B.Stammers2 1Industrial
Ergonomics Group, School of Manufacturing and Mechanical Engineering, University of Birmingham, Birmingham B15 2TT, UK
2Centre
for Applied Psychology, University of Leicester, Leicester LE1 7RH, UK
The increased prevalence of Anaphylaxis, or acute allergic reaction, has put an emphasis on food labelling. This experiment examined factors influencing the optimal positioning of content warnings. Nineteen participants indicated whether a target ingredient was present on a food label. There were three variables: the position of a target (before the first ingredient, after the last ingredient or below the ingredients), the font style (bold/normal) and the letter case (upper/ lower). Response times indicated significant main effects of position, with the warning before main ingredients giving the fastest responses. Additionally, advantages were found for the use of bold and of upper case letters. The results suggest that an optimal format for an ingredient warning would be to place it before the ingredient list and in a different style of text to that of the main list. Introduction The problem of indicating food contents to potential purchasers is an interesting one for ergonomics. A balance has to be drawn between the accurate indication of content, versus the potentially negative image of giving warnings of possible side effects. For some consumers the objective is to avoid certain ingredients for health, personal preference or religious reasons, eg, high sugar or animal fat. For others, it is a question of a serious health risk if certain ingredients are eaten. In the case of acute allergenic reaction or anaphylaxis, the result of eating certain ingredients can be fatal (Steinman, 1996; Williams et al., 1997). There is evidence of increased prevalence of such allergies (Hourihane et al., 1996) and Governmental bodies and the food industry have responded with the provision of indications of certain ingredients. The consumer is faced with the problem of scanning food ingredients to determine content. There is, however, some official advice on this: “Labelling particulars must be easy to understand, clearly legible and identifiable and…must be in a conspicuous place so as to be clearly visible”. Ministry of Agriculture, Fisheries and Food (1996:35–39) One approach to the problem would be to display clear warning panels on packages. Not surprisingly, the food industry could object to such an approach as it might tend to give a negative impression to the majority of consumers about a product. A solution that has been reached is to give an indication, for example, “this product may contain nut compounds”. Questions then arise as to the effectiveness of such warnings and
CONTEMPOARY ERGONOMICS 2000
447
where such a warning message should be placed. On the question of effectiveness, there is support that, in general, warnings are found to be effective (Cox et al., 1997). In addition, allergy sufferers are very alert to the problem. On the question of positioning, the general consensus seems to be that warning material should appear in the same area of the label as the ingredients. The current study was planned in order to help determine the optimum positioning of such a warning. In addition to this, aspects of typology were explored in order to investigate whether detection performance could be enhanced through the use of typological coding. Method Task and Equipment High definition, computer generated, versions of a food label were presented to participants on a computer monitor. The labels were based on a popular brand of chocolate bar. The name of the product, the ingredients list, quality guarantee and other details were present on the label in a realistic level of detail. Aspects of the target item were varied, these were: position, style and case of the target word. Position had three levels: above, within and below the ingredients list. Style had two levels: either in bold or normal print, and case was either all letters in upper case or all letters other than the initial letters in lower case. Three target words, which were similar in visual length, “Cocoa Mass”, “Nut Traces” and “Wheat Flour” were used to prevent participants being sensitized to a single target item. A 3×2×2 factorial design was used with all combinations of positions, style and case, together with the 3 target items, yielding 36 different food labels. In addition, 10 “filler” labels were used which contained no target word. These were included to prevent a “yes” response set. All graphics on the labels appeared in white on a black background. Participants Twenty Psychology undergraduates at Aston University volunteered to participate in the experiment. There were 9 males and 11 females, their ages ranging from 19 to 34 (mean=22.55, sd=4.01). All 20 reported no vision or hearing problems. Participants received course credits in return for their involvement. Procedure Each participant completed a consent form and then carried out the task in a small quiet cubical. Participants were given standardized instructions to follow, which were also read aloud to them. Following an opportunity to ask questions, there were two practice trials. The participants were then asked to begin tasks when they felt ready and were reminded they should work as quickly and accurately as possible. On each trial, they were instructed to move a marker on the screen with a mouse to a box marked “go” and to click when they were ready to continue. They were then presented with a question on the screen, which asked, for example, “Is wheat flour present?”. The participant then indicated whether the specified ingredient was present in the displayed label by clicking the mouse marker in a “yes” or “no” box positioned below the label. The computer logged the response time in hundredths of seconds. No overall time limit was imposed on participants. They worked through the 46 trials, which were presented in a random order for each participant. The participants were debriefed at the end of the experiment.
448
INCREASING THE CONSPICUITY OF FOOD CONTENTS WARNINGS
Results The data from one participant were eliminated due to high percentage of errors, particularly in one condition. Analysis of the data indicated that the distribution of response times approximated a normal distribution and variances were sufficiently homogenous. An ANOVA revealed a significant main effect of position (F=14.45; df 2, 36; p<0.001). Post hoc individual comparison tests (Tukey’s HSD test) showed a significant difference between positions 1 and 2 only (p<0.01). The means for the positions were as follows: 1. 2. 3.
above ingredients list after last ingredient below ingredients list
1.57 secs. 2.06 secs. 1.60 secs.
There was a significant effect for style (F=16.27; df 1, 18; p<0.001), and this reflected a difference between the following means: 1. 2.
bold normal
1.67 secs. 1.82 secs.
In addition, there was an effect for case (F=6.31; df 1, 18; p<0.05) indicating a significant difference between the following means: 1. 2.
upper lower
1.70 secs. 1.82 secs.
However, the picture was a little more complicated as significant two-way interactions were found to exist between position and style (F=3.71; df 2, 36; p <0.05) and position and case (F=9.16; df 2, 36; p<0.001). For the first interaction, the source seems to be a long mean response time for position 2 when normal text was used. For the interaction between position and case, again position 2 is implicated, with a long mean response time when lower case text was used. There was no three-way interaction. Discussion In isolation the mean response times indicated that target words were located most quickly when they proceeded the first ingredient, were presented in bold and in lower case. However, the order of means for the individual com binations of conditions presents a complex picture. Position 1, whilst it was significantly better than position 2, it was only marginally better than position 3. The advantages of the bold font style are clear from the ANOVA main effect, but this outcome is somewhat influenced by the long response times found for normal font in position 2. If the rank order of means for the individual combinations of conditions is used, it is found that the mean rank for bold conditions is 5.80 and for normal conditions it is 7.17. The effect for upper case over lower was not so strong in the overall ANOVA and again it was influenced by some long response times for position 2. The rank order of means for the individual combinations of conditions supports the less strong effect with the mean rank for upper conditions being 6.83 and for lower it is 6.17.
CONTEMPOARY ERGONOMICS 2000
449
Whilst there is a need for further experiments, the overall results support the presentation of warnings before the first ingredient, in bold and/or in upper case. The results of the present study and the existing literature have important implications for food packaging. At present, ingredients tend to be listed in order of descending weight, with the presentation of any allergenic warnings being separated out from the ingredients. The suggestion is that it is best to present allergenic warnings before the first ingredient, in a font that is clearly separable from those of the ingredients themselves. It would appear that the current positioning of such warnings below the ingredients list and in a standard font, whilst not being poor, might not be the optimal format. Whatever decisions are made, it is suggested that a standard format is chosen for all such allergenic warnings. Acknowledgement The authors would like to thank Mr Ray Taylor for writing of the computer program used in this experiment. References Cox, E.P., Wogalter, M.S., Stokes, S.L. and Murff, E.J.T. 1997, Do product warnings increase safe behavior? A metaanalysis. Journal of Public Policy and Marketing, 16, 195–204 Hourihane, J.O’B., Dean, T.P. and Warner, J.O. 1996, Peanut allergy in relation to heredity, maternal diet and other atopic diseases: Results of a questionnaire survey, skin prick testing and food challenges. British Medical Journal, 313, 518–521 Ministry of Agriculture, Fisheries and Food, 1996, Food Labelling Regulations. SI 1996/1499 Steinman, H.A. 1996, “Hidden” allergens in foods. Journal of Allergy and Clinical Immunology, 98, 241–250 Williams, D., Williams, A. and Croker, L. 1997, Life-Threatening Allergic Reactions: Understanding and Coping with Anaphylaxis. (Piatkus, London)
ROAD SIGN ANGULARITY Tanita Kersloot & Bryan Cooper Transport Research Laboratory, Old Wokingham Road, Crowthorne, Berkshire RG45 6AU, UK
Currently many signs in lit areas are required to be internally illuminated or to be provided with external lighting. It is possible that some of the more effective retro-reflective materials may be used to produce signs of sufficient luminance without additional lighting. This paper compares the performance of different sheetings and discusses how this varies with the geometry of illumination, observation and the orientation of the material. Introduction The performance of retro-reflective sheeting depends on its angularity, i.e. its performance in relation to the different angles between the driver’s eyes, the vehicle headlamps and the sign position. The basic principle of retro-reflectivity is that light coming from a source is returned in a nearly parallel direction. The performance of retro-reflective sheeting depends on its angularity. To describe angularity fully, four angles need to be taken into account. There are a number of four angular systems and one of these is the α, β1, β2 and ε system recommended by the CIE (1982). The entrance angle β is the angle formed between a light beam striking a sign surface and a line perpendicular to that surface (see Figure 1) and is determined from two components, β1 and β2. A full explanation of β1 and β2 is given in Retroreflection Definition and Measurement by CIE (1982). A sign placed at the exit of a roundabout for example would create a larger entrance angle than a sign placed on a straight road (Guest and Huddart, 1996). The observation angle α is formed by the displacement of the driver’s eyes in comparison with the vehicle headlamps (see Figure 1). An HGV would for example create a larger observation angle than a small car. The fourth angle is the rotation angle ε, indicating the orientation of the sign material.
Figure 1 : Entrance and observation angle
CONTEMPOARY ERGONOMICS 2000
451
Method Measurements were made of the retro-reflective coefficient (Ra), in units of cd/lux/m2, for three different white retro-reflective sheetings: Diamond Grade, both Long Distance Performance (LDP) and Visual Impact Performance (VIP) and High Intensity (HI). The retro-reflective coefficient is an expression of the amount of light that is returned from the retro-reflective material relative to the amount of light coming from the light source. The higher the Ra value, the brighter the material appears to motorists (McGee and Paniati, 1998). Measurements of the retro-reflective coefficient were carried out using different combinations of angles (BS 873 part 6, 1983). The entrance angles used were 5°, 15°, 30° and 40°. The observation angles used were 0.2°, 0.33°, 1.0°, 1.5° and 2°. To investigate the relationship between β1 and β2, β2 measurements were carried out in steps of 5° (from 0° to 50°) with four different angles of β1 (5°, 15°, 20° and 40°). Some materials, especially micro prismatics, are sensitive to rotation in their own plane, so those materials were rotated by 45°, 90° and 135°. Results Performance of different types of sheeting A comparison of different types of sheeting was made and is illustrated in Figure 2, the top row of the xaxis identifies the observation angle, while the bottom row identifies the entrance angle. The y-axis shows the retro-reflective coefficient in cd/lux/m2. The following conclusions can be drawn: • LDP sheeting performs significantly better than the HI or VIP at small observation angles (0.2° and 0. 33°). At high observation angles its performance falls in comparison to the HI and VIP. • VIP generally performs better than LDP and HI at larger observation angles (1.0°, 1.5° and 2°). • HI performs better than VIP and LDP with an observation angles of 2° and entrance angle of 40°. Thus if LDP and VIP are compared, LDP’s performance appears relatively better for cars (small observation angle), while VIP performs better for HGVs which are associated with higher observation angles. HI appears to perform better in the more ‘extreme’ situations where, for example, a sign at the exit of a roundabout is viewed from a HGV. Epsilon Epsilon (ε) is the rotation angle which indicates the orientation of the sign material. To identify the effect of ε on the retro-reflective coefficient, various measurements were carried out on VIP and LDP using four different angles. For the HI, two different ε angles were used. Figure 3 shows the results of measurements for VIP material taken at ε angles of 0, 45, 90 and 135 degrees. For each value of ε twenty measurements were made (combinations of five observation angles and four entrance angles). The higher observation angles are shown here, as the lower observation angles did not show a specific trend. The results were compared and the main outcomes were as follows:
452
ROAD SIGN ANGULARITY
Figure 2: Comparison between retro-reflective coefficients of VIP, LDP and HI
Figure 3: The effect of ε on retro-reflective coefficients for VIP material at high observation angles.
• Although the material orientated in a horizontal position (0°) gives the best overall performance (highest average score over all observation angles), other orientations might be more beneficial in certain circumstances. • When comparing high observation angles (1.5° and 2°), the 135° orientation scored the highest for entrance angles up to 30°. Material orientated in a 90° angle gives the lowest performance for these angles. • The 90° orientation gives the best performance for high entrance angles (40°). The LDP results were similar to the results for the VIP sheeting: • The 90° orientation gives the best performance for high entrance angles (40°). • The 135° orientation scores the best performance in seven of the twenty situations, although, the situations in which it scores best are very variable. • The 45° orientation gives the best performance for small entrance angles (5°).
CONTEMPOARY ERGONOMICS 2000
453
Figure 4: The relationship between the retro-reflective coefficient of ε for two values of β2
The change of orientation angle has much less influence on HI sheeting. Different orientations have not been found to give a statistically significant difference in performance. Beta 1 and Beta 2 The entrance angle β is the angle between the light falling on the sign and the reference axis of the material. In order to completely specify the angle, it can be characterised by two components, β1 and β2. β1 ranges from −180° to 180°. β2 ranges from—90° to 90°. When the entrance angle β alone is specified without reference to its components, β2=0° and β1=β (ASTM E808–94, 1994). To investigate how Ra varies with β1 and β2, some measurements were carried out on VIP material. The results show, as expected, that the material performs less well when β1 and β2 increase. Between the two values of β2 of 15° and 20° there was a large fall in the retro-reflective performance. Epsilon and Beta 2 Further measurements were carried out to determine the effect of ε on the retro-reflective coefficient at two values of β2, 15° and 20°. β1 was kept constant at 5° and α was kept constant at 0.2°. The results can be seen in Figure 5 and illustrate that the performance of the sheeting varied significantly. With an ε angle of 180°, for example, the performance varied by a factor of approximately 1.8 between the two values of β2. Discussion and conclusion The luminance of a retro-reflective sign depends strongly upon the location and mounting angle of the sign relative to the driver of the vehicle. The retro-reflective coefficient is dependent on the position of the driver relative to the direction of light reaching the sign and the orientation of the material. The performance of HI is not affected by rotation in its own plane, that of both VIP and LDP is certainly affected by such rotation. Some measurements were made which indicated that the retro-reflective coefficient could vary by a factor of at least 1.5. The nature of the relationship of retro-reflectivity with rotation was dependent on the values of the other angles describing the geometry of the situation. The relationship between retro-reflection and observation angle means that LDP sheeting would in general be more suitable for drivers of cars, while VIP material would be better for HGV drivers. Making a decision on the material to be used depends on specific situations. For example, an HGV driver viewing a sign that is positioned at the exit of a roundabout, would benefit from a sign made of VIP sheeting, orientated at a 90° angle. A small car in the same position would benefit from a sign made of LDP
454
ROAD SIGN ANGULARITY
sheeting, orientated at a 45° angle. These calculations do not take into account the two components of beta (the entrance angle). This might make a difference in performance of up to 40%. Signs made of HI material may show more uniform illumination in various situations. Further research is needed to investigate why the orientation angle had such a great impact on the performance of VIP and LDP material. Acknowledgement This research was part of a project commissioned and funded by the Traffic, Systems and Signing Division of the Highway Agency. References ASTM. 1994, Standard Practice for Describing Retroreflection. Designation: E808–94. British Standard Institution. 1983, British Standard Road traffic signs and internally illuminated bollards. BS 873 Part 6: Specification for retroreflective and non-retroreflective signs. CIE International Commission on Illumination. 1982, Retroreflection definition and measurement. Publication CIE No. 54, (Bureau Central de la CIE). Guest P. and Huddart K. 1996, A study in London of traffic sign angularity (Traffic Engineering+Control Printer Hall Ltd. London). McGee H.W. and Paniati J.A. 1998, An implementation guide for minimum retroreflectivity requirements for traffic signs. Publication no. FHWA-RD-97–052 (US Department of Transportation, Virginia).
Copyright © Transport Research Laboratory 2000
Author index
Akdağ R. 38 Alhemoud A. 265 Anderson D.M. 322 Arisz H. 229 Armstrong I.J. 129 Arnold A.G. 202 Arsenault C. 92
Craig I.R. 177 Crone D. 328 Crowhurst J. 270 David H. 12, 22 de Reus A.J.C. 50 Dickson B. 65 Donnelly C. 255
Baird A. 223, 396 Bairsto A. 192 Beard M. 239 Bellerby F. 218 Beltran J. 265 Benedyk R. 390 Bennington J. 124, 375, 385 Blanchonette P. 328 Bonner M.C. 70 Borras C. 156 Bos T.J.J. 50 Brook-Carter N.M.H. 87 Buckle P. 312 Burns P.C. 234 Bust P.D. 338
Ellis R. 380 Engström J. 234, 432 Farbos B. 12, 22 Flood E.K. 177 Fok S.C. 437 Forrest D. 17 Fraqueza E.J. 119 Gale A.G. 426 Gallwey T.J. 44, 286 Genaidy A. 265 Gramopadhye A.K. 98 Green W.S. 50, 197, 365 Grieve D.W. 250 Griffin M.J. 281
Cabon P. 12, 22 Caloo F. 12 Çapoğlu I. 38 Carey E.J. 286 Catterall B. 270 Chen C.-H. 151, 182 Chen C.-H. 182 Chui Y.P. 412, 422 Coleshaw S.R.K. 129 Coole C. 317 Cooper B.R. 452
Hadley T.J. 401 Harker S. 192 Harris S. 7 Haslam R.A. 312, 333 Haslegrave C.M. 307, 317, 338, 401 Hastings S. 312 Haward B.M. 281 Heaton N. 223, 396 Heavenor G. 291 455
456
AUTHOR INDEX
Hicks M. 156 Hide S. 307 Hoodless E.A. 448 Howells H. 55, 60 Howson D. 129 Hsu C.C. 151 Hunt D.P. 416 Jamieson D.W. 129 Johansson K. 432 Jones H. 2 Jones N. 124 Jordan P.W. 355 Kanis H. 50, 229, 365 Karlsson N. 432 Kaya M.D. 38 Kelkar K. 98 Kelly V. 213 Kennedy R. 2, 103 Kersloot T.M. 87, 452 King R. 328 Kirwan B. 2, 103 Kolich M. 406
Mugglestone M. 426 Nevill A. 140 Newman A. 276, 301 Nichols S.C. 307 Noyes J. 161 Okunribido O.O. 307 O’Neill D.H. 119 O’Sullivan L.W. 44 Özkan B. 38 Parham J. 145 Parker C. 187, 239 Parsons C. 296, 344, 380 Perfect T. 161 Pinder A.D.J. 250, 260 Pleydell-Pearce K. 65 Purdy K.J. 426 Pynn H. 140 Quek R.S.M. 422 Quek Y.-H. 134
Lamoureux T. 7, 17, 27 Lane K. 244 Lansdown T.C. 87 Lawton C.G. 333 Layton S. 145 Lim K.Y. 166, 412, 422, 437 Long J. 156
Raanaas R.K. 322 Rayson M.P. 140, 250 Rea K. 103 Rich K.J.N.C. 114 Rooden M.J. 229, 365 Rothwell A. 140 Ruiter I.A. 33 Russell S.G. 177
MacDonald A.S. 360 MacLeod I.S. 244 Maguire M. 207 Malyon V. 396 McClenahan J. 124 McClumpha A. 426 McDonagh-Philp D. 349 McMillan G. 75 Meltzer S. 197 Miller C.A. 70 Milton N. 60 Mitchell J. 124, 375, 385 Mohamed A.I. 406 Mollard R. 12, 22 Morris W. 307 Morrissey W. 172
Sellar C. 129 Sen R.N. 134 Shadbolt N. 60 Sheard M. 161 Shorrock S. 2 Siemieniuch C.E. 109 Simon J. 390 Simpson P. 328 Sinclair M.A. 109 Sinclair-Williams M. 145 Smyth G. 270 Stabler K.M. 370 Stammers R.B. 443, 448 Stanton N.A. 82 Summersgill B. 103
AUTHOR INDEX
Taboun S.M. 406 Taylor R.M. 55, 70 Tennison J. 60 Tey F.L.K. 412 Tomkinson E.J. 443 Torrens G.E. 276, 301, 349 Truelove A. 296 van den Heuvel S. 370 Vidulich M. 75 Watson D. 55 Wells L. 244 Whitecross S. 65 Whitlock A. 145 Wilson T. 92 Wong C.C.H. 166 Woods V. 312 Wray A. 344 Xu H. 437 Yeşilyurt H. 38 Yeung S. 265 Young M.S. 82 Zajicek M.P. 172, 202
457
Subject index
3-D stereoscopic display 422 accelerator keys 166 adaptive automation 70, 177 air traffic control 2, 7, 12, 17, 22, 27 aircraft 50 aircraft maintenance 98 airport security inspection 426 allergic reaction 448 anthropometry 33, 38 ATM 2 automation 82 automotive seating 406
disability discrimination act 218 disabled consumers 375 discomfort 286 display screen equipment 223 distribution 270 driving 82, 87, 92, 401 dynamic strength 250 E- commerce 437 earcons 172 elderly users 161 electromyography 406 electronic services 207 environmental stress 134 error taxonomy 98 escape 129 evaluation 244 eye-movement 12
brick packing 260 Brookes Talk 172 certainty testing 416 cockpits 60, 65, 70, 75 cognitive cockpit 55 cognitive task analysis 27 compressed working week 114 computer-based trainer 412 consumer products 355 consumers 385 culture 360
false information 416 fatigue 114, 286 fluorescence 443 food labels 448 footwear 344 form design 145
datalink 7 decision support system 187 depth perception 422 design for all 370 developing countries 119 development process 192 direct manipulation object 156 direct voice input 177 disability 380
gender 87 hand 276, 349 hand function 281 handles 291 handling 255 helicopters 129 home-working 223 hot desking 223 458
SUBJECT INDEX
HRA 103 incident reporting 145 incremental lift machine 250 individual differences 44 information technology 192, 197 input devices 312 interface metaphor 151 Internet 239 job design 109 judgement 17 knowledge 109 learning 22 legislation 213 life-maps 375 lifting 255 lumbar support 406 macroergonomics 333 manual handling 260, 270 manual lifting tasks 265 mental models 151, 182 mice 312 military personnel 140 military training simulator 412 modelling 33, 103 motion capture 276 musculoskeletal disorders 260 musculoskeletal loading 307 MUSE 156 neck/low back trouble 317 neuromuscular approach 255 night vision goggles 412 non-technology enabled 202 number of subjects 229 observer performance 426 older adults 161 operability 2 organisational design 333 organisational learning 109 performance 87 personnel selection 140
pilot aids 60 pilot functional state 65 pleasurability 390 pleasure 355 posture 396 PPE 301, 344, 349 prevention 338 product design 365 profiling beds 124 psychophysics 12, 22, 234 psychosocial 322 public technology 161 public transport 218 qualitative 244 quantitative measurements 234 range of movement 276 reach envelopes 44 rigour 244 risk assessment 270, 296 road signs 452 scaling 234 school nurseries 317 seating 396, 401 sense 360 shift-work 114, 322 side floating 129 situational awareness 75 social trends 355 task analysis 27 training 338 universal access 202 usability evaluation 239 usecues 365 user centred design 187 user interface 166, 422 user needs 218 user satisfaction 390 user trialing 197, 229 vibration 281 virtual reality 432, 437 visual impairment 172, 370 visual search 426
459
460
SUBJECT INDEX
warnings 443, 448 wheelchairs 385 WIMP user interface 166 work stress 134 workload 87 workplace design 260, 380 wrists 286, 291, 296 WRMSD 317