from the editor Editor in Chief: Steve McConnell
■
Construx Software
■
[email protected]
Who Needs Software Engineering? Steve McConnell
T
he traditional distinction between software and hardware was that software was easily changeable and therefore “soft,” whereas hardware was captured on a physical medium like a chip, was hard to change, and was therefore “hard.” This traditional distinction is breaking down today. Software delivered via the Internet is clearly “soft” in the traditional sense, but software delivered via CD or DVD is hardly “soft” in the sense of being “easy to change.” We now commonly see software being delivered on EPROMs; the electronic control module that controls my car’s fuel injection is an example. I can take my car to my dealer to have the chip reprogrammed, so in some sense the program on the chip is soft, but is it software? Should the chip developers be using software engineering? Computer chip designers are now doing much of their chip development using software-engineering-like tools. Only at the last minute is the code committed to silicon. Do we really think that committing code to a CD-ROM makes it software but committing it to a silicon wafer makes it hardware? Have we arrived at a point where even computer hardware is really software? If software and hardware are totally different, then electrical engineers designing computer chips don’t need to know about software engineering. But if modern chip design involves a significant amount of programming, then perhaps electrical engineers should know something about software enCopyright © 2001 Steven C. McConnell. All Rights Reserved.
gineering. Should computer hardware be designed using software engineering? Throw a few other disciplines into the mix such as Web programming and games development, and I think a fundamental question lurks here: What is software? This question is important because it leads to a second question: What is software engineering today, and who needs it? I recently posed these questions to several IEEE Software board members. Blurred distinctions Wolfgang Strigel: No doubt, the distinction between software and hardware has blurred. When the term software was coined, there was a clearer distinction, or nobody cared because it sounded good and made intuitive sense. Moreover, it is not important that software is modifiable (or “soft” once it is completed). Software does not change its nature by being turned into something “hard” or unmodifiable. After all, we have accepted the concept of selling software on CDs. And RAM can also be write-protected. What matters is whether there is a “program” that can be executed by a device. The project plan for building a high-rise is a set of complex instructions, decision points, and so on that could be interpreted as a software program. But it is not executed by a device. But how about multimedia, say an animated movie? It has instructions for movement, rendering, and so on, and is executed by a device. It can also be modified. How about a set of MIDI instructions that produce music if delivered to an electronic January/February 2001
IEEE SOFTWARE
5
FROM THE EDITOR
D E PA R T M E N T E D I T O R S
Bookshelf: Warren Keuffel,
[email protected] Country Report: Deependra Moitra,
[email protected] Design: Martin Fowler, ThoughtWorks,
[email protected] Loyal Opposition: Robert Glass, Computing Trends,
[email protected] Manager: Don Reifer, Reifer Consultants,
[email protected] Quality Time: Jeffrey Voas, Cigital,
[email protected] STAFF
Group Managing Editor Dick Price Senior Lead Editor Dale C. Strok
[email protected] Associate Lead Editors Crystal Chweh, Jenny Ferrero, and Dennis Taylor Staff Lead Editor Shani Murray Magazine Assistants Dawn Craig and Angela Williams
[email protected] Art Director Toni Van Buskirk Cover Illustration Dirk Hagner Technical Illustrator Alex Torres Production Artists Carmen Flores-Garvey and Larry Bauer Acting Executive Director Anne Marie Kelly Publisher Angela Burgess Membership/Circulation Marketing Manager Georgann Carter Advertising Assistant Debbie Sims CONTRIBUTING EDITORS
Denise Hurst, Kirk Kroeker, Nancy Mead, Kalpana Mohan, Ware Myers, Paula Powers, Judy Shane, Gil Shif, Tanya Smekal, Margaret Weatherford
Editorial: All submissions are subject to editing for clarity, style, and space. Unless otherwise stated, bylined articles and departments, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in IEEE Software does not necessarily constitute endorsement by the IEEE or the IEEE Computer Society. To Submit: Send 2 electronic versions (1 word-processed and 1 postscript or PDF) of articles to Magazine Assistant, IEEE Software, 10662 Los Vaqueros Circle, PO Box 3014, Los Alamitos, CA 90720-1314;
[email protected]. Articles must be original and not exceed 5,400 words including figures and tables, which count for 200 words each.
6
IEEE SOFTWARE
January/ February 2001
instrument? This does not fundamentally differ from a ladder diagram that controls the execution of a programmable logic controller. Larry Graham: I agree that the line between software and hardware is blurry. Patent law has a fairly rich tradition that equates the two—virtually every hardware device can be described in terms of a function it performs and vice versa. Is software engineering invariant? Annie Kuntzmann-Combelles: I think the basic practices needed to develop software properly are always the same: get clear and complete requirements from customers; manage changes to requirements; estimate, plan, and track the work to be done; select an adequate life cycle; define and perform QA activities; and maintain product integrity. The software architecture and coding might differ from one application to the other, but the process aspects are invariant. Tomoo Matsubara: I don’t think software development is always the same. My recommendation for improving software process is to apply domain-specific methodologies and tools and conduct problem-focused process improvement. The levels of importance and priorities are different between domains. For example, one of the most critical processes for commercial software is fixing data flow. For scientific software, a key to success is choosing the right algorithm. For COTS, it’s designing a good human–machine interface. For embedded systems, it’s pushing instructions into the fewest memory chips. For maintenance, it’s rigorous testing with regression tests. Software development practices should vary accordingly. Grant Rule: Think about the games industry, where the “software” is delivered on CD-ROM or game cartridges. Game development can take vast amounts of schedule. Teams can be quite large—25 to 30 people from a variety of disciplines including analysts, designers, coders, testers, QA staff, and project man-
agement—and lots of nontraditional software personnel such as writers, artists, and so on. Schedules must be managed carefully to meet holiday sales seasons. If a game misses its marketing window, it might be a commercial failure; there’s no second chance. Reliability is important: the game is mass-produced on CDROM, and from that point forward there is no real chance to correct it— it is infeasible to recall hundreds of thousands of copies to fix a defect. The result seems to be that, in some cases at least, game developers take a more rigorous approach to “engineering” their software than do some developers of commercial data-processing applications. All this seems to be “software engineering” to me. What’s unique about software? Robert Cochran: I use the following definition to describe what is unique or special about software: 1. Software is intangible (which I think is true even if it gets embedded). 2. It has high intellectual content (many other intangibles have low intellectual content). 3. It is generally not recognized as an asset by accountants and so is off the balance sheet. 4. Its development process is labor intensive, team based, and project based. We forget sometimes how little of the rest of the world regards projects as the normal way to work. 5. Software doesn’t exhibit any real separation between R&D and production. 6. Software is potentially infinitely changeable. Once you make a physical widget, there are severe limits on how you can change it. In principle, we can keep changing software forever. Is there any real distinction between printing out a source code listing, creating a binary, or burning software onto a chip or CD-ROM? In all these cases, we are just “fixing” a copy of the software in some form that cannot
FROM THE EDITOR
directly or easily be modified. That does not have any relevance to the nature of the software as such. Grant: That makes software just like any written work, art book, or design drawing. The medium (technology) might differ—wax tablets, canvas, paper—but anything that can have multiple copies made in a mutable medium sounds like it could be this thing called “software.” Martin Fowler: Robert’s Number 5 seems to be the key point. Until you deploy you can build, modify, and evolve the software, regardless of whether you eventually deploy to PROMs or CD. That ability to evolve while building is a key aspect of software that has no equivalent in disciplines where you must separate design from construction. This question of how “soft” is software is quite an important point and one that gels particularly with me. One of the reasons that I’m so much in favor of light methods is that they try to make software softer while many methodologies try to make it harder. In the information systems world, softness is an important and valuable asset. Making software softer Steve McConnell: That does seem to be the age-old challenge: How do you keep software from becoming brittle? How do you keep it soft? Whether you’re creating software, or a computer chip, or even a building, it seems as though it would be advantageous to keep the thing you’re building “soft” as far as possible into the project. Terry Bollinger: This overall question helps to deal with trying to understand the baffling diversity of production styles in the software marketplace. “Hard” software such as that found in processor chips is catastrophically expensive to fix after fielding, and so drives the entire software design process to be very conservative and validation intensive. “Fluid” software that can be changed automatically over the Internet drives the opposite behavior. I think that understanding these kinds of issues is quite fundamental to software management.
I do think we need to have a better overall definition of software. The very fact that I had to mangle together a phrase as awful as “hard” software to describe algorithms encoded into silicon shows that the industry has become more complex than it was in the days when you had vacuum tubes and bits on punched cards, and not much in between. I think the distinction needs to focus on the idea of “information machines”—what we have traditionally called software—versus the particular method of distribution and update of such machines. Those are two separate dimensions, not one. A bunch of gears is not an information machine, because it relies primarily on physical materials for its properties, even if it happens to do a little information processing at the same time (for example, odometers in cars). A structured algorithm masked into an addressable array that is an integral part of a processor chip most emphatically is an information machine, because its complete set of properties can be represented as structured binary information only, without any reference to the chip’s physical properties. Don Bagert: In defining what’s really software, my thought is to look at what the object code does (rather than where it resides), as well as the source code. I would define it as follows: “Software is a set of instructions that are interpreted or executed by a computer.” The source code is definitely “soft” not in the sense of the deployment medium but in that it is a nonphysical entity. It consists of a series of instructions, by definition nonphysical, that can be translated into object code or interpreted by a computer processor. My colleague Dan Cooke has an interesting view: a Universal Turing machine corresponds to computer hardware, while an individual Turing machine defined to solve a particular problem corresponds to software. What is software without programming? Steve: My original focus on the medium on which the software hap-
EDITOR-IN-CHIEF: Steve McConnell 10662 Los Vaqueros Circle Los Alamitos, CA 90720-1314
[email protected] EDITOR-IN-CHIEF EMERITUS: Alan M. Davis, Omni-Vista A S S O C I AT E E D I T O R S - I N - C H I E F
Design: Maarten Boasson, Quaerendo Invenietis
[email protected] Construction: Terry Bollinger, Mitre Corp.
[email protected] Requirements: Christof Ebert, Alcatel Telecom
[email protected] Management: Ann Miller, University of Missouri, Rolla
[email protected] Quality: Jeffrey M. Voas, Cigital
[email protected] EDITORIAL BOARD
Don Bagert, Texas Tech University Andy Bytheway, Univ. of the Western Cape Ray Duncan, Cedars-Sinai Medical Center Richard Fairley, Oregon Graduate Institute Martin Fowler, ThoughtWorks Robert Glass, Computing Trends Natalia Juristo, Universidad Politécnica de Madrid Warren Keuffel, independent consultant Brian Lawrence, Coyote Valley Software Karen Mackey, Cisco Systems Stephen Mellor, Project Technology Deependra Moitra, Lucent Technologies, India Don Reifer, Reifer Consultants Wolfgang Strigel, Software Productivity Centre Karl Wiegers, Process Impact INDUSTRY ADVISORY BOARD
Robert Cochran, Catalyst Software, chair Annie Kuntzmann-Combelles, Q-Labs Enrique Draier, PSINet Eric Horvitz, Microsoft Research David Hsiao, Cisco Systems Takaya Ishida, Mitsubishi Electric Corp. Dehua Ju, ASTI Shanghai Donna Kasperson, Science Applications International Pavle Knaflic, Hermes SoftLab Günter Koch, Austrian Research Centers Wojtek Kozaczynski, Rational Software Corp. Tomoo Matsubara, Matsubara Consulting Masao Matsumoto, Univ. of Tsukuba Dorothy McKinney, Lockheed Martin Space Systems Susan Mickel, AtomicTangerine Dave Moore, Vulcan Northwest Melissa Murphy, Sandia National Laboratories Kiyoh Nakamura, Fujitsu Grant Rule, Software Measurement Services Girish Seshagiri, Advanced Information Services Chandra Shekaran, Microsoft Martyn Thomas, Praxis Rob Thomsett, The Thomsett Company John Vu, The Boeing Company Simon Wright, Integrated Chipware Tsuneo Yamaura, Hitachi Software Engineering M A G A Z I N E O P E R AT I O N S C O M M I T T E E
Sorel Reisman (chair), James H. Aylor, Jean Bacon, Thomas J. Bergin, Wushow Chou, George V. Cybenko, William I. Grosky, Steve McConnell, Daniel E. O’Leary, Ken Sakamura, Munindar P. Singh, James J. Thomas, Yervant Zorian
P U B L I C AT I O N S B O A R D
Rangachar Kasturi (chair), Angela Burgess (publisher), Jake Aggarwal, Laxmi Bhuran, Lori Clarke, Mike T. Liu, Sorel Reisman, Gabriella Sannitti diBaja, Sallie Sheppard, Mike Williams, Zhiwei Xu
January/February 2001
IEEE SOFTWARE
7
FROM THE EDITOR
pens to be deployed seems to have been a red herring. I think software engineering applies broadly to creating complex “instructions,” regardless of the target media. Years ago, people argued that the need for software engineering had passed because Fortran had been invented. People didn’t have to write programs anymore; they could just write down formulas! Thirty years later, we now think of Fortran programming as comparatively low-level software engineering—“writing down formulas” is harder than it looks. As the years go by, we see the same argument repeated time and time again. The early claims about creating programs using Visual Basic were reminiscent of the early press on Fortran—“No more writing code! Just
8
IEEE SOFTWARE
January/ February 2001
drag-and-drop buttons and dialogs!” And today more new programs are being written in Visual Basic than any other language. Ten years from now, we’ll probably see programming environments that are much higher-level than Visual Basic; people working in those environments will benefit from software engineering. Martin: This is an important point. Often people talk about things “without programming” (one of my alarm-bell phrases). You find this in phrases like, “Buy our ERP system and you can customize it for your business without programming.” So you end up with people who are experts in customizing XYZ’s ERP system. Are they doing software engineering? I would say yes, even though their language is putting con-
figuration information into tables. They still need to know about debugging and testing. Indeed, whenever we talk about writing tools so that “users can configure the software without programming,” we run into the same problems. If you can enter programs through wizards, you still have to be able to debug and test the results. Wolfgang: At the risk of being simplistic, I would define software as follows: “Software is a set of instructions that are interpreted (or executed) by a computer.” I did not say “electronic computer” because it could well be a chemical computer or one that operates with light. That delegates the problem to defining the term “computer”—and that’s a hardware problem!
manager Editor: Donald J. Reifer
■
Reifer Consultants
■
[email protected]
Which Way, SQA? Emanuel R. Baker
W
hat is software quality assurance, and where is it headed? Has it become passé as firms embrace rapid application development techniques, modern tools, and spiral development models? These are interesting questions to ask in light of SQA’s history, quality models, and technological developments. There is, of course, the old question: is SQA a discipline or an organization? Matthew Fisher and I discussed this issue most recently in a chapter in the Handbook of Software Quality Assurance.1 Past and current practices contradict its status as a discipline; object-oriented design is a discipline, but it is difficult to cast SQA in the same light. The fact alone that many different organizations practice SQA in many different ways makes it difficult to characterize it as a discipline. Discipline or organization? If we look at organizations in the commercial world, especially those in the information systems arena, SQA consists of testing—primarily, system testing. In many cases, due to poor project planning or project management, little time is available to test adequately. Documentation of the requirements is often unavailable, so the test program’s ability to detect software defects is suspect. Testing as SQA is like locking the barn door after the horse has escaped; it is hardly “assuring product quality,” nor is it the discipline of SQA. Members of organizations that have adopted the Capability Maturity Model (CMM) often view SQA as the “process po16
IEEE SOFTWARE
January/February 2001
lice.” In this role, SQA determines whether the developers (and project management) conform to the process policies, standards, and procedures. Work products are checked for compliance with templates defining their content and format. In these organizations, those organizations that have a separate SQA function might have people with little software background performing SQA. Deviations from the process might not be as apparent to them as to more knowledgeable SQA personnel. The product assurance department I managed several years back did a number of things. It defined the organization’s process, it checked for compliance with it, evaluated work products for content quality and conformance to format and content requirements, moderated inspections, and did some testing. Many other companies practiced SQA this way, as well—very differently from those I described earlier. Today, a set of practices that tend to vary quite a bit emerges, hardly fitting the description of a discipline. SQA is not always an organization. Nor, when it is an organization, is it necessarily an independent one—a characteristic necessary to ensure objective evaluations. In many organizations, developers time-share their activities. They spend part of the time doing development work and part doing whatever has been defined as SQA for that organization. Clearly, there is ambiguity on that score, as well. Technological developments What effect does technology have on the practice of SQA? In an article on software management, Walker Royce cites assessing quality with an independent team and exhaustive inspection as principles of conven0740-7459/00/$10.00 © 2001 IEEE
MANAGER
tional software management.2 This approach is an outgrowth, he asserts, of the waterfall model of software development. Using different approaches, such as the spiral model and other iterative development approaches, lets us use more modern software management techniques. Royce states, Modern software development produces the architecture first, followed by usable increments of partial capability, and then completeness. Requirements and design flaws are detected and resolved earlier in the life cycle, avoiding the big-bang integration at the end of a project. Quality control improves because system characteristics inherent in the architecture (such as performance, fault tolerance, interoperability, and maintainability) are identifiable earlier in the process where problems can be corrected without jeopardizing target costs and schedules. He later argues that the conventional principle of a separate quality assurance results in “projects that isolate ‘quality police,’” and that “a better approach is to work quality assessment into every activity through the checks and balances of organizational teams focused on architecture, components, and usability.” Another fault of conventional software management, he suggests, is slavish insistence on requirements traceability to design. Clearly, one of SQA’s objectives is to ensure that the design and resultant code implements the agreed-upon requirements. The problem is that demanding rigorous problem-tosolution traceability is frequently counterproductive, forcing the design to be structured in the same manner as the requirements. Good component-based archi-
tectures have chaotic traceability to their requirements. Tight problem-to-solution traceability might have been productive when 100% custom software was the norm; those days are gone.
e-commerce, liability is more of a concern. An e-commerce application’s failure to provide adequate protection against credit card number theft, for example, raises serious liability issues. Various studies indicate that 30% to 40% of software projects fail, that up to 50% of the failures are due to inadequate requirements definition, and that up to 40% of the failures recorded during a project’s life are attributable to requirements problems. Yet, e-commerce firms develop applications at Internet speed. Undoubtedly, there is little time for adequate requirements definition or for testing. These organizations risk incurring significant liability losses. Undoubtedly, they have Sally Lee much to learn about adopting some of the quality prinThe QA role that emerges from these ciples discussed above. discussions is one of evaluations embedded in the development pro- CMM and ISO models cess. The process Royce describes deWhat about quality models such as emphasizes separate SQA and pro- the CMM or ISO 9000? In their curcess policing, rightly viewing them as rent versions, they impose QA overcounterproductive. sight, with substantial reviews, audits, and record keeping. The CMM E-commerce SQA is probably used more widely in the It’s interesting to look at Royce’s US than ISO 9001 or the companion approach in contrast to development guideline, ISO 9000-3. The CMM repractices followed by e-commerce quires that an organization establish software houses. We often hear an SQA function to achieve Level 2. about doing things “at Internet The role that it establishes for SQA is speed.” In e-commerce (and the prac- essentially that of the “process potices of many shrink-wrap software lice.” Don Reifer characterized a pohouses), that often involves getting tential problem in that role: slavish applications out the door as quickly enforcement of the process, whether as possible to beat the competition. it makes sense or not.3 Adversarial Process is of little concern to many e- personnel relationships are still ancommerce organizations. Few e-com- other potential problem. A policing merce firms have demonstrated any posture can undermine team spirit. The CMM also does not provide a interest in adopting quality models such as the CMM. (For that matter, road map for SQA. As noted before, few shrink-wrap software houses it sets up SQA as the process police. have, either.) Consequently, the con- In general, the SQA Key Process Area specifies that SQA performs reviews sumer often gets flawed software. For a long time, liability was not and audits of process and product an issue; the consumer merely put compliance and identifies specific reup with flawed software. However, views and audits under the Verifying with the advent of the Internet and Implementation common feature apJanuary/February 2001
IEEE SOFTWARE
17
MANAGER
pearing in the remaining KPAs. The SQA role should expand as the organization matures, because there are more codified practices that the organization should follow as it advances up the maturity ladder. In the CMM, there is little sense of an evolving role. The character of SQA’s role as the organization matures might be different. For example, if process becomes more a way of life as the organization moves up the ladder, fewer reviews and audits should be necessary, and SQA might add value at Levels 4 and 5 in other ways. Does the Integrated CMM provide better guidance? It does define an evolving role. It does, for example, specify measurement and quantitative process improvement roles at Levels 4 and 5 for SQA (referred to as Product and Process Quality Assurance in the CMMI). However, the CMMI’s framework makes it appear to pertain only to measurement and improvement of the SQA process. Undoubtedly, to avoid too much compartmentalization of roles, one organization could accomplish many of the measurement and quantitative assessments of process specified at Levels 4 and 5 for all the KPAs. SQA’s future What, then, would be a meaningful role for SQA in the future? It is clearly desirable to embed reviews and evaluations in the process. However, such practices probably could not be carried out in a consistent and effective manner at the CMM’s lower levels; there is just too little process culture in most organizations at the lower levels. What I would propose, then, is the scenario that follows, based on the CMM. To achieve Level 2, the level at which a project management discipline is institutionalized, typically requires a major shift in the organization’s attitude to acceptance of process discipline. Because organizations tend to ignore process as much as possible and focus on getting the software product out the door, the SQA role might need to emphasize the process police and peacekeeper roles to help instill a process-oriented culture within 18
IEEE SOFTWARE
January/February 2001
the organization. This is probably less true in going from Level 2 to Level 3, the level at which a defined process exists at the organizational level. Because Level 2 organizations sometimes backslide easily, SQA might still need to pay a great deal of attention to process compliance. In moving toward Level 3, the organization has come to appreciate that process discipline has, in fact, made life easier; SQA would begin to shift, requiring the process police and peacekeeper roles to a lesser degree. In achieving Level 3, the organization establishes a process database, and SQA can begin assuming the responsibility for collecting and analyzing process metrics. Once an organization consistently functions at Level 3 and is moving ahead to Level 4, the need for process police diminishes considerably. Process discipline should be a way of life, and verifying compliance should be less necessary. Periodic random checks could verify that process conformance is still being maintained. The SQA role could focus primarily on measuring the process (and identifying corrective action when the process goes awry) and monitoring the attainment of numerical quality goals. At Level 5, an even further shift to measurement occurs. The SQA role at both Levels 4 and 5 can be that of metrics collector and first-level analysis performer. The first-level analysis would focus on detection of trends to be analyzed for identification of the need for process correction. At Level 5, SQA would perform the first-level analysis in assessing (undoubtedly, in pilot applications) the efficacy of proposed changes in technology or the process. Those who are more expert in specific processes or technology would have the responsibility for finally determining the acceptability of the changes. Such a role would not have to be totally independent. Evaluators could act in a part-time SQA capacity and embed evaluations in the process. Those doing the metrics analyses could also be developers acting in a part-time SQA capacity. However,
those defining and organizing the program and monitoring its performance should be independent to ensure that it functions properly. This group of people would not have to be large at all. The role I’ve described would result in cost-effective, unobtrusive, value-added SQA. The big picture But this discussion still leaves another major issue not addressed— something that might be referred to as “big picture QA.” Embedding peer reviews, inspections, audits, and so forth into the process tends to create a microscopic view of quality. When we do this, we are reviewing perhaps the design or the source code of single units of code. We still need a macroscopic view of quality. There are times when we must step back and see the whole picture to assess whether all the pieces fit together properly. All the units might be coded in accordance with the design, but does the integrated design satisfy the user’s or customer’s needs? Is our drive to embed quality evaluation processes creating other kinds of quality problems? We still need the more global project level reviews at major decision points or milestones in the project. Embedding quality evaluations into the development process is certainly a desirable way to go. Let us hope that the motivation for that practice is not purely overhead reduction and that balanced views of both macroscopic and microscopic quality are taken into consideration in structuring SQA programs. References 1. E. Baker and M. Fisher, “Software Quality Program Organization,” Handbook of Software Quality Assurance, Schulmeyer and McManus, eds., Prentice Hall, Upper Saddle River, N.J., 1999, pp. 115–145. 2. W. Royce, “Software Management Renaissance,” IEEE Software, vol. 17, no. 4, July 2000, pp. 116–19. 3. D. Reifer, “A Tale of Three Developers,” Computer, vol. 32, no. 11, Nov. 1999, pp. 128–30.
Emanuel R. Baker is president of Software Engineer-
ing Consultants. Contact him at 10219 Briarwood Drive, Los Angeles, CA, 90077,
[email protected].
focus
guest editors’ introduction
Introducing Usability Natalia Juristo, Universidad Politécnica de Madrid Helmut Windl, Siemens AG Larry Constantine, Constantine & Lockwood
ertainly many of you have had enough frustrating experiences using software to acknowledge that usability strategies, models, and methods are often not applied adequately during software construction. Usability is not a luxury but a basic ingredient in software systems: People’s productivity and comfort relate directly to the usability of the software they use.
C
One definition of usability is quality in use.1 In simple terms, it reflects how easy the software is to learn and use, how productively users will be able to work, and
20
IEEE SOFTWARE
January/February 2001
how much support users will need. A system’s usability does not only deal with the user interface; it also relates closely to the software’s overall structure and to the concept on which the system is based. Usability is a difficult attribute to embed in any system—not only software—and it requires specific knowledge and a lot of awareness about the user’s likings, requirements, and limitations. However, many software developers would rather work with machines than with people; they show little interest in issues such as how much data should appear on the screen at one time. Additionally, many designers do not realize that their perception of their creation does not provide much information about how others will react to it. That is why we get all those “perfectly obvious to the designer” creations. We should regard usability as one more quality attribute for consideration during software construction. Of course, we shouldn’t concentrate on just a single qual0740-7459/00/$10.00 © 2001 IEEE
In This Issue Our aim in this special issue is to encourage software developers to listen more carefully to usability engineers and to give usability a more meaningful place in the overall software process. Usability is not an abstract idea; applications already exist that demonstrate it well. This issue tries to promote the application of that knowledge to a wider range of companies and systems by emphasizing existing techniques rather than research on new methods or models. Most of the articles following deal with real experiences. The seven articles we have selected fall into four categories. First, because of the low level of usability awareness and skills among software engineers, we include a tutorial, “Usability Basics for Software Developers” by Xavier Ferre, Natalia Juristo, Helmut Windl, and Larry Constantine. This will help some readers better understand the rest of the articles. Second, the links between usability and business competition or market domains is an especially interesting topic. In “Usability and the Bottom Line,” George M. Donahue discusses usability cost-effectiveness and describes how to perform a cost–benefit analysis. A company will often make this the first step toward integrating usability into its processes. Two articles discuss industrial experiences with usability and software development. In “Usability in Practice: Three Case Studies,” Karla Radle and Sarah Young present three real cases of introducing usability engineering into an organization. Their article points out common obstacles and describes some lessons
ity attribute when designing systems: combining software characteristics poses the real challenge. Usability and software development Integrating usability into the software development process is not easy or obvious. Even the companies that have usability departments have difficulty resolving conflicts between usability staff and software developers, based on the groups’ different perspectives. In the cases where software practitioners have applied usability techniques, they traditionally have done it late in the development cycle, when fixing major problems is costly. Ensuring a certain degree of usability based on the intended user’s work practices is very important when designing a good system concept or metaphor and must be integrated early in the design process. Interaction design can greatly affect the application’s overall architecture. If you consider usability too late in the life cycle, there is no time left to really make a difference—you can’t just toss it in at the last minute, any more than you could a good database schema. However, when you do introduce usabil-
learned. In “Integrating Usability Techniques into the Software Development Process,” Kathi Garrity, Francie Fleek, Jean Anderson, and Fred Drake describe how two groups in their company, the Software Engineering Group and the User Performance Group, came to understand each other’s processes, vocabulary, and approaches. The article discusses the challenges they faced and the development process that resulted. Because of the Web’s rapidly increasing significance in software development, the fourth group of articles addresses the role of usability engineering for Web applications. The usability of Web sites and applications continues to be worse than that of more traditional software. However, to be competitive in e-business, usability is a must. In “A Global Perspective on Web Site Usability,” Shirley Becker and Florence Mottay discuss how to assess Web application usability. Molly Hammar Cloyd’s article “Designing a User-Centered Web Application in Web Time” reports on a company’s experience transforming its development process from a traditional to a user-driven process. Beyond usability, we have other issues to consider. As an example, the one research article in this issue, “Engineering Joy” by Marc Hassenzahl, Andreas Beu, and Michael Burmester, talks about the joy of use—the task-unrelated aspects of quality called hedonic quality. After a brief overview of research on the relationship between enjoyment and software, this article shows that traditional usability engineering methods are not suited to analyzing and evaluating hedonic quality. These authors present promising new approaches.
ity concepts into your organization, you can cost-justify the investment,2,3 reduce development time,3,4 increase sales, improve users’ productivity, and reduce support and maintenance costs. You can avoid conflicts with usability staff either by integrating usability experts into the development team or by making some team members usability experts. Depending on how much money you can invest, you can also avoid having to build costly usability labs. As IBM has stated, usability “makes business effective. It makes business efficient. It makes business sense.”5
References 1. ISO/IEC 14598-1, Software Product Evaluation: General Overview, Int’l Org. for Standardization, Geneva, 1999. 2. G. Bias and D. Mayhew, Cost-Justifying Usability, Academic Press, New York, 1994. 3. T. Gilb, Principles of Software Engineering Management, Addison Wesley Longman, Boston, 1988. 4. M. Keil and E. Carmel, “Customer-Developer Links in Software Development,” Comm. ACM, vol. 38, no. 5, May 1995, pp. 33–44. 5. IBM, “Cost Justifying Ease of Use,” www3.ibm.com/ibm/easy/eou_ext.nsf/Publish/23 (current 2 Jan. 2001).
About the Authors Natalia Juristo, Helmut Windl, and Larry Constantine’s biographies
appear on page 29.
January/February 2001
IEEE SOFTWARE
21
focus
usability engineering
Usability Basics for Software Developers Xavier Ferré and Natalia Juristo, Universidad Politécnica de Madrid Helmut Windl, Siemens AG, Germany Larry Constantine, Constantine & Lockwood
I This tutorial examines the relationship between usability and the user interface and discusses how the usability process follows a design-evaluateredesign cycle. It also discusses some management issues an organization must face when applying usability techniques. 22
IEEE SOFTWARE
n recent years, software system usability has made some interesting advances, with more and more organizations starting to take usability seriously.1 Unfortunately, the average developer has not adopted these new concepts, so the usability level of software products has not improved.
Contrary to what some might think, usability is not just the appearance of the user interface (UI). Usability relates to how the system interacts with the user, and it includes five basic attributes: learnability, efficiency, user retention over time, error rate, and satisfaction. Here, we present the general usability process for building a system with the desired level of usability. This process, which most usability practitioners apply with slight variation, is structured around a designevaluate-redesign cycle. Practitioners initiate the process by analyzing the targeted users and the tasks those users will perform. Clarifying usability concepts According to ISO 9241, Part 11, usability is “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.”2 This definition ties a system’s usability to specific conditions, needs, and users—it requires establishing certain levels of usability based on the five basic attributes.
January/February 2001
Usability engineering defines the target usability level in advance and ensures that the software developed reaches that level. The term was coined to reflect the engineering approach some usability specialists take.3 It is “a process through which usability characteristics are specified, quantitatively and early in the development process, and measured throughout the process.”4 Usability is an issue we can approach from multiple viewpoints, which is why many different disciplines, such as psychology, computer science, and sociology, are trying to tackle it. Unfortunately, this results in a lack of standard terminology. In fact, the term usability engineering is not universally accepted—other terms used include usage-centered design, contextual design, participatory design, and goaldirected design. All these philosophies adhere to some extent to the core issue of usability engineering: evaluating usability with real users from the first stages of development. Usability attributes We can’t define usability as a specific as0740-7459/00/$10.00 © 2001 IEEE
pect of a system. It differs depending on the intended use of the system under development. For example, a museum kiosk must run a software system that requires minimum training, as the majority of users will use it just once in their lifetime. Some aspects of usability—such as efficiency (the number of tasks per hour)—are irrelevant for this kind of system, but ease of learning is critical. However, a bank cashier’s system would require training and would need to be highly efficient to help reduce customer queuing time. Because usability is too abstract a term to study directly, it is usually divided into the attributes we mentioned at the beginning of the article:5 ■
■
■
■
■
Learnability: How easy it is to learn the main system functionality and gain proficiency to complete the job. We usually assess this by measuring the time a user spends working with the system before that user can complete certain tasks in the time it would take an expert to complete the same tasks. This attribute is very important for novice users. Efficiency: The number of tasks per unit of time that the user can perform using the system. We look for the maximum speed of user task performance. The higher system usability is, the faster the user can perform the task and complete the job. User retention over time: It is critical for intermittent users to be able to use the system without having to climb the learning curve again. This attribute reflects how well the user remembers how the system works after a period of nonusage. Error rate: This attribute contributes negatively to usability. It does not refer to system errors. On the contrary, it addresses the number of errors the user makes while performing a task. Good usability implies a low error rate. Errors reduce efficiency and user satisfaction, and they can be seen as a failure to communicate to the user the right way of doing things. Satisfaction: This shows a user’s subjective impression of the system.
One problem concerning usability is that these attributes sometimes conflict. For example, learnability and efficiency usually influence each other negatively. A system must
be carefully designed if it requires both high learnability and high efficiency—for example, using accelerators (a combination of keys to perform a frequent task) usually solves this conflict. The point is that a system’s usability is not merely the sum of these attributes’ values; it is defined as reaching a certain level for each attribute. We can further divide these attributes to precisely address the aspects of usability in which we are most interested. For example, performance in normal use and advanced feature usage are both subattributes of efficiency, and first impression is a subattribute of satisfaction. Therefore, when analyzing a particular system’s usability, we decompose the most important usability attributes down to the right detail level. Usability is not only concerned with software interaction. It is also concerned with help features, user documentation, and installation instructions.
It’s important to carefully consider the interaction not just when designing the visible part of the user interface, but also when designing the rest of the system.
Usability and the user interface We distinguish between the visible part of the UI (buttons, pull-down menus, checkboxes, background color, and so forth) and the interaction part of the system to understand the depth and scope of a system’s usability. (By interaction, we mean the coordination of the information exchange between the user and the system.) It’s important to carefully consider the interaction not just when designing the visible part of the UI, but also when designing the rest of the system. For example, if a system must provide continuous feedback to the user, the developers need to consider this when designing the time-consuming system operations. They should design the system so it can frequently send information to the UI to keep the user informed about the operation’s current status. The system could display this information as a percentage-completed bar, as in some software installation programs. Unfortunately, it is not unusual to find development teams that think they can design the system and then have the “usability team” make it usable by designing a nice set of controls, adding the right color combination, and using the right font. This approach is clearly wrong. Developers must consider user interaction from the beginning of the development process. Their understanding of the interaction will affect the final product’s usability. January/February 2001
IEEE SOFTWARE
23
Usability testing alone is not enough to output a highly usable product, because it uncovers but does not fix design problems.
Usability in software development The main reason for applying usability techniques when developing a software system is to increase user efficiency and satisfaction and, consequently, productivity. Usability techniques, therefore, can help any software system reach its goal by helping the users perform their tasks. Furthermore, good usability is gaining importance in a world in which users are less computer literate and can’t afford to spend a long time learning how a system works. Usability is critical for user system acceptance: If users don’t think the system will help them perform their tasks, they are less likely to accept it. It’s possible they won’t use the system at all or will use it inefficiently after deployment. If we don’t properly support the user task, we are not meeting user needs and are missing the main objective of building a software system. For a software development organization operating in a competitive market, failure to address usability can lead to a loss of market share should a competitor release a product with higher usability. Also, a software product with better usability will result in reduced support costs (in terms of hotlines, customer support service, and so forth). Even if a system is being used, it does not necessarily mean it has a high level of usability. There are other aspects of a software product that condition its usage, such as price, possibility of choice, or previous training. In addition, because users are still more intelligent than computers, it is usually the human who adapts to the computer in human–computer interaction. However, we shouldn’t force the user to adapt to software with poor usability, because this adaptation can negatively influence efficiency, effectiveness, and satisfaction. Usability is a key aspect of a software product’s success. The usability process As we mentioned, a system’s usability depends on the interaction design. Therefore, we must deal with system usability throughout the entire development process. Usability testing alone is not enough to output a highly usable product, because usability testing uncovers but does not fix design problems. Furthermore, usability testing has been viewed as similar to other types of software quality assurance testing, so developers often apply the techniques late in the develop-
24
IEEE SOFTWARE
January/February 2001
ment cycle—when major usability problems are very costly, if not impossible, to fix. Therefore, it is crucial to evaluate all results during the product development process, which ultimately leads to an iterative development process. A pure waterfall approach to software development makes introducing usability techniques fairly impossible. All software applications are tools that help users accomplish certain tasks. However, before we can build usable software tools— or, rather, design a UI—we need information about the people who will use the tool: ■ ■ ■ ■
Who are the system users? What will they need to accomplish? What will they need from the system to accomplish this? How should the system supply what they need?
The usability process helps user interaction designers answer these questions during the analysis phase and supports the design in the design phase (see Figure 1). There are many usability methods—all essentially based on the same usability process—so we have abstracted a generic usability process from the different approaches to usability mentioned earlier. We hope this makes it easier for the reader to understand the different usability techniques we will be describing. Usability analysis phase First, we have to get to know the users and their needs, expectations, interests, behaviors, and responsibilities, all of which characterize their relationship with the system. User analysis. There are numerous approaches for gathering information about users, depending on each individual system under development and the effort or time constraints for this phase. The main methods are site visits, focus groups, surveys, and derived data. The primary source for user information is site visits. Developers observe the users in their working environment, using the system to be replaced or performing their tasks manually if there is no existing tool. In addition, developers interview the users to understand their motivation and the strategy behind their actions. A well-known method
Figure 1. The usability process.
User analysis
Task analysis
Evaluation
Usability benchmarks
Analysis phase
for doing user analysis jointly with task analysis is contextual inquiry.6 This method provides a structured way for gathering and organizing information. A focus group is an organized discussion with a selected group of users. The goal is to gather information about their views and experiences concerning a topic. It is well suited for getting several viewpoints about the same topic—for example, if there is a particular software product to discuss—and gaining insight into people’s understanding of everyday system use. In a survey, the quality of the information depends on the quality of the questions. Surveys are a one-way source, because it is often difficult or even impossible to check back with the user. Don A. Dillman’s book Mail and Internet Surveys provides a structured method for planning, designing, and conducting surveys.7 Derived data includes hotline reports, customer complaint letters, and so forth. It can be a good source of usability implications but is often difficult to interpret. The most important limitation is that such sources are one-sided. They report only problems and say nothing about the features that users liked or that enabled efficient use. The most important thing about user analysis is to record, structure, and organize the findings. Task analysis. Task analysis describes a set
of techniques people use to get things done.8 The concept of a task is analogous to the concept of a use case in object-oriented software development; a task is an activity meaningful to the user. User analysis is taken as input for task analysis, and both are sometimes performed jointly. We analyze tasks because we can use the located tasks to drive and test UI design throughout the product development cycle. Focusing on a small set of tasks helps rationalize the development effort. Therefore, we suggest prioritizing the set of tasks by
Conceptual Evaluation design
Visual design
Evaluation
Design phase
importance and frequency to get a small task set. This approach guarantees that you’ll build the most important functionalities into the system and that the product will not suffer from “featuritis.” These tasks should be the starting point for developing the system. One approach to analysis is to build a task model within the UsageCentered Design method, a model-driven approach for designing highly usable software applications, where tasks, described as essential use cases, are the basis for a wellstructured process and drive UI design.9 Task analysis ends when we evaluate the discovered task set, which is best done collaboratively with users. When the user population is already performing a set of tasks, we perform task analysis during user analysis to apprehend the tasks the user performs routinely and how the user perceives these tasks. After the optional first analysis, we identify the tasks our system will support, based on a study of the goals the user wants to attain. Then, we break the tasks into subtasks and into particular actions that the user will perform and take the identified tasks as the basis for building the usability specifications. We then instantiate them to real-world examples and present them to test participants in a usability test. Usability benchmarks. We set usability bench-
marks as quantitative usability goals, which are defined before system design begins.10 They are based on the five basic usability attributes or their subattributes. We need these benchmarks because, if we want to assess the value of the usability attributes for the system under development, we need to have a set of operationally defined usability benchmarks. We establish usability benchmarks by defining a set of benchmarks for each usability attribute we want to evaluate—that is, for each usability attribute we consider important for our system. We must define the benchmarks in a way that makes them calcuJanuary/February 2001
IEEE SOFTWARE
25
Table 1 A Sample Usability Specification Table4 Usability attribute
Measuring instrument
Performance in normal use
“Answer request” task
First impression
Questionnaire
Value to be measured
Current level
Worst acceptable level
Planned target level
Best possible level
Length of time taken to successfully perform the task (minutes and seconds) Average score (range –2 to 2)
2 min, 53 sec
2 min, 53 sec
1 min, 30 sec
50 sec
—
0
1
2
lable in a usability test or through a user satisfaction questionnaire. Table 1 shows the format of a usability specification table. (The “Observed results” column is filled with the data gathered during the usability tests.) We take task analysis as an input for this activity, because most usability benchmarks are linked to a task specified in task analysis. Usability design Once we have analyzed the tasks our system will support, we can make a first attempt at the UI’s conceptual design, which we will evaluate and possibly improve in the next iteration.
Observed results
able to paint reasonable pictures if they use the principles they learned. Another way to improve design ability is to examine UIs. Analyzing the UIs of every software application you can access is very helpful and can sometimes be a source of inspiration for finding innovative, alternative solutions. The conceptual design phase also ends with evaluating the results. It is a good idea to test the paper prototypes against the defined task set to check that all the prioritized tasks can be enacted. The last test in this phase is run together with users as a usability test or usability inspection of the paper prototype. Visual design. Having completed the concep-
Conceptual design. During the conceptual
design phase, we define the basic user–system interaction and the objects in the UI and the contexts in which interaction takes place. The findings of the user and task analysis are the basis for the conceptual design. The deliverables from this phase are typically paper prototypes, such as pencil drawings or screen mockups, and a specification, which describes the UI’s behavior. Conceptual design is the most crucial phase in the process, because it defines the foundation for the entire system. Unfortunately, design is a very creative process, and it can’t be automated with a method. There is a set of design principles and rules that we must creatively adapt for a certain design problem. (A good reading for any designer— not just software designers—is The Design of Everyday Things,11 which presents general design principles by evaluating the design of everyday objects.) The main principles of UI design cover feedback, reuse, simplicity, structure, tolerance, and visibility in UIs. Knowing usability design principles is the basis for good design. Compare this to an adult drawing class. Not everyone will be Picasso by the end of the course, but the students will be 26
IEEE SOFTWARE
January/February 2001
tual design, the final step in our process is visual design, where we define the UI’s appearance. This covers all details, including the layout of screens and dialog boxes, use of colors and widgets, and design of graphics and icons. There are also rules and principles for visual design, addressing use of color, text, screen layout, widget use, icon design, and so forth. It pays to have a professional screen designer, especially in this phase. Recommended readings about visual and conceptual design are About Face12 and Software for Use,9 which both include numerous design tips. Designing Visual Interfaces focuses on screen design and graphics design in the context of UIs, as well as the underlying principles of visual design.13 The deliverables of this phase are prototypes that must be tested, an exact specification of the UI appearance, and behavior plus the specification for new widgets that must be developed. Prototyping Prototypes are not exclusive to UI design, but they are valuable for performing usability testing in early development phases. We need to build prototypes because abstract technical specifications are not a good way of
communicating when we want to involve users in the design process—users understand tangible system prototypes much better.5 Some prototyping techniques help perform usability testing and require little implementation effort. We create prototypes to test them on the user through usability evaluation techniques. The prototyping techniques with which software developers usually are not familiar include ■
■
■
Paper mock-ups: At the beginning of the design process, the designer creates paper prototypes—usually pencil drawings or printouts of screen designs—for the user. The designer will act as the computer, showing the user the next element when a transition between graphical elements occurs.8 “Wizard of Oz” technique:8 A human expert acts as the system and answers the user’s requests, without the user’s knowledge. The user interacts normally with the screen, but instead of using software, a developer sits at another computer (network-connected to the user’s computer) answering the queries. The user gets the impression of working with a real software system, and this method is cheaper than implementing a real software prototype. Scenarios, storyboards, and snapshots: A scenario describes a fictional story of a user interacting with the system in a particular situation; snapshots are visual images that capture the interaction occurring in a scenario; and storyboards8 are sequences of snapshots that focus on the main actions in a possible situation. They make the design team think about the appropriateness of the design for a real context of use, and they help make the process user-centric.
Usability evaluation Usability evaluation is a central activity in the usability process. It can determine the current version’s usability level and whether the design works. Usability testing. The term usability testing describes the activity of performing usability tests in a laboratory with a group of users and recording the results for further analysis. We can’t predict a software system’s us-
ability without testing it with real users. First, we must decide which groups of users we want to use to test the system and how many from each group we will try to recruit as test participants. Then, we must design the test tasks we’ll ask the participants to perform. We usually take them from the results of the task analysis activity and apply them to hypothetical real-life situations. Some characteristics of the test require consideration, such as ■ ■
■ ■
We can’t predict a software system’s usability without testing it with real users.
whether the participant can ask the evaluator for help; should two participants jointly perform each test task to observe the remarks they exchange in the process; what information participants will receive about the system prior to the test; and whether to include a period of free system access after completing the predefined tasks to get the user’s overall impression of the system.
After we prepare the test and recruit test participants, we run the tests, optionally recording them with video cameras or audio recorders, and log the users’ actions in the system for further analysis (also optional). Once we have performed all the tests, we analyze the data and gather results to apply them in the next iterative cycle. Thinking aloud. Formative evaluation seeks
to learn which detailed aspects of the interaction are good and how to improve the interaction design.5 This opposes summative evaluation, which is performed at the end of the development process, after the system has been built. The results of summative evaluation do not help shape the product. Thinking aloud helps perform formative evaluation in usability tests. We ask the test participant to think aloud while using the system in a usability test,8 to verbalize his or her actions so we can collect the remarks. For example, a participant might say, “First, I open the file, and I click once on the file icon. Nothing happens. I don’t know why this is not working like the Web. I press the Enter key, and it opens. Now I want to change the color of the label, so I search in the Tools menu, but I can’t find any option for what I want to do.” User remarks obtained in usability tests can provide signifiJanuary/February 2001
IEEE SOFTWARE
27
Further Reading The following are books about human–computer interaction (HCI) and usability that are more likely to interest software practitioners with little or no knowledge about the field (the information for each book is in the References section of this article). B. Shneiderman, Designing the User Interface: Strategies for Effective Human–Computer Interaction—This book summarizes all aspects related to interactive systems from a serious scientific viewpoint, although some readers might prefer a more engineering-focused approach. It includes a valuable set of guidelines for designing the user interface. There have been some interesting additions in the third edition about issues such as hypermedia and the Web and Computer Supported Cooperative Work. D. Hix and H.R. Haertsen, Developing User Interfaces: Ensuring Usability Through Product and Process—Despite its title, this book is not just about the user interface; it focuses on the process of user interaction design. Written in a very practical style, it provides a hands-on approach to designing the interactive part of a software system. Software practitioners might want to skip the chapters devoted to the User Action Notation—a technique for representing interaction designs—which is too formal for non-HCI experts. L.L. Constantine and L.A.D. Lockwood, Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design—The most recent of the books reviewed here, it presents a process for designing usable software systems based on one of the current trends in software engineering: use cases. The book is written in a practical style that is likely to appeal to software practitioners. J. Nielsen, Usability Engineering—This book provides a good introduction to the issue of usability engineering. It is easy to read and includes stories of real situations. It deals with a wide variety of issues related to usability engineering—but none are addressed in depth.
cant insight into the best way of designing the system interaction. By detailing their mental process, test participants can uncover hidden usability problems. Formative evaluation is the usual form of evaluation in a usability process, combining qualitative data gathered from user comments with quantitative data to check against previously defined usability benchmarks. Heuristic evaluation. A usability expert can perform a heuristic evaluation of the system to make some development iterations shorter and to perform more iterations in the development process. The expert will make a critique founded on both his or her interaction design experience and on generally accepted usability guidelines, like the ones by Ben Shneiderman14 and Jakob Nielsen.5 Experts provide a different kind of feedback than final users through usability testing. Expert suggestions for modification are 28
IEEE SOFTWARE
January/February 2001
usually more applicable, and they are more precise about the underlying usability problems, such as a lack of consistency or poor navigation. On the other hand, usability testing must be performed with real users to identify specific usability problems. Heuristic evaluation can complement but not replace usability testing. Collaborative usability inspection. A collabo-
rative usability inspection is a systematic examination of a finished system, design or prototype from the end user’s viewpoint.9 A team of developers, end users, application or domain experts, and usability specialists collaboratively perform the review. Collaborative usability inspections (CUIs) use features and techniques from heuristic evaluation, pluralistic usability walkthroughs, and expert evaluations and are less expensive and faster than usability testing. Behind this technique is a set of strict rules to avoid the problems that typically arise if end users discuss their work together with designers or developers. CUIs uncover more—albeit different—usability defects (up to 100 defects per hour) than usability testing. Apart from efficiency, one advantage is that people with multiple perspectives and expertise examine the test object. Another advantage is that the participating developers build skills and know-how about how to make software more usable. Management and organizational issues When introducing usability, an organization must first commit management to the ideas behind the usability process and convince them of its benefits.4,5 The newest concepts they need to accept include creating conceptual design in the first stages of development and evaluating usability throughout the development process. CostJustifying Usability presents cost-benefit arguments in favor of performing usability practices, which can be used when trying to get management commitment.15 Another option to convince management is to take a recently developed system or one that is currently being developed and to perform videotaped usability tests with a few users who are novel to the system. Showing the results to management and the development team can produce a change of attitude to-
ward usability testing, as the results will probably show that the system is not as good in usability terms as expected. Integrating UI designers into the development team isn’t always easy, especially if they are assigned to several projects at the same time. One approach to applying usability techniques in some projects is to promote one member of each development team to usability champion,5 similar to process improvement champions. Usability champions learn the basic usability skills and are coordinated by a user interaction designer. The user interaction designer then acts as a consultant in several projects but can interact with the usability champion in each group.4 Don’t try to do a full-scale usability process from the beginning. You can start by setting a small set of usability specifications with a simple task analysis of the most prominent tasks, some conceptual design with paper prototypes and simple usability tests to be carried out with a small set of users. You can also act as usability expert performing heuristic evaluation on the system using the guidelines we mentioned (by Shneiderman14 and Nielsen5). Starting with modest objectives will contribute more firmly to the final success of your endeavor.
D
espite increasing usability awareness in software development organizations, applying usability techniques in software development is not easy. Software engineers and usability engineers have a different conception of software development, and conflicts can arise between them due to differences in terminology and procedures. To create acceptable usability concepts, the software engineering community must integrate usability techniques into a software engineering process that is recognizable from both fields. Use cases offer a good starting point, as they are the software engineering construct closer to a usable software development approach. References 1. L. Trenner and J. Bawa, The Politics of Usability, Springer-Verlag, London, 1998. 2. Ergonomic Requirements for Office Work with Visual Display Terminals, ISO 9241-11, ISO, Geneva, 1998. 3. M. Good et al., “User-Derived Impact Analysis as a Tool for Usability Engineering,” Proc. CHI Conf. Hu-
4.
5. 6.
7. 8. 9.
10.
11. 12. 13.
14.
15.
man Factors in Computing Systems, ACM Press, New York, 1986, pp. 241–246. D. Hix and H.R. Hartson, Developing User Interfaces: Ensuring Usability Through Product and Process, John Wiley & Sons, New York, 1993. J. Nielsen, Usability Engineering, AP Professional, Boston, Mass., 1993. H. Beyer and K. Holtzblatt, Contextual Design: A Customer-Centered Approach to Systems Design, Morgan Kaufmann, San Francisco, 1997. D.A. Dillman, Mail and Internet Surveys: The Tailored Design Method, John Wiley & Sons, New York, 1999. J. Preece et al., Human-Computer Interaction, AddisonWesley Longman, Reading, Mass., 1994. L.L. Constantine and L.A.D. Lockwood, Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design, Addison-Wesley Longman, Reading, Mass., 1999. J. Whiteside, J. Bennett, and K. Holtzblatt, “Usability Engineering: Our Experience and Evolution,” Handbook of Human-Computer Interaction, Elsevier NorthHolland, Amsterdam, 1988. D.A. Norman, The Design of Everyday Things, Doubleday, New York, 1990. A. Cooper, About Face: The Essentials of User Interface Design, IDG Books Worldwide, Foster City, Calif., 1995. K. Mullet and D. Sano, Designing Visual Interfaces: Communication Oriented Techniques, Prentice Hall, Upper Saddle River, N.J., 1994. B. Shneiderman, Designing the User Interface: Strategies for Effective Human-Computer Interaction, Addison-Wesley Longman, Reading, Mass., 1998. R.G. Bias and D.J. Mayhew, Cost-Justifying Usability, Academic Press, Boston, Mass., 1994.
About the Authors Xavier Ferré is an assistant professor of software engineering at the Universidad Politécnica de Madrid, Spain. His primary research interest is the integration of usability techniques into software engineering development practices. He has been a visiting PhD student at CERN (European Laboratory for Particle Physics) and at the HCIL (Human–Computer Interaction Laboratory) at the University of Maryland. He received an MS in computer science from the Universidad Politécnica de Madrid. He is a member of the ACM and its SIGCHI group. Contact him at
[email protected]. Natalia Juristo is a full professor in the Computer Science Department at the Universidad Politécnica de Madrid, where she directs master’s-level courses in knowledge engineering and software engineering. She is also an editorial board member of IEEE Software and the International Journal on Software Engineering and Knowledge Engineering. She has a BS and PhD in computer science from the Technical University of Madrid. She is a senior member of the IEEE Computer Society and a member of the ACM, the American Association for the Advancement of Science, and the New York Academy of Sciences. Contact her at the Facultad de Informática UPM, Campus de Montegancedo, s/n, Boadilla del Monte, 28660 Madrid, Spain;
[email protected]. Helmut Windl leads the User Interface Design Group for Simatic Automation Software at Siemens’ Automation & Drives Division, where he has helped define and implement a structured usability process within the software development process. He is an experienced userinterface and visual designer for large-scale software applications and a project leader for usability-focused products. He is also a trainer and presenter in Siemens AG and with Constantine & Lockwood. He received a diploma in electrical engineering from the University of Applied Sciences Regensburg. Contact him at Siemens AG, A&D AS S8, PO Box 4848, D-90327 Nuremberg, Germany;
[email protected]. Larry Constantine is director of research and development at Constantine & Lockwood, a training and consulting firm. He is also an adjunct professor in the School of Computing Sciences at the University of Technology, Sydney, where he teaches software engineering and managing organizational change, and he is on the faculty of the Cutter Consortium. He has authored or coauthored 10 books, including Software for Use: A Practical Guide to the Methods and Models of Usage-Centered Design (Addison Wesley Longman, 1999). Contact him at Constantine & Lockwood, 58 Kathleen Circle, Rowley, MA 01969;
[email protected]; www.foruse.com.
January/February 2001
IEEE SOFTWARE
29
focus
usability engineering
Usability and the Bottom Line George M. Donahue, Sapient
There’s little debate that usability engineering benefits end users, but its benefit for companies and the people who work for them is less widely known. The author discusses these broader usability benefits and also how to use a costbenefit analysis to demonstrate the value of usability to your company’s bottom line. 0740-7459/00/$10.00 © 2001 IEEE
sability engineering benefits end users. Few people disagree with that idea. However, usability’s beneficiaries also include system developers and the companies they work for. Improving usability—whether of IT systems, e-commerce Web sites, or shrinkwrapped software—is not only highly cost-effective, but it can also reduce development, support, training, documentation, and maintenance costs.
U
Additionally, it can shorten development time and improve a product’s marketability. Here, I will show how you can use a cost-benefit analysis to sell usability engineering based on the bottom line. I’ll then discuss the broader benefits a company can realize from usability engineering. Cost–benefit analysis Although usability’s broad benefits are impressive, a cost-benefit analysis might be a necessary first step in introducing usability into your organization or a particular project. In usability cost-benefit analyses, the goal is to estimate the costs and benefits of specific usability activities—such as prototyping, usability testing, heuristic evaluation, and so on—and contrast them with the likely costs of not conducting the activities. The analysis has four steps: ■
selecting a usability technique,
■ ■ ■
determining the appropriate unit of measurement, making a reasonable assumption about the benefit’s magnitude, and translating the anticipated benefit into a monetary figure.
In a usability cost-benefit analysis, it’s also important to focus on the techniques and benefits likely to yield the most value for, and seem most persuasive to, the people you’re trying to persuade. As Deborah Mayhew and Marilyn Mantei put it, you should “decide the relevant audience for the analysis and then what the relevant categories of benefits are for that audience, because not all potential benefits are relevant to all audiences.”1 For example, a commercial software company might be more interested in a cost-benefit analysis that focuses on usability’s potential for reducing development costs and increasing customer satisfaction January/February 2001
IEEE SOFTWARE
31
In usability cost-benefit analyses, the goal is to estimate the costs and benefits of specific usability activities and contrast them with the likely costs of not conducting the activities.
than in an analysis that focuses on its potential to improve end-user productivity. To illustrate, I offer the following scenario, based on a method developed by Mayhew and Mantei. While the scenario’s anticipated benefit is a productivity improvement on an internally developed IT system, you can use the same methodology to perform usability cost-benefit analyses for other benefits and in different organizations, including e-commerce and commercial software companies. For this scenario, assume that you work at Pretty Good Systems. You are aware that the human resources department has lodged complaints against PGS’s internally developed human resource system, Getting In Good (GIG). You’re also aware that development on GIG Version 2 is about to begin. You suspect that improving GIG’s usability would not only make GIG users’ lives easier, it could save the company money—and PGS management is very interested in reducing costs. Before you broach the subject of improving GIG’s usability, you wisely decide to do some preliminary usability work followed by a cost-benefit analysis. Preliminary work To start, you interview the HR director and several GIG users. They all say that GIG is too complicated. They can’t understand why there’s one screen for entering new applicant data, another for entering data about applicants who have been interviewed, another if an applicant is hired, and so on. “There’s simply not that much applicant data,” a GIG user tells you. “I don’t see why we need so many different screens. It takes so long to go through them all.” Next you spend an afternoon observing HR staff using the system. You then sketch out some low-fidelity, paper prototypes of how a single GIG data-entry screen might look and try out the prototypes on a few HR staffers in an informal usability test. Your preliminary usability work supports your hypothesis: GIG’s user interface is inefficient. Usability-aware person that you are, you also interview other significant GIG stakeholders—namely, the GIG development manager and some GIG developers. You then discuss the option of entering applicant data on a single screen, as opposed to several. The developers tell you that, from a technical standpoint, it’s easier to “modu-
32
IEEE SOFTWARE
January/February 2001
larize” applicant information according to the applicant’s status in the hiring process. Nonetheless, you ask if it’s possible to use a one-screen-per-applicant approach in GIG’s next version. The development manager says that they could do it, but that it would take at least 30 more person-hours. Estimating costs You’re now ready to do the cost-benefit analysis. First, you estimate how much it costs to process a job application in GIG’s current version. Based on your interview with the HR manager, you know that it takes an average of four hours per applicant and that the average loaded salary of a GIG data-entry person is $25 an hour. You multiply $25 by four to get the cost of processing a single application: $100. On average, PGS receives about 1,000 job applications a year. Multiplying $100 by 1,000 gives you the current average annual cost of processing job applications at PGS: $100,000. To be on the safe side, you assume that making the programming change to GIG will take 40 hours, rather than the 30 hours estimated by the development manager. You then multiply 40 hours by the loaded average salary of a developer at PGS ($60 an hour) to determine the cost of making the change: $2,400. Estimating benefits Your preliminary usability work suggests that adopting a one-screen-per-applicant approach would cut application-processing time in half. This is your unit of measurement for usability. However, to be cautious, you assume a 25-percent reduction in processing time. Given this, on average, a person could process an application in three hours rather than four. Given that the average loaded salary of a data entry person is $25 per hour, you estimate a cost of $75 to process a single job application in the new system. To estimate the overall savings in the average annual cost of processing job applications at PGS, you multiply $75 by 1,000. The result is $75,000—$25,000 a year less than the current average processing costs. You then factor in the cost of making the changes ($2,400) and subtract it from the anticipated first-year savings, giving you a first-year benefit of $22,600.
However, you’re not quite finished. The typical lifespan of a PGS system is three years. Because the cost of making the changes will be incurred once, you need only deduct that cost for one year, whereas the benefit will be realized each year that the new version is used. Thus, you add the first-year benefit ($22,600) to the benefit for the second and third years ($50,000), and you get a total lifetime usability benefit of $72,600. The overall result is a cost-benefit ratio of 1:301/4. That is, $72,600 ÷ $2,400 = $30.25. In this example, the benefits to the HR department are different than the benefits to PGS as a whole. Because HR’s data-entry process will be streamlined, GIG2’s users should be able to get more work done and their main benefits are higher productivity and better job satisfaction. For PGS’s development managers (and presumably also its executives and shareholders), the main benefit is decreased costs. Although managers, executives, and shareholders may be happy to hear that usability engineering will improve job satisfaction, your cost-benefit analysis should focus on the cost savings, because that’s the most relevant benefit to development managers—and their buy-in is crucial. Broader usability benefits According to the above analysis, every dollar spent on usability offers a return of $30.25. That’s a nice return-on-investment. Nonetheless, considering usability solely from an ROI perspective does not give it its full due. Usability costs are typically seen as additional; that is, if development doesn’t include any formal usability activities—such as usability testing or prototyping—there won’t be any usability costs. That assumption is wrong. Every software product has a user interface, whether it’s usability-engineered or not. The interface takes time to build, regardless of whether it’s consciously engineered for usability. Time is money. In other words, user-interface costs are inevitable and intrinsic to development. Many systems also have support, documentation, maintenance, and other costs. These are also usability expenditures, and regardless of how such costs appear on the company’s books, usability engineering can help manage them. Although the ROI argument is compelling,
your usability case can be bolstered by pointing out that the company always spends money on usability, even though it may not see it this way. In his article, Arnold M. Lund cost-justifies the existence of a permanent usability group within an organization over carrying out discrete usability activities.2 I’ll now examine several broader usability benefits in more detail. Reduced development and maintenance costs Software development projects typically overrun their budgets and schedules. Such overruns are often caused by overlooked tasks and similar problems, which techniques such as user analysis and task analysis can address. When you focus on real user needs and understand the people you’re designing for, the result is often fewer design arguments and fewer iterations. Usability techniques, such those described in the sidebar, “Basic Usability Engineering,” are also highly effective in helping you detect usability problems early in the development cycle, when they’re easiest and least costly to fix. By correcting usability problems in a project’s design phase, for example, American Airlines reduced the cost of those fixes by 60 to 90 percent.1 One frequently referenced study found that correcting a problem once a system is in development costs 10 times as much as fixing the same problem in the design stage. Once a system is released, the cost to fix a problem increases to 100 times that of a design-stage fix. This study also found that 80 percent of software life-cycle costs occur during the maintenance phase; many maintenance costs are associated with user requirements and other problems that usability engineering can prevent.3 Whether your company does usability testing or not, your customers will, in effect, usability-test the system. Ultimately, of course, relying on such “usability testing by default” risks angering customers, and, as the studies above show, post-release problems cost much more to fix. A printer manufacturer, for example, released a printer driver that many users had difficulty installing. More than 50,000 users called the support desk for assistance, costing the company nearly $500,000 a month. To correct the situation, the manufacturer sent out letters of apology and patch diskettes (at a cost
Whether your company does usability testing or not, your customers will, in effect, usability-test the system. Ultimately, relying on such “usability testing by default” risks angering customers.
January/February 2001
IEEE SOFTWARE
33
Basic Usability Engineering Applying usability techniques—even if only in an informal, “guerilla” manner—can offer many usability benefits. Here, I outline a few basic techniques that are fundamental to usability engineering. For more information, see books such as Jakob Nielsen’s Usability Engineering (Academic Press, Boston, 1993), Ben Shneiderman’s Designing the User Interface (third edition, Addison Wesley Longman, Reading, Mass., 1997), or Mark Pearrow’s Web Site Usability Handbook (Charles River Media, Rockland, Mass., 2000). There are also many good usability Web sites, including www.stc. org/pics/usability/topics/index.html, www.useit.com, and www.usableweb. com.
User and task analysis The focus in this technique is on interviewing the actual or intended users. If the system does not yet exist, you can ask your marketing department for customer profiles and use them to guide your recruitment effort. (Because marketing departments typically think of people as customers, rather than users, it’s important that you ask specifically for “customer profiles, as asking for “user profiles” will most likely produce only quizzical looks.) Once you’ve recruited users, ask them to explain what they use the system for, what they most like about it, what they don’t like about it, and so on. If there is no digital system in place, ask users questions about the current, manual process of completing the tasks. Next, observe them using the system (or completing the tasks manually). You can then ask them questions based on your observations, such as, “When you were using the XYZ Screen, you said, ‘I always get mixed up here.’ Could we go back to that screen now so you can show me exactly where it is you get mixed up?”
Low-fidelity prototyping With this technique, the focus is on user interaction with designs, screens, or pages. You begin by sketching system screens or site pages, preferably on paper. The less “finished-looking” your designs are, the better. Users are typically more candid with rough, high-level interaction designs that look as if they didn’t take much work. Focus your prototypes on the screens or pages that are commonly used or that you think users might find difficult to work with. It’s unlikely that you’ll be able to prototype and test the entire user interface.
Usability-testing the prototype Once you’ve designed your low-fidelity prototype, usability-test it with three to five users or intended users. To do this, you first write a usability test script with tasks for the test participants to perform using the prototyped pages. Although the prototype is low-fidelity, ask users to interact with it as if it were a functional system. For example, if it’s a paper prototype, have them use their finger for the mouse and say “I’d click here to display my most recent transactions,” and so on. If they have problems completing the task, note what the problems are. Once you’ve completed testing, review the findings for patterns and to understand why people had the problems they did. Next, think of alternative designs that might eliminate the problems discovered in testing. Do this as many times as necessary or possible until you have a design that facilitates good user performance. Once you’ve done low-fidelity prototype usability testing, you might want to conduct usability testing with an interactive, higher fidelity system prototype. However, the main idea is to try to identify usability bugs as soon as possible, which is why low-fidelity prototyping— which you can do prior to coding—is so important.
34
IEEE SOFTWARE
January/February 2001
of $3 each). In all, they spent $900,000 on the problem. The company ran no user testing of the driver before its release. As one researcher put it, “The problem could have been identified and corrected at a fraction of the cost if the product had been subjected to even the simplest of usability testing.”1 Improved productivity and efficiency People tend to be more productive using usability-engineered systems. This benefit can be especially important in the context of IT software systems. For example, a major computer company spent $20,700 on usability work to improve the sign-on procedure in a system used by several thousand people. The resulting productivity improvement saved the company $41,700 the first day the redesigned system was used. On a system used by more than 100,000 people, for a usability outlay of $68,000, the same company recognized a benefit of $6,800,000 within the first year of the system’s implementation. A cost-benefit analysis of such figures results in a cost-benefit ratio of $1:$100.1 Working with systems that have not been usability engineered is often stressful. Alan Cooper, “the father of Visual Basic,” worked on a project to improve the usability of an airline in-flight entertainment system. IFEs are devices connected through an onboard local area network that provide movies and music to travelers on transoceanic routes. One airline’s IFE was so frustrating for the flight attendants that many of them were bidding to fly shorter, local routes—which are usually considered highly undesirable—to avoid having to learn and use the difficult systems. “For flight attendants to bid for flights from Denver to Dallas just to avoid the IFE indicated a serious morale problem.”4 When possible, people avoid using stressful systems; if people must use such systems, stress tends to undermine their productivity. And, as Cooper’s anecdote illustrates, poor usability can undermine morale. Reduced training costs Usability-engineered systems can reduce training needs. When user interface design is informed by usability data and expertise, the resulting interfaces often facilitate and reinforce learning and retention, thereby reducing training time. At one company, only one hour of end-user training was needed on a
usability-engineered internal system, in contrast to the full week of training for a predecessor system that had no usability work. At AT&T, usability improvements saved the company $2.5 million in training expenses.1 Lower support costs Providing telephone support for computer software is estimated to cost companies between $12 and $250 per call, depending on the organization.5 Such support costs can add significantly to a system’s total cost of ownership and erode profits for both the developing company and purchaser alike. When a software product is understandable and easy to learn, users don’t need to call support as often. As a result, commercial software companies may need fewer people to work the support lines (and perhaps fewer DJs to entertain those on hold). At Microsoft several years ago, Word for Windows’s print-merge feature was generating a lot of lengthy support calls (45 minutes each, on average). As a result of usability testing and other techniques, the user interface for the feature was adjusted. In the next release, support calls dropped dramatically, and Microsoft’s savings were substantial.1 Reduced documentation costs Because usability-engineered systems tend to have predictable and consistent interfaces, they are relatively easy to document. As a former technical writer, I can attest that user manuals and online help for such systems are completed more quickly and are less susceptible to inaccuracies than those of difficult-to-document systems. Also, usability-engineered systems often require less documentation, and that documentation tends to cost less to produce than documentation for systems developed without usability engineering. For example, one company saved $40,000 in a single year when usability work eliminated the need to reprint and distribute a manual.1 Litigation deterrence Although software makers aren’t necessarily subject to the same sorts of litigation as, for example, a manufacturer of medical equipment might be, poor usability is a potential element in lawsuits and other litigation. For example, in July 2000, a US-based
Web consultancy was sued by a client company that accused it of creating a site-component interface that was “unusable.”6 Even if the lawsuit was spurious, as the Web consultancy contends, the situation points to a new liability for software development firms. Although usability engineering may not prevent such lawsuits, companies that can demonstrate that they applied usabilityengineering techniques during product development might be less vulnerable should such litigation occur. The US government’s recent case against Microsoft hinged on a usability question: Are users well-served when the browser and operating system are closely integrated? Although no usability experts were called in to testify, they are likely to be included in the future as usability awareness increases.
When user interface design is informed by usability data and expertise, the resulting interfaces often facilitate and reinforce learning and retention.
Increased e-commerce potential Though usability can benefit all development organizations, perhaps nowhere is the relationship between usability and profitability as direct as in e-commerce, as Forrester Research suggests: Usability goals are business goals. Web sites that are hard to use frustrate customers, forfeit revenue, and erode brands. Executives can apply a disciplined approach to improve all aspects of ease-of-use. Start with usability reviews to assess specific flaws and understand their causes. Then fix the right problems through action-driven design practices. Finally, maintain usability with changes in business processes.7
Usability-engineered sites let users be more efficient and productive. However, it’s important that you interpret efficiency and productivity in relation to online shopping. Ideally, online shopping should be enjoyable, rather than frustrating: Users should not have to waste time searching for merchandise or figuring out how to buy it; nor should they have any doubt that their credit-card numbers and other personal information are secure. Buying a product or service online should be superior to making a purchase in a brick-and-mortar shop. More than 44 million people in the US have made online purchases; 37 million more say they expect to do so soon.8 However, many of these would-be onlineshoppers won’t succeed in making a Web January/February 2001
IEEE SOFTWARE
35
Usability is important for all Web sites, but for e-commerce sites, having a competitive edge in usability is crucial.
purchase, because e-commerce sites are, for the most part, too difficult for the average user to navigate. Moreover, the experience of some online shoppers has been so bad that they don’t want to buy online again.3,7 Finally, online shoppers spend most of their time and money at sites with the best usability.9 Good navigation and site design make it easier for users to find what they’re looking for and to buy it once they’ve found it. Usability can significantly improve the ecommerce bottom line: According to Jakob Nielsen, usability efforts can increase sales by 100 percent.10 Competitive edge Users always cite ease-of-use as high on their list of software system demands.1 Thus, giving users usability is giving them what they want. Users appreciate software and Web sites that don’t waste their time or try their patience with complicated user interfaces. Building usability into your software tells users that you value their time and don’t take them for granted. Usability is important for all Web sites, but for e-commerce sites, having a competitive edge in usability is crucial. Such sites commonly drive away nearly half of repeat traffic by making it hard for visitors to find what they need.11 And, repeat customers are the most valuable: New users at one e-commerce site spent an average of $127 per purchase, while repeat users spent nearly twice that.12 Usable e-commerce sites also build goodwill. Users recognize the effort put into making their e-commerce experience easy and efficient by returning to usable sites. Moreover, one of the biggest obstacles to ecommerce is trust. Consumers must trust a site before they will disclose the personal and financial information typically required for online purchases. A study of e-commerce trust found that navigation and presentation—both usability concerns—were essential for creating trust.13 Advertising advantages High usability can garner attention for your company’s Web site and help distinguish it from other sites. Improved usability can also help differentiate commercial software applications. Compaq, Microsoft, and
36
IEEE SOFTWARE
January/February 2001
Lotus have all made usability part of their advertising campaigns, for example.5 More recently, a Swedish consultancy announced a “usability guarantee” for sites it develops, requiring large-scale projects to undergo testing in a usability lab before they are released. If the project fails the test, the consultancy promises to “improve the solution without any additional costs to the client.”14 As another example, MacroMedia recently issued a press release describing its “usability initiative.” Though the initiative seems to consist of posting basic usability tips on its Web site, the fact that a large software company took such action suggests that companies might be realizing the advantage of being perceived as usability-aware. These examples notwithstanding, and despite its great potential, usability’s advertising value remains largely unexploited. This seems especially so in e-commerce, where users are increasingly nontechnical consumers who won’t suffer technical difficulties gladly. It may behoove usability proponents to try to increase advertising departments’ awareness of usability’s value. Better notices in the media People in the media have discovered the connections among usability, productivity, and cost-effectiveness, especially on the Internet. Companies are regularly taken to task about usability in business publications and on e-business sites. For example, CIO Business Web Magazine pointed out, “On a corporate intranet, poor usability means poor employee productivity; investments in making an intranet easier to use can pay off by a factor of 10 or more, especially at large companies.”15 The question arises: If those in the media see increased productivity and cost-effectiveness, can shareholders be far behind? In 1993, Nielsen studied the coverage of usability issues in trade press reviews of new software products and found that approximately 18 to 30 percent of the accounts were usability-related.16 A good review in an industry publication can be worth millions in advertising. Such reviews increasingly include usability as a criterion. One of Internet Week’s most popular columns features userinterface design and usability specialists discussing the relative usability of various e-commerce and e-business sites, for example.
P
erforming usability cost-benefit analyses in your company could be a first step toward introducing usability engineering techniques, and thus the first step toward realizing the benefits I outlined. Working to improve usability can bring significant economic benefits to companies that develop IT applications, e-commerce sites, and commercial software. Of course, users of these systems benefit as well.
References 1. R.G. Bias and D. J. Mayhew, eds., Cost-Justifying Usability, Harcourt Brace & Co., Boston, 1994. 2. A.M. Lund, “Another Approach to Justifying the Cost of Usability,” Interactions, vol. 4, no. 3, May/June 1997, pp. 48–56. 3. R.S. Pressman, Software Engineering: A Practitioner’s Approach, McGraw Hill, New York, 1992. 4. A. Cooper, The Inmates Are Running the Asylum: Why High-Tech Products Drive Us Crazy and How to Restore the Sanity, SAMS, Indianapolis, 1999. 5. M.E. Wiklund, Usability In Practice: How Companies Develop User-Friendly Products, Academic Press, Boston, 1994. 6. B. Berkowitz and D. Levin, “IAM Sues Razorfish for Poor Design,” The Standard, 14 July 2000, www. thestandard.com/article/display/0%2C1151%2C16831
%2C00.html (current 3 Jan. 2001). 7. H. Manning, “Why Most Web Sites Fail,” Forrester Research, Sept. 1998; view an excerpt or purchase the report at www.forrester.com/ER/Research/Report/ Excerpt/0,1338,1285,FF.html (current 3 Jan. 2001). 8. S. Wildstrom, “A Computer User’s Manifesto,” Business Week, Sept. 28, 1998. 9. J. Nielsen, “The Web Usage Paradox: Why Do People Use Something This Bad?” Alertbox, 9 Aug. 1998; www.useit.com/alertbox/980809.html (current 3 Jan. 2001). 10. J. Nielsen, “Web Research: Believe the Data.” Alertbox, 11 July 1999; www.useit.com/alertbox/990711.html (current 3 Jan. 2001). 11. H. Manning, “The Right Way To Test Ease-Of-Use.” Forrester Research, Jan. 1999; view an excerpt or purchase the report at www.forrester.com/ER/Research/ Brief/Excerpt/0,1317,5299,FF.html (current 3 Jan. 2001). 12. J. Nielsen, “Loyalty on the Web,” Alertbox, 1 Aug. 1997, www.useit.com/alertbox/9708a.html (current 3 Jan. 2001). 13. Cheskin Research and Studio Archetype/Sapient, “E-commerce Trust Study,” 1999; www.studioarchetype.com/ cheskin/index.html (current 3 Jan. 2001). 14. “Icon Medialab Announces Usability Guarantee,” Icon Media Lab, 29 Nov. 2000; www.iconmedialab.se/ default/news/press_releases/right.asp?id=310 (current 3 Jan. 2001). 15. S. Kalin, “Mazed and Confused,” CIO Web Business Magazine, 1 Apr. 1999; www.cio.com/archive/ webbusiness/040199_use.html (current 3 Jan. 2001). 16. J. Nielsen, “Is Usability Engineering Really Worth It?” IEEE Software, vol. 10, no. 6, Nov. 1993, pp. 90–92.
About the Author George M. Donahue is a senior
experience modeler at Sapient. In addition to conducting scores of usability tests of e-commerce, financial, interactive television, and new media Web sites, he has designed user interfaces and written user interface design guides, user manuals, and online help guides. His current research interests include strategic usability and cross-cultural user interface design. Before joining Sapient, Donahue was a senior usability specialist at the Compuware Corporation. He holds degrees from the University of Delaware and Clemson University and is a member of the ACM Special Interest Group on Computer-Human Interaction. He is also a member of the Usability Professionals’ Association. Contact him at Sapient, 250 Williams St., Ste. 1400, Atlanta, GA 30303;
[email protected]; www. sapient.com.
BENCHMARKING S O F T WA R E O R G A N I Z AT I O N S CALL FOR ARTICLES As the market for software and related services becomes increasingly competitive and global, organizations must benchmark themselves—their practices and performance—against other organizations. Yet benchmarking is not widely practiced in software engineering, so neither its potential nor problems are well understood. Moreover, some have questioned the quality of reported benchmarks. This special issue will discuss alternative benchmarking approaches, their strengths, and their weaknesses. Topics of interest include • Case studies of partner-based benchmarking • Cross-industry comparisons of organizational or project performance • Descriptions, evaluations, critiques of different approaches • Lessons learned, benefits and uses of benchmarking results • Techniques for ensuring objective and accurate results
SEPT./OCT. 2001 Authors of articles describing results of quantitative benchmarking must be willing to make their data and methods available for review (but not necessarily published). In addition to regular papers, we will also consider short news articles describing unique benchmarking resources. We also invite software benchmarking product and service providers to participate in a survey, whose results will be published. Guest Editors: David N. Card
Software Productivity Consortium
[email protected] (contact for regular articles) Dave Zubrow
Software Engineering Institute
[email protected] (contact for news articles and survey participation) Submit articles by 28 February 2001 to IEEE Software 10662 Los Vaqueros Circle, P.O. Box 3014 Los Alamitos, CA 90720-1314
[email protected]
focus
usability engineering
Partnering Usability with Development:
How Three Organizations Succeeded Karla Radle, iXL Sarah Young, NCR
Improving product usability enhances an organization’s productivity, competitiveness, and profitability. However, integrating usability practice into an organization is challenging. These case studies examine how three organizations succeeded.
38
IEEE SOFTWARE
ost people using the computer as a tool are more interested in accomplishing their objective than getting to know their computer. The essence of usability is the degree to which people can accomplish their goals without the tool hindering them. The science of engineering usability into a product is human factors engineering.
M
As today’s business environment focuses increasingly on productivity, companies that make their products easy to use—and quick to benefit the user—gain a key advantage. By identifying the users and determining their needs and expectations, companies can incorporate usability factors early in the product’s life cycle. The result: a company can increase its competitiveness, productivity, customer and user satisfaction, and profitability while decreasing support costs. But, how do you begin to address usability, especially in an organization that might not have a basic understanding of this discipline or the human factors engineering resources required? There isn’t one best way to initiate usability, or HFE, into an organization because each situation is unique. There are, however, common obstacles and issues that challenge any HFE practice. The following case studies describe successful introductions of HFE into organizations. While the different organizations shared common problems, each creative solution
January/February 2001
met the needs of specific circumstances. (The “Situations” sidebar describes the overall conditions facing each organization.) Organization A In this situation, the information products department manager believed the department’s future was tied to usability. She believed that all IP departments should try to put themselves out of business, because many manuals are a byproduct of poor usability. The IP department was a service center to all development organizations. None of the employees had formal or industry training in usability. How did it start? The IP manager educated her team and earned their support. She brought in an external resource and provided user-centered design training to her staff. Next, she created a presentation explaining how usability relates to creating publications. She looked for opportunities to present the mission statement to 0740-7459/00/$10.00 © 2001 IEEE
Situations
other managers. Her approach was not to lecture but rather ask: “What do you think? Is this the right direction for our company? How would it affect your department?” What were the obstacles? There was a general lack of awareness about HFE or how it could improve products. HFE expertise did not yet exist internally. Time was a driving factor because of extreme pressure to be first to market, and meeting delivery dates was the primary criterion for performance reviews and bonuses. Developers were concerned about adding time to the schedule for HFE activities. Their current source of feedback was from testing and occasional customer visits. No executive support existed for the development of HFE activities, so this had to be a grassroots effort. What strategy was used? Each January, the IP department scoped the projects and resource needs for the year. Our consulting team contacted development teams to obtain product roadmaps and release dates. We began assessing which products should receive HFE resources, focusing first on highrevenue products, activities within the scope of the current HFE skill set, product managers that were most open to HFE participation, and potential impact on products.
Organization A Company size Originating department Management support Existing HFE skills Types of products Third-party products
Organization B Company size
Originating department Management support Existing HFE skills Types of products
Third-party products Organization C Company size Originating department Management support Existing HFE skills Types of products Third-party products
One site within a global Fortune 500 company The information products department Front-line management, but not executive, support No formal or industry training in usability Middleware, front- and back-office software, Intranet and e-commerce Web sites Partners’ products were occasionally incorporated into deliverables
Global Fortune 500 company developing consumer products in a highly competitive market Newly created HFE department Direct management; some executive support Formal and industry training in usability Software for recovery, CD computer hardware, desktops, laptops, and convergence systems, DVD/CD-ROM, corporate Web site, diagnostics, and setup Included in the product offerings
One division within a global Fortune 500 company An application software development group Some management support A few seasoned human factors engineers Hardware, peripherals, and point-of-sale systems Many software partners; a few hardware partners
Become part of the team. We worked hard to
become a part of each development team. We attended the process and development team meetings. We listened for developers’ pain, which helped us formulate a strategy to use when negotiating changes later. For example, one team was coding in a language for which few of them had expertise. Another team was meeting an extremely tight deadline driven by executive management. Still another team was trying to provide additional deliverables in order to obtain a bonus. Attending meetings helped us identify opportunities to add value with HFE, provided openings for impromptu explanations of HFE, and helped us understand the team’s dynamics. Aim for awareness. We did not slam developers’ heads on the table every time they refused to change the interface. In fact, we greatly soft-pedaled the results of our HFE activities—at first. We took a long-term approach of raising awareness and gaining buy-
in. Although it was painful at times, we did not insist on perfect or even tolerable usability of any one product. After the first few projects, momentum began to build and we gained greater acceptance of our suggestions. Start with parallel activities. For each product, we reviewed development and testing schedules, listened for key pain points, and considered our own HFE resources. We chose the most appropriate activities (such as usability testing, heuristic review, or needs analysis) and looked for a way to do them in parallel with the existing development schedules. Becoming the critical path in a schedule-driven environment would not help us gain acceptance. At first, this meant that many recommendations were held until a later release. Meanwhile, we were honing our skills and teams began to respect our methodology; within a short time, we began to have an impact on product releases. January/February 2001
IEEE SOFTWARE
39
Common Obstacles Most organizations face similar obstacles when first trying to incorporate HFE activities and guidelines. ■
■
■
■ ■
■
Awareness level. How many people on site have heard of “user-centered design” or “human factors engineering”? Is there a lack of awareness and knowledge about these activities and their benefits? HFE resources. How much HFE expertise exists internally? How many people and what range of skills do they have? Are they formally trained or do they have work experience in HFE? Performance measurements. What performance measures are in place for developers? Are they rewarded for meeting schedules, achieving quality, or both? These measures have a profound impact on the processes used and decisions made. Feedback sources. What current sources of feedback exist for the development teams? Is feedback limited to comments from the testing department? Management support and communication. How many managers support the HFE effort? Are they front-line managers or executives? How are HFE efforts communicated within the organization? Market positioning. Which marketing pressures are influencing the products most? Is the product functionality greatly needed and so new that users will overlook usability issues at first? Are third-party products an integral part of the deliverable?
Educate the campus. We took every opportu-
nity to talk about what HFE is and why it is needed. We spent most of this time one-onone with project team leads before meetings. We asked to be on the agenda occasionally, to explain what we were doing and why. We invited developers to observe usability tests, which makes the feedback personal and concrete in a way that no presentation can replicate. Develop the HFE skill set. Our manager
brought HFE training in-house throughout the year. We attended conferences, read books, and learned by doing. For example, one of our usability evaluations lasted four hours per participant. We learned that even the most well-intentioned user would hit the wall after two hours. It was slow going at first, but as we built expertise in some core areas—such as usability testing and heuristic reviews—we routinely were able to provide those services while learning new techniques. Did you have a lab? We began to conduct short usability evaluations at our workstations, but quickly moved to a conference room. Later, we found space in an existing lab where we installed chairs and a table. Our equipment was an old PC and a box of paper prototyping supplies. The biggest advantage was that we could control the room schedule and conduct studies on short notice. 40
IEEE SOFTWARE
January/February 2001
When the building was remodeled, we found a one-way mirror and installed it between our lab and conference room. We wheeled the PC into the conference room and used the lab for recording and observation. Our equipment consisted of a video camera (borrowed from the graphics lab) and a $100 microphone threaded through the wall. What techniques were used? We started by conducting usability tests of our own publications, both printed and CD. This gave us a low-pressure opportunity to develop our skills and obtain useful feedback. We conducted short, 15-minute usability tests at our own workstations and gave chewing gum as a thank you gift! Over time, we developed a strong list of HFE services that we could offer (see the “Usability Techniques” sidebar) and found ways around common obstacles. For example, we asked the quality group to incorporate usability objectives into the formal quality requirements. If circumstances prevented us from contacting actual users, we would document assumptions about the users and their tasks. It became obvious to developers how many assumptions were being made and the need to at least start validating them. We began creating personas (see the “Usability Techniques” sidebar) for every project. We found a picture in a magazine that “looked like” our persona user, and printed color copies of this one-page description for the development team. Whenever design discussions digressed into personal opinions, we would simply ask “what would {persona’s name} want?” This was very effective at returning focus to pleasing the primary user group. What were the results? Developers were skeptical at first, but they quickly recognized the benefits of our services. We discovered some developers who had usability concerns but did not know how to support them within their own groups. There are several reasons why developers want HFE assistance, once they understand the HFE role: ■ ■ ■
HFE provides feedback unavailable through common software testing. The feedback is usually based on external users. HFE uncovers details that the require-
Usability Techniques
■
■
ments document leaves out. HFE does not compete with developers. Our job is to define the user needs and interaction requirements; the developers still handle the functional and architectural requirements. We make the developer’s decisions easier by providing adequate information. Their job is to write good code, and all too often they find themselves involved in speculative discussions instead. We provide the information that reduces those discussions and allows them to do their jobs more effectively.
Human factors engineers and usability specialists use many different techniques to analyze, evaluate, design, and test. The needs of the product and situation dictate the techniques chosen. This is not a comprehensive list but an overview of the breadth of techniques and examples of deliverables. ■
■
■
■
After a few months, we began to see a shift in the developers’ mindset, and within a year we started experiencing the following: ■ ■ ■
■
■
Developers sought us out to evaluate their interface designs. Teams included our activities in their development schedules. Developers asked us to log our findings as regular bug reports and created special usability categories. Teams incorporated recommendations from previous usability evaluations into the requirements document. Requests for HFE assistance began to outnumber resources. We started rotating our assistance among the key revenue-generating products.
■
■
■
■
■
Organization B In the competitive computer industry, this global, Fortune 500 company decided that the usability of their products would provide a competitive advantage. The company hired a director to create the Customer Experience Department. Its mandate was to ensure that the consumer market would find the company’s products usable. This department was responsible for the usability of the product’s hardware, software, and documentation. How did it start? The director of customer experience developed a department of human factors engineers, usability specialists, and experimental psychologists. Some of the professionals had been successful in their field for several years, while others were just beginning their careers in the usability field. The vice president of product devel-
■ ■ ■
User needs analysis. Accumulate raw data on users and their experiences, mental models, backgrounds, skill levels, work activities, and expectations for the product. (This data serves as a basis for many of the other techniques mentioned below.) Competitive evaluation. Evaluate and compare usability of competitive products. This information helps identify usability gaps, refine usability objectives, prevent inclusion of inadequate third-party products, and identify alternatives users might seek if product usability is poor. Focus groups. Conduct moderated group discussions with users to determine features, functionality, and user experiences with the usability of competitive products. Usability objectives. Create measurable objectives for product usability. This requires an in-depth understanding of the business objectives and user objectives for the product. User profiles. Define user groups and identify demographics and skill levels that will affect how users interact with the product. Profiles are essential for making design decisions. User personas. Create a short description of a fictitious representative of a user group, including a name and a photograph. Serves as a concrete reference point when making design decisions, and provides a mechanism to bypass opinionated discussions. User task analysis. Identify a list of prioritized user tasks. This should be developed from in-depth interviews and observations with actual users. At a minimum, it can be helpful to document what assumptions the development staff is making regarding user tasks. A clear understanding of user tasks and priorities is essential for making design decisions. Interface prototyping. Create interface prototypes with storyboards and flowcharts. These serve as first drafts for interface design discussions, rapid prototyping, and usability testing. Rapid paper prototyping. Conduct rapid prototyping sessions with users. Working with paper designs allows rapid prototype evolution without an investment in coding. Heuristic reviews. Review products against design standards. This gives insight into interface design flaws and how to correct them. Task scenarios. Create user scenarios to test product features. Scenarios reflect common user tasks for the product evaluated. Usability testing. Conduct usability tests with users via paper or online, interactive prototypes. Gives specific feedback on design flaws or validates design approaches. This can be used for benchmarking purposes against usability objectives.
opment requested that these usability professionals be integrated into the product development teams. Management clearly communicated that usability would be the key differentiator between the products of this company and the competition’s products. However, management gave no formal presentation to explain the role of the January/February 2001
IEEE SOFTWARE
41
Members of the Customer Experience Department received blank stares at the product meetings because the teams had no knowledge of who we were or what we did.
42
IEEE SOFTWARE
Customer Experience Department. Although there was executive- and directorlevel support, the product teams were uninformed about the existence of our human factors group. As a result, members of the Customer Experience Department received blank stares at the product meetings because the teams had no knowledge of who we were or what we did. What were the obstacles? Although there was support at an executive level, there still were several obstacles to overcome in the integration of usability. The most problematic issue to overcome was the lack of awareness by the company as a whole. There was no formal process for integration into the development teams. Management gave no formal presentations or other communication that HFE professionals were to become part of the team. Most teams felt the integration would increase the length of the development cycle. This company also felt the “first-to-market” time pressure. There was no time to do usability studies of any kind. The only feedback came from test engineering and as long as the product functioned, the attitude was “ship it.” One last obstacle to overcome was developing the “fresh-out-of-school” professionals. There were no formal structures, guidelines, or mentor programs to guide the new professionals. They were on their own to fight the political battles in this corporate environment. Many opportunities were lost because of inexperience, lack of support and guidance, and the need to get the product out the door on schedule. In addition to these internal obstacles, there were external obstacles to face when working with third-party vendors. HFE professionals commonly worked with third-party vendors to complete a software project because this particular organization lacked the resources to design and develop all the software for their products. The obstacles experienced in the external projects were unique in some ways; for example, the development team often was not on-site. It took several email messages, faxes, or conference calls to agree on design changes, which increased the time it took for feedback. The developers would make the changes and send the revised product, and then the HFE professional would review it and send feedback. Had development been on-site, the HFE professional
January/February 2001
could have sat down with the developer to make and approve changes in real time. The last obstacle the HFE professionals faced was dealing with software developed by certain third-party vendors where the company had no control or interaction during the design phase. Products shipped with software over which this organization had no control. This software often had major usability problems, which reflected poorly on the organization’s products. Unfortunately, we had little ability to recommend or control any changes to these particular third-party vendor products. What strategies were used? The HFE professionals were assigned to specific projects or product lines to address any software need. From there, strategies varied depending on the type of project and the amount of awareness of human factors each individual team had. Integration into the team. We were proactive in
contacting the project leads, obtaining product roadmaps, meeting schedules, and attending design reviews. This integration allowed the HFE professionals to determine the skills needed in the design of the product. It also let us demonstrate how HFE could make the product more usable through discussions, examples, prototypes, and usability tests. Integration into the development process. Based
on the meetings and design reviews, the HFE professional identified what aspects of the product could be improved and worked with the development team to integrate these changes. We delayed implementation for some areas until future releases, but placed them on the requirements list immediately. Awareness. The best way we made the com-
pany and a product team aware of our services was to get involved in every aspect of the product. This included product development, marketing, documentation, and technical support meetings. We told all individuals about our existence and the value we could add to the team. We usually achieved this awareness by introducing who we are, what we do, why we do it, and what value we add for the user. This was informal. Beyond the initial introduction, we increased awareness by speaking out about issues or
designs that we could easily enhance. This might be as simple as the placement of icons or menu items to more complex issues, such as actual navigation of the system. Human factors skill development. Management wanted to keep the HFE group abreast of all the latest technologies, software packages, coding languages, or whatever else required to do their job. With management’s support and an annual budget, the HFE professionals took classes, went to training sessions, and attended conferences. Each individual determined what skills or educations he or she needed to succeed.
What techniques were used? We started usability testing by identifying the user population. This user profile defined demographic information such as age, sex, education, computer experience, profession, skill level, previous experience, and training. This information helped us determine and complete usability objectives. The production of benchmark data on how long it should take a user to complete a particular task helped in the definition. We captured this information during several trials with many users, through competitive analyses, and definition of the user’s perceptions of successful completion of a task. Another objective was whether a particular task required instructions for successful completion. Once we defined the study’s users and objectives, we defined the tasks the users would perform. With this list of tasks, we created user scenarios for use in the usability study. HFE also was involved in the identification of features. Based on focus sessions conducted in conjunction with the marketing group, HFE created a features list for the product. This list influenced design decisions and provided feedback about the usability of current and competitive product features. We regularly used heuristic evaluations to measure existing software products against design standards or usability objectives. These evaluations were helpful in competitive analyses and investigation of third-party products as well. They helped determine what features to include and what user interface changes to make to produce a more usable and competitive product. The thirdparty evaluations assisted in selecting the products we would sell to our customers and
determining whether they met the usability objectives we defined. If not, we would not sell the third-party product with our products. The heuristic evaluations also identified whether the usability problem related solely to the software from the third-party vendor or resulted in combination with the hardware and software solution. Competitive evaluations helped us determine how usable our product was compared to our direct competitors. We identified task scenarios, and each user tested all the products. The result was a comparison of completion time, accuracy, and if and how the task was completed across all products tested. This again was useful in determining product features and usability changes to the interface.
Figure 1. Organization B’s usability lab: from within the executive viewing room, the testing room appears in the background (yellow lighting).
Did you have a lab? We had a lab built to our specifications, including a testing room, control room, and executive viewing room (see Figure 1). The testing room had a one-way mirror, two moveable cameras mounted on large tripods, a couch, coffee table, large-screen TV, and two PCs on adjustable tables. The controls consisted of high-end video and mixing equipment, microphones, a scan converter, speakers, and several TV monitors to view the different camera angles. A wall and a glass window directly behind the control room separated the executive viewing room. This room included speakers, a monitor, and high, barlike chairs with a writing surface. The visitors in the executive viewing room could hear and see into the testing room through the clear glass and one-way mirror in the control room, or by looking at the monitor. What were the results? Over time, the product teams saw the January/February 2001
IEEE SOFTWARE
43
value that HFE could provide, especially those teams that were heavily concentrated on hardware. The HFE could take the lead and run with the development of the user interface that would operate that hardware piece. This was particularly true of the development of the DVD user interface. As the product team concentrated on the hardware, the HFE worked closely with the vendor to develop the user interface. The HFE provided an update at each product meeting and showed prototypes of the user interface as well as the functionality it supported. The HFE became the expert on DVD and consultant to the product team when questions arose. In addition to the product team integration, HFE began to play a key role in competitive studies. The HFE group proposed the value of competitive analysis and how it could improve the products delivered. The studies identified what users found easy and difficult to use, desired features, where we fell short, and where the competition excelled. As time went on, the HFE group gained many more successes. The product teams began integrating HFE into the product life cycle, asking questions when they felt HFE expertise was necessary for product development, and even requesting usability studies be performed on their product and the competition. This led to the development of a competitive “petting zoo” comprised of the competition’s hardware and software solutions for review. Organization C Human factors engineers and psychologists worked at the corporate office in primarily a research capacity. During a 10-year period, the company decentralized this group to create pockets of expertise without any central directive or support for HFE activities. Three of these specialists wound up in a software development group together. They conducted software user interface design and testing for products but did not hold the title “human factors engineer.” How did it start? An experienced HFE manager relocated to the division for the purpose of starting an HFE department. He brought key customer contacts where he had been engaged in external HFE consulting and a number of marketing techniques aimed at understand44
IEEE SOFTWARE
January/February 2001
Each of these three organizations took a different approach, and yet they were all successful in initiating HFE to address product usability.
ing business cultures in decision making. He pulled the existing HFE experts into the new department. What were the obstacles? There was some management support and a limited departmental budget. Insufficient funding existed to support HFE efforts for all product lines. There was a small pool of very experienced and skilled HFEs. Their job descriptions, however, did not accurately reflect their interface design activities. The group needed a lot of motivation to reenergize into a dynamic new department. There was a wide gap between the experienced engineers and new graduates. What strategy was used? The group worked with marketing to conduct HFE activities with external customers. This provided financial backing for advanced development activities with internal development groups. We developed marketing collateral for external engagements, such as success stories, formal brochures of services, and templates for contracts and statements of work. We wrote white papers given to—and often requested by—customers. Our access to and successes with customers gave us credibility with development teams and product marketing. Interpersonal skills were critical to creating relationships with the development staff. We created a web site to explain our purpose, individual backgrounds, and how to request assistance. We focused on developing tools to test with customers instead of testing in a laboratory. We reclassified jobs to reflect HFE expertise and added performance measures to reflect continued external customer contact while maintaining internal HFE activity levels. The HFE department had a substantial impact on a few product lines through its advanced development work. We leveraged the results of ergonomic and usability studies to assist with some new product designs. These designs eventually became very successful product lines for the company, resulting in greater credibility for the HFE department. Did you have a lab? We typically worked at the customer site, so we studied real environments in lieu of using a lab. We had access to many university labs, and after a few years we gained ac-
cess to a state-of-the-art usability lab that a nearby sister organization had recently installed. Our focus from the beginning, however, was to develop tools for use at the customer site instead of in the lab. What techniques were used? To make the HFE department successful, we developed new tools that showed value to the customer. One such tool was a time-andmotion study to analyze client performance. In addition, the opportunity to work with and receive support from advanced development research helped us develop prototypes. Prototypes defining a product and addressing a particular need continued to drive next-generation product designs and support for HFE. The HFE group also generated support by producing white papers and journal and conference publications based on the research the group conducted. This visibility within the company and among other HFE professionals and other companies in the industry helped spread the word about the HFE work’s value. HFE expertise became an asset in sealing the deal for many sales individuals because it provided proven results and was a key differentiator from the competition. What were the results? Internal development teams began to set aside annual funding for HFE services. Each year, they planned the general services that should apply and then the HFE specialist refined those plans as work proceeded. We had a growing base of external customers and large corporations were soliciting our services. The HFE department had a business manager and high visibility with upper management because of our customer involvement. The department was growing rapidly and published papers each year, both internally and externally. Lessons learned As we reviewed the different approaches taken by these three organizations, we learned some lessons about partnering HFE with development. These lessons will serve as valuable reminders to any organization that is just beginning to address usability: ■
Excellent interpersonal skills are crucial to developing relationships with development teams.
About the Authors
■
Karla Radle is a human factors engi-
neer at iXL. Karla has over six years of experience designing and conducting usability studies on hardware, software, and documentation. She has worked in computer, Internet, and retail industries. She earned a BA in psychology and sociology and an MS in industrial engineering from the University of Wisconsin, Madison. Contact her at iXL, 1600 Peachtree St., NW, Atlanta, GA 30309;
[email protected] or
[email protected]. Sarah Young is a human factors
engineer in NCR’s Retail Solution Division. She conducts usercentered development for kiosks, Web applications, and point-of-sale systems. She earned a BS in statistics from the University of South Carolina. Contact her at NCR Corporation, 2651 Satellite Blvd., Duluth, GA 30096;
[email protected].
■
■
■
■
■
Applying the results of HFE activities to thought leadership in product development makes the company more successful and raises HFE’s credibility. Working directly with the customer creates high visibility with management, marketing, and product teams. Even when schedule pressures are intense, HFE is possible. At a minimum, HFE activities will raise awareness and understanding, and will set the stage for the future. Expensive labs and equipment are not necessarily a prerequisite for HFE involvement; this depends on the product and what is measured. Most resistance to HFE comes from other pressures (such as schedule) and a lack of information. There is no substitute for observing user interactions first hand.
E
ach of these three organizations took a different approach, and yet they were all successful in initiating HFE to address product usability. Even if your organization faces common obstacles, such as a lack of management support, HFE skills, or customer focus, you can still begin to address the usability issues of your product. Consider the situation, assess the temperament of the group, and choose a course of action that is appropriate to the existing conditions. Of course, each success story raises the global awareness of the value of HFE. However, there is more to do. As businesses address procedures and standardization, such as ISO, the Capability Maturity Model, and Six Sigma, the challenge is to clearly define the most effective role for HFE within each process. And as this engineering practice of addressing the human need matures, we must strive to constantly improve our methods and research. We will find better ways to integrate the different aspects of user needs, such as physical and cognitive, and to deal with the constantly changing dynamics of the human population. References 1. R.G. Bias and D.J. Mayhew, Cost-Justifying Usability, Academic Press, San Diego, 1994. 2. J. Nielsen, Usability Engineering, Morgan Kaufmann, San Francisco, 1994. 3. D.A. Norman, The Design of Everyday Things, Currency Doubleday, New York, 1998.
January/February 2001
IEEE SOFTWARE
45
focus
usability engineering
Integrating Usability Techniques into Software Development Jean Anderson, Francie Fleek, Kathi Garrity, and Fred Drake, Shared Medical Systems
ow merged into Siemens Medical Solutions Health Services, our company, formerly called Shared Medical Systems, creates clinical, financial, and administrative software for the healthcare industry. Like other medium to large companies, SMS had reached a scale and maturity level that required the development process to be documented, predictable, repeatable, measurable, and usable by the development groups.1
N Focusing on the user early in the development process goes a long way toward improving product quality and eliminating rework. In this article, the authors show how their company is working toward this goal. 46
IEEE SOFTWARE
After much study and consideration, senior management committed to implementing a universal OO development methodology. Senior management recognized the need to improve customer satisfaction, which had always been high but needed to be better in an increasingly competitive market. Management saw the introduction of usability practices as a prime means to achieve this objective. So they began to place greater emphasis on usability—even to the point of building and staffing dual state-of-the-art usability labs. Our goal throughout the projects we describe here was to combine the best OO analysis and design practices and usability techniques to create a powerful, unified way to develop software. We wanted usercentered design and evaluation to be a core component of the development process instead of an afterthought. Given the diversity and number of entrenched methods used
January/February 2001
within the company, implementing a universal methodology presented quite a challenge. The company chartered a process coordination group to create the best possible process and to act as a change agent by educating and consulting with the various development groups. The process coordination group included the software engineering team responsible for the OO development process and tools. It also included the usability team responsible for the usability labs and evaluation processes as well as the company’s user interface standards. Although the usability team made significant progress in introducing their evaluation process, they were frustrated to see it continually left until the end of the development process—at which point most changes are too costly to implement. The team wanted to bring usability into the earliest possible phases where it could have the most impact by improving initial design and eliminating rework. 0740-7459/01/$10.00 © 2001 IEEE
The team members realized they had to integrate usability testing fully into the software development process rather than continue to support it as a complementary process. A small group of representatives from the software engineering and usability teams took on the task of integrating user-centered design and evaluation into the company’s new OO development process. This task proved more challenging than we expected. Identifying the challenges The usability team promoted a usercentered design process that combined the contextual-inquiry design techniques developed or made popular by Hugh Beyer and Karen Holtzblatt2 and the usage-centered design processes derived from the work of Larry Constantine and Lucy Lockwood.3 An overall design framework created by Charles B. Kreitzberg and Whitney Quesenbery guided the usability process through the software development phases.4 The software engineering team chose the Rational Unified Process as the knowledge base to represent the company’s development processes. RUP is a comprehensive tool that provides information, guidance, templates, and examples for software engineering development activities. Unfortunately, although it acknowledges usability as a component of good software development, it inadequately supports such activities as the collection of usability requirements or usability evaluations. So, one of our first challenges was to represent the usability activities within RUP. RUP’s process guide is customizable. However, because it employs extensive cross-referencing and provides for frequent updates, we had to execute modifications carefully to avoid difficult maintenance later on. For the two teams involved, understanding each other’s processes, vocabulary, tools, and perspective was crucial. We spent considerable time trying to bridge the gaps, with each team participating in education classes sponsored by the other team. We couldn’t make real progress until we started to focus on the intent of each activity in the two processes. This let us move beyond both terminology and sequence differences. By happy coincidence, the focus on intent is also a key usability technique for identifying usability requirements. To further complicate the situation, not all the development groups began using RUP
immediately. In particular, some strategic projects used Princeton Softech Enterprise as the case tool for entering use cases and other OO artifacts. The developers used a variety of other tools, such as Visio and Caliber, for capturing the models, requirements, and diagrams created during the process. Aligning these tools from a technological perspective and reconciling the differences in terminology presented further challenges. Both teams needed to be able to address developers, regardless of their background and tool experience. The usability team particularly needed to be able to explain how to integrate user-centered design into any of the development processes. Dealing with organizational and cultural changes within the company proved to be a challenge equal to—if not greater than—the technical ones. Our older systems encouraged analysts to be more system-focused than user-focused. Business process modeling, or understanding our customers’ needs, although always a priority, really hadn’t been well structured. So, design decisions tended to put the burden on the user to learn a complex system rather than on the development team to produce an easy-to-use system. Initially, these development groups became concerned that user-centered design in the early stages of development would lengthen the development process. Because user-centered design was so new to most people at the company, many had difficulty trusting this approach’s ability to reduce the overall development time by eliminating rework and improving design specifications. Clearly, user-centered design was not a natural way of thinking among many of the numerous development groups that shared responsibility for software products. Until the development groups became accustomed to and proficient with these processes, they resisted implementing them. We determined that education and management support for the transition was vital.
Dealing with organizational and cultural changes within the company proved to be a challenge equal to—if not greater than— the technical ones.
Agreeing on process Figures 1 and 2 illustrate the various activities in the software development process and the software development phases using the RUP nomenclature. The phases and steps don’t completely synchronize but are a general guideline. These figures also illustrate several key elements of the process: January/February 2001
IEEE SOFTWARE
47
Figure 1. A high-level view of requirements definition during software development.
Inception phase Create concept • Create concept statement • Define goals • Set objectives • Start business terms glossary Gather requirements • Plan and perform site visits • Define user profiles • Define user models (workflow, artifacts, cultural, and physical)
Elaboration phase
Analyze requirements • Create affinity diagram • Consolidate workflow, artifacts, cultural, and physical models
Design product vision • Create vision picture • Build vision storyboards or business use cases • Create user environment design
■
■
■
The usability activities and their deliverables should emerge during the requirements gathering, analysis, and visioning steps. The better you execute these techniques, the better the product will be. Avoid the tendency to jump too early to coding. You gather requirements, analyze them, and design the product vision iteratively during the inception and elaboration phases. The first iteration, during the inception phase, is a high-level requirements analysis for the project business plan. Later iterations, during the elaboration phase, are performed in a more detailed manner to produce the product design. Throughout all iterations of requirements definition, potential users provide insight. The requirements and visioning activities feed both the system and user interface design.
Agreeing on a requirements process All software development processes emphasize the importance of gathering requirements. However, many processes do not describe this step in detail. Ad hoc methods of defining requirements abound. Development teams commonly spend considerable time discussing and analyzing requirements. They also commonly argue about requirements late in the development process. We recognized that the requirements process was the stage where both the OO development process and user-centered design process could have a major impact on 48
IEEE SOFTWARE
January/February 2001
improving the development process. So, we made requirements a priority and researched and analyzed requirements-gathering methods to select a best-practice direction.5–10 The user role models—or profiles—described by Constantine and Lockwood are of great importance for understanding and interpreting the project requirements. Contextual inquiry provides an effective, detailed, and structured requirements-gathering technique. Defining the user and learning about the work process at the place where the work occurs, and then analyzing the findings in a structured way, can considerably reduce the time a company normally takes to define requirements. We have divided requirements definition into four steps (see Figure 1): 1. Creating concepts. This step is the initial business-modeling process. High-level requirements gathering becomes necessary at a project’s inception to define the business plan properly, because the business plan is the main deliverable for this first step. To the business plan, we add a concept statement for the product that includes an outline of the intended end user, what the product should do, and the initial usability goals. 2. Gathering requirements. This step is best done through site visits to see the actual work taking place. After each visit, we capture the findings and collect them in detailed user profiles and a variety of models illustrating the observed workflow sequence, the communication pat-
Elaboration phase System path
Design product vision • Build vision storyboards and/or business use cases • Create user environment design
System modeling • Build system use cases • Map to domain model/business objects
Working together...
Analyze and design • Build class diagrams • Build collaboration diagrams • Build sequence diagrams • Build state diagrams • Create software design model
Implement design • Refine software design models • Create implementation model • Create component diagram • Build source code • Test
Evolution phase
Design UI • Build prototype storyboards or use system use cases • Create paper prototype • Evaluate the design/ refine/evaluate/refine • Create design spec
Implement UI • Create UI • Conduct evaluation
Construction phase
Transition phase
User interface path
Figure 2. A high-level view of design and execution during software development.
During build • Participate in all design reviews • Implement changes • Perform final UI evaluation
Transition project • Produce software • Package • Distribute • Install/implement at sites
Support product environment • Support, train • Improve processes • Perform studies at production sites
terns, the artifacts used (such as documents and equipment), the cultural climate, and the physical environment. Beyer and Holtzblatt’s contextualinquiry technique uses five models. We condensed these models down to a more flexible set that we can use as needed, depending on the particular product’s scope and activity. 3. Analyzing requirements. After an appropriate number of site visits, the requirements team interprets the data and compiles a user profile that represents the common traits of all the users observed and interviewed. The team also consolidates the workflow, artifact, cultural, and physical models. In addition, the team defines detailed usability objectives for the project and creates an affinity diagram of all the issues still needing investigation. 4. Designing the product vision. In this step, the team develops a model envisioning the final product’s strategy for meeting the requirements.
Beyer and Holtzblatt call the product vision the user environment design model, Cognetics calls it a roadmap, and Constantine and Lockwood call it the contextual model. The product vision provides the structural blueprint for the product and how the end user will interact with and navigate through the system. This model is a key component of the user-centered design process. Developing essential-use cases with extended-use cases to illustrate requirements is part of this vision model; this enables evaluation of all stages of the design. By agreeing that development teams could use a combination of context diagramming, use cases, and storyboarding to create this vision model, we were able to integrate this key component into the software development process. Agreeing on design and execution processes With a good product vision well documented—both visually and with use cases—the rest of the software development process falls into place. Four steps remain (see Figure 2): January/February 2001
IEEE SOFTWARE
49
All software development process activities are potentially iterative. To mitigate project risk, iteration is critical to the project’s success.
1. Designing the user interface. Building UI prototypes lets the team test its design with potential end users. Iteratively testing the design, refining it, and retesting it until the team is certain the design works ensures the product’s future success.11 You accomplish the testing in this step with a series of prototypes, starting with rough, hand-drawn paper sketches and ending with detailed mock-ups simulating the functionality. 2. Modeling the system—analysis and design. You map use cases with system responses to domain models; the business objects lead to OO models, including class, collaboration, sequence, and state diagrams.12,13 3. Implementing UI design. Now, and only now, is the product coded. All the requirements analysis, as well as the previous iterative testing and refinement of the design, ensures that the design specifications have greater detail and that this stage involves less rework. At this point we hope to see the development teams recognize and appreciate the process’s cost-effectiveness, because they find that coding is more efficient than it has traditionally been. 4. Transitioning the project and supporting the product environment. These steps cover the product’s rollout and production. During production is the time to perform usability studies at user sites to both complete the cycle and begin the next cycle. All software development process activities are potentially iterative. To mitigate project risk, iteration is critical to the project’s success. Some iteration must occur even within a cycle. Revisiting and reconciling the high-level models and design concepts created during a project’s initial steps is vital as a project moves into the construction phase. This activity keeps the project focused on its vision, within the scope of its requirements, and on track with its budget. Flexibility in project development is also essential. The software development process we developed might seem a generic solution. But good judgment in adapting the process and choosing activities appropriate for any specific project and its time frames is always necessary. Usability roles While clarifying the process changes for our company’s developers, we identified
50
IEEE SOFTWARE
January/February 2001
some new roles and modified some existing ones on our development teams. A role doesn’t necessarily equate to a person. More than one person can perform a single role; a single person can perform more than one role. We based these roles loosely on those defined by Deborah Mayhew.14 The usability engineer has primary responsibility for gathering and analyzing user requirements and for expressing those requirements in the product vision and use cases. These responsibilities extend beyond those of the typical business analyst. The business analyst gathers requirements to determine what the product should do. The usability engineer studies the potential end users to determine how the product should do those things. This means collecting details about end users, such as their typical level of education and level of experience with computers, as well as details about their work, such as the average turnover rate and their criteria for performance evaluations. This role’s goal is to capture the end users’ mental model of the work. The user interface designer sculpts the end users’ interactions with the product, developing the early prototypes of the product design. The UI designer also evaluates the usability of the design prototypes. In addition, this role creates the design specifications. This role’s goal is to express the end users’ mental model as closely as possible within the constraints of the information and technical architecture. The usability evaluator has primary responsibility for testing the product design, analyzing and documenting the results, and presenting the results to the development team. The bulk of this testing takes place during the initial stages of design before you code the product. The user has no primary responsibilities, but is key to the product’s success. Users help define the requirements and design the user interface. They also evaluate the UI during usability testing. Workflow models Along the way, we attempted to simplify the process. While consulting on a development project with extremely tight deadlines, one of the authors, a UI designer, found the number of models necessary for reporting field observations too time consuming. Because she still needed to document and share the knowledge gained from her field
Figure 3. A workflow model for interpreting a field study.
Task 1 Trigger: Patient is present to receive care
Interviews patients and measures vital signs 1 Documents vitals and chief complaint
Intent: Obtain Med information assistant to aid physician in diagnosis.
2
Patient Places patient chart and facesheet on door
3
4
Patient chart Facesheet
Informs physician patient is ready Physician
Task 2 Intent: Provide diagnosis and treatment of problem to Physician patient.
Patient Chart
Picks up and reviews patient chart 1 Interviews and examines patient 2 Diagnoses patient; explains diagnosis 3 Patient
Task 3 Intent: Obtain patient instructions; document visit; fill out charge info. Physician
1
2
3
Goes to preceptor room
Preceptor room
Issue: Distance between rooms Patient instructions
Obtains instruction sheet
Completes physician portion of facesheet
Facesheet
Task 4 Intent: Provide billing information.
1
Physician
2
Returns to give instructions and facesheet to patient
Takes patient chart to office
Patient
Office Patient chart
Facesheet Front desk clerk
Task 5 Intent: Record the visit.
1
Completes documentation before next visit or at end of day
Physician
studies with Unified Modeling Languagetrained analysts on her team, she created a workflow model that condensed the Beyer and Holtzblatt sequence and communication models. She also designed it to resemble UML so that the analysts would understand it more easily. It was a success. We incorporated the new workflow model into our interpretation session models, which
Patient chart Issue: Paper charting is difficult to share information and organize it
drive the analysis following site visits. Within one model, we show both the communication relationships (or transfer of data) and the activity’s sequence, which helped us reduce the number of models that we created to represent the users’ work. Figure 3 shows a simple workflow model from a site visit. In Figure 3, stick figures represent the user being studied (in boldface) and other January/February 2001
IEEE SOFTWARE
51
Task 1 Trigger: Today's OR schedule
Patient registration log
Who needs a bed today? 1
Intent: Bed Have room coordinator assignment and artifact ready when patient arrives.
2
I'm looking for a bed for these patients.
May I give them these beds?
Here's the bed info. Use these beds. Unit clerk Writes bed # on paper log 3
Revises patient's registration info 4
Task 2 Trigger: Patient presents self at window Intent: Admitting Complete all clerk necessary patient information.
Task 3 Trigger: Admitting clerk Intent: Record start time of room Bed occupancy. coordinator
Figure 4. A consolidated workflow model compiled from a series of field studies.
52
IEEE SOFTWARE
Nurse manager Patient registration log
Patient's record
• Encounter facesheet Prints • ID bracelet • Patient ID card
sign models—artifact, cultural, and physical—are consolidated as well, and we build an affinity diagram of the issues and insights from the field studies. Figure 4 shows a consolidated workflow model from another set of field studies pertaining to patient bed location.
Reaching agreement early Reaching a high-level agreement on the essential goes to room processes was the first step toRoom/bed Greets patient 1 is taken is taken ward integrating usability into back to OR to room Completes registration info, our company’s process. Once gets signatures we achieved this agreement, 2 Recovery Operating room Patient working through the process Directs patient to room 3 details became easier. We is taken to recovery room chose to use key strategic projNotifies that patient arrived ects to incorporate the main 4 Bed coordinator techniques before rolling the process out to more established development teams. Patient We have kept in mind registration that rolling out a companylog Records room/bed assignment, arrival time wide process needs to be done over time. Upper management must stand behind Patient's the process, and the develrecord opment teams must also buy in. Maintaining the big picture is critical to a successful implementation. human actors involved in the task. CommuWe found that teams need to focus on nication arrows extend from the principal several steps early in the process: actor to the other people (and artifacts) with whom that person interacts. We label each ■ Refine the product’s user profiles to enarrow with an action or message. The activsure that the entire team has a thorough ity is broken down into tasks and task steps understanding of the users. that we number to show the activity se- ■ Prioritize site visits to gather usability quence. Each task’s principal actor is on the and functional requirements. left, letting the model show where the user ■ Have key usability engineers provide is a recipient of an interaction. At least one thorough and structured interpretation intent must be identified for each task. of the data collected from site visits. When we move to the next stage of analy- ■ Build the software blueprints—the sis, in which we consolidate the lessons of each user environment designs—and provide field study (in preparation for designing the roadmaps to guide the rest of the process. product vision), we study the intents from all the workflow models. Then we create a conThe user interface designers must be versed solidated workflow model to depict the task in usability principles and employ these princistructure and the strategies common to our ples as they work. Also, usability testing must various customers. The other contextual de- take place during the early stages of design.
January/February 2001
W
e embrace the observations of Alistair Cockburn,15 who stresses the people side of software development. He points out that the human factors have dominance over any other factor and that the development process must consider the human factors within the development team, as well as those of the end users. No matter what our company’s ultimate software development process turns out to be, it must address Cockburn’s very real 14 tenets, including these four: ■ ■ ■ ■
People act according to their reward. Recognize the presence of dominant personalities. People work in certain ways better than others. The communications load can soon dominate the project.
Until a company understands, accepts, and finds ways to address Cockburn’s tenets, every software development process will face challenges that could be more easily solved by applying some of the principles we’ve discussed here. We’re not suggesting that usability techniques are a panacea to every software development ill, but we are making an appeal to implement some of the principles we’ve discussed. Focusing on the user early in the process—indeed, throughout every development stage—goes a long way toward achieving two of the holy grails of development: improved product quality and elimination of rework. No product will ever be “perfect.” And eliminating rework completely will of course never be possible, because—after all—software development is almost endlessly iterative, with shifting user needs and the resulting requisite upgrades. But we can still measure several benefits in very tangible terms: user satisfaction. Injecting the user’s voice early in the process is our main objective.
4. “How to Stop Computer Waste: An Interview with Dr. Charles B. Kreitzberg, President, Cognetics Corp.,” Leaders Magazine, vol. 20, no. 4, 1998. 5. D.C. Gause and G.M. Weinberg, Exploring Requirements: Quality before Design, Dorset House Publishing, New York, 1989. 6. J.T. Hackos and J.C. Redish, User and Task Analysis for Interface Design, John Wiley & Sons, New York, 1998. 7. IEEE Std. 830-1993, Recommended Practice for Software Requirements Specifications, IEEE, Piscataway, N.J., 1993. 8. L.C. Kubeck, Techniques for Business Process Redesign: Tying It All Together, John Wiley & Sons, New York, 1995. 9. D. Leffingwell and D. Widrig, Managing Software Requirements: A Unified Approach, Addison-Wesley, Reading, Mass., 2000. 10. D.A. Norman, The Invisible Computer: Why Good Products Fail, the Personal Computer Is So Complex, and Information Appliances Are the Solution, MIT Press, Cambridge, Mass., 1998. 11. J. Rubin, Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests, John Wiley & Sons, New York, 1994. 12. S.H. Spewak, Enterprise Architecture Planning: Developing a Blueprint for Data, Applications and Technology, QED Publishing Group, Boston, Mass., 1993. 13. E. Yourdon et al., Mainstream Objects: An Analysis and Design Approach for Business, Prentice-Hall, Upper Saddle River, N.J., 1995. 14. D.J. Mayhew, The Usability Engineering Lifecycle, Morgan Kaufmann, San Francisco, 1999. 15. A. Cockburn, “Growth of Human Factors in Application Development,” http://members.aol.com/ acockburn/papers/adchange.htm (current 17 Jan. 2001).
About the Authors Jean Anderson is a senior usability analyst and user interface designer at Siemens Health
Services (formerly Shared Medical Systems). She authored the first edition of SMS’s User-Centered Design Handbook. Her interests include designing Web application interfaces and helping development teams commit to the use of usability principles. Anderson is a member of ACM SIGCHI. Contact her at Siemens Health Services, XO6, 51 Valley Stream Parkway, Malvern, PA 19355; jean.anderson@ smed.com. Francie Fleek is a usability consultant. While working at SMS, she pioneered the creation and establishment of the usability processes. Her interests include clinical and security software for hosipitals and ambulatory care. She received an MS in technical and science communication from Drexel University. She is a member of the IEEE, the Usability Professionals’ Association, and ACM SIGCHI. Contact her at
[email protected].
Kathi Garrity is a lead systems analyst at the Vanguard Group.
While at SMS, she was instrumental in the effort to develop a company-wide software process. Her interests include software development, all forms of methods and requirements analysis, usability testing, and business applications. Contact her at
[email protected].
References 1. Software Eng. Inst., Carnegie Mellon Univ., The Capability Maturity Model: Guidelines for Improving the Software Process, Addison-Wesley, Reading, Mass., 1995. 2. H. Beyer and K. Holtzblatt, Contextual Design: Defining Customer-Centered Systems, Morgan Kaufmann, San Francisco, 1998. 3. L.L. Constantine and L.A.D. Lockwood, Software for Use: A Practical Guide to the Models and Methods of Usage Centered Design, ACM Press, New York, 1999.
Fred Drake is a management consultant. While at SMS, he led
the usability effort. His interests include improving the creation and quality of product information and moving it from an extrinsic support to a part of the product. He also is interested in technical documentation, online help, and technical training. He received his BS and MS in aerospace engineering from the University of Virginia. Contact him at
[email protected].
January/February 2001
IEEE SOFTWARE
53
focus
usability engineering
A Global Perspective on Web Site Usability Shirley A. Becker and Florence E. Mottay,
Florida Institute of Technology
online business failures are increasing as customers turn away from Online “There is a widening customer experience gap online. Companies who bridge this gap will win.”1
Online business failures are increasing as customers turn away from unusable or unfriendly sites. From a global perspective, usability requires cultural sensitivity in language translation, along with the appropriate use of color, design, and animation. 54
IEEE SOFTWARE
lthough many companies have succeeded in developing online business applications, numerous others have failed. Many of the failures resulted from a lack of corporate vision by not taking Web usability into account. A study by Deloitte and Touche stated that approximately 70 percent of retailers lack a clearly articulated ecommerce strategy and considered their site as testing the waters for online demand.2 This corporate “build it and they will come” mentality has led to
A
the demise of e-commerce sites when sites are too late, too buggy, or too complex for ease of use. Many Internet analysts correctly predicted that a significant number of business-to-consumer sites would fail during the year 2000 due to a lack of customer retention and repeat sales. Webmergers estimated that 150 dot-coms failed during 2000 and more will follow this year.3 Those sites that continue to succeed have and will expend significant resources modifying their sites to improve customer retention. Many of the dot-com statistics do not take into account the global aspect of online marketing. The potential for financial
January/February 2001
gain in a global market is great, yet little is known about global ventures’ success rates in terms of meeting customer needs on a local level. On a global scale, we could argue that cultural diversity and sensitivity must be considered to ensure that the online shopping experience is the same for each customer regardless of locality. The fierce online competition that has led to the demise of poorly designed online sites nationally may occur globally if nothing is done to address global usability. What can be done strategically to reach out to a global market? We propose the use of a Web-based usability assessment model that promotes customer satisfaction as an 0740-7459/00/$10.00 © 2001 IEEE
integral part of online business application development. This usability assessment model is an outgrowth of our collaboration with industry in the pursuit of more effective online development efforts. From a global perspective, our work is in an exploratory phase. However, with the current expansion of online business applications in the global market, we believe our assessment findings can be useful. Strategic usability factors Thomas Powell4 formally describes Web usability as allowing the user to manipulate the site’s features to accomplish a particular goal. The targeted customer assesses usability for simplicity, understandability, and ease of use. The perception of usability is influenced by user characteristics, such as gender, age, educational level, and technology skills. Usability perception is also affected by cultural differences associated with, for example, design layout, use of color and animation, and information content. We developed the usability assessment model, which Figure 1 shows, to identify and measure usability factors that impact a customer’s online experience. We’ve expanded these factors into more than 100 usability elements, not shown for space reasons, that have been used during usability assessments of commercial sites.5 The following usability factors are briefly defined. Page layout Page layout is the visual presentation of the Web page by means of background color, white space, horizontal and vertical scrolling, font size and color, and other design elements. The layout affects ease of use and quick identification of page components. Layout can be influenced by cultural differences in usability, such as the significance of a particular color, use of graphics (for example, country flags or symbols), or textual organization (left to right or top down). Navigation Navigation is the navigational schema in terms of breadth and depth of search paths and traversal mechanisms. Simplicity is promoted through the effective use of links, frames, buttons, and text. Navigational
Strategic goals • Customer satisfaction • Financial • Business process effectiveness • Learning and innovation
Design layout
Navigation
Design consistency
User profile • Age • Gender • Computing skills • Native language
...
Localization factors • Reading • Language • Custom
Customer service Usability assessment
Reliability
...
Environment • Browser • Monitor • Modem
...
Security
Information Performance content
Clearly labeled fields
...
Facilitation of data entry
Figure 1. The usability assessment model incorporates usability factors as well as the user profile and computing environment. All of these affect a Design consistency customer’s perception Design consistency is the consistent loca- of Web site usability.
considerations, from a global perspective, include ready access to other country sites from a home page (understandable in any native language) or via a navigational schema on each page. Figure 2 illustrates global aspects of navigation on a Web site.
tion of page components within and across pages. Various components requiring consistency include textual descriptions, labels, prompts, and messages. Consistency of color is required for links, background, and text, among others. Design consistency promotes ease of use by applying a common look and feel to each page in a particular site or across global sites. Figure 3 shows a high level of design consistency in Yahoo’s various country Web sites.
Information content Information content includes timely and correct error messages, prompts, button labels, textual descriptions, help, and customer service information. From a global perspective, information translated from one language to another should be grammatically correct, not archaic, and appropriate for cultural differences. Local terminology for a shopping cart, for example, includes shopping trolley and shopping bag. Figure 4 shows an example of effective information content with buttons appropriately labeled for local use. January/February 2001
IEEE SOFTWARE
55
Figure 2. Illustration of navigational aspects of global usability. The world map supports global navigation by showing available country Web sites for a selected area on the map. The second Web page illustrates inconsistent global navigation. In terms of global usability, not all country Web sites navigate consistently to other country Web sites. (It’s possible that the Web sites cited in this article have since changed.)
National Schema: This page allows a user to highlight a region to display a list of countries for that region.
Navigation inconsistency: There is a link from the Swedish to the German and US pages but no link from the German to the Swedish page.
Performance Performance is measured according to consumer wait and system response times. Currently, there is significant global disparity in terms of modem speed and personal access to the Internet. Cultural sensitivity translates into sensitivity concerning download time. Performance-related cultural insensitivity is demonstrated by the high use of animation in many Asian and South American Web sites affiliated with US companies (we found animation disparity for European and Japanese-based companies as well). Yet their North American and European sister sites, where Internet access with
Figure 3. Illustration of design consistency. Note that the German and English site designs look very similar.
56
IEEE SOFTWARE
January/February 2001
higher modem speeds is more readily available, minimize the use of animation. Customer service Customer service is additional information and support mechanisms that are readily available from the organization to enhance the shopping experience. This includes, for example, email and mail addresses, phone numbers, and interactive chat rooms. It can also mean that help is available in a native language. Reliability Reliability is defined in terms of site
crashes, downtime, error messages, and consistent response times. A common usability problem related to reliability results when SQL, JavaScript, and other cryptic error boxes are displayed to the end user. Another common problem results from a miscalculation in the number of hits during peak periods of Web use. In terms of global-related reliability, these problems will have a major effect on customer usability.
ensure that these are weighed during usability decision making. Typically, strategic goals require a balance of financial, customer, business process, and internal learning perspectives.6 Strategic goals will dictate whether cultural sensitivity (driven chiefly by customer satisfaction goals) or cultural insensitivity (driven chiefly by financial, time-to-market goals) take priority in the development of online business applications.
Security Security is concerned with privacy and limited access to personal information. The security issues facing American consumers extend to customers worldwide regarding the misuse and unauthorized distribution of credit card numbers, addresses, phone numbers, income, and other personal data.
Country-centricity and usability As a result of our study of usability associated with US companies, we discovered that organizations tend to develop countrycentric sites to support their global market. (We limited our study, and so our discussion, to US-centric usability, although the usability concept could apply to any country.) US-centricity is imposing a Web usability look and feel from an American perspective onto localized Web sites. The result might be an emphasis of English as a primary language on all international Web sites with little regard for native-language support. The result might also include a lack of concern (or awareness) for grammatical inconsistencies or incorrect translations to a native language. US-centricity can come about unknowingly, for example, when an Englishlanguage Web site is directly translated into native-language Web sites. Other possible reasons for US-centricity are when a com-
Other usability components Our usability assessment model includes a user profile of the targeted customer base and the customer’s computing environment, which is important in ensuring that modem speed, browser type, and screen size are taken into account during the assessment process. A usability assessment also considers other environmental factors. Moreover, the user profile and environment data might need to be localized based on a particular country’s or region’s characteristics. The usability assessment model also includes the organization’s strategic goals to
The user has the option of an English or Chinese version of this Web site. In either case, the button label for the other selection is written in the appropriate language for ease of use.
Figure 4. Button labels are appropriate for local use of a given Web site.
January/February 2001
IEEE SOFTWARE
57
The France site is in English.
The site is in English. Directions provided for European travel may be selected in many different languages. ■
■
Figure 5. Several international sites that are in English. These sites illustrate the reliance on the English language for international sites. The user would have to understand English, for example, to ask for directions in a native language.
pany deems it economically feasible to maintain only English-supported country sites, translates one US-based Web design into many international sites, or uses implied design standards regardless of cultural differences. Figure 5 illustrates this concept of cultural insensitivity whereby site pages for global use are written in English. Usability problems that we encountered range from simple grammatical mistakes to the overuse of animation, which severely slows download time. A number of UScentric usability issues can negatively affect a local customer’s online experience:
■
■
■
English words hard to pronounce in French.
■
Figure 6. An example of a site with potential for confusion: Selected English words are not translated into the native language. In this case, the English words are difficult to pronounce and may not be understood in French. 58
IEEE SOFTWARE
January/February 2001
The use of culture-specific icons may be inappropriate, confusing, or unknown at a local level. A common example is the shopping cart icon. Other countries use different terminology to represent the shopping container, such as a trolley or a bag. The use of a particular color for backgrounds, error messages, or textual information may be inappropriate, confusing, or misleading. A color might have different meanings in different countries. The color red means error or warning in the US although this isn’t the case in Asian countries. One or more colors might represent nationalism for a country. Yellow, for example, is found on many German sites, as this is a national color. Commonly used English words and phrases, as well as trademarks, are often not translated into the native language. Locally, these words might be misunderstood, difficult to pronounce, or their meaning might be unknown (see Figure 6). Direct translation of English to a native language can result in unintuitive or confusing labels and instructions. On one particular site, the English word “map” was translated directly into the French word “plan,” which is not selfexplanatory in French. Plan du site— plan of the site—would have been a better phrase for improved readability. Figures 7 and 8 show examples of Web sites in which the direct translation might affect local usability. A main or home page for accessing country or regional sites is in English. The user must select a country or option from a list of English words with no translation support for the native language. (Some sites have remedied this in part by providing a visual map of the world, as Figure 2 shows). The use of animation varies by country site. For several US companies, their Asian and Central and South America sites have significant animation when compared to North American and European sites. For several European and Japanese sites, the US site contained more animation. Figure 9 shows an example of a European company with varying degrees of animation associated
■
with its country sites. Navigational schema varies by country site. Inconsistencies in navigation make it difficult to traverse consistently across sites. Some country sites allow access to a home page; others allow access to a particular region of the country, while others access all countries (Figure 2 shows this limitation).
Usability strategies In pursuing a global market, organizations should be sensitive to cultural differences that might impact usability. Several strategies are available that can help with usability, depending on the organization’s goals. Common design A general design layout, with little or no customization for particular country sites, might reduce the cost of upgrades and maintenance associated with multiple sites. For customers accessing more than one country site, it provides design consistency for ease of use. It is also easier to enforce global design standards in terms of the site’s look and feel. Figure 3 illustrates this concept for Yahoo sites, which have a high level of design consistency. The risk associated with this strategy is that usability can be degraded when grammatical mistakes, missing translations, and inappropriate colors, for example, are introduced during site construction and maintenance. Usability assessments uncover these problems before they reach the customer. Customization A lot could be learned about cultural sensitivity, concerning global site deployment, from the international marketing strategies of McDonald’s and Coca-Cola. When visiting a McDonald’s in Aruba, for example, there is a localized food item— barbeque chicken—not found on the North American menu. Similarly, Coca-Cola localizes the flavor of its products to maximize global sales. This localization concept could be applied in the development of global online business applications to enhance global usability. The downside to developing customized Web sites for each country, however, includes higher development and maintenance costs when each site is built and maintained separately.
The US name “Escrow” may be interpreted as “fraud” in French. Though the French version of escrow is escroc, it is pronounced the same. The company explains the meaning of the word, but one still has to question whether this will overcome the negative connotation of the word.
Combined common and customized design This middle-of-the-road strategy supports design consistency across all Web sites while customizing a particular Web site to meet the locality’s cultural needs. By standardizing corporate logos, nav bars, graphics, and other standard look-and-feel components across all sites, companies support the usability goals of simplicity and ease of use. By customizing colors, icons, graphics, and other Web components to meet a given country’s needs, companies promote understandability and ease of use. Perhaps most important, however, is the appropriate use of the native language for each respective Web site. Applying the customization or the com-
Figure 7. Direct translation with potential negative connotation. In this case, the French translation of the English word has a negative connotation. Though its meaning is explained to the user, there may still be a negative impression.
Academic initiative is English but is also composed of two French words in reverse order. For a non-English-speaking French person, academic initiative can be understood to have meaning but is grammatically incorrect.
Figure 8. English words may cause confusion when interpreted in a native language. January/February 2001
IEEE SOFTWARE
59
Table 1 Comparison of country Web sites for a software company. The study was conducted using a 56K modem, 15-inch monitor on a notebook computer, and Microsoft’s Internet Explorer browser. Country
Use of animation (Scale 1 –
US Australia Sweden France Japan China Brazil
Horizontal scrolling
5)1
Oversized graphics
(Yes or no)
No animation No animation 3 3 No animation 3 1
(Yes or
Yes Yes No No No No Yes
no) 2
No No No No No Yes Yes
English content (Scale 1 – 5)1
Not applicable Not applicable 1 1 2 1 (button label GO) 4
1 Likert 2
scale where 1 is the lowest point of allocation and 5 is the highest. A 1 indicates low significance; 5, high significance. Oversized graphics waste valuable information space and require more vertical or horizontal scrolling to find information.
bined strategy instead of a common design results in higher development and maintenance costs. The higher costs are justified, however, by customer satisfaction achieved with culturally sensitive sites. Although more research is needed, the national fallout of business-toconsumer Web sites to date tells us that fierce competition and customer satisfaction both play a critical role in online success. Usability assessments: A study Much of our work on usability assess-
Figure 9. Animation and performance issues. These two examples illustrate country centricity in terms of animation and the impact on performance. In each case, the country-oforigin Web site has less animation than the other country site. 60
IEEE SOFTWARE
The download time for the US site versus the Argentina site is significant. The Argentina site has significant animation, which the US site does not have.
January/February 2001
ments of US sites has focused on user profile data that included age, gender, computing skills, and other commonly used marketing data. When profiling the consumer for a particular country, however, there is additional information that would assist in developing an effective online business application. From a global standpoint, a user profile for a country should include the level of understanding (or popularization) of commonly used icons (such as a shopping cart), words (such as the GO button label),
US site navigation is complex because of the extensive use of animation and frames. The Dutch site has little animation and is very simple in design.
and colors (such as red). A usability assessment, based on the model in Figure 1, can uncover this information. To illustrate the importance of usability assessments in uncovering design flaws, we compared seven country sites for a USbased, global software company. Table 1 summarizes the results. The usability elements included animation, horizontal scrolling, graphics, and English content. It’s interesting that although these sites were customized, each had usability problems. The US and Australian sites did not have animated components, thus minimizing download time. However, both sites made use of horizontal scrolling, which negatively impacted readability. The China and Brazil sites had oversized graphics, which wasted valuable information content space. All non-English sites had various amounts of English embedded in the text. The company that we studied and summarized in the table is a large, well-established software company selling multiple products in an international market. Common aspects of all the company’s sites included consistent use of background colors, fairly consistent page design, mixed English with native language, good use of vertical white space, and the use of the folder design standard (popularized by Amazon.com design).
T
he number of non-US customers using online business applications continues to increase very rapidly. To take advantage of this opportunity, companies must understand the target market in terms of localized and common online needs. In this respect, we have only just begun to understand the usability issues that influence short- and long-term use of online business applications. We are developing a tool, an automated environment, that will let users enter their assessment of a particular Web page or site. The tool implements the usability model shown in Figure 1. It supports data entry for one or more selected usability elements in order to analyze the user’s perspective on
Web site usability. The tool’s report generator allows for data analysis based on user profile or environmental selection criteria. Our future endeavors will expand our tool to incorporate our findings on global usability for more effective assessments.
Acknowledgment We would like to thank Anthony Berkemeyer and Natalie Roberts for their usability expertise and their invaluable assistance in uncovering global usability issues.
References 1. M. Hurst and E. Gellady, “Building a Great Customer Experience to Develop Brand, Increase Loyalty and Grow Revenues,” www.creativegood. com/creativegood-whitepaper.pdf (current 16 Jan 2001). 2. R. Spiegel, “Report: 70 Percent of Retailers Lack ECommerce Strategy,” E-Commerce Times; www. ecommercetimes.com/news/articles2000/000126-1.shtml (current 16 Jan 2001). 3. J. Weisman, “E-Commerce 2000: The Year of Living Dangerously,” E-Commerce Times, 29 Dec. 2000; www.ecommercetimes.com/perl/story/6380.html (current 16 Jan 2001). 4. T. Powell, Web Design: The Complete Reference, Osborne McGraw-Hill, Berkeley, Calif., 2000. 5. S. Becker, A. Berkemeyer, and B. Zou, “A Goal-Driven Approach to Assessing the Usability of an E-commerce System,” Cutter Information Technology J., Apr. 2000, pp. 25–34. 6. R.S. Kaplan and D.P. Norton, The Balanced Scorecard—Translating Strategy into Action, Harvard Business School, Boston, 1996.
About the Authors Shirley A. Becker is a professor of computer science at the Florida Institute of Technology, Melbourne, and codirector of its Software Engineering Research Center. Her funded research includes Web usability and testing, Web-enabling tools and technologies, e-commerce systems development, and database systems. She recently served as editor of the Journal of Database Management and serves on several editorial review boards. Becker received her MS and PhD in information systems from the University of Maryland, College Park. She is a member of the IEEE, the ACM, and the Association for Women in Computing.
Florence E. Mottay is a graduate student in software engineering and a research assistant at the Center for Software Engineering Research, Florida Institute of Technology, Melbourne. Her research interests are in software testing, formal languages, mathematical models, and e-commerce. She was awarded for excellence in mathematics by the United States Achievement Academy (1997) and for academic excellence by the American Association of University Women (1998). Mottay received a BS in applied mathematics from Florida Institute of Technology.
Contact the authors at the Florida Inst. of Technology, 150 West University Blvd., Melbourne, FL 32901;
[email protected];
[email protected].
January/February 2001
IEEE SOFTWARE
61
focus
usability engineering
Designing User-Centered Web Applications in Web Time Molly Hammar Cloyd, Broadbase Software
As designers struggle to develop Web applications “in Web time,” they are under the added pressure of delivering usability. This author describes her company’s successful transformation to user-driven processes for designing e-commerce applications. She also offers strategies for introducing human factors methods into a reluctant development organization. 62
IEEE SOFTWARE
sability has moved from a “nice to have” to a “must have” component of e-commerce application design.1 In the past, customers purchased desktop applications and then struggled to learn how to use them or called for technical support. Now, they shop in a try-before-you-buy model. If they can’t navigate your site, they are a few clicks away from your competition. Even if you’re in an industry with few competitors, users’ time and attention are at a premium.2
U
The growth of e-commerce and businessto-business applications has created an unprecedented emphasis on knowing our users and designing usable applications. However, backing corporate commitments to usability with user-driven development processes is a challenge. Designers struggle to design new applications, defining Web user interface standards as they go, all the while under pressure to deliver applications faster—in “Web time.” These problems are compounded in many start-ups, which have little design process infrastructure, much less human factors methodology, in place. With little historical data about Web application user interface and usability standards, human factors engineers are searching for ways to balance three different approaches to Web-based usability engineering: transferring traditional application design techniques to the Web environment, relying on emerging Web design standards, and conducting new research into what
January/February 2001
Web application users want and need. Developers frequently ask usability professionals, “What’s the difference between a Web site and a Web application?” “Should I conform to Web site standards or Windows standards when designing Web application screens?” and “What should the Cancel button on a Web form do?” To complicate matters, developers focusing on getting Web applications to market in Web time often means they cut back on planning and design in the development process. Overall, the use of software engineering processes is in decline.3 The result: human factors engineers are pressured to provide unprecedented usability in a fraction of the time they need. This article presents a case study of how Decisionism, an analytic-applications company, redefined its software development process to design usable Web applications in Web time. In the midst of these process changes, Broadbase Software acquired the company. The development process that 0740-7459/00/$10.00 © 2001 IEEE
Decisionism pioneered is now the basis for the user-centered design group at Broadbase Software. Our challenge Our organization’s decision to enter the B2B Web application arena shifted us from being a traditional software developer to a Web application provider. Specifically, we were faced with these challenges:
process by placing human factors methods at the core. We stripped away our existing software development process and started over. In a matter of days, we outlined the human factors methods and deliverables that would be required to ■
■ ■
■
■
■
■
■
■
Shifting the development organization’s mindset from a feature-driven approach to a user-goal-driven one.4 Rather than generate lists of product features (what our product would do), we wanted to set requirements based on what users would be doing with our product. Changing the organization’s view to human factors methods. Prior to entering the Web application arena, Decisionism did not have a human factors group, so its addition represented a change in the corporate culture. Introducing a design process in an organization in which team members were reluctant to be bound by procedures or heavy project documentation requirements. Our challenge was to design a process comprehensive enough to be repeatable and to support introducing new team members and technologies, without being cumbersome. Defining an all-new product, starting with very little knowledge about potential users and no concrete information about how users would perform tasks with the new application. Having a limited design, development, and quality assurance staff along with a corporate goal to be first to market with a B2B analytic application. Designing a development process that would allow for the thorough investigation of users’ characteristics and goals yet would facilitate a rapid application development life cycle. Making an architectural shift from a user interface that is tightly bound with functional components to a flexible one that could be changed with minimal impact to the underlying code.
Our Web application design process Decisionism redefined its development
■ ■ ■ ■ ■
determine who our new application’s users would be, what their goals are, and how they work; establish overall Web application user interface standards; identify usability goals for the new application; communicate usability architecture requirements to the developers; determine the application’s overall flow; design the user interface, including site maps, prototypes, and usability tests; and produce a user interface specification that supports developers in programming the interface but does not take months to write.
Decisionism redefined its development process by placing human factors methods at the core. We stripped away our existing software development process and started over.
With this process outlined, we asked the development, quality assurance, documentation, and marketing leads to add their pieces to the process. We sequentially added each functional group’s tasks until all the people on the team were satisfied that the process met their needs. The resulting development process is centered on human factors methods. Every human factors deliverable is a critical input to other functional teams’ work. We defined five phases for the requirements and design process: 1. Condensed user and user-goal analysis. 2. Proof of concept (prototyping). 3. Combined site maps and storyboard content. 4. Use cases with screen mockups. 5. Hand-off of use cases and screen mockups to development. Condensed user and user-goal analysis Prior to the design phase, the business development group completed a market analysis of prospective customer companies. Using this information, we spent one week sketching out a preliminary picture of our prospective users, identifying such factors as their goals, skill level, and measures of job success. Everyone on the team—the lead architect, programmers, services members, the quality January/February 2001
IEEE SOFTWARE
63
Figure 1. We used a paper prototype to convey our initial product vision to team members and from this developed a PowerPoint demo screen to test with users.
64
IEEE SOFTWARE
assurance and documentation lead, and the vice presidents of engineering and business development—participated in creating user profiles. This initial look at users gave the team a starting point for identifying the tasks that users would perform with the application and for creating a prototype to use in subsequent user analysis and feedback sessions. In Mastering the Requirements Process, James and Suzanne Robertson describe the requirements process as determining “the business problem to be solved ... and what the product will do to contribute to a solution.”5 Unfortunately, in many companies, this process is abbreviated because of tight deadlines. The functional requirements document becomes merely a shopping list of features that engineers prioritize and identify trade-offs to determine which features can be implemented in a given release. In taking a user-centered approach, we steered away from feature lists and focused on a handful of real-life user problems or goals that our product would accomplish. For example, instead of listing requirements such as “Display of multiple analytics on a single page,” one of our requirements was “Enable users to determine the best auction starting price for a commodity.” This requirement led to an Offer Optimizer software module that not only displayed multiple analytics on one page but also supported users in making smart buying and selling decisions in a B2B market. With our preliminary user profiles and user goals in place, we started an ongoing process of meeting with potential customers, watching them work, and asking for their feedback on user interface prototypes. In addition to soliciting feedback from a variety of
January/February 2001
B2B companies, we created a close development partnership with 20tons.com, a marketplace information provider for the plastics industry. They acted as subject matter experts and provided us with ongoing feedback and input into user profiles and use cases throughout our design and development processes. Proof of concept Because we were introducing a brandnew product idea and starting with so little user information, we created a paper prototype to convey our initial product vision to team members. This served as a starting point for gathering requirements and usability feedback from prospective users. We chose paper prototyping rather than functional prototyping for three reasons: ■ ■
■
It was faster to mock up and revise designs than coding screens. The designs clearly had not yet been coded, so reviewers did not hesitate to suggest changes. Developers were not tempted to use already written code.
Once we were satisfied with our initial paper prototypes, we created PowerPoint slides of the proposed user interface (see Figure 1). We used these to gather feedback about our overall product requirements and interface design approach. The PowerPoint prototype conveyed our overall vision for the product yet was general enough to spur design conversations with users. Because we wanted to gather requirements as well as usability feedback, we used cognitive walkthrough to evaluate our prototype design. In a cognitive walk-
Administrator Home Page Usage | Reports | Documents | Users | Desktops
Usage reports (determined by Decisionism) News or important information. Report 2
Report 1 User clicks Change Report
View/Change Report
Manage Reports Usage | Reports | Documents | Users | Desktops
Grid/list of reports, with columns: Report name, report category, price, active/inactive. User can sort by columns, change report attributes directly in the grid or click ‘Change’ to bring up change form, click the report name to view the report, or click Delete next to a report. Import list Export list Add Manage of Reports of Reports Report categories
Report displays in a new browser window, (without browser buttons or navigation panel—this is just a window for working with the report). User can drill down, change dimensions, etc.
User clicks Save
Save Report
Import List of Reports
Select File for Import
Usage | Reports | Documents | Users | Desktops
User specifies: Location Name Private or shared (if shared, must specify who can see it)
Windows browse dialog
Browse for spreadsheet of reports attributes Specify delimiter Specify file to write errored records to Import
Notify User of Import Errors Usage | Reports | Documents | Users | Desktops
Confirm Import Usage | Reports | Documents | Users | Desktops
The report list was imported, with the exception of the following reports: The report list was imported successfully
Report Name—why it failed Tell user to check records in error file and rerun the import with just those errors. Return to Reports
View error file
through, prospective users tell the facilitator what their goals would be for using the product, and then they guess where each navigation path will take them and explain how they would expect to perform certain tasks using the prototype design.6 These methods expose the users’ goals and expectations and identify potential navigation pitfalls in user interface designs. Cognitive walkthrough proved to be a valuable technique for gathering usability data on prototypes that were not fully functional. We used it to evaluate our prototype with five users at two net market companies. Ideally, we would have gathered feedback from a larger sample of users. However, with our time constraints and the difficulties we had in finding users, we collected as much information as we could before moving on to the next phase.
Return to Reports
Combined site maps and storyboard content Armed with a better understanding of our users, we were ready to build a site map, an aerial view of the application showing how the user interface screens would flow from a user’s perspective. To save time and make the site map easier for reviewers to conceptualize, we built storyboarding components directly into our site map. Whereas many site maps only contain representations of each screen and the navigation between screens, our site map included lists of each screen’s content. By presenting user goals, navigation, and screen content in the context of the overall application flow, the site map was the converging point for userdriven and technical product requirements (see Figure 2). We conducted a series of intensive review sessions to get input and approval from every
Figure 2. Combined site map and storyboards made it easier for reviewers to conceptualize.
January/February 2001
IEEE SOFTWARE
65
Having the overarching site map in place made it possible for us to hand off sections of the user interface to be coded without losing continuity across incremental designs.
member of the development, marketing, services, and executive teams. We also gathered feedback from our 20tons.com development partner. This feedback and approval process was critical to our ability to develop the application quickly. As a group, we walked through every screen of the application, considering the task flow and functional requirements from a user’s perspective. The process, though tedious, ensured that everyone involved in the application’s design, development, and marketing was in full agreement about its scope and flow. This process also identified and forced us to resolve contradictory visions of the application scope or flow early in the design process.7 After the development, marketing, and business development teams signed off on the prototype, we created detailed designs for each screen and included them in a modified use case document. Together, the site map and the modified use case document took the place of the traditional user interface specification document. Use cases with screen mock-ups We expanded each user goal identified in the condensed user and user-goal analysis phase to include use case information. The lead architect, lead developer, and human factors engineer jointly contributed to use case documents. With slight modifications of the Rational Unified Process use case template,8 our use cases embodied users’ goals and motivations and functioned as developers’ guidelines for implementation. Our use case document was organized by user task (for example, “Viewing a Report”). For each process, the document provided details about the look and feel, task flow, and technical requirements for implementing the use case in the application. For each user goal or task, the document included the following: ■
■ ■ ■
■
66
IEEE SOFTWARE
the users’ goals and, if applicable, how users would know when they met each goal; frequency and criticality of tasks; usability requirements of the user interface supporting each use case; a picture of the screen (this was a placeholder section in early versions of the document, later filled in with a design diagram); a list of data elements (such as buttons,
January/February 2001
■ ■ ■
links, or display-only items) and how they would respond to users’ actions; descriptions of how interactions with the data elements would be validated; requirements for entering and exiting each screen; and requirements for future releases that might affect how a use case is implemented.
Figure 3 shows an example use case and screen mock-up that we developed using the modified use case template. Critics of use cases argue that it is a timeconsuming, arduous task that can delay implementation. Others argue that there is no way of knowing when the set of use cases is complete.9 However, our team subscribed to the view that in rapid development environments, designers should select a small number of users and use cases that represent the entire product and then develop a user interface architecture that can extend to the whole product.10 We questioned the costeffectiveness of creating an exhaustive set of use cases with such limited time. Our aim was to identify the users’ most important goals and then develop an application that would enable users to meet those goals, meet the product requirements, and be extensible to outlying goals and tasks. We generated use cases for each product requirement. We focused on the activities users would perform most frequently with the application and activities most critical to the users’ success with the product. These use cases gave us the framework we needed to develop the application’s core functionality. Combining use cases, screen mock-ups, and screen descriptions into a single document saved time and also ensured that use cases, user interface designs, and functional requirements were kept in sync. Jointly developing use cases put human factors’ influence into a context that was already familiar to developers. Using the modified use case template ensured that user goals were viewed as integral to every use case. This meant that technical requirements in use cases were driven by the flow of events from a user’s perspective. Also, efficiencies were gained by creating a use case model that incorporated both developers’ and users’ needs instead of embarking on separate activities to define human factors requirements and development requirements.
Hand-off of use cases and screen mock-ups to development The design and development process was iterative. Once we identified the big picture of the application flow in the site map, we created use case documents for specific areas of functionality. When a use case or group of use cases was complete, we handed those off to engineers for development. For each set of use cases, the director of development produced an architecture design for that iteration of the product. As engineers coded one set of use cases, the design team created the next set of use cases. If engineers encountered implementation issues that required user interface changes, we responded by quickly mocking up alternative screen designs. Having the overarching site map in place made it possible for us to hand off sections of the user interface to be coded without losing continuity across incremental designs. The entire product design process, from user analysis to hand-off of the design to developers for coding, took about 12 weeks. Implementing human factors processes in a reluctant organization Several members of our development team were reluctant to adopt development processes, let alone one grounded in human factors methods. Some had come from large companies where they’d had negative experiences with ISO or slow-moving waterfall processes. Others were concerned that the human factors engineer would design the application in isolation and hand down designs that the developers would have no control over. Most were concerned that following a process would prevent us from meeting our time-to-market goal. With these concerns in mind, we worked to create a process that would help, rather than hinder, developers. Strategies included the following: ■
Combining phases of traditional human factors processes to ensure that our process required minimal documentation and was not cumbersome.
Figure 3. An enhanced use case and screen mock-up. January/February 2001
IEEE SOFTWARE
67
■
■
■
■
Completing design phases in parallel and handing off designs for coding incrementally. We emphasized that our process is an iterative process, not a traditional waterfall process.11 Focusing on getting the developers’ buyin to the new design process. We did this by involving them in every step of the design. In The Elements of User Interface Design, Theo Mandel discusses the importance of creating a multidisciplinary product design team.10 Involving a wide range of people not only provides the full spectrum of skills needed for good design, but it also increases the team’s buy-in to the design. Creating a shared vision among all team members.7 Our proof-of-concept prototype produced early in the design process conveyed the product vision to the entire company. This gave every team member a vision of what the product would do to help B2B users. It took the product from seeming like something too large and impossible to produce to something we could actually design and build within our time constraints. Distributing articles and Web site information to developers pertaining to Web application usability and design. This increased the developers’ awareness of the need for usability in Web-based products.
Accruing benefits from the process Involving the entire development team in the design phases had a number of benefits. It gave developers a say in what they would be developing, and it showed them the volume of work that had to be done before coding could begin. It gave us more complete requirements and designs because of collaborative input from multiple disciplines. It also shortened the calendar time spent on each design phase. This enabled us to do user analysis and detailed design while staying on schedule. We were also able to demonstrate to the whole team the importance of identifying our users and understanding their experiences. This was the beginning of a user-centered culture at Decisionism. Our process helped dissolve communication barriers between human factors and development personnel. Because team members were involved in developing user profiles and 68
IEEE SOFTWARE
January/February 2001
Released in December 2000, our application, called E-Marketplace, was the first B2B analytic application of its kind in its market.
task analyses from the start, we lost no time communicating user research findings and convincing developers of what users needed. Team members stopped viewing user requirements as something imposed on them and started viewing them as the purpose for the project. Design meetings emphasized how the application should work from a user’s perspective. Finally, by distributing the human factors workload, we were able to accomplish human factors activities in the time permitted. Instead of the classic problem of not enough human factors people to do the work, one human factors engineer was able to oversee all human factors activities and keep a big-picture perspective of working toward usable design.
R
eleased in December 2000, our application, called E-Marketplace, was the first B2B analytic application of its kind in its market. When Broadbase Software approached Decisionism about acquisition, Decisionism illustrated the viability of getting a B2B product to market using the proof-of-concept’s prototypes, user profiles, site maps, and enhanced use cases. Since Decisionism had not yet released a B2B analytic product, this demonstration enhanced our appeal as an acquisition candidate. The human factors and user interface design team, now part of Broadbase Software, is implementing the processes described in this article for Broadbase, along with the three other software companies Broadbase recently acquired. Beyond the business benefits, the development team reported several positive results from this process. Team commitment improved in getting the product to market. The marketing, human factors, and development teams worked closely together to create a product vision and design. Creating the project plan was greatly simplified. Developers saw the product as a whole instead of focusing only on the individual features or components they were coding. They also understood the interdependencies between features and worked together to make a cohesive product.
Developers had time to focus on solving implementation issues and coding the product. Having clear site maps and screen designs meant that they didn’t have to spend time deciphering requirements documents or worrying about details of screen flow and layout. The application flow in the site map expedited identifying and resolving business and presentation logic issues in the technical architecture and made it easy for them to identify dependencies among features. Identifying navigation and application flow problems at the site map phase minimized the number of defects that our QA engineer found during final testing. The QA engineer used the site map as a reference in planning test cases. Moreover, the detailed site map and use case documents controlled scope creep by clearly outlining what needed to be developed. Developers accepted human factors as a key part of the design process and began seeking out human factors and user interface design team members for design guidance. At the time of our acquisition by Broadbase, the development team was required to completely change the underlying technologies, development language, and third-party components. The technology-independent nature of the site maps and use cases made this change possible. In fact, the development team was able to make the required changes and still deliver the product three weeks before the deadline. Most importantly, E-Marketplace hit the mark with B2B net markets and their customers. While we have not yet completed formal usability tests, we gathered subjective feedback and cognitive walkthrough data throughout our design and development process. We responded to customer problems and suggestions, and customers successfully navigated our user interface during cognitive walkthroughs. Ultimately, we provided customers with a targeted analytic application for doing business in online markets. The first release of E-Marketplace was a stake in the ground that redefined our design processes and development culture. Future plans for E-Marketplace include formal usability testing and integration into the Broadbase analytic application suite. Like the user interface, our Web application development process will be iterative. We also plan to inte-
About the Author Molly Hammar Cloyd is the hu-
man factors engineer at Broadbase Software. She joined the company after starting the human factors function at Decisionism, an analytic applications company recently acquired by Broadbase. She is now building a user-centered design group at Broadbase. Contact her at Broadbase Software, 4775 Walnut St., Ste. 2D, Boulder, CO 80301;
[email protected]; www.broadbase.com.
grate a few more techniques into our process for future product releases: ■
■
■
While group design and storyboarding sessions helped us generate a broad range of design ideas, we plan to experiment with parallel design, in which designers sketch screens separately before coming together to combine efforts. We hope this will expedite the initial screen mock-ups and facilitate generating more design options for the team to choose from. We plan to conduct formal usability testing at multiple points along the design process. We have received funding for usability testing equipment and resources so we can gather quantitative usability data, identify specific areas for design improvements, and measure improvements against baseline usability results. We will iterate user interface designs based on usability test results, user feedback, market requirements, and new Web application technologies.10
References 1. S. Ward and P. Kroll, “Building Web Solutions with the Rational Unified Process: Unifying the Creative Design Process and the Software Engineering Process,” www. rational.com/products/whitepapers/101057.jsp (current 2 Jan. 2001). 2. J. Nielsen, Designing Web Usability: The Practice of Simplicity, New Riders, Indianapolis, Ind., 2000. 3. R. Reddy, “Building the Unbreakable Chain,” Intelligent Enterprise Magazine, vol. 3, no. 3, www. intelligententerprise.com/000209/feat3.shtml (current 2 Jan. 2001). 4. A. Cooper, The Inmates Are Running the Asylum, Macmillan USA, Indianapolis, Ind., 1999. 5. S. Robertson and J. Robertson, Mastering the Requirements Process, ACM Press, New York, 1999. 6. J.S. Dumas and J.C. Redish, A Practical Guide to Usability Testing, Ablex, Norwood, N.J., 1993. 7. J. McCarthy, Dynamics of Software Development, Microsoft Press, Redmond, Wash., 1995. 8. G. Schneider and J.P. Winters, Applying Use Cases: A Practical Guide, Addison-Wesley Longman, Reading, Mass., 1998. 9. B.L. Kovitz, Practical Software Requirements, Manning, Greenwich, Conn., 1999. 10. D. Mayhew, The Usability Engineering Lifecycle, Morgan Kaufmann, San Francisco, 1999. 11. T. Mandel, The Elements of User Interface Design, John Wiley & Sons, New York, 1997.
January/February 2001
IEEE SOFTWARE
69
focus
usability engineering
Engineering Joy Marc Hassenzahl, Andreas Beu, and Michael Burmester, User Interface Design GmbH
Joy of use has become a buzzword in user interface design although serious attempts at defining it remain sparse. The authors propose systematic methods of taking into account one of its main determinants, hedonic quality, and its complex interplay with usability and utility as a step toward truly engineering the user experience. 70
IEEE SOFTWARE
ver the last 30 years, usability has become an acknowledged quality aspect of a wide variety of technical products, ranging from software to washing machines. The concept of usability has been accompanied by the assumption that usability can be engineered. Clearly, the aim of usability engineering is to devise processes to assure that products are usable.1 Although usability
O
engineering is still a first-generation field, some of its basic ideas are widespread and reach back to practices in industrial design2 and the “golden rules” of John D. Gould and Clayton Lewis.3 The key principles include the analysis of a product’s intended context of use (user skills and needs, task requirements, and physical, social, and organizational context) at the beginning of development; user participation throughout the development process; early prototyping; usability evaluation; and continuous revision based on evaluation data.4 As the field’s methods have evolved, they have changed the concept of usability from a narrow product-oriented quality attribute to the broad concept of quality of use, that is, “that the product can be used for its intended purpose in the real world.”5 However broad the latest definition of usability is, it recently acquired a new associate, the so-called joy of use. The notion of
January/February 2001
joy of use is instantly appealing, though its actual meaning is hard to grasp. In 1997, Bob Glass said, “If you’re still talking about ease of use then you’re behind. It is all about the joy of use. Ease of use has become a given—it’s assumed that your product will work.” However, joy of use is extremely hard to define. As Glass said, “You don’t notice it, but you’re drawn to it.”6 The way the term joy of use is employed in general computer and human–computerinteraction literature reveals three perspectives on the issue: ■
Usability reductionism supposes that joy of use simply results from usable software and that the answer to the question of how to design for enjoyment is already known. The only problem is how to put usability engineering into practice. So, joy of use appears to be just a natural consequence of excellent usability. 0740-7459/00/$10.00 © 2001 IEEE
■
■
This perspective discounts the qualitative differences between simply doing a job and enjoying doing a job. Design reductionism reduces joy of use to a quality that graphical and industrial designers add to software. Designers “possess the ... skills that combine science and a rich body of experience with art and intuition. Here is where ‘joy’ and ‘pleasure’ come into the equation: joy of ownership, joy of use.”7 This perspective assumes that joy of use is concerned more with superficial than with deeper qualities, such as interaction style and functionality. Therefore, it fails to acknowledge the complex interplay of visual, interactional, and functional qualities. Marketing reductionism reduces joy of use to a simple marketing claim. This opinion is comparable to the perception of usability at its advent: user-friendliness. It is mainly a claim with no substance.
None of these perspectives seems satisfactory. Given that our aim is to design enjoyable software systems, we should take the analysis of joy of use as seriously today as we took ease of use yesterday. Why consider enjoyment in software design? The most basic reason for considering joy of use is the humanistic view that enjoyment is fundamental to life. Glass said, “I believe that products of the future should celebrate life! They should be a joy to use. I predict that joy of use will become an important factor in product development and product success.”8 Although some might readily agree with this view, others object on the grounds that there is a radical difference between leisure and work. The former calls for idle enjoyment, the latter for concentrated work. Erik Hollnagel has voiced a perspective against connecting emotions (such as enjoyment) with software design.9 He argues that human–computer interaction is basically about efficiency and control, and that emotions interfere with these attributes. For example, one might make decisions based on highly subjective, emotional criteria not suitable for the rational work domain. Somewhat cynically he states, “Affective interfaces may serve a therapeutic purpose, [to] make the
user feel better.”9 We, on the other hand, believe that the users’ well-being always matters, especially in a work domain. Technology acceptance research has demonstrated the positive effects of perceived enjoyment or fun in work settings. For example, in one study, when people enjoyed a software product, their acceptance and satisfaction increased.10 The impact of user-perceived enjoyment on acceptance (one important determinant of productivity) nearly equaled that of user-perceived usefulness. In another example, providing an enjoyable workplace for call center agents was assumed to sustain the quality of customer service and even increase it throughout the day.11 So, in certain work positions (those requiring “emotion work,” such as a call center agent or hotel receptionist), enjoyment might have an important effect on work quality instead of solely serving a therapeutic purpose. There are other cases where joy or fun plays a role as a software requirement, for example, where learning is the main system function.12 Acknowledging the positive effects of enjoyment does not necessarily imply knowledge of how to design enjoyable software. The primary question is: What do we actually have to do to design for joy of use? Advocates of usability reductionism would answer: “Nothing! Just provide useful functionality so the users can easily operate the software.” This view emphasizes software’s role as a tool for accomplishing a task and focuses on task-related qualities (usability and utility). It has been shown, however, that hedonic qualities, that is, task-unrelated qualities, can also play a role. For example, including hedonic components (task-unrelated graphics, color, and music) increased an information system’s enjoyment and usage.13 Similarly, the perception of hedonic quality (task-unrelated aspects such as novelty or originality) substantially contributed to the overall appeal of software prototypes for process control tasks14 and different visual display units—a standard CRT, an LCD flat screen, and a computer image projected on the desktop.15 Both studies demonstrate that task-related and -unrelated quality aspects seem to compensate for each other from the user’s perspective. In other words, extremely usable but tedious software might be as appealing to a user as an extremely
The most basic reason for considering joy of use is the humanistic view that enjoyment is fundamental to life.
January/February 2001
IEEE SOFTWARE
71
Designers need to introduce novelty with care. User interfaces that are too novel and unfamiliar are likely to evoke strong adequacy concerns instead of hedonic quality perceptions.
unusable but thrilling one. Thus, exploring and understanding the nature of hedonic quality as a software requirement and, furthermore, the dependence between hedonic quality and task-related quality (utility and usability) is a valuable road toward designing for joy of use. The driving forces behind hedonic quality The definition of hedonic quality as taskunrelated quality is clearly too broad to guide design. The driving forces behind the scene might be more specific qualities such as the need for novelty and change and the need to communicate and express oneself through objects.16 Though these may not be the only needs, they exemplify the two-edged nature of the forces behind hedonic quality. One part is directed inward, concerning the individual’s personal development or growth; the other part is directed outward, concerning social and societal issues. If users perceive a software product as potentially capable of satisfying the need for personal growth and status, it has hedonic quality. The perception of hedonic quality (or lack of it) will affect the user’s preference for a given software product. Need for novelty and change Several areas of research have found evidence of a general human need for novelty and change. Daniel Berlyne, for example, states that our central nervous system is designed to cope with environments that produce a certain rate of stimulation and challenge to its capacities.17 We reach best performance at a level of optimal excitement, where neither overstimulation nor monotony are present. The same notion exists in Mihaly Csikszentmihalyi’s optimalexperience concept.18 Optimal experience or flow describes the state when somebody is completely wrapped up in an activity. The crucial determinant is the certainty that the activity is challenging but attainable—it has the optimal level of excitement. In a home automation system evaluation study,19 we found that individuals with a technical job background reported the system to be of low hedonic quality compared to individuals with a nontechnical job background. We suppose that technically educated individu-
72
IEEE SOFTWARE
January/February 2001
als are more likely to possess knowledge about existing home automation system functionality, so they don’t find the functionality excitingly new. Conversely, for individuals with nontechnical job backgrounds, the system provided the means to do things they could not do before. Indeed, the focus during system design was on usability and visual design rather than on adding exciting new functionality. In this aspect, the experiment did not address the technically oriented users’ need for challenge and stimulation. Strikingly, taking the need for novelty and change into account might unavoidably imply a reduction of usability. Usability and joy of use might be partially incompatible, because the former requires consistency and simplicity, whereas the latter requires surprise and a certain amount of complexity.20 Designers need to introduce novelty with care. User interfaces that are too novel and unfamiliar are likely to evoke strong adequacy concerns instead of hedonic quality perceptions.21 What is needed is a way to determine an optimal level of novelty. Need to communicate and express oneself through objects This need addresses the social dimension of using software. Robinson states that the artifacts people choose to use can be interpreted as statements in an ongoing “dialog” people have with other people in their environment.22 We should not underestimate the fact that using a product can evoke a state of importance. Being an expert at something that others do not understand, being able to afford something that others cannot afford, or possessing something that others desire are strong driving forces. To give an anecdotal example from our experience: The home automation system mentioned earlier had a user interface designed to be as nonintimidating as possible in order to encourage use by people with low computer expertise. The strategy succeeded. Usability tests with elderly, non-computer-literate individuals showed an astonishingly low number of severe usability problems. However, one participant with a more sophisticated technical background complained about the visual design. He said it looked like a “children’s book” and that his friends would laugh at the system’s appar-
ent lack of professionalism. Thus, designers need to develop user interfaces with status needs in mind. Techniques for engineering hedonic quality There is an explicit difference between knowing that hedonic quality could play a role in designing interactive systems and actively accounting for it. The latter requires practical methodical support for both design (techniques for gathering and analyzing hedonic requirements) and evaluation (metrics and techniques to measure hedonic quality). As long as you understand their advantages and disadvantages, the following techniques can fit into a design process for interactive systems. A semantic differential for measuring perceived hedonic quality A well-known technique for measuring how people perceive and evaluate objects is the semantic differential. The differential we employ comprises seven pairs of adjectives that characterize hedonic quality’s presence or absence, evaluated on a seven-point rating scale. Each pair of extremes corresponds to opposing adjectives, such as good–bad, interesting–boring, or clear–confusing. Once the participants rate the software on each characteristic, we calculate a hedonic quality “value” by summing or averaging the ratings. Figure 1 shows the semantic differential we typically use for measuring hedonic quality14,15,19 (note that the verbal anchors used in these studies were originally in German). We can apply the differential throughout the design process for interactive systems, from the evaluation of early mock-ups or prototypes to fully operational systems. It has various advantages: the usability engineer does not require special training for using the differential, the participants can quickly and easily fill it in, and the statistical analysis is straightforward. The characteristics are high-level and deal with subjective user perceptions—that is, that “quality is in the eye of the beholder.” This makes the differential applicable to various software products without needing to adjust it to the product’s special features. The differential’s general applicability is also one of its major disadvantages. Although it can show the extent to which users regard a
Outstanding Exclusive Impressive Unique Innovative Exciting Interesting
Second-rate Standard Nondescript Ordinary Conservative Dull Boring
Figure 1. Semantic differential for measuring hedonic quality.
piece of software as hedonic, the underlying reasons (determinants of hedonic quality or lack thereof) remain unknown. However, it is exactly the understanding of the underlying reasons that proves to be most important for stimulating and improving a software product design, especially when it comes to a premature construct such as hedonic quality. Another problem associated with the nature of hedonic quality is the solely operational definition that the differential provides. Without a theoretically solid definition, there is always the danger of missing an important facet of hedonic quality. Repertory grid technique A way to overcome the differential’s problems is the repertory grid technique (RGT).23,24 Georg Kelly assumes that individuals view the world (persons, objects, events) through personal constructs. A personal construct is a similarity–difference dimension comparable to a semantic differential scale. For example, if you perceive two software products as being different, you might come up with the personal construct “too colorful—looks good” to name the opposed extremes. On the one hand, this personal construct tells something about you, namely that too many colors disturb your sense of aesthetics. On the other hand, it also reveals information about the products’ attributes. From a design perspective, we are interested in differences between software products rather than differences in individuals, so we focus on what the personal constructs of a group of users might tell us about the products they interact with. RGT deals with systematically extracting personal constructs. It consists of two steps: construct extraction and product rating. For construct extraction, we present individuals with a randomly drawn triad from a software products set, marking the “design space” we are interested in. They must answer in what way two of the three products are similar to each other and different from January/February 2001
IEEE SOFTWARE
73
Table 1 Example Constructs 1 2 3 4 5 6
Pole A
Pole B
Does not take the problem seriously Inappropriately funny Non-expert-like All show, no substance Playful Has been fun
Takes the problem seriously Serious Technically appropriate Technology-oriented Expert-like Serious (good for work)
the third. This procedure produces a construct that accounts for a perceived difference. The people then name the construct (for example, playful–serious, two-dimensional–three-dimensional, ugly–attractive) indicating which of the two poles they perceive as desirable (having positive value). We repeat the process until no further novel construct arises. The result is a semantic differential solely based on each individual’s idiosyncratic view. In the product rating step, we ask people to rate all products on their personal constructs. The result is an individual-based description of the products based on perceived differences. Designers can apply RGT in various forms throughout the user-centered design process for interactive systems. A promising application might be “diagnostic benchmarking,” for example, comparing your current, future, and competitors’ Web sites.25 We recently used RGT to explore the differences between design studies for control room software resulting from a parallel design session.21 Table 1 shows some example constructs from this study. These constructs illustrate that the participants were concerned about the adequacy of some of the designs for a work domain. At least two different views became apparent: some participants believed that control room software must look serious (constructs 1–4), maybe to induce trustworthiness (constructs 1 and 4) and perceived control (construct 3). Other participants acknowledged the hedonic quality of some designs (and the enjoyment they derived from them) (construct 5) but emphasized the dichotomy between leisure and work (construct 5 and 6). This illustrates the rich information that we can obtain by RGT. In the study just mentioned, we extracted 154 constructs from 10 participants, covering topics such as quality of interaction and presentation, hedonic quality, and adequacy concerns (participants’ belief about the extent to which the prototype is suitable for the task). 74
IEEE SOFTWARE
January/February 2001
RGT has a number of advantages: ■
■
■
■
It is a theoretically grounded23 and structured approach, but nevertheless open to each participant’s individual view. The focus on individual (personally meaningful) constructs is a clear advantage over the semantic differential. The differential can only measure what we define to be hedonic quality. In other words, the participants must use our constructs (the scales we provide), regardless of whether they are meaningful to them and cover the topics relevant to them. RGT is more efficient than comparable open approaches such as unstructured interviews. Focusing on the personal constructs as data denotes a significant reduction in the amount of data to be analyzed compared to transcribing and analyzing unstructured interviews. This is especially important in the context of parallel design, or benchmarking, when many alternatives are under study. Personal constructs have the potential to be design-relevant. The whole approach is likely to generate different views on software products, embodying various individual needs and concerns in relation to the product and its context of use. This again is something the semantic differential neglects. The basic method can be applied to almost any set of software products.
The method’s main disadvantage is the amount of effort invested. While we can use the semantic differential as an add-on to a regular usability test or as an online questionnaire, an RGT study is a self-contained method in which the experimenter needs considerable training. Another disadvantage is that RGT relies on comparisons and its application is therefore confined to situations where at least four alternatives are available. Shira interviewing Rainer Wessler, Kai-Christoph Hamborg (University of Osnabrück), and Marc Hassenzahl have recently developed a new analysis method that avoids at least the multiple-alternative problem associated with RGT. Structured hierarchical interviewing for requirement analysis (Shira)26 is an in-
terviewing technique that seeks to explore the meaning of product attributes such as “controllable,” “simple,” “impressive,” or “innovative” for a specific software application in a specific context of use. Shira starts from a pool of attributes covering usability aspects (such as “controllable”) and hedonic qualities (such as “innovative”). We first introduce participants to a possible software application and its intended context of use—for example, a home automation system or software for writing one’s diary. In a second step, we ask the participants to select an attribute from the pool that is important to them with regard to the software (Figure 2 shows an example dealing with the attribute “simple”). Starting from the attribute, they then list software features that would justify attaching that attribute. By repeatedly answering questions such as “what makes a home automation system seem innovative to you,” they will generate a list of features that contain context and the attribute’s software-specific determinants (for example, “user-friendly” and “not patronizing”). The resulting list comprises the context level. In the third step, the participants must produce recommendations for each entry in the context level suggesting how the actual design could address the feature (for example, “adaptive, learning, intelligent system that works more or less independently and requires little attention from the user”). We call this the design level. The result is a hierarchical, personal model of attributes that are important to the participants with regard to a specified software product, what these attributes actually mean to them, and how they can be addressed by the design. Shira is a systematic way to get in-depth data and detailed insights into an individual’s expectations of a specified software system. Its hierarchical representation facilitates getting a better idea of central and peripheral aspects (attributes, features, or design recommendations). In particular, Shira has the power to gather hedonic requirements by using hedonic attributes as stimulation. By integrating personal models into a group model, we obtain a rich body of information about the system’s design space. Shira is especially suited to gather information at early stages of the design process for interactive systems. However, it might
also be possible to evaluate software at a later stage regarding how it fits the user’s expectations. Shira is still at an early development stage. It is too early to assess advantages and disadvantages. However, from our preliminary experience with the technique, it seems to provide detailed design-relevant data in a structured form that facilitates interpretation and integration of multiple personal perspectives.
U
sability and utility are basically about how well software supports people in getting their jobs done. However, task-unrelated qualities can play a crucial role. Traditional usability engineering methods are not adequate for analyzing and evaluating hedonic quality and its complex interplay with usability and utility. The techniques we have suggested might significantly broaden usability engineering practices by shifting the focus to a more holistic Figure 2. Portion of a perspective on human needs and desires. In personal model gathered by using Shira.
Attribute level
Context level
Design level
Simple User-friendly System remembers previous interactions System adapts to my habits I only have to specify exceptions to the rule Common sense (for example, system automatically excludes Saturdays and Sundays from the daily morning wake-up call) Not patronizing
The system may remind me but must not order me I do not want to feel like the system knows everything and I know nothing
January/February 2001
IEEE SOFTWARE
75
the future, we might see usability engineering evolving toward more complete user experience design—one that encompasses the joy of use.
Acknowledgments This article was partly funded by the German Ministry for Research (BMBF) in the context of INVITE (01 IL 901 V 8). See www.invite.de for further information. We are grateful to Uta Sailer for her helpful comments on an earlier draft of the article.
References 1. J. Nielsen, Usability Engineering, Academic Press, Boston, San Diego, 1993. 2. H. Dreyfuss, Designing for People, Simon & Schuster, New York, 1995. 3. J.D. Gould and C.H. Lewis, “Designing for Usability: Key Principles and What Designers Think,” Comm. ACM, vol. 28, no. 3, Mar. 1985, pp. 300–311.
About the Authors Marc Hassenzahl works at User Interface Design GmbH in Munich. He is involved in projects ranging from usability evaluation of automation software to user interface design for computer chip design tools. His research interests are appealing user interfaces, especially hedonic qualities, and related new analysis and evaluation techniques. He studied psychology and computer science at the Technical University, Darmstadt. Contact him at User Interface Design GmbH, Dompfaffweg 10, 81827 Munich, Germany;
[email protected]. Andreas Beu works at User Interface Design GmbH in Mu-
nich. He is interested in user interface design for small displays, wearable computers, and augmented reality systems. He studied mechanical engineering at the University of Stuttgart. Contact him at User Interface Design GmbH, Dompfaffweg 10, 81827 Munich, Germany;
[email protected]. Michael Burmester is head of the Munich office of User Interface Design GmbH, a software and usability consultancy company. Results and experiences of his research and consultancy work are published in over 40 scientific and technical papers. He studied psychology at the University of Regensburg in southern Germany. Contact him at User Interface Design GmbH, Dompfaffweg 10, 81827 Munich, Germany;
[email protected].
76
IEEE SOFTWARE
January/February 2001
4. ISO-13407 Human-Centred Design Processes for Interactive Systems,, Int’l Organization for Standardization, Geneva, 1999. 5. N. Bevan, “Usability is Quality of Use,” Proc. HCI Int’l 95, Lawrence Erlbaum Associates, Mahwah, N.J., 1995, pp. 349–354. 6. R. Glass, “The Looking Glass,” www.sun.com.au/news/ onsun/oct97/page6.html (current 15 Jan. 2001). 7. D.A. Norman, The Invisible Computer, MIT Press, Cambridge, Mass., 1998. 8. B. Glass, “Swept Away in a Sea of Evolution: New Challenges and Opportunities for Usability Professionals,” Software-Ergonomie ’97. Usability Engineering: Integration von Mensch-Computer-Interaktion und Software-Entwicklung, R. Liskowsky, B.M. Velichkovsky, and W. Wünschmann, eds., B.G. Teubner, Stuttgart, Germany, 1997, pp. 17–26. 9. E. Hollnagel, “Keep Cool: The Value of Affective Computer Interfaces in a Rational World,” Proc. HCI Int’l 99, vol. 2, Lawrence Erlbaum Associates, Mahwah, N.J., 1999, pp. 676–680. 10. M. Igbaria et al., “The Respective Roles of Perceived Usefulness and Perceived Fun in the Acceptance of Microcomputer Technology,” Behaviour & Information Technology, vol. 13, no. 6, 1994, pp. 349–361. 11. N. Millard et al., “Smiling through: Motivation at the User Interface,” Proc. HCI Int’l ’99, vol. 2, Lawrence Erlbaum Associates, Mahwah, N.J., 1999, pp. 824–828. 12. S.W. Draper, “Analysing Fun as a Candidate Software Requirement,” Personal Technology, vol. 3, no. 1, 1999, pp. 1–6. 13. N. Mundorf et al., “Effects of Hedonic Components and User’s Gender on the Acceptance of Screen-Based Information Services,” Behaviour & Information Technology, vol. 12, no. 5, 1993, pp. 293–303. 14. M. Hassenzahl et al., “Hedonic and Ergonomic Quality Aspects Determine a Software’s Appeal,” Proc. CHI 2000 Conf. Human Factors in Computing Systems, ACM Press, New York, 2000, pp. 201–208. 15. M. Hassenzahl, “The Effect of Perceived Hedonic Quality on Product Appealingness,” Int’l J. Human–Computer Interaction, submitted for publication. 16. R.J. Logan et al., “Design of Simplified Television Remote Controls: A Case for Behavioral and Emotional Usability,” Proc. 38th Human Factors and Ergonomics Soc., Santa Monica, Calif., 1994, pp. 365–369. 17. D.E. Berlyne, “Curiosity and Exploration,” Science, vol. 153, 1968, pp. 25–33. 18. M. Csikszentmihalyi, Beyond Boredom and Anxiety, Jossey-Bass, San Francisco, 1975. 19. M. Hassenzahl et al., “Perceived Novelty of Functions—A Source of Hedonic Quality,” Interfaces, vol. 42, no. 11, 2000, p. 11. 20. J.M. Carroll and J.C. Thomas, “Fun,” ACM SIGCHI Bull., vol. 19, no. 3, 1988, pp. 21–24. 21. M. Hassenzahl and R. Wessler, “Capturing Design Space from a User Perspective: The Repertory Grid Technique Revisited,” Int’l J. Human-Computer Interaction, vol. 12, no. 3–4, 2000, pp. 441–459. 22. L. Leventhal et al., “Assessing User Interfaces for Diverse User Groups: Evaluation Strategies and Defining Characteristics,” Behaviour & Information Technology, vol. 15, no. 3, 1996, pp. 127–137, and references therein. 23. G.A. Kelly, The Psychology of Personal Constructs, vols. 1–2, Norton, New York, 1955 (reprinted by Routledge, 1991). 24. F. Fransella and D. Bannister, A Manual for Repertory Grid Technique, Academic Press, London, 1977. 25. M. Hassenzahl and T. Trautmann, Analysis of Web Sites with the Repertory Grid Technique, submitted for publication. 26. R. Wessler et al., Orientation, Understanding and Decision-Making—a User-Centred Approach to Guide the Design of Prototypes, submitted for publication.
country report Editor: Deependra Moitra
■
L u c e n t Te c h n o l o g i e s
■
d m o i t r a @ c o m p u t e r. o r g
India’s Software Industry Deependra Moitra
O
n my way to New Jersey recently, I was seated next to a businessman. To avoid boredom, I grabbed the earliest opportunity to introduce myself. When I told him that I was from India traveling to the US on business, he exclaimed, “India! You must be a software engineer.” Such is the reputation of India’s software industry that the world has taken notice, and India today has a distinct identity as a software superpower. With the world’s second largest pool of English-speaking scientific and technical professionals, India boasts a US$5.7 billion software industry with an annual growth rate of more than 50 percent. As the software industry increasingly becomes a major driver of the nation’s economy and policymakers devise ways to fuel its growth, India’s software industry is poised for massive expansion. As a matter of fact, policymakers and industry leaders envision this industry’s growing to more than US$80 billion by 2008 (with US$50 billion worth of software exports). India: The land of contrasts Is India a developing nation with more than 40 percent illiteracy and a large population living below the poverty line, or is it the land of world-class technical brains and entrepreneurs? The answer is both, and much more. India is a land of such contrasts, paradoxes, and inconsistencies that the answer really depends on the viewer’s perspective. It is one of the world’s oldest civilizations, with a 0740-7459/00/$10.00 © 2001 IEEE
very rich cultural heritage. Home to more than one billion people, it is the fifth largest economy and the largest democracy in the world. In the last two decades, India’s image has transformed from the land of snake charmers to one of top-notch software soldiers. India has been primarily an agrarian economy but, because of the fast pace of growth in the high-tech sector, software and IT are fast becoming a critical component of India’s economic growth. Historically, the Indian software industry began by providing onsite contracting services (“body shopping”) to US and European organizations. Gradually, the trend shifted from onsite services to offshore development, which accounts for approximately 50 percent of the total revenue. One reason for this shift relates to restrictions on US visas. The US government used to limit the number of visas for foreigners and require that foreign nationals working in the US be paid comparable wages to that of their American counterparts. It also imposed a local tax for contracting foreign nationals, resulting in serious cost implications for onsite services and thus giving birth to the offshore development model as the only economically viable option. India’s competitive advantage The low cost of onsite services was the original growth driver for the Indian software industry. As the offshore model became increasingly popular, the industry has experienced rapid growth and global recognition— so much so that the market capitalization of Indian software companies has skyrocketed from approximately US$4 billion in January January/February 2001
IEEE SOFTWARE
77
COUNTRY REPORT
A New Department As software becomes more and more critical to products and services, and as most businesses become software dependent, developing and capitalizing on a strong software capability is increasingly the mantra for competing in the new economy. As a result, different nations are competing for huge software business opportunities to fuel their economic growth. IEEE Software will publish a column periodically to bring its readers a comprehensive account of the software industry in various countries. These columns will cover all the dimensions of the software industry in both major and emerging “software nations,” along the lines of this article. The next issue will carry an account of the Irish software industry—another strongly emerging software nation gaining prominence on the world map. We’d like to hear from you about this new department. Please send your comments and suggestions to
[email protected]. —Deependra Moitra
1999 to nearly US$55 billion at the beginning of June 2000. Today, India exports software to about 100 countries around the globe, with North America being the major market (62 percent). Nearly 200 Fortune 500 companies either have their development centers in India or outsource development to India. So, what really is India’s competitive advantage? Five factors contribute to the growth of India’s software industry: ■ ■ ■ ■ ■
availability of a highly competent and large talent pool, world-class quality and high process maturity, competitive cost structures, rapid delivery capability, and no language barrier (Englishspeaking resource pool).
People—the raw material As mentioned earlier, India has the world’s second largest pool of Englishspeaking scientific and technical professionals. The Indian software industry employs approximately 300,000 people, 80 percent men and 20 percent women. However, to reach US$80 billion by 2008, the industry must add approximately 200,000 people a year. Indian universities annually churn out nearly 90,000 engineering graduates, with many private training houses producing similar numbers (with varying degrees of skill and quality of output). Although the government has set up a new generation of institutes, called Indian Institutes of Information Technology, in response to the pressure for high-quality resources, it still needs to 78
IEEE SOFTWARE
January/ February 2001
do much to address the projected gap between supply and demand. A real war for talent is underway. With increasing competition for resources, the ability to attract, develop, and retain high-quality employees is indeed the name of the game. Companies are using a plethora of approaches: market hiring (hiring experienced people from other companies), advertisements, job portals, campus recruitment, headhunters, joint industry–academia programs, employee referrals, and retaining freelance headhunters to work exclusively for a particular company. Still, no single approach adequately meets the need. Many companies use an approach that could be termed “catch them young,” creating mind share even at the high school level. Some companies hire qualified housewives to work from home on a parttime basis. Interestingly, reverse “brain drain” is also taking place; increasingly, qualified Indians abroad are returning home to be part of an exciting and happening industry. However, the competition for talent has also given rise to a disturbingly high employee turnover. With rising internal competition and continued overseas demand, the average industry attrition ranges between 12 and 35 percent, leading to a very high cost of hiring and employee development. For knowledge-intensive activities, such as high-tech product development, attrition means not only losing people to competitors but also knowledge walking out of the organization. The competitive landscape for the talent pool is worsening, with
many European countries opening their gates to Indian software professionals. One recent example is the German government’s move to open the IT sector to Indian software professionals to meet their widening supply–demand gap for qualified human resources. Clearly, talent acquisition and retention is fast becoming the centerpiece of companies’ competitive strategy. Many firms are recruiting based on learnability, and many large companies—such as TCS, Wipro, Satyam, and Infosys—have established their own training hubs and learning centers. Then there are those such as NIIT and Pentafour that follow a “horse and hay approach”—running expensive software and IT training programs and spotting talented people whom they eventually absorb into their workforce. To combat the retention issue, they use many strategies, including providing ■ ■ ■ ■ ■ ■ ■ ■
continuous learning opportunities and increasing employability, high quality of work and work life, overseas assignments, competitive compensation and pay for performance, perks, loans, and recreation facilities, wealth creation opportunities such as employee stock options, support for distance learning, and career progression and management.
The key to solving the retention issue, however, is capitalizing on the existing emotional reservoir in each organization and effectively managing
COUNTRY REPORT
employee expectations. With the Indian software professional’s median age being only 26.5 years, managing the industry’s raw material—the people—is indeed a complex undertaking. Younger employees have different expectations and priorities from their older counterparts and are often unsure as to their goals. To stay competitive, many companies have established human-resources differentiators who go beyond the work environment and look at employees’ personal, social, and family needs. The quality bandwagon As the overseas companies contracting work to India needed some assurance of the quality of deliverables, Indian firms responded by launching major quality initiatives to boost confidence. Over time, as the number of players competing for a share of the global software business multiplied and competition among the software development houses became cutthroat, companies increasingly focused on quality as a way to create a strategic differentiator in the marketplace. In addition, as some level of quality certification became a prerequisite for doing business, the trend toward model-based quality and process improvement began. Today, however, the situation is changing. As Indian software organizations face stiff competition from other fastemerging software nations such as Ireland, China, and Israel, a focus on quality is a competitive necessity as opposed to a strategic advantage— and that is for the better. Today, more than 175 software companies have ISO 9001 certification and nearly 50 boast of an SWCMM level 3 or higher ranking. Interestingly, more than 55 percent of the CMM level 5 companies in the world are in India, and many companies have embraced P-CMM and Six Sigma. Unfortunately, a focus on quality and certification is still being used as a marketing instrument. The drive for attaining a particular certification or CMM level has led to a predominantly compliance-based approach instead of really driving busi-
ness excellence and innovation through quality and processes. With the average business productivity per employee so low (US$35K–50K per person per year) and the same probably true of software development productivity, we need a strategic focus on processes combined with improved technology integration. India’s software competence The Indian software industry has focused primarily on providing software services in almost all possible areas: systems software, telecommunications, e-commerce, medical systems, automotive software, Webbased development and multimedia applications, and applications software for the insurance, banking, and retail industries. The focus on developing products has almost been nonexistent, and only recently have a new breed of high-tech entrepreneurs and multinationals launched major product-based organizations in India. While software design and development skills are really on par with the best by global standards, India’s capabilities in project management and new product innovation must grow to retain its hard-earned competitive edge. Enabling initiatives In May 1998, the government of India formed an IT task force to provide thrust to India’s software sector.
While software design and development skills are really on par with the best by global standards, India’s capabilities in project management and new product innovation must grow.
This thrust has become more strategic with the formation of the Ministry of Information Technology this year. With continued efforts to improve the infrastructure and simplify policies and procedures, the government is helping accelerate the industry’s growth. It uses such measures as the establishment of software technology parks and export processing zones offering tax holidays on software exports, new working capital guidelines, liberalization of telecom policy, and relaxing the taxation policy for employee stock options. India’s software cities Bangalore is by far the hottest of all the software cities in India and is often termed India’s software hub or the Silicon Valley of India. Hyderabad—also popularly called Cyberabad—is another city with a lot of action. Its chief minister wooed Bill Gates to open a Microsoft development center there, and it earned prominence as a high-tech city when US President Bill Clinton visited it on his trip to India this year. Mumbai (formerly Bombay) and Pune in the western part of India and Delhi, India’s capital in the north, are also teeming with software professionals. Chennai (formerly Madras) and Calcutta are quickly catching up and attracting investments from entrepreneurs and large companies.
I
ndia has come a long way toward prominence as a software superpower, but it must address certain issues to continue to enjoy its current reputation. Innovative thinking and practices to retain knowledge workers will be critical to the industry’s success in the long run. Also, a catalyst for the industry’s growth clearly has been low costs; this advantage must be sustained through continued productivity and infrastructure improvement. The rising cost of salaries, the cost of attrition, and the lack of a world-class infrastructure seem to be weakening the cost advantage. Other important issues requiring immediate January/February 2001
IEEE SOFTWARE
79
COUNTRY REPORT
attention include brand-building, significantly improving the quality of training, securing global parity in telecom infrastructure, and creating an ideal regulatory framework. Also, in the last two years the industry’s real “cash cows” were projects in Y2K and Euro conversion, but now it must reposition itself with a renewed set of capabilities to maintain its growth rate. More focus on product development to move up the value chain and ensure higher revenue generation requires a serious and immediate effort. Many multinational companies such as Texas Instruments, Novell, Oracle, and Parametric Technologies are getting their strategic products developed at their R&D centers in India, but Indian companies lag far behind in R&D. The latter must significantly increase R&D spending to about 3 percent of the total budget from the current approximate of 1.5 percent. Huge opportunities lie in such areas as e-commerce, Web-based technologies, convergence technologies, mobile Internet devices, and application service providers—but to exploit them the industry must focus continuously on upgrading skills, especially in project execution and management and high-quality, rapid delivery. The ability to innovate and respond
to changing market needs and everincreasing customer expectations will certainly be a key factor. Also, innovative outsourcing models and a significantly improved ability to organize and execute geographically distributed software engineering will distinguish India and keep it ahead as other nations compete to become software superpowers. The Indian software industry has come a long way from being a small US$50 million industry in 1990 to nearly a US$5.7 billion industry in the financial year 1999–2000 (see Table 1). With several Indian companies globalizing and some already listed on NASDAQ, and with a continuous flow of venture capital investments, the sector is really hot. The recent visits of Jack Welch (General Electric) and Bill Gates, during which they announced their megaplans for India, have given further impetus to the industry. Similarly, many other high-profile CEOs and entrepreneurs are bullish on India. However, to be able to reciprocate and capitalize on such opportunities, the Indian software industry must revitalize itself and innovate relentlessly to meet the expectations of people as demanding as Welch and Gates. Indian software community, are you listening?
For More Information The best source of information on the Indian software industry is the Web site of the National Association of Software and Services Companies (NASSCOM), www.nasscom.org. It provides a comprehensive account of the industry, ways of doing business in India, regulatory mechanisms and government policies, and so forth. It also offers a database of Indian software companies and growth areas and an executive summary of the NASSCOM-McKinsey 1999 report on India’s software industry. The Web site for Dataquest magazine (www.dqindia.com), currently under construction, will be another good place to find information about what is happening in the Indian software and IT sector. This magazine publishes a comprehensive analysis of trends and performance each year in July and August. The December 2000 issue of CIO magazine (www.cio.com) carries a detailed account of the Indian software industry based on a study and series of interviews done by senior editors Tom Field and Cheryl Bentsen. The diary pertaining to their India visit can be found at www.cio.com/forums/global/edit/ 082100_indialetter.html.
80
IEEE SOFTWARE
January/ February 2001
Table 1 Financial Snapshot of India’s Software Powerhouses (export revenues) Companies
Software Exports in 1999–2000 (in million US$, assuming US$1 = Rs. 44.00)
Tata Consultancy Services Wipro Technologies Infosys Technologies Satyam Computer Services HCL Technologies NIIT Silverline Technologies Cognizant Technology Solutions Pentamedia Graphics Pentasoft Technologies Patni Computer Systems IBM Global Services India DSQ Software Mastek Mahindra British Telecom HCL Perot Systems i-Flex Solutions Tata Infotech Zensar Technologies Birlasoft
413.72 237.30 197.66 150.67 143.85 125.41 98.83 94.15 88.42 80.23 67.16 61.62 59.63 54.80 53.45 48.28 43.78 43.39 42.12 37.31
Acknowledgments Most of the data presented here is based primarily on NASSCOM (National Association of Software and Services Companies) reports and its Web site, www.nasscom.org. I have made every effort to ensure the accuracy of the information presented here, but neither I nor IEEE Software is responsible for any error. The views expressed in this report are mine and do not in any way reflect the views of my organization. I thank Asit Pant for his help in preparing this report.
Deependra Moitra is general manager of engineering
at Lucent Technologies’ India R&D Program in Bangalore and an adjunct faculty member at the Indian Institute of Information Technology. His current interests are in software engineering management, business and technology strategy, management of technology and innovation, new product innovation management, knowledge management, and entrepreneurship in software and the high-tech industry. He has a BTech in instrumentation and control engineering from the University of Calcutta. He serves on the editorial boards of several international journals, including Research & Technology Management Journal, Technology Analysis and Strategic Management Journal, The International Journal of Entrepreneurship and Innovation, The Journal of Knowledge Management, Knowledge and Process Management Journal, and IEEE Software. Contact him at
[email protected].
feature software testing
Validating and Improving Test-Case Effectiveness Yuri Chernak, Valley Forge Consulting
anagers of critical software projects must focus on reducing the risk of releasing systems whose quality is unacceptable to users. Software testing helps application developers manage this risk by finding and removing software defects prior to release. Formal test methodology defines various test types, including the function test.1 By focusing on a system’s functionality and looking for as many defects as possible, this test bears most direct responsibility for
M Effective software testing before release is crucial for product success. Based on a new metric and an associated methodology for in-process validation of test-case effectiveness, the author presents an approach to improving the software testing process. 0740-7459/00/$10.00 © 2001 IEEE
ultimate system quality. Implementing the function test as a formal process lets testers cope better with the functional complexity of the software application under test. Testplan documents and test-case specifications are important deliverables from the formal test process.2 For complex systems, test cases are critical for effective testing. However, the mere fact that testers use test-case specifications does not guarantee that systems are sufficiently tested. Numerous other factors also determine whether testers have performed well and whether testing was effective. Ultimately, testers can only evaluate complete testing effectiveness when a system is in production. However, if this evaluation finds that a system was insufficiently tested or that the test cases were ineffective, it is too late to benefit the present project. To reduce such a risk, the project team can assess testing effectiveness by performing in-process evaluation of test-case effectiveness. This way, they can identify problems
and correct the testing process before releasing the system. This article describes a technique for inprocess validation and improvement of testcase effectiveness. It is based on a new metric and an associated improvement framework. These work together to improve system quality before its release into production. Evaluation = verification + validation The evaluation process certifies that a product fits its intended use. For software project deliverables in general, and test cases in particular, evaluation commonly consists of verification and validation tasks.3 Verification To start, a project team must first verify test-case specifications at the end of the testdesign phase. Verifying test cases before test execution is important; it lets the team assess the conformance of test-case specifications to their respective requirements. However, such January/February 2001
IEEE SOFTWARE
81
Validation serves primarily to determine whether a test suite was sufficiently effective in the test cycle.
conformance does not mean that the test cases will automatically be effective in finding defects. Other factors also determine whether test cases will be effective in the test cycle. These include design of test cases using incomplete or outdated functional specifications, poor test-design logic, and misunderstanding of test specifications by testers. Verification activities commonly used include reviews or inspections and traceability analysis. Reviews or inspections let us evaluate test-case specifications for their correctness and completeness, compliance with conventions, templates, or standards, and so forth. Traceability matrices or trees, on the other hand, let testers trace from the functional specifications to the corresponding test-case specifications, which ensures that all functional requirements are covered by the given test cases. Nevertheless, test cases that passed verification could have weak failure-detecting ability and, therefore, should be required to pass validation as well. Validation Validation can proceed as soon as testers have executed all test cases, which is at the end of the test process’s test-execution phase. As its main objective, test-suite validation determines whether the test cases were sufficiently effective in finding defects. If a test suite was effective, the system under test will likely be of high quality and users will be satisfied with the released product. But, if a test suite was not effective, there is a high risk that the system was not sufficiently tested. In such cases, the users will likely be dissatisfied with the system’s quality. If test-case effectiveness has not proved satisfactory, it is not too late to analyze the reasons and correct the test process. Using the proposed improvement framework, testers can revise and improve the test suite, and then execute the tests again. This, in turn, can help them find additional defects and thus deliver a better software product. The test-case effectiveness metric To perform validation objectively, testers need a metric to measure the effectiveness of test cases. When testing online mainframe systems, and especially client–server systems, a certain number of defects are always found as a side effect. By side effect, I mean the situation where testers find defects by executing
82
IEEE SOFTWARE
January/February 2001
some steps or conditions that are not written into a test-case specification. This can happen either accidentally or, more frequently, when the tester gets an idea on the fly. When defining a metric for test-case effectiveness, we can assume that the more defects test cases find, the more effective they are. But, if test cases find only a small number of defects, their value is questionable. Based on this logic, I propose a simple testcase effectiveness metric, which is defined as the ratio of defects found by test cases (Ntc) to the total number of defects (Ntot) reported during the function test cycle:
TCE = Ntc / Ntot ∗ 100% More precisely, Ntot is the sum of defects found by test cases and defects found as a side effect. The proposed TCE metric might resemble Jones’ defect removal efficiency metric,4 which is defined as the ratio of defects found prior to production to the total number of reported defects. The important distinction is that DRE has the purpose of evaluating user satisfaction with the entire test process. It measures the test-process effectiveness and reflects a production or users’ perspective on the test process. In contrast, my TCE metric serves specifically to validate the effectiveness of functional test cases. Unlike DRE, the TCE metric evaluates test cases from the test-cycle perspective, which provides inprocess feedback to the project team on how well a test suite has worked for testers. As I’ve discussed, validation serves primarily to determine whether a test suite was sufficiently effective in the test cycle. We can conclude this by comparing the actual TCE value, calculated for the given test cycle, with a baseline value. The project team selects the latter in advance, possibly obtaining it by analyzing previous successful projects that are considered appropriate as models for current and future projects. My experience with successful client-server projects delivering business applications suggests 75 perecent to be an acceptable baseline value. However, the goal for testcase effectiveness can be different for various application categories, such as commercial, military, or business applications. When the TCE value is at the baseline level or above, we can conclude that the test cases have been sufficiently effective in a test
cycle. In this case, the project team can anticipate user satisfaction with the system in production. But, the further the TCE value falls below the baseline level, the higher is the risk of user dissatisfaction. In such cases, the project team can correct the test process based on the framework, as I’ll describe. Improving test-case effectiveness If in-process validation finds test-case effectiveness to be less than acceptable, the project team should analyze the causes and identify areas for test process improvement. My proposed improvement framework stems from the defect-prevention concept developed at IBM.5 The IBM approach improves the test process on the basis of causal analysis of defects, so-called test escapes, that were missed in testing. Test escapes are “product defects that a particular test failed to find, but which were found in a later test, or by a customer [in production].” I further evolve IBM’s concept and suggest that the analysis of defects missed by test cases can help us improve test-case effectiveness. Hence, my improvement framework is based on test-case escapes, defined as software defects that a given suite of test cases failed to find but that were found as a side effect in the same test cycle. Once they are found by chance, we can view test-case escapes as a manifestation of deficiencies in the formal test process. Therefore, their causal analysis can help us identify areas for test process improvement. In brief, the proposed improvement framework consists of the following steps: 1. Understand and document the test process used by the project team. 2. Make assumptions about the factors affecting test-case effectiveness. 3. Gather defect data and perform causal analysis of test-case escapes. 4. Identify the main factors. 5. Implement corrective actions. Following these steps, either a revised part or the entire test suite (as I’ll discuss later) should be executed again. As a result, testers should find additional defects that justify the improvement effort. Below I discuss each of the five steps in detail. Clearly, my approach relies entirely on the analysis of defects missed by test cases. Consequently, it requires that a sufficient number of such defects be available. This fact can limit the applicability of the approach for
some projects, for example, in the testing of mainframe batch systems. Here, testers generally exercise only preplanned conditions, and the number of defects found as a side effect is usually very low in the test cycle. But, for client–server projects that implement formal testing, the share of such defects could be from 20 to 50%, which provides a valuable source of information for test-suite validation and test-process improvement. Let’s look at the five steps. Step 1. Understand, document the test process When a project team uses written test-case specifications and focuses on their evaluation and improvement, this already indicates that a certain test process has been established and followed. The test process should be planned at the beginning of the software project and documented in a test plan. Commonly, testers define the test process in terms of the following phases: test planning, test design, test preparation and execution, and test evaluation and improvement.6–8 Each phase should be planned and defined in terms of tasks and deliverables. For example, we can define the test process as follows: ■ Test planning. In this phase, the main tasks are the definition of the scope, objectives, and approach to testing. The main deliverable is a test-plan document. ■ Test design. This involves the design of test cases, with the main deliverables being the test-case specifications. ■ Test preparation and execution. In this phase, preparation of the test environment, executing test cases, and finding defects are the necessary tasks, and the main deliverables are defect reports. ■ Test evaluation and improvement. Here, the main task is analyzing the results of testing and the main deliverable is a test summary report. In all phases, except the last, there are a number of factors that determine the effectiveness of functional test cases in a given project. Hence, the following steps of my improvement framework focus on identifying and evaluating these factors.
Clearly, my approach relies entirely on the analysis of defects missed by test cases. Consequently, it requires that a sufficient number of such defects be available.
Step 2. Make assumptions Once it understands and documents the test process, the project team should analyze each phase and identify factors that can affect test-case effectiveness. January/February 2001
IEEE SOFTWARE
83
Test-design phase factors Incorrect test specifications
Incomplete test design
Incomplete test suite Test-case effectiveness
Test execution problems
Incorrect functional specifications
Test-execution phase factors
Figure 1. Factors affecting test-case effectiveness.
Incomplete functional specifications
Test-planning phase factors
Test planning. The main deliverable of the testplanning phase is a test-plan document that, among other things, defines the scope and objectives of testing. We can define test objectives as features to be tested2 that, in turn, should be traced to functional specifications. If the functional specifications do not completely define functional features, the test-plan document will not be complete either. Hence, the test cases will not completely cover a system’s functionality, thereby reducing their effectiveness. Test design. When writing a test-case specification, we usually begin by understanding and analyzing the corresponding business rule that is the object of the test. Then, we consider the test logic required for testing this functional feature. To identify necessary test cases, we can use test design techniques such as decision tables, equivalence partitioning, boundary analysis, and so forth.1,7,8 The test-design phase can give rise to other factors that affect test-case effectiveness. First, the test suite might be incomplete and some of the business rules in the functional specifications might not be covered by test cases. Second, test-design logic could be incomplete and some of the necessary test conditions could be missing in test-case specifications. A common example of this situation is a lack of negative test cases. By definition, a test case is negative if it exercises abnormal conditions by using either invalid data input or the wrong user action. Finally, a third factor is that the test-case specifications could simply be incorrect. For example, a source document—a corresponding functional specification—could be incorrect or unclear, or there might be an error in the test-case specification itself. All the deficiencies identified in the testplanning and test-design phases will ultimately require addition of new and revision of existing test-case specifications and their retesting. Test preparation and execution. The test-
execution phase itself can be a source of factors that reduce test-case effectiveness. For example, some test cases might not be executed 84
IEEE SOFTWARE
January/February 2001
or might be executed incorrectly. In addition, a tester might overlook defects, especially when the verification of expected results is not straightforward. Based on our experience, only a small number of test-case escapes stem from test-execution factors. Therefore, these factors will probably not be central to the test-case effectiveness improvement effort. However, further analysis at Step 4 might show that the proportion of defects in this category is significant. In such cases, a detailed evaluation of test-execution factors should be performed. Figure 1 shows these factors in the form of a cause–effect diagram. I have grouped the factors according to the test-process phases in which they originate. However, at this point, they are just assumptions that should be evaluated using the following steps to identify the factors that are mostly responsible for insufficient test-case effectiveness. Step 3. Gather defect data and perform causal analysis To perform causal analysis of test-case escapes at the end of the test-execution phase, testers must select the defects missed by test cases. This requires the use of a defect-tracking system. Also, the testers must identify which defects were found as a result of test-case execution and which were found as a side effect—that is, as test-case escapes. Once identified and selected, the test-case escapes should be classified according to one of the factors based on the causal analysis logic shown in Figure 2. This analysis is used to evaluate each test-case escape and understand why the test suite missed the corresponding defect during test execution. We can begin causal analysis by verifying that a functional specification has a business rule related to the given defect. If it does not, we have determined that the cause of this test-case escape is an incomplete functional specification. However, if it does, we need to check whether the test suite has a test specification that should have found this test-case escape. If a test-case specification does not exist, this means that the test suite does not cover all business rules. Therefore, an incomplete test suite is the reason this defect was missed. If a test specification does exist, we need to check the defect against test cases in the specification. If none of them were de-
Case Study
By definition, negative test cases focus on abnormal workflow
signed to catch such a defect, this indicates that the test specification is incomplete. Indeed, all test inputs and expected results in the test-case specification might be correct. However, the specification might include, for example, only positive test cases. A lack of negative test cases in test specifications is a common cause of missed defects. This is a case of deficiency in test design that was used to derive test cases. Hence, we can specify that the cause of such test-case escapes is incomplete test design. But, if the test specification includes conditions related to a given defect, we need to verify that these test condi-
and are intended to break a system. However, the test suite initially used by the testers was not sufficiently “destructive.” A significant number of defects were found as side effects as opposed to being found by conditions in test specifications. In addition, the team created a number of new test-case specifications to completely cover the business rules in the revised functional specifications. To verify test suite completeness, this time the project team used a traceability matrix, which was not done in the first test cycle. Test suite incompleteness was one of the factors that reduced test-case effectiveness (see Figure A). After these corrections, the testers executed the revised part of the test suite. As a result, they found 48 additional defects that otherwise would have been released into production. At this point, the number of defects found during the test cycles had grown to 231. After two months in production, the rate of defects, reported by users, had noticeably declined. By the end of the second month the number of production defects was 23. The DRE metric calculated at this time was 91%, which is 231/(231+23) = 0.91, and indicated sufficient effectiveness of the test process.4 Indeed, none of the defects reported from production by the users were of critical severity, and the users were fairly satisfied with the system quality.
In co In m co pl Number of test-case escapes m et pl e et fu e nc te tio st na de ls sig p In e n I cif nc co om ica rre tio ct pl et fu ns In e nc co te tio s rre ts na ct ui ls te te pe stc ifi ca ca se tio Te sp ns st ec ex ifi ec ca tio ut io ns n pr ob le m s
This project was a banking application intended for external clients—financial institutions. The system had a three-tier client–server architecture with a Windows NT front-end developed in Visual Basic and Visual C++. The second tier was implemented in a Unix environment with Oracle 7 as a database engine. The third tier was a data feed from a mainframe COBOL/DB2 system. The project team consisted of 10 developers and three testers. Because the application was intended for external clients, software quality was of great importance to project management. To ensure high quality, the project team implemented a formal test process with a focus on functional testing. The development team was responsible for functional specifications, and the test team was responsible for the test-plan document and test-case specifications. Management defined the functional testing exit criteria as follows: ■ 100% of test cases are executed. ■ No defects of high and medium severity remain open. ■ Test-case effectiveness not less than 75%. By the end of the test-execution phase, testers had executed all test specifications and reported 183 defects. Defects were managed using the PVCS-based defect tracking system. In reporting defects, testers classified them either as test-case escapes or as being found by conditions in test-case specifications. Testers reported 71 test-case escapes and 112 defects found by test cases. Based on these numbers, the calculated TCE metric value was 61%, which was considerably lower than the acceptable level of 75%. As a result, the project team concluded that functional testing did not pass the exit criteria and the system was likely not sufficiently tested. Hence, test-process correction and system retesting were needed. The project team performed the test process improvement according to the framework described above. First, they analyzed all test-case escapes and classified them by appropriate causes. Next, they built a distribution of causes (see Figure A). Analysis of the distribution showed incomplete test design and incomplete functional specifications to be the main factors causing missed defects by test cases. To improve the test process, the project team began by correcting and completing the functional specifications and reviewing them with the users. A subsequent review of testcase specifications showed that the main deficiency of the test design was a lack of negative test cases. Therefore, the existing testcase specifications were completed with negative test cases.
35 30 25 20 15 10 5 0
Description of causes
Figure A. A Pareto chart.
tions and the corresponding expected results are correct. If they are correct, we should conclude that test-execution problems are the likely reason that the defect was missed. If the test conditions or expected results were not correct, we need to understand why the test specification is incorrect. First, we should check the source document and see if the corresponding business rule is also incorrect. If this is the case, we should classify the cause of this test-case escape as an incorrect functional specification. Otherwise, the cause is incorrect test specification. As a result of defect causal analysis, all January/February 2001
IEEE SOFTWARE
85
Test-case escape
No
Does a test specification exist?
Incomplete test suite No
Does a business rule exist?
No
Is the test specification correct?
Yes
Test execution problems
Yes
Incorrect test specification
Yes
Incomplete functional specification
Yes Is the test specification complete?
Yes
Incomplete test design
No Incorrect No functional specification
Figure 2. Test-case escape classification logic.
Is the business rule correct?
test-case escapes should be classified according to one of the possible causes presented in Figure 1. Step 4. Identify the main factors At this point, all test-case escapes have been classified according to their respective causes. We now need to identify those “vital few” factors that are responsible for the majority of the defects being missed by test cases. For this, we can build a Pareto chart,9 which displays frequency bars in descending order, convenient for analyzing types of problems. Once identified, the most important causes will be the focus of the next step—implementation of corrective actions.
About the Author Yuri Chernak is president and
founder of Valley Forge Consulting, Inc., a consulting firm that specializes in the field of software quality assurance and systems testing. He has over 20 years of experience in the software industry. As a consultant, he has worked for various clients, primarily for the brokerage firms in New York. He has a PhD in computer science and is a member of the IEEE. His research interests cover systems test methodology, software metrics, and process improvement. He has been a speaker at international conferences on software quality. Contact him at
[email protected].
86
IEEE SOFTWARE
Step 5. Implement corrective actions After identification of the main causes of test-case escapes, the project team should implement corrective actions and repeat the test execution cycle. For the factors shown in Figure 1, corrective actions could be any of the following: ■ Incomplete or incorrect functional specifications—inspect and rework functional specifications, then rework testcase specifications. ■ Incomplete test suite—use a traceability matrix to ensure complete coverage of business rules by test cases. ■ Incomplete test design—implement training of testers on test-design techniques; use checklists or templates to design test-case specifications; rework testcase specifications. ■ Incorrect test-case specifications—inspect and rework test-case specifications. ■ Test-execution problems—implement training of testers on test execution, develop and use procedures for test execution and verification of test results.
January/February 2001
When functional specifications or test cases must be corrected, the project team should revise the test suite and execute the revised part again. However, if correction is required only due to the test-execution problems, the same test can be used for retesting. The main objective of retesting is to find additional defects. If additional ones are found, this fact can justify the whole improvement effort. The “Case Study” box illustrates how my proposed approach to test-case effectiveness validation and in-process improvement was implemented in a client–server project.
T
his technique for in-process validation of test cases is intended to give project teams better visibility into test-process effectiveness before their systems are released into production. The proposed technique can be applied within any project management model, including incremental or evolutionary models, where it can be used for assessment of test-process effectiveness and its tuning from one incremental build to another. A project team has to decide in advance what level of test-case effectiveness is acceptable for their project. Such a requirement can vary depending primarily on the project’s criticality. Future work will focus on developing a formal approach to selecting a baseline value for the TCE metric. Acknowledgments I am grateful to Vladimir Ivanov for his help in preparing this material. I thank Richard Reithner for editing the article. Finally, I am grateful to the IEEE Software reviewers for their helpful feedback and comments.
References 1. 2. 3. 4. 5. 6.
7. 8. 9.
G. Myers, The Art of Software Testing, John Wiley & Sons, Upper Saddle River, N.J., 1979. IEEE Std. 829-1983, Software Test Documentation, IEEE, Piscataway, N.J., 1983. IEEE Std. 1012-1986, IEEE Standard for Software Verification and Validation Plans, IEEE, Piscataway, N.J., 1986. C. Jones, Applied Software Measurement, McGraw-Hill, New York, 1991. R. Mays et al., “Experiences with Defect Prevention,” IBM Systems J., vol. 29, no. 1, 1990, pp. 4–32. Y. Chernak, “Approach to the Function Test Decomposition and Management,” Proc. 15 Pacific Northwest Software Quality Conf., PNSQC/Pacific Agenda, Portland, 1997, pp. 400–418. E. Kit Longman, Software Testing in the Real World, Addison-Wesley, Reading, Mass., 1995. P. Goglia, Testing Client–server Applications, QED Publishing Group, Wellesley, Mass., 1993. L.J. Arthur, Improving Software Quality, John Wiley & Sons, Upper Saddle River, N.J., 1993.
feature estimation
Improving Subjective Estimates Using Paired Comparisons Eduardo Miranda, Ericsson Research Canada
Despite the existence of structured methods for software sizing and effort estimation, the so-called “expert” approach seems to be the prevalent way to produce estimates in the software industry. The pairedcomparisons method offers a more accurate and precise alternative to “guesstimating.”
ost practitioners and project managers still produce estimates based on ad hoc or so-called “expert” approaches, even though several software sizing methods—counting source lines of code,1 function points,2 full function points,3 and object points, to name a few—are well known and have been available for a long time. Among the most common explanations given for not adopting more formal
M
0740-7459/00/$10.00 © 2001 IEEE
estimation practices are the lack of necessary information at the beginningof the project, the specificity of the domain addressed, the effort and time required, and the need to introduce a vocabulary foreign to stakeholders without a software background. However, as Figure 1 shows, ad hoc size estimates have problems of their own. Their accuracy (the closeness of a measured value to the true one) and precision (indicating how repeatable a measurement is) leave much to be desired. The problem is not academic: inaccurate size estimates automatically translate into questionable project budgets and schedules. This article presents a method based on paired comparisons, which social science researchers use for measuring when there is no accepted measurement scale or when a measurement instrument does not exist. Although not new, the idea has received little attention in the literature. Earlier work in-
cludes Target Software’s software sizing method4 and more recent articles by Focal Point AB5 and by Bournemouth University’s Empirical Software Engineering Research Group,6 which uses the analytic hierarchical process to prioritize requirements relative to their cost and estimate effort respectively.7 Overall approach The idea behind paired comparisons is to estimate the size of n entities by asking one or more experts to judge the entities’ relative largeness rather than to provide absolute size values. (Entities can be requirements, use cases, modules, features, objects, or anything else relevant to all stakeholders and for which it is possible to know the number of lines of code, hours, or any other magnitude that could later be used for planning purposes.) By requiring multiple and explicit decisions about the relative size of every two entities and by using easily available historical January/February 2001
IEEE SOFTWARE
87
Lines of code
500 450 400 350 300 250 200 150 100 50 0
Actual
Stack
Ad hoc
Queue
Paired comparisons
Binary Linked Reference Linked Balanced Hash tree list (a) (string) list (b) tree table
Figure 1. The accuracy and precision of different estimation approaches (results of a study involving over 30 software professionals and graduate students): actual measurements, ad hoc estimates, and paired-comparison estimates.
data—rather than a single comparison to some vague notion of size buried in the estimator’s mind—the paired-comparisons method improves both the accuracy and the precision of estimates, as shown in Figure 1. These findings are consistent with the conclusions of Albert L. Lederer and Jayesh Prasad’s study, which shows that using historical data and documented comparisons produce better estimates than those based on intuition and guessing.8 As Figure 2 shows, with the proposed approach we start by arranging the entities to be sized according to their perceived largeness. We then assess the relative size of each one with respect to all the others and record this information in what is called a judgment matrix. From the judgments made, we derive a ratio scale using a simple mathe-
Artifacts to be sized
matical procedure and then calculate the absolute size of the entities using the ratio scale and a reference value. Should the need arise, judgments can be reviewed for internal consistency. The method is independent of the type of entities chosen. It is important, however, that the sizes of the entities being estimated do not differ by more than one order of magnitude, because our ability to accurately discriminate size diminishes as the difference between the entities becomes larger.7,9,10 Judgment matrices A judgment matrix is a square matrix of size n, where n is the number of entities being compared; and each element aij captures the relative size of entity i with respect to entity j. The elements of the matrix are defined as
[ ]
A n × n = aij
a = si ij sj a = 1 = ii 1 a ji = a ij
How much bigger (smaller) entity i is with respect to entity j Every entity has the same size as itself
If entity i is aij times bigger (smaller) than entity j, then entity j is1/aij times smaller (bigger) than entity i
In practice, as Table 1 shows, the judges must estimate only the relative sizes of the upper
Verbal scale (optional)
Rank artifacts from largest to smallest Compare the artifacts pairwise establishing their relative size
Judgment matrix Review internal inconsistencies Calculate ratio scale and inconsistency index
Figure 2. The paired-comparisons estimation process.
88
IEEE SOFTWARE
Reference value(s)
January/February 2001
(1)
Ratio scale
Calculate absolute sizes
Sized artifacts
Table 1 Judgment Matrix Example Entities
diagonal elements of the matrix, because all the other values can be derived from them. The element “a12 = 4” in the example in Table 1 expresses the fact that entity D has been judged four times bigger than B. Notice that as shown by the relations D / C = 6, D / A = 7.5, and C / A = 2, the judgments recorded in the matrix do not need to be perfectly consistent. After all, who knows which is the true value? Remember that we are estimating things that have not been built yet. Although not a mandatory step, arranging the entities in descending order according to their size makes the rest of the process much easier. As Table 1 shows, when we sort the rows of a judgment matrix in descending order, the comparisons flow in one direction only. For example, entity D will be either equal to or larger than any of the other entities against which it is compared; it will never be smaller. Notice also that within a row, the values to the left of any given column are always smaller than or equal to those to the right. While these properties are irrelevant from the mathematical point of view, they diminish the strain put on the judges by the large number of decisions they must make. The paired-comparisons method requires the existence of at least one reference entity whose size is known, for example, from a previous development project. First, we rank this entity, as we would any other, by comparing it to every other entity to be sized. Later, we use its size to calculate the absolute size of the entities being estimated. The choice of a reference entity is an important decision. Entities with sizes in either extreme of the scale tend to amplify any bias that might affect the judgments. To minimize this risk, it is better to choose as reference an
entity that will divide the population being estimated into halves or to use two or more references instead of one.
D
D B C A
B
C
A
4
6 1.5
7.5 2 2
The verbal scale Using a verbal scale simplifies and speeds up the estimation process without jeopardizing the accuracy of the results. Although not an essential part of the methodology, having a shared understanding of how small is “smaller” and how big is “bigger” helps the participants reach consensus in the sizing process. A predefined value scale keeps us from wasting time discussing values down to the second decimal when our judgment error is one or two orders of magnitude bigger than that. Quoting an earlier work from Ernest H. Weber, Thomas L. Saaty proposes using a scale from 1 to 9 and their reciprocals to pass judgment on the entities being evaluated.7 Table 2 lists the equivalence between verbal expressions and relative sizes. Suspecting that the values proposed by Saaty could be different for the software domain, I conducted an informal survey among colleagues; 30 people from different countries and from both industry and academia provided input for the scale. The results suggest that the correspondence between size and verbal description in the software domain is closer to the one shown in Table 3 than to Saaty’s. Calculating a ratio scale and an inconsistency index A ratio scale is a vector [r1, r2, …, rn] in which each number ri is proportional to the size of entity i. An inconsistency index is a number that measures how far away our
Table 2 Saaty’s Verbal Scale Definition
Explanation
Relative value Reciprocal
Equal size Slightly bigger (smaller) Bigger (smaller) Much bigger (smaller) Extremely bigger (smaller) Intermediate values between adjacent scales
The two entities are roughly the same size. Experience or judgment recognizes one entity as being somewhat bigger (smaller). Experience or judgment recognizes one entity as being definitely bigger (smaller). The dominance of one entity over the other is self-evident; very strong difference in size. The difference between the entities being compared is of an order of magnitude. When compromise is needed.
1 3 5 7 9 2, 4, 6, 8
January/February 2001
1.00 .33 .20 .14 .11 .5, .25, .16, .12
IEEE SOFTWARE
89
Table 3 Verbal Scale for the Software Domain Definition
Explanation
Relative value
Reciprocal
Equal size
Ei /Ej ≤ 1.25 (0–25%)
1.00
1.00
Slightly bigger (smaller)
1.25 < Ei /Ej ≤ 1.75 (26–75%)
1.25
.80
Bigger (smaller)
1.75 < Ei /Ej ≤ 2.275 (76–275%)
1.75
.57
Much bigger (smaller)
2.275 < Ei /Ej ≤ 5.75 (276–575%) 4.00
.25
Extremely bigger (smaller) 5.75 < Ei /Ej ≤ 10 (576–1000%)
7.00
1 .25 .16 .13
.13
n
(2)
j =1
then I calculate the ratio scale as ri =
vi n
∑ vl l =1
(3)
,
and finally the inconsistency index as n n
v
∑ ∑ ln aij − ln v i i =1 j >i
j
(n − 1)(n − 2)
2
. (4)
2
Thus, given the ratio scale [r1, r2, …, rn], we can calculate the absolute sizes of the entities being estimated using the expression Sizei =
ri ∗ Sizereference . rreference
(5)
If more than one reference value is stipulated, the regression line of the references provided can replace the reference size. A numerical example Let’s look at a complete numerical example using as the departure point the judgments stated in Table 1. First, using the rules for creating a judgment matrix and the relative size of the entities given earlier as examples, we derive values for the matrix: 90
IEEE SOFTWARE
January/February 2001
6 1.5 1 1
7.5 2 2 . 1
Applying Equation 2, we calculate the vector of the row’s geometric means:
judgments are from being perfectly consistent. (A perfectly consistent judgment matrix is one in which all its elements satisfy the condition aij × ajk = aik for all i, j, k.) There are several ways to derive ratio scales and inconsistency indexes from paired-comparisons data, among them Saaty’s eigenvalues,7 averaging over normalized columns,7 and Gordon Crawford and Cindy Williams’ geometric mean procedure.11 Here, I use Crawford and Williams’s approach because of its simplicity and good results. I first calculate the geometric mean of the matrix’s rows as vi = n ∏ aij ,
4 1 .7 .5
3.6 .93 .68 . .42
We sum the geometric means
∑ vi = 5.7 and then normalize the vector just calculated by dividing it by the sum of the means: .64 .16 .12 . .07
Assuming that entity C is the reference point and its size is 1.7 KLOC, we can calculate the absolute size of the other entities using the relationship in Equation 5: .64 ∗ 1.7 .12 .16 ∗ 1.7 .12 . 12 ∗ 1.7 . .12 .07 ∗ 1.7 .12
The absolute sizes—SizeD = 9.07 KSLOC, SizeB = 2.3 KSLOC, and SizeA = 1.06 KSLOC—with an inconsistency index of 3% are the final outputs of the process. Implementation Successful implementation of the pairedcomparisons method requires the selection of qualified judges and a tool capable of automating the calculations. When the number of entities to evaluate is large, you can divide the work among multiple judges. You can also use this approach to minimize the bias introduced by a single judge and to get buy-in to the results. The number of judges used to evaluate n entities
S
oftware sizing using the paired-comparisons method is especially well suited to the early stages of a development project, when the knowledge available to project team members is mostly qualitative. The mathematics behind the method are foolproof, but the judgments on which the calculations are based are not. For the method to work, those making the comparisons must understand both the functional and the technological dimensions of the things being sized. Although not conclusive, the results observed so far are promising. Further experimentation is necessary to establish the validity of the verbal scale for the software domain and to verify that the method scales up when used with larger and more complex entities.
2.
3.
4.
5.
6.
7. 8.
9.
References 1. R. Park, Software Size Measurement: A Framework for Counting Source Statements, tech. report CMU/SEI-92-
10.
11.
Req e
Req f
Req g
Req h
> > > > > <
> > >> >> >> >> >> < > > > > = > > = > = > > = > > =
Req c
is
<<
<
=
Req j
Req d
> > > = >
Req i
Req c
> = < >
>
0.12 0.13 0.11 0.11 0.11 0.09 0.09 0.08 0.07 0.08
>> than
Req 1
96.5 101.6 91.1 90.0 85.0 69.5 73.6 65.9 59.5 60.5
Clear
Req e
Req f
Req g
Req h
Req i
Req j
=
= =
= = =
> = < >
> > > = >
> > > > > <
> > >> >> >> < > > = > = > > > >
>> >> > > > = = > =
Total 793.1 9.5 Inconsistency Index 3 %
Ratio Estimated + / – scale value 0.12 0.13 0.11 0.11 0.11 0.09 0.09 0.08 0.08 0.08
Inconsistency Diagnostic
Req f
1.2 1.2 1.1 1.1 1.0 0.8 0.9 0.8 0.7 0.7
(a)
Req d
Req a Req b Req c 90 Req d Req e Req f Req g Req h Req i Req j
Analyze
= = =
Req c
Reference Artifact value name
Calculate
= =
Ratio Estimated + / – scale value
2 % tolerance
Analyze
Acknowledgments I thank Tamara Keating and Gaetano Lombardi from Ericsson, Alain Abran from the Université du Québec à Montréal, Norma Chhab-Alperin from Statistics Canada, and Raul Martinez from RMyA for their valuable comments.
Clear
=
Req b
Calculate
Req b
Req a Req b Req c 90 Req d Req e Req f Req g Req h Req i Req j
Req a
should not exceed n / 3, otherwise the advantage of the method will be lost because each judge will not get the opportunity to make multiple comparisons for a given entity. A simple way to allocate comparisons to judges is to assign every other comparison to a different judge in a sequential fashion. At Ericsson, we use the home-grown tool MinimumTime to support the paired-comparisons method. Figure 3a shows MinimumTime’s interface, which was designed to reduce the strain put on judges by the large number of comparisons required by the method. MinimumTime displays all the completed decisions in a matrix structure using a symbolic or numeric format, according to the user’s preferences. In keeping with the idea of providing range rather than point estimates, the tool calculates a confidence interval based on the scale dispersion. The tool also provides an analysis capability, shown in Figure 3b, to detect judgment inconsistencies and thus to iteratively refine the initial estimate. The sensibility of this tool can, and should, be adjusted to find only the major discrepancies. Since the true value of the relation is unknown, a certain degree of inconsistency could be considered beneficial.
Req a
Reference Artifact value name
96.5 101.6 86.0 90.0 85.0 69.5 73.6 65.9 63.0 63.0
1.1 1.2 1.0 1.0 1.0 0.8 0.8 0.8 0.7 0.7 ×
REQ A is equal to REQ C. REQ C is 0.8 times smaller than REQ I. So REQ A should be 0.8 times smaller than REQ I, but its value is 1.3. In a perfectly consistent relationships A[i,j]*A[j,k]/A[i,k]=1. The current value is: 0.56. Review the relationships between the artifacts named above Total 793.4 9.5 is NOTE: << < = > >> Inconsistency Index 3 % OK A[i,j] is red, A[j,k] blue and A[i,k] purple
25 % tolerance
(b)
TR-20, Software Eng. Inst., Carnegie Mellon Univ., Pittsburgh, 1992. A. Albrecht and J. Gaffney, “Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation,” IEEE Trans. Software Eng., vol. SE-9, no. 6, 1983, pp. 639–648. COSMIC—Full Function Points, Release 2.0, Software Engineering Management Research Lab, Montreal, Sept. 1999; www.lrgl.uqam.ca/cosmic-ffp/manual.html (current 11 Dec. 2000). G. Bozoki, “An Expert Judgment Based Software Sizing Model,” Target Software, www.targetsoft-ware.com (current 8 Jan. 2001). J. Karlsson and K. Ryan, “A Cost-Value Approach for Prioritizing Requirements,” IEEE Software, vol. 14, no. 5, Sept./Oct. 1997, pp. 67–74. M. Shepperd, S. Barker, and M. Aylett, “The Analytic Hierarchy Processing and Almost Dataless Prediction,” ESCOM-SCOPE ’99, Proc. 10th European Software Control and Metrics Conf., Shaker Publishing, Maastricht, The Netherlands, 1999. T. Saaty, Multicriteria Decision Making: The Analytic Hierarchy Process, RWS Publications, Pittsburgh, 1996. A. Lederer and J. Prasad, “Nine Management Guidelines for Better Cost Estimating,” Comm. ACM, vol. 35, no. 2, Feb. 1992, pp. 51–59. E. Miranda, “Establishing Software Size Using the Paired Comparisons Method,” Proc. 9th Int’l Workshop Software Measurement, Université du Québec à Montréal, 1999, pp. 132–142; www.lrgl.uqam.ca/ iwsm99/index2.html (current 11 Dec. 2000). E. Miranda, “An Evaluation of the Paired Comparisons Method for Software Sizing,” Proc. 22th Int’l Conf. Software Eng., ACM, New York, 2000, pp. 597–604. G. Crawford and C. Williams, The Analysis of Subjective Judgment Matrices, tech. report R-2572-1-AF, Rand Corp., Santa Monica, Calif., 1985, pp. xi, 34; www.rand.org/cgi-bin/Abstracts/ordi/getabbydoc. pl?doc=R-2572-1 (current 12 Dec. 2000).
Figure 3. (a) The MinimumTime tool’s graphical interface. (b) The consistency analyzer.
About the Author Eduardo Miranda is
a senior specialist at Ericsson Research Canada and an industrial researcher affiliated with the Research Laboratory in Software Engineering Management at the Université du Québec à Montréal. He is in charge of investigating new management techniques for planning and tracking projects. He received a BS in system analysis from the University of Buenos Aires and an MEng. from the University of Ottawa. He is a member of the IEEE Computer Society and the ACM. Contact him at Ericsson Research Canada, 8400 Decaire Blvd., Town of Mount Royal, Quebec H4P 2N2, Canada;
[email protected].
January/February 2001
IEEE SOFTWARE
91
feature
requirements engineering
Exploring Alternatives during Requirements Analysis John Mylopoulos, University of Toronto Lawrence Chung, University of Texas, Dallas
Goal-oriented requirements analysis techniques provide ways to refine organizational and technical objectives, to more effectively explore alternatives during requirements definition. After selecting a set of alternatives to achieve these objectives, you can elaborate on them during subsequent phases to make them more precise and complete. 92
IEEE SOFTWARE
Stephen Liao and Huaiqing Wang, City University of Hong Kong Eric Yu, University of Toronto
raditionally, requirements analysis consisted of identifying relevant data and functions that a software system would support. The data to be handled by the system might be described in terms of entity-relationship diagrams, while the functions might be described in terms of data flows. Or you could use object-oriented analysis techniques that offer class, use case, state chart, sequence, and other diagrammatic notations for modeling.
T
While such techniques1 form the foundation for many contemporary software engineering practices, requirements analysis has to involve more than understanding and modeling the functions, data, and interfaces for a new system. In addition, the requirements engineer needs to explore alternatives and evaluate their feasibility and desirability with respect to business goals. For instance, suppose your task is to build a system to schedule meetings. First, you might want to explore whether the system should do most of the scheduling work or only record meetings. Then you might want to evaluate these requirements with respect to technical objectives (such as response time) and business objectives (such as meeting effectiveness, low costs, or system usability). Once
January/February 2001
you select an alternative to best meet overall objectives, you can further refine the meaning of terms such as “meeting,” “participant,” or “scheduling conflict.” You can also define the basic functions the system will support. The need to explore alternatives and evaluate them with respect to business objectives has led to research on goal-oriented analysis.2,3 We argue here that goal-oriented analysis complements and strengthens traditional requirements analysis techniques by offering a means for capturing and evaluating alternative ways of meeting business goals. The remainder of this article details the five main steps that comprise goal-oriented analysis. These steps include goal analysis, softgoal analysis, softgoal correlation analysis, goal correlation analysis, and evaluation of alter0740-7459/00/$10.00 © 2001 IEEE
Schedule meeting Generate schedule
Collect constraints
natives. To illustrate the main elements of the proposed analysis technique, we explore a typical scenario that involves defining requirements for a meeting scheduling system. Goal analysis Let’s suppose that the meeting-scheduling task is a generic goal we want to achieve. We can then use AND/OR decompositions to explore alternative solutions. Each alternative reflects a potential plan for satisfying the goal. Figure 1 presents the decomposition of the Schedule Meeting goal. We mark the AND decompositions with an arc, which indicates that satisfying the goal can be accomplished by satisfying all subgoals. We mark the OR decompositions, on the other hand, with a double arc; these decompositions require only one of their subgoals to be satisfied. In Figure 1, the goal Schedule Meeting is first AND-decomposed to two subgoals: Collect Constraints and Generate Schedule. The Generate Schedule subgoal is OR-decomposed into three subgoals that find a schedule manually, automatically (by the system being designed), or interactively. Other decompositions explore alternative ways of fetching the necessary information, including timetable information, which might or might not be publicly available for each potential participant. Even in this simple example, there are literally dozens of alternative solutions. Generally, AND/OR diagrams systematize the exploration of alternatives within a space that can be very large. These diagrams make it possible to conduct a simple form of analysis. In particular, suppose we choose three leaf nodes—marked with a checkmark in Figure 1—as requirements for the system. With a simple, well-known algorithm, we can explore whether we’ve achieved the root goal or not. In particular, the algorithm propagates a check upwards if all the subgoals of an AND-goal are checked or one of the subgoals of an OR-goal is checked. Softgoal analysis Unfortunately, basic goal analysis can only help us with goals that can be defined crisply, such as having or not having a meeting scheduled. Not all goals, however, can be so clearly delineated. Suppose we want to represent and analyze less clear-cut requirements such as “the system must be highly usable” or “the system must improve
Collect timetables
Collect other constraints
Interactively
Share timetables Manually Automatically
System collects
Person collects
By email
From all
From initiator only
By all means
meeting quality.” Such requirements, often called qualities or nonfunctional requirements, can’t be defined generically nor can they have clear-cut criteria to determine whether they’ve been satisfied. For these requirements, we need a looser notion of goal and a richer set of relationships so that we can indicate, for example, that a goal supports or hinders another one without being limited to strict AND/OR relationships. To model this looser notion of goal, we use the notion of softgoal, presented in detail in Nonfunctional Requirements in Software Engineering.4 Softgoals represent illdefined goals and their interdependencies. To distinguish softgoals from their hard goal cousins, we declare a softgoal as being satisfied when there is sufficient positive evidence and little negative evidence against it; a softgoal is unsatisfiable when there is sufficient negative and little positive evidence. Figure 2 illustrates an example that focuses on the quality “highly usable system,” which might be as important an objective as any of the functional goals encountered earlier. The softgoal Usability in Figure 2 represents this requirement. Analyzing this softgoal consists of iterative decompositions that involve AND/OR relationships or other more loosely defined dependency relations. Figure 2 shows a number of such relationships labeled with a + sign, which indicates that one softgoal supports—or “positively influences”—another. In Figure 2, for instance, User Flexibility is clearly enhanced not only by the system quality Modularity, which allows for module substitutions, but also by the system’s ability to allow setting changes. These factors, however, are not necessarily sufficient to satisfy User Flexibility. This is why we use + and – labels to describe relationships instead of AND/OR labels. Figure 2 provides only a partial decomposition of the softgoal. In particular, the
Figure 1. An AND/OR decomposition that illustrates alternatives for achieving the meeting-scheduling goal.
January/February 2001
IEEE SOFTWARE
93
Usability
Error avoidance
User tailorability
Information sharing
Softgoal correlation analysis
Ease of learning User flexibility Programmability
Allow change of settings
+ +
+
+
+
+ +
Allow change Allow change of language Allow change of colors of state
Figure 2. A partial softgoal hierarchy for Usability. We adopted this diagram from coursework by Lisa Gibbens and Jennifer Spiess, prepared for a graduate course taught by Eric Yu.
94
IEEE SOFTWARE
Modularity
Use components
User-defined writing tool
softgoals Error Avoidance, Information Sharing, and Ease of Learning have their own rich space of alternatives, which might be elaborated through further refinements. What is the relevant knowledge for identifying softgoal decompositions and dependencies? Some of the relevant knowledge might be generic and related to the softgoal being analyzed. For instance, general software system qualities—such as reliability, performance, and usability—all have generic decompositions into a number of finer-grain quality factors. There can also be generic rules for decomposing finer-grain softgoals. Here is one example: “A Speed performance softgoal can be AND-decomposed into three softgoals: Minimize User Interaction, Use Efficient Algorithms, and Ensure Adequate CPU Power.” In certain situations, however, you can use project-specific or task-specific decomposition methods after agreement among the project’s stakeholders. For any given software development project, you will always initially set several softgoals as required qualities. Some of these might be technical—such as system performance—because they refer specifically to qualities of the future system. Others will be more organizationally oriented. For instance, it is perfectly reasonable for management to require that introduction of the new system should improve meeting quality (by increasing average participation and/or effectiveness measured in some way) or that it should cut average cost per meeting (where costs include those incurred during the scheduling process). Softgoal analysis calls for each of these qualities to be analyzed in terms of a softgoal hierarchy like the one shown in Figure 2.
January/February 2001
You can build softgoal hierarchies by repeatedly asking what can be done to satisfy or otherwise support a particular softgoal. Unfortunately, quality goals frequently conflict with each other. Consider, for example, security and user friendliness, performance and flexibility, high quality and low costs. Correlation analysis can help discover positive or negative lateral relationships between these softgoals. You can begin such analysis by noting top-level lateral relationships, such as a negatively labeled relationship from Performance to Flexibility. This relationship can then be refined to one or more relationships of the same type from subgoals of Performance (such as Capacity or Speed) to subgoals of Flexibility (such as Programmability or Information Sharing). You repeat this process until you reach the point where you can’t refine relationships any further down the softgoal hierarchies. As with the construction of softgoal hierarchies, you can discover correlations by using generic rules that state conditions under which softgoals of one type—say, Performance—can positively or negatively influence softgoals of another type. For example, a correlation rule might state that complete automation can prevent users from customizing or modifying output. This rule negatively correlates certain automation-related performance softgoals to certain flexibility ones. Figure 3 shows diagrammatically the softgoal hierarchy for Security, with correlation links to other hierarchies developed during the softgoal analysis process. Goal correlation analysis We next need to correlate the goals shown in Figure 1 with all the softgoals identified so far, because we propose to use the latter in order to compare and evaluate the former. For example, alternative subgoals of the goal Schedule Meeting will require different amounts of effort for scheduling. With respect to the softgoal of Minimal Effort, automation is desirable, while doing things manually is not. On that basis, we can set up positively or negatively labeled relationships from subgoals to choose the schedule Automatically or Manually, as shown in Figure 4. On the contrary, if we determine that meeting quality will be
Security
Usability
Maintainability +
Integrity
–
Availability
the criterion, doing things manually is desirable because, presumably, it adds a personal touch to the scheduling process. Figure 4 shows a possible set of correlation links for a simplified version of the Schedule Meeting goal in terms of the softgoals Minimal Effort and Quality of Schedule. The goal tree structure in the lower right half of the figure shows the refinement of the Schedule Meeting goal, while the two quality softgoal trees at the upper left of the figure represent the softgoals that are intended to serve as evaluation criteria. Figure 4 illustrates a major advantage of distinguishing between goals and softgoals: It encourages the separation of the analysis of a quality sort (such as Flexibility) from the object to which it is applied (such as System). This distinction lets you bring relevant knowledge to bear on the analysis process—from very generic knowledge (“to achieve quality X for a system, try to achieve X for all its components”) to very specific (“to achieve effectiveness of a software review meeting, all stakeholders must be present”). Knowledge structuring mechanisms such as classification, generalization, and aggregation can be used to organize the available know-how for supporting goal-oriented analysis. Evaluating of alternatives The final step of goal-oriented analysis calls for an evaluation of alternative functional goal decompositions in terms of the softgoal hierarchies we’ve already constructed. You can evaluate alternatives by selecting a set of leaf goals that collectively satisfy all given goals. For our single given goal, Schedule Meeting, we might want to choose the two leaf goals labeled with checkmarks in Figure 4. These leaf goals clearly satisfy the Schedule Meeting goal. In addition, we might want to choose sets of leaf softgoals that collectively either satisfy or at least provide the best overall support for top-level softgoals. Satisfying softgoals might be impossible because of conflicts. Accordingly, our search for alternatives might involve finding a set of leaf softgoals that maximize their positive support for top softgoals while minimizing negative support. The result of this softgoal evaluation will lead to additional design decisions, such as using passwords or allowing setting changes.
Confidentiality
+
Accuracy +
+
Minimize redundancy
Recoverability +
– Flexibility
Use passwords Minimize external communication
Extra testing
+ Performance
– Do backups
Of course, generally there will be several possible solutions to the satisfaction of given goals and satisfying or finding the best overall support for softgoals. These alternatives all need to be evaluated and compared to each other.
Figure 3. A partial softgoal hierarchy for Security, with correlations to and from other hierarchies.
Putting together the pieces In summary, goal-oriented analysis amounts to an intertwined execution of the different types of analysis we’ve outlined in this article. The following steps summarize the process: ■
■ ■ ■ ■
■
Input: A set of functional goals, such as Schedule Meeting, and also a set of qualities, such as Improve Meeting Participation or Reduce Meeting Costs. Step 1: Decompose each functional goal into an AND/OR hierarchy. Step 2: Decompose each given quality into a softgoal hierarchy. Step 3: Identify correlations among softgoals. Step 4: Identify correlations between goals and softgoals. Select a set of leaf softgoals Figure 4. The result that best satisfy all input qualities. of goal correlation Step 5: Select a set of goals and softgoals analysis for Schedule Meeting. Quality of schedule Minimal Minimal effort conflicts Matching effort
Degree of participation
–
Collection effort
+
+
– Schedule meeting
+ + – By person
– Collect timetables
Choose schedule By system
By all means By email Have updated timetables
Collect them
Manually Automatically
January/February 2001
IEEE SOFTWARE
95
About the Authors John Mylopoulos is a professor of computer science at the University of Toronto. His
research interests include requirements engineering, databases, knowledge-based systems, and conceptual modeling. He received a PhD from Princeton University. He was a corecipient of the “most influential paper” award at the International Conference on Software Engineering, is the president of the VLDB Endowment, and is a fellow of the AAAI. He currently serves as coeditor of the Requirements Engineering Journal and is a member of the editorial board of the ACM Transactions on Software Engineering and Methodology. Contact him at
[email protected]. Lawrence Chung is an associate professor of computer science at the University of Texas, Dallas. His research interests include requirements engineering, software architecture, electronic business systems, and conceptual modeling. He received a PhD in computer science from the University of Toronto and serves on the editorial board of the Requirements Engineering Journal. He is the principal author of NonFunctional Requirements in Software Engineering. Contact him at
[email protected].
Huaiqing Wang is an associate professor at the Department of Information Systems,
City University of Hong Kong. His interests include intelligent systems, Web-based intelligent agents, and their e-business applications, such as multiagent supported financial monitoring systems and intelligent agent-based knowledge management systems. He received a PhD in computer science from the University of Manchester. Contact him at
[email protected].
Stephen Liao is an assistant professor in the Information Systems Department, City University of Hong Kong. His research interests include OO modeling, systems, and technology; user profiling in e-business; and data mining techniques and applications. He received a PhD in information systems from Aix Marseille III University. Contact him at
[email protected].
Eric Yu is associate professor in the faculty of Information Studies at the University of
Toronto. His research interests include conceptual modeling, requirements engineering, knowledge management, and organizational modeling. He has served as cochair of workshops on agent-oriented information systems held at the Conference on Advanced Information Systems Engineering (CAiSE) and the AAAI conference. Contact him at
[email protected].
■
that satisfy all given functional goals and best satisfy all given qualities. Output: A set of functions to be performed by the system that collectively meet the functional goals set out.
Output can also reflect a set of design decisions, such as to use back-ups that are intended to address particular quality requirements. In addition, output can be a set of requirements on the system’s operational environment, such as the requirement that their owners update timetables at least once a week.
W
e hope you agree that the steps we’ve outlined here can precede and augment conventional requirements analysis. Indeed, defining data and functions can begin with the functional requirements produced by a goal-oriented analysis. For our example, once we select a set of leaf nodes for satisfying the Schedule Meet-
96
IEEE SOFTWARE
January/February 2001
ing goal, we can proceed to define how meeting scheduling will be done and what the role of the new system will be. In other words, conventional requirements analysis assumes that we have already settled on a particular solution that meets predefined organizational and technical objectives through the introduction of the new system. The framework we presented here is essentially a simplified version of a proposal we’ve already fleshed out in detail elsewhere.4 Moreover, this framework is only one sample among many proposals. For example, the KAOS methodology employs a rich and formal notion of goal-identification as the central building block during requirements acquisition.2 There is also empirical evidence that goal-oriented analysis leads to better requirements definitions.5 Different conceptions of meeting a goal have led to different ways of handling conflict and evaluating alternatives.6 In addition, goals have been used as an important mechanism for connecting requirements to design. The “composite systems” design approach, for instance, used goals to construct and later prune the design space.7 While there are indeed several different approaches to goal-oriented requirements analysis, we believe the technique proposed here systematizes the search for a solution that can characterize early phases of software development, rationalizes the choice of a particular solution, and relates design decisions to their origins in organizational and technical objectives. References 1. A. Davis, Software Requirements: Objects, Functions, and States, Prentice Hall, Old Tappan, N.J. 1993. 2. A. Dardenne, A. van Lamsweerde, and S. Fickas, “Goal-Directed Requirements Acquisition,” Science of Computer Programming, vol. 20, 1993, pp. 3–50. 3. J. Mylopoulos, L. Chung, and E. Yu, “From ObjectOriented to Goal-Oriented Requirements Analysis,” Comm. ACM, vol. 42, no. 1, Jan. 1999, pp. 31–37. 4. L. Chung et al., Non-Functional Requirements in Software Engineering, Kluwer Publishing, Dordrecht, the Netherlands, 2000. 5. A.I. Anton, “Goal-based Requirements Analysis,” Proc. 2nd IEEE Int’l Conf. Requirements Engineering, CS Press, Los Alamitos, Calif., Apr. 1996, pp. 136–144. 6. A. van Lamsweerde, R. Darimont, and P. Massonet, “Goal Directed Elaboration of Requirements for a Meeting Scheduler: Problems and Lessons Learnt.,” Proc. 2nd IEEE Int’l Symp. Requirements Engineering, CS Press, Los Alamitos, Calif., Mar. 1995, pp. 194–203. 7. M. S. Feather, “Language Support for the Specification and Development of Composite Systems,” ACM Trans. on Programming Languages and Systems, vol. 9, no. 2, Apr. 1987, pp. 198–234.
design Editor: Martin Fowler
■
T h o u g h t Wo r k s
■
[email protected]
Avoiding Repetition Martin Fowler
oftware design is not easy—not easy to do, teach, or evaluate. Much of software education these days is about products and APIs, yet much of these are transient, whereas good design is eternal—if only we could figure out what good design is.
S
and removing repetition can lead to many interesting consequences. I have an increasing sense that a pig-headed determination to remove all repetition can lead you a long way toward a good design and can help you apply and understand the patterns that are common in good designs.
Searching for design principles One of the best ways to capture and promulgate good design is to learn from the patterns community. Their work, especially the famous book Design Patterns (E. Gamma et al., Addison Wesley, Reading, Mass., 1994), has become a cornerstone for many designers of object-oriented software. Patterns are not easy to understand, but they reward the effort of study. We can learn from the specific solutions they convey and from the thinking process that leads to their development. The thinking process is hard to grasp, but understanding it helps us discover principles that often generate these patterns. Over the last year, I’ve been struck by one of the underlying principles that leads to better designs: remove duplication. It’s also been highlighted by mantras in a couple of recent books: the DRY (don’t repeat yourself) principle in the Pragmatic Programmer (A. Hunt, and D. Thomas, Addison Wesley, 1999) and “Once and Only Once” from Extreme Programming Explained: Embrace Change (K. Beck, Addison Wesley, 1999). The principle is simple: say anything in your program only once. Stated blandly like that, it hardly bears saying. Yet identifying
A simple case: subroutine calls Take a simple example: subroutine calls. You use a subroutine when you realize that two blocks of code, in different places, are or will be the same. You define a subroutine and call it from both places. So, if you change what you need to, you don’t have to hunt down multiple repetitions to make the change. Granted, sometimes the duplication is just a coincidence, so you wouldn’t want a change in one to affect the other—but I find that is rare and easy to spot. So what if the blocks are similar but not identical? Maybe some data is different in the two cases. In that case, the answer is obvious: you parameterize the data by passing in arguments to a subroutine. One block of code that multiplies by five and another that multiplies by 10 become one block that multiplies by x, and you replace x with the right number. That’s a simple resolution but it illustrates a basic principle that carries over into more complicated cases. Identify what is common and what varies, find a way to isolate the common stuff from the variations, then remove the redundancy in the common stuff. In this case, separating the commonality and the variability is easy. Many times it seems impossible, but the effort of trying leads to good design. January/February 2001
IEEE SOFTWARE
97
DEPT DESIGN TITLE
What’s the same and what’s different? What if two routines have the same basic flow of behavior but differ in the actual steps (see Figure 1)? These two routines are similar, but not the same. So, what is the same and what is different? The sameness is the routine’s overall structure, and the differences are in the steps. In both cases, the structure is ■ ■ ■
String asciiStatement() { StringBuffer result = new StringBuffer(); result.append(“Bill for “ + customer + “\n”); Iterator it = items.iterator(); while(it.hasNext()) { LineItem each = (LineItem) it.next(); result.append(“\t” + each.product() + “\t\t” + each.amount() + “\n”); } result.append(“total owed:” + total + “\n”); return result.toString(); }
print some header for the invoice, loop through each item printing a line, and print a footer for the invoice.
As Figure 2 shows, we can separate the two by coming up with some kind of printer notion with a common interface for header, footer, and lines and an implementation for the ASCII case. Figure 3a shows that the common part is then the looping structure, so we can wire the pieces together as shown in Figure 3b. There’s nothing earth-shattering about this solution; just apply a polymorphic interface—which is common to any OO or component-based environment that lets you easily plug in multiple implementations of a common interface. Design Patterns junkies will recognize the Template Method pattern. If you are a good designer and are familiar with polymorphic interfaces, you could probably come up with this yourself—as many did. Knowing the pattern just gets you there quicker. The point is, the desire to eliminate duplication can lead you to this solution. Duplication and patterns Thinking in terms of duplication and its problems also helps you understand the benefits of patterns. Framework folks like patterns because they easily let you define new pluggable behaviors to fit behind the interface. Eliminating duplication helps because as you write a new implementation, you don’t have to worry about the common things that need to be. Any common behavior should be in the template method. This lets you concentrate on the new 98
class Invoice...
IEEE SOFTWARE
January/February 2001
String htmlStatement() { StringBuffer result = new StringBuffer(); result.append(“
Bill for ” + customer + “
”); result.append(“
”); Iterator it = items.iterator(); while(it.hasNext()) { LineItem each = (LineItem) it.next(); result.append(“” + each.product() + “ | ” + each.amount() + “ |
”); } result.append(“
”); result.append(“
total owed:” + total + “
”); return result.toString(); } Figure 1. Two similar routines with different steps.
interface Printer { String header(Invoice iv); String item(LineItem line); String footer(Invoice iv); } (a) static class AsciiPrinter implements Printer { public String header(Invoice iv) { return “Bill for “ + iv.customer + “\n”; } public String item(LineItem line) { return “\t” + line.product()+ “\t\t” + line.amount() + “\n”; } public String footer(Invoice iv) { return “total owed:” + iv.total + “\n”; } } (b)
Figure 2. (a) A common interface for header, footer, and lines and (b) an implementation for the ASCII case (try the HTML case as an exercise on your own).
DESIGN
class Invoice... public String statement(Printer pr) { StringBuffer result = new StringBuffer(); result.append(pr.header(this)); Iterator it = items.iterator(); while(it.hasNext()) { LineItem each = (LineItem) it.next(); result.append(pr.item(each)); } result.append(pr.footer(this)); return result.toString(); } (a)
class Invoice... public String asciiStatement2() { return statement (new AsciiPrinter()); } (b) Figure 3. (a) The common part of the routine and (b) the pieces wired together.
behavior rather than the old. The principle of duplication also helps you think about when to apply this pattern. As many people know, one of the problems with people who have just read a pattern is that they insist on using it, which often leads to more complicated designs. When you insist on using a pattern, ask, “What repetition is this removing?” Removing repetition makes it more likely that you’re making good use of the pattern. If not, perhaps you shouldn’t use it. Often, the hard part of eliminating duplication is spotting it in the first place. In my example, you can spot the two routines easily because they are in the same file and located close to each other. What happens when they are in separate files and written by different people in different millennia? This question leads us to think about how we construct our software to reduce the chances of this happening. Using abstract data types is a good way of doing this. Because you have to pass data around, you find that people are less likely to duplicate data structures. If you try to place routines next to their data
structures, you are more likely to spot any duplication in the routines. Much of the reason why objects are popular is because of this kind of social effect, which works well when reinforced by a culture that encourages people to look around and fix duplications when they do arise.
S
o, avoiding repetition is a simple principle that leads to good design. I intend to use this column to explore other simple principles that have this effect. Have you noticed simple principles like this in your work? If so, please contact me—I’m always happy to repeat good ideas.
Martin Fowler is the chief scientist for ThoughtWorks, an Internet systems delivery and consulting company. For a decade, he was an independent consultant pioneering the use of objects in developing business information systems. He’s worked with technologies including Smalltalk, C++, object and relational databases, and Enterprise Java with domains including leasing, payroll, derivatives trading, and healthcare. He is particularly known for his work in patterns, UML, lightweight methodologies, and refactoring. He has written four books: Analysis Patterns, Refactoring, Planning Extreme Programming, and UML Distilled. Contact him at
[email protected].
How to Reach Us Writers For detailed information on submitting articles, write for our Editorial Guidelines (software@ computer.org), or access computer.org/ software/author.htm. Letters to the Editor Send letters to Letters Editor IEEE Software 10662 Los Vaqueros Circle Los Alamitos, CA 90720
[email protected] Please provide an e-mail address or daytime phone number with your letter. On the Web Access computer.org/software for information about IEEE Software. Subscription Change of Address Send change-of-address requests for magazine subscriptions to
[email protected]. Be sure to specify IEEE Software. Membership Change of Address Send change-of-address requests for the membership directory to directory.updates@ computer.org. Missing or Damaged Copies If you are missing an issue or you received a damaged copy, contact membership@ computer.org. Reprints of Articles For price information or to order reprints, send e-mail to
[email protected] or fax +1 714 821 4010. Reprint Permission To obtain permission to reprint an article, contact William Hagen, IEEE Copyrights and Trademarks Manager, at
[email protected].
January/February 2001
IEEE SOFTWARE
99