Process-Based Knowledge Management Support for Software Engineering
vom Fachbereich Informatik der Universität Kaiserslautern zur Verleihung des akademischen Grades Doktor der Naturwissenschaften (Dr. rer. nat.) genehmigte Dissertation
von
Dipl.-Inform. Harald Holz
Vorsitzender:
Prof. Dr. Otto Mayer
Berichterstatter:
Prof. Dr. Michael M. Richter Prof. Dr. Frank Maurer
Dekan:
Prof. Dr. Jürgen Avenhaus
Tag der wissenschaftlichen Aussprache: 14. November 2002
(D 386)
Holz, Harald: Process-Based Knowledge Management Support for Software Engineering / Harald Holz. – Als Ms. gedr.. – Berlin : dissertation.de – Verlag im Internet GmbH, 2003 Zugl.: Kaiserslautern, Univ., Diss., 2002 ISBN 3-89825-728-2
Bibliografische Information Der Deutschen Bibliothek Die Deutsche Bibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über
abrufbar.
Copyright dissertation.de – Verlag im Internet GmbH 2003 Alle Rechte, auch das des auszugsweisen Nachdruckes, der auszugsweisen oder vollständigen Wiedergabe, der Speicherung in Datenverarbeitungsanlagen, auf Datenträgern oder im Internet und der Übersetzung, vorbehalten. Es wird ausschließlich chlorfrei gebleichtes Papier (TCF) nach DIN-ISO 9706 verwendet. Printed in Germany.
dissertation.de - Verlag im Internet GmbH Pestalozzistraße 9 10 625 Berlin URL:
http://www.dissertation.de
iii
Abstract The primary goal of Process-Oriented Knowledge Management (POKM) is to establish, run and maintain an organizational environment that provides process participants with the information needed to successfully perform their activities as defined in the process model. In this work, we present a detailed life-cycle model for POKM that is specific to software development processes. This life-cycle model for Software Engineering Process-Oriented KM (SE-POKM) is integrated into the life-cycle model performed by the organization’s Process Group (PG), and becomes an essential part of a continuous organizational learning process. SE-POKM distinguishes between three stakeholders: the PG (focusing on process quality and improvement), the Knowledge Department (KD; responsible for knowledge management within the organization), and process participants (performing activities and requiring access to available knowledge). The SE-POKM model encompasses the following: • an explicit representation of dynamic (i.e. situation-specific) information needs that typi-
cally arise for process participants during software development activities; this representation also covers potential ways to satisfy those information needs. Furthermore, we present a process-oriented organization scheme for information needs that is based on standard process modeling concepts. • a specification of the information need retrieval during process enactment. Depending on a characterization of their current situation (i.e. current activities, individual preferences and skills etc.), process participant are provided with modeled information needs that are expected to arise for them during their activities; in particular, corresponding selected information items are retrieved for each of these information needs, which are assumed to satisfy the information needs in required detail. • a guideline for the experience packaging phase based on feedback from process participants; during this phase, the initial model of relevant and useful information needs is updated to better reflect the participants’ actual information needs. In order to automate the retrieval of information needs and corresponding information items, we present the Process-oriented Information resource Management Environment (PRIME). PRIME provides a technical infrastructure for knowledge distribution and feedback communication, and is designed to be coupled with the organization’s Process-Centred Software Engineering Environment (PSEE); in this work we demonstrate its integration into the MILOS PSEE. We argue that the light-weight integration of PRIME with the organization’s PSEE improves the efficiency of the process participants and the KD in their every-day work; in the least, participants are provided with answers to standard questions (especially to those of new participants), so that human experts in the KD need only be consulted for new, more difficult problems. Thus, the approach presented in this thesis facilitates an easy, incremental phase-in of Knowledge Management techniques into a software organization.
iv
v
Acknowledgments If it took me somewhat longer to write this thesis, this has been mostly due to the fact that working in Prof. Dr. Michael Richter’s research group has been a privilege I have been very reluctant to part with. It will be difficult to another place that is equal in its inspiring atmosphere, supportiveness of colleagues, and freedom to pursue interesting research topics. I am sorry for the next generation of computer science students in Kaiserslautern, for whom the "AG Richter" will only be rumour, this, alas, being the last year of the group’s existence. My thanks go to: • My thesis advisors, Prof. Dr. Michael Richter and Prof. Dr. Frank Maurer, for their great
•
• • • •
•
•
•
• • •
support as well as for many stimulating discussions and useful suggestions concerning my research. Prof. Dr. Ralph Bergmann, Sigrid Goldmann, Boris Kötting, Dr. Charles Petrie, Markus Roggenbach, and Martin Schaaf, for proof-reading and commenting on earlier versions of this thesis. In particular, Sigrid Goldmann’s detailed comments helped to make the thesis much more readable. Sigrid Goldmann, Thomas Sauer, Martin Schaaf, and Jochen Schäfer, for a final night shift that helped me to keep the deadline. Arne Könnecker, Guido Mayer, and Jochen Schäfer, for interesting discussions and implementation work that made PRIME a running system. Empolis GmbH, for providing me with their knowledge management tool suite "orenge". Sigrid Goldmann, Jochem Hüllen, Boris Kötting, Armin Stahl, and Sascha Schmitt, for keeping work away from me by shouldering it themselves, while I was writing this thesis. The MILOS team: Fawsy Bendeck, Barbara Dellen, Boris Kötting, Prof. Dr. Frank Maurer, Martin Schaaf, and all those many students who helped making the project a success (you know who you are, guys. Your names are listed in the system’s "About" window). It has been a great pleasure working with you. Christoph Globig, Willi Klein, Martin Schaaf, Armin Stahl, Ivo Vollrath, and Frank Weberskirch, for keeping the group’s IT infrastructure up and running, and for providing excellent troubleshooting support. Edith Hüttel and Petra Homm, for always removing administrative burdens with a smile, and for turning the (secretary’s) office into a refugium. Edith Hüttel has been the "soul of the AG Richter" for most of the years I have worked there. My parents, Helen and Heinz Holz, and grandparents, Gisela and Eric von Åkerman, for supporting me during my studies without putting pressure on me to "finish" university. Doris and Dietrich Sturm, for numerous invitations to write this thesis undisturbed in their beautiful house in France. Anja Fährmann, Markus Roggenbach, and Gabriel Zachmann, for their friendship and very, very long telephone conversations that encouraged me and cheered me up.
vi
• Stefan Besling, for first raising my interest in A.I. back in 1982; never again has pair pro-
gramming been so much fun (of course, we were younger, then... ;-) • my first computer science teacher, Ulrich Seitz, for his excellent course he gave at the Lessing-Gymnasium at a time when other so-called computer science teachers had difficulties using their programmable calculators. • the people at Infocom, Inc., for their unforgettable text adventures.
Kaiserslautern, September 2003
vii
To E. F., without whom this thesis might have never been finished.
viii
ix
Contents 1. Introduction
1
1.1 1.2
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Contributions of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 1.4
Assumptions Underlying the Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Structure of Dissertation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2. Knowledge Management and Software Engineering 2.1 2.2
2.3
A Life-Cycle Model for Knowledge Management . . . . . . . . . . . . . . . . . . . . . . 13 Knowledge Management for Software Engineering . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Process Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.2 Process-Centred Software Engineering Environments . . . . . . . . . . . 19 Shortcomings of SE Knowledge Management Approaches . . . . . . . . . . . . . . . 21 2.3.1 Shortcomings of Experience Base Approaches . . . . . . . . . . . . . . . . . 21 2.3.2 Shortcomings of Process-Centred SE Environments . . . . . . . . . . . . . 22
3. Process-Oriented Knowledge Management 3.1
3.2
4.3
25
What is Process-Oriented Knowledge Management? . . . . . . . . . . . . . . . . . . . . 25 3.1.1 Process-Oriented Knowledge Capture . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1.2 Process-Oriented Knowledge Organization . . . . . . . . . . . . . . . . . . . . 31 3.1.3 Process-Oriented Knowledge Formalization . . . . . . . . . . . . . . . . . . . 33 3.1.4 Process-Oriented Knowledge Distribution . . . . . . . . . . . . . . . . . . . . . 35 3.1.5 Process-Oriented Knowledge Application . . . . . . . . . . . . . . . . . . . . . 37 3.1.6 Process-Oriented Knowledge Evolution . . . . . . . . . . . . . . . . . . . . . . 37 A Model for Software Engineering Process-Oriented KM . . . . . . . . . . . . . . . . 39 3.2.1 Capturing Process-Specific Information Needs . . . . . . . . . . . . . . . . . 41 3.2.2 Process-Oriented Information Resource Organization . . . . . . . . . . . . 49 3.2.3 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.4 Information Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4. Integrating POKM into a Process-Centred SEE 4.1 4.2
11
101
MILOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 PRIME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.2.1 System Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
x
5. Process-Oriented Knowledge Evolution
127
6. Discussion
143
6.1 6.2
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153 Case Study Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154 6.2.1 Quality Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155 6.2.2 Efficiency Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .156 6.2.3 Convenience Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157
7. Summary & Outlook 7.1
159
Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163
References
165
List of Figures
175
List of Tables
179
List of Clarifications
181
xi
xii
CHAPTER 1
Introduction
"An immense and ever-increasing wealth of knowledge is scattered about the world today; knowledge that would probably suffice to solve all the mighty difficulties of our age, but it is dispersed and unorganized. We need a sort of mental clearing house for the mind: a depot where knowledge and ideas are received, sorted, summarized, digested, clarified and compared." (H.G. Wells, 1940)
In order to meet the challenges of today’s economy, companies are beginning to realize the potential of Knowledge Management (KM). A Conference Board survey of 200 senior executives from leading global companies revealed that 80 percent of them have some KM efforts under way [Hac00]. Measurable business benefits are being attributed to the innovative use of KM in the areas R&D, sales, marketing and customer service: e.g., BP Amoco reports gains of more than $600 million as a result of its knowledge and learning initiatives. Central to Knowledge Management is the concept of an Organizational Memory (OM), alternatively called Corporate Memory. In essence, an OM is a logical or physical unit (not necessarily automated) that is responsible for • converting knowledge, i.e. knowledge that is created both within and
outside the organization needs to be collected, prepared and organized in an appropriate form, and • connecting knowledge, i.e. knowledge has to be disseminated within the organization to employees who require it for their work [OLe98a]. Problems with OMs range from an organizational level (e.g.: How to assure management support?) to a cultural level (e.g.: How to establish a culture of information sharing?) to a technical level (e.g.: How to cope with heterogeneous data representations?). Studies and experience reports (see e.g. [Hac00]) stress the importance of the first two levels and the need to address them as part of an organization’s KM initiative. However, problems on these levels lie outside the scope of computer science, whose role in Knowledge Management research so far has mainly been to provide solutions on a technical level, e.g. in the form of Organizational
1
2
CHAPTER 1 Introduction
Memory Information Systems (OMIS) [SZ95] as a technical infrastructure for OMs. Existing OMIS approaches (see e.g. [Ack93], [WWT98]) encompass various information sources and formats, ranging from informal text documents to semi-formal documents to formal data and knowledge representations. passive vs. active OM
In general, we can distinguish between passive and active OMs (see e.g. [RMS00]). An OM is called passive if it expects employees to explicitly query it for relevant information whenever they have a specific information need. In contrast, an OM is called active if it distributes information to employees whenever it is necessary for their work. In order to realize active OMs that provide employees with relevant information, it is essential to keep track of the activities they are about to perform. The importance of active Organizational Memories has been illustrated by empirical studies, showing that a crucial problem with reusing existing information is to know that certain knowledge is available for reuse [MR97]. Thus, the question of how to make sure that each employee is putting the wealth of knowledge stored in the OM to its best possible use forms a major issue that needs to be addressed. 1.1
Motivation
In this work, we will focus on information system-supported Knowledge Management for software organizations1. In our opinion, Knowledge Management appears to be particularly relevant for software organization for the following reasons: • Dealing with almost daily advances in technology, software organiza-
tions have to ensure that any new information is spread quickly to all employees whose work might benefit from it. At the same time, care must be taken to avoid information overload. • With the emergence of virtual software organizations, e-market-places for software development tasks, and open-source projects, the tendency towards geographically dispersed software development has increased. The limited opportunities for face-to-face communication impede knowledge exchange between different sites, restricting it mostly to e-mail or telecommunication. This results in an increased importance of ensuring that either knowledge bearers (i.e. people who have knowledge about a specific subject) can be identified and con-
(1)
For the remainder of this work, we will use the terms ’organization’ and ’software organizations’ synonymously.
1.1 Motivation
tacted immediately, or that knowledge on a specific subject has been made explicit and can be accessed whenever the need for it arises. • In the still growing software industry, companies are confronted with the problem of providing new employees quickly and efficiently with the knowledge required to successfully perform their tasks. Knowledge Management has a long tradition in Software Engineering (SE) research, the most prominent approach being the Experience Factory concept proposed by Basili et.al. [BCR94]. Compared to standard KM approaches, the Experience Factory corresponds to the concept of an Organizational Memory; the Experience Factory’s main technical infrastructure, called Experience Base, corresponds to an OMIS that is dedicated to maintain Software Engineering-related, packaged experiences. These experiences can be regarded as a special kind of knowledge. At the time of writing this thesis, only few implementations of this approach are found in current software organizations. In our opinion, one of the main reasons for this reluctance is that the approach lacks a systematic integration into every-day activities performed by employees: the Experience Factory is not tuned towards functioning as an active OM. Companies will not be willing to invest resources in an Experience Factory that is in danger of becoming an ’experience cemetery’. The initial scenario that we consider in this work is depicted in Figure 1.1.
Fig. 1.1: Motivational scenario.
On one hand, the software organization’s employees are performing activities within various software projects. While they are performing certain
3
4
CHAPTER 1 Introduction
Information Needs
activities, information needs arise that need to be satisfied in order to successfully perform the activities. These information needs range from simple questions (e.g. “Who has working experience with EJB in this company?”) to questions that usually are more complicated to answer (e.g. “What issues need to be addressed when using Serialization together with RMI?”)
Information Source
On the other hand, typically there are several information sources available to the employees, which potentially contain information that can be used to satisfy the employees’ information needs. These information sources can be either human subject-matter experts (e.g. experienced colleagues) that employees can contact, or any electronic information system that is accessible by employees. Furthermore, information sources might either be maintained outside the company (e.g. newsgroups, mailing list archives, tool vendor websites, etc.), or they might be internally maintained within the organization (e.g. the company’s document management system (DMS), bug tracking systems, lessons learned systems, etc.). For software organizations, the existence of external information sources is an important factor, as a considerable amount of relevant up-to-date technical knowledge (in the form of documents, newsgroup postings, etc.) is created and made available outside the organization via Internet technology. A complete replication of this knowledge by converting it into organization-specific information sources would be either too expensive or technically infeasible.
Query-Return Model
For the remainder of this work, we will consider the organization’s OMIS as only one of several information sources available to the organization. Typically, the information sources maintained by the OM follow the standard passive query-return model that expects employees to access an appropriate information source whenever they want to satisfy an information need2 that arose during one of their activities. As pointed out above, this entails the danger of relevant information being overlooked, because employees • might be unaware that they should have an information need (e.g.
when they are using technology for which relevant lessons-learned are available), • might be unaware of a relevant information source they could use to satisfy an information need, • might be unable to query an information system appropriately, or • will not be notified when new relevant becomes available.
(2)
In the field of information retrieval, an information need denotes what a user really wants to know, typically expressed in natural language; a query is an approximation of the information need.
1.1 Motivation
5
Therefore, the query-return model seems to be impractical for software development activities. As an alternative to the passive query-return model, information filtering approaches for active information delivery have been developed [BC92]. Typically, these approaches are based on long-term queries or user profiles that facilitate the filtering of newly created documents, alerting the user to new documents only if they meet certain relevance criteria.
Information Filtering
However, in the context of software organizations, these information filtering approaches also seem to be inadequate for the task of distributing relevant documents: during software development projects, employees typically are continuously starting new activities, changing roles during the activities (e.g. a designer of a certain component might be the reviewer of another component), or dropping out of activities. Furthermore, the characteristics of these activities change frequently, e.g. because of changes to products handled, or changes to the set of tools being used during the activity. As a consequence, information retrieved by a longterm query that an employee once specified for an already completed activity is likely to have become irrelevant to him, as has the information retrieved by a long-term query that no longer reflects the current characteristics of the employee’s current activities3. Even though approaches exist that can adapt to users’ changing interests via learning strategies (see e.g. [MMP96]), it seems doubtful to us whether an employee’s changing interests, caused by the diverse characteristics of the activities that he is assigned to, allow these learning algorithms to converge quickly enough. In this work, we argue that, in order for a software organization to utilize its Organizational Memory Information System effectively, knowledge should be structured around process models (i.e. abstract activity descriptions). Taking into account the fact that, in many cases, a diverse set of information systems are already in use by employees within the organization, we propose an approach where process models serve as entry points (or query triggers) to different information sources. In particular, process models will define views on information sources by specifying a particular context in which process participants should be provided with relevant information that can be retrieved from available information sources. Thus, this work contributes to the problem of knowledge dissemination in organizations by means of Process-Oriented Knowledge Management (POKM).
(3)
Wolverton has made similar observations for the domain of battle planning [Wol99].
ProcessOriented Knowledge Management
6
CHAPTER 1 Introduction
1.2
Contributions of this Thesis
In this thesis, we present an approach that addresses the problem of how to facilitate a distribution of information available within the organization’s OM to its software engineers such that the information distribution is: • (pro)active, i.e. the organization does not rely on its engineers request-
ing information from the OM, but aims at providing relevant information without their explicit request; • activity- and situation-specific, i.e. the proactive information distribution is based on the activity a software engineer is currently working on, as well as on the engineer’s personal preferences and/or skills. • systematic, i.e. it is possible to ensure that the OM regularly provides an engineer with relevant information whenever a certain situation arises; • automated, i.e. as far as possible, members of the OM are relieved from proactively providing engineers with relevant information for their current activities by appropriate tools. Conceptual Contributions
In particular, the conceptual contributions of the approach presented in this thesis to KM support within software organizations consist of: • A detailed life-cycle model for Process-Oriented Knowledge Manage-
ment with regard to Software Engineering: Based on a standard Knowledge Management life-cycle model [NKS00], we present a detailed life-cycle model for Software Engineering Process-Oriented Knowledge Management (SE-POKM). The phases of this model are assumed to occur in the context of an organization that is willing to deploy (or already has deployed) a ProcessCentred Software Engineering Environment (PSEE) [GJ96]. Typically, the PSEE is a means of the organization’s Software Engineering Process Group (PG) to capture best practices in the form of abstract process models, and to disseminate these best practices as the basis for concrete software projects. Thus, the PSEE already stands in the centre of a life-cycle model operated by the PG in order to capture and distribute knowledge on how development processes should be performed. The phases of the SE-POKM life-cycle model presented in this thesis are integrated into the life-cycle model operated by the Process Group; these phases are concerned with the collection and distribution of knowledge on what information is required to successfully perform development activities, and where as well as how it can be found. In particular, our SE-POKM life-cycle model presents an approach for bootstrapping the cycle and continuous knowledge evolution.
1.2 Contributions of this Thesis
• A representation formalism for recurrent information needs:
We introduce an explicit, structured representation of information needs that typically arise for process participants during software development activities. In addition to a textual question that expresses such a recurrent information need, the representation allows to formally specify: • a set of preconditions that state when the information need is
expected to arise, • for whom it is expected to arise (i.e. information needs are expected to arise only for process participants with a certain skill profile and/ or role assignment), • information sources that potentially contain information that can be used to satisfy the information need, as well as how to retrieve this information automatically (whenever possible). In particular, the representation may contain parameters in order to specify a generic information need. A generic information need represents a class of (concrete) information needs that uniformly arise and can uniformly be satisfied. For example, the information need "What similar activities have been performed before?" might be expected to occur for any concrete, current activity; it can be satisfied by a similarity-based search in the organization’s activity case-base, using the representation of the current activity as the query case (i.e. as a search parameter). • An organization scheme for recurrent information needs:
The entities defined in the process model that is maintained within the Process-Centred Software Engineering Environment serve as the basis for the organization of recurrent information needs. In general, a recurrent information need will be associated with a certain model entity if the entity’s presence in the situational context of a concrete activity gives rise to the information need. Thus, the organization scheme of recurrent information needs can be regarded as an extension and refinement of the organization’s process model. • A specification of situation-specific, systematic retrieval of infor-
mation: Based on a characterization of their current situation (i.e. current activities, individual preferences and skills etc.), process participants are provided with modeled information needs that are expected to arise for them during their activities; in particular, certain selected information items are retrieved for each of these information needs, which are assumed to satisfy the information needs in required detail. The approach presented in this thesis aims at turning any passive OMIS into an active one that distributes relevant information to employees "just-in-time" [CFS97]. Rather than enforcing the construction of a
7
8
CHAPTER 1 Introduction
new central knowledge repository in which all knowledge is maintained, our approach takes into account the existence of information sources already in use within the organization. Instead of attempting to automatically retrieve all information that might be relevant for a process participant during his activity, we present a two-phase, interactive retrieval model: first, the set of information needs that are likely to arise for the process participant during a given activity is determined and presented to him. Second, the information retrievable from one of the information sources as specified in these information needs is presented to the process participant on demand. The explicitly represented information need also serves as an explanation of why a retrieved information item is relevant to the process participant. Practical Contributions
The concepts presented in this thesis have been implemented in a system called PRIME4, which can provide users of the organization’s ProcessCentred Software Engineering Environment (PSEE) with activity- and situation-specific information. As a proof-of-concept implementation, PRIME has been coupled with the MILOS5 PSEE [HKM01][MH02]. Instead of having to access available information sources by formulating explicit, source-specific queries to satisfy information needs that repeatedly occur during their activities, PRIME provides software engineers with a list of pre-defined information needs expected to arise for them during their activity. From this list, they can choose the one that best corresponds to their actual information need, and trigger an automatic retrieval of information from available sources that is assumed to satisfy the information need. In order to facilitate an easy adoption of Knowledge Management support within an organization, it is important that knowledge modeling effort can be kept low in the beginning, and that an immediate benefit can be achieved. Our approach promotes this by providing process participants with the functionality of activity-specific, personal lists of information items ("favorites") considered to be useful/relevant during the activity, as well as activity-specific forums for information request postings. A later formalization of these information item preferences and requests into explicit information needs by Knowledge Engineers at enactment time facilitates an incremental, bottom-up strategy of introducing Knowledge Management into an organization.
(4) (5)
PRocess-oriented Information resource Management Environment. Minimally-Invasive Long-term Organizational Support [MDB+00].
1.3 Assumptions Underlying the Approach
For an organization, we expect the following main benefits from a deployment of PRIME: 1. Relevant information is not overlooked (quality aspect): Users are actively provided with access to available information. As a consequence, the danger of additional costs or a loss of quality caused by information being overlooked is reduced. 2. Time spent on searching is reduced (efficiency aspect): Posting queries is time-consuming; it interrupts the users’ primary work, forcing them to spend time on searching and browsing available information sources. Providing users with preformulated queries that automatically retrieve the information relevant to them reduces their search efforts. 3. Searching for information is more convenient (convenience aspect): Because query generation can be automated, users need not be bothered with formulating queries in proprietary, information source-specific query languages. In particular, the user does not need to repeatedly enter the same activity characteristics (or certain project characteristics perhaps unknown to him) to search for relevant information. Initial proposals on corresponding measurable success criteria will be presented in Section 6.2. 1.3
Assumptions Underlying the Approach
The following is a list of assumptions on the organization that must hold as a prerequisite to the approach presented here: 1. The main organizational and cultural issues of KM [Hac00] have been successfully addressed and are taken care of; in particular, senior management is committed to the organization’s KM initiative, and a culture of knowledge sharing has been or is being established. 2. The organization is willing to assign resources to an organizational unit (in the following called Knowledge Department) that is responsible for the management of knowledge related to the organization’s software engineering domain. 3. The organization is willing to deploy (or already has deployed) a Process Support Environment that provides the basic functionality of a workflow engine [Hol95]; in particular, employees are provided with to-do lists that contain their individual tasks (activities).
9
Expected Benefits
10
CHAPTER 1 Introduction
1.4
Structure of Dissertation
The remainder of this work is structured as follows: Chapter 2 contains a short introduction to Knowledge Management and reviews existing work on Knowledge Management in the field of Software Engineering. In Chapter 3, a life-cycle model is presented for Process-Oriented Knowledge Management (POKM). Based on this model, we introduce a refined model for Software Engineering Process-Oriented Knowledge Management (SE-POKM) that takes into account the specific demands encountered within software organizations. In Chapter 4, we present the Process-oriented Information resource Management Environment (PRIME). PRIME provides a technical infrastructure to support Software Engineering Process-Oriented KM, and is designed to be coupled with the organization’s Process-Centred Software Engineering Environment. The integration of MILOS and PRIME is used as an example to demonstrate how SE-POKM is supported by a tool environment, in particular, how users of a PSEE benefit from services provided by PRIME6. Chapter 5 outlines a complete knowledge utilization cycle, covering incremental knowledge evolution based on continuous user feedback. Chapter 6 discusses the work presented here, followed by a summary and an outlook on future work in the concluding chapter.
(6) In fact, we believe that the provision of this service will become one of the main reasons for introducing a PSEE into an organization.
CHAPTER 2
Knowledge Management and Software Engineering
"Those who cannot remember the past are condemned to repeat it." (George Santayana)
In the so-called information age in which the knowledge society has been proclaimed, where companies face the difficulties of high staff turnover rates and a lack of skilled people for positions that demand higher education, there is an increased tendency towards regarding knowledge as a valuable resource for modern organizations. Motivated by this insight, Knowledge Management (KM) has been proposed as a means to systematically manage the experience and know-how created and used within an organization. In an analogy to physical resources, knowledge is viewed as a special kind of raw material that needs to be acquired or created, stored and distributed just-in-time to specific locations where it is needed by certain employees during their tasks or activities. Consequently, management of knowledge not only involves the identification and maintenance of valuable knowledge, but also managing the logistic processes that prepare and distribute this knowledge. Thus, an organization not only needs to "know what it knows", but also needs to know "who needs what knowledge, and when?" Knowledge Management is an interdisciplinary topic that addresses economical, organizational, cultural, psychological and technological issues. Hence, various alternative definitions of Knowledge Management exist, reflecting the particular focus of the respective research disciplines (see e.g. [Lie99]). In this work, we will focus on the Computer Science view on KM, and start from a clarification of the term provided by O’Leary [OLe98b].
11
12
CHAPTER 2 Knowledge Management and Software Engineering
Clarification 2.1: Knowledge Management Knowledge Management (KM) is the formal management of knowledge for facilitating creation, access, and reuse of knowledge, typically using advanced technology. Knowledge that is or might be valuable to an organization typically resides in many different places and various formats, e.g. inside people’s heads, books, filing cabinets, file systems, databases, knowledge bases, information systems, etc. Motivated by this observation, O’Leary further characterizes Knowledge Management as "a process of converting knowledge from the sources accessible to an organization and connecting people with that knowledge“ [OLe98a]. Figure 2.1 illustrates this view on Knowledge Management, where we assume that a separate organizational unit, called the Knowledge Department (KD), has been established, which is responsible for the organization’s knowledge converting an connecting processes. Organization ?
information systems
external experts
Knowledge Department ?
External Information Sources
employees
Internal information systems Information Sources capture & convert knowledge
connect to knowledge items/bearers
knowledge transfer
Fig. 2.1: External view on Knowledge Management.The Knowledge Department is responsible for capturing, converting and storing available knowledge from internal and external information sources (subject matter experts, books, databases, etc.). In order to facilitate knowledge transfer to employees with information needs, it connects stored knowledge items or experts with employees to satisfy their information needs.
In the literature, this Knowledge Department is also often called an Organizational Memory (see e.g. [SZ95]) or Corporate Memory. In the context of this thesis, we will use the term "Knowledge Department", hoping to avoid any confusion between the organizational unit responsible for Knowledge Management activities, and any technical infrastructure that is operated by the Knowledge Department to support or automate these activities.
2.1 A Life-Cycle Model for Knowledge Management
Clarification 2.2: Knowledge Department The Knowledge Department (KD) is a logical or physical unit that is responsible for knowledge connecting and converting activities. The KD is headed by a Chief Knowledge Officer (CKO), who is supported by a group of Knowledge Engineers and Knowledge Brokers. Knowledge Engineers perform or automate the tasks of capturing, formalizing, and maintaining knowledge (i.e. their focus lies on "converting"). Knowledge Brokers are responsible for distributing captured knowledge that is relevant to other employees (i.e. their focus lies on "connecting"); typically, Knowledge Brokers act on explicit requests from other employees. Knowledge Brokers might be supported by automated tools provided and maintained by Knowledge Engineers. 2.1
A Life-Cycle Model for Knowledge Management
In order to achieve a systematic Knowledge Management, several lifecycle models for Knowledge Management have been proposed in the literature (see e.g. [DP98], [DC99], [CHL+94], [Nis99]). As Nissen [NKS00] points out, these models differ mainly in their naming of phases, but otherwise share many similarities. In the following, we mostly adhere to the Nissen model [Nis99][NKS00], which distinguishes between six phases: 1. Capture: Valuable knowledge is elicited and externalized1 2. Organize: Captured knowledge is systematically stored for later access 3. Formalize: Whenever necessary, captured knowledge is formalized, e.g. for the sake of clarity or an intended automation 4. Distribute: Captured knowledge is made available to employees 5. Apply: Employees make use of the distributed knowledge during their work 6. Evolve: During knowledge application, new knowledge is created that might be valuable to the organization (or other employees) As Nissen points out, “knowledge evolution leads in turn to further knowledge creation, thereby completing the cycle” [NKS00]. Thus, the resulting model might best be depicted in analogy to the Spiral Model [Boe87], to better illustrate the model’s iterative nature (see Figure 2.2).
(1)
Later, Nissen renamed the phase “Capture” to “Create” [NKS00]. However, we prefer to use his original notation.
13
14
CHAPTER 2 Knowledge Management and Software Engineering
Evolve Apply
Capture
Distribute
Organize
Formalize
Fig. 2.2: Nissen life-cycle model.
In Chapter 3, we will refine this life-cycle model into a model for ProcessOriented Knowledge Management with a focus on software organizations. The next section reviews existing Knowledge Management research in the context of Software Engineering. Further details on Knowledge Management in general can be found in [Lie99][DP98]. 2.2 Experience Factory
Knowledge Management for Software Engineering
In Software Engineering, Knowledge Management is addressed by the Experience Factory (EF) approach proposed by Basili et.al. [Bas89][BR91] [BCR94]. In order to support comprehensive reuse of experiences (software products, software processes, lessons learned, etc.), the operation of a separate organizational unit (called the Experience Factory) is described: Clarification 2.3: Experience Factory "The Experience Factory is a logical and/or physical organization that supports project developments by analyzing and synthesizing all kinds of experience, acting as a repository for such experience, and supplying that experience to various projects on demand. It packages experience by building informal, formal or schematized, and productized models and measures of various software processes, products, and other forms of knowledge via people, documents, and automated support." [BCR94] As the name suggests, the EF focuses primarily on experiental knowledge, i.e. knowledge that reflects concrete experience made by individuals, as opposed to generalized knowledge (e.g. rules, heuristics, etc.). The repository deployed as a technical infrastructure to store the experience is called an Experience Base (EB). The Experience Base stores knowledge that is related to software development (e.g. process models, software products, available resources) in the form of experience packages. Each
2.2 Knowledge Management for Software Engineering
experience package consists of an artifact (i.e. a piece of information that can be processed electronically) and an artifact characterization. The characterization is intended to facilitate a decision on whether the artifact should be used in future situations [BCR94]. Figure 2.3 illustrates the Experience Factory concept.
Characterize Set Goals Choose Process
Execution Plans
Execute Process
Environment Characteristics
Project Support Goals, Processes, Tools, Products, Resource Models, Defect Models, ...
Package
Generalize EXPERIENCE BASE
Data, Lessons Learned
Tailor Formalize
Project Analysis
Analyze Project Organization
Experience Factory
Fig. 2.3: Experience Factory Organization (adapted from [BCR94]).
However, the responsibilities of the Experience Factory exceed those of setting up and running the Experience Base: "The Experience Factory can be used to consolidate and integrate activities, such as packaging experience, consulting, quality assurance, education and training, process and tool support, and measurement and evaluation." [BCR94] Consequently, the EF is also responsible for a continuous improvement of the organization’s software development process. This responsibility is made explicit by the Quality Improvement Paradigm (QIP) (see Figure 2.4) that underlies the operation of the Experience Factory [BCR94].
15
16
CHAPTER 2 Knowledge Management and Software Engineering
Package
Characterize
Set Goals
Analyze
Execute
Choose Artifacts
Fig. 2.4: Quality Improvement Paradigm cycle (adapted from [BCR94]).
In general, the QIP cycle can be applied at different granularity levels, e.g. at an organizational, project or activity level (see e.g. [Tau00]); Table 2.1 shows an example on the project level. QIP Step
Project Level
1. Characterize
Characterize project and identify relevant models to be reused
2. Set Goals
Define project goals in measurable terms and derive related measures
3. Choose Artifacts
Choose appropriate processes and develop project plan
4. Execute
Perform project according to plan, collect data, and provide online feedback for project guidance
5. Analyze
Analyze project and collected data, and suggest improvements
6. Package
Package analysis results into improved reusable artifacts
Tab. 2.1: The QIP cycle instantiated on a project level (adapted from [Tau00]).
Several successful realizations of the Experience Factory concept have been reported from major software development-intensive organizations, e.g. NASA [BCR94], DaimlerChrysler [HSW98] or sd&m [Schie01]. In particular, various tools that aim at supporting the Experience Factory have been and are being developed (see e.g. [ABT98][WSM+99][Fel00]). In general, these tools encompass mechanisms that help to capture experience, and make it accessible and reusable via appropriate interfaces. Retrieval of experience packages is either based on standard information retrieval approaches using statistic document analysis techniques, or deploys more advanced, knowledge-intensive techniques for similarity-
2.2 Knowledge Management for Software Engineering
based or ontology-based retrieval. For a recent overview on Experience Factory research, the reader is referred to [Tau00]. Because of the holistic reuse scenario outlined by the Experience Factory framework, other reuse-oriented approaches can be interpreted as (necessary!) framework refinements that focus on particular software engineering activities or artifacts. For example, Domain Analysis techniques [Ara93] will have to be applied in order to identify organization-specific types of reusable artifacts and appropriate characterization languages. Furthermore, various research disciplines have developed specific reuse concepts for several of these artifacts types, e.g. Product Line Approaches [BJM+00], Process Patterns (e.g. [Mün01]), Analysis Patterns [Fow97], Design Patterns [GHJ+96], or source code reuse [PF87]. In the following, we will look in more detail at process models, a specific kind of artifacts typically maintained by the EF. More information on software reuse in general can be found in [SPM94][SCK95][JGJ97]. 2.2.1
Process Modeling
Process modeling as a way to capture and reuse knowledge about software development processes has a long tradition in SE research [RV95]. Process modeling is concerned with the specification and evolution of explicit representations (process models) of real-world processes occurring within software organizations. Process models are mainly used to serve as a means to support human communication about processes, or to facilitate optimization of processes, e.g. in terms of throughput or quality. To achieve the latter, process models serve • as a basis for reasoning about real-world processes (e.g. simulation,
critical path analysis, deadlock detection, etc.), • as domain knowledge or ’best-practice’ patterns (see e.g. [Mün01]) during project planning, • as a means to support the enactment of real-world processes, ranging from non-restrictive guidance (see e.g. [BW98]) to restrictive process automatization by workflow engines (see e.g. [Fer93]). So far, three main [Law97][CHR+98]:
approaches
have
been
distinguished
• communication-based modeling describes processes as communication
acts between performers and customers [WF87]. • artifact-based modeling is centred around the products (i.e. documents) on which specific operations (e.g. creation, read-access, modification) can be performed as the artifacts pass through a series of activities (see e.g. [CG99]).
17
18
CHAPTER 2 Knowledge Management and Software Engineering
• activity-based modeling focuses on the activities that occur during the
process, along with the flow of information (product flow) between these activities and the order in which they occur (control flow) (see e.g. [SO97]). For each of these approaches, various representation techniques have been proposed, including Petri nets [BFG93], several languages inspired by different programming language paradigms (e.g. procedural [SO97], object-oriented [CHL+94][JPL98], and rule-based [TKP94][PSW92] languages), description logics [SV99], and also multi-view concepts [Ver94]. The classification scheme mentioned above provides only a coarse differentiation, i.e. there are modeling languages that facilitate both artifactbased and activity-based modeling. Whereas several different process modeling languages have been proposed, a certain core set of concepts can be identified (see [STO95]): • Activity descriptions: to define the steps that occur as part of the proc-
• • •
•
Type Level vs. Instance Level
esses; for example, activity descriptions might consist of a name, goals to be achieved, or guidelines of how to perform the activities Artifact (or product) descriptions: to define the various products that are handled during activities Resource description: to specify human or computer resources (e.g. tools) that can be assigned to activities or used during activities Relationships among entities: to define different kinds of associations and dependencies between activities, artifacts and resources; for example, to specify decompositions on activities into sub-activities, to define which products have to be created during certain activities, or to specify the roles in which a resource can participate in activities Constraints: to specify conditions over activities, products, and resources that have to be satisfied (e.g. entry/exit conditions that must hold when the assigned agent attempts to start/finish an activity).
For process models, it is important to distinguish between a type level and an instance level (see e.g. [JPL98]). The type level represents classes of (real-world) activities (e.g. the class of all implementation activities), products, and resources. The instance level represents concrete occurrences of (real-world) activities (e.g. a concrete implementation activity currently being performed by an agent), products, and resources. In this work, the term "process model" always denotes a type level model, in which process types, product types and resource types represent classes
2.2 Knowledge Management for Software Engineering
19
of activities, products and resources, respectively; Figure 2.5 shows a typical example of a process type definition. process type definition name Implementation Process What is the name of the activity class? goal Transformation of the Detailed Design representation of a software product into a programming language realization products What is the goal consume to be reached? reqdoc: Requirements Document desdoc: Design Document W h a t p ro d u cts a re ... a ccessed? produce codedoc: Code Document
What products are created?
When can activities of conditions this class be started? precondition reqdoc.status = "complete" and desqdoc.status = "complete". ... end process definition
Fig. 2.5: Excerpt from a process type definition in the process modeling language MILOS (for details, see [VBM+96]).
The terms "activity", "product", and "agent" denote an individual occurrence (or instance); it will be clear from the context whether the realworld occurrence or its corresponding (instance level) representation is meant. In Software Engineering, the elicitation, maintenance, and distribution of process models is typically considered to be the responsibility of a specific organizational unit, called Software Engineering Process Group (or Process Group (PG) for short), as proposed by various software process improvement initiatives (e.g Capability Maturity Model [PCC+93], or SPICE [EDM98]). The Process Group can be regarded as a subgroup of the Experience Factory staff. 2.2.2
Process-Centred Software Engineering Environments
In the literature, tool support for Knowledge Management that focuses on Software Development Processes has been proposed in the form of Process-Centred Software Engineering Environments (PSEEs) [GJ96]. The support provided encompasses several process-relevant areas such as modeling, planning, managing, executing, and monitoring software processes.
Process Group
CHAPTER 2 Knowledge Management and Software Engineering
Garg and Jazayeri distinguish between three different areas [GJ96]: • Process engineering: The definition and maintenance of software proc-
ess models. • Software engineering: The development and maintenance of a software product (by following a software process). • Project management: The coordination and monitoring of the activities performed during software engineering. A PSEE provides tool support for these three areas, integrated into a single environment (see Figure 2.6). So
re Engineeri ftwa ng Status
Automation
Guidance
Enforcement Analysis
Process
Definition
PSEE
Status Monitoring
Simulation History
ine Eng
Measurement
er ing
Improvement
tM ana geme nt
20
Modification Controlling
jec o r P
Fig. 2.6: Software Development Support by a PSEE [GJ96].
Again, it is important to distinguish between the generic (type level) process models defined during process engineering, and the project-specific (instance level) process model also called the project plan. The latter might be modified as part of Project Management due to external or internal reasons. Figure 2.7 illustrates this distinction by outlining an idealized dataflow between process engineering, project management and software engineering. The dataflow has been annotated with the corresponding QIP steps (see Table 2.1).
2.3 Shortcomings of SE Knowledge Management Approaches
Learning (QIP 5-6)
Past experience Process requirements
Process engineering
Process model
Project requirements Product requirements
Project planning
Project management & Software engineering
(QIP 1-3)
(QIP 4)
Project plan
Product
Fig. 2.7: Dataflow between the three areas supported by a PSEE (adapted from [GJ96]).
An initial process model is created during process engineering; this process model is supposed to be independent of particular projects. For a specific project, the process model (or part of it) is instantiated to reflect a given project’s characteristics and requirements. This instantiated (or tailored) process model, the initial project plan, is the result of a project planning step. The project plan specifies the activities that are to be performed during software engineering. The PSEE supports this plan enactment by enforcing or automating activities, guiding process performers (e.g. designers, reviewers, programmers, etc.) through the process, or providing status information. Based on experience acquired during specific projects, the process model might need to be adapted to better meet the organization’s goals, resulting in a continuous process (model) improvement. 2.3
Shortcomings of SE Knowledge Management Approaches
2.3.1
Shortcomings of Experience Base Approaches
At the time of writing this thesis, most existing Experience Bases are based on the passive query/return model, where users are assumed to expresses their information need in the form of a query2, and are provided with the result set determined by the query. As argued in Chapter 1, such
(2)
Note that, according to Clarification 2.3, the EF supplies experiences "on demand".
21
22
CHAPTER 2 Knowledge Management and Software Engineering
an approach is inadequate to effectively provide software process participants with up-to-standard, relevant information. In particular, systematic reuse is difficult to ensure, because the passive query/retrieve model: • relies on the user to decide when to search for relevant experiences, • requires that users know about the existence of an appropriate informa-
tion source (which, in the case that several information systems are available in addition the Experience Base, might not be trivial), know how to access them, and how to specify queries correctly, and • forces users to repeat queries regularly in order to obtain newly available information. Furthermore, current Experience Base implementations typically do not maintain the context of a user’s queries: as a consequence, users are repeatedly asked to enter project and activity characteristics in order to specify their current situation. The next time a user logs in to search for experiences related to his current activity, he will have to specify the activity characteristics again. 2.3.2
Shortcomings of Process-Centred SE Environments
While process modeling has been accepted by several software organizations3 (even if only in the form of informal documentation), for various reasons the use of PSEEs by process performers during their enactment of individual software engineering activities has not become state-of-practice. In our opinion, two main reasons for this lack of acceptance can be identified: 1. the highly dynamic nature of SE projects (e.g. as opposed to handling standard business processes), and 2. the inherently knowledge-intensive, creative nature of software development activities. While several approaches address the first issue by facilitating change management (see e.g. [BT96] [CHL+94]), only few address the issue of knowledge-intensive processes [DJB97]; even recent comprehensive standardization initiatives (e.g. PSL [SGT+00]) do not take this issue into account.
(3)
For example, see the recent (non-exhaustive) list of all CMM Level 3 (or higher) certified companies published by the Software Engineering Institute (http:// www.sei.cmu.edu/sema/pub_ml.html)
2.3 Shortcomings of SE Knowledge Management Approaches
In particular, most existing approaches lack the concept of situation-specific information delivery. During their enactment of software engineering activities, process performers typically need access to various information that is maintained either inside (e.g. by the Experience Factory) or outside the organization in order to successfully perform their activities. However, the support provided by a PSEE is mainly focused on coordinating the different activities within a project, e.g by providing process performers with to-do lists, and notifying them when new activities have been assigned to them, or when changes in products have occurred. Desirable here would be the concept of an automated assistant that "looks over the process performer‘s shoulder" and (pro)actively provides him with available information that is relevant for the activity currently being performed. For example, during an implementation activity that involves the use of Remote Method Invocation (RMI) technology and Serialization, the programmer might be provided with the RMI specification document, a set of relevant lessons learned on RMI, as well as known issues and problems with using RMI and Serialization technology at the same time. If, however, a change in the project plan results in the substitution of RMI with another technology during the implementation activity, the information on RMI should no longer be offered to the performer, but should be replaced by information on the new technology. In particular, any information about known conflicts between the new technology and Serialization should be offered now to the programmer. An analysis of the shortcomings identified for Experience Bases and Process-Centred Software Engineering Environments reveals the possibility to address them by an appropriate integration of both approaches. Such an integration would aim at using the process model as a means to structure the Experience Base, in a way such that it serves as a kind of index for experience packages. As a consequence, a proactive, situation-specific delivery of relevant experience packages (as well as other knowledge items) to performers of the activities as defined in the process model might be facilitated. The following chapter will present this integrated approach, also called Process-Oriented Knowledge Management, in more detail.
23
24
CHAPTER 2 Knowledge Management and Software Engineering
CHAPTER 3
Process-Oriented Knowledge Management
In this chapter, we provide a clarification of Process-Oriented Knowledge Management (POKM), and present a detailed life-cycle model for Software Engineering Process-Oriented KM (SE-POKM). This model is based on an explicit representation of information needs that potentially arise for process participants during their activities.
3.1
What is Process-Oriented Knowledge Management?
In the literature, a distinction is made between Product-Oriented Knowledge Management (i.e. focus on knowledge as artifacts) and Process-Oriented Knowledge Management (i.e. focus on knowledge as a resource to support human work processes)[SAD99]. Typical examples of the latter are the management of process descriptions (e.g. process modeling), or the management of experiences on performing these processes (e.g. traces, design rationales, or lessons learned). The primary goal of POKM is to establish, run and maintain an organizational environment that provides process participants with the information needed to successfully perform their activities. In the context of this work, we call Knowledge Management to be process-oriented when the Knowledge Department maintains an activity-based process model as its primary means to convert knowledge and to connect process participants with available knowledge. The following clarifications specify this in more detail. Clarification 3.1: Process-Oriented Knowledge Management Process-Oriented Knowledge Management (POKM) is a specialized form of Knowledge Management, in which a process model is the primary means to: • capture • organize • formalize • distribute • apply
25
26
CHAPTER 3 Process-Oriented Knowledge Management
• evolve
knowledge on what information might be useful to successfully perform the classes of activities specified in the process model. In the following, we explain what we mean by this process (model)-orientation of the KM phases. 3.1.1
Process-Oriented Knowledge Capture
Among the knowledge to be captured are the classes of activities performed by employees within the organization. We assume that these activity classes are explicitly represented in the form of process types (see Figure 3.1). “Real World” Activities
Process Model Testing
Black-Box Testing
s activity
t
Glass-Box Testing
process type t is sub-type of process type s
Fig. 3.1: Simplified process model example (depicted as a UML diagram). Classes of (real-world) activities are represented by process types.
Clarification 3.2: Process Type A process type is an abstract description of a class of activities, ranging from activities ’in-the-large’ (e.g. "Create a system design") to activities ’in-the-small’ (e.g. "Run the Test-Driver for a component"). Process types can be represented either informally (e.g. in textual form) or formally (e.g. as a concept class in description logics). Clarification 3.3: Activity of Type T We say that an activity act is of type T, if act is a member of the class that is is represented by T.
3.1 What is Process-Oriented Knowledge Management?
Process types are maintained in the organization’s process model, that can be regarded as a part of the organization’s enterprise ontology [UG96]. Depending on the representation formalism for process types, different relationships can be defined between process types, e.g. specialization, decomposition, or ordering relationships (see e.g. [JPL98]). Clarification 3.4: Process Model A process model consists of a set PT of process types, and a set of relations defined on PT. For the specialization relation, we assume the following interpretation: Clarification 3.5: Process Type Specialization
during activities of this class? • Where can this information be found? • How should this information be represented? We assume that information is represented in the form of knowledge items. A knowledge item can be any document (e.g. a MS-Word document, a web page, or an e-mail) that can be made electronically available to an agent1 by accessing/querying an information source. We say that information is provided to an agent if a set of knowledge items is presented to him.
(1)
In the remainder of this work, we will use the terms "agent" and "process participant" synonymously.
27
28
CHAPTER 3 Process-Oriented Knowledge Management
Clarification 3.7: Knowledge Item, Information Item A knowledge item is a document that can be presented electronically to an agent, who will have to interpret it. A knowledge item that is presented to an agent is also called an information item. Knowledge items can be obtained from information sources that are available to the organization (see Section 1.1); typical examples of information sources are experienced colleagues, databases, Document Management Systems (DMS), or Web search engines (e.g. Google, AltaVista, etc). Clarification 3.8: Information Source An information source is an entity from which knowledge items can be obtained via an interface to access/retrieve information. Entities can be either human or electronic information systems. It should be noted that a knowledge item retrieved from an information source can reference another knowledge item or even an information source. An example of such a knowledge item would be an e-mail of a colleague, in which he recommends to consult a particular database. A knowledge item should be offered to an agent during an activity if it is useful for the successful enactment of the activity by the agent. In general, the usefulness of a knowledge item depends on several factors, including activity characteristics, the agent’s background (e.g. skills, experiences, preferences etc.) as well as the organization’s cost/benefit considerations. Because of the latter, a local/global dichotomy exists for usefulness that principally has to be taken into account: from the organization’s perspective, a knowledge item might be useful for an agent during an activity with regard to the local cost/benefit considerations; but considering the global cost/benefit, the knowledge item might be considered to be not useful at all. An example for such a knowledge item might be a former design document, that is reused during the design process of a new project. Whereas the reuse might accelerate the design process considerably (resulting in a good local cost/benefit ratio), the reuse of the former design might cause considerable difficulties in succeeding phases of the project (i.e. the global cost/benefit ratio might be inacceptable). In this thesis, we will use a simplified model of usefulness that is sufficient for illustration purposes. The model will have to be refined and instantiated by any organization that decides to adopt Process-Oriented Knowledge Management. Typically, there are several different ways in which an agent can enact an activity. The goal of the Knowledge Department is to provide the agent with a set of knowledge items such that the agent enacts the activity in sufficient quality.
3.1 What is Process-Oriented Knowledge Management?
We assume that the quality of a specific enactment eact,agt of an activity act by an agent agt is measured by some quality metric qual(eact,agt) with: qual(eact,agt) = f(m1(eact,agt), m2(eact,agt), ..., mn(eact,agt)) . [0,1], where the functions mi(eact,agt), i=1,..n, denote different cost or benefit enactment measures that are amalgamated by a function f. Typical examples of such measures mi are the time required for the enactment (e.g. measured in number of days), or the number of errors detected in the products that were created during activity enactment. Clarification 3.9: Successful Enactment An activity act has been enacted (or performed) successfully by an agent agt, if the enactment eact,agt is of sufficient quality, i.e. qual ( e act, agt ) ≥ η, 0 ≤ η ≤ 1
where
η
denotes some quality threshold value.
The goal of a process-oriented knowledge capture task is to capture knowledge on what knowledge items are useful for a class of activities a priori, i.e. either before or during the enactment of the activity. Hence, we will only consider the probability that a knowledge item has a positive influence on the enactment quality. Clarification 3.10: Useful Knowledge Item A knowledge item k is called useful for an agent agt during an activity act, if reading, understanding and (as far as possible) applying the contents of k during act increases the probability that the agent agt successfully performs act. More formally: Let A be the event that agent agt has successfully performed activity act. Let B be the event that agent agt has read and interpreted k during his enactment of act. Then knowledge item k is useful for an agent agt during an activity act, if: P(A | B) > P(A), where P(A | B) denotes the conditional probability. A knowledge item that is not useful will be called useless. Note that, in general, the events A and B are not independent, since reading a knowledge item (a document) usually has an influence on the quality of the activity enactment (if only because reading the document takes time).
29
30
CHAPTER 3 Process-Oriented Knowledge Management
Clarification 3.11: Useful Set of Knowledge Item A set K of knowledge items is called useful for an agent agt during an activity act, if reading, understanding and (as far as possible) applying the contents of all element of K during act increases the probability that the agent agt successfully performs act. More formally: Let A be the event that agent agt has successfully performed activity act. Let B be the event that agent agt has read and interpreted all elements of K during his enactment of act. Then the set K is useful for an agent agt during an activity act, if: P(A | B) > P(A). In order to be able to compare two sets of knowledge items with regard to their usefulness, we provide the following Clarification 3.12: More Useful Let K1, K2 be two sets of knowledge items. Let A be the event that agent agt has successfully performed activity act. Furthermore, let Bi be the event that agent agt has read and interpreted all elements of Ki during his enactment of act, i=1,2. Then the set K1 is more useful then K2 for an agent agt during an activity act, if: P(A | B1) > P(A | B2). Based on the terminology introduced above, we give the following clarification of a process-oriented knowledge capture phase: Clarification 3.13: Process-Oriented Knowledge Capture Process-Oriented Knowledge Capture is concerned with capturing two kinds of knowledge: 1. a process model, 2. knowledge on what information might be useful to successfully perform the classes of activities described in the process model. MetaKnowledge
Knowledge on what information might be useful can be considered as meta-knowledge, as it makes a statement about the knowledge items stored in the information sources.
3.1 What is Process-Oriented Knowledge Management?
Example: The sentence "During implementation activities of type ’ImplementationProcess’ in which EJB technology is used, tutorials on EJB are useful for agents who have not used EJB before." captures metaknowledge on the usefulness of EJB tutorials (i.e. certain knowledge items). 3.1.2
Process-Oriented Knowledge Organization
The organization of available knowledge aims at supporting an efficient mapping from activities to the knowledge items that are considered to be useful for agents performing these activities. To this end, corresponding meta-knowledge is associated with each process type (see Figure 3.2). Process Model
Meta Knowledge MKTesting
Testing
Black-Box Testing
Information Universe
MKGB-Testing
Glass-Box Testing MKBB-Testing
s
MKs
t
MKt
process type t is sub-type of process type s
Meta-knowledge MKt is a spezialization of meta-knowledge MKs
Fig. 3.2: Meta-knowledge organization based on process types. The type-specific metaknowledge allows to selects useful information items from the Information Universe.
For each process type specified in the process model, the meta-knowledge chunk captured during the process type-specific knowledge capture task is associated with that process type. These chunks refer either directly (e.g. via a static link) or indirectly (e.g. via a query statement to a database) to certain knowledge items from the organization’s information universe. Thus, for a process type S, the associated meta-knowledge MKS can be used to determine a useful set of knowledge items for agents performing activities of type S. Clarification 3.14: Information Universe IUt The Information Universe IUt is the set of all information items that can be retrieved from the information sources available to the organization at time t.
31
32
CHAPTER 3 Process-Oriented Knowledge Management
Since typically new knowledge items are continuously created (as well as deleted) both within and outside the company, the set of available items evolves over time. If the process model defines a specialization relation on the set of process types, then the knowledge organization has to reflect this relation by a corresponding specialization relation on the associated meta-knowledge (see Figure 3.2). Given our interpretation of process type specialization (see Clarification 3.5), it seems natural that meta-knowledge should be ’inherited’ along the specialization relation: a knowledge item that is considered useful for activities of a given process type is, by default, also considered useful for activities of subtypes of that process type. On the other hand, given T
3.1 What is Process-Oriented Knowledge Management?
6. "For activities of type ’Glass-Box Testing’, a guideline on how to determine when code reading should be preferred over automated tests is useful for planners of the activity." 7. "For activities of type ’Glass-Box Testing’, a definition of the term ’cyclomatic complexity’ is useful for all agents enacting the activity." Considering the example above, it seems desirable that some meta-knowledge items should be inherited along
Process-Oriented Knowledge Formalization
Process-oriented knowledge formalization is concerned with the formalization of the meta-knowledge that defines mappings from activities to the knowledge items that are useful for the agents during their enactment of these activities. The objectives of this formalization can range from increased precision of the description of these mappings (e.g. to ensure that human knowledge brokers can correctly determine all useful informa-
33
34
CHAPTER 3 Process-Oriented Knowledge Management
tion items that should be distributed (see Section 3.1.4) to complete automation of these mappings. In correspondence to these objectives, activities and agents require an explicit representation that ranges from informal textual descriptions to formal specifications (e.g. using predicate logic). For an automation of the mappings, it is required that the explicit representation of an activity and an enacting agent facilitates: • an automatic identification of the meta-knowledge chunks that should
be interpreted (i.e. the identification of the appropriate process types), and • an automatic interpretation of this meta-knowledge in order to determine the set of captured useful knowledge items for the agent. Example: An implementation activity act might have the following informal representation: "Testing a graphical user interface for a rulebase." Its formal specification (in predicate logic-like syntax) might be: activity(act) type(act, blackBoxTesting). component(ruleBaseGUI). objectToTest(act,ruleBaseGUI).
to denote the facts that act is an activity of type ’Black-Box Testing’, ruleBaseGUI is a component, and ruleBaseGUI is the object being tested during act. The formalization of the meta-knowledge MKBB-Testing for the process type ’Black-Box Testing’ might be (see the example in Section 3.1.2): ∀A: type ( A, blackBoxTesting ) → useful ( A, guidelineForTestCases )
to denote the rule that during all activities A of type ’Black-Box Testing’, the knowledge item guidelineForTestCases is useful for all agents. Clarification 3.17: Process-Oriented Knowledge Formalization Let IU = IUt at time t. Let P(IU) denote the powerset of IU. Let RACT be the set of activity representations. Let RAGT be the set of agent representations.
3.1 What is Process-Oriented Knowledge Management?
Process-oriented knowledge formalization is concerned with formalizing a mapping infosT:RACT × RAGT → P(IU) for every process type T in the process model, such that the following holds: For all activities act with representation ract , and all agents agt with representation ragt which participate in the enactment of act, the set infosT(ract,ragt) is the set of knowledge items that • the Knowledge Department has captured as useful for agent agt during enactment of act, according to the interpretation of the metaknowledge associated with type T with regard to ract and ragt • are available from IU In general, the Knowledge Department will also have to define the representation languages RACT and RAGT. However, the focus of formalization lies on the set of mappings infosT:RACT × RAGT → P(IU) (i.e. the metaknowledge) for every process type T .PT, rather than on the formalization of activities or the knowledge items contained in the set IU. The latter is primarily considered as a means to achieve the former. 3.1.4
Process-Oriented Knowledge Distribution
The primary objective of Process-Oriented knowledge distribution is to systematically provide agents with the information needed to successfully perform their activities. Given a representation ract of an activity act that is an instance of type T, and representation ragt of an agent agt who is performing act, the set infosT(ract,ragt) of knowledge items serves as an approximation of this information. Typically, the appropriate process type has to be deduced from the activity representations. In general, an activity can be an instance of more than one process type (see Clarification 3.6), i.e. the Knowledge Department has to solve a classification task in order to determine from ract the process types that an activity act is an instance of. Clarification 3.18: Activity Classification Function types(RACT) Let PT be the set of process types defined in the process model. Let RACT be the set of activity representations. Then for all activities act with representation ract . RACT, the set types(ract) + PT is the set of process types that act is classified by the Knowledge Department to be an instance of. Based on this classification function, the Knowledge Department can evaluate the meta-knowledge chunks associated with the identified process types, in order to retrieve useful knowledge items.
35
36
CHAPTER 3 Process-Oriented Knowledge Management
Clarification 3.19: Knowledge Item Retrieval Function infos Let RACT be the set of activity representations. Let RAGT be the set of agent representations. For all activities act with representation ract , and all agents agt with representation ragt which participate in the enactment of act, the set infos ( r act, r agt ) =
∪
infos T ( r act, r agt )
T ∈ types ( ract )
is the set of knowledge items considered to be useful by the Knowledge Department for agent agt during activity act. In correspondence to the distinction between passive and active Organizational Memories (see Chapter 1), we distinguish between passive and active knowledge distribution. Passive knowledge distribution relies on explicit requests for information from agents during activity enactment. Clarification 3.20: Passive Process-Oriented Knowledge Distribution Let IU = IUt be the Information Universe at time t. Whenever the Knowledge Department receives a request for useful knowledge items concerning an activity representation ract . RACT and agent representation ragt . RAGT at time t, the Knowledge Department provides the agent represented by ragt with the set infos ( r act, r agt ) . In contrast, active knowledge distribution assumes that the Knowledge Department keeps track of certain activities (typically those that are currently being enacted) by maintaining a list of corresponding activity and agent representations. Whenever a new knowledge item becomes available that is considered to be useful by the KD for the enacting agents of one of these activities, the KD distributes the new item to these agents. Similarly, whenever the representation of one of the activities or agents changes (e.g. a change in the technology being handled, or an update of an agent’s skill profile), the KD determines whether any knowledge items have become useful to the agent that were not distributed before. Clarification 3.21: Active Process-Oriented Knowledge Distribution Let IUt be the Information Universe at time t. Let Trackt = {(rtx, rty)| rtx is the representation of an activity x that is enacted by an agent y represented by rty} be the set of activity/agent representation tuples for all activities that the Knowledge Department is supposed to track at time t.
3.1 What is Process-Oriented Knowledge Management?
Whenever • a non-empty set of new2 knowledge items K is added to IUt at time t’, t
t'
t'
t
t
t
newInfos = infos ( r act, r agt )\ infos ( r act, r agt ) ≠ ∅ t
t
t
for some tuple ( r act, r agt ) ∈ Track , then the Knowledge Department distributes the set newInfos to agent agt. 3.1.5
Process-Oriented Knowledge Application
The knowledge items provided by the Knowledge Department to agents during their activities enable them to read, understand and (as far as possible) apply the information contained in the knowledge items during activity enactment. Thus, knowledge application is not necessarily automated. This is especially true for the highly creative software engineering activities considered in this work, which are performed by human agents. Typically, the application of the information provided results in the creation of new knowledge items, either in the form of products that were expected to be created during the activity (immediate work results) or additional, non-required knowledge items (e.g. lessons learned during the activity). 3.1.6
Process-Oriented Knowledge Evolution
For sufficiently complex domains, process-oriented knowledge capture can not be completed in one step. Therefore, a learning (or evolution) process has to established to successively elicit new knowledge and update the knowledge that has already been captured. In addition, knowledge evolution is necessary whenever • the set of activities that have to be performed changes in a way that
requires a change to the process model, or • new knowledge items are being continuously created that need to be taken into account for the successful performance of current or future activities, or
(2)
A change to an existing knowledge item k is also interpreted as causing the creation of a new item, namely "the item k has been changed in the following manner:..."
37
38
CHAPTER 3 Process-Oriented Knowledge Management
• the educational background of new agents (employees) requires addi-
tional knowledge items to be provided. An important input for the learning process is the direct feedback from agents who were provided (or should have been provided) with knowledge items during their activities. Feedback regarding the knowledge distribution can be obtained by asking agents a set of standard questions for every activity that has been performed, e.g.: • Did you miss useful information items to help you with performing
the activity? If yes, how can we make sure that this information is provided in the future? • Did we provide you with irrelevant knowledge items? How can we prevent this in the future? • Concerning the knowledge provided: is there a better way to provide it (e.g. by restructuring documents)? • Did you find new information that you think is worth capturing? The collection of feedback to these questions correspond to a new process-oriented knowledge capture phase, that triggers the next iteration through all the phases of the KM life-cycle (see Section 2.1). During the evolution phase, the Knowledge Department’s performance is evaluated based on the information it provided during each individual activity. Shortcomings identified by this evaluation will trigger changes to the set PT of process types, the set of captured knowledge items, or the set of mappings {infosT:RACT × RAGT → P(IU), T .PT}. Learning Goal
For a controlled learning process that aims at a continuous improvement, the Knowledge Department should define a learning goal that it tries to achieve and against which it can be measured. Example: "After a finite number of evaluation steps (=m0), the KD can provide for all activities being performed by an agent a useful set of knowledge items such that the following holds: the probability that the activities are successfully performed when the set of provided knowledge items is read and interpreted during the activities is greater or equal some given threshold." Let Atagt be the set of activities performed at time t by agent agt. Learning Goal: Find a sequence (PMt, KBt, {infostT:RACT x RAGT-> P(IUt), T a process type defined in PMt}) such that:
3.2 A Model for Software Engineering Process-Oriented KM
m0 . NAT:
m >= m0:
act . Amagt : ract . RACT: ragt . RAGT:
P(agt successfully performs activity act | agt reads and interprets infosm(ract,ragt) during his enactment of act ) >= delta for some threshold delta, 0 <= delta <=1. The learning goal specified in the above example is probably difficult to achieve for delta near 1, as activity characteristics might vary greatly over time. This will pose a problem to the Knowledge Department, e.g. when no useful knowledge items on a new technology are available. In contrast, the following example illustrates a more moderate learning goal. Example: "If the same activity has to be performed again by another agent at some later time, then a more useful set of knowledge items can be provided for this agent than during the former enactment of the activity." Let Atagt be the set of activities performed at time t by agent agt. Learning Goal: Find a sequence (PMt, KBt, {infostT:RACT × RAGT → P(IU), T a process type defined in PMt}) such that: m : act . Amagt : if n>m: act . Anagt’ then ract . RACT: r’act . RACT: ragt’ . RAGT: infosn(r’act,ragt’) more useful than infosm(ract,ragt)
In practice, the Knowledge Department’s performance (including evolution) has to be evaluated from the organization’s perspective in terms of global cost/benefit considerations. These will have to be defined by any organization that intends to introduce Process-Oriented Knowledge Management, but lie outside the scope of this thesis. In the following section, we extend and refine the life-cycle for Process-Oriented Knowledge Management to software engineering processes. 3.2
A Model for Software Engineering Process-Oriented KM
Before the generic life-cycle model for Process-Oriented Knowledge Management presented in the previous section can be utilized, its central concepts (e.g. process model, meta-knowledge, etc.) as well as the steps in the life-cycle model need to be further refined. In the following, we present such a refinement for software engineering processes, resulting in
39
40
CHAPTER 3 Process-Oriented Knowledge Management
a detailed life-cycle for Software Engineering Process-Oriented Knowledge Management (SE-POKM). The refinement is based on existing process modeling concepts for software processes. Furthermore, the formalization of meta-knowledge presented here relies on a flexible annotation of activities, which takes into account that software processes are generally weakly-structured. This annotation is solely used for the determination of useful knowledge items, rather than being intended to formalize activities. As has been noted by Tautz [Tau01], current process modeling languages are still ill-suited for knowledge-intensive processes such as software development. In the context of this thesis, one of the main problems with current modeling languages is that the model only provides an initial, static, idealized view on the activities. In practice, their descriptions as well as the set of products expected to be handled during their enactment will have to be adapted in order to reflect events and changes that occur during the project. Furthermore, the products and tools specified for an activity must only be considered as defaults: the performer should be allowed and in fact supported to inspect additional documents, launch additional tools etc., depending on his/her preferences, skills and experience. Consequently, a static list of predefined documents as the means to distribute activity-specific information provided by many process modeling languages is inadequate for software development processes. Instead, a flexible mapping from activities to available, useful documents is required, that is based on concrete activity/project characteristics. Whereas modeling languages have been developed that provide the means to define events and event-handling routines for activities (see e.g. [SO97]), they are focused on control issues between activities; there are no separate means to describe meta-knowledge on where or how certain information, depending on certain activity and agent characteristics, might be found. For knowledge-intensive processes such as software development, activities should be expected to consist of three kinds of interlinked subactivities: (i) knowledge gathering activities, (ii) product-generation activities, and (iii) knowledge creation activities. Standard process modeling approaches are mainly concerned with the modeling of product-generation activities. The concepts introduced in the following sections will extend a standard process modeling approach by the means to explicitly support knowledge gathering activities (by meta-knowledge modeling) and knowledge creation activities (by feedback mechanisms during enactment).
3.2 A Model for Software Engineering Process-Oriented KM
41
The following sections present the individual phases of a life-cycle model for Software Engineering Process-Oriented Knowledge Management in detail. 3.2.1
Capturing Process-Specific Information Needs
As described in Clarification 3.13, process-oriented knowledge capture subsumes the capture of (i) a process model, and (ii) meta-knowledge on what information might be useful during the activity classes described in the process model. Process type-specific meta-knowledge is elicited during type-specific knowledge capture tasks that try to clarify the following questions (see Section 3.1.1): • What information should be offered to agents performing a certain role
during activities of this class? • Where can this information be found? • How should this information be represented? Initial answers to these questions can be elicited from two sources: the Process Group and agents who participated in the enactment of former activities that are considered instances of the process type. Seemingly, the set of core concepts for process modeling listed in Section 2.2.1 provides no explicit means to represent type-specific meta-knowledge. At best, the constructs for artifacts or resources might be abused for meta-knowledge representation. In the following, we argue that the set of core concepts needs to be extended by additional concepts to represent knowledge about available information items that might be useful for agents during activities of a certain class; we propose an extension to the modeling language by constructs for representing information source recommendations and expected information needs. Information source recommendations and expected information needs correspond to the meta-knowledge items maintained in type-specific meta-knowledge chunks (see Section 3.1.2). In order to provide a template for the representation of available information sources we refine the former Clarification 3.8 in the following way: Clarification 3.22: Information Source (IS) An information source (IS) is an entity from which knowledge items can be obtained via an interface to access/retrieve information. Entities can be either human or electronic information systems.
Information Source
42
CHAPTER 3 Process-Oriented Knowledge Management
An IS is represented by the following aspects: • name: the name by which the information source is commonly referred to within the organization • contents description: a short text that explains what information is stored here, and how the information source can be used • access: specifies where the information source can be found or accessed (e.g. a URL in case of an online resource, or contact information about a colleague/expert) • query interface: specifies the information source’s interface for automated retrieval, if available (e.g. a CGI script to retrieve items from the information source). This interface will be used to execute queries that have been specified within information source usage recommendations (see below) • quality/cost aspects: a set of aspects describing quality and cost aspects of the information source (e.g. access cost in case of commercial information services) Table 3.1 shows an example for the representation of an information source, a Java JDK1.2 language specification document that can be browsed by humans or searched automatically. IS aspect
Value
name
Java JDK1.2 language specification
contents description
Official language specification for Java JDK 1.2
access
http://java.sun.com/products/jdk/1.2/docs/api/
query interface
http://search.java.sun.com/search/java/
quality/cost aspects
high reliability, freely available
Tab. 3.1: Information source example.
It should be noted that an individual document can also be represented as a special case of an information source (i.e. an information source that contains only one document). Information Source Recommendation
Which information sources contain useful information for an agent during his enactment of an activity typically will depend on certain activity and agent characteristics. Furthermore, software engineering activities often undergo changes during their enactment (e.g. schedule changes, product feature changes etc.). Hence, the set of information sources that contain useful information changes during an activity’s enactment, in correspondence to the activity’s changing characteristics. Consequently, a static list of information sources as a means to provide agents with useful knowledge items is inadequate for software engineer-
3.2 A Model for Software Engineering Process-Oriented KM
ing activities. This leads to the following concept of situation-specific information source recommendations, where a situation is described in terms of a process type and a set of conditions. Clarification 3.23: Information Source Recommendation An information source recommendation (IS recommendation) for a given process type captures the meta-knowledge that a certain information source might be useful to agents during activities of that type. An IS recommendation is represented by the following aspects: • information source: the information source being recommended • process type: the process type representing the class of activities for which the information source might contain useful information • activity constraints: specify conditions on activity characteristics. The information source is only recommended if the conditions hold • role constraints: restrict the recommendation to agents performing a certain role in the activity • skill constraints: specify conditions concerning the agent’s skill profile that must hold in order for the information source to be recommended. Table 3.2 shows an example of an information source recommendation: the information source "Java DK1.2 language specification" (see Table 3.1) is considered useful for all agents taking part in an implementation activity (see Figure 2.5) in the role of a "programmer", but only if the implementation language is Java 1.2 and the agent is not already known to be a Java 1.2 expert. IS recommendation aspect
Value
information source
Java JDK1.2 language specification
process type
Implementation Process
activity constraints
programming language is Java 1.2
role constraints
useful for role ’programmer’
skill constraints
programmer is not a Java 1.2 expert
Tab. 3.2: Information source recommendation example.
So far, information source recommendations only describe which information sources are generally considered to be useful during an activity; they do not describe explicitly for what purpose they are considered to be useful, i.e. what information needs might be satisfiable by their contents. In order to capture this knowledge, we introduce the concept of information needs.
43
44
Information Need
CHAPTER 3 Process-Oriented Knowledge Management
An information need (IN) encompasses a situation where an agent requires certain information in order to successfully carry out a given activity. We assume that information needs are being expressed in form of a question (e.g. "Where can I find a tutorial on EJB?"). These questions are supposed to be of the kind "Where can I find pieces of information on ..., because it might help me to solve problem x?", rather than "What is the solution to problem x?". In that way, information needs describe knowledge goals that, when achieved, enable agents to successfully perform their activities, which in turn are intended to achieve a certain project goal (e.g. realization of a software functionality). Information needs might arise during both planning and technical activities. During planning, typical information needs are concerned with the identification of agents that are available (according to calendar information), qualified (according to skill profiles), or admissible (i.e allowed to perform the activity according to the organization’s regulations). Other information needs might address the question of known problems or experiences with similar activities; Figure 3.3 lists examples of information needs that might arise for a planner during his activity of assigning an agent to activity ’DesignReview’.
G
.
/
/
2
3
4
5
7
8
2
9
2
8
"
$
%
H
I
J
<
>
?
@
A
B
<
D
?
K
7
8
:
G
;
!
<
L
5
4
F
$
N
&
&
&
(
P M
$
"
&
O
+
,
)
$
$
&
Q
R
S
U
V
U
S
W
N
&
(
&
(
O
P
U
X
Y
\
U
]
^
_
a
R
[
Y
[
S
b M
Fig. 3.3: Examples of information needs during planning.
During technical activities, information needs are often triggered by the characteristics of current products (e.g. the concepts and technologies used therein) that an agent must handle. Figure 3.4 shows examples of information needs that might arise for the agent who enacts activity ’Design Review’. The information needs concerning the ’Factory-Pattern’, problems with RMI and Serialization, as well as former designs of RMI applications are triggered by the contents of the Design Document. i
k
s
t
o
c
d
e
d
f
g
h
i
i
o
p
k
l
l
d
m
s
g y
u
v
g
v
p
k
h
v
o
n
h
w
x
z r
{
n d
|
k
f
d
n
k
l
}
p
~
k
p
f
d
l
v
l
r
r
£
¤
¢ ¥
¡
¦
§
¨
©
ª
©
¨
«
£
¤
¥
§
³
±
µ
¨
±
®
²
³
¬
´
¯
§
¨
¢
~
o
~
g
l
|
k
l
d
m
n
k
|
u
o
h
e
k
h
k
d
k
¢
h y
g
f
d
p
g
v
d
o
n
o
f
k
e
£
¤
¥
l
l
i
d
v
~
g
n
o
p
l
s
o
h
l
d
e
d
f
g
h
|
¦
k
s
o
h
k
§
¨
©
ª
©
¨
«
r
£
|
c
k
h
d
g
f
d
g
v
d
o
n
k
l
d
m
n
l
¤
¥
r
r
§
¬
®
¯
°
±
®
²
³
¬
´
¯
§
¨
¢
Fig. 3.4: Examples of information needs during enactment.
3.2 A Model for Software Engineering Process-Oriented KM
45
Whether a certain information need arises for an agent will depend on certain activity and agent characteristics (e.g. the technologies that have to be used, the agents experience, skills etc.). Hence, a captured, expected information need should include a specification of the situations in which they typically occur, as introduced for the capture of information source recommendation. Information needs potentially can be satisfied by accessing one or more information sources via their interface (e.g. sending an e-mail to a colleague, launching a tool to open a document, or querying an information system); as a result, the information source returns one or more information items (e.g. a human answers by e-mail, or the Document Management System returns a set of documents). The interpretation of these information items is supposed to either satisfy the information need directly, or help to satisfy it by referring to another information source that might contain the information required to satisfy the information need. In order to provide a template for the description of a way to access an information source to potentially satisfy an information need under certain conditions, we introduce the concept of an information source usage recommendation (IS usage recommendation).
Information Source Usage Recommendation
Clarification 3.24: Information Source Usage Recommendation An information source usage recommendation (IS usage recommendation) is represented by the following aspects: • information source recommendation: specifies the information source that potentially contains information to satisfy an information need, as well as the conditions (in terms of activity, skill and role constraints) when the information source is recommended to be accessed to satisfy the information need. • usage direction: either is a short text explaining in natural language where to find the desired information, or it is a query specification. In the latter case, the query is specified by the following aspects: • comment: is a short text explaining the query’s semantics to the human reader • queryCommand: contains a query expression that can be sent to the information source via its query interface (see above). In addition to accessing an information source to retrieve information items that directly satisfy an information need, the question expressing the information need can often be decomposed into a set of sub-questions; the answers to these sub-questions are assumed to facilitate finding an answer for the original question. This motivates the concept of sub-information needs.
Sub-Information Needs
46
CHAPTER 3 Process-Oriented Knowledge Management
Example 1: To find an answer to the question "Where do I find a tutorial on EJB?", answering the following sub-questions can provide useful information: • "Where can I find documents on Java distribution technology?" • "Which colleagues have developed EJB applications before?" Example 2: To find an answer to the question "To whom could task x be assigned?", answering the following sub-questions can provide useful information: • "Whose schedule allows to work on task x?" • "Who has the skills required for task x?" • "Who has done something similar before?" Example 3: The information need "Who has worked with VAJ before?" can be ’decomposed’ in correspondence to the decomposition of the tool VAJ, i.e. each sub-question refers to a particular part of the VisualAge for Java IDE: • "Who has worked with VAJ’s Visual Composition Builder?" • "Who has worked with VAJ’s RMI-Compiler?" • "Who has worked with VAJ’s Bean Editor?" To summarize these observations, we provide the following Clarification 3.25: (Expected) Information Need An (expected) information need describes a recurrent information need that is expected to arise for agents during their enactment of certain activities under certain conditions. An (expected) information need is represented by the following aspects: • question: a textual representation that describes the information need • information source usage recommendations: a list of information source usage recommendations, describing alternative ways to potentially satisfy the information need under certain conditions. • process type: the process type representing the class of activities during which the information need is expected to arise • activity constraints: specifies conditions on activity characteristics. The information need is only expected to arise if the conditions hold • role constraints: describes for which roles the information need is expected to arise • skill constraints: specifies conditions concerning the skill profile of an agent participating in the activity. The information need is only expected to arise if the conditions hold.
3.2 A Model for Software Engineering Process-Oriented KM
47
• sub-information needs: references a set of sub-information needs; sat-
isfying these information needs is assumed to provide information that helps in satisfying the referencing "parent" information need. Figure 3.5 shows an example of a captured, expected information need "ejb_tutorial_in". For this information need, two information source usage recommendation have been specified, as well as a sub-information need "java_tech_docs_in". Information source recommendations and information needs are used to differentiate conceptually between two strategies: • providing access to an information item or source, without stating
explicitly what information need(s) it is intended to satisfy • presenting explicitly formulated information needs in form of a question, together with information items that potentially provide answers to the question. Thus, the question denotes the purpose for offering the information items. In summary, information source recommendations are used whenever (i) it is obvious what the corresponding information source is used for (e.g., a language specification will always be used for reference), or (ii) there are so many different ways of usage that it would be too cumbersome to explicitly list all of them. In contrast, the explicit representation of information needs allows to capture in more detail • what information might be useful (expressed as a question) • where and how this information can be found (i.e. a list of information
sources that potentially contain the information, together with directions on how to access them) • when it might be useful (i.e. constraints on certain activity characteristics available at enactment time) • to whom it might be useful (i.e. constraints on performers’ roles and skills) In the following, we will use the term information resource to denote both information source recommendations and expected information needs. Clarification 3.26: Information Resource An information resource is either an information source recommendation or an (expected) information need.
Information Source Recommendations vs. Information Needs
48
CHAPTER 3 Process-Oriented Knowledge Management
information need ejb_tutorial_in question Where do I find a tutorial on EJB? IS usage recommendations IS usage recommendation IS recommendation information source "Using VAJ and EJB" document activity constraints development tool is VisualAge for Java useful for programmer skill constraints programmer is familiar with VisualAge for Java usage directions See Chapter 3 of tutorial IS usage recommendation IS recommendation information source Sun’s Javasoft domain activity constraints none useful for programmer skill constraints none usage directions query comment Search for phrase "EJB Tutorial" queryCommand search(?keywords="EJB"+"Tutorial") end IS usage recommendations process type Implementation Process activity constraints System design is based on EJB architecture useful for programmer skill constraints programmer is an EJB novice sub information needs java_tech_docs_in end
information need java_tech_docs_in question Where can I find documents on Java distribution technology? ...
Fig. 3.5: Example information need "ejb_tutorial_in".
3.2 A Model for Software Engineering Process-Oriented KM
3.2.2
Process-Oriented Information Resource Organization
So far, we have introduced two constructs for representing type-specific meta-knowledge items: information source recommendations and (expected) information needs. In the following, we propose an organization scheme for these information resources. In order to achieve an efficient mapping from activities to knowledge items assumed to be useful by the Process Group, the information source recommendations and expected information needs are organized by the process type for which they have been captured; Figure 3.6 illustrates this organization scheme. Documents contained in the recommended information source
D(isr1) isr1
Documents retrieved from the referenced information source via usage directions
D(isrn)
¶
Implementation Process
isrn
D(isur1.1) isur1.1 ¶
ein1
isur1.n 1
D(isur1.n1 )
¶
einm
isurm.1
D(isurm.1)
¶
isurm.n m
D(isurm.nm )
: information source recommendations isr1 ... isrn : expected information needs ein1 ... einm isuri.1 ... isuri.ni : IS usage recommendations
Fig. 3.6: Process-oriented organization scheme: the process type "implementation process" refers to information source recommendations (isr1 ... isrn) and expected information needs (ein1 .. einm) captured for that type. Each information need eini refers to a set of information source usage recommendations (isuri.1 .. isuri , n ). i
Depending on the constructs offered by the process modeling language under consideration, this organization scheme can and should be further refined. Extending the scheme in conformance to language constructs (e.g. type decomposition or specialization) facilitates a modularization of the captured information resources that will make maintenance easier. We will discuss the effects of typical language features in the following sections.
49
50
CHAPTER 3 Process-Oriented Knowledge Management
3.2.2.1
Process Type Decomposition
The definition of process type decompositions allows to further divide processes in individual sub-processes, thereby describing the process with finer granularity (see Figure 3.7). Component Development
Information need: What is the effort distribution of the ’implementation-first’ decomposition?
implementationfirst
Implement Component
s
Information need: What test-driver tools are available?
Test&Debug Component
d
d is a decomposition of process type s
Write Test-Drivers
d
t
process type s is a subprocess of decomposition d
Information need: When must Component Development be completed? Information need: What former projects used what decompositions?
test-drivers-first
Implement Component
s
Test&Debug Component
t
Pre-order on process types: s is performed before t
Fig. 3.7: Process type decomposition example: two (alternative) decompositions are defined for process type "Component Development". The decomposition "implementation-first" represents a refinement into the two sub-process types "Implement Component" and "Test&Debug Component", the former being performed before the latter. In contrast, the second decomposition "test-drivers-first" refines the development into a sequence of three sub-process types, in which the implementation of test-drivers is performed before the component is implemented. Also shown are information needs that are associated with process types or decompositions.
The definition of a decomposition effects the organization scheme in the following way: • information resources with regard to the decision between alternative
decompositions should be associated with the process type that is being decomposed. • information resources that are concerned with a particular decomposition (i.e. a certain way of performing the process) should be organized by that decomposition (rather than by the super-process type). • information resources that are concerned with a particular sub-process type should be organized by that sub-process type (rather than the super-process type or the decomposition). As we will see in Section 3.2.4, the consequence of associating an information resource with a particular process model element (i.e. a process type, decomposition, product type etc.) will be that this information
3.2 A Model for Software Engineering Process-Oriented KM
51
resource is considered for distribution (i.e. is ’activated’) in all situations in which an instance of the process model element is present. Typically, a decomposition’s information resources are intended to provide useful information to the planner of the decomposed activity, whose task will be to set up the sub-activities introduced by the decomposition in a way such that the activity’s goals are achieved. In contrast, the activity’s execution has already been refined by the decomposition into the (planning and) execution of the individual sub-activities, to which specific sets of information resources will be associated. Thus, the decomposition is the primary place for maintaining information resources concerning dependencies between subactivities (e.g. the effort distribution, or former projects in which the decomposition was chosen). However, the information resources associated with the decomposed process type might also be useful during the activities described by the sub-process types. On the instance level, an activity that has been decomposed establishes a particular context (or goal) in which (for which) the sub-activities occur. As this context is ’active’ during each of these subactivities, so are the information needs that have been predicted to arise in it. Example: The information need "When must Component Development be completed?" depicted in Figure 3.7 should be considered for distribution both during activities of type "Component Development" and during sub-activities into which the activity might be decomposed Hence, in addition to information resources associated with a given activity’s process type(s), the information resources associated with the activity’s context will also be considered for distribution. An activity’s context is defined by the path from the decomposition the activity is contained in, over the decomposed super-activity, the decomposition the super-activity is contained in, a.s.o. up to the ’root’ activity. It should be noted that the context is determined by traversing ’upwards’ (i.e. from the sub-activities’ types to the decomposition). In contrast, the information resources associated with decompositions or sub-types defined for a process type will not be considered for distribution during activities of that type.
Activity Context
52
CHAPTER 3 Process-Oriented Knowledge Management
3.2.2.2
Process Type Specialization
The definition of a process type specialization facilitates a refinement by means of subtyping (see Figure 3.8), similar to the definition of subclasses in object-oriented programming languages. Information need: How should I report bugs?
Information need: What tools are available for testing?
Testing
Black-Box Testing
Glass-Box Testing
s t process type t is sub-type of process type s
Information need: What tools are available for glass-box testing?
ein
Fig. 3.8: Process type specialization example: the process types "Black-Box Testing" and "Glass-Box Testing" are declared to be specializations (sub-types) of type "Testing". The relation
Specializations effect the information resource organization in the following way: • information resources defined in a process type are inherited by its
process sub-types (similar to the inheritance of methods in OO languages) • each information resource should be organized by the most general process type, such that it is expected to be useful for activities of this type and all its specializations. This last point leads to the issue of sensitivity to type refinement (see Clarification 3.15), which we address by introducing three specialization relations
3.2 A Model for Software Engineering Process-Oriented KM
are to be interpreted as a refinement of the constraints defined for y, i.e.: • y’s activity constraints are added (conjunctively) to x’s activity constraints • y’s role constraints are added (conjunctively) to x’s role constraints • y’s skill constraints are added (conjunctively) to x’s skill constraints By default, x is assumed to recommend the same information source as y. Optionally, a new value for x’s information source can be specified. In this case, the new value should be a refinement of y’s information source, e.g. only a subset of the former document set, a set of more detailed documents, etc. Before we can clarify the specialization
53
54
CHAPTER 3 Process-Oriented Knowledge Management
Furthermore, x inherits the set of IS usage recommendations from y, but also extends and refines this set. More specifically, the set of x’s IS usage recommendations is the union of three disjoint subsets: • a set of newly defined usage recommendations • a set of usage recommendations that are specializations of y’s usage directions • all non-specialized usage recommendations from y Figure 3.8 depicts an example of an information need specialization: the information need x = "What tools are available for glass-box testing?" (associated with process type "Glass-Box Testing") is declared to be specialization of the information need y = "What tools are available for testing?" (associated with type "Testing"). The specialization relations
added because of the information resources associated with T • knowledge items associated with S via an information resource can be replaced by more useful items via a specialization of this information resource that is associated with T (see Figure 3.8 for an example illustrating an information need specialization) • knowledge items associated with S via an information resource can be excluded for T, e.g. by means of specializations that add further constraints3, further refine the information source, or refine usage directions. Note that, in contrast to the sub-information needs introduced in Section 3.2.1, specializations of an information need are intended to replace that information need in the context of an activity, whereas sub-information needs do not replace it, but are intended to help in satisfying that information need.
(3)
including unsatisfiable constraints, which will prevent the information resource from ever being offered
3.2 A Model for Software Engineering Process-Oriented KM
3.2.2.3
55
Domain Ontology Entities
A certain kind of information need is likely to arise regardless of the activity’s type: for example, the question "How do I start tool <x>?" can be expected to arise whenever tool x has to be used during an activity. In order to avoid redundant definitions, such an information need should be organized more naturally by a corresponding model entity for tool x. In addition to other entities from the Software Engineering domain model (e.g. SE techniques and methods, programming languages, architecture, technologies, OS platforms etc.), all software products (applications, components, etc.) handled within the organization are candidates for maintaining their specific set of information resources, assuming that they have been represented as explicit model entities. Since all model entities are connected to activities, information needs maintained by such entities will be managed in relation to processes, thereby still adhering to the principles of POKM. Figure 3.9 depicts an excerpt from such an organization-specific computer science domain model with associated information resources. CS Thing
… Tool Rational Rose
JDK 1.2
Information need: How do I start RationalRose?
Info Source Recommendation: GoF Design Pattern Catalogue
Language Modeling Language
UML
SDL
s
OO-Programming Language
ST-80
Java
isr
t entity t is a specialization of entity s
isr’ information source reommendation isr’ is a specialization of isr
Fig. 3.9: Computer science domain example.
The main intentions behind organizing information resources around model entities other than processes and their decompositions are: • a more natural organization that allows to maintain entity-specific
information sources locally for that entity
56
CHAPTER 3 Process-Oriented Knowledge Management
• with regard to the distribution phase (see Section 3.2.4.3), entity spe-
cific information resources can be considered for recommendation (i.e. are activated) whenever the entity is referenced by an actual activity (see Figure 3.10)4. activity of type “Design”
IRSDesign e.g.“Who performed similar design tasks before?”
uses_tool
tool “RationalRose”
is_tool_for
IRSRationalRose e.g.“How do I load JDK 1.2 into RR?”
language “UML”
IRSUML e.g.“UML Specification?”
Fig. 3.10: Spreading activation example. Information resources (IRS) that are associated with domain entities "RationalRose" and "UML" become activated because these entities are directly or indirectly referenced by the activity of type "Design".
Whether an activated, entity-specific information resources is actually recommended to a given agent is decided by the resource’s activity, role, and skill constraints.
Required topics
However, certain information needs will only arise in the presence of more than one entity. Therefore, we introduce the additional information resource aspect required topics, intended to capture the set of entities that have to be present in the context of an activity. In the example situation depicted in Figure 3.10, this aspect would allow us to declare that the information need "How do I load JDK 1.2 into Rational Rose?" associated with "RationalRose" to be useful only if both entities "Java" and "RationalRose" are present in the activity’s context. In contrast, it will not be expected to arise for users of Rational Rose who work in a Smalltalk80 project. In order to prevent enforcing any arbitrary decision concerning the question which of its required topics an information resource should be associ-
(4)
The alternative would have been to explicitly define a set of generic information resources that are being recommended for all process types, e.g.: FORALL x . TOOLS: add infoNeed "How do I start <x>?" to all activities. Experience from building expert systems has shown that this kind of global knowledge representation is difficult to maintain; whenever possible, knowledge should be divided into small chunks that can be maintained locally (see e.g. approaches for concept hierarchy-oriented configuration vs. rule-based systems).
3.2 A Model for Software Engineering Process-Oriented KM
ated with, it must be possible to associate it with more than one entity; Figure 3.11 depicts an example situation in which an information need is associated with two entities, both of which are listed as the information need’s required topics. The problem addressed by the information need stems from the combination of two technologies; the question to which technology the problem "naturally" belongs can not be answered objectively.
… Serialization
RMI
Information need: question: “How are stubs serialized?” required topics: Serialization and RMI
Fig. 3.11: Two required topics: the information need is associated with both "Serialization" and "RMI". Note that both entities have to be listed as required topics, because the information need can be triggered by each of these entities; as a consequence, the presence of the remaining entity needs to be tested.
In summary: an information resource (IRS) is associated either with a process model entity (process type or type decomposition), or with one or more domain entities. In the former case, the IRS is considered to be useful for all activities of that type or all activities that have been decomposed accordingly (as long as the constraints are satisfied). In the latter case, an IRS associated with a given domain entity is considered useful for all activities (regardless of their type) during which this entity appears (as long as the constraints are satisfied). Allowing to associate information resources with entities other than process types raises the question of how to decide whether a given information resource should be associated with a process type or a domain entity. For example, the information need "How do I implement test cases with JUnit?" that might be likely to arise during activities of process type "Implement Test Cases" could be associated either with that process type or with the testing tool "JUnit"5. In order to address this question, we recommend to apply the following heuristic: • by default, associate information resources with process types.
(5)
The problem is similar to that of assigning responsibilities among a set of candidate objects during object-oriented analysis.
57
58
CHAPTER 3 Process-Oriented Knowledge Management
• if, over time, information resources are associated with a process type
whose constraints refer to several alternative entities for a certain process aspect, then move these resources to their corresponding entities. tool is JUnit
IRS3.1 IRS3.2 …
JavaTestingTool
Implement Test Cases tool : JavaTestingTool tool is JavaTestingTool
IRS2.1
TestMentor
IRS2.2
JUnit
…
tool is TestMentor
IRS1.1 IRS1.2 …
heuristic
IRS1.1 IRS1.2
Implement Test Cases
IRS2.1 IRS2.2
tool : JavaTestingTool
JUnit
IRS3.1 IRS3.2 …
…
TestMentor
…
JavaTestingTool
Fig. 3.12: Illustration of the heuristic’s effect on the organization via a domain ontology.
Figure 3.12 illustrates the effect of this heuristic on an example model: it prevents the accumulation of a large set of information resources being associated with one process type, which will become difficult to maintain. Instead, a kind of delegation is introduced, dividing the set of information resources into smaller chunks that can be distributed over several entities. 3.2.2.4
Product Parameters
Most process modeling languages provide the means to describe for each process type the products that are consumed, produced or modified [STO95], usually in the form of corresponding parameter declarations; Figure 3.13 illustrates this graphically for the process type specification shown in Figure 2.5. reqdoc: Requirements Document desdoc: Design Document
Fig. 3.13: Process type with parameters.
Implementation Process
codedoc: Code Document
3.2 A Model for Software Engineering Process-Oriented KM
59
Similar to our utilization of domain entities, in certain situations information resources should be associated with a parameter’s product type rather than the process type. Information resources that are associated with a product type are considered to be useful whenever an instance of this type appears during activities of arbitrary types. However, an information resource that is associated with a product type might still not be considered useful for all types of activities, but only for activity types that share a specific view on their product. A view represents a way of looking at a product from a certain angle (i.e. focusing only on a subset of product aspects) that is determined by the role played by the product during the activity that handles the product. In the context of this work, we use views as means to organize information resources around product types. Figure Example of a product model (i.e. a set of product type declarations) with information resources associated with appropriate product types via explicitly represented views. IN: “When is the product due?” IN: “What similar products exist?”
general
Product
… Executable
ISR: Checklist for Designs
Document Requirements Document
design view
Design Document
EJB-Design Document
DJB-Design Document design view
ISR: EJB-Design Example
ISR: Checklist for EJB-Designs ISR: EJB-Design Example
s t
v
irs
t Product type t is a specialization of product type s
Information resource irs is associated with product type t via view v
Fig. 3.14: Product model example.
A default view "general" is used to maintain all information resources that are considered to be useful whenever no specific view is specified (e.g. as for the Design Document parameter in Figure 3.13); an information resource might belong to more than one view. Figure 3.15 gives an example of how views are referenced by parameter declarations. In this example, different review checklists and example design documents are considered to be useful for instances of "Design Process" and "Design
Views
60
CHAPTER 3 Process-Oriented Knowledge Management
Review Process" (but not for instances of "Implementation Process"), depending on the two different types of design documents. reqdoc: Requirements Document
desdoc: Design Document {design view}
desdoc: Design Document {design view}
Design Process
Design Review Process
revdoc: Review Report
Fig. 3.15: Parameter declarations referencing the view "design view".
Because the usefulness of information resources that are associated with a product type might depend on certain product characteristics, we need to introduce the means to restrict an information resource’s recommendation to situations in which these characteristics are present. In correspondence to the activity constraints introduced in Section 3.2.1, we extend the characterization of information resources that are referenced by a product type by the following aspects: • recommended for products of type specifies a product type with which
the information resource should be associated • product constraints describe under what conditions regarding a product of this type the information source might be useful during enactment time. Table 3.3 shows an example of an information source recommendation that is associated with product type "Design Document": the review report for a given design document is considered useful for all team members in the role of a "designer" or "planner" that handle the design document during any of their activities. IS recommendation aspect
Value
information source
Review Report for the design document concerned
recommended for products of type
Design Document
product constraints
design contains flaws
role constraints
useful for roles ’designer’ and ’planner’
skill constraints
-
Tab. 3.3: Information source recommendation organized by a product type.
Information resources that are associated with a product type do not specify activity constraints; they are considered to be useful for all types of
3.2 A Model for Software Engineering Process-Oriented KM
61
activities, so that only role and skill constraints need to be specified. However, a restriction on activities of certain types can be achieved via views. Concerning the question whether to associate an information resource with a process type or a product type, we recommend to apply a heuristic similar to that proposed for entities (see Section 3.2.2.3): • by default, associate information resources with process types • if, over time, information resources are associated with a process type
whose constraints refer to alternative product subtypes of a parameter’s type, then move these resources to their corresponding product types under appropriate views. Select the corresponding view in the parameter declarations of those process types for which the information resource is considered to be useful • if an information resource whose constraints reference a product type is considered to be useful for more than one process type, then associate the information resource with the product type by adding it to appropriate product type views. Select the corresponding view in the parameter declarations of those process types for which the information resource is considered to be useful Figure 3.16 illustrates the effect of this heuristic, which helps to prevent the accumulation of a large set of information resources being associated with one process type. IRS1.1 IRS1.2 … desdoc is DJB-Design Document reqdoc: Requirements Document
Design Document
desdoc: Design Document
Design Process
EJB-Design Document
desdoc is EJB-Design Document
DJB-Design Document
IRS2.1 IRS2.2 …
heuristic
Design Document Design Process
desdoc: Design Document {v}
IRS1.1 EJB-Design Document
DJB-Design Document v
IRS3.1 IRS3.2 …
Fig. 3.16: Illustration of the heuristic’s effect on the organization via a product model.
IRS1.2 v
…
reqdoc: Requirements Document
62
CHAPTER 3 Process-Oriented Knowledge Management
In case that the process-modeling language under consideration supports the representation of domain entities as introduced in Section 3.2.2.3, product types should also be used as a source for additional activity key topics; for example, any design pattern contained within a software design might serve as a key topic of the design document. 3.2.2.5
Information Resource Categories
In order to help the Knowledge Engineers working in the KD to maintain and organize the information resources, we must provide them with the means to group resources into categories, e.g. with respect to themes or issues that a subset of resources address. For example, typical, recurring issues for an agent who is responsible for planning an activity are: anticipating potential difficulties, finding a skilled team member with free capacities, estimating the time effort, and, depending on the task's granularity and the team member's expertise, developing an outline of how to perform the task. Table 3.4 lists examples of typical planning information needs that have been grouped into categories reflecting different issues. Category Risk Management
Information Needs What problems occurred in similar activities? How were they solved? What relevant problems/shortcomings/bugs with the tools used in the current activity are known?
Agent Assignment
Which agents match the skills required for the activity? Which agents have performed a similar activity before? Which agents are available at the time period in question?
Effort Estimation
What quality models exist for the activity? What was the time effort of similar activities?
Task Refinement
What standard refinements exist for the activity? How were similar activities performed in the past?
Tab. 3.4: Example categories for information needs during planning.
In addition to containing information resources, a category might be further divided into sub-categories to refine the group even further. Figure 3.17 shows an excerpt from a category graph example: the category for information resources related to the DBMS "GemStone" has sub-categories concerning the installation as well as the two distribution architectures DJB and EJB that are supported by GemStone. Information
3.2 A Model for Software Engineering Process-Oriented KM
resources that are specific to either of these architectures are grouped under the corresponding (sub-)categories. DBMS GemStone Installation
ISR: GemStone Homepage IN: “How do I install GemStone?”
DJB Transactions … EJB
IN: “How do clients start a transaction?” IN: “How can I enable chained transactions?” …
Deployment
…
Fig. 3.17: Information resources indexed by category graph.
3.2.3
Formalization
So far, we have introduced a semi-structured representation of information resources. In order to allow for a more precise description of information resources (e.g. with respect to describing constraints), as well as to be able to provide semi-automated support (see Section 3.2.4), we need a formal representation to specify information resources; in particular, we have to provide a language for defining decidable information resource constraints. In the following, we use F-Logic [KLW95] in Florid syntax [May00] to formally represent information resources. We choose F-Logic because we consider it to be more readable than predicate logic (i.e. a decidable subset of it) with regard to structured entities due to its object-orientation. In addition, if we want to be able to associate information resources with process types and additional model entities, we must also provide a basic formal representation of process model elements; frame/object-oriented approaches to activity-based process modeling can easily be extended (e.g. without changes to syntax) and have already been discussed in the literature [SV99][JPL98].
63
64
CHAPTER 3 Process-Oriented Knowledge Management
Process types will be represented in F-Logic as subclasses of a base class process (see Figure 3.18); instances of these subclasses will represent process instances (i.e. activities). process[ name => string; scheduledStartDate => date; scheduledEndDate => date; estimated_effort => effort; role@(string) =>> agent]. FORALL PT PT:process_type :- PT::process. Fig. 3.18: F-Logic excerpt from the definition of the base class process. Instances of process will have a name and scheduling information. The multi-valued method role allows several agents to perform a certain role (denoted by the function’s argument) during an activity. The rule states that all subclasses of process are to be considered instances of a class process_type.
Figure 3.19 shows an example definition of a process type implementation_process. implementation_process::process. implementation_process[ prog_language => programming_language]. Fig. 3.19: F-Logic representation of the process type implementation_process. Instances of implementation_process will have an attribute prog_language, whose values are restricted to instances of programming_language.
As the example shows, type-specific attributes can be defined for each process type by specifying methods for the corresponding F-Logic class. The definition of additional attributes will become necessary whenever activity constraints that refer to specific activity characteristics (e.g. see Table 3.2) need to be formalized; Figure 3.20 contains an example representation of an implementation activity. impl_act:implementation_process. impl_act[ name -> "Implement an editor for ECA rules"; prog_language -> java_1_2]. Fig. 3.20: F-Logic representation of an activity impl_act as an instance of process type implementation_process.
3.2 A Model for Software Engineering Process-Oriented KM
3.2.3.1
Information Sources
Figure 3.21 shows a formal representation of information sources in FLogic; members of class iSource describe individual information sources. iSource[ name => string; contents_descr => string; access => url; query_interface => url; quality_costs@(string) => string]. Fig. 3.21: F-Logic representation for class iSource.
Figure 3.22 lists the representation for the information source example shown in Table 3.1. java_1_2_spec:iSource[ name -> "Java JDK1.2 language specification"; contents_descr -> "language specification for Java JDK 1.2"; access -> "http://www.sun...."; quality_costs@(reliability) -> "high"; quality_costs@(availability) -> "free"]. Fig. 3.22: F-Logic representation for the information source example from Table 3.1; an instance java_1.2_spec of class iSource is defined. 3.2.3.2
Information Source Recommendations
Figure 3.23 shows a formal representation of information source recommendations; members of class iSourceRec describe individual information source recommendations. iSourceRec::iResource iSourceRec[ infoSource => iSource; rec_for_type => process_type; categories =>> string]. Fig. 3.23: F-Logic representation for class iSourceRec. Instances of this class refer to the process type (i.e. a subclass of process) with which they are associated via their method rec_for_type. The multi-valued method categories specifies a set of category paths under which the recommendation will be grouped. A category path is a string that describes a list of category names, seperated by the character ">".
The recommendation’s constraints are not specified on the type level, because they usually will refer to characteristics of an actual process instance (e.g. the activity constraint shown in Table 3.2 refers to the actual programming language used during an implementation activity). However, F-Logic does not provide the means to refer to instances from the
65
66
CHAPTER 3 Process-Oriented Knowledge Management
type level (i.e. there are no pseudo variables similar to "this" or "self" as known from other OO languages). Therefore, we formalize recommendation constraints via a uniform set of rules for each recommendation specified; Figure 3.24 shows the rule patterns used to represent both information source recommendation and information need constraints. FORALL ACT act_constr_sat(, ACT) :. FORALL ACT, AGT role_constr_sat(, ACT, AGT) :. FORALL ACT, AGT skill_constr_sat(, ACT, AGT) :. Fig. 3.24: Rule patterns used to represent activity, role, and skill constraints via the predicates act_constr_sat, role_constr_sat, and skill_constr_sat, respectively. The predicates are true if the corresponding constraints are satisfied. The non-terminal is replaced by the F-Logic object’s name that is representing the information resource; is replaced by an appropriate F-Logic rule body.
Figure 3.25 lists the representation for the information source recommendation example shown in Table 3.2.
3.2 A Model for Software Engineering Process-Oriented KM
java_spec_rec:iSourceRec[ infoSource -> java_1_2_spec; rec_for_type => implementation_process; categories ->> {"Specifications > Java"}]. /* rule for representing activity constraints */ /* the predicate is true if ACT is an implementation process and the programming language is Java */ FORALL ACT act_constr_sat(java_spec_rec, ACT) :ACT:implementation_process AND ACT[prog_language -> java_1_2].
/* rule for representing role constraints */ /* the predicate is true if agent AGT performs the role of a ’programmer’ for activity ACT */ FORALL ACT, AGT role_constr_sat(java_spec_rec, ACT, AGT) :ACT[role@(programmer) ->> AGT].
/* rule for representing skill constraints */ /* the predicate is true if agent AGT is not a Java 1.2 expert*/ FORALL ACT, AGT skill_constr_sat(java_spec_rec, ACT, AGT) :NOT AGT[skillLevelFor@(java_1_2) -> expert]. Fig. 3.25: F-Logic representation for the information source recommendation example from Table 3.2; the recommendation java_spec_rec has been grouped into the subcategory "Java" of category "Specifications".
This representation of constraints allows us to refer to an activity’s actual characteristics: whether an information resource is considered to be useful depends on the value of predicate act_constr_sat. Note that, in combination with the specifications listed in Figure 3.20, the predicate act_constr_sat(java_spec_rec, impl_act) holds for activity impl_act. In order to be able to associate information source recommendations (or, more generally, information resources) with process types, we extend the definition of class process_type by a multi-valued method
67
68
CHAPTER 3 Process-Oriented Knowledge Management
(see Figure 3.26). Because of the rule shown in Figure 3.18, all subclasses of process are instances of process_type.
info_resources
process_type[ info_resources =>> iResource]. Fig. 3.26: Method info_resources for instances of process_type.
As a consequence, every subclass of process can refer to several information resources. Figure 3.27 gives an example in which the information resource java_spec_rec (see Figure 3.25) is associated with process type implementation_process. implementation_process[ info_resources ->> {java_spec_rec}]. Fig. 3.27: Process type implementation_process with associated information resource java_spec_rec. 3.2.3.3
Expected Information Needs
Figure 3.28 shows a formal representation of expected information needs; as already mentioned in Section 3.2.3.2, their constraints are represented via a set of rules, following the rule patterns listed in Figure 3.24. infoNeed::iResource infoNeed[ question => string; rec_for_type => process_type; categories =>> string; isurs_local =>> iSourceUsageRec; sub_infoNeeds =>> infoNeed]. iSourceUsageRec[ infoSource_rec => iSourceRec; usage_direc => query]. query[ description => string; queryCommand => string]. Fig. 3.28: F-Logic representation for class infoNeed. Usage recommendations defined locally for an information need are referred to via the method isurs_local. For formalization purposes, usage directions are always query objects; if no queryCommand can be specified for automated query execution, the method description captures any textual explanations on how to find the desired information.
3.2 A Model for Software Engineering Process-Oriented KM
Figure 3.29 lists the formalization of the information need example shown in Figure 3.5. ejb_tutorial_in:infoNeed[ question -> "Where do I find a tutorial on EJB?"; rec_for_type -> implementation_process; categories ->> {"EJB > Documentations"}. isurs_local ->> {vaj_and_ejb_isur, javasoft_isur}; sub_infoNeeds ->> {java_tech_docs_in}]. vaj_and_ejb_isur:iSourceUsageRec[ infoSource_rec -> vaj_and_ejb_iSourceRec; usage_direc -> query[ comment -> "See Chapter 3 of VAJ handbook"]]. javasoft_isur:iSourceUsageRec[ infoSource_rec -> javasoft_iSourceRec; usage_direc -> query[ comment -> "Search for phrase ’EJB Tutorial’"; queryCommand -> "search(?keywords=’EJB’ +’Tutorial’)"]]. java_tech_docs_in:infoNeed[ question -> "Where can I find documents on Java technology?"]. Fig. 3.29: F-Logic representation for the information need example from Figure 3.5 (excerpt). 3.2.3.4
Parameterized Query Commands
Similar to activity constraints, the query of an information source usage recommendation often depends on actual activity characteristics, e.g. when querying for team members that are available at the time for which the activity under consideration has been scheduled. Consequently, query commands must be parameterizable with regard to an activity’s characteristics. Similar to the representation of information resource constraints, we introduce a rule-based representation of query commands; Figure 3.30 shows the corresponding rule pattern (see Figure 3.31 for an example). :query FORALL ACT, AGT, QC [queryCommand@(ACT, AGT) -> QC] :. Fig. 3.30: Rule pattern for a parameterized query command. The query command string QC has to bound by the expressions contained in . These expressions can make use of activity and agent characteristics by using ACT and AGT as method host objects.
69
70
CHAPTER 3 Process-Oriented Knowledge Management
my_query:query FORALL ACT, AGT, QC my_query[queryCommand(ACT, AGT) -> QC] :QC is concat("searchAvailableAgents?fromDate=", ACT.scheduledStartDate, "+?dueDate=", ACT.scheduledEndDate).) Fig. 3.31: Example for a parameterized query command. 3.2.3.5
Type Decompositions and Information Resources
In order to be able to associate information resources with type decompositions, we introduce a basic representation of the latter (see Figure 3.32): type decompositions are subclasses of a new base class process_decomposition. Instances of these subclasses will represent decompositions of process instances (i.e. activities); they refer to the activity that is being decomposed and to the subactivities into which the activity is being decomposed. Activities refer to their decomposition via the method decomp, and to the decomposition they are contained in via method containing_decomp. The rule in Figure 3.32 states that all subclasses of process_decomposition are to be considered instances of a class processTypeDecomp. process_decomposition[ decomposed_process => process; sub_processes =>> process]. process[ decomp => process_decomposition; containing_decomp => process_decomp]. FORALL PD PD:processTypeDecomp :- PD::process_decomposition. Fig. 3.32: F-Logic excerpt from the definition of class process_decomposition.
While class process_decomposition represents decompositions on the process instance level, decompositions on the process type level are represented by class processTypeDecomp (see Figure 3.33).
3.2 A Model for Software Engineering Process-Oriented KM
processTypeDecomp[ name => string; decomposed_type => process_type; sub_processTypes =>> process_type; info_resources =>> iResource]. process_type[decomps =>> processTypeDecomp]. Fig. 3.33: Formal representation of type decompositions. Note that instances of processTypeDecomp refer to process types, and not to activities (i.e. process instances). Instances of class process_type refer to explicit type decompositions via a method decomps.
In order to be able to associate information resources with type decompositions, we define a multi-valued method info_resources for class As a consequence, every subclass of processTypeDecomp. process_decomposition can refer to several information resources. Figure 3.34 shows a representation excerpt of the (type-level) decomposition example depicted in Figure 3.7. component_development::process. implement_component::process. test_and_debug_component::process. impl_first::process_decomposition. test_drivers_first::process_decomposition. component_development[ decomps ->> {impl_first, test_drivers_first}]. impl_first[ name -> "implementation-first"; decomposed_type -> component_development; sub_processTypes ->> {implement_component, test_and_debug_component}; info_resources ->> {impl_first_effort_in}]. Fig. 3.34: Excerpt from the formal representation of the decomposition example shown in Figure 3.7.
71
72
CHAPTER 3 Process-Oriented Knowledge Management
Figure 3.35 contains an example representation of a corresponding instance-level decomposition. comp_dev_act:component_development. impl_comp_act:implement_component. t_and_d_comp_act:test_and_debug_component. aDec:impl_first. aDec[ decomposed_process -> comp_dev_act; sub_processes ->> {impl_comp_act, t_and_d_comp_act}]. comp_dev_act[decomp -> aDec]. impl_comp_act[containing_decomp -> aDec]. t_and_d_comp_act[containing_decomp -> aDec]. Fig. 3.35: Excerpt from the formal representation of an activity decomposition instancelevel) based on the type decomposition shown in Figure 3.34.
Information resources that are associated with a type decomposition are considered to be useful during activities for which a corresponding decomposition of that type has been selected. Hence, the rule patterns defined in Figure 3.24 are instantiated for activities of the type for which the decomposition is defined. However, the activity constraints now can also refer to the sub-activities introduced by the decomposition. As the example in Figure 3.36 shows, the activity constraints of the information need "What is the effort distribution of the ’implement-first’ method?" state that it is only considered to be useful as long as there is still a subactivity whose effort has not been estimated yet.
3.2 A Model for Software Engineering Process-Oriented KM
impl_first_effort_in:infoNeed[ question -> "What is the effort distribution of the ’implement-first’ method?"]. /* rule for representing activity constraints */ /* the predicate is true if ACT is of type ’component_development’and there is a subactivity introduced by ACT’s decomposition for which no effort estimation has been specified. */ FORALL ACT act_constr_sat(impl_first_effort_in, ACT) :ACT:component_development AND estim_effort_unspecified(ACT.decomp). FORALL DEC, SACT estim_effort_unspecified(DEC) :DEC:impl_first AND DEC[sub_processes ->> SACT] AND SACT[estimated_effort -> nil]. /* rule for representing role constraints */ /* the predicate is true if agent AGT is the ’planner’ for activity ACT */ FORALL ACT, AGT role_constr_sat(impl_first_effort_in, ACT, AGT) :ACT:component_development AND ACT[role@(planner) ->> AGT]. /* rule for representing skill constraints */ /* no constraints*/ FORALL ACT, AGT skill_constr_sat(impl_first_effort_in, ACT, AGT) :ACT:component_development AND AGT:agent. Fig. 3.36: Information need associated with type decomposition as shown in Figure 3.7. 3.2.3.6
Formalization of type specializations
A formal representation of the specialization relation between process types facilitates inheritance of information resources along this relation, i.e. process subtypes inherit information resources from their supertype(s). We formalize the specialization relation by means of F-Logic’s subclass-relationship: if process type s is a subtype of t, then we declare s to be a subclass of t (i.e. s::t). Figure 3.37 shows an excerpt from the formal representation of the type specialization example depicted in Figure 3.8.
73
74
CHAPTER 3 Process-Oriented Knowledge Management
testing::process. black_box_testing::testing. glass_box_testing::testing. Fig. 3.37: Formal representation of the type specialization example depicted in Figure 3.8.
Inheritance of information resources can now be formalized by making use of F-Logic’s concept of inheritable methods: instead of using arrow type "->>" when assigning values to the method info_resources for a class, we use the arrow type "*->>" (see Figure 3.38). As a consequence, testing[info_resources *->> {bug_report_in, avail_test_tools_in}]. bug_report_in:infoNeed. bug_report_in[ question -> "How do I report bugs?"]. avail_test_tools_in:infoNeed. avail_test_tools_in[ question -> "What tools are available for testing?"]. glass_box_testing[info_resources *->> {avail_gb_test_tools_in}]. avail_gb_test_tools_in:infoNeed. avail_gb_test_tools_in[ question -> "What tools are available for glass-box testing?"] Fig. 3.38: Inheritable multi-valued method info_resource.
the method’s results are propagated to every subclass (and instance of) that class. For example, from the specifications listed in Figure 3.37 and Figure 3.38, we can derive the following information: black_box_testing [info_resources black_box_testing [info_resources glass_box_testing [info_resources glass_box_testing [info_resources glass_box_testing [info_resources
*->> {bug_report_in}]. *->> {avail_test_tools_in}]. *->> {bug_report_in}]. *->> {avail_test_tools_in}]. *->> {avail_gb_test_tools_in}].
3.2 A Model for Software Engineering Process-Oriented KM
The following sections introduce a means to represent the fact that the information resource avail_gb_test_tools_in is a specialization of the inherited information resource avail_test_tools_in, which, consequently, should be overridden by its specialization during distribution (see Section 3.2.4.2). 3.2.3.7
Formalization of
The signature definition for iSourceRec is extended by the method specializationOf (see Figure 3.39). iSourceRec[specializationOf => iSourceRec]. // default is ’nil’ as the root object. iSourceRec[specializationOf *-> nil]. Fig. 3.39: Signature extension for class iSourceRec.
Figure 3.40 lists the pattern for the formalization of the specialization relation. [specializationOf -> ]. /* uncomment if no new info source has been specified for (=default).*/ // [infoSource -> .infoSource]. /* uncomment if a new info source has been specified. */ // [infoSource -> ]. /* inherits ’s categories */ FORALL CAT [categories ->> CAT] :[categories ->> CAT]. Fig. 3.40: Pattern for specialization relation element .
The inheritance of constraints along the specialization relation
75
76
CHAPTER 3 Process-Oriented Knowledge Management
3.2.3.8
Formalization of
The signature definition for infoNeed is extended by the method specializationOf (see Figure 3.41) infoNeed[specializationOf => infoNeed]. // default is ’nil’ as the root object. infoNeed[specializationOf *-> nil]. // ’nil’ will make sure that the recursion // in the rule body (see below) will terminate. Fig. 3.41: Signature extension for class infoNeed. The constraints on the root info need nil have already been defined in Figure 3.39.
Figure 3.42 lists the pattern for the formalization of this specialization relation. [specializationOf -> ]. /* uncomment if no new question has been specified for (=default).*/ // [question -> .question]. /* uncomment if a new question has been specified. // [question -> ]. /* inherits ’s categories */ FORALL CAT [categories ->> CAT] :[categories ->> CAT]. Fig. 3.42: Pattern for specialization relation element
.
The inheritance of constraints along the specialization relation
3.2 A Model for Software Engineering Process-Oriented KM
/* IN[isurs ->> ISUR] if ISUR is locally defined, OR it is inherited and not overwritten FORALL IN, ISUR IN[isurs ->> ISUR] :IN:infoNeed AND ISUR:iSourceUsageRec AND IN:infoNeed[isurs_local ->> ISUR]. FORALL IN, ISUR, IN_sup IN[isurs ->> ISUR] :IN:infoNeed AND ISUR:iSourceUsageRec AND IN[specializationOf -> IN_sup] AND // cause ’nil’ to fail, to stop recursion IN_sup:infoNeed AND IN_sup[isurs ->> ISUR] AND NOT specialized_by_local_isur(ISUR, IN). FORALL ISUR_loc, IN specialized_by_local_isur(ISUR, IN) :IN[isurs_local ->> ISUR_loc] AND ISUR_loc[specializationOf -> ISUR]. Fig. 3.43: Rules for inheritance of information source usage recommendations along
What remains now is the formalization of iSourceUsageRec]. // default is ’nil’ as the root object. iSourceUsageRec[specializationOf *-> nil]. Fig. 3.44: Signature extension for class iSourceUsageRec.
Figure 3.45 lists the pattern for the formalization of relation
77
78
CHAPTER 3 Process-Oriented Knowledge Management
[specializationOf -> ]. /* uncomment if no new infoSourceRec has been specified for (=default).*/ // [infoSource_rec -> .infoSource_rec]. /* uncomment if a specialization of ’s infoSourceRec has been specified (i.e. isr[specializationOf -> .infoSource_rec]). */ // [infoSource_rec -> ].
/* uncomment if no new usage direction has been specified for (=default).*/ // [usage_direc -> .usage_direc]. /* uncomment if a specialization of ’s usage direction has been specified (query specialization not formalized). */ // [infoSource_rec -> ]. Fig. 3.45: Pattern for specialization relation element . 3.2.3.9
Formalization of Domain Model Entities
For the formalization of domain model entities, we choose an approach similar to the formalization of process types and their specialization relationship: entity types are represented in F-Logic as subclasses of a root class cs_thing, or as a subclass of one of cs_thing’s subclasses; the latter is used to present an specialization of an entity type. However, in contrast to the process type formalization, no instances of these classes will created: the classes are also used to represent concrete entities (e.g. programming languages, tools etc.). This approach has two advantages: 1. Representability aspect: treating entity types as first-order objects allows us to use ’abstract’ entities like "OO-Programming Language" as a topic that can be associated with process types (e.g. with type "OO-Implementation Process") or activities. 2. Maintenance aspect: over time, entities frequently develop sub-entities (e.g. the programming language Java can be considered to have the "sub-entities" Java 1.1, Java 1.2, Java 1.3, etc.). Figure 3.46 shows a formal representation of the example entity model from Figure 3.9.
3.2 A Model for Software Engineering Process-Oriented KM
tool::cs_thing. rational_rose::tool. jdk_1.2::tool. language::cs_thing. modeling_language::language. oo_programming_language::language. java::oo_programming_language. st_80::oo_programming_language. technology::cs_thing. java_technology::technology. rmi::java_technology. serialization::java_technology. Fig. 3.46: Formal representation of the entity hierarchy example from Figure 3.9, extended by entities for rmi and serialization from Figure 3.11
In order to express that information resources can be associated with entity types, we define a class entity_type as shown in Figure 3.47. entity_type[ info_resources =>> iResource. FORALL ET ET:entity_type :- ET::cs_thing. Fig. 3.47: Specification of the class entity_type. The rule states that all subclass of cs_thing are to be considered instances of class entity_type.
Figure 3.48 extends the formalization example from Figure 3.46 by the specification of information resources that are associated with entity types as shown in Figure 3.9.
79
80
CHAPTER 3 Process-Oriented Knowledge Management
oo_programming_language[ info_resources *->> dp_catalog_isr]. dp_catalog_isr:iSourceRec[ infoSource -> dp_catalog]. dp_catalog:iSource[ name -> "WikiWikiWeb Design Pattern Catalog"; access -> "http://c2.com/cgi-bin/wiki?Portland PatternRepository"].
java[ info_resources *->> java_dp_catalog_isr]. java_dp_catalog_isr:iSourceRec[ infoSource -> java_dp_catalog]. java_dp_catalog:iSource[ name -> "Design Patterns Java Companion"; access -> "http://www.patterndepot.com/put/8/ JavaPatterns.htm"]. java_dp_catalog_isr[ specializationOf -> dp_catalog_isr]. Fig. 3.48: Example: Information resources attached to entity types (see Figure 3.9).
Formalized computer science domain entities are intended to represent key topics of activities. To that aim, we extend the definition of the root process type process and class process_type by a new method key_topics (see Figure 3.49). Whenever a topic is characteristic for all process[ key_topics =>> entity_type]. process_type[ key_topics =>> entity_type] Fig. 3.49: Method key_topics.
activities of a certain process type, then this topic should be associated with the type; from there, it is inherited by all sub-types and instances of this type (see Figure 3.50 for an example).
3.2 A Model for Software Engineering Process-Oriented KM
81
implementation_process::process. implement_with_VAJ_process::implementation_process. implement_with_VAJ_process[ key_topics *->> {vaj}]. impl_act:implement_with_VAJ_process. impl_act[ name -> "Implement an editor for ECA rules"; key_topics ->> {rmi, serialization}]. Fig. 3.50: Process type implement_with_VAJ_process refers to the entity vaj as an (inheritable) key topic. As a consequence, it can be inferred that the instance impl_act also refers to entity vaj, in addition to the other two key topics that are specified explicitly for it.
In addition to the inheritance of key topics from process type(s), topics of an activity often can be inferred from other activity characteristics; in the example of an implementation activity, the chosen programming language usually is a good candidate for a key topic. Such dependencies can be expressed straight-forward via rules (see Figure 3.51 for an example). FORALL ACT, TOP ACT[key_topics ->> TOP] :ACT:implemenation_process AND ACT[prog_language -> TOP]. Fig. 3.51: Example rule concerning activities of type implemenation_process: the programming language specified for an activity is inferred to be one of the activity’s key topics.
As mentioned in Section 3.2.2.3, some information needs can be expected to arise during an activity only in the presence of several specific key topics. As a consequence, we define a new method required_topics for class iResource (see Figure 3.52) to reference these required topics. During the distribution phase (see Section 3.2.4.3), the entities referenced via this method are required to be specified as an activity’s key topics; an information resource is only considered to be useful during an activity if all entities appear as one of the activities key topics, in addition to the other constraints that have to be fulfilled. iResource[ required_topics =>> entity_type]. Fig. 3.52: Method required_topics.
Required Topics
82
CHAPTER 3 Process-Oriented Knowledge Management
Figure 3.53 shows an example formalization to describe an information need that is considered only to be useful if the two entities rmi and serialization are present during an activity (also see Figure 3.11). stub_serialization_in:infoNeed. stub_serialization_in[ question -> "How are stubs serialized?"; required_topics ->> {rmi, serialization}]. rmi [info_resources *->> stub_serialization_in]. serialization [info_resources *->> stub_serialization_in]. Fig. 3.53: Information need stub_serialization_in associated with the entities rmi and serialization. 3.2.3.10 Formalization of Product Types and Parameters
In order to be able to associate information resources with product types, we introduce a basic representation of the latter (see Figure 3.54): product types are represented in F-Logic as subclasses of a new root class product. Instances of these subclasses will represent actual products created during activities. product[ name => string; author => person; created => date; lastModified => date]. Fig. 3.54: F-Logic excerpt from the definition of the base class product.
Figure 3.55 shows an example definition of a product type design_document. As the example shows, type-specific attributes can be defined for each product type by specifying methods for the corresponding F-Logic class. The definition of additional attributes will become necessary whenever product constraints that refer to specific product characteristics (e.g. see Table 3.3) need to be formalized. design_document::document. design_document[ design_language => modeling_language; contains_flaws => boolean]. Fig. 3.55: F-Logic representation of the product type design_document. Instances of design_document will have a method design_language, whose values are restricted to instances of modeling_language, and a method contains_flaws, whose boolean value indicates whether known flaws exist in the design.
3.2 A Model for Software Engineering Process-Oriented KM
Figure 3.56 shows a representation excerpt of the product model example depicted in Figure 3.14. executable::product. document::product. requirements_document::document. design_document::document. ejb_design_document::design_document. djb_design_document::design_document. Fig. 3.56: Formal representation of the product type hierarchy shown in Figure 3.14.
Whereas class product represents products on the process instance level, all product types will be represented by the class product_type (see Figure 3.57). product_type[ info_resources@(view) =>> iResource]. FORALL PT PT:product_type :- PT::product. Fig. 3.57: F-Logic excerpt from the definition of the base class product_type. The rule states that all subclasses of product are to be considered instances of a class product_type.
In order to be able to associate information resources with product types, we define a multi-valued method info_resources for class product_type; the method takes instances of a class view as arguments. As a consequence, every subclass of product can refer to several information resources with respect to different views (see Figure 3.58 for a formalization of the example from Figure 3.14). ejb_design_checklist_rec:iSourceRec[ infoSource -> checklist_for_ejb_designs]. djb_design_checklist_rec:iSourceRec[ infoSource -> checklist_for_djb_designs]. design_view:view. // declaration of a view /* associate information resource with product types under this view */ ejb_design_document[ info_resources@(design_view) ->> {ejb_design_checklist_rec}]. djb_design_document[ info_resources@(design_view) ->> {djb_design_checklist_rec}]. Fig. 3.58: Product types associated with infromation resources.
83
84
CHAPTER 3 Process-Oriented Knowledge Management
In order to formalize an information resource’s constraints, we use the same set of rule patterns introduced in Figure 3.15, but replace the pattern for act_constr_sat by a pattern for prod_constr_sat (see Figure 3.59). FORALL PROD prod_constr_sat(, PROD) :. Fig. 3.59: Rule pattern for representing product constraints via predicate prod_constr_sat. The predicate is true if the corresponding constraints are satisfied. The non-terminal is replaced by the F-Logic object’s name that is representing the information resource; is replaced by an appropriate FLogic rule body that can reference the product PROD and its method values.
Figure 3.60 shows a representation of the information source recommendation example from Table 3.3. review_report_rec:iSourceRec[ infoSource -> PROD.review_report].
/* Useful if PROD contains design flaws*/ FORALL PROD prod_constr_sat(review_report_rec, PROD) :PROD.contains_flaws = true. /* Useful if AGT performs the role "designer" or "planner". */ FORALL PROD, ACT, AGT role_constr_sat(review_report_rec, ACT, AGT) :ACT[role@(designer) ->> AGT]. FORALL ACT, AGT role_constr_sat(review_report_rec, ACT, AGT) :ACT[role@(planner) ->> AGT]. /* no skill constraints*/ FORALL ACT, AGT skill_constr_sat(review_report_rec, ACT, AGT). /* Associate with "design document" under view "general */ design_document[ info_resources@(general) ->> {review_report_rec}]. Fig. 3.60: Formal represenation of the information source recommendation example from Table 3.3.
3.2 A Model for Software Engineering Process-Oriented KM
On the type level, product types are referenced by the parameter declarations of process types. Since these declarations are used to select a set of views for a parameter (see Figure 3.15), we also need to formalize parameter declarations as well as concrete parameters. Parameter declarations are represented as subclasses of a new root class process_parameter (see Figure 3.61 for an example); instances of these subclasses represent parameters of concrete activities. requirements_doc_para::process_parameter. design_doc_para::process_parameter. design_doc_para[ name -> "desdoc"; type -> design_document; value => design_document; relevant_views ->> {design_view}]. Fig. 3.61: Formalization of the "desdoc" parameter declaration from Figure 3.15. The class design_doc_para can be used for both process types (design_process and design_review_process), because they share the same relevant views.
Note that for the parameter declaration for "desdoc" of process type implementationProcess (see Figure 3.14), a new subclass of process_parameter is required (see Figure 3.62), because its relevant view is the default view general. design_doc_general_para::process_parameter. design_doc_general_para[ name -> "desdoc"; type -> design_document; value => design_document; relevant_views ->> {general}]. Fig. 3.62: Formalization of the "desdoc" parameter declaration from Figure 3.13.
In order to describe the structure of parameter declarations, we declare them to be instances of a new class processTypeParameter (see Figure 3.63) FORALL PP PP:processTypeParameter :- PP::process_parameter. processTypeParameter[ name => string; type => product_type; relevant_views =>> view]. Fig. 3.63: .F-Logic excerpt from the definition of class processTypeParameter.
85
Parameters
86
CHAPTER 3 Process-Oriented Knowledge Management
Process types reference their parameter declarations by a new method declared_parameters (see Figure 3.64). For our purpose of associating information resources with process parameters, there is no need to distinguish between consumed or created products. process_type[ declared_parameters =>> processTypeParameter]. Fig. 3.64: Multi-valued method declared_parameters.
Figure 3.65 shows an example of a process type with parameter declarations. design_process::process. design_process[ declared_parameters ->> {requirements_doc_para, design_doc_para}]. Fig. 3.65: Formalization of the process type design_process from Figure 3.15.
In order to allow activities to reference the products handled by them, we extend the definition of class process by a new method parameters (see Figure 3.66); again, there is no need here to distinguish between consumed or created products. process[ parameters =>> process_parameter]. Fig. 3.66: Method parameters for instances of process.
Figure 3.67 shows an (instance-level) example of a concrete activity with parameters. design_act:design_process. aReq_doc_para:requirements_doc_para. aDesign_doc_para:design_doc_para. design_act[ parameters -> {aReq_doc_para, aDesign_doc_para}]. /* the current design document is based on an EJB design. */ eca_rule_editor_design:ejb_design_document. aDesign_doc_para[ value -> eca_rule_editor_design]. Fig. 3.67: Excerpt from the formal representation of an activity with parameters, based on the process type formalized in Figure 3.65.
3.2 A Model for Software Engineering Process-Oriented KM
In case that domain entities have been formalized, product types as well as concrete products should also be used as a source for additional activity key topics. Therefore, we extend the definition of product_type and product by a new method key_topics (see Figure 3.68), in analogy to the extensions for process_type and process (see Section 3.2.3.9). Whenever a topic is characteristic for all products of a certain product type, then this topic should be associated with the type; from there, it is inherited by all product sub-types and instances of these types (see Figure 3.69 for an example). product[ key_topics =>> entity_type]. product_type[ key_topics =>> entity_type] Fig. 3.68: Method key_topics. ejb_design_document[ key_topics *->> {ejb}]. eca_rule_editor_design[ name -> "Design Document of an ECA rule editor."; key_topics ->> {mvc_architecture, command_designPattern}]. Fig. 3.69: Product type ejb_design_document refers to the entity ejb as an (inheritable) key topic. As a consequence, it can be inferred that the product instance eca_rule_editor_design also refers to entity ejb, in addition to the other two key topics that are specified explicitly for it.
3.2.4
Information Distribution
According to Clarification 3.1.iv, it is the Knowledge Department’s responsibility to provide agents with knowledge items that are assumed to be useful with regard to individual activities. More specifically, given an activity description act and an agent agt who performs a certain role during the activity, it is the task of the KD’s Knowledge Brokers to provide agt with the set of all captured knowledge items believed to be useful for agt concerning act. Rather than retrieving a set of knowledge items for a given activity directly, we make use of our explicit representation of information needs as a means to group the knowledge items to be retrieved with regard to a specific purpose (i.e. the question they are intended to help answering). More specifically, Knowledge Brokers in the KD are performing the following steps:
87
Product Key Topics
88
CHAPTER 3 Process-Oriented Knowledge Management
1. Given an activity specification act for an agent agt, determine the set of all information resources iResources(act,agt) that have been associated with act’s type(s). 2. If act refers to additional entities (products, topics, etc.), then the information resources that have been associated with these entities are included in iResources(act,agt). 3. Remove all information resources from iResources(act,agt) whose constraints are not satisfied. 4. Present the remaining information resources in iResources(act,agt) to agent agt. On request from agt, present the knowledge items encapsulated by a chosen information resource to agent agt: • for an information source recommendation, present the referenced information source to agt (e.g. open the URL in a browser). • for an information need, determine the subset of specified information source usage recommendations whose constraints are satisfied, and present these usage recommendation to agt. On request from agt, either execute a usage recommendation’s queryCommand (in case it has been specified) and present the retrieval result to agt; or, present the referenced information source to agt, together with the textual description on how to access it. In both cases, also present the information need’s sub-information needs to agt on request. In the following, we will show how these steps can be automated, in order to establish the scenario depicted in Figure 3.70. Here, information distribution is integrated into the process enactment service provided by the organization’s workflow management system (WFM system). One of the WFM system’s component, the Activity List Manager, allows agents to access their individual to-do lists. For our information distribution scenario, we provide each human agent with an additional software component called Information Assistant (IA). Whenever an agent focuses on a particular activity (1), the IA determines and presents the set of information resources considered to be useful for this agent (i.e. with regard to his role and skills) concerning the selected activity ((2) and (3)). The categories specified for the retrieved information resources are used to organize and index these resources. The IA allows the agent to browse this set of information source recommendations and expected information needs (as well as their sub-information needs). On demand (4), the IA accesses the information source referenced by a
3.2 A Model for Software Engineering Process-Oriented KM
89
selected (usage) recommendation (5). This yields a set of retrieved knowledge items which is presented to the agent (6). Knowledge Department – Technical Infrastructure
WFMS
update activity characteristics
WFE
Product ModelIRS
Process ModelIRS Activity KB
Product KB
Activity List Manager
To-Do List
update request 2
Domain Entity ModelIRS
Agent KB
Information Information Resource Resource Manager Manager
3 iResources(act,agt)
Act 1 Information Assistant
Act 2 … 1
4
IRS 1
choose
choose
IRS 2
5
launch queryCommand
internal information sources
… 6
Agent
present retrieval result/document
external information sources
Fig. 3.70: Information distribution coupled with a WFMS. The process model, product model and domain entity model have been enriched with information resources (IRS).
The two main purposes of the Information Assistant are: • to provide the agent systematically with immediate access to useful
information items that satisfy his information needs (i.e. neither does he have to wait for the Knowledge Brokers to find information items for him, nor does he need to search for himself.) • to ease the burden of Knowledge Brokers to answer questions that repeatedly are being raised (e.g. standard questions asked by new employees, or employees working for the first time with a particular piece of technology). If the agent has an information need that does not correspond to an expected information need offered by the IA, the agent can: 1. characterize the activity under consideration more specifically by filling in activity aspects in its characterization. As a consequence, the IA will update the set of information resources being offered in correspondence to the new activity specification, 2. post his question in form of a new information need to the KD, or
90
CHAPTER 3 Process-Oriented Knowledge Management
3. attempt to find the desired information by himself (e.g. by trying to find someone who might be able to answer his question, or by searching for useful documents). In the second case, the IA serves as the agent’s "interface" to the KD, i.e. it allows agents to post a question to the KD for which he would like to have an answer. The KD will attempt to provide the agent with information items that satisfy his information need (i.e. the KD searches for useful information items and provides the agent with the search result via his IA). The IA supports this communication between KD and the agent by providing both of them with the activity’s formal characterization that establishes the context in which the information need occurs: • the KD can use this characterization to determine the appropriate
Knowledge Brokers with regard to their expertise in the activity, its products, key topics, etc. • the agent does not have to describe his current work context, project information, etc., assuming that this information has already been specified as part of the characterization. Only where the KB considers this necessary for automated information distribution, he might ask the agent for additional activity aspects in case they have been left unspecified in the characterization. Before the KB tries to find information items that might answer the agent’s question, he needs to check whether the question corresponds to one of the already existing expected information needs whose constraints were not satisfied, and, hence, the information need not offered. If such an expected information need exist, then this information need should be offered to the agent despite the unsatisfied constraints; these will have to be updated later during a review phase (see Chapter 5). In any case, the IA will keep track of new information needs posted by the agent for later review (see Chapter 5). The formalizations introduced in Section 3.2.3 facilitate a computable mapping infostT(T)-> P(K) from activities to a set of knowledge items assumed to be useful: Given a process instance p_act that formally represents an activity act of team member agt, then the set of knowledge items assumed to be useful for agt can be determined by the following query: ?-useful_iResource_from_type(p_act, agt, IRS).
3.2 A Model for Software Engineering Process-Oriented KM
The set of useful information resources consists of the variable bindings for IRS; the definition of the predicate useful_iResource_from_type is shown in Figure 3.71 and Figure 3.72. /* */ FORALL ACT, AGT, IRS useful_iResource_from_type(ACT, AGT, IRS) :ACT[info_resources_from_type ->> IRS] AND constraints_sat(IRS, ACT, AGT).
FORALL IRS, ACT, AGT constraints_sat(IRS, ACT, AGT) :act_constr_sat(IRS, ACT) AND role_constr_sat(IRS, ACT, AGT) AND skill_constr_sat(IRS, ACT, AGT). Fig. 3.71: Rule for predicate useful_iResource_from_type(ACT, AGT, IRS): determines the set of all useful information resources for an agent that are associated with an activity’s type(s), and whose constraints are satisfied.
/* The info resources for ACT are determined by info resources associated with ACT’s process type(s). */ FORALL ACT, IRS, T ACT[info_resources_from_type ->> IRS] :ACT:T AND T[info_resources ->> IRS]. Fig. 3.72: Rule for method info_resources_from_type: determines the set of all information resources that are associated with an activity due to its process type(s). Note that, in case that activities can be instances of several process types, the information resources associated with each of these process types are considered for distribution.
In order to abstract from the fact that information resources were collected from the type, we introduce the predicate useful_iResource as shown in Figure 3.73. /* */ FORALL ACT, AGT, IRS useful_iResource(ACT, AGT, IRS) :useful_iResource_from_type(ACT, AGT, IRS). Fig. 3.73: Rule (1) for predicate useful_iResource(ACT, AGT, IRS). An information resource IRS is useful for an agent AGT during activity ACT, if an appropriate specification has been associated with the activity’s type.
Note that the formalization of activities is mainly motivated by the aim of automating the distribution of information resources. The degree of activity formalization directly corresponds to the effort the KD will have to spend on determining the set of useful information resources, in addition to its influence on recall and precision. In the worst case, any missing for-
91
92
CHAPTER 3 Process-Oriented Knowledge Management
malization of an activity’s characteristic might have as its consequence that available, useful information resources are not offered automatically. 3.2.4.1
Decompositions and Information Distribution
In case that activity representations can optionally specify a decomposition that has been selected for the activity, then the information resources attached to that selected decomposition should also be considered for recommendation during the activity. Given the definitions shown in Figure 3.74 and Figure 3.75, the set of all knowledge items assumed to be useful can be determined by the following query: ?-useful_iResource_from_decomp(p_act, agt, IRS).
Here, p_act is a process instance that formally represents an activity act of team member agt; the variable bindings for IRS constitute the set of useful information resources. /* */ FORALL ACT, AGT, IRS useful_iResource_from_decomp(ACT, AGT, IRS) :ACT[iResources_from_decomp ->> IRS] AND constraints_sat(IRS, ACT, AGT). Fig. 3.74: Rule for predicate useful_iResource_from_decomp(ACT, AGT, IRS): determines the set of all useful information resources for an agent that are associated with an activity’s decomposition, and whose constraints are satisfied. Remember that, as mentioned in Section 3.2.3.5, the constraints refer to the activity being decomposed, and not to the decomposition.
/* */ FORALL ACT, IRS ACT[iResources_from_decomp ->> IRS] :ACT[iResources_from_decomp@(ACT.decomp) ->> IRS]. FORALL ACT, DEC, IRS ACT[iResources_from_decomp@(DEC) ->> IRS] :DEC[info_resources ->> IRS]. /* Activity decompositions receive their iResources from their process type decomposition*/ FORALL DEC, DEC_TYPE, IRS DEC[info_resources ->> IRS] :DEC:DEC_TYPE AND DEC_TYPE[info_resources ->> IRS]. Fig. 3.75: Rule for method iResources_from_decomp: determines the set of information resources that are associated with an activity’s decomposition due to the decomposition’s type.
3.2 A Model for Software Engineering Process-Oriented KM
In order to abstract from the fact that information resources were collected from the decomposition, we extend the definition of the predicate useful_iResource by the additional rule shown in Figure 3.76. /* */ FORALL ACT, AGT, IRS useful_iResource(ACT, AGT, IRS) :useful_iResource_from_decomp(ACT, AGT, IRS). Fig. 3.76: Rule (2) for predicate useful_iResource(ACT, AGT, IRS). An information resource IRS is useful for an agent AGT during activity ACT, if an appropriate specification has been associated with the activity’s decomposition.
An issue that remains to be addressed here is how the information resources associated with a process type are related to the sub-process types introduced by the decomposition. In contrast to the inheritance of a process type’s information resources along the specialization relation, a decomposed process type’s information resources can not simply be inherited by its individual sub-process types, if only for the reason that, in general, they do not share the particular characteristics referred to by the super-process type’s resource constraints. As explained in Section 3.2.2.1, the information resources associated with the activity’s context should also be considered for recommendation. Figure 3.77 shows the corresponding definition of a predicate useful_iResource_from_context for the identification of an activity’s context-induced information resources. /* useful context iResources for an activity are all useful iResources associated with the super-activity */ FORALL ACT, AGT, IRS useful_iResource_from_context(ACT,AGT,IRS) :useful_iResource( ACT.containing_decomp.decomposed_process, AGT, IRS). Fig. 3.77: Rule for predicate useful_iResource_from_context(ACT, AGT, IRS): determines the set of all information resources that are introduced by an activity’s super-activity. In combination with the rule shown in Figure 3.78, this rule also covers the iResources from the super-activity’s context, i.e. the aestivate’s complete context is taken into account.
In order to abstract from the fact that information resources were collected from the context, we extend the definition of the predicate useful_iResource by the additional rule shown in Figure 3.78.
93
94
CHAPTER 3 Process-Oriented Knowledge Management
/* */ FORALL ACT, AGT, IRS useful_iResource(ACT, AGT, IRS) :useful_iResource_from_context(ACT, AGT, IRS). Fig. 3.78: Rule (3) for predicate useful_iResource(ACT, AGT, IRS). An information resource IRS is useful for an agent AGT during activity ACT, if an appropriate specification has been associated with the activity’s context. 3.2.4.2
Specialization and Information Distribution
In case the process-modeling language under consideration facilitates process type specialization, all information resources associated with a process type should be inherited to its process sub-types, except those that are explicitly overridden by a specialized information resource associated with the sub-type. /* ACT collects the IRS from all(!) the classes along the inheritance path of process types, as long as the IRS are not specialized along the way */ FORALL ACT, IRS, T ACT[info_resources_from_type ->> IRS] :ACT:T AND T[info_resources *->> IRS] AND NOT iRes_specialized_by_subtype(ACT,T,IRS). // checks whether IRS is specialized by a subtype S of T, such that ACT is also an instance of S. FORALL ACT, T, IRS, S, IRS_spec iRes_specialized_by_subtype(ACT,T,IRS) :ACT:S::T AND S[info_resources *->> IRS_spec] AND IRS_spec[specializationOf -> IRS]. Fig. 3.79: Rule for method for info_resources_from_type reflecting specialization: determines the set of all information resources that are associated with an activity due to its process type(s). This includes resources that are inherited by the process’ super-types, as long as they have not been explicitly specialized.
To reflect this, the rule presented in Figure 3.72 is replaced by the rules shown in Figure 3.79. Revisiting the example from Figure 3.38, these new rules allow us to infer the following facts: glass_box_testing[info_resources_from_type ->> {bug_report_in}]. glass_box_testing[info_resources_from_type ->> {avail_gb_test_tools_in}].
In particular, the inherited (but specialized) information need avail_test_tools_in has been excluded, i.e. we can not infer:
3.2 A Model for Software Engineering Process-Oriented KM
glass_box_testing[info_resources_from_type ->> {avail_test_tools_in}].
The inheritance of constraints along the specialization relations
Domain Model Entities and Information Distribution
If the representation of activities includes a specification of key topics in order to further characterize an activity, then the information resources associated with an activity’s key topics should also be considered for recommendation (see Figure 3.81 and Figure 3.82). /* */ FORALL ACT, AGT, IRS useful_iResource_from_topics(ACT, AGT, IRS) :ACT[iResources_from_topics ->> IRS] AND constraints_sat(IRS, ACT, AGT). Fig. 3.81: Rule for predicate useful_iResource_from_topics(ACT, AGT, IRS): determines the set of all useful information resources for an agent that are associated with an activity’s key topics, and whose constraints are satisfied.
95
96
CHAPTER 3 Process-Oriented Knowledge Management
FORALL ACT, TOP, IRS ACT[iResources_from_topics ->> IRS] :has_key_topic(ACT,TOP) AND TOP[info_resources *->> IRS] AND NOT iRes_specialized_by_subtopic(ACT,TOP, IRS). FORALL ACT, TOP has_key_topic(ACT,TOP) :ACT[key_topics *->> TOP]. // checks whether IRS is specialized into IRS_spec by a subtopic TOP_spec of TOP, such that TOP_spec is (also) a key topic of ACT. FORALL ACT, TOP, IRS iRes_specialized_by_subtopic(ACT,TOP,IRS) :ACT[key_topics *-> TOP_spec] AND TOP_spec::TOP AND TOP_spec[info_resources *->>IRS_spec] AND IRS_spec[specialzationOf -> IRS]. Fig. 3.82: Rules for method iResources_from_topics: determines the set of information resources that are associated with the activity’s topics. Note on the definition of predicate iRes_specialized_by_subtopic: Florid’s closure properties ensure that "o::o" holds for all active objects o. Hence, a topic is always a subclass of itself, and the information resources associated with a particular topic are also taken into account when testing for a more specialized information resource associated with that topic.
In order to take into account the additional constraint represented by an information resource’s set of required topics, the predicate constraints_sat(IRS,ACT,AGT) needs to be adapted as shown in Figure 3.83. FORALL ACT, AGT, IRS constraints_sat(IRS, ACT, AGT) :act_constr_sat(IRS, ACT) AND role_constr_sat(IRS, ACT, AGT) AND skill_constr_sat(IRS, ACT, AGT) AND all_required_topics_present(IRS, ACT) AND constraints_sat(IRS.specializatonOf, ACT, AGT). // ’nil’ will make sure that the recursion // in the rule body above will terminate FORALL ACT, AGT constraints_sat(nil, ACT, AGT). Fig. 3.83: Rule for predicate constraints_sat(IRS, ACT, AGT) reflecting required topics. The constraint inheritance rules introduced in Figure 3.79 are adapted to include a test (via predicate all_required_topics_present) whether the required topics for information resource IRS are present in activity ACT.
3.2 A Model for Software Engineering Process-Oriented KM
The entities referenced via method required_topics are required to be specified as an activity’s key topics; an information resource is only considered to be useful during an activity if all entities appear as one of the activities key topics, in addition to the other constraints that have to be fulfilled. /* is there an element TOP in the "set" IRS.required_topic such that no element TOP_spec in the "set" of ACT’s key_topics exists with: TOP_spec is subsumed by TOP? */ FORALL TOP, ACT, TOP_spec topic_present_in_act(TOP, ACT) :has_key_topic(ACT, TOP_spec) AND TOP_spec::TOP ?- sys.strat.doIt. FORALL ACT, IRS not_all_topics_present(IRS, required_topics, ACT, key_topics) :IRS[required_topics ->> TOP] AND ACT[key_topics ->> {}] AND not topic_present_in_act(TOP, ACT). /* is each element "in the set" IRS.required_topic subsumed by an element "in the set of ACT’s key_topics? */ ?- sys.strat.doIt. FORALL ACT, IRS all_required_topics_present(IRS, ACT) :IRS[required_topics ->>{}] AND ACT[key_topics ->>{}] AND NOT not_all_topics_present(IRS, required_topics, ACT, key_topics). Fig. 3.84: Rule for predicate all_required_topics_present(IRS, ACT). The Florid command "?- sys.strat.doIt." partitions a programm into several strata to facilitate the inference of intended results in the presence of negation. Note on the definition of predicate topic_present_in_act: Florid’s closure properties ensure that "o::o" holds for all active objects o. Hence, a topic is always subsumed by itself.
Given the definitions shown in Figure 3.81 through Figure 3.84, the set of all knowledge items assumed to be useful because of an activity’s topics can be determined by the following query: ?-useful_iResource_from_topics(p_act, agt, IRS).
where p_act is a process instance that formally represents an activity act of team member agt; the variable bindings for IRS constitute the set of useful information resources.
97
98
CHAPTER 3 Process-Oriented Knowledge Management
In order to abstract from the fact that information resources were collected from activity topics, we extend the definition of the predicate useful_iResource by the additional rule shown in Figure 3.85. /* */ FORALL ACT, AGT, IRS useful_iResource(ACT, AGT, IRS) :useful_iResource_from_topics(ACT, AGT, IRS). Fig. 3.85: Rule (4) for predicate useful_iResource(ACT, AGT, IRS). An information resource IRS is useful for an agent AGT during activity ACT, if an appropriate specification has been associated with at least one of ACT’s key topics. 3.2.4.4
Products and Information Distribution
In the case that activity representations reference handled products as parameter values, the information resources associated with that products’ types should also be considered for recommendation during an activity. The appropriate views on its products are obtained from the corresponding declaration of the activity’s parameter that references the product. The new four-argument predicate constraints_sat checks the information resource’s product constraints (instead of the activity constraints); however, it still requires the activity to test the role constraints (see Figure 3.86). FORALL PROD, ACT, AGT, IRS constraints_sat(IRS, PROD, ACT, AGT) :prod_constr_sat(IRS, PROD) AND role_constr_sat(IRS, ACT, AGT) AND skill_constr_sat(IRS, ACT, AGT) AND constraints_sat(IRS.specializatonOf, PROD, ACT, AGT). // ’nil’ will make sure that the recursion // in the rule body above will terminate FORALL PROD ACT, AGT constraints_sat(nil, PROD, ACT, AGT). Fig. 3.86: Rule for predicate constraints_sat(IRS, PROD, ACT, AGT).The predicate constraints_sat(IRS, PROD, ACT, AGT) is true if the constraints defined locally for the information resource hold and the constraints defined for resource that IRS is a specialization of hold.
Given the definitions shown in Figure 3.87 and Figure 3.86, the set of all knowledge items assumed to be useful with respect to the products handled by an activity can be determined by the following query: ?-useful_iResource_from_paras(p_act, agt, IRS).
3.2 A Model for Software Engineering Process-Oriented KM
where p_act is a process instance that formally represents an activity act of team member agt; the variable bindings for IRS constitute the set of useful information resources. /* */ FORALL ACT, AGT, IRS useful_iResource_from_paras(ACT, AGT, IRS) :ACT[iResources_from_paras ->> IRS]. FORALL ACT, AGT, IRS, PAR, VIEW ACT[iResources_from_paras ->> IRS] :ACT[parameters ->> PAR] AND PAR[relevant_views ->> VIEW] AND useful_iResource_from_product(PAR.value, VIEW, ACT, AGT, IRS). FORALL PROD, ACT, AGT, IRS, PAR, VIEW useful_iResource_from_product(PROD, VIEW, ACT, AGT, IRS) :PROD[info_resources_from_type@(VIEW) ->> IRS] AND constraints_sat(IRS, PROD, ACT, AGT). /* PROD collects the IRS in the specified view from all(!) the classes along the inheritance path of product types, as long as the IRS are not specialized along the way */ FORALL PROD, PT, IRS, VIEW PROD[info_resources_from_type@(VIEW) ->> IRS] :PROD:PT AND PT[info_resources@(VIEW) *->> IRS] AND NOT iRes_specialized_by_subtype(PROD,PT,VIEW, IRS). // checks whether IRS is specialized by a subtype S of T, such that PROD is also an instance of S. FORALL PROD, T, IRS, S, IRS_spec iRes_specialized_by_subtype(PROD,T,VIEW,IRS) :PROD:S::T AND S[info_resources(VIEW) *->> IRS_spec] AND IRS_spec[specializationOf -> IRS]. Fig. 3.87: Rule for predicate useful_iResource_from_paras(ACT, AGT, IRS): determines the set of all useful information resources for an agent that are associated with the product types of the products handled during an activity, and whose constraints are satisfied. This includes resources that are inherited by the products’ super-types, as long as they have not been explicitly specialized .
99
100
CHAPTER 3 Process-Oriented Knowledge Management
In order to abstract from the fact that information resources were collected from the products, we extend the definition of the predicate useful_iResource by the additional rule shown in Figure 3.88. /* */ FORALL ACT, AGT, IRS useful_iResource(ACT, AGT, IRS) :useful_iResource_from_paras(ACT, AGT, IRS). Fig. 3.88: Rule (5) for predicate useful_iResource(ACT, AGT, IRS).An information resource IRS is useful for an agent AGT during activity ACT, if an appropriate specification has been associated with the product types of the products that are assigned to an activity’s parameters.
Product Key Topics
In case that the characterization of products allows for a specification of key topics, any information resources associated with these productinduced key topics should also be considered for recommendation. To that aim, we define an additional rule for the predicate has_key_topic, formerly introduced in Figure 3.82; this new rule extends the set of an activity’s key topics by all key topics associated with the products handled during the activity (see Figure 3.89). FORALL ACT, PAR, TOP has_key_topic(ACT,TOP) :ACT[parameters ->> PAR] AND PAR.value[key_topics *->> TOP]. Fig. 3.89: Rule for predicate has_key_topic(ACT,TOP).Key topics that are used to characterize the products handled during an activity are considered to be key topics during the activity.
As a consequence, the key topics introduced by products are also considered when the presence of all topics required by an information resource’s constraints is tested (see Figure 3.84), because the corresponding rule also makes use of the predicate has_key_topic.
Integrating POKM into a ProcessCentred SEE
CHAPTER 4
In this chapter, we show how the SE-POKM model presented in this thesis has been implemented in the form of a Process-Oriented Information Resource Management Environment and integrated into the MILOS PSEE.
In order to illustrate the SE-POKM model presented in this thesis, we have constructed a system, called the Process-Oriented Information Resource Management Environment (PRIME). PRIME is designed to be coupled with a Process-Centred Software Engineering Environment that the organization either is already using, or is willing to deploy (see Figure 4.1). Given an activity that appears on an agent’s to-do list managed by the PSEE, PRIME provides the agent with a set of information resources considered to be useful by the Process Group for this activity. change events
Characterization Characterization Updater Updater
Characterization Characterization Modeler Modeler
Characterization Characterization Manager Manager
Characterization KB
Characterization System
PSEE Process Model
read
Characterization Characterization Editor Editor
Project Plans WFE WFE
Plan Enactment Data
To-Do List
Information Information Resource Resource Retriever Retriever
update request 2
3 iResources(act,agt)
…
…
choose
Information Resource KB
Information Information Assistant Assistant Monitor Monitor Information Information Request Request Forum Forum
PRIME
Information Assistant
actPE 1
Information Information Source Source Manager Manager
Information Information Resource Resource Manager Manager
4
IRS 1
choose
IRS 2
5
launch queryCommand
internal information sources
… 6
Agent
external information sources
present retrieval result/document
KD member
Characterization Editor Information Request Forum
Fig. 4.1: PRIME architecture.
101
102
Characterization
CHAPTER 4 Integrating POKM into a Process-Centred SEE
To facilitate the retrieval of these information resources, PRIME supports the characterization of activities and activity-related entities (e.g. products, tools, and technologies). Here, a characterization consists of two parts: (i) a classification (i.e. a mapping to one of the characterization classes), and (ii) an instance of that characterization class, called a characterization object. The characterization classes correspond to the FLogic classes introduced in Section 3.2.3 (e.g. for process types, product types, etc.); they are mainly used to maintain type-specific information resources. The characterization objects correspond to F-Logic instances of those F-Logic classes; they are used to evaluate the constraints and query command expressions of information resources. All information resources will be associated with characterization classes and objects that are maintained within PRIME, instead of being directly associated with entities defined within the PSEE. There are two reasons for this: 1. the representation of the entities provided by the PSEE will usually not be powerful enough to define information resource constraints, such that recommendations can be restricted to useful information resources (e.g. available entity attributes or predicates defined on them are insufficient). 2. the modeling language provided by the PSEE may not cover all entities or constructs that are desirable for the organization and maintenance of information resources (e.g., modeling of technologies employed during an activity or type specialization may not be supported). As far as possible, characterization classes should correspond to the model entities (e.g. process types, product types, resource types etc.) that have already been defined within the PSEE. However, additional characterization classes might have to be defined in PRIME, especially if the expressive power of the process modeling language provided by the PSEE does not cover a particular class of entities with which certain information resources need to be associated. In the following, we will explain the main components of PRIME and their responsibilities:
103
Characterization Modeler: Allows users to define and maintain a hierarchy of characterization classes. Currently, we are using the commercially available tool tengerine1 for this purpose (see Figure 4.2).
Fig. 4.2: Snapshot from tengerine.The left window panel shows a characterization class hierarchy, currently expanded to display the characterization classes for process types (subclasses of SEDMProcess) and product types (subclasses of SEDMProduct). SEDMProcess and SEDMProduct are the root classes for all user-defined process types and product types, respectively2. The right window panel lists the attributes defined for the selected class Implement_SystemComponent.
Characterization KB: Stores characterization classes and characterization objects. The characterization KB is logically partitioned into the following subdomains: • Process KB stores characterization classes and objects that correspond
to process model entities (process types, type decompositions etc.) and process instances (i.e. activities), respectively.
(1) (2)
www.empolis.com SEDM: Software Engineering Domain Model
104
CHAPTER 4 Integrating POKM into a Process-Centred SEE
• Product KB stores characterization classes and objects that corre-
spond to product model entities (product types) and product instances, respectively. • Resource KB stores characterization classes and objects that correspond to resource model entities (resource types) and resource instances (e.g. agents, tools, technologies etc.), respectively. Currently, PRIME uses the commercially available tool ORENGE3, a middleware for object-oriented, case-based systems, to store characterization classes and objects. Characterization Manager: Allows users to define and maintain mappings from PSEE entities to corresponding characterization entities. PRIME distinguishes between two kinds of mappings: 1. model entities (e.g. process types, product types, type decompositions etc.) are mapped to their corresponding characterization classes. 2. activities and activity-related entities are mapped to their corresponding characterization objects. Mappings of the first kind have to be established manually; Figure 4.3 shows a snapshot from the Characterization Manager’s user interface to define them.
Fig. 4.3: Snapshot from the Characterization Manager interface to define mappings between model entities defined in the PSEE and PRIME characterization classes. PSEE model entities appear in the tree displayed on the left-hand side. For a selected entity, an appropriate characterization class can be chosen from the combo-box.
Mappings of the second kind are created automatically by the Characterization Updater (see below). These mappings need to preserve any special-
(3)
www.empolis.com
105
ization and instance-of relationships defined in the PSEE, i.e.: if actPE is an instance of process type ptPE within the PSEE, and actPE and ptPE are mapped to the characterization object actChar and class ptChar, respectively, then actChar has to be an instance of ptChar (or a subclass of ptChar) within PRIME. Characterization Updater: Listens to change events/messages sent by the PSEE (e.g. to signal activity creation, scheduling changes, state changes etc.) that need to be reflected in the characterizations; the component updates the corresponding characterization objects of the modified PSEE entities. Information Resource Manager: Allows users to define and maintain information resources that are associated with characterization classes and characterization objects (see Figure 4.4).
Fig. 4.4: Snapshot from the Information Resource Manager interface to define information needs. From the tree in the upper-left part of the window, the user has selected the process object Implement_SystemComponent from a process model. The tree in the upper-right part displays the associated information needs, grouped under their categories. The attribute values of the selected information need are shown in the lower part of the window.
Information Resource KB: stores the information resources defined via the Information Resource Manager. Information Resource Retriever: Retrieves the set of useful information resources for a given activity and agent characterization, i.e.:
106
CHAPTER 4 Integrating POKM into a Process-Centred SEE
Let actChar be the characterization object of an activity actPE within the PSEE; let agtChar be the characterization object of an agent agtPE within the PSEE entities. Then the component retrieves the set of information resources iResources(actChar, agtChar) considered to be useful for agent agtPE during activity actPE. The retrieval result is displayed to the agent agtPE via his personal Information Assistant (see below). Information Assistant (IA): interacts with a PSEE user via a graphical interface; each user is provided with his personal IA that he can launch for a selected activity on his to-to list4. In particular, the IA: • triggers the Information Resource Retriever to determine the set of
•
•
•
•
•
(4)
information resources iResources(actChar, agtChar), where actChar and agtChar are the characterization objects of the selected activity actPE and user agtPE, respectively. The IA obtains these characterization objects from the Characterization Manager. displays the set of information resources returned by the Information Resource Retriever for an activity that the agent selected on his to-do list; allows the agent to browse this set of information source recommendations and expected information needs (as well as their sub-information needs); on demand, accesses the information source referenced by a selected (usage) recommendation. This yields a set of retrieved knowledge items which is presented to the agent (typically via a web browser); allows the agent to maintain personal, activity-specific lists of information items (i.e. bookmarks) that he considers to be useful. From these lists, the agent can recommend items for capture and/or association with a particular model entity to the Knowledge Department; allows the agent to post a question or feedback on an offered information resource to the Knowledge Department with regard to his current activity. This starts an activity-specific communication thread that is maintained within the Information Request and Feedback Forum (see below).
For workflow engines that conform to the WFMC standard [Hol95], the IA can be launched via the engine’s tool interface, with the selected activity and the current user as parameters
107
Figure 4.5 depicts a snapshot from an agent’s Information Assistant interface.
Fig. 4.5: Snapshot from an agent’s IA interface for activity "Implement ECA rule editor": the left-hand panel lists the agent’s personal, activity specific information items, as well as questions posted to the Knowledge Department during the activity. The righthand side lists the information resources retrieved by the Information Resource Retriever.
Information Request and Feedback Forum (IRFF): maintains the activity-specific requests for information and feedback comments on information resources that agents can post to the KD via their Information Assistant (see Figure 4.6).
108
CHAPTER 4 Integrating POKM into a Process-Centred SEE
(a) (b)
Implementation Process > Implement an ECA rule editor >Info Requests
Post a new information need
Hier: Post Message Form (c)
Fig. 4.6: Posting an information need. The Information Assistant (a) allows agents to post their request (b) directly to a forum (c). For each task, a message forum is provided that maintains the agents’ information requests and the replies posted by colleagues. The agents’ requests are posted to the corresponding forum by the Information Assistant, after having been extended by a link to the activity during which the information need occurred.
Currently, we are using the commercially available discussion forum software Jive5 as the platform for our IRFF. Knowledge Brokers are alerted to incoming new requests and react to these requests by posting a reply that suggests one or more information items, or information source usage recommendations (see Figure 4.7).
(5)
http://www.jivesoftware.com/
109
(a)
(b)
Implementation Process > Implement an ECA rule editor > Info Requests
Fig. 4.7: Accessing the thread for a posted information need. The Information Assistant (a) maintains a link to the request posted by the agent in the context of a particular task (cf. Figure 4.6). Thus, the agent is provided with direct access to the communication thread (b), i.e. to answers posted by colleagues.
Alternatively, they can ask the agent for more details concerning their current request; this communication is maintained for later analysis by the KD (see Chapter 5). Characterization Editor: allows agents to browse and edit the characterization objects for PSEE entities that characterize activities on their to-do lists (see Figure 4.8). In addition, agents can use the Characterization Editor to edit their persnal characterization, i.e. their skill profile. The class and attribute values specified for the characterization objects of activity and activity-related entities directly influence the set of information resources determined by the Information Resource Retriever. Thus, an appropriate characterization of activities and agents is essential for offering useful information resources to the agent during his activities.
110
CHAPTER 4 Integrating POKM into a Process-Centred SEE
Fig. 4.8: Snapshot from the Characterization Editor, opened for the implementation activity example from Figure 3.50. The user has specified Serialization and RMI as the activity’s key topics by selecting corresponding entities from the domain ontology.
Information Assistant Monitor: allows members of the KD to monitor the set of information resources offered by the IA of a certain agent for his activities. This provides KD members with the means to verify whether certain information resources are being offered to the agent; this is necessary whenever the agent posts a question to the KD which it considers should already have been covered by a corresponding information resource. Information Source Manager: allows members of the KD to define and maintain wrappers for all information sources available to the organization (see Figure 4.9). Each wrapper encapsulates an information source and must provide the means to access it, i.e. the wrapper is responsible for
111
establishing a connection to the information source as well as for sending an executable query command to it via one of its interfaces.
Fig. 4.9: Snapshot from the Information Source Manager. The selected information source ’Sun Java Domain’ is accessible via a CGI script that can be accessed via the URL specified in the attribute ’Query Execution Access’.
An information source wrapper is represented by the following attributes: • representation: specifies the name of the information source • interface type: determines how queries are sent to the information • • • •
source automatically query engine access: specifies a URL that provides users with access to the information source for browsing or launching queries manually query execution access: specifies a URL of the interface provided by the information sources for automatic query execution example query: an example query command query syntax: specifies the query language supported by the interface.
When an agent issues a "Show" command on a selected IS usage recommendation (see Figure 4.5), its queryCommand is evaluated in the current activity context, i.e. the variables contained therein are replaced by appropriate characterization object attribute values. This evaluation is dependent from the wrapper of the concerned information source, as the source’s query syntax typically requires data in a fixed format (e.g. the date format might differ from the one used within characterization objects). Further details on the wrapper implementation can be found in [Kön00].
112
CHAPTER 4 Integrating POKM into a Process-Centred SEE
As a proof of concept, PRIME has been implemented and coupled with the MILOS PSEE [MDB+00]. MILOS has been chosen for two main reasons: • it separates clearly between process modeling, planning, and enact-
ment as proposed by [GJ96] • its modeling language scales relatively low on expressive power, when compared to other recent languages like Spearmint [BW98] or E3 [JPL98]; e.g. it does not provide means to specify user-defined attributes or product types. This illustrates the necessity of providing a separate characterization vocabulary, and allows a clear distinction between PSEE functionality and PRIME functionality • As a consequence, PRIME has been designed to make only a few assumptions on the PSEE’s modeling language, helping us to ensure that PRIME’s design will allow it to be coupled with other PSEEs in the future. • MILOS supports the planning and enactment of weakly-structured, dynamic processes, allows on-the-fly plan changes and provides a powerful change notification mechanism. Thus, PRIME complements the existing, ’internal’ information delivery concept with a new, compatible paradigm. 4.1
MILOS
The MILOS system6 provides software organizations with the means to define generic process models and to set up concrete project plans based on these models. Furthermore, the MILOS workflow engine supports the enactment of project plans, providing team members with individual todo lists and relevant documents. During plan enactment, MILOS allows on-the-fly plan changes, notifying team members affected by those changes and updating the workflow engine accordingly. Process Model A process model in MILOS consists of a set of process types and decom-
position relationships between these types. More specifically, each process type is specified by the following aspects: 1. a textual goal description, 2. a list of input, output and modify parameters which represent the products that are consumed, created or altered during instances of the process type (i.e. activities),
(6)
MILOS (Minimally Invasive Long-tern Organizational Support) [MDB+00] is a PSEE developed at the University of Kaiserslautern (Germany) and the University of Calgary (Canada).
4.1 MILOS
113
3. entry/exit conditions that must hold for an instance of the process type before its execution can be started or finished, respectively, 4. a set of references to predefined documents (e.g. guidelines, manuals, checklists etc.), and 5. a set of alternative methods; each method represents a process type decomposition that describes a way to execute instances of the process type. Figure 4.10 shows a snapshot from the MILOS Process Model Editor; for a more detailed description of the process modeling language provided by MILOS, see [VBM+96].
Fig. 4.10: MILOS Process Modeling Editors: the window labeled "Process Model" lists the process types and the methods defined for them. For the process type "Component Development", two methods have been defined that correspond to the type decomposition example from Figure 3.7. A detail window has been opened for process type "Implement Component", showing the type’s input parameters.
In MILOS, a project plan consists of a set of process instances (activities) and decomposition relationships between these activities. An activity can be declared to be an instance of a certain process type defined in the process model. In that case, the process type serves as a template for the activity in that: • the type’s parameter declarations are added to the activity’s parameter
sets, • the type’s entry/exit conditions as well as the references to predefined documents are transferred to the activity, • during planning, the (human) planner can select one of the type’s alternative methods as a possible decomposition of the activity. The planner
Project Plan
114
CHAPTER 4 Integrating POKM into a Process-Centred SEE
is free to decide whether to choose from this set, or to define a new decomposition. However, selecting one of the type’s methods results in the automatic creation of all sub-activities defined by the method (including the document flow between them). In addition to the attributes obtained from its type, each activity specifies aspects concerning scheduling (earliest/latest start/end dates etc.), resource assignments and state information (e.g. "planned", "started", "finished" etc.); also, an activity can reference its decomposition into subactivities. That way, a planner can select process types and type decompositions from the generic process model as templates for (parts of) a project plan, and tailor them to the needs of the concrete project. Figure 4.11 shows snapshots from the MILOS Project Plan Editor.
Fig. 4.11: .MILOS Project Plan Editors: For the activity ’Implement ECA rule editor’ of type ’Implement Component’, the panes for activity scheduling, agent assignment, and specifying a list of input parameters are shown.
Workflow Engine
The MILOS workflow engine interprets project plans in order to actively guide human users in their work. It manages the state of the activities contained in the plan, individual to-do lists for users, the products created during activity enactment, and traceability relationships between activities and products [Del00]. Figure 4.12 shows snapshots from a to-do list provided by the MILOS workflow engine.
4.2 PRIME
115
Fig. 4.12: MILOS to-do list of team member "Barbara".
4.2
PRIME
PRIME has to be tailored to the particular PSEE whose users are to be provided with Information Assistants. We have conducted this tailoring for MILOS by performing the following steps: I.) Technical set-up 1. Using the Characterization Modeler, create base characterization classes for activities and activity-related entities that the PSEE presents on the users’ to-do lists (and plan editors). A characterization class should be created for all entity types whose presence is likely to introduce or trigger information needs. The attributes defined in each characterization class are induced from the attributes provided by the PSEE for these entities; they constitute the base set of variables that can be referenced in constraints and query command expressions of information resources. For MILOS, we have created characterization classes for activities, products, methods, and agents (see Figure 4.13); they form the basis of the organization- and PSEE-specific Software Engineering Domain Model (SEDM), that has to be maintained by the KD.
Software Engineering Domain Model
116
CHAPTER 4 Integrating POKM into a Process-Centred SEE
Fig. 4.13: UML diagram of the basis SEDM created for MILOS (adapted from [May01]). The class SEDMMILOSElement is the root class of all MILOS-specific characterization classes: SEDMProcess for activities, SEDMProduct for products etc. The additional layer of SEDM<entityType>Base classes has been introduced for technical reasons to specify all MILOS-specific attributes for the characterization classes.
2. Provide read-access on the PSEE’s process model data to PRIME’s Characterization Manager. In order to facilitate the definition of mappings between model entities and characterization classes, the Characterization Manager needs to have read-access to the set of model entities defined in the PSEE. Technically, the Manager needs to be provided with a unique identifier for each model entity that fulfils the following property: • whenever an activity or activity-related entity actPE in the PSEE
references a model entity as an indication of its type, then PRIME can determine that type’s unique identifier from actPE.
4.2 PRIME
This property is required by the Characterization Updater during the initial creation of characterization objects (see below). Ideally, the unique identifier can also be used to determine an appropriate graphical/textual representation in order to display the PSEE’s model entities within the Characterization Manager (see Figure 4.3). 3. Establish a connection from the PSEE to PRIME’s Characterization Updater. This connection has to provide the following functionality: • it has to ensure that for every activity or activity-related entity that
is created within the PSEE, a corresponding characterization object is generated and mapped within PRIME. • it has to ensure that changes to activities and activity-related entities that occur within the PSEE trigger updates to the corresponding characterization objects maintained within PRIME. For MILOS, this connection has been realized by registering the Characterization Updater as an observer of the MILOS system’s Audit Manager [Sau01]. 4. Provide activity-specific access to the Information Assistant from the to-do lists (and plan editors) provided by the PSEE. The PSEE has to be configured in a way such that it allows PSEE users to launch an IA for selected activities. For activities that appear on an agent’s to-do list, PSEEs typically facilitate activity-specific access to external tools via a tool interface (as defined by the WMFC’s reference model [Hol95]). That is how we provide agents with access to the IA from MILOS to-do lists (see Figure 4.14)
Fig. 4.14: MILOS to-do list with access to the IA.
117
118
CHAPTER 4 Integrating POKM into a Process-Centred SEE
However, as we have seen from the examples in Section 3.2, information needs arise for both project planners and enactors during their activities; therefore, an Information Assistant that provides planners with useful information resources usually is highly desirable. Yet, the activities to be performed by a project planner (e.g. resource assignment, scheduling, decomposition of activities) typically are not maintained on to-do lists. Since the WFMC reference model does not define a standard for tool integration during build-time, access to the IA for selected activities from the plan editors provided by the PSEE might require code adaption of the PSEE. For MILOS, the plan editor component has been modified accordingly (see Figure 4.15).
Fig. 4.15: MILOS Plan Editor, modified to facilitate access to the IA.
5. Provide read-access to the information systems available in the organization. In order to automatically launch query commands in an information source as specified in information needs, PRIME currently requires the information sources to be accessible via http-requests. Hence, any information system that has to perform a search needs to provide an interface (e.g. a CGI script) to which PRIME can send the query via an http-request. After the technical set-up has been successfully completed, tailoring proceeds with "content set-up", i.e. knowledge engineering steps.
4.2 PRIME
II.) Knowledge Engineering 1. Using the SEDM created in step (I.1) as a basis, define appropriate characterization classes for each process type, product type, or other model entity. Process types that have already been defined in the PSEE provide a natural starting point for classifying and/or characterizing activities: they identify certain classes of activities and describe how they should be performed, albeit typically on a high level of abstraction. Also, it can be assumed that the process types already captured in the process model represent classes of activities that are of prime importance to the organization, since the Process Group made the effort to capture them. The snapshot from tengerine (see Figure 4.2) shows an excerpt of the characterization class hierarchy that corresponds to the process model example depicted in Figure 4.10. The attributes defined in the classes are induced from the activity characteristics referred to by constraints and query command expressions of information resources that are elicitated from the PG (see below). 2. Using the Characterization Manager, define a mapping from the PSEE’s model entities to corresponding characterization classes maintained in PRIME. For MILOS, we have to define the following mappings: • process types to (subclasses of) SEDMProcess • process parameters to (subclasses of) SEDMProduct • methods to (subclasses of) SEDMMethod
Figure 4.16 sketches an example of the mappings for a process type; in PRIME, these mappings can be defined via the Characterization Manager (see Figure 4.3). 3. Specify relevant information sources. Using the Information Source Manager (see Figure 4.9), define the set of information sources that is available to the organization7, and which either should be consulted by PSEE users during their activities, or can be accessed in order to satisfy information needs that typically arise during certain activities.
(7)
Currently, PRIME requires that information sources are accessible via a URL.
119
120
CHAPTER 4 Integrating POKM into a Process-Centred SEE
…
… …
SEDMProcess
Document Design Document
EJB-Design Document
Requirements Document
SourceCode
Implementation Process
Migration Process
DJB-Design Document
Testing Process
PRIME MILOS
reqdoc: Requirements Document desdoc: Design Document
Implementation Process
codedoc: Code Document
Fig. 4.16: Mappings from a process type and its parameter definition within MILOS to corresponding characterization classes within PRIME.
4. Formalize captured information resources and associate them with appropriate characterization classes. Using the Information Resource Manager (see Figure 4.4), members of the KD need to formalize the information resources elicitated from the PG. As described in Section 3.2, the information resources should be associated with those model entities (more precisely: the characterization classes of those entities) whose instances should activate them. 4.2.1
System Status
Most of the concepts presented in this thesis either have been implemented in PRIME, or have been realized by an integration of third-party tools8. An exception is the definition of specialization relations on information resources. Furthermore, characterization classes and objects, as well as information resources, are represented in PRIME as Java objects rather than F-Logic declarations. In order to specify information resource constraints, we use standard Java expressions, with several extensions that allows us to reference characterization objects; these expressions are evaluated using Java reflection.
(8)
See the descriptions provided above for the Characterization Modeler and the Information Request and Feedback Forum.
4.3 Example
As mentioned above, the current realization of PRIME requires that information sources are accessible via a URL; query results are assumed to be displayed in a standard web browser. PRIME is implemented in Java as a Distributed Java Beans (DJB) application, using the OODMBS GemStone9 for client/server communication and persistency services. 4.3
Example
In the following, we illustrate how we set up PRIME for MILOS for use in our research group environment; we present an example scenario motivated by our own project's software development process. Information sources that are used in this context and can be accessed by PRIME include: • an activity case-base from which "similar" former activities (or • • • •
projects) can be retrieved via Orenge a local bug-tracking system10 an Experience Base maintained at our University [Fel00] various newsgroups dedicated to software tools used in our research group standard web search engines (e.g. Google11, AltaVista12, etc...)
In our example, the project planner starts the MILOS PSEE to set up a new project, i.e. to define a project plan. Using the Characterization Editor, he provides a characterization for his current project by specifying the project’s name ('Distributed workflow management system'), the system architecture type ('Distributed'), and the estimated duration ('2 years'). Before creating a new project plan, he tries to remember similar projects conducted in the past that might help him with his planning. Since the Information Assistant (integrated into the MILOS PSEE) offers a corresponding expected information need "Which similar projects exist?", he selects the information source usage recommendation defined for this information need, and asks the IA to launch the predefined query. The retrieval results are presented to him in a browser, listing similar MILOS project plans which are stored in the activity case-base (see [May01] for details).
(9) (10) (11) (12)
www.gemstone.com based on Bugzilla (http://www.mozilla.org/projects/bugzilla/) www.google.com www.altavista.com
121
122
CHAPTER 4 Integrating POKM into a Process-Centred SEE
Inspecting the plan for the most similar project, he can see that this plan used the process model 'Development of distributed systems' from the MILOS process model. Hence, he browses the process model and selects this process model as a basis for his project plan. As a result, the processes from the process model now define new tasks13 within the project plan. Besides the specification of these tasks, their parameters, and the possible methods, the corresponding characterization objects for project plan entities refer to the information resources via their characterization classes. The chosen process model maintains knowledge about the development method 'Iterative enhancement' which the project planner thinks is appropriate especially because of the long project duration and the componentoriented architecture. Next, the planner wants to make a rough estimation of the time effort required for the task 'Requirements analysis'. This can be done by consulting a quality model that describes the effort distribution with respect to process steps in the chosen process model. Using the Information Assistant (see Figure 4.17), he accesses the information associated with the question "What is the effort distribution when using Iterative Enhancement?" which is listed in the information category 'Project scheduling'. A corresponding diagram retrieved from the experience base for quality models is displayed in a window as shown in Figure 4.17. According to the effort distribution and the estimated total time of 2 years for his project, he can now provide rough estimates for all tasks. Furthermore, the planner has to assign agents to each of the tasks specified in the project plan. Let us assume that he wants to do this for a design task for which he has not yet chosen a method. Existing alternatives are 'UML' and 'SDL'. He accesses the agent assignment UI within the MILOS project planner component for the design task. As the project planner is fairly new in his job and the department, he requires information whether any agents are working in the department which have experience in one of these methods. The Information Assistant provides an information resource within the information category 'agent assignment' which models this information need represented as "Which agents have experience with 'UML' or 'SDL'?". The question is coupled with the retrieval on a skill database for agents, which yields a list of agents with the required skills. He can see that for both methods agents are employed in the department. He decides to use 'UML' as he personally has used this method before and is convinced of its quality. Due to this decision, his current work context changes as the method decision for the design task gets the value 'UML'. This is propagated to the Information Assistant which updates the planner’s list of information resources. The general question about experi-
(13) In the following, we will use the terms ’task’ and ’activity’ synonymously.
4.3 Example
enced agents for both methods is not longer required as a decision has been made.
Fig. 4.17: PRIME usage scenario during planning. The project planner launches the Information Assistant from the scheduling tab within the MILOS project planner to get support in his scheduling task (1). The IA presents him the current relevant and available information resources, and he selects the question "What is the effort distribution for the process model 'Development of distributed systems'?" (2). After issuing the "Show" command for a corresponding IS usage recommendation, a browser opens on an effort distribution diagram (3).
Since he has already provided start and end dates for the design task, he is now interested in agents which have experience with 'UML' and are available in the given time frame. A corresponding expected information need offered by the Information Assistant facilitates querying the schedule database for available agents, checking only those agents that are known to have UML skills. The project planner knows from formerly planned projects that a similar design methodology is 'OMT'. Besides the planning for the current project, he is interested if any agents in the new department have experience with this method. As 'OMT' is not a defined method in the process model he has received from the PM experience base, the Information Assistant does not offer support for this question. But as the retrieval for 'UML' and 'SDL' has been done on a skill database, he uses the Assistant to access the query engine interface of this information source. Now he
123
124
CHAPTER 4 Integrating POKM into a Process-Centred SEE
defines his own query to search for agents with 'OMT' experience and launches it. As he thinks this information need might occur again in the future (maybe even the 'OMT' method can be modeled as additional method in the process model), he posts a corresponding modeling request to the Information Request and Feedback Forum. Likewise, as with the agent assignment for the 'Design' task, the project planner proceeds with the remaining tasks in the project plan. Agents that participate in the project can access their individual workspace provided by the MILOS workflow engine. The project planner has assigned a task 'Implement ECA rule editor' during project planning to the agent 'Barbara' and, accordingly, this task appears in her to-do list (see Figure 4.18(a)). 1
(a)
(b)
(c)
2
Fig. 4.18: Query execution for an information need to find an EJB tutorial.
When she starts working on the task 'Implement ECA rule editor' she runs into a problem while trying to implement an EJB. Using the Characterization Editor, she specifies ’EJB’ as one of the key topics for her current task. She launches the Information Assistant (Figure 4.18(b)) and identifies her question "Where can I find a tutorial on EJB?" in the presented list, which is exactly what she is looking for. Issuing the "Show" com-
4.3 Example
mand on a corresponding IS usage recommendation (Figure 4.18(2)), a browser opens (Figure 4.18(c)) and presents her a list of links which have been retrieved from the the Javasoft Developer Domain using "EJB" and "Tutorial" as search keywords. She follows the hyperlink to the EJB tutorial, in order to refresh her knowledge while following the tutorial steps. A further look in the Information Assistant window guides her to the question of available experts on EJB technology. She did not even know before that any EJB experts were available in her group, but the information sources associated with the information need point them out to her. However, before she contacts them, she browses the information resources offered to her for any links to documented experience with EJB implementation. As it seems, either no documented experience concerning Java/EJB implemenations exist, or information needs requesting this documented experience have not yet been captured. Therefore, she uses the posting functionality of the Information Assistant to post a new information request "Which experience has been documented for Java/EJB implementation?" to the Information Request and Feedback Forum. The Information Assistant adds this new information need to the list of personal information resources for task 'Implement ECA rule editor'. experience is documented related to that, but seemingly these documents exist as otherwise this information need would not appear in the Information Assistant. Since there are documented experiences, there must also be agents who have made these experiences. As she can not find a corresponding information need in the presented list of offered information resources, she uses In summary, besides having used the provided information need support via the execution of modeled information resources within the Information Assistant, the project planner has posted at least one information need modeling request about the 'OMT' method, and Barbara has posted an information request which has been associated as a personal information need to the tasks characterization object. The definition of appropriate information resources that reflect these requests lies in the responsibility of the Knowledge Department; the following chapter will discuss this evolution of meta-knowledge in more detail.
125
126
CHAPTER 4 Integrating POKM into a Process-Centred SEE
Process-Oriented Knowledge Evolution
CHAPTER 5
This chapter outlines a complete knowledge utilization cycle, covering incremental knowledge evolution based on continuous user feedback. Such feedback is created via activity-specific forums, and evaluated by the Knowledge Department.
In order to obtain first-hand data on what knowledge is actually needed during concrete activities, access to and requests for certain knowledge has to be traced and maintained. Therefore, a feedback loop has to be established that completes the cycle through the five KM phases (see Section 2.1). This feedback loop has to facilitate the evolution of the knowledge stored, aiming at an organizational learning process. 5
Information Information Resource Resource Modeling Modeling
basis for
6
Provision Provision of of Information Information Resources Resources requests for information
basis for suggest refinements
1
2
Process Process Modeling Modeling
Project Project Planning Planning
Experience Packaging
3
Plan Plan Enactment Enactment
Project Support Environment
initiate remodeling 4
Analyze Analyze Project Project Trace, Trace, Knowledge Knowledge Requests Requests and and Feedback Feedback
Phase performed by the KD
Phase in which the KD participates
Fig. 5.1: Knowledge Utilization and Evolution Cycle: capture happens during the analysis phase, based on data and feedback from agents collected during project planning and enactment (e.g. in the form of posted questions to the KD). Organization and formalization happens during experience packaging. Distribution happens when planners make use of process types, and by provision of info resource during activities.
127
128
CHAPTER 5 Process-Oriented Knowledge Evolution
In order to support an continuous organizational learning process, the POKM model presented here tightly integrates the four knowledge management phases with process modeling, project planning, enactment support. More precisely, evolution is integrated into the feedback loop operated by the Process Group in the presence of a process-centred SE environment (see Figure 5.1). PG-operated feedback loop
The feedback loop operated by the PG consists of the phases (1) through (4)1. Process modeling (phase 1) is concerned with the capture, organization and formalization of best practices for software development in its company. The resulting process model serves as a template library from which project planners can select processes and decompositions to create an initial project plan (phase 2). Typically, this tailoring of a process model to specific project characteristic also includes scheduling and resource assignments. The project plan is the basis for plan enactment, usually supported by a workflow engine (phase 3). In general, enactment has to be interleaved with project planning to accommodate for intermediate workflow results (i.e. specific product characteristics), deviations, unanticipated problems, environmental changes, etc. At any time during or after project enactment, members of the PG can analyze a project trace in order to identify reusable best practices (i.e. process models) from the current project plan (phase 4); these will have to be integrated into the generic process model (loop into phase 1).
KD-operated feedback loop
The feedback loop operated by the Knowledge Department partly overlaps with the PG’s feedback loop: in phase 5, it uses the process model generated in phase 1 as the basis to capture and model information resources that should be associated with certain process model elements (see Section 3.2.1). The PG and KD have a different focus on the process model: • the PG focuses on process composition, their ordering, and the docu-
ment flow between them. • the KD has a fine-granular focus on the individual activities: for each process type, it tries to elicit form the PG what information resources might be useful for agents involved in activities of that type. Because of the KD’s focus on individual activities, the process model might need to be refined during the elicitation of information resources
(1)
adapted from [GJ96]
129
from the PG. Likewise, changes to the process model generally trigger changes to the information resource model. Based on the information resource model, the KD can actively provide agents with information resources considered to be useful for them by the PG with regard to their individual activities (phase 6). In addition to this automatic information resource provision via the Information Assistant (see Section 3.2.4), agents can post explicit questions to the KD, for which it will attempt to provide information items that help to find an answer to the question. Also, agents are given the means to mark offered information resources as useless for them, in order to prevent them from being bothered with useless information resources during the remainder of their activity. Both during and at the end of an activity, members of the KD monitor and analyze the provision of information resources (phase 4), in order to identify new useful information resources that should be captured, and to identify problems with the current model (e.g. indicated by the provision of irrelevant or useless information resources). Monitoring and evolution is supported actively by PRIME via the Information Request and Feedback Forum (IRFF) component. Any communication between agents and the KD is maintained for later analysis by the KD, with the objective to update the information resources model in such a way that the Information Assistant will provide agents with all useful information during future activities, relieving agents from the burden of having to query information systems or the KD for required information explicitly. The IRFF is organized in a tree-like structure hierarchy, where each tree node maintains one or more node-specific forums (see Figure 5.2). The inner nodes correspond to the characterization class hierarchy stored in the Process KB. The tree’s leaf nodes correspond to the instances (i.e. characterization objects) that are referenced as the child nodes of each characterization class node.
Information Request and Feedback Forum
130
CHAPTER 5 Process-Oriented Knowledge Evolution
… SEDMProcess Implementation Process
Migration Process
… Testing Process
Modeling Requests (from PG)
… impl_act1
impl_act2
test_act1
test_act2
… Information Requests
act
characterization object
instance-of
Feedback
Information Feedback Requests
forum
has-forum
Fig. 5.2: Forum structure based on the characterization class hierarchy.
For each activity, two activity-specific forums are maintained: • Information Requests: stores the questions posted by agents to
Knowledge Brokers during the activity, as well as their replies (see Figure 4.7). • Feedback: stores feedback postings of agents performing the activity, as well as any replies from KD members. Agents can post activity-specific feedback messages via their Information Assistant. The IA supports the following different kinds of feedback postings by providing appropriate preconfigured template messages: 1. Modeling Requests: An agent can request to have certain information resource, that he found useful during his activity, to be offered to him in the future under certain conditions. Requests can be posted from the agent’s Information Assistant issuing a "Post Modeling Request" command for a
131
selected information resource. The request is formulated by filling in a template provided by the Information Assistant as shown in Figure 5.3. Information resource that should be offered: -------------------------------------------JDK 1.2 language specification In what situation should it be offered? ------------------------------------whenever I am performing a Java implementation activity. Fig. 5.3: Example for an activity-specific modeling request. The agent fills in a template to indicate when he would like to be offered the information source "JDK 1.2 language specification". The template (already containing the name of the selected information resource) is provided by the Information Assistant. The Information Assistant will send the posting to the activity-specific forum.
It will be the KD’s responsibility to define a corresponding information source recommendation. 2. New Information Source Notification: Whenever an agent adds a new information source (e.g. the URL of a web page that he found useful during his activity) to the personal information source list maintained by his Information Assistant, the IA posts an automatic message to the activity-specific feedback forum2 (see Figure 5.4) This message informs the KD about a newly found information source that might be useful to other agents as well. Together with the PG and subject matter experts, it has to be decided whether the information source should also be offered during other activities (see list of questions below). Newly added Information resource -------------------------------------------JDK Tech Tipp: Swing Help Package (http://....) Added to activity -------------------------------------------Implement ECA rule editor... (http://....) Fig. 5.4: Example for an automatic new information source recommendation posting.
(2)
For privacy concerns, the IA can be alternatively set-up in a way such that newly added personal information sources are not posted automatically. Instead, the agent has to issue an explicit "publish" command in order to trigger the IA to post the message.
132
CHAPTER 5 Process-Oriented Knowledge Evolution
3. Resources Marked Useless Whenever an agent marks an information resource offered to him by that IA as "useless", the IA provides him with a predefined template. Using this template, the agent can (optionally) enter a short textual explanation why he considered the information resource useless for his current activity. After completing the template, the IA sends the posting to the activity-specific feedback forum2 (see Figure 5.5). Information resource considered useless: -------------------------------------------JDK Tech Tipp: Handling Time and Date in Java Considered useless for activity -------------------------------------------Implement ECA rule editor... (http://....) If possible, please provide a short explanation why it is useless: ---------------------------------------------Tech Tipp is specific to Java 1.1, but is no longer useful for Java 1.2 (deprecation). Fig. 5.5: Example for a "marked useless" posting.
4. Restructure Proposals An agent can propose to restructure the information provided by the IA for a selected information resource. Proposals can be posted from the agent’s Information Assistant issuing a "Propose Restructuring" command for a selected information resource. The proposal is formulated by filling in a template provided by the Information Assistant as shown in Figure 5.6. Information resource for which restructuring is proposed: -------------------------------------------EJB Handbook for VisualAge for Java Proposal: ---------------------------------------------Extract the chapter "Introduction to EJB". This is a good general introduction to EJB that is not specific to VisualAge. Fig. 5.6: Example for a "restructure proposal" posting.
133
5. Missing Knowledge (Identified Knowledge Gaps): Whenever agents recognize on hind-sight that certain information would have been useful for an activity, but was not offered to them, it is important that the KD is alerted to this problem. Often, these kind of "knowledge gaps" will be identified by agents who review the activity results of a colleague. Hence, the IA supports the posting of activityspecific "Missing Information Reports". The report is formulated by filling in a template provided by the Information Assistant as shown in Figure 5.7. Missing Information Feedback ("If I had only known this before ...") ============================================== Missing information for activity: ------------------------------------------What information should have been provided? -------------------------------------------------------------Potential source of this information? (Where could it have been found?) ----------------------------------------------------------------------------------------------Information would have effected (please check): -------------------------------------------------------------O Planning: O Time Scheduling O Agent Assignment O Tool Selection O Method Selection O O O O
Enactment: Tool/Technology usage (known limitations, problems...): Best practice ("How to do it"...): Work result (e.g. possibility to reuse a solution...):
When should the information be offered in the future? (If possible, please try to refine the situation): ----------------------------------------------------------------------------------------------------------------
Fig. 5.7: Template for an "Missing Information Feedback" posting.
In addition to these activity-specific forums, a type-specific forum for Modeling Requests from the PG is maintained for each process type (i.e. its corresponding characterization class). Whenever the Process Group has decided that certain information should be offered to agents under certain conditions, it posts a corresponding message to an appropriate process type-specific forum (see Figure 5.8). Requests are posted by a Process Group member by navigating to the type-specific forum of the appropriate process type (i.e. characterization
134
CHAPTER 5 Process-Oriented Knowledge Evolution
class) and filling in a template provided by the Forum component as shown in Figure 5.8 Information that should be provided: ------------------------------------New Java version (JDK 1.5) language specification has been released (http://www.....). We should ensure that our source code is compatible with the new standard For activities of type: -------------------Implementation Process Activity constraints? -------------------programming language is Java Role constraints? -------------------planners , programmers and reviewers Skill constraints? -------------------none Required topics? -------------------none Fig. 5.8: Process type-specific Modeling Request Forum. Requests should be posted to the forum of the most general type of activities whose participating agents should be provided with the information specified. If necessary, members of the KD can reply to postings by asking for more detailed information.
The type-specific forums will also be used by the PG to inform the KD when the organization purchased a new tool to be used during certain activities; supported by the PG, it will be the KD’s responsibility to define appropriate model entities in PRIME’s Characterization KB, as well related information sources (e.g. tool-specific newsgroups, the vendor’s web site, the tool’s technical support site etc.) that should be disseminated to agents performing certain activities. Likewise, the type-specific forums will be used to inform the Knowledge Department about changes to the process model. By means of the IRFF, members of the Knowledge Department are actively notified when the possibility to evolve and improve its informa-
135
tion resource model should be considered. Evolution is initiated by e-mail notifications sent to the Knowledge Department by the IRRF whenever: • an agent posted a question to an activity-specific information request
forum: in addition to posting an answer to the question, KD members should consider whether the information need expressed by the question might also arise during other activities. • an agent posts a feedback message to an (activity-specific) feedback forum. • a Process Group member posts a request to a (process-type-specific) Modeling Requests forum. However, as explained in Clarification 3.1, the final decision on whether a new information resource should be provided during all activities of a certain type under certain conditions has to be made by the Process Group: it is the PG that takes into account the organization’s perspective on usefulness, i.e. its best practise process descriptions3 are based on cost/benefit considerations. In this work, we will use the term "useful" in its intuitive meaning. However, it will have to be defined more precisely in terms of a cost model specified by the organization that operates the Knowledge Department. In the following, the events and reactions by the KD as sketched above are listed more systematically in the form of questions and guidelines, looking more closely at the analysis conducted by the KD in phase 4 (see Figure 5.1). Based on the postings stored in the IRFF, the KD analyses the feedback from agents, attempting to find answers to the following questions: Question 1: Did you miss any useful information items to help you with performing the activity? If yes, how can we make sure that this information is provided in the future? We can distinguish the following situations: Case 1.1: The agent posted a question to the KD that expressed his information need. Because the IA tracks all posted questions to the KD concerning a concrete activity, the set of these new information needs indicates that certain useful information has not been offered to the agent.
(3)
In general, the process model has to provide an answer to the question "What process should be followed for a project with certain characteristics?"
136
CHAPTER 5 Process-Oriented Knowledge Evolution
Case 1.1.1: A corresponding expected information need already exists, but was not offered because its constraints were not satisfied. In a review conducted in cooperation with the PG, it has to be clarified whether the information need should have been offered during the activity. If no, no further steps are taken by the KD; if yes, the KD has to identify which of the following cases occurred: • the constraints are correct, but the referenced activity or agent charac-
teristics either have not been specified, or have been specified incorrectly, or have not been updated to reflect the current situation (e.g. the agent’s skill profile is outdated: he has lost a certain degree of expertise) • the constraints are incorrect (e.g. too strict) In the first case, the KD should update the corresponding activity and/or agent characteristics; in the second case, the KD has to clarify in cooperation with the PG how the constraints should be modified so that the information need will be offered during future comparable activities of that type. Case 1.1.2: A corresponding expected information need already exists and was offered by the KD (more specifically: the IA), but the agent did not make use of it (e.g. because it was overlooked, or considered to be useless by the agent). The agent did not recognize the offered information resource as a means to satisfy his information need. The KD should consider restructuring the information resource (see Question 4), in order to prevent the information resource from being overlooked in the future. Case 1.1.3: A corresponding expected information need does not exist. In a review conducted in cooperation with the PG, the following issues have to be clarified: • when should this new information need be offered in the future? (That
is, the information need has to be associated with appropriate model entities and constraints have to be identified.) • where and how can information items be found that satisfy the information need? (That is, a set of information source usage recommendation have to be identified. The information items that were found by the Knowledge Brokers in order to satisfy the agent’s information need can be used as a starting-point. Also, any information items found by the agent himself should be taken into account.)
137
In case that the information need is considered not useful for future activities by the PG, the information need should still be kept, together with an explanation (as an information item "satisfying" the information need) why the questions is considered useless. Such an information need will never be offered by the IA, unless an agent posts the question again during a future activity. Only if the KD finds the corresponding expected information need (marked as useless) while checking for existing expected information needs that were not offered because of unsatisfied constraints, will it be presented to the agent. The explanation of why it is not useful will prevent the agent and the KD from spending further time on searching for information items to satisfy the information need. Case 1.2: The agent’s information need is reflected by a corresponding expected information need being offered, but the information items provided for it did not satisfy the information need. The agent contacts the KD to ask for information items that actually satisfy his information need. A Knowledge Broker from the KD searches for useful information items and provides the agent with the retrieval results. Case 1.2.1: An information source usage recommendation that would have satisfied the information need already existed for it, but was not offered. In cooperation with the PG, it has to clarified if and how the usage recommendation’s constraints and/or the agent’s skill profile have to be updated, so that it is offered during future activities. This case corresponds to Case 1.1.1. Case 1.2.2: An information source usage recommendation that should have satisfied the information need already exists was offered, but its query command is incorrect. The KD (i.e. one of its Knowledge Engineers) needs to correct the query command. Case 1.2.3: The usage recommendations are correct; to the best of the KD’s knowledge, no better items can be offered. In cooperation with the PG, it has to clarified if and how the information need can be satisfied in the future (e.g. by establishing a new internal information source/system). Case 1.2.4: The usage recommendations are correct and the retrieved items actually satisfy the agent’s information need, but the agent misinterprets (or does not understand) them. The KD should alert the agent to this misunderstanding.
138
CHAPTER 5 Process-Oriented Knowledge Evolution
Question 2: Did you find new information that you think is worth capturing? Apart from contacting the KD, the agent might have found useful information items on his own that satisfy one of his information needs, without consulting the KD. These information items should be captured in the form of information source (usage) recommendation for the corresponding information need, so that they can be offered automatically by the IA during future activities. One way to collect these information items during project enactment is to provide agents with the means to maintain activity-specific, personal lists of information items, e.g. a private workspace to which the agent can add newly found information items during his activity. From this workspace, the agent can release select items that he considers worth capturing. In order to determine which of these released items are useful for future activities, the PG is involved in the same review procedure as in Case 1.1. Question 3: Did we provide you with useless knowledge items? If yes, how can we prevent this in the future? In order to support immediate feedback to the KD concerning this question, the IA provides its agent with the means to mark offered information resources as well as information items retrieved for an information resource as useless. The IA hides all information resources marked as useless by its agent from him, i.e. they are no longer being offered to him during the remainder of the activity. Case 3.1: The agent considers an information resource to be useless for him. In cooperation with the PG, it has to clarified whether the information resource is actually useless to the agent during his activity. Case 3.1.1: The information resource is actually useless for the agent. This means that the KD should prevent the information resource from being offered during comparable situations in the future. In correspondence to Case 1.1.1, we can distinguish the following reasons why the information resource has been offered: • the constraints are correct, but the referenced activity or agent charac-
teristics either have been specified incorrectly, or have not been updated to reflect the current situation (e.g. the agent’s skill profile is outdated, because he as acquired new capabilities) • the constraints are incorrect (e.g. too general)
139
In the first case, the KD should update the corresponding activity and/or agent characteristics, in a way such that the information resource is no longer offered to the agent by the IA during activities comparable to the one under consideration. In the second case, the KD has to clarify in cooperation with the PG how the constraints should be modified so that the information need will be offered no more during future comparable activities of that type when it would be useless. Case 3.1.2: The information resource is not useless for the agent. In this case, the agent is not aware of the information resource’s usefulness. The PG should explain to the agent why the offered information resource should be consulted by him. Case 3.2: The agent considers an information item to be useless for him that was retrieved for an information need (considered to be useful). In cooperation with the PG, it has to clarified whether the information item is actually useless for the agent during his activity. Case 3.2.1: The information item is actually useless for the agent. The KD has to determine which information source (usage) recommendation retrieved the useless information item. The remainder of this case corresponds to Case 3.1.1, i.e. either the recommendation’s constraints (or query) are either correct or incorrect. Case 3.2.2: The information item is not useless for the agent. This case corresponds to Case 3.1.2. So far, the questions focused on when certain knowledge is required, and, hence, involved the PG whose responsibility it is to determine what knowledge is required during what activity classes. The next question focuses on how this knowledge is represented. Question 4: Concerning the knowledge provided: is there a better way to provide it (e.g. by restructuring documents)? Feedback concerning this question is used by the Knowledge Engineers to: • reformulate the textual representation of information resources (e.g.
questions and textual usage descriptions of information needs) • restructure the set of existing information resources, e.g. by updating the (sub-)categories so that agents find the resources more quickly, or by adding new sub-information need references between information needs, etc.
140
CHAPTER 5 Process-Oriented Knowledge Evolution
• create additional information resources that provide immediate access
to useful information items which were offered only indirectly during the activity. • restructure the contents of certain information items. The last point corresponds to the standard evolution phase of KM, whereas the first three points are characteristic to the process-oriented approach presented here. So far, the feedback from agents to the questions listed above with regard to the performance of their own activities only reflects their subjective information needs. As a consequence, the problem of relevant captured information items that should have been used, but of which the agent is unaware (e.g. when known issues have not been addressed by the agent, or reusable artifacts already available in the organization have not been considered by the agent), will not be identified by the above questions. In order to identify such knowledge gaps, agents who reviewed the activity results of a colleague are also asked for feedback to the following question. Question 5: Concerning the activity results you have reviewed: have useful information items that you know of been neglected? What information resource provision could prevent this in the future? Case 5.1: The information resource suggested by the reviewer already exists, but was not offered by the KD (more specifically: the IA), but was not offered because its constraints were not satisfied. This corresponds to Case 1.1.1; the same procedure should be applied here. Case 5.2: The information resource suggested by the reviewer already exists and was offered by the KD (more specifically: the IA), but the agent did not make use of it (e.g. because it was overlooked, or considered to be useless by the agent). Apart from restructuring the information resource in order to prevent it from being overlooked, the issue lies outside the KD’s responsibilities; instead, it needs to be resolved by the agent and the reviewer. Case 5.3: The information resource suggested by the reviewer does not yet exist.
141
This corresponds to Case 1.1.3; the same procedure should be applied here. In summary, the feedback loop operated by the KD provides a means to accommodate for the gap between (i) what information resources the PG considers to be useful during likely future activities, and (ii) what information needs actually arise during concrete activities. Whereas an initial information resource model elicitated from the PG can be used to bootstrap the feedback loop, analysis of requests to the KD and feedback from agents during their activity performance provides the basis for continuous organizational learning: new information resources discovered by an agent are captured and, in case they are considered to be useful by the PG for a certain class of activities, are distributed to other agents that perform activities of that class via their IAs.
142
CHAPTER 5 Process-Oriented Knowledge Evolution
CHAPTER 6
Discussion
"Direct exchange between people is for more relevant than the capture of knowledge, collaboration is more important than documentation." (Christian Kurtzke, Founder of the Siemens ShareNet, Die Zeit, Nr. 4/02)
In the following, we conduct a qualitative analysis of the approach presented in this thesis and compare it to related work. Furthermore, we outline a set of case studies that could be conducted in order to substantiate the expected benefits listed in Section 1.2, and discuss limitations of our approach. Process-Oriented Knowledge Management has recently received increasing attention from both industry and research (see e.g. [SSS+01], [AHM+02]). Even though few reports on empirical evaluations exist (e.g. [RMS00]), the integration of business process modeling/reengineering and Knowledge Management is generally acknowledged to bear a high synergy potential. It is argued that, by focusing on the organization’s core activities as defined in its business process model, KM initiatives obtain a well-defined objective (namely: cost reduction and quality improvement with regard to core activities) that make a pay-off more likely. Concerning the level of integration, Abecker et.al. distinguish between three different integration types [AHM+02]:
Levels of Integration
1. Business Process Management as the Basis for Knowledge Management Integration is achieved on a management level. A life-cycle model for Business Process Management is extended by activities that describe how knowledge that is relevant for the business process is identified, structured, stored, exchanged and applied. The process model stands in the centre of an integrated Business Process/Knowledge Management life-cycle model.
143
144
CHAPTER 6 Discussion
2. Knowledge Management and Process Enactment During process enactment, activity-specific access to relevant knowledge is provided, i.e. knowledge is retrieved (and stored) during enactment. Ideally, enactment support is provided by a Workflow Management System, whereas knowledge retrieval is supported by an Organizational Memory Information System. 3. Business Processes as the Primary Object of Knowledge Management Knowledge Management approaches are applied in order to achieve continuous process improvement. Processes are reengineered based on experience gained during enactment, resulting in updates to the process models as well as to the knowledge being provided during process enactment to better meet the organization’s goals. The approach presented in this thesis describes a Type 2 integration, but also covers Type 3 aspects of continuous process improvement. The improvement addressed in our approach is concerned with the information delivery for agents participating in the process. Instead of aiming at an evolution of the process itself, our approach aims at "making the best" of the current process by predicting the information needs of process participants, and providing participants with information that (potentially) satisfies these information needs. In our opinion, the need for such a continuous learning of what information is useful/relevant during process planning and enactment is characteristic for weakly-structured, dynamic processes like software engineering processes. In business processes, which so far have been the main focus of Process-Oriented Knowledge Management, the necessity of a continuous capture of information needs does not arise: compared to the dynamic environment of software engineering, business processes (including their domain) remain fairly static, whereas for software development processes the domain changes frequently, at least in terms of technological advances and product modifications. Because this issue did not arise before, almost all systems that support POKM for business processes assume a static set of modeled information needs, that is not supposed to change during process enactment (see below).
Process Improvement Initiatives
In the domain of Software Engineering, several approaches have been developed that can be regarded as "fully-fledged" approaches for Type 3 integration for continuous process improvement, e.g. the Capability Maturity Model (CMM) [PCC+93], or SPICE [EDM98]. Our work complements these approaches by presenting a concept for refining (instead of replacing) the current process model. Using a rough analogy, one could say that our approach attempts to achieve a "local optimum" with the current process model by evolving the associated meta-knowledge on what
145
information is needed to successfully perform the process. In contrast, an organization that has reached CMM level 5 would attempt to change the process itself, trying to find a "global optimum". Correspondingly, our approach focuses on systematic information delivery support for finegranular, individual activities, whereas process improvement initiatives like CMM typically take a coarse-grained, project-level view on software engineering processes. Process Improvement is also subsumed by the general Quality Improvement Paradigm (QIP) that underlies the Experience Factory (EF) approach [BCR94]. In principle, the QIP can be applied both on the project level and on the level of individual activities (see e.g. [Tau00]); the Experience Factory is designated for comprehensive reuse of various kinds of artifacts, including process models, project plans, former activities, and software products. However, the approach does not detail how the packaged experiences are stored or organized in the Experience Base. Even though a distinction is made between an organization-wide section and project-specific sections of the Experience Base, there is no explicit means for a process-oriented representation of what information is relevant for activities that occur within the process. Accessing the Experience Base in order to find information that is relevant for the current activity lies in the responsibility of the project organization (e.g. the individual process participants). Hence, the approach presented in this thesis complements the EF approach by proposing a process model-oriented knowledge organization scheme, together with a technology that turns a formerly passive EF into an active Organizational Memory. In contrast, providing participants with relevant information during process enactment is a major issue in research on Process-Centred Software Engineering Environments (PSEEs) [GJ96]. In general, the focus has been mostly on process coordination, process enactment, change management, and modeling languages; an integrated support concept for Knowledge Management so far has been neglected. At best, PSEEs can be regarded as tools to support Knowledge Management that is restricted to knowledge on how to enact processes, what products have been created during the process, and information on what events occurred during the process. None of the first-generation process support systems (e.g. EPOS [LC98], SPADE [BDF96], Merlin [PSW92], Endeavors [BT96], or ProcessWeaver [Fer93]) address the topic of active information delivery from information systems available to the organization. In more recent systems (see e.g. [GH98]), active, situation-specific information delivery is restricted to change notifications with regard to changes that occurred to process-specific entities; there is no support for representing information needs that can be satisfied by queries to general information systems.
Experience Factory
PSEEs
146
CHAPTER 6 Discussion
At the time of writing this thesis, most work on integrating Knowledge Management and process support has been done in the field of business processes (see [AHM+02] for a recent overview of Business Process-Oriented Knowledge Management). In the following, we discuss and compare PRIME to related state-of-the-art approaches for Type 2 integration with regard to the aspects listed below; these aspects present a refinement of the integration possibilities listed in [ABN+01] for Type 2 integration approaches system.: 1. Process-Oriented Knowledge Archive: The organization’s process model serves as a index into the Organizational Memory (Information System). During process enactment, the agent enacting an activity is provided with the means to manually browse (or specify queries to) the contents of the OMIS as defined in the model for this activity. 2. Active Information Delivery: During workflow engine-supported process enactment, predefined queries that correspond to information needs can be sent to the OMIS for an automatic information retrieval during an activity. 3. Freely-Definable Information Needs: Information needs can be defined independently from the activities’ structure; i.e., they are not necessarily intended to determine the value of a specific activity or product attribute. Instead, modeled information needs can correspond to any information need that a user might have during the enactment of the activity. 4. Information Need Satisfaction Alternatives: If there is more than one information source or query command that can be utilized to retrieve information that satisfies an information need, corresponding alternatives can be specified. 5. Generic Information Needs: The query commands specified in information needs can contain parameters that correspond to activityand agent-specific characterization attributes. During activity enactment, these parameters are replaced by the current activity/agent attribute values before the query is launched. This facilitates the retrieval of context-specific information. 6. Situation-Specific Information Needs: Conditions can be specified for information needs, such that its query is only triggered in dependence of the current situation, which is defined by the current activity’s and agent’s attribute values (i.e. process state information) 7. Information Needs for Different Roles: In addition to (6.), the condition allows to distinguish between different roles that an agent can play during an activity. For example, the planner of an activity will have different information needs than the performer of the activity. 8. Domain-Oriented Information Need Organization: As a means to structure the set of modeled information needs, they can be associated with any domain entity. Restricting the organization scheme to proc-
147
9.
10.
11.
12.
13.
ess model entities poses the danger that the scheme does not scale up to complex domains (measured in terms of the number of different products, tools and technologies that are handled by the participants during activities of a given type). Separate Characterization Vocabulary: A separate vocabulary can be defined for activities, activity-related entities, and agents from an information delivery viewpoint. This becomes necessary when the vocabulary provided by the workflow engine is restricted to an enactment/coordination viewpoint, and can not be extended to an organization-specific domain ontology. Contextualized Information Storage: Artifacts/products created during activity enactment are stored together with references to their creation context, e.g. the corresponding activity, the agent who created the product, etc. Activity-Specific Forums: For every activity, discussion or feedback forums are maintained to stimulate immediate and focused feedback in the context of the activities. External Information Sources: Queries can be sent to information sources maintained outside the organization, and are not restricted to a centralized OMIS. For software engineering processes, this is especially important whenever they have to cope with innovative technology, because knowledge on such technology is mostly available in external information sources (e.g. newsgroups, mailing lists etc.). Continuous Information Need Evolution: New information needs can continuously be captured and modeled during process enactment, while already existing information needs can be updated according to agent feedback. This is especially important for flexible, weaklystructured processes that occur in a frequently changing environment, where no complete set of information need can be specified in advance.
Table 6.1 summarizes the results of the comparison, which we restricted to related work that addresses workflow-embedded Knowledge Management support. According to this analysis, only a few approaches exist that provide a functionality roughly comparable to PRIME when coupled to a flexible PSEE like MILOS.
148
CHAPTER 6 Discussion
Aspects
TIDE
EULE
OntoBroker/ SGML
Work Brain
Know More
DECOR
PRIME
Process-Oriented Knowledge Archive
+
+
+
+
+
+
+
Active Information Delivery
+
+
+
-
+
+
+
Freely-Definable Information Needs
+
+
-
-
-
(+)a
+
Information Need Satisfaction Alternatives
-
+
-
-
+
(+)a
+
Generic Information Needs
+
+
+
-
+
+
+
Situation-Specific Information Needs
(+)b
+
+
-
+
+
+
Information Needs for Different Roles
-
-
-
-
+
+
+
Domain-Oriented Information Need Organization
-
-
-
-
-
?
+
Separate Characterization Vocabulary
+
+
+
-
+
+
+
Contextualized Information Storage
-
-
-
+
+
-
+
Activity-Specific Forums
-
-
-
+
-
-
+
External Information Sources
-
-
-
+
+
+
+
Continuous Information Need Evolution
-
-
-
-
-
-
+
+: aspect supported -: aspect not supported
a. unclear (because work still in progress), but very likely b. via Bayesian Network
?: unknown Tab. 6.1: Comparison of Process-Oriented Knowledge Management environments.
TIDE
TIDE [Wol99] is a web-based system that facilitates task-based retrieval of documents. A task in TIDE describes a yes/no question that a user is trying to answer; it is represented by a set of weighted slot/value pairs that characterize the question, where all values are terms (words or word stems). The weight of an attribute "loosely ... represents the importance or frequency of the value of that slot in relevant documents" [Wol99]. Furthermore, a task (question) references a set of sub-questions that provide evidence towards answering the parent. This task hierarchy "corresponds to a Bayesian Network, which encodes the probabilistic relationships
149
between questions" [Wol99]. The weights and the task hierarchy are maintained in a task model that users instantiate for their concrete tasks. The task representation is used to retrieve a set of task-specific relevant documents via the vector space model [SM83]. Each documents is characterized by a vector of weighted terms. A weighted keyword query is derived from a user’s task (question) representation by recursively collecting the terms from the question’s sub-questions and computing weights for these terms according to the sub-questions’ importance to the parent question. TIDE’s method of query derivation allows the relevance criterion to be adapted to reflect changes in the users’ opinion on relevance by appropriate weight modifications. Compared to PRIME, two main differences can be identified: first, TIDE restricts the notion of a user’s task to answering yes/no questions, whereas activities in PRIME reflect arbitrary tasks. Second, the determination of relevance in TIDE is computed by a weighted term query approach; in PRIME, relevance computation is a two-step process: (i) determination of relevant information needs based on a symbolic activate/trigger model1 using boolean F-Logic expressions, and (ii) launching a well-formed query command to an appropriate information source as defined by the information needs. Thus, in TIDE relevance can only be expressed heuristically in terms of a probabilistic model; in PRIME, relevance can be formulated as a logical fact, which allows it to enforce that certain important documents are always retrieved in specific situations. Also, TIDE’s weight-based relevance model will be difficult to maintain, as it is not trivial to identify which weights have to be changed in what way for an intended update of the relevance criterion. However, the TIDE system might be an interesting extension to PRIME, as TIDE’s notion of a task actually corresponds more closely to PRIME’s notion of an information need (formulated as a yes/no question): by mapping TIDE’s tasks to PRIME’s information needs, the information need’s query command could be used to trigger TIDE’s retrieval mechanism. EULE [RMS00] is a system that provides computer-based guidance for office workers at Swiss Life. It introduces a formal knowledge representation language that covers data and process aspects, as well as legislation and company regulations relevant for office tasks dealing with life insurance. Users are guided through a sequence of activities to perform their tasks, being provided with access to relevant documents (contracts, letters, client data etc.); for each activity, users are requested to enter task-
(1)
Information resource are activated by entities present in the activity representation; activated information resource trigger if their constraints are satisfied. Only triggered resources are presented to the agent performing the activity.
EULE
150
CHAPTER 6 Discussion
specific data into forms that are presented to them by EULE. Depending on the data entered, new activities might be triggered because of certain laws or regulations. EULE uses deduction to create appropriate instances of rights and obligations, which are represented as concepts; its inference engine couples description logic and deductive database technology. For each activity, EULE can present an explanation to the user why the activity has to be performed. Furthermore, letters that have to be created during certain activities can be generated automatically from the user’s data (in combination with the company’s databases). The system was set productive at Swiss Life in mid-1999, and is reported to be highly accepted by employees. Perhaps most interestingly for the approach presented in this thesis, a field study with EULE has been conducted at Swiss Life with positive results: team heads "noticed a considerable relief from the support they usually need to give their team members whenever they encounter a situation they do not know how to deal with" [RMS00]. Because of its inflexible workflow enactment model (build/compile/execute life-cycle), EULE is inadequate to support software development processes2. Furthermore, EULE is not designed to provide users with information from external information sources. In addition to the documents related to an activity (contracts, letters, etc.), users are given access to textual representations of laws and regulations that are relevant to the user’s current activity. In particular, relevance of information is determined strictly deductively in EULE. In PRIME, relevance can also be computed deductively (by means of information need preconditions formalized in F-Logic); but additionally, information can be retrieved via soft-matching mechanisms (e.g. standard information retrieval approaches [SM83], similarity measures [RA01] etc.). Depending on the query command specified within an information need and the retrieval mechanisms supported by available information sources, soft-matching can be used to find relevant information whenever this seems appropriate.
OntoBrokerBased Reactive Agent Support
Schnurr et.al. describe an approach based on OntoBroker3 for Reactive Agent Support [SSS99][SS00]. OntoBroker is used to define a domain ontology and to manage an archive of ontology-annotated documents. In addition, OntoBroker scans the documents for facts, stores them in a database, and infers facts from the database using a built-in inference engine. Business processes are defined as SGML nets (a special kind of Petri nets). Processes are represented by transitions; predicates based on docu-
(2) (3)
In fact, EULE is a single-user system and does not support workflows performed by project teams. OntoBroker is a commercially available F-Logic interpreter (www.ontoprise.com).
151
ment contents define when a transition may be executed. Queries (called context-based views) to the database can be associated to transitions and places. The approach focuses on strongly-structured processes, as the planning of activities and remodeling of SGML nets is not supported. Furthermore, the approach is restricted to F-Logic-based queries to one central repository of annotated documents. For PRIME, F-Logic-based queries to an OntoBroker repository only form one of many possibilities to retrieve information; alternatively, it can provide information retrieved from standard information retrieval systems, relational databases, or case-based reasoning systems4. Especially the latter are considered to be of prime importance for experience management within software organizations [Tau00]. Furthermore, the proposed SGML net-based approach does not facilitate an explicit representation of activities. Rather, the activity states are implicitly defined by the state of document attributes. As a consequence, queries can only reference attributes of the document currently being modified by an activity (i.e. transition). In [WWT98], Wargitsch et. al. present the OM-based flexible WFMS WorkBrain that provides integrated access to different information and communication services. These include a CBR system (storing former workflow cases), a workflow issue-based information system (WIBIS), a mechanical design database, an electronic product catalogue, a know-how database for engineering solutions as well as a traditional DMS.
WorkBrain
WorkBrain supports both structural planning and enactment tasks: e.g. workflow construction is supported by retrieving similar former workflow cases, whereas enactment tasks are supported by retrieving documents created in former workflows. However, only the WIBIS is process-oriented in the sense that processes are used to organize issue threads. Access to the other information systems can not be tailored to specific tasks; in particular, generic queries that are instantiated for concrete tasks cannot be modeled. While the OM is comprised of different information sources that have been made available, no process-specific usage is supported and no automatic query execution takes place, i.e. the OMIS is passive. The KnowMore framework [ABS99] outlines a three-step deployment process for their workflow-enabled information delivery system. First, a commercial business process modeling tool is used to define a process
(4)
For an example integration with a CBR-System, see [May01].
KnowMore
152
CHAPTER 6 Discussion
representation that can be enacted by a workflow engine. Second, knowledge-intensive tasks (KIT) within this process model are identified; these are enriched with KIT variables and with conditional, generic queries. KIT variables represent slots that have to be filled during process enactment, whereas queries represent potential information needs. During workflow enactment, the generic queries are instantiated with workflow parameters in the context of concrete tasks. After instantiation, the queries are executed by computer agents which encapsulate knowledge on how to retrieve information from a particular information source. The results can automatically be integrated into document templates that specify the input fields that have be filled with retrieved information. In addition, KnowMore users can be presented with explanatory information on the values chosen/retrieved for the template’s input fields. The retrieval results are updated whenever the context in which they have been retrieved changes. Like OntoBroker/SGML, the KnowMore approach focuses on stronglystructured processes and the automated integration of retrieved information, both of which are inadequate for software development processes. With KnowMore, only the enactors (but not the planners) of workflows are supported by the automated information retrieval, and the set of information needs is defined statically in the process model. KnowMore and PRIME also differ in their main strategy for knowledge delivery. The KnowMore system always automatically executes the whole set of information needs currently regarded as relevant, and then postprocesses the results. In PRIME, the agent is given the possibility to choose from a set of offered information needs the one that she considers as relevant in her current situation. The approach implemented in KnowMore is indented to support (automatically) filling in the structured document template, whereas PRIME is intended to support creative processes handling informally specified documents. In particular, the objective of information needs in PRIME is not to fill in the slots (i.e. attributes) of a document’s characterization object. On the contrary, the attributes are used to retrieve information items that help a human agent to successfully perform a creative activity. DECOR
DECOR5 [ABN+01] builds upon the KnowMore framework, but addresses weakly-structured, knowledge-intensive processes which can not be planned fully in advance. Similar to PRIME, an Information Assistant is proposed that observes the workflow and interprets modeled information needs specified in the process model in order to offer relevant
(5)
Delivery of context-sensitive organizational knowledge
6.1 Limitations
information. The main focus of the DECOR project is to provide a practice-driven, "total solution" for the integration of information retrieval into workflow-embedded, knowledge-intensive tasks. To this end, the project utilizes available, consolidated modeling methods and information technology in combination with research results from the KnowMore approach. However, continuous information need evolution as facilitated with PRIME is not reported to be addressed by DECOR. 6.1
Limitations
In general, the applicability of the approach presented in this thesis is restricted to software organizations with one or more of the following characteristics: • direct communication (i.e. "coffee room-based" knowledge exchange)
between the project team members is inhibited either because their number is too large, or the team members are geographically dispersed • frequent loss of expertise knowledge because of a high staff turnover rate • shared problem domains (e.g. concepts, tools, technologies etc.) can be identified for a sufficient number of employees, such that a shared vocabulary can be agreed upon While software organizations with at least one of these characteristics exist (and have also realized their need for Knowledge Management, see e.g. [Schie01]), our approach is probably less useful for small organizations (e.g. staff size < 10). Furthermore, the approach presented in this thesis demands a modeling effort on the part of the organization’s Knowledge Department that must not be underestimated. The approach is based on the construction of an explicit, activity-oriented process model that identifies different types of activities. In addition to these process types, product types, and other domain-related entities (e.g. components, technologies, tools, etc.) are assumed to be defined whenever this seems appropriate for the capture and maintenance of new, recurrent information needs. Whether there is a pay-off for these efforts depends on several factors, e.g.: • The quality of information sources available to the organization (both
internal and external sources), including the quality of the individual information items contained in these sources.
153
154
CHAPTER 6 Discussion
• Characteristics of the retrieval mechanisms (e.g. recall and precision)
provided by these information sources, limitations of the query languages provided, etc. • Properties of the software organization’s domain (e.g. how fast become domain concepts irrelevant?). • Inclination of the team members to build and use a shared vocabulary. • The frequency with which modeled information needs are triggered and accessed by users, or the influence of the information items provided by a triggered information need during an activity on its successful enactment. Only the last point can be influenced by the Knowledge Department, as it depends on the information needs that have been modeled by its knowledge engineers. In the case that the recurrence frequency is sufficiently high, agents save time by not having to search for information on their own during their development activities; at the same time, knowledge brokers do not waste time repeatedly answering the same set of standard questions. In the case that the information items provided by a set of modeled information needs are highly relevant for the successful enactment of an activity, the modeling effort can be justified by a corresponding quality improvement. We based our approach on the assumption that the software organization is willing to deploy (or already has deployed) a Process-Centred Software Engineering Environment. As at the time of writing this thesis, PSEEs have not been generally accepted by the software industry, this assumption is probably the biggest obstacle for a practical deployment of the approach. However, we argue that the demands made by our approach on the way in which the PSEE has to be used are comparatively small, as the PSEE basically only is supposed to maintain the personal lists of scheduled activities for every team member. Considering the wide-spread use of "Personal Digital Assistants" to electronically manage to-do lists, the resistance of software engineers to use this functionality to manage their list of development activities might vanish over time; especially, if a noticeable advantage can be gained by being actively provided with useful information for each activity on this list. 6.2
Case Study Outline
In order to show the possibility to determine whether the benefits listed in Section 1.2 are achieved by the approach presented in this work under
6.2 Case Study Outline
certain conditions, we propose the following reformulations of the expected benefits, together with an outline of a corresponding case study. 6.2.1
Quality Aspect
The original benefit formulation "Relevant information is not overlooked" is reformulated in the following way: The number of useful information items provided by the Information Assistant to the user is greater than the number of information items found by the user when he accesses the available information sources on his own. This hypothesis might be tested by the following case study: 1. Choose a software project that: • runs sufficiently long to give the Knowledge Department enough time for the identification and modeling of recurrent information needs • has a sufficient number of team members that can be split into two groups with comparable capabilities/expertise and responsibilities with regard to the type and number of activities that have to be performed throughout the project. 2. Let all team members of the project use PRIME to bootstrap the capture of information needs and the team members’ Information Assistants. 3. When a certain milestone is reached (e.g. estimated project half-time, or when the Knowledge Department signals that it has gathered a sufficient number of information needs): • Split the team members into two groups A and B with comparable capabilities/responsibilities • Disable the Information Assistants for one of these groups: without loss of generality for Group A • Group B will continue to use their Information Assistants (possibly causing the KD to capture new information needs or to update existing ones). 4. Information that would have been offered by the Information Assistants of Group A members during their activities is silently traced by PRIME. Let Infosystem be the set of information items that would have been offered. 5. Whenever members of Group A search for information items during their current software engineering activity, they mark those information items that they consider as useful for their activity. Let Infogroup
155
156
CHAPTER 6 Discussion
be the set of information items marked useful by members of Group A. 6. Conduct a post-mortem analysis with (a representative subset of) all team members at the end of the project. The set Infosystem of information items that would have been offered by the Information Assistants of Group A members is analyzed for each of their activities. • Let nsystem be the number of information items in Infosystem for which Group A members agree6 that they would have been useful for them during their respective activities • Let ngroup be the number of information items in Infogroup that the Group A member who formerly marked the item as useful is still convinced that the item was useful for him during his activity. • Hypothesis: n system > n group
Deploy Group B as a "control group" in order to validate stable system behaviour (i.e. monitor and compare the number of relevant information items offered by their Information Assistants). 6.2.2
Efficiency Aspect
The original benefit formulation "Time spent on searching is reduced" is reformulated in the following way: Relevant information is found faster when using PRIME than without it. This hypothesis might be tested by the following extensions to the case study outlined in Section 6.2.1: • In addition to Step (4), measure the cumulative time tsystem needed by
PRIME to retrieve the information items for Group A members during their activities. • In addition to Step (5), measure the time tgroup that members of Group A spent on searching for information items.
(6)
Whether an information item would have been actually useful might be subject to some discussions. However, the final decision should be left to the team member who performed the activity, as the Information Assistant is primarily intended to satisfy individual information needs.
6.2 Case Study Outline
Then the post-mortem analysis in Step (6) is intended to validate the following hypothesis: t system t group --------------- < -------------n system n group
i.e., the time required by PRIME for information delivery with respect to the number of useful information items found is shorter than the time spent on searching by Group A members with respect to the number of actually useful information items found by them during their software engineering activities. Deploy Group B as a "control group" to compare the amount of time spent on searching by them (in addition to monitoring system behaviour in terms of the number of relevant information items offered by their Information Assistants). 6.2.3
Convenience Aspect
The expected benefit "Searching for information is more convenient" might be analyzed by conducting a survey on the members of Group B at the end of the case study outlined in Section 6.2.1. The survey could be based on a questionnaire that allows Group B members to express their subjective opinion on whether they found working with the Information Assistant was preferable to the way in which they had to search for information during their former software engineering projects. It is important that the questionnaire distinguishes between the initial "bootstrapping phase" and the phase in which PRIME is supposed to be "fully-operational", starting with the disabling of the Information Assistant for Group A members. Apart from the usual "tool mastery burden" caused by any new tool that is introduced into an organization, the benefit of PRIME during bootstrapping is likely to be limited to providing team members with personal, activity-specific information items ("bookmarks"). As this requires that the team members first have to find these items on their own, the convenience gain will probably be considered to be comparatively low during this phase. As the approach presented in this thesis is intended for sufficiently large software organization (see the influencing factors listed in Section 6.1), the case studies outlined above would have to be conducted in an appropriate industry context, which would have exceeded the boundaries of this work. Conducting the case studies in a smaller setting (e.g. with a group of students at our university) might have a psychological benefit, but bears a high possibility of not yielding statistically significant results for this type of work, as required for a scientifically sound evaluation. Hence,
157
158
CHAPTER 6 Discussion
we decided against a statistical evaluation of the experience gained with our approach so far as part of this thesis. However, the need for an active information delivery as provided by PRIME within the context of a PSEE has been acknowledged by an industry study conducted with MILOS at DaimlerChrysler (Ulm).
CHAPTER 7
Summary & Outlook
This chapter recapitulates the main contributions of this thesis and concludes with an outlook on future work.
For the successful enactment of software development processes, it is essential that process participants are provided just-in-time with all information available that is relevant and useful for their current activities. The difficulty is rooted in the continuous change of both process characteristics and process environment: 1. participants are continuously starting new activities, changing roles during the activities, or dropping out of activities. 2. the characteristics of activities change frequently, e.g. because of changes to relevant input products, or changes to the set of tools being used. 3. the process environment changes frequently: both inside and outside the organization in which the process occurs, new information is continuously created that might be highly relevant to the process participants. Standard information filtering/retrieval approaches are inadequate in such situations: the passive query-return model typically leads to under-utilization of available information [MR97], e.g. because it assumes agents to know about the existence of relevant information sources, as well as when and how to query them appropriately. While information filtering approaches based on long-term queries or user profiles address issue (3), they fail to distribute relevant documents because of (1) and (2), unless the long-term queries are continously updated in order to reflect the changes. In order to address these problems, we decided to utilize the organization’s process model as the basis to coordinate the Knowledge Manage-
159
160
CHAPTER 7 Summary & Outlook
ment activities required for a systematic, active delivery of information. The thesis presented a detailed life-cycle model for Software Engineering Process-Oriented Knowledge Management (SE-POKM). This model specifies the activities and responsibilities of three different stakeholders: the Process Group (PG; focusing on process quality and improvement), the Knowledge Department (KD; responsible for knowledge management within the organization), and process participants (performing activities and requiring access to available knowledge). At the basis of the life-cycle model for SE-POKM lie (i) an explicit representation of the organization’s process model, and (ii) an explicit representation of meta-knowledge on what information might be useful for participants to successfully perform the classes of activities described in this process model. In this work, we introduced a representation formalism for this kind of meta-knowledge in the form of situation-dependent, recurrent information needs that typically arise during software development activities. Essential components of these information needs representation are: • a question (or question template) in natural language that denotes the
information need • a structured precondition that specifies the situation in terms of constraints on the activity, activity-related entities (e.g. products), the participant’s role in this activity, as well as the participant’s profile (e.g. his skills, preferences, etc.) • alternative ways to potentially satisfy the information need in the form of information source usage recommendations that contain parameterized, executable query commands to available information sources. The organization’s process model serves as a starting point to elicit information needs that are likely to arise for each process (activity) type; in addition, it is used to maintain the captured information needs by associating them with appropriate process model entities. In the presence of process type-specialization, the organization scheme presented in this work allows inheritance of information needs along the specialization relation. Together with the concept of generic (i.e. parameterized) information needs, this facilitates reuse of information need definitions in sub-types. In order to prevent the inheritance of information needs that should be refined in correpondence to the level of detail reflected by a sub-type, our approach provides an explicit specialization relation on information needs. Depending on the complexity of the organization’s domain (measured in terms of the number of different products, tools and technologies that are
161
handled by the participants during activities of a given type), the large number of information needs associated with one process model entity is likely to become difficult to maintain over time. In order to address this scaling problem, we extended the basic process-oriented organization scheme to encompass additional domain entities (e.g. product types, tools, technologies, etc.) that can be referenced by activity representations, giving rise to an incremental construction of a software organization-specific domain ontology. Based on this information need organization scheme, we presented a retrieval mechanism in order to automatically determine the set of information needs that are likely to arise in a specific situation. Given the characterizations of a process participant and an activity that he has been assigned to perform in a certain role, the mechanism determines the set of information needs via an activate/trigger strategy: first, the domain entities present in the object-oriented activity characterization activate all information needs being associated with these entities. In a second step, the preconditions of activated information needs are evaluated: only those whose precondition holds in the current situation are triggered, i.e. supposed to be presented to the process participant. Thus, in contrast to other Process-Oriented Knowledge Management approaches, our approach does not attempt to directly retrieve all information items that might be useful/relevant for the process participant during his activity. Instead, we assume that the triggered information needs are presented to the participant in the form of a list of corresponding textual questions. From this list, the participant is assumed to choose one that corresponds best to his current information need. For this chosen information needs, the predefined information source usage recommendations are executed (i.e. the specified query commands are instantiated and sent to appropriate information systems, or contact information for human subject-matter experts is displayed) to provide the process participant with information items that potentially satisfy his information need. Because of the dynamic nature of software processes and their changing environment, it seems unrealistic to assume that a fixed set of information needs can be identified and modeled in advance that covers all information needs that will actually arise. Therefore, our life-cycle model for SEPOKM specified an evolution phase in which the information need model is continuously updated to better reflect the participants actual information needs. In order to capture actual information needs that currently arise for participants during process enactment, the life-cycle model presented in this thesis proposed the deployment of a forum that is operated by the Knowledge Department. Participants can post requests for information to this
162
CHAPTER 7 Summary & Outlook
forum or provide feedback on information needs presented to them automatically. In addition to replying to information requests with appropriate information items, members of the Knowledge Department are assumed to identify information requests that are likely to recur. The formalization of these recurrent requests as information needs (i.e. including potential ways to automatically retrieve information to satisfy them) should be in the interest of both the process participants and the Knowledge Department: on one hand, participants can be provided automatically during future activities with information that satisfies their information needs, without bothering to contact the Knowledge Department, or manually formulating queries to information systems. On the other hand, the Knowledge Department will be spared from repeatedly answering standard questions (especially those of new participants), so that human experts in the KD need only be consulted for new, more difficult problems. The concepts presented in this thesis have been implemented in a system called PRIME (PRocess-oriented Information resource Management Environment). PRIME provides tool support for process participants and members of the Knowledge Department. PRIME is designed to be coupled with the organization’s Process-Centred Software Engineering Environment (PSEE), such that it can provide users with activity- and situation-specific information from information sources available to the organization. As a proof-of-concept implementation, PRIME has been coupled with the PSEE MILOS [HKM01].
Bootstrapping Knowledge Management with PRIME
In particular, PRIME allows a smooth introduction of Knowledge Management services into the every-day work practice of process participants. To begin with, the system can be used by participants to maintain activityspecific bookmarks (i.e. URL links to favourite documents), providing participants with an activity-oriented way to organize and quickly access their documents. In addition, PRIME’s Information Request Forum serves as platform for activity-specific communication with the Knowledge Department. Alternatively, if a KD has not yet been established within the organization, requests for information can also be posted to a forum of collaborators that support each other by answering other persons’ questions. At this stage, no modeling effort is required other than the effort already spent on providing participants with to-do lists; hence, a Knowledge Department is not a prerequisite for this level of service. Only when • users start to express an interest in being provided with certain book-
marks on a systematic basis during a certain class of activities, or • the Process Group requests that users should be provided with certain information during a class of activities (captured by them in the form of a corresponding process type),
7.1 Outlook
the need for a KD arises to capture and formalize these requests in the form of explicit information resources, and to provide access to the requested information from appropriate information sources. 7.1
Outlook
Future work has to be conducted mainly in two directions: (i) the empirical evaluation of the approach presented in this thesis, and (ii) conceptual extensions based on the results of this evaluation. An evaluation of the approach will have to based on appropriate tool support, as the life-cycle model focuses mainly on the automation of situation-specific information distribution. At the time of writing this thesis, the implementation of PRIME has reached a stage where students have started to use it during their implementation activities on MILOS. An empirical evaluation of the benefits expected from a deployment of PRIME within software organizations has to be the object of future research. In anticipation of a likely result of such an empirical evaluation, future extensions of the work presented here will need to address the high effort required for information need modeling. As an alternative to the explicit representation of logical preconditions, techniques known from Collaborative Filtering (see e.g. [Lie95]) or Case-Based Reasoning (see e.g. [RA01][Ric01]) should be adapted and integrated with our mechanism for situation-specific information need retrieval. Usage of this technology could allow to provide participants with information items that other participants found useful who had ’similar’ information needs, or with former information needs that (other) participants had during ’similar’ situations. It is to be hoped that such extensions could narrow the current gap between the low effort required to maintain the participants’ personally preferred information resources during their activity, and the relatively high effort required to model fully-specified, generic information needs.
163
164
CHAPTER 7 Summary & Outlook
165
References [Aar98] R. J. Aarts. A CBR Architecture for Project Knowledge Management. In B. Smyth, P. Cunningham (Eds.): Advances in Case-Based Reasoning, Proceedings of the 4th European Workshop on Case Based Reasoning (EWCBR-98), Dublin, Ireland, September 1998. LNCS 1488, Springer, 1998. [ABM+00] A. Abecker, A. Bernardi, H. Maus and C. Wenzel. Information Support for Knowledge-Intensive Business Processes - Combining workflow with document analysis and information retrieval. AAAI Workshop on Bringing Knowledge to Business Processes, 20-22 March, 2000. [ABN+01] A. Abecker, A. Bernardi, S. Ntioudis, G. Mentzas, R. Herterich, C. Houy, S. Müller and M. Legal. The DECOR Toolbox for Workflow-Embedded Organizational Memory Access, In: Proc. of the 3rd Int. Conf. on Enterprise Information Systems (ICEIS 2001), Portugal, July 7-10 2001, Vol. 1, 2001. [ABS99] A. Abecker, A. Bernardi and M. Sintek. Enterprise Information Infrastructures For Active, Context-Specific Knowledge Delivery. ECIS’99 - The 7th European Conference on Information Systems, Copenhagen, Denmark, June 1999. [ABT98] K.-D. Althoff, F. Bomarius and C. Tautz. (1998). Using Case-Based Reasoning Technology to Build Learning Software Organizations. In Proceedings of the 1st Interdisciplinary Workshop on Building, Maintaining, and Using Organizational Memories (OM-98), 13th European Conference on AI (ECAI’98), Brighton, http://SunSITE.Informatik.RWTH-Aachen.de/Publications/CEUR-WS/Vol-14. [Ack93] Mark S. Ackerman. Answer Garden: A Tool for Growing Organizational Memory. Massachusetts Institute of Technology, Ph. D. Thesis, 1993. [ADE+99] D.F.J. Angele, S. Decker, M. Erdmann, H. Schnurr, S. Staab, R. Studer and A. Witt. On2broker: Semantic-Based Access to Information Sources at the WWW. Proceedings of the World Conference on the WWW and Internet (WebNet 99), Honolulu, Hawaii, USA, October 1999. [AH99] M. S. Ackerman and C. Halverson. Organizational Memory: Processes, Boundary Objects, and Trajectories. Proceedings of the IEEE Hawaii International Conference of System Sciences (HICSS 99), January, 1999. [AHM+02] A. Abecker, K. Hinkelmann, H. Maus and H.-J. Müller (Hrsg.). Geschäftsprozessorientiertes Wissensmanagement, Springer, 2002. [AM95] Mark S. Ackerman and E. Mandel. Memory in the Small: An Application to Provide Task-Based Organizational Memory for a Scientific Community. Proceedings of the IEEE Hawaii International Conference of System Sciences (HICSS 95), January, 1995, vol. IV, pp. 323-332.
166
[AM99] M. S. Ackerman and E. Mandel. Memory in the Small: Combining Collective Memory and Task Support for a Scientific Community. Journal of Organizational Computing and Electronic Commerce, 1999, 9(2-3), pp. 105-127. [Ara93] G. Arango. Domain Analysis Methods. In Software Reusability, Ellis Harwood, 1993. [Bas89] V.R. Basili. The Experience Factory: packaging software experience. In Proceedings of the Fourteenth Annual Software Engineering Workshop, NASA Goddard Space Flight Center, Greenbelt, MD 20771, 1989. [BC92] N. J. Belkin and W. B. Croft. Information filtering and information retrieval: Two sides of the same coin?, Communications of the ACM 35(12): 29–38, 1992. [BCR94] V.R. Basili, G. Caldiera and H. D. Rombach: Experience Factory. In Encyclopedia of Software Engineering, volume 1, pp. 496-476, John Wiley & Sons, 1994. [BDF96] S. Bandinelli, E. Di Nitto and A. Fuggetta. Support Cooperation in the SPADE-1 Environment. IEEE Transactions Software Engineering, Vol. 22, No. 12, December. 1996, pp. 841-865. [BFG93] S. Bandinelli, A. Fuggetta and S. Grigolli. Process Modeling-in-the-large with SLANG. In IEEE Proceedings of the 2nd International Conference on the Software Process, Berlin (Germany), 1993. [BGH+98] F. Bendeck, S. Goldmann, H. Holz and B. Kötting. Coordinating Management Activities in Distributed Software Development Projects, Proceedings of the 7 th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET ICE '98), IEEE Computer Society Press, ISBN 0-8186-8751-7, pp. 3338, 1998. [BJM+00] D. S. Batory, C. Johnson, B. MacDonald, and D. von Heeder. Achieving extensibility through product-lines and domain-specific languages: A case study. In International Conference on Software Reuse, Springer, LNCS 1844, pp. 117-136, 2000. [Boe87] B. Boehm. A Spiral Model of Software Development and Enhancement. Computer, 20(9), pp. 61-72, 1987. [BR91] V.R. Basili and H. D. Rombach. Support for a comprehensive reuse. Software Engineering Journal, September 1991. [BT96] G.A. Bolcer and R.N. Taylor. Endeavors: A Process System Integration Infrastructure. Proceedings IEEE Computer Soc. International Conference on Software Process (ICSP4), Los Alamitos, California, USA, 1996, pp. 76-89. [BW98] U. Becker-Kornstaedt and R. Webby. Towards a Comprehensive Schema Integrating Software Process Modelling and Software Measurement. IESE Report No. 021.97/E, Fraunhofer IESE, 1998. [CFH+99] A. Cushman, M. Fleming, K. Harris, R. Hunter and B. Rosser. (1999). The Knowledge Management Scenario: Trends and Directions for 1998-2003, Gartner's Group Strategic Analysis Report R-07-7706, 1999.
167
[CFS97] K. Cole, O. Fischer and Ph. Saltzman. Just-in-Time Knowledge Delivery, Communications of the ACM, Vol. 40, No. 7, 1997. [CG99] G. Cugola and C. Ghezzi. Design and implementation of PROSYT: a distributed process support system. In Proceedings of the 8th International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises, California, USA, June 1999. [CHL+94] R. Conradi, M. Hagaseth, J. O. Larsen, M. Nguyen, G. Munch, P. Westby and W. Zhu. EPOS: Object-Oriented and Cooperative Process Modeling. PROMOTER book: A. Finkelstein, J. Kramer and B. A. Nuseibeh (Eds.). Software Process Modeling and Technology, 1994, p. 33-70. Advanced Software Development Series, Research Studies Press Ltd. (John Wiley). [CHR+98] A. Cichocki, A. S. Helal, M. Rusinkiewicz and D. Woelk. Workflow and Process Automation: Concepts Technology. Kluwer. 1998. [DC99] C. J.-N. Despres and D. Chauvel. Mastering Information Management: Part Six Knowledge Management, Financial Times, March 8, 1999, pp. 4-6. [Del00] B. Dellen. Change Impact Analysis Support for Software Development Processes, Ph.D. Thesis, Fachbereich Informatik, Universität Kaiserslautern, 2000. [DHM+97] B. Dellen, H. Holz, F. Maurer and G. Pews. Knowledge-Based Techniques to Improve the Flexibility of Workflow Systems (in German), Software Management 1997, Reihe Wirtschaftsinformatik, B.G. Teubner Verlagsgesellschaft, Germany, 1997. [DHP97] B. Dellen, H. Holz and G. Pews. Knowledge Management in CoMo-Kit (in German), Proceedings of the KI-97, 1997. [DJB97] T. Davenport, S.L. Jarvenpaa and M.C. Beers. Improving Knowledge Work Processes. Sloan Management Review, Vol. 37, No. 4, pp.53-65, 1997. [DM95] B. Dellen and F. Maurer. Integrating planning and execution in software development processes. In proceedings of the Workshop on Enabling Technologies: Infrastructures for Collaborative Enterprises (WET ICE ’96). IEEE CS Press, June 1996. [DM98] B. Dellen and F. Maurer. Change Impact Analysis Support for Software Development Processes. Journal of Applied Software Technology, International Academic Publishing, 1998. [DMM+97] B. Dellen, F. Maurer, J. Münch and M. Verlage. Enriching Software Process Support by Knowledge-based Techniques. In International Journal of Software Engineering and Knowledge Engineering, Volume 7, No.2, 1997, pp. 185-215. [DP98] T. Davenport and L. Prusak. Working Knowledge: How organizations manage what they know, Harvard Business School Press, Boston, 1998. [EDM98] K.E. Emam, J.N. Drouin and W. Menlo. SPICE: The Theory and Practice of Software Process Improvement and Capability Determination. IEEE Computer Society, Los Alamitos, CA, 1998. [EGR91] C. A. Ellis, S. J. Gibbs and G. L. Rein. Groupware - Some Issues and Experiences. Communications of the ACM, 34:1, pp. 38-58, 1991.
168
[Fel00] R.L. Feldmann. On Developing a Repository Structure Tailored for Reuse with Improvement. In G. Ruhe, F. Bomarius (eds.): Learning Software Organization -Methodology and Applications. Lecture Notes in Computer Science #1756, Springer-Verlag, 2000. [Fer93] C. Fernström. Process WEAVER: Adding Process Support to UNIX. Proceedings of the 2nd International Confernence on Software Process, IEEE, Society Press, March 1993. [Fow97] M. Fowler. Analysis Patterns. Addison-Wesley, 1997. [GH98] J. C. Grundy and J. G. Hosking. Serendipity: integrated environment support for process modelling, enactment and work coordination. Automated Software Engineering: Special Issue on Process Technology 5(1), Kluwer Academic Publishers, pp. 27-60, January 1998. [GHJ+96] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-oriented Software. Addison Wesley, Reading, 1996. [GJ96] P. K. Garg and M. Jazayeri. Process-centered Software Engineering Environments. IEEE Computer Society Press, 1996. [GMH99a] S. Goldmann, J. Münch and H. Holz. A Meta-Model for Distributed Software Development, Proceedings of the 8 th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET ICE '99), IEEE Computer Society Press, ISBN 0-7695-0365-9, pp. 48-53, 1999. [GMH99b] S. Goldmann, J. Münch and H. Holz. MILOS: A Model of Interleaved Planning, Scheduling, and Enactment, presented at the ICSE '99 Workshop on Software Engineering over the Internet. Web proceedings only: http://sern.cpsc.ucalgary.ca/~maurer/ ICSE99WS/Program.htm. [GMH00] S. Goldmann, J. Münch and H. Holz. Distributed Process Planning Support with MILOS. Int. Journal of Software Engineering and Knowledge Engineering, Vol. 10, No. 4, pp. 511-525, 2000. [Hac00] B. Hackett. Beyond Knowledge Management: New Ways to Work, The Conference Board Inc, March 2000. [HKM01] H. Holz, A. Könnecker and F. Maurer. Task-Specific Knowledge Management in a Process-Centred SEE, in K.-D. Althoff, R.L. Feldmann and W. Müller (eds.): Advances in Learning Software Organizations, Proc. of the Third International Workshop (LSO 2001), Kaiserslautern, Germany, September 12-13, 2001. Springer, LNCS 2176, ISBN: 3-540-42574-8. [Hol95] D. Hollingsworth. Workflow Managament Coalition: The Workflow Reference Model. Workflow Managament Coalition, Document Number TC00-1003, January 1995, http://www.wfmc.org/standards/docs/tc003v11.pdf. [HSW98] F. Houdek, K. Schneider, and E. Wieser (April 1998). Establishing Experience Factories at Daimler-Benz: An Experience Report, In Proc. of 20th International Conference on Software Engineering, Kyoto, Japan, pp. 443-447, 1998.
169
[JGJ97] I. Jacobson, M. Griss, and P. Jonsson, Software Reuse: Architecture, Process and Organization for Business Success, ACM Press, 1997. [JPL98] M. L. Jaccheri, G. P. Picco and P. Lago. Eliciting Software Process Models with the E3 Language, ACM Transactions on Software Engineering and Methodology, Vol. 7(4), 1998. [KB97] B. Krulwich. and C. Burkey. The InfoFinder Agent: Learning User Interests through Heuristik Phrase Extraction, IEEE Intelligent Systems, Vol. 12, No. 5, Sept./Oct. 1997, pp 22-27. [KLW95] M. Kifer, G. Lausen, and J. Wu. Logical Foundations of Object-Oriented and Frame-Based Languages. Journal ACM, 42:741-843, 1995. [Kno96] C. A. Knoblock. Building a Planner for Information Gathering: A Report from the Trenches. AIPS 1996. [Kön00] A. Könnecker. Extending a Process-Centred SEE by Context-Specific Knowledge Delivery, Diploma Thesis, University of Kaiserslautern, 2000. [LAB+99] M. Liao, A. Abecker, A. Bernardi, K. Hinkelmann and M. Sintek. Ontologies for Knowledge Retrieval in Organizational Memories. Workshop on Learning Software Organizations (LSO) at SEKE’99, Kaiserslautern, Germany, June 1999. [Law97] P. Lawrence (Ed.). WfMC Workflow Handbook 1997. John Wiley & Sons, 1997 [LBH+99] D. Leake, L. Birnbaum, K. Hammond, C. Marlow and H. Yang. Task-based Knowledge Management. Technical Report on Exploring the Synergies of Knowledge Management and Case-based Reasoning, AAAI-99 KM/CBR Workshop, October 1999, pp. 35-39. [LC98] M. Letizia and R. Conradi. Techniques for Process Model Evolution in EPOS. IEEE Transactions on Software Engineering, Vol. 19, No. 12, December 1998, pp. 1145-1156. [LHR95] C. M. Lott, B. Hoisl and H. D. Rombach. The use of roles and measurement to enact project plans in MVP-S. In W. Schäfer, editor, Proceedings of the Fourth European Workshop on Software Process Technology, pages 30-48, Noordwijkerhout, The Netherlands, April 1995. [Lie95] H. Liebermann. Letizia: An Agent that assists web browsing. In Proc. of the 13th Int. Joint Conference on Artificial Intelligence, San Francisco, CA, Morgan Kaufmann, 1995. [Lie99] J. Liebowitz. Knowledge Management Handbook, CRC Press, Boca Raton, FL., 1999. [LR00] F. Leymann and D. Roller. Production Workflow - Concepts and Techniques. Prentice Hall, 2000 [LSR+97] S. Luke, L. Spector, D. Rager and J. Hendler. Ontology-based web agents. Proceedings of First International Conference on Autonomous Agents 1997. [Mae94] P. Maes. Agents that Reduce Work and Information Overload. Communications of the ACM 37(7): 31–40, 1994.
170
[May00] W. May. How to Write F-Logic Programs in Florid, Internal Report, Institut für Informatik, University of Freiburg, Germany, 2000. [May01] G. Mayer. Erweiterung einer prozeßorientierten Software-Entwicklungsumgebung um ähnlichkeitsbasiertes Knowledge-Management, Diploma Thesis, Universität Kaiserslautern, 2001. [MDA98] D. W. McDonald and M. S. Ackerman. Just Talk to Me: A Field Study of Expertise Location. Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW ’98), November, 1998, pp. 315-324. [MDB+00] F. Maurer, B. Dellen, F. Bendeck, S. Goldmann, H. Holz, B. Kötting and M. Schaaf. Merging Project Planning and Web-Enabled Dynamic Workflow Technologies. IEEE Internet Computing May/June 2000, pp. 65-74. [MDH99] F. Maurer, B. Dellen and H. Holz. Process Support for Virtual Software Organizations, Proceedings of the 11th International Software & Engineering and Knowledge Engineering Conference (SEKE '99), Knowledge Systems Institute, ISBN 1-891706-012, 1999. [MH99a] F. Maurer and H. Holz. Process-Oriented Knowledge Management for Learning Software Organizations, Proceedings of the 12th Knowledge Acquisition Workshop (KAW '99), Banff, Canada, 1999. [MH99b] F. Maurer and H. Holz. Process-Centered Knowledge Organization for Software Engineering, In D.W Aha and H. Munoz Avila (Eds.). Exploring Synergies of Knowledge Management and Case-Based Reasoning, A 1999 AAAI Workshop (Working Notes). (Technical Report AIC-99-008). Washington, DC, Naval Research Laboratory, Navy Center for Applied Research in Artificial Intelligence. 1999. [MH02] F. Maurer and H. Holz. Integrating Process Support and Knowledge Management for Virtual Software Development Teams, Annals of Software Engineering, Vol. 14, Kluwer Academic Publishers, 2002 (in press). [Mil] The project web page can be found at http://wwwagr.informatik.uni-kl.de/~milos/. [MK99] D. E. Mahling and R. C. King. A goal-based workflow system for multiagent task coordination. Journal of Organizational Computing and Electronic Commerce, 1999, 9(1), pp. 57-82. [MMP96] S. Mukhopadhyay, J. Mostafa and M. Palakal. An Adaptive Multi-level Information Filtering System. Proceedings of the Fifth International Conference on User Modeling, 1996, pp. 21-28. [MR97] S. Mahe and C. Rieu. Towards a Pull-Approach of KM for Improving Enterprise Flexibility Responsiveness: A Necessary First Step for Introducing Knowledge Management in Small and Medium Enterprises. In Proceedings of the International Symposium on Management of Industrial and Corporate Knowledge (ISMICK ‘97), Compiegne, 1997.
171
[MSH+99] F. Maurer, G. Succi, H. Holz, B. Kötting, S. Goldmann and B. Dellen. Software Process Support over the Internet, Proceedings of the 21th International Conference on Software Engineering (ICSE '99), ACM Press, ISBN 1-58113-074-0, pp. 642-645, 1999. [Mün01] J. Münch. Muster-basierte Erstellung von Software-Projektplänen, Ph. D. Thesis, Fachbereich Informatik, Universität Kaiserslautern, 2001. [NFS98] F. Naumann, Ch. J. Freytag, M. Spiliopoulou. Quality-driven Source Selection using Data Envelopment Analysis, In Proc. of the 3rd Conference on Information Quality (IQ), Cambridge, MA, 1998. [NHM00] P. Nour, H. Holz and F. Maurer. Ontology-based Retrieval of Software Process Experiences. Proceedings of the ICSE 2000 Workshop on Software Engineering over the Internet. Web proceedings only: http://sern.cpsc.ucalgary.ca/~maurer/icse2000ws/submissions/Submissions.html. [Nis99] M. E. Nissen. Knowledge-Based Knowledge Management in the Reengineering Domain, Decision Support Systems (27)1-2, 1999. [NKS00] M. E. Nissen, M. Kamel and K. Sengupta. Integrating Knowledge Management, processes and systems, AAAI Workshop on Bringing Knowledge to Business Processes, 20-22 March, 2000. [OLe98a] D. E. O’Leary. Knowledge Management Systems: Converting and Connecting, IEEE Intelligent Systems, May-June, 1998, p30-33. [OLe98b] D. E. O’ Leary. Using AI in Knowledge Management Knowledge Bases and Ontologies, IEEE Intelligent Systems, May-June, 1998, p34-39. [Ost87] L. Osterweil. Software Processes are Software Too. In Proc. of the Ninth Int. Conf. of Software Engineering, Monterey CA, 1987, pp. 2-13. [PCC+93] M.C. Paulk, B. Curtis, M. Chrissis and C.V. Weber. Capability Maturity Model, Version 1.1. IEEE Software, 10(4), pp. 18-27, 1993. [PF87] R. Prioto-Diaz and P. Freeman. Classifying Software for Reusability. IEEE Software, 4(1), pp. 6-16, 1987. [PSW92] B. Peuschel, W. Schäfer and S. Wolf. A Knowledge-based Software Development Environment Supporting Cooperative Work. International Journal on Software Engineering and Knowledge Engineering, 2(1), pp. 79-106, 1992. [RA01] M. M. Richter and K.-D. Althoff. Similarity and Utility in Non-Numerical Domains, Mathematische Methoden der Wirtschaftswissenschaften, Physika-Verlag, pp. 403 – 413, 2001. [Rem00] U. Remus. The role of process modeling in designing process-oriented knowledge management systems, AAAI Workshop on Bringing Knowledge to Business Processes, 20-22 March, 2000. [Ric01] M. M. Richter. CBR: Past and Future - A Personal View. Invited Talk, International Conference on Case-Based Reasoning (ICCBR-2001), Vancouver, British Columbia, Canada, 30 July - 2 August 2001. http://wwwagr.informatik.uni-kl.de/~richter/
172
[RMS00] U. Reimer, A. Margelisch and M. Staudt. A Knowledge-Based Approach to Support Business Processes, AAAI Workshop on Bringing Knowledge to Business Processes, 20-22 March, 2000. [RV95] H.-D. Rombach and M. Verlage. Directions in software process research. M. V. Zelkowitz (editor), Advances in Computers, vol.41, pages 1-63. Academic Press, 1995. [SAD99] R. Studer, A. Abecker and S. Decker. Informatik-Methoden für das Wissensmanagement, In Festschrift Zum 60. Geburtstag Von Prof. Dr. Wolffried Stucky, Teubner, 1999, [Sau01] Th. Sauer. MILOS Project Trace. Project Thesis, Universität Kaiserslautern, 2001. [Schie01] K. von Schierstedt. Praktischer Einsatz von Wissensmanagement von sd&m. Eingeladener Vortrag auf der 1. Konferenz Professionelles Wissensmanagement Erfahrungen und Visionen, Baden-Baden, 14.-16. März 2001. [SCK95] G. Sindre, R. Conradi, and E.-A. Karlsson. The REBOOT approach to software reuse. Journal of Systems and Software, 30(201--212), 1995. [SFB] The web page of the "Sonderforschungsbereich 501" is http://wwwsfb501.informatik.uni-kl.de/. [SGT+00] C. Schlenoff, M. Gruninger, F. Tissot, J. Valois, J. Lubell and J. Lee. The Process Specification Language (PSL): Overview and Version 1.0 Specification. NISTIR 6459, National Institute of Standards and Technology, Gaithersburg, MD, 2000. http:// www.mel.nist.gov/psl/pubs/PSL1.0/paper.doc [SM83] G. Salton and M. McGill. Introduction to Modern Information Retrieval, McGrayHill, 1983. [SO97] M. Sutton, Jr. and L. Osterweil. The Design of a Next-Generation Process Language. In M. Jazayeri and H. Schauer (eds.), Software Engineering - ESEC/FSE'97, Proceedings, LNCS 1301, Springer, 1997; pp. 142-158. [SPM94] W. Schaefer, R. Prieto-Diaz, M. Matsumoto, Software Reusability, Ellis Horwood, 1994. [SS99] S. Staab and H. Schnurr. Knowledge and Business Processes: Approaching an Integration. Proceedings of the International Workshop on KM and OM (IJCAI-99), Stockholm, Sweden, 1999. [SS00] S. Staab and H.-P. Schnurr. Smart Task Support through Proactive Access to Organizational Memory. Journal of Knowledge-based Systems 13(5). Elsevier, 2000. [SSS99] H.-P. Schnurr, S. Staab and R. Studer. Ontology-based Process Support. Workshop on Exploring Synergies of Knowledge Management and Case-Based Reasoning (AAAI99). Technical Report, Menlo Park: AAAI. [SSS+01] H.-P. Schnurr, S. Staab, R. Studer, G. Stumme and Y. Sure. (Ed.). Professionelles Wissensmanagement - Erfahrungen und Visionen, Beiträge der 1. Konferenz Professionelles Wissensmanagement - Erfahrungen und Visionen, Baden-Baden, 14.-16. März 2001, Shaker Verlag, 2001.
173
[STO95] S. M. Sutton, Jr., P. L. Tarr and L. J. Osterweil. An Analysis of Process Languages, CMPSCI Technical Report 95-78, University of Massachusetts, 1995. [SV99] W. Scacchi and Andre Valente. Developing a Knowledge Web for Business Process Redesign. In Proceedings of the 12th Knowledge Acquisition Workshop (KAW '99), Banff, Canada, 1999. [SZ95] E. W. Stein and V. Zwass. Actualizing Organizational Memory with Information Technology, Information Systems Research, 6(2), pp. 85-117, 1995. [Tau00] C. Tautz. Customizing Software Engineering Experience Management Systems to Organizational Needs, Ph.D. thesis, Universität Kaiserslautern, 2000. [Tau01] C. Tautz. Traditional Process Representations are Ill-Suited for Knowledge-Intensive Processes, In Proc. of the Int. Conf. on Case-Based Reasoning (ICCBR-01), Workshop on Processes and KM, Position Paper, 2001. [TKP94] A. Tong, G. Kaiser and S. Popovich: A Flexible Rule-Chaining Engine for process Based Software Engineering. 9th Knowledge-Based Software Engineering Conference, September 1994 [UG96] M. Uschold and M. Gruninger. Ontologies: Principles, Methods, and Applications, The Knowledge Engineering Review, Vol. 11:2, pp. 93-136, 1996. [UW91] G. R. Ungson and J. P. Walsh. Organizational Memories. Academy of Management Review, 16(1), pp. 57-91, 1991. [VBM+96] M. Verlage, B. Dellen, F. Maurer and J. Münch. A synthesis of two software process support approaches. Proceedings of the 8th Software Engineering and Knowledge Engineering (SEKE-96), USA, June 1996. [Ver94] M. Verlage. Multi-view modeling of software processes. B. C. Warboys (ed.), Proc. Third European Workshop on Software Process Technology, (Springer Verlag, 1994) 123127. [WF87] T. Winograd and F. Flores. Understanding Computers and Cognition, AddisonWesley, 1987. [Wii93] K.M. Wiig. Knowledge Management: Foundations. Schema Press, Arlington, Texas, 1993. [Wol99] M. Wolverton. Task-Based Information Management, ACM Computing Surveys, Vol. 31, Number 2es, 1999. [WSM+99] R. Webby, C. Seaman, M. Mendonca, V. Basili and Y.-M. Kim. Implementing an Internet-Enabled Software Experience Factory: Work in Progress. International Conference on Software Engineering (ICSE '99), 2nd Workshop on Software Engineering over the Internet, Los Angeles, CA, USA, 16-22 May, 1999. Web proccedings only: http://sern.ucalgary.ca/~maurer/ICSE99WS/Submissions/Webby/Webby.html [WWT98] C. Wargitsch, T. Wewers and F. Theisinger. An Organizational-Memory-Based Approach for an Evolutionary Workflow Management System - Concepts and Implementation, Proceedings of the 31st Annual Hawaii International Conference on System Sciences, 1998, p174-183.
174
175
List of Figures Fig. 1.1:
Motivational scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Fig. 2.1:
External view on Knowledge Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
Fig. 2.2:
Nissen life-cycle model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Fig. 2.3:
Experience Factory Organization (adapted from [BCR94]). . . . . . . . . . . . . . . . . . . . . . . . . .15
Fig. 2.4:
Quality Improvement Paradigm cycle (adapted from [BCR94]). . . . . . . . . . . . . . . . . . . . . . .16
Fig. 2.5:
Excerpt from a process type definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
Fig. 2.6:
Software Development Support by a PSEE [GJ96]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
Fig. 2.7:
Dataflow between the three areas supported by a PSEE. . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
Fig. 3.1:
Simplified process model example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
Fig. 3.2:
Meta-knowledge organization based on process types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31
Fig. 3.3:
Examples of information needs during planning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44
Fig. 3.4:
Examples of information needs during enactment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44
Fig. 3.5:
Example information need "ejb_tutorial_in". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
Fig. 3.6:
Process-oriented organization scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49
Fig. 3.7:
Process type decomposition example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50
Fig. 3.8:
Process type specialization example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
Fig. 3.9:
Computer science domain example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
Fig. 3.10: Spreading activation example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56 Fig. 3.11: Two required topics: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57 Fig. 3.12: Illustration of the heuristic’s effect on the organization via a domain ontology. . . . . . . . . . .58 Fig. 3.13: Process type with parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58 Fig. 3.14: Product model example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59 Fig. 3.15: Parameter declarations referencing the view "design view". . . . . . . . . . . . . . . . . . . . . . . . . .60 Fig. 3.16: Illustration of the heuristic’s effect on the organization via a product model. . . . . . . . . . . . .61 Fig. 3.17: Information resources indexed by category graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63 Fig. 3.18: F-Logic excerpt from the definition of the base class process. . . . . . . . . . . . . . . . . . . . . .64 Fig. 3.19: F-Logic representation of the process type implementation_process.. . . . . . . . . . . .64 Fig. 3.20: F-Logic representation of an activity impl_act . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64 Fig. 3.21: F-Logic representation for class iSource. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65 Fig. 3.22: F-Logic representation for the information source example from Table 3.1 . . . . . . . . . . . . .65 Fig. 3.23: F-Logic representation for class iSourceRec. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65 Fig. 3.24: Rule patterns used to represent activity, role, and skill constraints. . . . . . . . . . . . . . . . . . . . .66 Fig. 3.25: F-Logic representation for the information source recommendation example . . . . . . . . . . . .67 Fig. 3.26: Method info_resources for instances of process_type. . . . . . . . . . . . . . . . . . . . . .68 Fig. 3.27: Process type implementation_process with associated information resource . . . . . .68 Fig. 3.28: F-Logic representation for class infoNeed.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68 Fig. 3.29: F-Logic representation for the information need example from Figure 3.5 (excerpt). . . . . . .69
176
Fig. 3.30: Rule pattern for a parameterized query command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Fig. 3.31: Example for a parameterized query command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Fig. 3.32: F-Logic excerpt from the definition of class process_decomposition. . . . . . . . . . . . 70 Fig. 3.33: Formal representation of type decompositions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Fig. 3.34: Excerpt from the formal representation of the decomposition example shown in Figure 3.7.71 Fig. 3.35: Excerpt from the formal representation of an activity decomposition . . . . . . . . . . . . . . . . . . 72 Fig. 3.36: Information need associated with type decomposition as shown in Figure 3.7.. . . . . . . . . . . 73 Fig. 3.37: Formal representation of the type specialization example depicted in Figure 3.8.. . . . . . . . . 74 Fig. 3.38: Inheritable multi-valued method info_resource. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Fig. 3.39: Signature extension for class iSourceRec.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Fig. 3.40: Pattern for specialization relation element . . . . . . . . . . . . . . . . . . . . . . 75 Fig. 3.41: Signature extension for class infoNeed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Fig. 3.42: Pattern for specialization relation element . . . . . . . . . . . . . . . . . . . . . . . . 76 Fig. 3.43: Rules for inheritance of information source usage recommendations along .. . . . . . . . . . . . . . . . . . . 78 Fig. 3.46: Formal representation of the entity hierarchy example from Figure 3.9 . . . . . . . . . . . . . . . . 79 Fig. 3.47: Specification of the class entity_type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Fig. 3.48: Example: Information resources attached to entity types (see Figure 3.9). . . . . . . . . . . . . . . 80 Fig. 3.49: Method key_topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Fig. 3.50: Process type implement_with_VAJ_process refers to the entity vaj. . . . . . . . . . . . 81 Fig. 3.51: Example rule concerning activities of type implemenation_process: . . . . . . . . . . . . 81 Fig. 3.52: Method required_topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Fig. 3.53: Information need stub_serialization_in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Fig. 3.54: F-Logic excerpt from the definition of the base class product. . . . . . . . . . . . . . . . . . . . . . 82 Fig. 3.55: F-Logic representation of the product type design_document.. . . . . . . . . . . . . . . . . . . . 82 Fig. 3.56: Formal representation of the product type hierarchy shown in Figure 3.14. . . . . . . . . . . . . . 83 Fig. 3.57: F-Logic excerpt from the definition of the base class product_type. . . . . . . . . . . . . . . . 83 Fig. 3.58: Product types associated with infromation resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Fig. 3.59: Rule pattern for representing product constraints via predicate prod_constr_sat. . . . . 84 Fig. 3.60: Formal represenation of the information source recommendation example from Table 3.3. . 84 Fig. 3.61: Formalization of the "desdoc" parameter declaration from Figure 3.15. . . . . . . . . . . . . . . . . 85 Fig. 3.62: Formalization of the "desdoc" parameter declaration from Figure 3.13. . . . . . . . . . . . . . . . . 85 Fig. 3.63: .F-Logic excerpt from the definition of class processTypeParameter. . . . . . . . . . . . . 85 Fig. 3.64: Multi-valued method declared_parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Fig. 3.65: Formalization of the process type design_process from Figure 3.15. . . . . . . . . . . . . . . 86 Fig. 3.66: Method parameters for instances of process.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Fig. 3.67: Excerpt from the formal representation of an activity with parameters . . . . . . . . . . . . . . . . . 86 Fig. 3.68: Method key_topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Fig. 3.69: Product type ejb_design_document refers to the entity ejb . . . . . . . . . . . . . . . . . . . . 87
177
Fig. 3.70: Information distribution coupled with a WFMS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89 Fig. 3.71: Rule for predicate useful_iResource_from_type(ACT, AGT, IRS). . . . . . . . .91 Fig. 3.72: Rule for method info_resources_from_type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91 Fig. 3.73: Rule (1) for predicate useful_iResource(ACT, AGT, IRS). . . . . . . . . . . . . . . . . .91 Fig. 3.74: Rule for predicate useful_iResource_from_decomp(ACT, AGT, IRS) . . . . . .92 Fig. 3.75: Rule for method iResources_from_decomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92 Fig. 3.76: Rule (2) for predicate useful_iResource(ACT, AGT, IRS). . . . . . . . . . . . . . . . . .93 Fig. 3.77: Rule for predicate useful_iResource_from_context(ACT, AGT, IRS) . . . . .93 Fig. 3.78: Rule (3) for predicate useful_iResource(ACT, AGT, IRS). . . . . . . . . . . . . . . . . .94 Fig. 3.79: Rule for method for info_resources_from_type reflecting specialization . . . . . . . .94 Fig. 3.80: Constraint inheritance rules for the specialization relations
PRIME architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .101
Fig. 4.2:
Snapshot from tengerine.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103
Fig. 4.3:
Snapshot from the Characterization Manager interface . . . . . . . . . . . . . . . . . . . . . . . . . . . .104
Fig. 4.4:
Snapshot from the Information Resource Manager interface . . . . . . . . . . . . . . . . . . . . . . . .105
Fig. 4.5:
Snapshot from an agent’s IA interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107
Fig. 4.6:
Posting an information need.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .108
Fig. 4.7:
Accessing the thread for a posted information need. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .109
Fig. 4.8:
Snapshot from the Characterization Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110
Fig. 4.9:
Snapshot from the Information Source Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Fig. 4.10: MILOS Process Modeling Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113 Fig. 4.11: .MILOS Project Plan Editors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .114 Fig. 4.12: MILOS to-do list of team member "Barbara". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 Fig. 4.13: UML diagram of the basis SEDM created for MILOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .116 Fig. 4.14: MILOS to-do list with access to the IA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 Fig. 4.15: MILOS Plan Editor, modified to facilitate access to the IA.. . . . . . . . . . . . . . . . . . . . . . . . .118 Fig. 4.16: Mappings from a process type and its parameter definition . . . . . . . . . . . . . . . . . . . . . . . . .120 Fig. 4.17: PRIME usage scenario during planning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .123 Fig. 4.18: Query execution for an information need to find an EJB tutorial. . . . . . . . . . . . . . . . . . . . .124 Fig. 5.1:
Knowledge Utilization and Evolution Cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127
Fig. 5.2:
Forum structure based on the characterization class hierarchy. . . . . . . . . . . . . . . . . . . . . . .130
178
Fig. 5.3:
Example for an activity-specific modeling request. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Fig. 5.4:
Example for an automatic new information source recommendation posting.. . . . . . . . . . . 131
Fig. 5.5:
Example for a "marked useless" posting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Fig. 5.6:
Example for a "restructure proposal" posting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Fig. 5.7:
Template for an "Missing Information Feedback" posting. . . . . . . . . . . . . . . . . . . . . . . . . . 133
Fig. 5.8:
Process type-specific Modeling Request Forum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
179
List of Tables Tab. 2.1:
The QIP cycle instantiated on a project level (adapted from [Tau00]). . . . . . . . . . . . . . . . . .16
Tab. 3.1:
Information source example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
Tab. 3.2:
Information source recommendation example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
Tab. 3.3:
Information source recommendation organized by a product type. . . . . . . . . . . . . . . . . . . . .60
Tab. 3.4:
Example categories for information needs during planning. . . . . . . . . . . . . . . . . . . . . . . . . .62
Tab. 6.1:
Comparison of Process-Oriented Knowledge Management environments.. . . . . . . . . . . . .148
180
181
List of Clarifications 2.1. 2.2. 2.3. 3.1. 3.2. 3.3. 3.4. 3.5. 3.6. 3.7. 3.8. 3.9. 3.10. 3.11. 3.12. 3.13. 3.14. 3.15. 3.16. 3.17. 3.18. 3.19. 3.20. 3.21. 3.22. 3.23. 3.24. 3.25. 3.26. 3.27. 3.28. 3.29.
Knowledge Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge Department . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experience Factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Process-Oriented Knowledge Management . . . . . . . . . . . . . . . . . . . . . . . . . . . Process Type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Activity of Type T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Process Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Process Type Specialization
12 13 14 25 26 26 27 27 27 28 28 29 29 30 30 30 31 33 33 34 35 36 36 36 41 43 45 46 47 52 53 53
182
Lebenslauf
Persönliche Daten Name
Harald Holz
Anschrift
Werderstraße 6 67655 Kaiserslautern
Geburtsdatum Geburtsort
26. April 1968 Düsseldorf
Nationalität Familienstand
deutsch ledig
Schulbildung 1974 - 1978
Henri-Dunant-Grundschule in Düsseldorf
1978 - 1982
Schloß-Gymnasium in Düsseldorf
1982 - 1987
Lessing-Gymnasium in Düsseldorf
Juni 1987
Abitur
Zivildienst 1987 - 1988
Alten- und Krankenbetreuer in Düsseldorf
Studium 1988 - 1991
Studium der Informatik mit Nebenfach Mathematik an der Universität Karlsruhe
1991 - 1996
Fortsetzung des Studiums der Informatik mit Nebenfach Mathematik an der Universität Kaiserslautern. Studienabschluß als Diplom-Informatiker
Berufstätigkeit 09/1996 - heute
Wissenschaftlicher Mitarbeiter im Sonderforschungsbereich 501: "Entwicklung großer Systeme mit generischen Methoden", Universität Kaiserslautern.
01/2000 - 04/2000
"Visiting Professor" an der University of Calgary, Canada; Department of Computer Science.
05/1998 - 06/1999
Berater der Software-Entwicklungsabteilung bei der Markant-Südwest Software und Dienstleistungs GmbH, Kaiserslautern.
08/1992 - 07/1996
Wissenschaftliche Hilfskraft in der Arbeitgruppe "Wissensbasierte Systeme - Künstliche Intelligenz", Universität Kaiserslautern.