SEMANTIC WEB TECHNOLOGIES FOR E-LEARNING
The Future of Learning Learning is becoming more and more important as one of the indispensable tools to ensure future prosperity and well-being. This is the case not only for the individual, alone or as a member of a group, but also for organisational structures of all kinds. New learning paradigms and pedagogic principles, new learning environments and conditions, and new learning technologies are being tested in order to find the right combination of parameters that can optimise the outcome of the learning process in a given situation. This book series presents to all stakeholders the latest advances in this important area, based on a sound foundation. Schools, higher education, industrial companies, public administrations and other organisational structures, including providers of learning and training services, including life-long learning, plus all the individuals involved, researchers, students, pupils, citizens, teachers, professors, instructors, politicians, decision makers etc., contribute to and benefit from this series. Pedagogic, economic, structural and organisational aspects, the latest technologies, and the influence from changing attitudes and globalisation are treated in this series, providing sound and updated information, which can be used to further improve the learning process in both formal and informal contexts. Series Editors:
N. Balacheff, J. Breuker, P. Brna, K.-E. Chang, J.C. Cherniavsky, J.P. Christensen, M. Gattis, M. Gutiérrez-Díaz, P. Kommers, C.-K. Looi, C.J. Oliveira, M. Schlager, M. Selinger, L. Steels and G. White
Volume 4 Recently published in this series Vol. 3. Vol. 2. Vol. 1.
E. McKay, The Human-Dimensions of Human-Computer Interaction – Balancing the HCI Equation S. Salerno et al. (Eds.), The Learning Grid Handbook – Concepts, Technologies and Applications M. Pivec (Ed.), Affective and Emotional Aspects of Human-Computer Interaction – Game-Based and Innovative Learning Approaches Related publications by IOS Press:
M. Tokoro and L. Steels (Eds.), The Future of Learning: Issues and Prospects M. Tokoro and L. Steels (Eds.), A Learning Zone of One’s Own: Sharing Representations and Flow in Collaborative Learning Environments P. Kommers (Ed.), Cognitive Support for Learning: Imagining the Unknown T. Hirashima, U. Hoppe and S. Shwu-Ching Young (Eds.), Supporting Learning Flow through Integrative Technologies R. Mizoguchi, P. Dillenbourg and Z. Zhu (Eds.), Learning by Effective Utilization of Technologies: Facilitating Intercultural Understanding
ISSN 1572-4794
Semantic Web Technologies for e-Learning
Edited by
Darina Dicheva Department of Computer Science, Winston-Salem State University, USA
Riichiro Mizoguchi The Institute of Scientific and Industrial Research, Osaka University, Japan
and
Jim Greer ARIES Laboratory, Department of Computer Science, University of Saskatchewan, Canada
Amsterdam • Berlin • Tokyo • Washington, DC
© 2009 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-60750-062-9 Library of Congress Control Number: 2009937771 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected]
LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved.
v
Preface Recent research on web-based educational systems attempts to meet the growing needs and expectations of the education community concerning e-learning efficiency, flexibility, and adaptation by employing ontologies and Semantic Web standards and paradigms. These advanced technologies allow for more intelligent access and management of web information and semantically richer modelling of content, applications, and users. Within the educational field, they motivate efforts to achieve semantically rich, well-structured, standardised, and verified learning content and learning activities that can be shared and reused by others. Conceptualizations, ontologies, the available W3C standards such as XML, RDF(S), OWL, OWL-S and educational standards such as LOM, SCORM, and IMS-LD allow specification of components in a standard way. The standards-based machine-processable semantic descriptions of web resources provide the necessary ground for achieving reusability, shareability, and interoperability of educational web resources and better personalization in educational hypermedia and web-based applications. The notion of Social Semantic Web describes an emerging design approach for building Semantic Web applications which employs Social Software approaches. Social Semantic Web systems usually support collaborative creation, usage and continuous refinement of Semantic Web structures by communities of users. Typically they elicit domain knowledge through semi-formal ontologies, taxonomies or folksonomies. Semantic Web and Social Semantic Web techniques offer new perspectives on intelligent educational systems by supporting more adequate and accurate representations of learners, their learning goals, learning material and contexts of its use, as well as more efficient access and navigation through learning resources. They advance the state-ofthe-art in intelligent educational systems development, so as to achieve improved elearning efficiency, flexibility and adaptation for single users and communities of users (learners, instructors, courseware authors, etc.). Within this context, this book attempts to outline the state-of-the-art in the research on application of ontologies and Social and Semantic Web technologies in e-Learning. It presents a view of the latest theoretical and technological advances, various perspectives of application of Semantic Web and Web 2.0 technologies in e-Learning, and showcases major achievements in this area. Most of the chapters present research and applications stemming out of work reported at the recent editions of the International Workshop on Ontologies and Semantic Web in e-Learning (SWEL).1 The book is aimed as a guide for researchers and developers to gain understanding of the present and future tendencies in the research in this field. It consists of three parts, the first concentrating on Ontologies, the second on Technologies, and the third on the emerging Social Semantic Web. Within these sections of the book, viewpoints and research findings of various authors are organized. The book cannot claim to cover the full breadth of issues in the SWEL domain, but opens up a number of interesting issues and leaves many open problems for future researchers to pursue.
1
http://compsci.wssu.edu/iis/swel/.
vi
In the first part, ontologies in support of e-Learning are examined, stretched, evaluated, and applied. Rogozan and Paquette tackle the challenging problem of ontology evolution, explaining how ontologies change over time and providing a mechanism and an ontology for describing this evolution. Dicheva and Dichev attack the practical problem of scaling up learning content repositories, pushing the limits of ontological representation schemes. Lillian Cassel investigates the “ontology of all computing” and the efforts of the ACM and others in the process of curriculum mapping based upon a comprehensive ontology of concepts. Three chapters investigate the practical problems of applying ontologies directly to authoring instruction for learners. Mizoguchi et al. look at ontologies underlying instructional and learning theories, formulating such theories into representational and reasoning engines suitable for authoring content. Suraweera et al. focus on ontology support for authoring constraint-based tutors, demonstrating the generality of an ontological approach in automating the development of domain models. Finally, Soldatova and Mizoguchi apply ontologies to the development of assessment examinations. The second part of this book surveys selected areas among the vast set of possibilities for application of Semantic Web technologies to e-learning. Jovanovic et al. demonstrate how instructor feedback can be enhanced with Semantic Web technologies. Libbrech and Desmoulins improve content annotation, representation and searching in a Geometry teaching domain. Melis et al. describe how semantic technologies have been incorporated in ActiveMath, an intelligent tutoring system that has been enhanced with Semantic Web technologies. Radenković et al. present enhancements to generalized testing and assessment systems, while Pasin and Motta present a Semantic Web tool tightly bound to the discipline of Philosophy. And finally, Dzbor and Rajpathak present a Semantic Web-enhanced general platform for search and aggregation of information about authors and content topics. The third and final part of the book speaks to the developing technologies related to the Social Semantic Web. Jovanovic et al. survey this emerging area. Brooks et al. present a number of projects and experiences that broadly explore Semantic Web technologies in social learning contexts. To conclude this volume, Loll and Pinkwart offer a new approach to collaborative filtering that relies on Semantic Web technologies. Current research on the application of ontologies and Semantic Web technologies in e-Learning covers an even greater scope that this diverse set of articles might suggest. Though we provide a selective view of the emerging research, we want to convey a sense of today’s cutting edge in the design, implementation, and evaluation of ontology-aware web-based educational environments and community-centred educational social applications. We hope that this book will provide some new insights and serve as a catalyst to encourage others to investigate the potential of the application of ontologies and Social and Semantic Web technologies for their organisational needs and research endeavours. Darina Dicheva, Winston-Salem State University, USA Riichiro Mizoguchi, Osaka University, Japan Jim Greer, University of Saskatchewan, Canada
vii
Contents Preface Darina Dicheva, Riichiro Mizoguchi and Jim Greer
v
Part 1. Ontologies for e-Learning Part 1.1. Ontologies as Enabling Technologies Ontology Evolution and the Referencing of Resources in Semantic Web Context Delia Rogozan and Gilbert Paquette
5
Authoring and Exploring Learning Content: Share Content by Sharing Concepts Darina Dicheva and Christo Dichev
24
Using a Computing Ontology in Curriculum Development Lillian Cassel
44
Part 1.2. Ontologies for Authoring Instructional Systems Inside a Theory-Aware Authoring System Riichiro Mizoguchi, Yusuke Hayashi and Jacqueline Bourdeau
59
Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems Pramuditha Suraweera, Antonija Mitrovic, Brent Martin, Jay Holland, Nancy Milik, Konstantin Zakharov and Nicholas McGuigan
77
An Ontology-Based Test Generation System Larisa N. Soldatova and Riichiro Mizoguchi
96
Part 2. Semantic Web Technologies for e-Learning Part 2.1. Instructional Support and Adaptation Using Semantic Web Technologies to Provide Contextualized Feedback to Instructors Jelena Jovanovic, Dragan Gasevic, Carlo Torniai and Vladan Devedzic
117
A Cross-Curriculum Representation for Handling and Searching Dynamic Geometry Competencies Paul Libbrecht and Cyrille Desmoulins
136
Part 2.2. Semantic Web-Based Intelligent Learning Environments Architectures ActiveMath – A Learning Platform with Semantic Web Features Erica Melis, Giorgi Goguadze, Paul Libbrecht and Carsten Ullrich
159
viii
An Intelligent Framework for Assessment Systems Sonja D. Radenković, Vladan Devedžić and Nenad Krdžavac
178
PhiloSurfical: An Ontological Approach to Support Philosophy Learning Michele Pasin and Enrico Motta
197
Comparative Evaluation of ASPL, Semantic Platform for e-Learning Martin Dzbor and Dnyanesh G. Rajpathak
219
Part 3. Social Semantic Web Applications E-Learning and the Social Semantic Web Jelena Jovanovic, Dragan Gasevic and Vladan Devedzic
245
Lessons Learned using Social and Semantic Web Technologies for e-Learning Christopher Brooks, Scott Bateman, Jim Greer and Gord McCalla
260
Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques Frank Loll and Niels Pinkwart
279
Subject Index
299
Author Index
301
Part 1 Ontologies for e-Learning
This page intentionally left blank
Part 1.1 Ontologies as Enabling Technologies
This page intentionally left blank
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-5
5
CHAPTER 1
Ontology Evolution and the Referencing of Resources in Semantic Web Context Delia ROGOZAN 1 and Gilbert PAQUETTE LICEF Research Center, TELUQ, Québec, CANADA
Abstract. Because ontologies evolve over time, their evolution needs to be managed. Therefore, in this paper, we propose a framework composed of two main systems: ChangeHistoryBuilder, which tracks and manages the history of ontology changes, and SemanticAnnotationModifier, which provides a support to maintain the integrity of the ontology-based referencing of resources after the ontology evolution. Both systems are based on a formal specification of types of possible changes in OWL-DL ontologies. In concrete terms, this specification is an ontology of ontology changes. Keywords. Ontology evolution, ontology of ontology changes, tracking changes, managing the ontology-based referencing of resources
Introduction Evolution is a fundamental requirement for useful ontologies. Since ontologies are knowledge theories of a precise domain, they need to evolve because the domain has changed or because problems in the original domain conceptualization have to be resolved [1]. Moreover, in open and dynamic environments such as the Semantic Web, the ontologies need to evolve because domain knowledge evolves continually [2] or because ontology-oriented software-agents must respond to changes in users’ needs [3]. Consequently, ontology evolution is an essential part of research in ontology engineering and in application of ontologies in Semantic Web environments. This chapter explores some important issues of ontology evolution. Three research questions structure this chapter: 1) What is ontology evolution and which are the types of possible changes in OWL ontologies? 2) How can we manage the evolution history by logging changes brought to ontologies? 3) What are the effects of changes on the ontology-based referencing of resources and how can we resolve them?
1
Corresponding Author: Delia Rogozan, LICEF Research Center, 100, rue Sherbrooke Ouest, Montréal, Qc., Canada, H2X 3P2; E-mail:
[email protected].
6
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
1. Ontology Evolution and Ontology Changes 1.1. Definition of the Ontology Evolution Notion Actual research is far from defining the ontology evolution notion in a consensual way. Thus, for the authors of [4, 5], the ontology evolution signifies the process of applying changes to a unique ontology, while the authors of [6, 7] consider it more as the building and the management of multiple ontology versions. Both interpretations are pertinent in the distributed and dynamic context of the Semantic Web. Consequently, we consider the ontology evolution as the timely modification of an ontology by application of changes to an ontology version (V N ) in order to obtain a new ontology version (V N+1 ), while preserving the ontology consistency and roles. The ontology role refers to the service provided by the ontology and to its usage. For example, in the Semantic Web context, the ontology is used to assure the semantic referencing so that resources can be found by the knowledge they contain [8, 9]. The ontology consistency designates the state where all structural and axiomatic constraints of the ontology model are respected. An ontology change is a modification brought to ontology during the evolution from a version V N to a new version V N+1 . Changes can be elementary or complex. An elementary change is a simple and non-composite change (i.e. addition or deletion of ontology elements). A complex change is a collection of elementary changes, which form together a logical entity whose signification is unique and clearly defined (cf. Table 1). Table 1. Examples of complex changes Complex change
Collection of elementary changes
MergeClasses (C 1 … C N ) into class C
- DeleteClass (C 1 ), …, DeleteClass (C N ) - AddClass (C)
SplitClass C into classes (C 1 … C N )
- DeleteClass (C) - AddClass (C 1 ), …, AddClass (C N )
ModifySuperClass of C, from class A to B
- DeleteSuperClass (A) from (C)
MoveDisjointClass (C), from class A to B
- DeleteDisjointClass (C) from class A
- AddSuperClass (B) to (C)
- AddDisjointClass (C) to class B
Application of changes can induce inconsistencies in other parts of the ontology [10]. For example, merging two classes will cause subclasses and properties to be inconsistent. Resolving that problem can be treated as a request for new additional changes, e.g. subclasses and properties can be either deleted, or attached to some other classes. Thus, a primary change is not a consequence of any change, while an additional one is caused by another change (named parent-change). 1.2. An Ontology of Ontology Changes The notion of “change” is central to ontology evolution. Indeed, to describe evolution it is necessary to specify formally all types of changes that can be applied to ontologies. Regarding the change specification, actual research proposes only taxonomies of elementary changes, although complex changes have a richer semantic [11].
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
7
In this section, we propose an ontology of changes that can be applied in OWL-DL ontologies. This ontology expands the taxonomies described in [12, 13] by adding a typology of complex changes, as well as a number of properties and axioms. We have built this ontology with the MOWL 2 graphical editor developed by a LICEF team. The following table presents some of the basic graphical symbols used by MOWL to represent ontologies. Table 2. An example of some of the graphical symbols used by MOWL to represent ontologies
1.2.1. Classification of Changes in Change Ontology We present here an extract of our ontology; more details can be found in [14]. The Change Ontology consists of two principal hierarchies. The ChangeObject hierarchy specifies the ontology objects that can be changed, i.e. elements used to build OWL ontologies, such as classes, properties or axioms. The ChangeOperation hierarchy specifies the types of changes in OWL-DL ontologies. It consists of two taxonomies, one of elementary changes and one of complex changes, both of them being described further. 1.2.1.1. Operations of Elementary Changes The taxonomy of elementary changes contains the generic changes Add_Change and Delete_Change. The conceptual structure of these generic changes is similar. For that reason, in Figure 1 we illustrate only the classification of additions. The changes that add elements are classified according to their application object: Add_To_Ontology, Add_To_Class, Add_To_Property. From the ‘ontology’ point of view, there are two main changes: Add_Class and Add_Property. From the ‘class’ point of view there are multiple changes: additions of logical axioms (i.e. intersectionOf, complementOf, unionOf), additions of class axioms (i.e. superClass, equivalentClass, disjointWith) or even additions of property restrictions that characterize classes. From the ‘property’ point of view, the main changes operate on the property domain and range as well as on the property axioms (i.e. superProperty, equivalentProperty, inverseProperty). 1.2.1.2. Operations of Complex Changes The taxonomy of complex changes contains the main types of complex changes, which are those that merge, split, modify or move elements of ontologies (Merge_Change, Split_Change, Modify_Change, Move_Change). Other types of complex changes are those that add, delete or modify sub-hierarchies of OWL elements (Subtree_Change). 2 The MOWL is a tool for editing OWL ontologies and for exporting them to XML files compliant with OWLDL (http://www.cogigraph.com/Produits/OWLDLOntologyEditors/tabid/1100/language/en-US/Default.aspx).
8
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
Given the number of concepts represented by this taxonomy (more than 50), we present in Figure 2 only the classification of the Modify_Change type.
Figure 1. Classification of elementary changes that add elements to OWL-DL ontologies
Figure 2. Classification of complex changes that modify elements in OWL-DL ontologies
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
9
1.2.2. Change Characterization in Change Ontology To allow a richer characterization of changes, we also defined some properties. 1.2.2.1. Properties of Changes Figure 3 introduces the general properties of change operations. The appliedOn property connects the change operations to ontology objects. The properties haveSource and haveTarget describe the source and the target of change operations. Both properties have as domain a class of type ChangeOperation and as range a class of type ChangeObject or a value rdfs:Datatype. Other properties are introduced, as haveChangeNumber, which well: specifies a reference number that indicates the application order of a change and Figure 3. Properties of change operations haveParentChangeNumber, which declares the reference number of the parent-change. 1.2.2.2. Characterization of Changes by means of Property Restrictions Restrictions on these general properties may be associated to each change operation in order to characterize it formally. Figure 4 shows a part of the characterization of Add_Change, the same method being followed for all other changes.
Figure 4. Part of the characterization of addition changes
10
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
Thus, any operation of type Add_Change is characterized by two general restrictions: an exact cardinality of 0 for the haveSource property, which declares that there is no element as source of an addition change; a minimal cardinality of 1 for the haveTarget property, which declares that the target of any addition change comprises at least one element. The application object is defined by adding restrictions on appliedOn property. These restrictions characterize changes that add elements to the ontology structure (Add_To_Ontology), to a class definition (Add_To_Class) or to a property definition (Add_To_Property). We described in this section an ontology of ontology changes that extends previous classifications. It also adds clear definitions of change operations by means of properties. We can now use this formal theory to support the development of tools for managing ontology changes. We address this objective in the following sections where we propose two interlinked systems for managing (1) the history of ontology changes and (2) the ontology-based referencing of resources after the ontology evolution.
2. Managing the History of Ontology Changes with ChangeHistoryBuilder (CHB) Although the management of ontology changes is one of the key issues in successful applications of evolving ontologies, methods and tools to support it are almost missing [15]. As we underlined in [14], very little research concerning tools for keeping track of ontology changes has been carried out. However, these tools are important to consider for ontologybased referencing of resources since changes affect the way that resources should be handled and interpreted by means of new ontology versions. There are two major approaches for tracking and managing ontology changes. The first one logs changes during ontology evolution [4, 13]. Even if this approach facilitates later retrieval of all performed changes, it presents an important problem: the log-files are stored independently from ontology versions and a tool-oriented language formalizes them. Consequently, these log-files are more difficult to identify, access, and interpret by Semantic Web agents. For that reason, the second approach relies only on a comparison between ontology versions to identify changes [12, 16]. However, it presents a problem as well. It can identify only some elementary changes and therefore, it cannot provide complete information about evolution processes 3 . The ChangeHistoryBuilder (CHB) system overcomes these two problems: it combines the fact of having access to a log that captures the entire semantic of ontology evolution with the fact of identifying changes starting only from ontology versions. It also can deal with complex changes, in addition to elementary ones. To track and manage the history of ontology changes, the CHB system supports a fourstep process, as illustrated in Figure 5.
3
Knowing that two classes were deleted from V N does not tell us that these classes were merged in V N+1 .
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
11
Figure 5. The fourth-step functioning of the CHB system Legend of the graphical formalism [17] : procedure (oval shape), input/output resource of a procedure (rectangular shape) and actor that carries out the underling procedure (hexagonal shape); link composition (C), specialization (S), precedence (P), input-output (I/P) and regulation (R).
2.1. Capturing Changes during Ontology Evolution (Step 1) The first step aims at logging in a log-file all changes applied during the evolution from V N to V N+1 . To resolve the interpretation problems of log-files generated by different editors, the CHB provides ontology editors with a uniform and common model for logging changes. The CHB model is a set of metadata that aggregates in a common structure all changes [18]. Based on the change ontology, these metadata allow ontology editors to capture specific information about elementary and complex changes, in addition to general information regarding the ontology version. These ontology editors can use the CHB model as a plug-in and thereby generate log-files presenting a normalized and rich description of applied changes. A log-file example of this sort is presented in Figure 6.
12
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
Figure 6. Example of a log-file based on the CHB model
Despite the figure above, the CHB model is not a linear one. It is organized so that every primary change is represented under a tree-shape that is formed by additional changes. Moreover, this change tree is generated in a flexible way according to evolution strategies applied by ontologists during evolution. 2.2. Formalizing Changes using OC+OWL Language (Step 2) The second step of the logging process supported by the CHB system is the formalization of changes that were captured during the previous step. For this purpose, we developed a formalization language, named OntologyChange (OC), which is based on a minimal number of constructs, labelled oc. When combined to those of OWL [19], these constructs formally describe all types of changes in OWL-DL ontologies. Table 3 shows a concise summary of OC language constructs and Figure 7 illustrates how CHB uses these constructs to formalize changes. Consequently, all semantic web agents or software components, which are able to manipulate the OC+OWL language, can also interpret and reason with the trace of formalized changes that were logged using the CHB model. Table 3. OC language constructs to formalize changes
2.3. Archiving Formalized Changes in the New Ontology Version (Step 3) The third step consists in the archiving of previous formalized changes. The solution proposed by the CHB system is to append to the new ontology version the trace of changes formalized with OC+OWL language. The expression V N+1 Change denotes thus this new
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
13
ontology version with an integrated trace of changes. In this way, V N+1 Change contains in addition to the underlying domain conceptualization, all information about the evolution from V N to V N+1 . Figure 7 presents an example of a V N+1 Change version. In order to preserve the interpretation of V N+1 Change through all OWL compliant tools, the formalized changes are declared inside the owl:versionInfo statement. According to OWL language, this statement gives information about ontology versions without contributing to the logical meaning of the ontology. The resulting V N+1 Change version thus conforms to the OWL language, while offering information about all applied changes.
Figure 7. An example of a V N+1 Change ontology version
2.4. Identifying Changes Starting from the New Ontology Version (Step 4) The fourth step concerns the interpretation of changes after the ontology evolution. The CHB system is able to identify all applied elementary and complex changes, together with their primary-additional relationship, by simply reading the OC+OWL trace contained in the V N+1 Change ontology version. Furthermore, all software agents able to interpret OC+OWL language can also identify changes starting only from V N+1 Change.
14
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
3. Managing Semantic Referencing with SemanticAnnotationModifier (SAM) Ontology evolution can give rise to side effects on the resource referencing and can thus hamper one of the most important features of the Semantic Web: the ontology-based referencing of resources that formally describes the resources content. Consider the example of the ontology evolution from V N to V N+1 and a resource R 1 , which is referenced by the class PedagogicalDesigner belonging to V N . During evolution, this class is merged with another class and consequently, it no longer exists in the new ontology version. This makes resource R 1 no more accessible for requests of type “Give me a resource which is a PedagogicalDesigner”: the access to R 1 is broken via V N+1 . Consider furthermore a resource R 2 that is referenced by two classes Tutor and Researcher. If a disjunction axiom is added between these two classes, then the interpretation of R 2 becomes inconsistent via the new ontology version. However, despite the necessity of managing the effects of ontology changes on the resource referencing, little research tackled this issue. For example, in [2] it was demonstrated that the add changes do not affect the access to referenced data, while changes that delete entities hamper it. Or, the authors of [20] analyzed the effects of elementary changes on the class hierarchy. The authors of [21] analyzed and proved that modifications made on an ontology whose concepts are used to generate metadata may disrupt the metadata semantic. In [22] was proposed the CREAM annotation model together with some recommendations regarding the modification of resource referencing, yet without proposing any concrete solution to that purpose. The authors of [23] presented a rule-based approach to detect and correct inconsistencies of ontology-based semantic annotations. Finally, let us underline that, even if a wide range of referencing tools exists nowadays, none of them is able to support an evolving ontology-based referencing of resources. In this context, the second system that we propose in this chapter is as much innovatory as fundamental. The SemanticAnnotationModifier (SAM) system provides a support for managing the ontology-based referencing of resources after the ontology evolution. In order to present SAM, we start by explaining the notion of semantic referencing on which the system is based. Then, we discuss the operation model of SAM and we explain it through examples. 3.1. Semantic Referencing of Resources by means of UKIs Semantic referencing denotes the description of resources content by means of formal semantic descriptors. These descriptors, named semantic references, are generally knowledge, i.e. classes according to the OWL terminology, belonging to different ontologies. To specify the semantic references, we use the URI general syntax. A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies all kind of objects, whether they are physical (e.g. images, documents, services, actors) or abstract (e.g. concepts in an ontology). It consists of a hierarchical sequence of common components: scheme, authority, path and fragment [24]. In addition, to assert that semantic references identify solely ontology concepts, we introduce the terms Uniform Knowledge Identifier (UKI) and we define it as a URI with two restrictions: the first three components must identify a unique version of an ontology and the last component must identify a unique class inside this ontology version. Thus, as
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
15
illustrated in Figure 8, an UKI is composed of two principal elements: the URI of the ontology version and the name of a class within this version.
Figure 8. An UKI that specifies the reference PedagogicalDesigner within the second version of eLearningOntology
In conclusion, the semantic referencing consists of one or several semantic references associated to resources to describe their content formally, each reference being specified by means of a UKI (cf. Figure 9).
Figure 9. The semantic referencing of a resource
3.2. Operation Model of the SAM System We present the operation model of the SAM system in Figure 10. This model underlines the two main services that SAM offers to users. The first one analyses changes applied to V N to obtain V N+1 . The purpose here is to inform users about changes that hinder the access to referenced resources or that modify their interpretation. The second service modifies the semantic referencing (e.g. UKIs) that is affected by ontology changes. The purpose here is to allow access to all resources via the new ontology version as well as a consistent interpretation of them. Both services are based on data provided by the CHB system, consisting of complete and semantically rich information about elementary and complex changes together with the causality relation existing among them.
16
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
Figure 10. The operation model of the SAM system (see legend of Figure 5)
3.3. Exemplifying the Operation Model of SAM In this section, we exemplify the operation model of SAM. We start by illustrating how SAM analyses the change effects on resource referencing. Next, we present how SAM assists users in modifying this resource referencing. 3.3.1. SAM Analysis the Change Effects on Resource Referencing 3.3.1.1. Users send UKIs to SAM in order to analyse them Let us consider a user who wants to verify if the semantic referencing of a resource collection is affected by the evolution from an ontology version V N to a new version V N+1 . In that purpose, he sends to SAM a file containing the UKIs (i.e. references) associated to these resources. For this first prototype of SAM, we imposed some constraints on the file format: the UKIs file must stem from the same owner; it must be organized as a list; all UKIs must refer to the same ontology version.
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
17
3.3.1.2. SAM interprets UKIs To interpret the UKIs file, SAM decomposes every UKI in order to identify the URI of the V N ontology version together with the name of the class used as reference (cf. Figure 11).
Figure 11 (a). User UKIs file
Figure 11 (b). Decomposed UKIs
Then, it asks CHB for the V N+1 Change and extracts all ontology changes that were applied to V N to obtain V N+1 . Because SAM can interpret the OC+OWL language, it can also ‘understand’ the trace of changes appended to V N+1 Change. Finally, SAM links UKIs to changes by matching each class name specified by UKIs to its corresponding pair in the change trace. 3.3.1.3. SAM analyses change effects on UKIs and the user request UKIs modification Based on UKIs interpretation, SAM presents to user an analysis of changes (cf. Figure 12).
Figure 12. Change visualization and change analysis with SAM
18
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
Firstly, SAM highlights: (1) changes that break the access to resources, in red; (2) changes that give rise to an inconsistent interpretation of resources, in yellow; (3) changes that modify the interpretation of resources (e.g. by modifying the class-parent of the class used as semantic reference), in blue. Secondly, SAM provides the user with an analysis of effects for each underlined change. This analysis 4 consists of three panels. The two first ones deal with the effect of changes on the access to referenced resources or on the consistency of their interpretation. The third one indicates the relation exiting between a class belonging to V N and the same or other class belonging to V N+1 , according to criteria as identity, equivalence, inclusion, generalization, specialization or conceptually different. This last panel is particularly useful for understanding how the meaning of a class used as reference was modified during the ontology evolution. Starting from the change analysis, the user has the possibility to request the modification of resource referencing (i.e. UKIs) that is affected by ontology changes. 3.3.2. SAM Modifies the Resource Referencing 3.3.2.1. SAM modifies the resource referencing affected by non problematic changes This modification concerns the UKIs affected by changes that do not cause either a loss of access to resources, or an inconsistent interpretation of them. Changes of this type are AddEquivalentClass or ModifySuperClass, for example. Thus, to allow access to resources via the new ontology version, SAM modifies only the URI of the ontology version inside UKIs (cf. example bellow). The user has to validate it, even though this modification can be automatically processed.
Figure 13. Modification of the UKI referring to Designer_IMS_LD (cf. evolution example from Figure 15)
3.3.2.2. SAM identifies several solutions for the modification of the resource referencing affected by problematic changes This situation concerns especially the UKIs affected by changes that hamper the access to resources via the new ontology version (e.g. MergeClasses, DeleteClass, SplitClass). In this case, most of classes used as references in UKIs are no more available in V N+1 . To give access to resources, SAM should then modify, besides the URI of the ontology version, the class name in each affected UKI (cf. example bellow).
Figure 14. Modification of the UKI referring to PedagogicalDesigner (cf. evolution example from Figure 15)
However, this modification cannot be automatically processed because several solutions are possible. To detect them, SAM exploits two identification algorithms that we developed in [14]. Since these algorithms are based on the information provided by the V N+1 Change, they are able to deal with all problematic changes. Consider, for example, the 4 As we are focusing on the general functioning of SAM, we are not going to discuss the change analysis in this chapter. Details can however be found in [14].
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
19
change MergeClasses illustrated in Figure 15. SAM is able to detect several classes that can be pertinent for the UKIs modification: Classes semantically “closed”, such as the ContentPresenter because it includes the meaning of PedagogicalDesigner. First-level subclasses of classes the name of which must be modified in UKIs. Regarding our example, these subclasses are Designer_MISA and Designer_IMS_LD. They were transferred to another class in V N+1 , after the removal of their parent-class. Classes to which first-level subclasses were transferred, i.e. CourseManager.
Figure 15. Identification of pertinent classes for the modification of UKIs affected by problematic changes: the MergeClasses example.
3.3.2.3. SAM assists users in modifying UKIs affected by problematic changes Choosing among the solutions identified by SAM is the user privilege; only the user may decide which solution is more appropriated to his context. However, SAM can guide him during the modification process. The Figure 16 presents the interaction between the SAM system and a user who wants to modify UKIs affected by problematic changes (the example of MergeClasses is considered). As shown in this figure, the SAM interface consists of four principal sections. Section 1 indicates the UKIs affected by the change whose analysis was previously explored by the user. Section 2 presents the classes identified as being pertinent for the modification of indicated UKIs. The classes are enumerated in a decreasing order, according to their pertinence degree. Section 3 consists of comments and specific characteristics of classes listed in Section 2. For each class, the “Comments” panel describes the reason why a class was considered by SAM as pertinent. The other panels indicate the subclasses, axioms and properties that were deleted from, transferred or added to the selected class. Finally, Section 4 presents the modified UKIs.
20
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
Figure 16. Interaction between SAM and users during the modification of UKIs
3.3.2.4. SAM generates the file of modified UKIs Once the user validated the modification of all UKIs, SAM generates a file containing these modified UKIs and sends it to the user.
4. Evaluation and Deployment of CHB and SAM in eLearning Contexts 4.1. Evaluation of CHB and SAM Systems Regarding the systems evaluation, we carried out a technical validation of CHB and SAM with ontologies of small and average size, a diversified set of changes and the UKIs files that respect the specified constraints. We also conducted a qualitative evaluation of CHB and SAM systems according to the utility criterion, i.e. a criterion allowing to identify, for a given context, the interest and the relevance degree of systems features [25]. In our case, the general target context was the Semantic Web. The specific context was that of eLearning systems based on ontologies and on the semantic referencing of resources. We used several techniques while undergoing the evaluation of systems, i.e. thinking aloud method, qualitative questionnaires, interviews and a focus-group. Six participants were then selected. They all have knowledge of OWL ontologies as well as experience in the eLearning fields. The evaluation took place in the LORIT 5 laboratory for observation, test and experimentation of instructional technologies. In order to draw some valid meaning from qualitative data that we collected during systems evaluation, we based our analysis on the method proposed in [26]. Some outcomes of this data analysis are illustrated in Table 4. Other results may be found in [14]. 5 LORIT (http://www.licef.teluq.uquebec.ca/lorit/eng/Index.htm) stands for Research Laboratory-Observatory in Tele-learning Engineering.
21
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources Table 4. Qualitative evaluation of CHB and SAM systems: some outcomes CHB
SAM
The utility was validated for both systems, especially for users that are responsible of the ontology management and of the ontology-based referencing of resources The evaluation participants demonstrated…
The evaluation participants appreciated…
A better understanding of the ontology evolution after the visualization of changes, especially in the case of complex changes
The fact that SAM allows users to control the referencing modification (for problematic changes) The relevance of multiples solutions proposed by SAM
The evaluation participants underlined orientations for future works, such as … Use the integrated trace of changes as a support to the collaborative modification of an ontology
Customize the assistance (or automatism) level of the modification of resource referencing with SAM
Connect a change viewer with an ontology viewer
Make available new means to define new references
4.2. Deployment of CHB and SAM Systems in eLearning Contexts Once the evaluation of CHB and SAM completed, the next step is to deploy these systems in eLearning contexts. To that effect, we have selected the TELOS project that was designed and developed by a LICEF team within the LORNET research network [27]. The Technology Enhanced Learning Operating System (TELOS) aims to enable pedagogical technologists to develop, modify or use eLearning resources within a serviceoriented framework. In TELOS, all types of ‘content provider’, e.g. multi-media document, learning object, learning design, knowledgeable person, are eLearning resources. All these resources are referenced using specific knowledge defined in domain ontologies. The goal here is to allow the search of relevant eLearning resources, the aggregation of resources according to their semantic description and the creation of consistent learning scenarios based on a semantic equilibrium among resources [28]. The ontology-based referencing layer is thus a foundational element of the TELOS framework. Considering that, CHB and SAM systems are necessary to manage the referencing of resources, given that domain ontologies are not fixed entities: at any moment, these ontologies may be modified by TELOS users according to their needs. Therefore, we illustrate in Table 5 the services that will be provided by CHB and SAM, once these modules are integrated into TELOS. Table 5. Services provided by CHB and SAM in TELOS system Services provided by CHB and SAM in TELOS system
CHB
Track changes during the modification of TELOS domain ontologies using the MOWL editor
Help distant ontologists to see all changes made on a shared ontology
Draw attention to the potential effect of a change in order to allow users to approve or to cancel it during ontology evolution /modification
Allow the exploration of change history after ontology evolution
Automatically highlight the change effects on resource referencing, given that all resources are stored in TELOS repositories
SAM
Automatically update the resource referencing affected by non-problematic changes
Support users in modifying the resource referencing affected by problematic changes
22
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources
5. Conclusion We proposed in this article a framework for managing ontology changes and for resolving some of their problematic effects. This framework is composed of three major components: an ontology of changes, a system that tracks changes during ontology evolution (CHB) and a system that supports users in maintaining the semantic referencing of resources (SAM). Building an ontology of ontology changes is an emergent preoccupation in our research domain. We therefore developed a representation of elementary and complex changes that can be applied to OWL-DL ontologies (more than 60 operations were identified and characterized by means of properties). Based on this representation, we conceived and built a first version of the CHB and SAM systems. Concerning the CHB system, we underlined our three principal contributions. Firstly, based on the change ontology, we developed a model that allows ontology editors to capture elementary and complex changes in a uniform manner. Secondly, we proposed the OC language for change formalization. Using a minimal number of constructs, together with those of OWL, this language can represent formally all types of changes in OWL-DL ontologies. Thirdly, we offered a solution to problems of tools oriented log-files access and interpretation. This solution is to append the trace of formalized changes to the new ontology version in a manner that keep this version OWL compliant. Regarding the SAM system, our principal contribution consisted in the exploration of new and essential ideas in the ontology-based referencing domain, i.e. an appropriate modification of resource referencing in order to allow access to all resources by means of the new ontology version. For this purpose, the SAM system offers solutions and guides the users during the process of referencing modification. It maps between the referencing of resources (i.e. UKIs set) and ontology changes in order to identify the affected UKIs. It analyses the change effects on the access and on the interpretation of resources. For UKIs affected by problematic changes, it identifies a set of concepts belonging to the new ontology versions, which can be pertinent for UKIs modification. Finally, it allows users to choose among these different solutions by giving them information about the appropriateness of each identified concept. As we have completed the evaluation of prototypes for both CHB and SAM system, we currently aim to improve these two systems for making them able to treat all types of elementary and complex changes as well as different representation formats of semantic referencing. We also work on a project to integrate them in the TELOS system for eLearning and knowledge management.
References [1] [2] [3] [4] [5]
N. Noy and M. Klein, Ontology evolution: Not the same as schema evolution, Knowledge and Information Systems 5 (2003). J. Heflin and J. Hendler, Dynamic Ontology on the Web, 17th National Conference on artificial Intelligence (AAAI), 2000. L. Stojanovic, A. Maedche, N. Stojanovic, and R. Studer, Ontology Evolution as Reconfiguration- Design Problem Solving, Second International Conference on Knowledge Capture, 2003. A. Maedche, B. Motik, and L. Stojanovic, Managing Multiple and Distributed Ontologies in the Semantic Web, VLDB Journal - Special Issue on Semantic Web 12 (2003), 286-302. L. Stojanovic and B. Motik, Ontology Evolution within Ontology Editors, Knowledge Acquisition, Modeling and Management (EKAW), Siguenza, Spain, 2002.
D. Rogozan and G. Paquette / Ontology Evolution and the Referencing of Resources [6]
[7] [8] [9] [10]
[11] [12] [13] [14]
[15] [16] [17] [18] [19] [20] [21]
[22] [23] [24] [25] [26] [27]
[28]
23
M. Klein, Y. Ding, D. Fensel, and B. Omelayenko, Ontology management - Storing, aligning and maintaining ontologies, in Towards the Semantic Web: Ontology-Driven Knowledge Management, J. Davids, D. Fensel, and F. vanHarmele, Eds., Wiley, 2002, 47-69. N. Noy and M. Musen, Ontology Versioning as an Element of an Ontology-Management Framework, IEEE Intelligent Systems (2003). T. Berners-Lee, J. Hendler, and O. Lasilla, The Semantic Web, Scientific American 5 (2001), 34–43. J. Hendler, Agents and the Semantic Web, IEEE Intelligent systems 3/4 (2001), 30-37. L. Stojanovic, A. Maedche, B. Motik, and N. Stojanovic, User-driven Ontology Evolution Management, 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW02), Sigüenza, Spain, 2002. M. Klein and N. Noy, A component-based framework for the ontology evolution, Workshop on Ontologies and Distributed Systems, IJCAI 2003, Acapulco, Mexico, 2003. M. Klein, Change Management for Distributed Ontologies, Vrije Universiteit Amsterdam, 2004. L. Stojanovic, Method and tools for ontology evolution, University of Karlsruhe, Germany, 2004. D. Rogozan, Management of the ontology evolution: methods and tools for an evolving semantic referencing based on analysis of changes applied to ontology versions (in French), in LICEF Center, vol. PhD. Montréal: Université de Québec à Montréal (UQAM)/Télé-université (TELUQ), 2008. P. Haase and Y. Sure, State-of-the-Art on Ontology Evolution, Technical report, SEKT informal deliverable 3.1.1.b, Institute AIFB, University of Karlsruhe 2004. N. Noy, S. Kunnatur, M. Klein, and M. Musen, Tracking Changes During Ontology Evolution, 3rd International Semantic Web Conference (ISWC2004), Hiroshima, Japan, 2004. G. Paquette, Modélisation des connaissances et des compétences, pour concevoir et apprendre: Presses de l'Université du Québec, 2002. D. Rogozan and G. Paquette, Managing Ontology Changes on the Semantic Web, IEEE/WIC/ACM International Conference on Web Intelligence (WI'05), Compiegne, France, 2005. W3C_WebOnt, OWL Web Ontology Language Guide and Reference, 2004. H. Stuckenschmidt and M. Klein, Integrity and change in modular ontologies., 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 2003. P. Ceravolo, A. Corallo, G. Elia, and A. Zilli, Managing Ontology Evolution Via Relational Constraints, Knowledge-Based Intelligent Information and Engineering Systems, 8th International Conference KES, Wellington, New Zealand, 2004. S. Handschuh, Semantic Annotation of Resources in the Semantic Web, in Semantic Web Services, R. Studer, S. Grimm, and A. Abecker, Eds.: Springer Berlin Heidelberg, 2007, 135-155. H. Luong and R. Dieng-Kuntz, A rule-based approach for semantic annotation evolution, Computational Intelligence 23 (2007), 320-338. T. Berners-Lee, R. Fielding, and L. Masinter, Uniform Resource Identifier (URI): Generic Syntax, Network Working Group, 2005. J. Nielsen, Usability engineering: Boston, Academic Press, 1993. M. Miles and A. Huberman, Qualitative Data Analysis (2nd edition). Thousand Oaks, CA: Sage Publications, 1994. G. Paquette, I. Rosca, S. Mihaila, and A. Masmoudi, Telos, a service-oriented framework to support learning and knowledge management, in E-Learning Networked Environments and Architectures: a Knowledge Processing Perspective, S. Pierre, Ed.: Springer-Verlag, 2007. G. Paquette and F. Magnan, Learning Resource Referencing, Search and Aggregation At the eLearning System Level, presented at IODE Workshop, ECTEL-07 Conference, Crete, September 18-21, 2007.
24
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-24
CHAPTER 2
Authoring and Exploring Learning Content: Share Content by Sharing Concepts Darina DICHEVA1 and Christo DICHEV Computer Science Department, Winston-Salem State University Winston Salem, NC 27110, USA
Abstract. We propose an environment that enables authors to create learning repositories by collecting and annotating learning content using a consensually agreed vocabulary and learners to explore the repositories based on relevant staring points for exploration. The authors’ support includes: (i) tools for creating an ontological structure, partially populated with learning resources, to be used as a skeleton for structuring and organizing course related resource repositories, and (ii) help in selecting names for new concepts/topics combined with their subject identification. Besides the conventional querying and browsing support for learners, the focus is on tasks that imply exploratory search requiring extensive navigation on the part of the user. In this context we propose a method for finding good staring points for navigation designed to assist learners in performing openended search tasks in learning repositories. Keywords. Ontology-based courseware, metadata harvesting, ontology mapping
Introduction The volume of publicly available information is growing drastically. This phenomenon opens new avenues and brings new challenges to instructors and learners. Although the Web offers an abundance of learning resources, finding learning materials is a difficult task with unpredictable results. The critical issue hampering the effective use of elearning resources is not the lack of information, but its poor structure and the lack of adequate tools for efficient information retrieval. E-learning repositories, such as Merlot2, SMETE3, and CAREO4, were proposed as key enablers for facilitating the access and utilization of learning resources. An important goal motivating their development was to boost the sharing and reuse of learning materials. Despite the significant efforts though, much of the learning content provided by such repositories 1
Corresponding Author: Darina Dicheva, Winston-Salem State University, 601 S. Martin Luther King Jr. Drive, Winston Salem, NC 27110, USA; E-mail:
[email protected]. 2 http://www.merlot.org 3 http://www.smete.org/smete/ 4 http://careo.ucalgary.ca
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
25
failed to attract a sizable and enduring audience. On the other hand, there is growing interest to the new generation of community-centered online repositories [1,2] that demonstrate many useful applications. This implies a need for addressing the impacting factors. Since the aim of this work was to demonstrate the viability of Topic Maps based e-learning repositories, an important objective was to address the factors contributing to the low rate of use of current e-learning repositories. Although general repositories have extensive breadth of coverage, the depth of the material is typically insufficient and the information is frequently not up to date. With the large amount and the variable quality of learning content, the potential users are facing the daunting problem of how to find what they need. As a result, disciplinespecific repositories that are more targeted and with a higher concentration of relevant materials are emerging. One of the factors hindering the success of current repositories is that they rely on authors to provide both the learning materials and the related metadata. Typically, they don’t provide support for automatic metadata creation or ontology extraction. Furthermore,, in many repositories the published learning resources lack a cohesive structure, and when they have, it reflects a particular perspective. This arbitrary structuring, often combined with arbitrariness in selecting topic names, invokes a contrasting parallel with the emerging tendency of using domain ontologies and classification systems that offer standard vocabularies and organization mechanisms. Thus an important missing factor in current educational repositories are enablers for organizing the content based on agreed upon principles and naming conventions. For a learner,, facing a repository with massive volume of resources coming from many individual and institutional collections, an obvious question is: where does one begin? If learners cannot locate learning materials and determine their relevance quickly, the repository is unlikely to be used. Therefore, assistance with selecting a starting point for repository exploration is another feature impacting repository usability. A further problem is that there is a plethora of duplicate content on the web published on different sites. This in turn leads to the problem of co-reference, where different URLs are used to address the same resource. For example, ACM, IEEE, Citeseer and DBLP have different URIs for the same papers and authors. The coreference can present a problem, when there is a need to merge together learning content from disparate information providers by eliminating duplicates. However, identifying duplicates requires comparing resources. Similar problems arise with merging topics. Efficient comparison implies some sort of identifiers, which in turn raises the question of how to create effective subject and resource identifiers. These problems and the related factors indicate the need of new principles for building e-learning repositories. In this chapter we take an integrated approach in addressing the hindering factors in the context of domain-specific learning repositories. The central focus is on the support for harvesting existing structures and for subject identification (as a means for merging and reusing existing repository components), which enables also modular content creation. Though the key ideas are exemplified in an authoring tool extending TM4L [3], their significance is independent of the selected framework. TM4L is an integrated e-learning environment providing authoring and browsing support for creating and using Topic Maps-based e-learning repositories. It utilizes topic maps as overlay semantic structures that encode domain knowledge and connect it to learning resources, which are considered relevant to a specific domain. In this subject-centered architecture, each concept (topic) is a hub for resources possibly
26
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
grouped by additional characteristics, such as resource type (i.e. definitions, code examples, PPT slides, lecture notes, quizzes, articles, etc.) or other LOM attributes. This enables instructors to organize effectively and students to search efficiently course-related learning resources. From an instructor’s perspective, the emphasis of the work presented here is on shareability and reusability of learning content (existing or being created by the instructor). We propose a framework that supports collecting, structuring, and exploration of learning content organized in ontological structures limited to the main concepts covered by a specific course (course ontology) and partially populated with learning resources. Such semantic structures can be used as a skeleton for organizing and merging course related resources. From a learner’s perspective, the emphasis is on enhancing users’ navigation support and assisting users to quickly find an appropriate starting point for exploring relevant information. More specifically, the presented framework provides means for assisting authors and learners in: • • • •
extracting conceptual structures from existing online documents with usable degree of accuracy, collecting free online learning content associated with an existing conceptual structure, selecting consensual names and machine-processable subject identifiers for new concepts/topics to be added to a conceptual structure (course ontology), finding a good starting point for content exploration..
Accordingly, we propose an approach for modular creation and reuse of ontologybased course material, including a set of heuristics for extracting semantic information from course related HTML documents (transformable into a Topic Map format), an approach of using Wikipedia in the construction of a course ontology as a mediating ontology and source of subject identifiers, and an ontology mapping technique for a stepwise course ontology creation (based on an already created conceptual structure and a proposed new concept). The chapter is organized as follows. In Section 1 we present a view on a course ontology structure that supports the proposed modular creation of learning repositories. In Section 2 we present our heuristics for extracting course ontology components in Topic Maps (TM) format from HTML documents (specified by the author). In Section 3 we present our view on subject identity and its implication to finding and merging relevant information. In Section 4 we present an approach for extracting consensual information from Wikipedia. Section 5 describes our methods for providing users with starting points for their exploration. Section 6 presents an evaluation of the entire approach. Section 7 discusses relevant work and Section 8 offers a concluding discussion.
1. Integrating Learning Repositories with Identifiable Topics Online course resources are frequently organized and rendered around a structure of course topics (course syllabus). Typically, different courses on the same subject exhibit different structures, though with a sizable overlapping of their topics. For example, Table 1 shows course topics from only four (of a larger group) websites found by using Google with the query “Operating Systems” course notes. The high level topics
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
27
Processes, Scheduling, Synchronization, Deadlocks, Memory Management, Virtual Memory, File Systems, and File Structures are overlapping. The listed course syllabi are based on different textbooks: Operating System Concepts by Silberschatz, Galvin, and Gagne, Operating Systems Internals and Design Principles by Stallings and Operating Systems by Tanenbaum and Woodhull. Such course sites are typically accompanied by links to course support materials. Despite the substantial proportion of overlapping topics though, the supporting instructional material for the individual courses (including slides, exercises, examples, tests), is usually diverse. These course topic structures are typically expanded into several levels of topics and subtopics that can be interpreted as a light-weight ontology of the course domain, which we call a course ontology. Course-related instructional resources can then be linked to the topics of this ontology. We view the course ontology as a communication language between instructors and between instructors and learners. This view is in line with the digital library trend of organizing resources in the form of concept-based, subject-specific repositories. We propose to use the observed “course ontology” pattern for structuring learning resources and particularly for consolidating (merging) free course materials in a controlled manner. In particular, when creating or extending a course resource repository, harvesting methods such as those discussed in Section 2 can exploit the adopted course structure pattern. Table 1. Sample TOC of OS courses. Operating Systems Processes Threads Scheduling Synchronization Deadlocks Memory Management Virtual Memory File Systems File Structures
Operating Systems Processes Scheduling Process Synchronization Deadlocks Memory Management Virtual Memory File Systems File System Implementation I/O Systems
Operating Systems Processes Scheduling Concurrency Threads CPU Scheduling Deadlocks Memory Management Virtual Memory File Systems File Structures Distributed systems
Operating Systems Processes Threads Scheduling Synchronization Deadlocks Memory Management Virtual Memory File Systems File Structures Security Distributed Structures Distributed File System
In some cases, course ontologies do not exhibit stability, such as the topical similarity illustrated by Table 1. The following Table 2 lists again the course topics from four websites found by using Google with the query “Web programming” Spring 2007. In this case, the topics listed under the four courses have little in common besides the course titles. This result demonstrates why learners and instructors sometimes can not benefit from the instructional resource repositories, such as Merlot and SMETE. It also illustrates the need of mapping the variety into a manageable and predictable space of topic names. Thus choosing appropriate topic names and relating topics meaningfully is an essential issue in a course ontology creation. Our primary insight here is that Wikipedia can play the role of a shared context between course topic maps’ authors and users and a mediator for subject identification. To incorporate this type of functionality and assist authors in identifying, naming, and relating subjects, TM4L has been extended with means for harvesting consensus information from Wikipedia. Wikipedia, indeed, can provide a rich pool of consensual topic names, topic subject indicators, and topic subject identifiers that can support modular development of domain-specific ontologies (in our case topic maps) and simplify the organization of e-learning repositories. Our
28
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
work on exploiting Wikipedia is focused on three related aspects: mediation in topic naming, subject identification, and course ontologies’ consolidation. Since the primary mechanism for denoting subjects in human/machine discourse is names, we need sharable topic names. Deciding what concepts/topics to include in their course ontology and which are the widely agreed names for the selected topics is a substantial challenge for the authors ([4,5]. We address this challenge by using Wikipedia as a source of “standard” topic names, its articles as descriptive resources (subject indicators), and the resource addresses as subject identifiers (see Sections 3 and 4). Table2. Sample TOC of Web Programming courses.
To support a modular and distributed creation of learning repositories we provide two types of merging – local and interoperable. Local merging can be used for creating repositories by combining course ontologies of the type discussed in relation to Table 1. It assumes that the collections to be merged are available in a Topic Map format and the merging is based on topic names. The second type of merging is based on subject identifiers in URI format. It assumes that the naming and the subject identification mediated by Wikipedia are completed prior to the merging. One possible scenario for a modular learning repository creation is harvestreconcile-merge, where harvesting is based on the methods described in Section 2, while reconciling is based on the approaches described in Sections 3 and 4.
2. Digging the Web for Course Ontology Sketches In the first stage of our scenario, a ‘draft’ topic map is built automatically and offered to the author so that they don’t have to start from scratch but from the proposed draft. The author can accept or delete any of the proposed topic map objects and continue building the topic map by adding new topics, relationships, and resources.
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
29
This aspect of our support for authors in creating course topic maps was motivated by the fact that there is a significant amount of semi-structured information on the web. HTML documents, for instance, are structured for rendering purposes however the structures can be used for extracting some semantic information (by simple parsing the HTML files). For example, list items can be transformed into members of “whole-part” relationships; information items (topics) marked up as “bold” or “italic” can be considered semantically important, etc. Thus our goal was to find out what semantic information can be extracted from the existing HTML markup of web pages and included in a ‘draft’ of the intended course topic map. Since there are no formal rules for extracting semantic information, heuristic approaches are practically feasible. Observations and experiments with various types of semi-structured information in HTML format led us to propose the following heuristic rules5 for draft TM fragment extraction. 2.1. Defining “Page” Topics A ‘draft’ topic map consists of topics and relationships. These objects are extracted by crawling a specified website. In the extraction, we differentiate between two types of topics: topics that reflect the website topology and topics extracted from the text of the web pages. Rule 1: A new topic is created in the topic map for each web page visited by the crawler. We call these topics “page” topics. Rule 2: All the topics created in the process of parsing a specific web page are subtopics of the “page” topic for that page. Rule 3: Naming “page” topics: The “page” topic, corresponding to the entry page for the site, is named with the theme of interest, provided by the user; all other “page” topics are named using the text in the corresponding HTML anchor elements (the anchor tag
defines a hyperlink destination). 2.2. Information Extraction from Heading Tags, List Element Tags, and Table Tags Rule 4: Heading represents a topic that is more general than the topics extracted from the text below it (if any). Rule 5: The topics extracted from headings of different levels can be organized in a taxonomy reflecting the headings’ level (1, 2, etc.). Rule 6: Heading tags on a referenced (through an anchor element) web page are considered as structurally related to the “page” topic of the referencing page. Thus the result of parsing heading tags consists of a set of topics named by the text enclosed in the corresponding heading elements and connected to the “page” topic of the referencing page with a specified (by the user) relationship. Rule 7: The topics extracted from list item tags are more detailed (sub-topics) of the topics contained in the list tags. The list-list item relationship between two topics is modeled by a “child-parent”-type relationship. Table of contents presented as a bulleted list in HTML has an isomorphic image in terms of a “whole-part” tree in TM. Rule 8: The topics extracted from the cells of one column in a table are related since they can be considered as values of the same attribute (represented by the column header). 5
The rules are not ordered.
30
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
Rule 9: The topics extracted from the cells of one column in a table are subtopics of the topic corresponding to the column header. The big difficulties in extracting relationships from text are recognized, but we can try capturing at least the relevancy of topics that appear in the same HTML elements: Rule 10: Group the topics extracted from the same HTML element together since this grouping indicates some kind of relatedness of the topics. 2.3. Information Extraction from Course Web Pages Rule 11: Extraction of course topics: Find an anchor element with a name belonging to the set {Course Syllabus, Syllabus, Course Schedule, Course Outline, Course Material, Course Notes, Class Schedule, Class Notes, Material, Lectures, Lecture Schedule, Schedule, Reading List and Schedule, Lecture Notes, Lectures and Reading, Handouts, Description, Logistics}. On the referenced page find a table with a column with a name belonging to {Topics, Lectures, Lecture Topics, Sections}. The topics extracted from that table column are related to the “page” topic of the referencing page with a “whole-part” relationship. Rule 12: Extraction of course resources: If the element from which a topic has been extracted includes an anchor tag containing an URL of a file (PDF, PPT, graphic/video format), extract the URL as a resource for the topic. Rule 13: Extraction of course resources from a table: In the table on the referenced page (see Rule 11) look for a column with a name belonging to {Reading, Reading material, Material, Notes, Comments}. If found and a column cell includes an anchor tag containing an URL of a file (PDF, PPT, graphic/video format, etc.), extract the URL as a resource for the topic extracted from the same table row. The proposed rules have been used to create a TM4L plug-in for topic map fragment extraction (see Fig. 1).
Figure 1. Results screens with: anchor tag extraction circled (left) and list tag extraction circled (right).
3. Subject Identity as a Key to Merging As we already mentioned, choosing appropriate topic names is an essential issue in the topic map construction process. Since selecting names is directly related to subject identification, in this section we present our view on subject identity and its implication to finding and merging relevant information.
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
31
Identification of a subject is involved when one wants to say something about that subject or when we try to comprehend what was said about it. An example of this type of duality can be seen in the information world where content creators and content consumers need to communicate. In the area of learning content authoring, we view a topic map as a form of communication between a content author and learners. From this viewpoint, we attempt to analyze the different aspects that subject identities and their names in particular can play in organizing e-learning repositories. The focus is on interchange of information between humans through machines. In this context we address both sides of the dual system and propose some solutions intended to assist the content creators as well as content consumers in dealing with problems typical for digital repositories. Topic maps [6, 7] as a means for encoding knowledge use special symbols known as topics to represent the subjects of interest in the world. The topics act as proxies for the subjects of the real world allowing asserting statements about them. However, different topics may refer to the same thing. Topic maps are designed to be merged together, so there is a need to be possible to say that two topics coming from different sources refer to the same thing and therefore can be merged. In the Topic Maps model this is done with Published Subject Indicators and Public Subject Identifiers (PSI). Technically, an URI can be used as a reference to a resource acting as a subject indicator to unambiguously identify the subject represented by a topic to a human being. In fact, the referred resource describes the represented topic. A subject identifier is a locator that refers to a subject indicator and is assumed to uniquely identify a topic (subject). The implicit assumption is that equal symbols (URIs) represent the same subject. As a result, if two subjects share the same URI they are considered identical. However, it is unrealistic to assume that it is possible to make everyone use exactly the same URI to refer to equivalent entities, particularly in a web environment. It is unlikely to achieve such a goal with administrative and organizational measures. In the following we will try to justify our claim based on the semiotic triangle [8, 9]. The semiotic triangle consists of the following three elements (see Fig. 2): • • •
referent - the specific object of the real or abstract world, we want to talk about, concept - the idea of the object, that a human has in his mind of the referent, symbol - an expression of the concept that is used to communicate with others.
Figure 2. The semiotic triangle.
Every referent possesses a number of individual characteristics. However, a user considers only the relevant characteristics and ignores the rest. The sum of the relevant characteristics of a referent is called concept. This is the user’s idea that reflects the inner image of the relevant object, which every human creates unconsciously. It is subjective and depends on the context of the user. For example, for a manager in a
32
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
company producing dairy products, a cow is a source of milk, but for a biologist, cows are domesticated ungulates, a member of the subfamily Bovinae of the family Bovidae. Considering the relationships between these elements, we note that there is a direct relationship between the referent and a concept, because the concept is a subset of the overall characteristics of the referent. There is a less direct relationship between the concept and the symbol. At the end, arbitrary symbol can be chosen, which (from the perspective of the creator) is suitable to encode the concept. Thus, such a process is inherently indeterminate and yields an unpredictable result. Furthermorethere is only an indirect relationship between symbol and referent. This implies that it is impossible to determine how any given symbol refers to any given object of relevance. In the topic maps context, because the symbol (PSI) selection is not independent of its creator and because the relation between the referent and the PSI is indirect, it is uncertain that two Subject Identifiers represent the same referent (subject). Pragmatically, since we can not describe a subject completely, it is impossible to make a subject and its description absolutely identifiable. There will always be a case when subject descriptors have to be interpreted in a particular context for deciding on the subject meaning. On the other hand, the responsibility of giving a PSI to a resource lies with the creator and they will typically assign PSIs based on the web domain over which they have control. Without authoritative control, this may result in a proliferation of “synonymous” PSIs. Taking into account the computational cost of storing and determining when different PSIs refer to the same resource, we propose to use Wikipedia articles as subject descriptors and their URIs as subject identifiers. According to the Topic Maps specification, a subject descriptor is a resource that is intended by the topic map author to provide an unambiguous indication of the identity of a subject. Thus, topic maps’ subject indicators can be viewed as concept descriptors in semiotic terms. This new viewpoint suggests the following conceptual strategy for choosing PSI: before deciding on your subject descriptor, check in the bank of potential subject descriptors for a match with your concept descriptor. If we assign Wikipedia the role of such a bank of potential subject descriptors, this statement translates in the following simple rule: if a Wikipedia article matches your concept descriptors, then select this article as a subject indicator and its URI as a subject identifier. The idea here is to facilitate and unify the process of creating subject identifiers without any authoritative oversight by providing established sources of potential subject descriptors playing also the role of carriers of subject identifiers. The benefit will be in minimizing the unnecessary proliferation of PSIs, reducing the number of subject identifiers for the same subject, and consequently facilitating topic maps merging.
4. Using Wikipedia for Course Ontology Construction and Subject Identification We suggest that Wikipedia, due to its fast growing reputation as a universal knowledge repository and rapidly expanding usage [10, 11], is very suitable to play the role of the bank of common descriptors, that is, of a source of consensual subject names and definitions. Wikipedia page titles can be considered as consensual topic names. Relevant concepts can be extracted from a page text (for example, corresponding to terms with hyperlinks). Relationships between topics can be found in some layout elements, such as tables and bulleted lists, at the price of some further analysis. In
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
33
particular, summary tables that list key facts about the subjects can yield meaningful relationships. Ideally, the topic map designer would enter a tentative topic name, which TM4L will transform into a Wikipedia search query. The result will be a single page whose title matches exactly the query (which means that the page describes the target topic). As a side effect, the topic becomes a part of the Wikipedia’s broader-than, narrowerthan lattice. This approach however fails when the topic name doesn’t match any page title. In such a case, a standard solution would be to take the top-ranked page returned by a Wikipedia keyword-based search. This however is not always the best choice, since the page corresponding to the intended subject may be ranked well below the top. To eliminate the incidental candidate pages, we propose to apply a NLP-inspired disambiguation technique [12]. It assumes that the graph structures provided by both Wikipedia and the topic map will sustain matching-based disambiguation reasoning similar to what is now an accepted practice in ontology mapping [13]. Our intuition is that the target page from Wikipedia (the optimal hit) will be surrounded by other pages of high relevance to the overall subject of the topic map. Therefore, when assessing the relevance of a particular page we should also reflect its neighborhood, for example, by counting the relevant pages laying in the vicinity of a candidate. However, in strict graph-theory terms, the neighborhood of a page, that is, the set of pages connected to it by incoming or outgoing links, might be large and will necessarily comprise semantically unrelated pages (for example, the Operating Systems page has a link to the US Government page). Thus, exhaustive neighborhood exploration would be expensive. For the same reason as above, the relevant neighbor pages should not be expected to appear themselves in the candidate lists of the query targeting the initial topic (the focus of the search). Since a neighbor will score well with respect to a query focused on its own topic, we propose to launch a limited number of peripheral queries to capture the pages relevant for the neighbor topics from the topic map (ancestor, sibling and descendant topics). Thus, we hypothesize that a good discriminator between the optimal hit and the irrelevant candidates will be the number of neighbor pages listed under the candidates by peripheral queries. Our hypothesis is rooted in the broad coverage of Wikipedia: we expect that within an average topic map a substantial number of topics will have an equivalent Wikipedia page. Thus, the optimal hit of the focus should be recognizable as a strong hub in the Wikipedia graph limited to the pages in the candidate lists for the focus and side queries. The basic claim supporting our approach is that performing topic search within Wikipedia, not page-wise, but neighborhood-wise, increases substantially the chances to detect the optimal hit. We use graph-matching techniques to establish a correspondence between the two neighborhood structures. Intuitively, the right page to return, the optimal hit, will be both relevant with respect to the keywords (but not necessarily the most relevant one) and also will be linked by hyperlinks to some pages of higher relevance to the focus topic or to neighbor topics. The optimal hits for neighbor topics should be “in the semantic vicinity” of the optimal hit of the focus topic. This means, there will be either direct links, or hyperlink paths of small size (few links). One may also hypothesize that the optimal hits of two neighbor topics will belong to the same category, or of two different categories that will rapidly join into a common category above them (typically in one upward move). The goal of the proposed algorithm is to find the optimal hit among the candidates in the candidate page list. The task tackled is a specific case of ontology mapping, i.e.,
34
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
mapping the topic map ontology onto the rich and complexly structured Wikipedia knowledge repository. In this case the mapping has a focus, a specific topic that limits its scope to a small neighborhood in the topic map. The images of the topics, i.e., the Wikipedia articles, however, need not be neighbors due to the immensely richer structure of Wikipedia compared to an educational topic map. Hence, instead of subgraph isomorphism between both structures, we look for the best matching of a page cloud from the candidate lists against the topic neighborhood (seen as a pattern). The quality of the matching depends on intra-cloud similarity rather than on topic-topage scores. Indeed, the optimal hit is the page maximizing its overall semantic relatedness to other pages from the side lists, which reflects the intuition that the neighborhoods in topic maps are semantically cohesive. With regard to semantic relatedness of pages, we exploit the category structure of Wikipedia. Thus, the semantic relatedness of two pages is a function of the shortest path-lengths to a common semantic category. Formally, let P1 and P2 be two distinct Wikipedia pages, then their relatedness ratio is: Rel(P1,P2 ) = 1 − min
C ∈Cat WP
path c (P1,PC ) + path c (P2 ,PC ) 2depth (PC ) + path c (P1,PC ) + path c (P2 ,PC )
where CatWP is the set of all categories, PC is the page of the category C, depth(PC) is the length of the shortest path from the page of the root category in Wikipedia Categories to PC, and pathc() the length of the shortest path between two pages comprising only category links. The above measure is further constrained by excluding links to/from non-semantic categories, i.e., those related to Wikipedia management (for example, Categories requiring diffusion). A threshold for distances is used to force zero relatedness whenever the distances between pages and common categories are too high. Thus the absence of a good candidate matching the focus topic can be explicitly communicated to the user. Details about the proposed algorithm can be found in [12].
5. Finding a Starting Point for Content Exploration Finding a good starting point is a critical step for successful browsing. It is desirable to start navigating a repository from a place that allows reaching the relevant learning content with a few clicks. Indeed, the search can be improved if we switch from keyword search to more semantic driven search, combined with subsequent browsing. This implies not just returning a set of resources containing the search keywords but placing the user in a relevant location, i.e. at a starting point for further exploration for resources. A topic map version of such query-initiated-navigation is implemented in TM4L. In the proposed approach, querying is seen as a means to identify starting points for navigation, and navigation is guided based on information supplied in the query. Note that topic maps based on consensual naming conventions, such as the one described in the previous section, facilitate and improve the accuracy of the query process. Our motivation came from studies reporting that users often supplement querying with extensive manual navigation [14] and observations that open-ended search tasks entail a significant amount of manual navigation [15]. Moreover, many users prefer to
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
35
navigate rather than “jump” to a target document, as doing so enables them to understand the surrounding context, a process known as orienteering. An additional motivation came from our view on subject identity as formed by the (unique) set of subject’s relationships with other subjects. For example, if we hear the terms “Process Management”, “Memory Management” and “File Systems”, a reasonable prediction would be that this is a discussion about Operating Systems. But if we perceive the terms “Paging”, “Resident Set”, “Page Replacement”, “Demand Paging” and “Thrashing”, we would guess that the discussion is on Virtual Memory. Exploratory search typically entails browsing resources grouped around related subjects. Therefore, when the users are able to describe their exploratory interest in terms of related subjects, the latter can be used for finding a promising area for exploration. 5.1. Aiding Repository Exploration Loosely speaking, good starting points for exploration are groupings of documents that permit easy navigation to many documents matching the user’s need, via tracing one or more short paths. In the next section we discuss an algorithm for selecting a starting point for exploration. It takes as an input a set of topics (entry topics), and outputs a collection of topics qualified as starting topics for a topic map exploration. The latter are found through their relationships with the entry topics. There are different ways to specify a subject. In our case, we are interested in information objects (articles, tutorial, handouts, etc.) describing the subject. When the users are able to describe their exploratory interest in terms of related subjects, it would be helpful to provide them with assistance in the form of a navigational strategy for the area of exploration as illustrated in Fig. 3. Assume that in an interactive mode the user submits a sequence of topics, intended as an initial entry for computing the starting point for browsing. For an input list {“Critical Sections”, “Mutual Exclusion”} the user will be presented with the segment containing the topics “Synchronization”, “Critical Sections”, and “Mutual Exclusion” as a starting point for exploration, that is, with the minimal sub-graph containing the topics from the input list. For the list of topics {“Scheduling Criteria”, “Synchronization”, “Critical Section”} the user will be presented with the starting list {“Scheduling Criteria”, “Scheduling”, “Processes”, “Synchronization”, ”Critical Sections”}, etc. Processes
Scheduling
Scheduling Criteria
Synchronization
Scheduling Algorithms
Critical Sections
Deadlock
Mutual Exclusion
Figure 3. Partial topical structure of Processes.
36
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
5.2. Identifying Starting Points A topic map can be represented as a graph G(T, A) of topics T and associations A. The task of finding a starting point for exploration based on a user’s entry topic list can then be formulated in terms of finding the minimal sub-graph containing the list of the entry topics. More precisely, given a graph G(T, A) the aim is to identify a sub-graph Gm of G that meets the following conditions: 1. Gm contains all nodes from the Entry list (the user input). 2. Gm should be minimal, that is, should contain as less nodes as possible. 3. Gm should be connected (if possible). In the following description of the algorithm we denote by Trv = Traversed(T) = (T1, T2, … , Tk) the set of all topics Ti which are directly associated by any association A to topic T; that is, each Ti is a neighbor of T with respect to an association A. The algorithm maintains the following data structures: Path(start_node, end_node, length, path) is an object that stores a path between two nodes as well as its length, start node, and end node. Input is a list that holds the user input and remains unchanged throughout the execution of the algorithm. Entry stores a modifiable copy of Input. Open is a list of topics in the topic map, which are yet to be examined. Closed is a list of topics that have been already examined. The Open list acts like a queue. This has a danger that the search space may be too large. Thus a depth limit is placed to prevent this. Accordingly, d(j) denotes the depth of node j and p(j) denotes the predecessor of node j within the Best First Search (BFS) search tree. 1. FOR each topic ti in Input (1 <= i <= #Input) DO 1. Initialization. Copy all entries {ti} from Input to Entry. Set Open = {ti} and Closed = {ti}. Initialize p(ti) = ti and d(ti) = 0. FOR all other topics (nodes) tj set their depths and predecessors to undefined p(tj) = d(tj) = undefined. 2. Algorithm Body. WHILE Entry and Open contain at least one element, DO : 1. Take the head th of Entry and delete th from Entry. 2. IF d(th) > depth limit THEN EXIT. 3. FOR all topics tj in Traversed(th) and not in Closed, set d(tj) = d(th) + 1, p(tj) = th and add tj to Open and Closed. If tj is in Entry, delete tj from Entry and create a Path object Pathi,j that contains all nodes on the Path from topic ti to topic tj. The path can be determined by following the predecessors of tj 2. Out of all paths Pathi,j, create a weighted graph G=(V, E) with V=Input and ei,j ∈ E iff Pathi,j (i,j ∈ V). The weight of edge ei,j ∈ E corresponds to the length given by the number of nodes, of Pathi,j. Therefore, an edge between two nodes represents the shortest path between these two nodes and the weight of an edge is its length within the original graph (topic map). If there is no edge, then no path has been found between these two nodes. 3. Finally, calculate the minimal spanning tree Gmin out of G with Kruskal’s Algorithm and replace the edges of Gmin with the path given by its corresponding Pathi,j object. Gmin is the output of the algorithm that represents the minimal topic map structure based on the user’s input.
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
37
Note that Kruskal’s algorithm [16] is an example of a greedy algorithm. The basic idea it is that we start with a new, empty graph and add the edges in order of increasing cost. The output of the algorithm is a set of nodes. If the goal is to present a single node as an output, the algorithm requires one more phase, where one of the nodes from the output set is selected as a starting node based on some criteria. The adopted approach is based on the notion center of a graph. A center of a graph is a node C such that the maximum distance from C to any other node Ti is minimized. When the center is unique, the algorithm terminates by returning the center of the output graph as a starting point for exploration. In general, a center of a graph is not unique; moreover, the set of the output nodes may not present a single graph. In such a case we have to apply additional selection criteria. Our implementation is based on the following algorithm: 1. If Gmin is a forest with k trees, then select the tree trj (0 < j <= k) with maximum topics ti ∈ Input. The tree trj connects a maximal subset of topics out of all trees within Gmin generated from the user's input. The intuition is that trj captures the topic that the user had in mind. The identification of trj is based on the following strategy: 2. FOR all topics tk in trj DO 1. Accumulate the hop distances from tk to all other nodes within trj. 2. Determine the node(s) tmin with the shortest accumulated distance to all other nodes 3. IF tmin is unambiguous, return tmin ELSE 4. IF tmin contains more than one node return the node(s) from tmin that maximizes the number of nodes reachable with two hops. 5. IF, after applying this criterion, the center is still not unique, return the node tmin that maximizes the number of nodes reachable with one hop. 6. IF the center is still unique, return the first node from tmin Thus the algorithm ends with identifying the center topic of the Output structure of topics. From an implementation viewpoint, the algorithm terminates based on the user preference: set of starting nodes or a single starting node. If the user selects the first option, the algorithm terminates by returning the Output set. If the user selects the second option, the algorithm completes the second phase and terminates by returning a single node as an output. If the user does not have any other preferences, this node will be suggested as a starting topic for browsing (see Fig. 4). When we cannot evoke directly the identity of a particular entity, we often recall that entity by recalling some other things related to it. For example, we can ask for “that bolt about 4 inches long, with a black top”, when we can not remember “part P734-9”. Similarly, in the course of conversation we may fail to remember the name ML but we may still remember that it is a functional programming language, developed by Robin Milner and that it influenced some newer languages such as Haskell. This information might be enough to convey the subject identity to the other participants in the conversation. These observations can be expressed in terms of necessary conditions for unique identification. Such facts suggest the possibility of using a set of subjects as a means of determining the identity of another subject uniquely related to them. This intuitive observation about the subject identity can be expressed in terms of the minimal graph used in the above algorithm and satisfying additional constraints.
38
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
Figure 4. For the Operating System TM and the list of topics {Conditions for Deadlock, Safe and Unsafe State}, the user is presented with the starting list {Deadlocks, Deadlock Characterization, Conditions for Deadlock, Deadlock Prevention, Banker's Algorithm, Safe and Unsafe States}.
Let T1,T2,.., Tk be a list of entry topics. If the minimal graph containing the entry topics T1,T2,..,Tk contains a topic T such that it is uniquely related to all the topics in the entry list, then the entry list T1,T2,.., Tk can be used as conferring the identity of topic T. A particular perspective on the previous definition allows an interesting interpretation of the multiple identifiers phenomenon, in terms of different groups of topics related to the same topic. This approach allows multiple identifiers, i.e., constriction of different sets for a topic’s identification. For example, the subject “ML” can be identified by two different lists of topics: {“functional programming”, “year of creation 1973”} and {“functional programming”, “creator Robin Milner”}.
6. Evaluation The evaluation of the proposed approaches (implemented in TM4L) consists of two parts: an evaluation of the approach for building a ‘draft’ topic map, that is, extracting topical structures and resources, and an evaluation of the use of Wikipedia as a provider of topic names and subject identifiers. In order to evaluate our approach to topic and resource extraction, we applied it to 70 web pages related to instruction, selected from the following three representative categories: Wikibooks web pages (30); web pages from the ACM/IEEE Computing Curricula 2001 recommendation (10); and web pages of selected course sites (30). The assessment of the performance of our crawler that uses the heuristic rules proposed in Section 3, are based on the standard measures from Information Retrieval, Recall and Precision. For our task recall is interpreted as the number of relevant topics returned in a particular grouping, divided by the total number of relevant topics in that grouping. Precision is interpreted as the returned relevant topics, divided by the total number of topics returned by the crawler using the proposed heuristics. The evaluation results in terms of recall are shown in Table 3 and in terms of precision in Table 4.
39
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content Table 3. Recall values for the three categories of websites. Web Pages Course pages
# Relevant topics returned 1463
Total # of relevant topics
Recall
1563
0.936
ACM/IEEE Computing curricula
672
672
1.0
Wikibooks pages
1471
1471
1.0
Table 4. Precision values for the three categories of websites. Web Pages Course pages
# Relevant topics returned 1463
ACM/IEEE Comp. curricula Wikibooks pages
Total # topics returned
Precision
1903
0.7687
672
815
0.8245
1471
2497
0.5891
Our assessment demonstrated an acceptable accuracy for the proposed rules. Interpreting the topic extracted from an anchor element as the root for all topics extracted from the referenced page generally resulted in a natural hierarchical structuring of the topics in the extracted tree and thus in a reasonable level of precision and recall. The text extracted from the anchor elements in most cases was appropriate for naming the “page” topics. Heading tags allowed the extraction of topics related indeed by a “child-parent” relationship to the root topic. In a similar way, concepts extracted from headings were linked by a taxonomic relation to the concepts extracted from the subordinate sections. Topic extraction from HTML list elements produced the best results in terms of precision combined with a good recall. The hierarchical topic structure captured from lists (see Fig. 1, right) reflects the nested nature of the corresponding list stricture thus preserving the intended relationship between the list items in the document. We carried out an assessment of our tool for topic-to-Wikipedia match, which is now entering into a more comprehensive stage. The initial step of the performance study involved a reasonably-sized topic map (made of 65 topics), partly covering the Operating System field. In this experiment, we compared our tool to the basic service universally available for the authors, i.e., the Wikipedia search engine. The performance of both was evaluated in terms of precision and recall. Since the goal in this case was to match each topic in the topic map to the most relevant Wikipedia page, the recall is interpreted as the number of the correctly matched Wikipedia pages divided by the total number of relevant (“matching”) Wikipedia pages for the considered set of topics. Table 5 summarizes the results on recall. Table 5. Recall scores for both search methods on the Operating System topic map. Algorithm Topic matcher Wikipedia search
# Relevant topics returned 46 36
Total # matching Wikipedia pages 51 51
Recall 90.19 70.58
40
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content Table 6. Precision scores for both search methods on the Operating System topic map. Algorithm
# Exact matches
#Relevant matches
Total # matches
Precision (exact)
Precision (relevant)
Topic matcher
33
46
59
55.93
77.96
Wikipedia search
30
36
65
46.15
55.38
Regarding the precision, we consider two forms of matching - exact match and relevant match, as it is possible that for a particular topic there is no exact matching article in Wikipedia; we calculate correspondingly two precision scores. These scores are interpreted as the number of exact (relevant) matches divided by the total number of matched Wikipedia pages. The results are presented in Table 6. This initial assessment shows that the proposed algorithm outperforms well the Wikipedia search. In 25 out of the total 65 searches, our algorithm produced better results: in 4 cases it reported “Wikipedia page not found” instead of giving a wrong match; in the remaining 21 cases it suggested better matches than the ones suggested by the Wikipedia search. The Wikipedia search outperformed the proposed algorithm in 14 cases. Only 18 of the total 65 searches returned identical results. The remaining 7 searches gave the following results: • • •
1 case – neither algorithm found a matching page; 5 cases – both algorithms gave wrong results (i.e. matched unrelated Wikipedia articles to the topics); 2 cases – both algorithms found pages that are somewhat related but not closely related).
An interesting observation was that in the case when neither of the tools found a matching page, it turned out that the name of the topic in the TM had been misspelled, and in most of the cases, where both algorithms gave wrong results, the names of the corresponding topics in the TM were not well chosen (e.g. were too general, such as “case” or “process”).
7. Related Work There is substantial work on integrating information from different sources including consolidating information using common identifiers. The TAP platform [17] tries to overcome the problem of using different names for the same concept by “Semantic negotiation”. Similar motivation has SemTag [18], which uses an algorithm to disambiguate entities, combined with a large-scale effort at tagging and assigning identifiers (topic URIs) to web pages. The idea of performing object consolidation on FOAF data based on values of inverse functional properties as described in [19], handles data from a large number of sources obtained from the Web. Heuristic ontology and schema mappings have been studied in systems such as PromptDiff [20]. In all these systems, created already repositories need to be integrated. In contrast, our focus is on an infrastructure for publishing learning resources where resource integration is not a posteriori concern but a major aspect of the TM4L authoring strategy. In addition, our interest is on instance-level integration based on schema-level integration.
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
41
In relation to the recent efforts for tackling the subject identity problem, Pepper and Schwab advocate the concept of Public Resource Identifier [21]. Newcomb introduces the Versavant Project [22], which provides a topic map application bus acting as “subject addressing engine”. Bouquet et al. [23] motivate the problem of (re)using common identifiers as one of the pillars of the Semantic Web, and provide a framework to fuse identifiers. Our focus was not on general methods towards solving the subject identity crisis, but on a pragmatic strategy to subject identity and applying it for identifying subject representations in a particular domain (e-learning) to promote reusability. A substantial number of methods have been proposed for text extraction, including text mining and NLP approaches [24, 25]. The classic supervised learning methods are not suitable for our goal, since they require initial training. Classical NLP approaches are also unsuitable: language analysis is specialized in a specific topical area. Close to our approach is OntoLT [26], used in a plug-in for the ontology editor Protege. However it proposes automatic extraction of concepts and relationships from previously annotated text collections. Fortuna et al. describe in [10] a system for a semi-automatic ontology extraction from a document collection, using text mining. While our goal is close to their, we aim at initial creation of a topic map. Another difference is that our tool extracts not only concepts (topics) but also resources related to them. A major difference between the above approaches and our approach is that the extracted data is intended for human consumption. Thus the novelty is in the collaborative building of the subject structure by the human author and the agent. Therefore, the user interface is as important as the accuracy of concept extraction. We chose Wikipedia as a mediating ontology since it provides a broad source of general knowledge. It has considerable depth in many specialized fields and its development model allows it to keep up with the rapidly-changing world [27]. At the design time the other alternatives indicated some weaknesses. DBpedia [28], on the other hand, is a lightweight ontology created by a straightforward extraction of information from the Wikipedia infoboxes. While the amount of information is significant (close to 100 million triples), important ontological knowledge is still missing. WordNet [29] also deals with lightweight semantic structures, however it was not designed to be an ontology but rather a “language map”. Ontology mapping is an active area with a rich literature (see [13] for a survey). The majority of the methods from the literature boil down to confrontation of ontology entities along two axes: terminological, i.e., comparison of entity names, and structural, i.e., matching of entities with respect to the way they are linked to each other in the ontology graph. More specifically, Hatala et al. explored ontology mapping in mediating between subject structures in e-learning [30]. In all cases, comparable ontologies are confronted, whereas the structures we compare are incommensurable in all aspects; the latter requires an asymmetric approach like the one we propose.
8. Conclusion We identified several problems related to the reusability and sharability of e-learning resources. The solutions to these problems that we propose have been incorporated into TM4L, a system which serves as a platform for creating task-oriented information support for learners.
42
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
When creating a course repository, authors actually perform two separate tasks: they specify a course ontology, and then “populate” the ontology with instances (learning resources). The proposed here approach for “digging the web” semiautomates and combines the two tasks. It is based on the observation that online learning resources are typically published in some sort of a structure. Therefore, when gathering resources one can gather them in their (reusable) structures and vice versa. With the increasing popularity of the ontology-driven learning applications, the ability to expand an existing shared vocabulary to a larger shared vocabulary will become very important. We presented here an approach to Wikipedia-mediated vocabulary coordination for tackling this problem. We believe that the proposed approach will be beneficial for the authors of learning resources. One of its main advantages is that repositories that are Wikipedia mappable are also mergable: authors can merge their course resources with other course resources. Finally, gathering documents with information about subjects implies a possibility for identifying and representing the subjects. We presented our view on subject identity and discussed some implications of recognizing the identity as an abstraction, capturing subject’s relationships with other subjects. One implication of this view is that related subjects can be interpreted as a weak form of identity and thus used as a good starting point for browsing exploration. Part of the work presented here addresses this issue from a practical perspective by suggesting a step towards solving the problem.
Acknowledgements We would like to thank Petko Valchev for contributing the mapping algorithm and Jan Fisher, Steven Roberson and Jean Francois Djoufak Kengue for their contributions to the implementation of the TM4L Extractor and Context-browsing plug-ins.
References [1] [2] [3] [4]
[5] [6] [7] [8] [9] [10] [11]
T. Hammond, T. Hannay, B. Lund, J. Scott, Social Bookmarking Tools (I), D-Lib Magazine 11 (04), April 2005. G. Macgregor, E. McCulloch, Collaborative tagging as a knowledge organization and resource discovery tool, Library Review 55(5) (2006). D. Dicheva and C. Dichev. TM4L: Creating and Browsing Educational Topic Maps, British Journal of Educational Technology - BJET 37(3) (2006), 391-404. J. Greenberg and W.D. Robertson, Semantic Web Construction: An Inquiry of Authors’ Views on Collaborative Metadata Generation, In Proc. of the Int’l. Conf. on Dublin Core and Metadata for eCommunities, Florence, Italy, 2002, 45-52. D. Dicheva and C. Dichev, Authoring Educational Topic Maps: Can We Make It Easier? 5th IEEE Int’l Conf. on Advanced Learning Technologies, Kaohsiung, Taiwan, 2005, 216-219. M. Biezunski, M. Bryan, and S. Newcomb, ISO/IEC 13250:2000 Topic Maps: Information Technology. http://www.y12.doe.gov/sgml/sc34/document/0129.pdf. XML Topic Maps (XTM) 1.0, http://www.topicmaps.org/xtm C. K. Ogden and I. A. Richards, The Meaning of Meaning: A Study of the Influence of Language Upon Thought and of the Science of Symbolism, London: Routledge & Kegan Paul, 1923. Ferdinand de Saussure, Nature of the Linguistics Sign, in: Charles Bally & Albert Sechehaye (Ed.), Cours de linguistique générale, McGraw Hill Education (1916). B. Fortuna, D. Mladenic, and M. Grobelnik, System for Semi-Automatic Ontology Construction. In Proc. of 3rd European Semantic Web Symposium, 2006. M. Hepp, D. Bachlechner, and K. Siorpaes, Harvesting Wiki Consensus - Using Wikipedia Entries as Ontology Elements. In Proc. of 1st Workshop SemWiki’06, Budva, Montenegro, 2006.
D. Dicheva and C. Dichev / Authoring and Exploring Learning Content
43
[12] D. Dicheva, C. Dichev, and P. Valchev, Integrating Metadata Harvesting with Semantic Search, IEEE/WIC/ACM Int’l Conference on Web Intelligence and Intelligent Agent Technology, Dec 9-12, Sydney, Australia, 2008. [13] P. Shvaiko and J. Euzenat, A survey of schema-based matching approaches. Journal on data semantics, 4 (2005), 146-171. [14] S. Pandit and C. Olston, Navigation Aided Retrieval, In Proceedings of WWW 2007, 391-400. [15] R. W. White and S. M.Drucker, Investigating Behavioral Variability in Web Search. In Proceedings of WWW 2007, 21-30. [16] J. B. Kruskal, On the shortest spanning subtree and the traveling salesman problem. In Proceedings of the American Mathematical Society, 7 (1956), 48-50. [17] R. Guha and R. McCool, Tap: a semantic web platform. Computer Networks 42(5) (2003), 557-577. [18] S. Dill, SemTag and seeker: bootstrapping the semantic web via automated semantic annotation.WWW’03 (2003), 178-186. [19] A. Hogan, A. Harth, and S. Decker, Navigation Aided Retrieval Performing Object Consolidation on the Semantic WebData Graph. In Proceedings of the WWW2007 Workshop I3: Identity, Identifiers, Identification, Banff, 2007. [20] N. Noy and M. Musen, PromptDiff: a fixed point algorithm for comparing ontology versions. In Proc. of the 18th National Conference on AI (AAAI’02), Edmonton, Canada, 2002. [21] S. Pepper and S. Schwab, Curing the Web’s Identity Crisis: Subject Indicators for RDF. TR, Ontopia, (2003). http://www.ontopia.net/topicmaps/materials/identitycrisis.html [22] S. R. Newcomb, and P. Durusau, Multiple Subject Map Patterns for Relationships and TMDM Information Items. In Proceedings of Extreme Markup Languages, Montreal, Canada, 2005. [23] P. Bouquet, H. Stoermer, and D. Giacomuzzi, Okkam: Enabling a Web of Entities. In Proceeding of the WWW2007 workshop I3: Identity, Identifiers, Identification, Banff, 2007. [24] E. Gabrilovich, and S. Markovitch, Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In Proceedings of 21st Conf. on AI (AAAI 06), Boston, Mass, 2006. [25] M. Strube and S. Ponzetto, WikiRelate! Computing Semantic Relatedness Using Wikipedia, Proc. the 21st National Conference on AI (AAAI 06), Boston, Mass, 2006. [26] P. Buitelaar, D. Olejnik, and M. Sintek, A Prot´eg´e Plug-In for Ontology Extraction from Text Based on Linguistic Analysis. In 1st European Semantic Web Symposium, 2004. [27] D. Ahn, V. Jijkoun, G. Mishne, K. Mller, M. Rijke, and S. Schlobach, Using Wikipedia at the TRECQA Track. In Proceedings of TREC, 2004. [28] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak,and Z. Ives, Dbpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web Conference (ISWC), Lecture Notes in Computer Science, Springer 4825 (2007), 722–735. [29] C. Fellbaum, editor. WordNet: an electronic lexical database. MIT Press, 1998. [30] D. Gasevic and M. Hatala, Ontology mappings to improve learning resource search. British Journal of Educational Technology, Special issue on The Semantic Web for Elearning 37(3) (2006).
44
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-44
CHAPTER 3
Using a Computing Ontology in Curriculum Development Lillian CASSEL 1 Villanova University, USA
Abstract. The computing disciplines have a long history of curriculum recommendations developed and disseminated by the education committees of relevant societies. The process has become unmanageable as the disciplines become more diverse and the field expands. Recent work on an ontology of all computing offers a possibility of a new approach to curriculum development, both at the society level and for individual departments, by providing a comprehensive and objective view of the entire discipline. This chapter presents the motivation for an ontology-driven curriculum process and a preliminary plan for its use. Keywords. Curriculum development, ontology, computer science education, computing disciplines.
Introduction The computing disciplines are relatively new entries into the university curriculum. They began nearly simultaneously in several different places including Mathematics, Engineering, and Business. Each place of origin brought its own perspective to the field. As early as the 1960s, a need was recognized for a recommended curriculum for departments that were struggling to define what the new field entailed. Curriculum ’68 [1] provided the first formal recommendation for computer science curricula and contributed substantially to defining the new field. Computer science curriculum recommendations followed in 1978 [2], in 1991 [3] and in 2001 [4]. Similarly, curriculum recommendations emerged for programs in Information Systems [5, 6, 7] and in Software Engineering [8, 9]. By the time of the 2001 curriculum recommendation effort, it was clear that the number of distinct computing areas requiring specification had grown. Eventually, a separate volume appeared for each of Computer Science [4], Information Systems [7], Computer Engineering [10], Software Engineering [9] and Information Technology [11]. Each recommendation includes a body of knowledge (BOK). Significant overlap makes the boundaries fuzzy at best. Other areas, such as Information Science and Artificial Intelligence, develop without these official definitions. The large number of
1
Corresponding Author: Lillian Cassel, Department of Computing Sciences, Villanova University, 800 Lancaster Avenue, Villanova PA 19085-1699; E-mail: [email protected].
L. Cassel / Using a Computing Ontology in Curriculum Development
45
separate reports, and the expectation that more will develop in the future, motivated production of a roadmap to the various computing disciplines [12]. These reports serve a very important function. They provide an authoritative, stable reference point in a shifting universe. They define a core and a set of electives for each type of program. Program designers, instructors, textbook authors, and accreditation agencies rely on them. The issues are important both within the computing disciplines and among those who look to computing educators to provide important concepts to merge with other disciplines, defining new fields called Computational-X or Y-informatics. The process of defining a separate curriculum recommendation for each possible subfield of computing and for every combination of computing and another discipline is simply not practical. There needs to be a way to see the big picture, to have a common frame of reference on which the many variations can be built. “In the context of computer and information sciences, an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse” [13]. The computing ontology takes as its domain the breadth of the computing and information topic areas. In addition to concern about the growing number of related fields that require definition and curriculum recommendations, there is a concern that parts of the field may be missed and that relationships between the fields may not be well understood. Because the computing disciplines are still relatively young, a comprehensive ontology presents several opportunities: x x
It may be possible to coordinate development and dissemination of knowledge about the various components of the discipline in a coordinated and organized way. It may also be possible to keep experts aware of each other’s work, avoiding unnecessary duplication and highlighting missing or underserved areas.
Achieving these goals requires a complete representation of the entirety of the discipline in all its manifestations and of all its component parts. The representation must be dynamic, responding to changes as the field evolves and it must be accessible to all who can use it. This chapter presents the motivation and guiding principles of an ontology of all of the computing disciplines and discussion of its application in generating computing curricula, both at the level of society recommendations and of individual program curriculum review and development. Section 1 provides a brief survey of relevant work in the area. Section 2 describes the current situation, the goals, and some of the challenges of this work. Section 3 provides some considerations for the use of the computing ontology in developing curriculum plans and curriculum recommendations.
1. Some Related Work A number of papers have considered ways in which an ontology can serve the purposes of education. That is a theme of this book. For example, Chi considers sequencing of content to address specific learning objectives [14]. Fok looks at the role of ontologies in personalized education [15]. Gupta et al look at the use of established ontologies of science topics as a tool for indexing content in the National STEM Digital Library [16]. Khan and Hardas have considered ways that an ontology can help relate topics and
46
L. Cassel / Using a Computing Ontology in Curriculum Development
course resources [17]. The literature on the application of ontologies for the organization of knowledge applied to learning is growing. In this chapter, the specific interest is in the use development of curriculum recommendations for the computing community.
2. The Computing Ontology 2.1. A goal To address the needs of the disciplines, specifically in terms of supporting curriculum development, we suggest a need for an interactive structure for the representation and exploration of the unified body of knowledge of all of the computing and information related disciplines. The result should be a facility to x x x x
maintain a current classification scheme for computing and information related work address the challenges of updated curriculum recommendations support curriculum development for creative new types of programs of study, and ease the path toward accreditation for non-standard programs of study.
Although our current focus is on support of curriculum development, the computing ontology has other potential uses. For example, it should replace the ACM CCS for tagging research work and should facilitate groupings and support searching. If properly done, it could lead to more effective categorization of related research. 2.2. Starting Places Each of the curriculum recommendations mentioned in the Introduction includes a body of knowledge. These documents come from the ACM, IEEE-CS, AIS, IFIP, the Australian Computer Society, German Accreditation for Informatics Programs, the British Computer Society and other organizations concerned with documenting the content of computing curricula. These documents list topics from the perspective of educational needs. Relationships between topics express core educational requirements, topics that lead to learning outcomes, and topic dependencies. In addition to the curriculum documents, the Association for Computing Machinery maintains a Computing Classification System (ACM CCS), originally developed in 1964. An entirely new system appeared in 1982 and updates followed in 1983, 1987, 1991, and 1998. The 1998 version undergoes regular review and update [18]. Although every effort has been made to keep the ACM CCS up-to-date and relevant to the classification of computing literature, many who must use it to categorize their work find it inadequate. It is not easily modified to reflect the changing dimensions of the discipline. The focus of the ACM CCS is the classification of research and publications in the computing disciplines. Relationships coded include subtopic and a few “isRelatedTo” connections. These documents provide a wealth of information for defining the computing fields. However, the lists are not consistent, nor are they complementary. They
L. Cassel / Using a Computing Ontology in Curriculum Development
47
overlap and sometimes contradict each other. In some cases, there are contradictions among publications from the same source. 2.3. Ontology Structure Many candidates for a structure of a computing ontology exist. A few serve as examples of the similarities and differences. The ACM Computing Classification System has 11 top level categories: x x x x x x x x x x x
General Literature Hardware Computer Systems Organization Software Data Theory of computation Mathematics of Computing Information Systems Computing Methodologies Computer Applications Computing Milieux.
The Australian Computer Society Body of Knowledge has 13 top level categories: x x x x x x x x x x x x x
Computer Organisation and Architecture Conceptual Modeling Database Management Data Communications and Networks Discrete Mathematics Ethics/Social Implications/ Professional Practice Interpersonal Communications Program Design and Implementation Project management and quality assurance Security Software Engineering and Methodologies Systems analysis and design Systems software.
The German Accreditation gives us another view: x x x x x x x x x
Automata and formal languages Algorithms and data structures Databases Operations systems Communication systems (particularly networks) Computer architecture Programming engineering Software engineering (particularly modeling) and project management Projects with high software engineering content.
48
L. Cassel / Using a Computing Ontology in Curriculum Development
There are more. All attempt to provide a categorization of the areas that comprise the computing disciplines. The views are different, but none is wrong. Each sees the same whole from a different perspective. What we seek to do with the computing ontology is to bring the many views together in order to provide a common base to which all can refer. Each will emphasize different things, perhaps, but all will do so with an awareness of the parts they are including and the parts they are leaving out. A useful join of the many views is a multidimensional problem. It must include x x x x
Basic elements Multiple views Applications of views Unrestricted combinations.
An ontology that serves the purposes of curriculum development in the computing disciplines must include all the relevant terms of the field and must represent the relationships between and among the terms and the clusters of terms that define subspecialties. The purpose of the multiple views, for example, is to allow the developer of a new software engineering curriculum to have a comprehensive view of the entirety of the topic domain from which the relevant material will come. That will allow the program designer to make informed decisions about what should be included in the new program and what other topics are closely tied to the ones that the program requires. Further, a view into the ontology will allow the program designer to explain to university administrators and to potential students just what part of the computing landscape is included in the new program, how it differs from other programs, and what the successful student will have accomplished on completion of the program. 2.4. Topics and Outcomes Just listing all the topics and all the combinations is not sufficient. In the education domain the topic space serves as a tool to be incorporated into applications that address educational issues. It cannot stand on its own. As identified by an International Federation for Information Processing (IFIP) working group, curriculum development involves seven aspects [19]: x x x x x x x
Body of Knowledge. Foundational Material. Application Context. Social Context. Breadth and Depth. Thematic Coherence. Outcomes.
The ontology provides a representation of the Body of Knowledge. Rules in the ontology connect elements in the body of knowledge with other aspects of the overall scheme. An application that references the ontology for educational purposes would classify some elements in the body of knowledge as foundational material. The classification would differ between a computer science curriculum and an information systems curriculum or between a computer engineering curriculum and a software engineering curriculum. The connections made would reflect the goals of the type of curriculum. Some elements provide an application context that gives meaning to items
L. Cassel / Using a Computing Ontology in Curriculum Development
49
in the body of knowledge. For example, marketing might provide an application context for the study of databases. Medical records might serve that role for a study of privacy issues. The social context provides a dimension in the study of technology that cannot be neglected. Cryptography is a subject of interest, but the need to protect the intellectual property rights or the privacy rights of an individual provides a different view of its significance. Designing a curriculum requires decisions with respect to breadth and depth of a part of the body of knowledge. Some programs are deliberately broad so that the student has some knowledge of many areas. Others delve deeply into a more narrow subarea so that the student gains a field of expertise. Most programs have some breadth and some depth. Making those choices is what distinguishes one approach from another. A program that has bits and pieces of many things without any thematic coherence would leave a student confused about what they have achieved and where their learning can take them. Finally, modern education requires stated outcomes for student learning. Outcomes require more than topics. They require specific types of activities that help achieve learning and also demonstrate that learning has occurred. Even here, however, a meaningful collection of topics forms an important part of the package. The body of knowledge for a given program of study derives from the entirety of the related disciplines. All the other aspects of a complete curriculum development activity combine with the body of knowledge to form a viable program.
3. Applying the Computing Ontology to Curriculum Development Imagine that we have a comprehensive representation of all of a discipline, specifically the computing discipline in this case. Picture the topics all shown in such a way that a specific curriculum can be mapped onto the overall representation. The mapping clearly shows where this curriculum addresses the subject matter of the field. It shows what parts of the field have not been covered in this curriculum. This is not a fault of the curriculum. No one curriculum can cover the entirety of this field. This representation shows what has been chosen for inclusion and what has been excluded. Perhaps the curriculum designers look at this representation and realize that something they intended to include has been left out. Perhaps they see that their program has a much greater emphasis in one area than they intended. This clear and objective view allows conscious choices to be made about what to include and what to exclude. Further, the distinctions between a proposed new curriculum and an existing one become visible. The overlaps and the differences are clear. Perhaps the new proposal supports a specialization that is missing from the existing one, but builds on the same core. With a clear view of the role of the proposed curriculum, faculty and students easily see the advantages of each option. Learning outcomes can link to specific related topics. However, the topic that is most directly related to the desired outcome may have a dependency on other topics that are not entirely obvious. The relationships built into an ontology makes these interconnections clear. Figure 1 illustrates the place of the topic “hashing” in a section of a computing ontology. The picture shows that this topic is related to many others, including algorithms, complexity, discrete structures, language translation, information management, and security. All connect to hashing at some level. A curriculum or
50
L. Cassel / Using a Computing Ontology in Curriculum Development
course planner sees the position of this topic relative to others of interest and makes a conscious choice to include or exclude parts of the diagram. The important point is that the planner makes a conscious choice with full awareness of the relationships among these topics. The choice of how much detail to cover is determined by the goals of the course and the desired learning outcomes, not by a lack of understanding about these connections.
Figure 1. Visual representation of the "hashing" topic
The computing subject domain changes as technology changes and as research advances our understanding. Whether we use the ontology as a tool in developing curriculum recommendations and programs of study or as a tool for classifying research, the field is highly dynamic. That means that any representation of the structure must be open to insertion of new concepts and new relationships.
L. Cassel / Using a Computing Ontology in Curriculum Development
51
Figure 2. An automatic visualization showing just 241 nodes
4. The Computing Ontology The top level categories of the Computing Ontology are these: x x x x x x x x x x x x x x x x x
Algorithms & Theory Computer Hardware Organization Computer Network Systems Computing Education Discrete Structures Ethical and Social Issues History of Computing Graphics, Visualization, Multimedia Information Topics Intelligent Systems Mathematical Connections Programming Fundamentals Programming Languages Security Systems Development Systems & Project Management User Interface
It would be helpful to reduce this number, but the overall goal is completeness and clarity of presentation. Experience and time may allow collapsing some categories or a more meaningful grouping. The ontology project appears on the Web at Visitors can http://what.csc.villanova.edu/twiki/bin/view/Main/OntologyProject. download the ontology as an OWL file.
52
L. Cassel / Using a Computing Ontology in Curriculum Development
4.1. Visualization The ontology must be made visible to be useful for general reference. Figure 1 illustrates one small section of the ontology. Figure 2 shows what happens when all the nodes are shown. Clearly, a controlled presentation of relevant parts must be provided. Figure 3 shows a part of an effort to do this. The aim is to provide an interactive interface that allows exploration of the topic space. A search box aids in finding the area and then looking at what is in the area supports establishment of context. Links to related areas need to be visible also. Figure 4 shows the general idea. The immediate context shows as large spheres, with smaller spheres showing connections to other areas. Clicking on the small sphere brings that area of the ontology to the forefront and sends the other area to the back.
Figure 3. Visualizing a level in the computing ontology with its links to the rest of the scheme
4.2. Educational Applications The Computing Ontology is a centerpiece of plans for changing the way computing curriculum recommendations are produced. As we have seen, computing curriculum recommendations have appeared regularly since 1968. What is not obvious is what it takes to produce these documents. The task has grown substantially since the early efforts, reflecting the complexity of the field and the subdivisions to be included. The 2001 effort identified a need for 5 separate volumes to provide descriptions of the most widely used variations on computing curricula. Even so, they did not produce recommendations for Information Science, or for Artificial Intelligence, or for Computer Games curricula. These are all existing computing majors. Many more appear in university catalogs and new ones emerge regularly. The cost of doing a proper curriculum recommendation is great – in terms of time, money, and human
L. Cassel / Using a Computing Ontology in Curriculum Development
53
effort. As more types of programs demand more definitions and as the field grows, something more efficient must take the place of the old ways. The goals of the new approach are to be faster, more agile, more flexible and to smooth the process to avoid cataclysmic change; to exploit state-of-the-art technology; to expand the community that can participate in curriculum development and revision; to increase the ways that an interested member of the community can contribute; and to be less authoritarian and more consensual. A new approach would be community based, supported online, moderated by a committee and it would be a continuous operation. This raises new issues relative to the nature of curriculum recommendations.
Figure 4. Visualizing relationships between parts of the computing ontology
Historically, these recommendations have been relatively stable. They become the basis for new program design, for textbook production, and for accreditation criteria. If the recommendation is constantly changing, how can these needs be met? Figure 5 illustrates a hybrid plan. An existing recommendation remains stable for as long as it serves the community needs. However, there are mechanisms to allow discussions of the current state of the recommendation and of the nature of the field and to capture those discussions in the form of recommended modifications of the curriculum recommendation. The entire structure is moderated by a committee formed by the education committees of the relevant computing groups. Discussions are converted into recommended changes. The recommended changes appear immediately and are available for consideration by curriculum designers and accreditation bodies. When the community or the education committee determines that the updates are sufficiently substantial, the existing curriculum is revised to reflect the accumulated recommendations. The process is incremental, with the intermediate steps clearly visible before they become part of the more stable recommendation. The role of the Curriculum Committee remains important. The committee seeds and steers discussion, decides when and how changes are accepted and commits the next version when that is appropriate. It is important to note that the recommendations
54
L. Cassel / Using a Computing Ontology in Curriculum Development
are just that – they are not prescriptive. Though they are used to drive accreditation criteria, they provide overall guidance, not strict requirements. A curriculum recommendation must be responsive to the changing needs of institutions, faculty and students. The new plan will be put to the test with the next major curriculum recommendations. Its success will depend on the will and interest of the computing education community. The ontology will provide a vehicle for an objective view of what options are available, allowing designers to make conscious choices about what to include and what to leave out.
Figure 5. Proposed role of the computing ontology in the curriculum recommendation scheme
4.3. Ontology connections to Outcomes-based curriculum Figure 6 suggests another way in which the ontology can serve education. Imagine a system in which a drop down list of learning outcomes appears. When a particular outcome is selected, its related concepts in the ontology are highlighted. More than that, however; concepts related to those that are directly connected to the outcome are also displayed. The course or module developer now has a clear view of all that is involved in the topic at hand. The relationships are explored and a conscious choice is made about how far to follow the connections. Perhaps this is a minor topic in the module and only the most relevant concepts will be presented. Perhaps this is a key component of the program and the related concepts must be explored in depth. The developer sees what is involved from the beginning and makes choices and appropriate plans for including as much of the related material as is appropriate for the needs in this specific case. 4.4. Distinguishing Programs of Study With the large and growing number of computing programs, how can a student or employer know what each is about? There are several potential uses of the ontology for the purpose of describing and distinguishing among computing programs. One way is to look at the learning outcomes associated with a program. By selecting all the learning outcomes at the program level (as opposed to the module level illustrated in Figure 4), a person can see sections of the ontology highlighted and know that those sections characterize the goals of that program. By comparing the sections highlighted by several programs, the user can see the overlaps and the distinctions between programs.
L. Cassel / Using a Computing Ontology in Curriculum Development
55
Figure 6. Ontology-related learning outcomes
Alternatively, a person might look at the curriculum recommendations and the sections of the ontology that are associated with each of those. Hopefully, the curriculum recommendations will also be based on expected outcomes. Now, the user can see the characteristics of those types of programs that have documented recommended curricula. This would allow a student or employer to see the overlaps and the differences between programs in computer science and software engineering, for example. 4.5. The Ontology in Digital Library Reference Finally, a complete reference ontology for all of computing provides an ideal way to categorize the entries in a digital library. When one categorization scheme is used to design curriculum and to classify resources, the connections between the two are immediate and obvious. Work is currently under way to import the computing ontology into the Computing and Information Technology Interactive Digital Education Library (CITIDEL 2 ) as the classification system for all entries. CITIDEL is implemented in DSpace and is part of the NSF’s National STEM Digital Library (NSDL 3 ).
5. Conclusion The Computing Ontology project has a number of goals and the diversity of intended uses complicates its development. However, a significant factor in its design and implementation has been the application to education. Implementations are at various stages of development, but a great deal of thought has been given to the use of a computing ontology for purposes as varied as indexing the content of a digital library to serving as a reference point for curriculum development. Effective visualization of the ontology is key to a number of its proposed uses and that remains a challenge.
2 3
www.citidel.org www.nsdl.org
56
L. Cassel / Using a Computing Ontology in Curriculum Development
Acknowledgements Major contributors to this project have been Jim Cross, Gordon Davies, Reza Kamali, Eydie Lawson, Rich LeBlanc, Andrew McGettrick, Russ Shackelford, Bob Sloan, Heikki Topi, Also contributing: Fred Mulder, and Anneke Hacquebard, Maarten van Veen. Design of the revised curriculum recommendation process comes from Larry Snyder. The work has been supported by the National Science Foundation, the Association for Computing Machinery and the IEEE Computer Society.
References [1] W. F. Atchison, S. D. Conte, et al., Curriculum 68: Recommendations for academic programs in computer science: a report of the ACM curriculum committee on computer science. Communications of the ACM 11 (3) (1968), 151-197. [2] R. H. Austing, B. H. Barnes, et al., Curriculum '78: recommendations for the undergraduate program in computer science— a report of the ACM curriculum committee on computer science. Communications of the ACM 22(3) (1978), 147-166. [3] Computing Curricula 1991, Communications of the ACM 34(6) (1991), 68-84. [4] Computing Curricula 2001, Journal of Educational Resources in Computing (JERIC) 1(3) (2001). [5] Curriculum recommendations for graduate professional programs in information systems, Communications of the ACM 15(5) (1972), 363 – 398. [6] G. B. Davis, J. T. Gorgone, et al., Model curriculum and guidelines for undergraduate degree programs in information systems. Guidelines for undergraduate degree programs. New York, 1997, 103. [7] IS 2002: Curriculum Guidelines for Undergraduate Programs in Information Systems (2002), from http://www.acm.org/education/curricula.html [8] G. Ford, The SEI undergraduate curriculum in software engineering. Technical Symposium on Computer Science Education, San Antonio, TX, ACM (1991). [9] SE 2004: Curriculum Guidelines for Undergraduate Degree Programs in Software Engineering (2004), from http://www.acm.org/education/curricula.html [10] CE 2004 - Curriculum Guidelines for Undergraduate Degree rograms in Computer Engineering (2004), from http://www.acm.org/education/curricula.html [11] B. M. Lunt, J. J. Ekstrom, et al., Curriculum Guidelines for Undergraduate Degree Programs in Information Technology, New York, ACM, 2008, 139. [12] Curriculum 2005: The Overview Report, ACM Press, 2005. [13] T. Gruber, Ontology (2008). Retrieved 30 March 2009, from http://tomgruber.org/writing/ontologydefinition-2007.htm [14] Y.-L. Chi, Ontology-based curriculum content sequencing system with semantic rules, Expert Systems with Applications: An International Journal 36(4) (2009), 10. [15] A. W. P. Fok, Ontology-driven content search for personalized education. 13th annual ACM international conference on Multimedia. Singapore, ACM (2005), 1033-1034. [16] A. Gupta, B. Ludäscher, et al., Ontology services for curriculum development in NSDL. 2nd ACM/IEEE-CS joint conference on Digital libraries. Portland Oregon, ACM, 2002, 219-220. [17] J. I. Khan and M. S. Hardas, Observing knowledge clustering for educational resources using a course ontology, 4th International conference on Knowledge Capture, Whistler, BC, Candada, ACM, 2007, 193-194. [18] The 1998 ACM Computing Classification System, ACM, 1998. [19] L. N. Cassel, G. Davies, and D. Kumar, The shape of an evolving discipline in Informatics Curricula and Teaching Methods, 2003.
Part 1.2 Ontologies for Authoring Instructional Systems
This page intentionally left blank
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-59
59
CHAPTER 4
Inside a Theory-Aware Authoring System a
Riichiro MIZOGUCHIa,1 , Yusuke HAYASHIa and Jacqueline BOURDEAU b Institute of Scientific and Industrial Research (ISIR), Osaka University, Japan b LICEF Research Center, TÉLUQ-UQAM, Montreal, Canada
Abstract. At one time, computers that could understand learning and instructional theories seemed only a dream, yet recent advances in ontological engineering have enabled this dream to come true. We first envisioned such a goal in 2000, to be realized in 2010 [16]. Since then, we have tackled this problem and devised a theory-aware authoring system named SMARTIES, based on a comprehensive ontology of learning and instructional theories named OMNIBUS. This chapter discusses the philosophy behind the research as well as its technological details. Keywords. Ontology of learning and instructional theories, theory-aware authoring system, ontological engineering
Introduction The authors have been building a theory-aware authoring system, an expert system that simulates the performance of human experts in instructional design. Such a system would have a rule base for interpreting and emulating human experts’ behavior, which would end up with a so-called “hard-wired” performance system that is not very intelligent because it does not “know” learning/instructional theories that would have been left implicit behind the rules. This suggests that conventional expert systems technology could not be utilized for realizing our goal because it does not help in making the computer understand such theories. The primary technological difficulty we had to overcome to achieve our goal was a declarative representation of learning and instructional theories that enables computers to interpret these theories. Because declarative representation has been one of the central techniques of AI, this challenge itself is not new. Application of ontological engineering [3][4] to this problem is innovative and quite new, even in the Semantic Web era in which most of the research on ontology applications is devoted to metadata creation. Another difficulty was how to make computers utilize the ontology for realizing their intelligent behavior in authoring. The authors thought it was task ontology that 1
Corresponding Author: Riichiro Mizoguchi, Department of Knowledge Systems, The Institute of Scientific and Industrial Research (ISIR), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan; E-mail: [email protected].
60
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
would help us implement human experts’ behavior more elegantly than rule-based systems. The idea of task ontology was proposed by one of the authors in 1995 [15] and has been used world-wide since then. It is a powerful methodology for formulating procedural knowledge declaratively. In fact, we have been tackling the problem of building a theory-aware authoring system with task ontology by building an ontology of authoring tasks in the first several years. Although ontology building itself went smoothly and we developed a task ontology-driven authoring system [7], we learned that it was not easy for us to utilize “ontologized” theories to improve the behavior generated by the task ontology. This difficulty arose because we did not know how to make computers interpret theories procedurally. We could code a procedural program for a theory; in fact, such systems do exist [20]. These systems, however, are so-called hard-wired systems, and hence, cannot be called “intelligent” because they neither know what they are doing nor why. This difficulty, as is already known, comes from procedural representation of how theories apply to authoring problems. Furthermore, such systems tend to follow only one theory, limiting their utility. To achieve our goal, we needed to enable computers to flexibly interpret multiple theories in order to generate intelligent behavior in helping authors develop theory-compliant instructional/learning scenarios. After a few years of struggle, we finally thought of applying functional decomposition to overcome this difficulty. Functional decomposition has been devised by our group in the industrial engineering domain to represent the functional structure of any artifact; it has now been deployed in a few manufacturing companies in Japan [14]. By employing a functional decomposition methodology together with the above-mentioned ontology, we succeeded in developing a multiple theory-aware authoring system. Until date, we have published several papers about OMNIBUS/SMARTIES [8, 9, 10, 11, 12, 17]. Readers interested in the details of the research may refer to the literature. This research is still in the preliminary stage of a substantial evaluation, and its practical benefits have not yet been fully realized. However, we believe that the current results of this study address high-level technical challenges from the viewpoint of the current state of the art. This chapter concentrates on ontological engineering issues of learning and instructional theories and is structured as follows: In order to make this chapter self-contained, the next section summarizes the OMNIBUS/SMARTIES project. Further, we give an overview of our ontology named OMNIBUS and explain how declarative representation of learning/instructional theories is realized, thanks to four underlying policies. The third section discusses how prescriptive aspects of theories can be captured in OMNIBUS. The key issue is the above-mentioned functional decomposition methodology. In order to demonstrate the feasibility of theory awareness of authoring tools using ontological engineering and the effectiveness of the OMNIBUS ontology, a theory-aware authoring system named SMARTIES has been implemented. Section 4 summarizes the characteristics of SMARTIES, and section 5 discusses the descriptive aspects of theories in OMNIBUS, i.e., the quality of their organization, followed by concluding remarks.
1. Summary of the OMNIBUS/SMARTIES Project The OMNIBUS ontology has been developed to build a theory-aware authoring tool named SMARTIES that complies with standards technology. By a theory-aware tool,
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
61
we mean one that can explain and utilize multiple instructional and learning strategies specified in theories to help authors build theory-compliant instructional/learning scenarios. By a standards-compliant tool, we mean a tool that can generate IMS-LD-compliant learning scenarios that run on IMS-LD-compliant players such as the Learning Design player 2 developed in the Reload project. SMARTIES can currently generate waterfall-type instructional/learning sequences with LOs associated with each action. It is equivalent to the type of scenarios at level A of IMS-LD specification. Furthermore, SMARTIES-generated scenarios correspond to the sequence of leaves of an I_L event decomposition tree shown in Fig. 9. In other words, they consist of a number of I_L events, each of which is composed of instructional and learning actions discussed in 2.3. The length of a scenario is roughly that of a lecture. As discussed below, OMNIBUS consists of descriptive and prescriptive ontologies. While it currently covers many theories belonging to behaviorism, cognitivism, or constructivism paradigms in a descriptive ontology, it only covers nine instructional theories and two instructional models that we selected according to our preference. OMNIBUS can help authors blend several theories in a scenario. SMARTIES suggests applicable instructional strategies at each decision point to help authors select the most relevant one according to their knowledge or preference. Selection is totally author-dependent, and hence, blending multiple theories is the authors’ decision. Although SMARTIES has been built as a feasibility study of such an ambitious goal, it has been opened for trial use at http://edont.qee.jp/omnibus/doku.php. There remains a lot of work before it can be used by practitioners. The OMNIBUS ontology is available at the above website, from which documents may also be downloaded. Demo slides available at http://www.ei.sanken.osaka-u.ac.jp/pub/miz/AIED07WS_ Invited.pps explain how SMARTIES works.
2. How to build an ontology of learning and instructional theories 2.1. Overview There are many hurdles to overcome while building an ontology of learning and instructional theories. (1) What an ontology of a theory is. (2) How to reconcile theories in different paradigms in which theorists fight each other. (3) How to capture a variety of educational actions. (4) How to decide which concepts should be included in the ontology. (5) How to guide the building process. Apparently, these have been a real challenge from the beginning. Fortunately, however, we have considerable experience in building ontologies and knowhow about Hozo [13], a powerful ontology building tool, as well as a sophisticated upper ontology named YATO [19]. YATO provides us with useful guidelines for tackling such a challenging task.
2
http://www.reload.ac.uk/new/ldplayer.html
62
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
Learning and instructional theories have descriptive and prescriptive aspects, both of which must be captured by our ontology. Representation of the former is straightforward; we can use declarative representation, which is a typical method used in AI and an ontology. One of the most difficult issues is the latter, i.e., how to represent the prescriptive content of theories not using a procedural representation scheme. This issue is discussed later with a solution using the function decomposition method. Contrary to what many people understand, an ontology is not a taxonomy of the target thing. If it were so, building an ontology of learning and instructional theories would have been far easier than what it really was. An ontology consists of a system of necessary and sufficient concepts that characterize the target things as well as a taxonomy of them. Therefore, in our case, the ontology should comprise concepts explaining what happens in learning and instructional practice in the real world, because learning and instructional theories are developed to explain them. The ontology should cover theories in different paradigms, such as behaviorism, cognitivism, constructivism, and socio-constructivism. The difficulty here is that theories in each paradigm dramatically differ from each other, in spite of the fact that they use the same term “learning.” We need to devise a certain common conceptual platform on which we can represent these theories. The very top level of the OMNIBUS ontology consists of common world, learning world, instructional world, instructional design (systems) world, world of cognition, and theory & model, as shown in Fig. 1. This is a problem-dependent ontology configuration in the sense that it is not compliant with an ordinary ontology, whose top-level categories are compliant with one of the existing upper ontologies such as Dolce [5], BFO [1], etc. Rather than building such an ontology, we decided to build a more domain-expert-friendly ontology by lifting these domain-specific categories up to the top level. The decision has shortcomings (e.g., attribute of learning is not organized under attribute defined in Common world.) Upper ontology compliance is realized in Common world, which reflects the real world with
Figure 1. Top-level categories.
Fig. 1 Top-level categories.
Figure 2. Upper categories of OMNIBUS.
63
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
constraints in learning and instructional aspects. Figure 2 shows the next lower categories. Common world consists of concepts existing in the real world articulated from learning/instructional points of view. It is compliant with YATO, an upper ontology compatible with Dolce and BFO. States, actions, and events are the three major classes in OMNIBUS and will be discussed in detail in 2.2. World of cognition consists of a knowledge model and its attributes. Theory and model contains a taxonomy of theories and models in which each theory and model is described in terms of attributes and organized in an is-a hierarchy. Learning world is composed of learning, learning entity, and attribute of learning. Because learning goal is defined as a role played by state, it is invisible at this level of description. Instructional world is composed of instruction, instructional goal (purpose), instructional attribute, I_L scenario, and instructional entity. In both learning and instructional worlds, neither theories nor actions/events are defined. They are defined in theory and model. Underlying policy 1 We build the OMNIBUS ontology based on the role theory [18] using Hozo [13], which is the only tool that can deal with roles properly. Referring to
Context The essence of the the other type Role concept role theory is seen in the example of the Role layer front wheel role of a bike. Role holder The front wheel Fig. 33.Hozo role is defined in the Figue Hozolegend. legend. context of a bike. A wheel can play such a role and thereby becomes a front wheel. A wheel can play other roles such as rear or driving wheel roles, depending on the situation (context). This is represented in Hozo as shown in Fig. 3. We exploit the theory of roles to clearly capture such roles, role players, and the role-playing things (role holders) in learning and instructional problems.
2.2. States, Actions and Events Figure 4 shows the ontology of actions and states. Following Policy 1 described above, we clearly separate actions and events by dealing with actions as context-neutral entities and events as context-specific entities in which actions are required to play a specific role, because an action can be used in many different events, independently of whether it is a learning or instructional event. An action performed in an instructional event is called instructional action, and the same action performed in a learning event is called a learning action. In other words, an action is defined as a lean entity and has only actor, operand, and resulting state. On the other hand, an event is defined as a rich entity and has goal and context to reflect meaningful context around it by referring to actions. Events are discussed in detail in 2.3. States are critical in OMNIBUS for the following two reasons: (1)
They provide a common and fundamental conceptual platform on which all existing theories can be captured in a consistent manner.
64
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
(2)
They enable us to represent all phenomena in a uniform manner to allow application systems, including authoring systems, ITSs, etc., to work smoothly.
As discussed above, one of the worst difficulties is how to capture the variety of theories in a computer-understandable manner. Although it is not perfect, a state-based approach to modeling of theories would be a solution. The authors are aware of possible disagreement on this by theorists who are far more sensitive to differences among theories, and hence, it is very hard for them to accept their theories being modeled in terms of states that capture the phenomena to a limited extent only. We have to agree that our modeling of theories is imperfect. However, we would claim that we need to introduce engineering approximation in order to achieve our goal. In the engineering community, we have learned that we need to compromise to realize usable products, which must be improved step-by-step to realize the ultimate goal. We can summarize this as follows: Underlying policy 2 We introduce the idea of engineering approximation to capture instructional and learning theories. The next issue is how to collect necessary and sufficient states to capture all the possible phenomena of interest. It was not hard for us to collect initial candidate states because we have been involved in AI in education for many years. It was also easy to organize them into an ontologically-sound is-a hierarchy because we could follow YATO. The same applies to actions. Distinguishing between actions and events might not
Figure 4. Is-a hierarchy of actions and states.
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
65
have been an easy job, but the above policy resolved it in advance. The states and actions were refined when we defined theories that often required unexpected states and actions. In principle, it is impossible to prove the completeness of the collection of states. All that we can say is that they are necessary and sufficient for representing the theories we currently address. States are classified into agent state and object state. By object state, we mean the state of the focused object, such as whether the object has been used or not. Agent state is divided into internal state and external state. The former is further divided into cognitive process state, attitudinal state, procession state, and developmental state. External state is divided into communicative state and physical state. Cognitive process
Figure 5. Is-a hierarchy of states.
66
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
state for working memory has have assessed oneself as meta-cognitive process state and have recognized, have recalled, etc. as the cognitive process state. Details of state are shown in Fig. 5. External state is necessary for understanding learners’ observable behaviors to identify what has been done in the course of learning/instruction. How these states contribute to modeling theories is discussed in Section 3. 2.3. Definition of Events Another critical concept in OMNIBUS is event, which mainly means educational event composed of learning event, instructional event, and I_L event, as shown in Fig. 6. The first two events are rather straightforward because they naturally and independently capture what a learning event is and what an instructional event is. In reality, however, instruction and learning events happen in tight conjunction. When a teacher teaches, learners are taught. If John talks to Mary, then she listens to him. This “duality” of actions must be captured appropriately. To do that, we introduced I_L event composed, Figure 6. Is-a hierarchy of events. roughly speaking, of a pair of instructional and learning events as shown in Fig. 7 and divided into three events, such as simple event, reciprocal event, and influential I_L event. Reciprocal event represents tell&listen-like events. Influential I_L event is the major event and is divided into progressing event, preparing learning condition, and developmental event. I_L event is the key to the conceptual platform on which OMNIBUS is based. As is discussed in Section 3, it allows us to successfully capture the complexity of phenomena in learning/instruction. As shown in Fig. 7, it is composed of one instructional event and one or more learning events. Influential I_L event is defined as a subclass of I_L event by specializing I_L event by adding instructional action that influences learning action. Preparing learning condition event is defined as a subclass
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
67
Figure 7. Definition of I_L event.
of influential I_L event and is composed of one or more effect of learning events and preparing learning condition events, in which the former prepares for the latter. 2.4. Theory and Model In spite of the fact that theory and model are central concepts to capture learning and instructional theories in OMNIBUS, defining and organizing them in an is-a hierarchy is less hard than other tasks. Figure 8 shows the top-level categories of theory and model. Note here that there is no class representing Learning paradigm under Theory and model because of the ontology building technique. As shown in Fig. 2, Theory and model is separated from the three worlds, such as Learning world, Instructional world, etc. It might look strange because one might think Learning theory should be under the Learning world. But if we do so, each of the three kinds of theories should be separately located in different worlds so that they cannot share the same super class Theory and model, which is not good. Readers would think that if Learning theory was classified under Learning world, we could have appropriately merged the Learning paradigm and Learning theory hierarchies. Unfortunately, that is not the case. The reason, similarly to the above case, is a bit technical. At first glance, it seems we can make a class such as Cognitivism learning theory, Constructivism learning theory, etc., instead of Cognitivism, which is subsumed by Learning paradigm, as is done in the current ontology. But if we refer to do so, there is no way to make a class Learning paradigm that would subsume these theories, since Learning paradigm cannot be a kind of Learning theory. Figure 8. Theory and model.
68
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
Figure 9. I_L event decomposition tree.
This consideration explains the current ontology’s design.
3. How to capture prescriptive aspects of theories Capturing prescriptive aspects is the heart of OMNIBUS. All that we have discussed thus far is for preparation for capturing prescriptive aspects of learning and instructional theories as I_L event decomposition. Underlying policy 3 Representation of procedural knowledge should be done by separating what to achieve and how to achieve. This policy proposed by us in our research of functional ontology has been deployed in the industry [14]. An explanatory example of this policy is the function to weld. In mechanical engineering, all domain experts had believed that to weld is a function. However, we found that it is interpreted as a composite of to join and a way of fusion, i.e., to weld can be decomposed into to join as what to do and fusion way as how to do. We claim that only the portion of what to do should be identified as a function and that of how to do should be detached from function and be called a way. By doing so, functions become generic because the portion of way mainly embodies domain-specific properties. To join can be achieved by other ways, such as bolt & nut way, glue way, etc. The benefit of this decomposition is that it is not specific to functional decomposition but applicable to all actions. To walk is decomposed into to move and a walking way. This decomposition suggests other ways to move, such as crawling way, running way, etc. As these examples suggest, there are two kinds of actions: one implies how to do (achieve) and the other does not imply how to do. A typical example of the latter is to kill, which is way-neutral in the sense that it has no implication of how to kill. To educate would be another example and is realized by behaviorism way, cognitivism way, constructivism way, etc. This example of to educate has tremendous implications.
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
69
Underlying policy 4 We interpret all the instructional theories as sharing the same what-to-do: to help learners learn. They differ only in how-to-do. A way should be structured by two major components: what it is for and what particular actions to perform. Once a way is chosen, subactions are automatically specified according to it. Thus, a what-to-do is decomposed into several sub what-to-do’s. Each sub what-to-do is further decomposed through other ways. This decomposition can be repeated until primitive actions/states are reached. Note here that there are two kinds of decomposition. (1) Decomposition of an action into what to do and how to do to extract a generic concept embedded in the original concept of an action and conceptualize the way to achieve it. (2) Decomposition of the generic what-to-do into a few sub what-to-do’s with a way used in the decomposition. This latter decomposition becomes possible thanks to the first decomposition. As we will see below, this is the precious fruit of ontological engineering. We interpret what-to-do as an Figure 10. Definition of instructional strategy as I_L I_L event. As described above, I_L event decomposition. event is composed of instructional action, learning action, and learner’s state obtained by the learning action. Further, the decomposition ends up with I_L event decomposition. Repeating this decomposition, we obtain a tree that we call the I_L event decomposition tree. I_L event has two interpretations: an action-based interpretation and a state-based interpretation. The former corresponds to what-to-do, which has no implication of how to do, and the latter corresponds to the resulting state after performing the action. Figure 9 shows an example of I_L event decomposition. We call the event a decomposed macro I_L event and resulting events micro I_L events. The root node in Fig. 9 reads as “an instructor wants to introduce a content expecting the learner to recognize it and then he/she is in the state of recognizing it.” In order to achieve the state (or to do so), there are two ways. WAY1 is based on Gagne and
70
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
Briggs’s theory [6], which firsts presents what to learn and then gives guidelines. The other, WAY2, is based on Collins’ theory [2]. This gives only demonstrations and no explanation. In this case, the macro-I_L event is not decomposed but concretized. These ways can be thought to have the same goal, but this goal is achieved in different ways. Such a relation between ways is described by an OR relation, such as between WAY1 and WAY2. Figure 10 shows Hozo implementation of WAY1, though details are omitted. It is implemented as a relation between a macro I_L event and two micro I_L events. Its name is Presentation and it has five slots: a macro, two micros, theory for reference, and style. The first three slots have subslots. All the slots are defined by referring to other concepts defined elsewhere. Each of the first three slots represents an I_L event. The macro slot represents Preparing learning condition event and the two micro slots represent Guiding event. It thus represents that the Preparing learning condition event is decomposed into the two Guiding events. As seen in Fig. 9, the Preparing learning condition event, as an I_L event, says the instructor makes the learner recognize (the content), and the learner recognizes it and attains the have recognized state for preparation of the succeeding learning action. When WAY1 is applied, the micro I_L events supposed to occur are as follows: (1) the learner is informed of the learning item and (2) the learner is informed of how to learn it. Let us summarize how we captured the prescriptive characteristics of learning and instructional theories. Ways are defined as relations between macro I_L event and multiple micro I_L events and are used to decompose each macro I_L event repeatedly to form an I_L event decomposition tree. The sequence of decomposition operations corresponds to a design process of an instructional/learning scenario. An I_L event is defined by referring to instructional and learning events that are defined by Figure 11. Is-a hierarchy of strategies.
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
71
Table 1. Theory & model vs. States
referring to actions. These actions are defined by referring to states. Ways are thus defined on the basis of layered concepts that are mutually dependent on each other. Although the rationale behind the organization of the ways is also an interesting topic, it is omitted here (See [11] for details). In summary, we built an ontology of theories and models to cover both their descriptive and prescriptive aspects. For the former, we used a straightforward manner and described the theories’ and models’ characteristics by considering features derived from paradigmatic differences. As for the latter, we introduced the technique of functional decomposition to represent instructional/learning strategies found in theories and models. As a result, each theory/model has been modeled as a set of ways, each of which is composed of a macro I_L event and a few micro I_L events. In other words, as far as their prescriptive aspects are concerned, theories and models are modeled as a set of strategies of I_L event decomposition. This approach is rather innovative because the instructional and learning theories and models are modeled in terms of their strategies in a computer-understandable manner. This result is the major benefit of using ontological engineering as content technology. The current version of OMNIBUS contains 99 ways obtained by analyzing nine theories and two models organized as shown in Fig. 11. Details are discussed in
72
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
[10][12]. Names of theories and models we dealt with are shown in Table 1, together with the number of ways extracted from each theory/model. The first, second, and seventh rows show the number of theories/models analyzed, pieces of ways, and I_L events for categories of theory/model. Rows three to six represent the theories/models considered and the number of ways extracted from each theory. The last six rows represent percentages of the number of states used for each category of ways. This is discussed in Section 5.
4. SMARTIES: A Theory-Aware Authoring System SMARTIES has been implemented to demonstrate the feasibility of a theory-aware authoring tool using ontological engineering and the effectiveness of the OMNIBUS ontology. Figure 12 shows its system architecture, which exploits the utility of I_L event decomposition as prescriptive knowledge. The basic idea of SMARTIES is based on what is shown in Fig. 9. An I_L event decomposition tree is a kind of AND/OR goal tree whose leaf nodes correspond to a sequence of I_L events of fine granularity, each of which would embody an executable scenario unit. SMARTIES proposes possible ways of decomposition of given I_L events into a sequence of I_L events of finer granularity by consulting the OMNIBUS ontology to find ways organized as shown in Fig. 11. Thus, it can help authors develop theory-compliant instructional/learning scenarios. SMARTIES has the following remarkable features: (1) It is aware of instructional and learning theories and can help authors’ author theory-compliant scenarios using instructional and learning theories. (2) It is a next-generation expert system. It is intelligent but has no rule bases. (3) It can blend multiple theories if requested by authors. (4) It can explain theories in terms of concepts defined in OMNIBUS. (5) It can generate instructional/learning scenarios with justifications supported by theories that OMNIBUS has. (6) It can generate IMS-LD-compliant code of the resulting scenario that runs on IMS-LD tools such as a Reload LD player. All functions, except the editing function of the scenario, are available on a Reload LD player.
Figure 12. SMARTIES: System architecture
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
73
Figure 13. An example scenario.
In spite of its sophisticated functionality, SMARTIES’ implementation is rather simple. Although we spent a lot of time to build the OMNIBUS ontology, SMARTIES was built rather quickly, since its main operation mechanism is based on a simple loop composed of (a) a reading ontology, (b) getting a necessary piece, and (c) applying it. Hence, SMARTIES can be said to be a next-generation expert system that is based on an ontology rather than rules. In SMARTIES, a scenario is described as a sequence of leaf nodes in the I_L event decomposition tree, and the rest of the tree accounts for the design intention of the scenario. Nonetheless, SMARTIES is not an automatic authoring system but an interactive tool that proposes possible alternative ways to decompose any I_L event (expand the partial tree), from which authors are asked to choose. When it proposes alternative ways, it can show all the ways applicable to the I_L event of interest from various theories that it knows. Therefore, SMARTIES would come up with ways found from multiple theories, and it would then propose blending two or more theories in a scenario. In spite of its significant academic challenge, SMARTIES bridges the gap between academia and practice by generating IMS-LD-compliant code for the designed scenario. It is a totally ontology-aware and ontology-based system, so that its code needs no adaptation when the ontology is updated as long as the change is not major. Figure 13
74
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
shows how a scenario generated by SMARTIES looks. It is automatically converted to HTML files that can run on a Reload LD player. When a user runs it on a Reload LD player, interactive learning events occur with appropriate LOs as shown in the figure.
5. Discussion The validity of OMNIBUS in terms of prescriptive aspects of theories is partially supported by the successful implementation of SMARTIES. Another controversial issue, however, is the descriptive aspects of theories. In other words, the quality of organization of theories should be evaluated to some extent. To meet this goal at least partially, we examined all the 99 ways in terms of states for ways belonging to four different kinds of theories. As discussed in Section 3, states are most fundamental for defining ways. Thus, revealing how an ontology of state contributes to organizing theories would be interesting. As mentioned above, the first, second, and seventh rows in Table 1 show the number of theories/models analyzed, pieces of ways, and I_L events for categories of theory/model. Rows three to six represent the theories/models considered and the number of ways extracted from each theory. The last six rows show percentages of the number of states used for each of the category of ways. For example, the second column reads the total number of states referred to in all 30 ways, which includes 77 I_L events belonging to Cognitivist, and the percentage of cognitive process states is 61.9%. The bold figures denote the highest percentage in each row. It is very interesting to see that each of the highest percentages in a column coincides with the highest in the row. It is interpreted as follows: the most dominant state in each category of ways represents its intrinsic property in the sense that it differentiates itself from others. For example, cognitive process state is used extensively in most of the categories of ways, although its usage in cognitivist is the highest in the category and is also higher than any other category. In summary, the major state used in each category of ways represents their characteristics very well: learning stage for cross-paradigm, cognitive process state for cognitivist, meta-cognitive process state for constructivist, and attitudinal state for instruction management. This result partially demonstrates the appropriateness of the ontology of state and definitions of ways in terms of I_L events.
6. Concluding Remarks Ontology building is hard but interesting and beneficial. Such a challenging enterprise is especially rewarding when it succeeds. The OMNIBUS ontology is unique in the sense that it is not a light-weight ontology and that it covers knowledge in learning and instructional theories as well as an ontology. The tool Hozo used in building OMNIBUS supports representation of “the wholeness concepts” that correspond to ordinary things and “relational concepts” that correspond to relationships. Both types of concepts are represented declaratively. The former concepts are the main content of an ontology, whereas the latter are mostly used as auxiliary concepts to modify the concepts defined in the main ontology (the former). However, OMNIBUS is different from such ordinary ontologies in that the latter concepts also play a crucial role by representing knowledge as strategies extracted from theories. They are called “ways” in OMNIBUS and constitute core knowledge of how I_L events are decomposed
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System
75
(concretized) into finer granularity. All the ways are defined by referring to the concepts defined in the “wholeness concepts.” Although it is not fully discussed in this chapter, the achievements of the OMNIBUS project are summarized as follows (see [12] for details): (a) A prototype multiple-theory-aware authoring tool, named SMARTIES, has been developed. (b) The gap between the theoretical and practical worlds has been bridged, thanks to generating IMS-LD-compliant scenarios. (c) The feasibility of the next generation intelligent systems has been demonstrated using not rule-based, but rather ontology-based technology. (d) An innovative manner of organizing instructional and learning theories in terms of strategies as ways of I_L event decomposition has been developed partly for better understanding of OMNIBUS. (e) Exhaustive declarative representation of all the concepts necessary for discussing instructional and learning theories provides us with a common engineering platform to analyze theories and models from a new perspective. We scarcely discussed evaluation of the OMNIBUS ontology in this chapter. In general, evaluation of an ontology is a non-trivial task, especially for a large ontology. Frankly speaking, it is not realistic to expect domain experts to read and evaluate OMNIBUS directly because it requires knowledge about ontology engineering. SMARTIES has been developed partly to help people understand the content of OMNIBUS. It can explain all the scenarios and theories in English. As explained already, SMARTIES is now open for trial for users to see what it is and how it works, as presented in 5.2. A visit to the demo slides of SMARTIES is also highly recommended. We understand the importance of community acceptance of an ontology and hope a part of SMARTIES’ functionalities contributes to it. Furthermore, topic (d) in future work is intended to facilitate community acceptance of OMNIBUS. The first result of the research about (d) is discussed in [11]. As discussed in the end of 5.4, we plan to investigate, in depth, the structure of the organization of ways as strategies.
References [1]
Basic Formal Ontology (BFO), http://www.ifomis.org/bfo/home.
[2]
A. Collins, J. S. Brown, S. E Newman, Cognitive apprenticeship: Teaching the crafts of reading, writing & mathematics, In L. B. Resnick (Ed.) Knowing, learning, & instruction: Essays in honor of Robert Glaser, Hillsdale, NJ: Lawrence Erlbaum Associates, 1989, 453-494.
[3]
D. Dicheva, O4E: Ontologies for Education, http://compsci.wssu.edu/iis/nsdl/.
[4]
V. Devedzic, Semantic Web & Education, Springer-Verlag, 2006.
[5]
DOLCE:
a
Descriptive
Ontology
for
Linguistic
and
Cognitive
Engineering,
http://www.loa-cnr.it/DOLCE.html [6]
R. M. Gagne, L. J Briggs, Principles of Instructional Design (2nd Ed.). Holt, Rinehart & Winston,
[7]
Y. Hayashi, M. Ikeda, R. Mizoguchi, A Design Environment to Articulate Design Intention of
New York, 1979. Learning Contents, International Journal of Continuing Engineering Education & Life Long Learning, 14 (3) (2004), 276-296.
76 [8]
R. Mizoguchi et al. / Inside a Theory-Aware Authoring System Y. Hayashi, J. Bourdeau, R Mizoguchi, Ontological Support for a Theory-Eclectic Approach to Instructional and Learning Design, Proc. of First European Conference on Technology Enhanced Learning (EC-TEL2006), 2006, 155-169.
[9]
Y. Hayashi, J. Bourdeau, R Mizoguchi, Ontological Modeling Approach to Blending Theories for Instructional and Learning Design, Proc. of The 14th International Conference on Computers in Education (ICCE2006), 2006, 37-44.
[10] Y. Hayashi, J. Bourdeau, R Mizoguchi, Structurization of Learning/Instructional Design Knowledge for Theory-aware Authoring systems, Proc. of the 9th International Conference on Intelligent Tutoring Systems (ITS'08), 2008, 573-582. [11] Y. Hayashi, J. Bourdeau, R Mizoguchi, Structuring Learning/Instructional Strategies through a State-based Modeling, Proc. of 14th International Conference on Artificial Intelligence in Education (AIED2009), 2009, .215-222. [12] Y.
Hayashi,
J.
Bourdeau,
R
Mizoguchi,
Using
Ontological
Engineering
to
Organize
Learning/Instructional Theories and Build a Theory-Aware Authoring System, International Journal of Artificial Intelligence in Education, 2009, 138-147. [13] Hozo: ontology editor, http://www.hozo.jp/ [14] Y. Kitamura, Y. Koji, and R Mizoguchi, An Ontological Model of Device Function: Industrial Deployment and Lessons Learned, Journal of Applied Ontology, Special issue on “Formal Ontology Meets Industry” 1(3-4), (2006), 237-262. [15] R. Mizoguchi, J. Vanwelkenhuysen, M. Ikeda, Task Ontology for Reuse of Problem Solving Knowledge, Knowledge Building & Knowledge Sharing (KB&KS'95), 1995, 46-59. [16] R. Mizoguchi, J Bourdeau, Using Ontological Engineering to Overcome Common AI-ED Problems, International Journal of Artificial Intelligence in Education 11(2) (2000), 107-121. [17] R. Mizoguchi, Y. Hayashi, J. Bourdeau, Inside Theory-Aware & Standards-Compliant Authoring System, Proc. of SWEL'07: Ontologies & Semantic Web Services for Intelligent Distributed Educational Systems, 2007, 1-18. [18] R. Mizoguchi, E Sunagawa., K Kozaki, Y. Kitamura, A Model of Roles within an Ontology Development Tool: Hozo, J. of Applied Ontology 2(2) (2007), 159-179. [19] R. Mizoguchi, Yet Another Top-level Ontology: YATO, Proc. of the Second Interdisciplinary Ontology Meeting, 2009, 91-101. (http://www.ei.sanken.osaka-u.ac.jp/pub/miz/YATO%20revised.pdf) [20] T. Murray, S. Blessing, S. Ainsworth, Authoring Tools for Advanced Technology Learning Environments: Toward Cost-Effective Adaptive, Interactive & Intelligent Educational Software, Springer, 2003.
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-77
77
CHAPTER 5
Using Ontologies to Author ConstraintBased Intelligent Tutoring Systems Pramuditha SURAWEERAa,b, Antonija MITROVICa,1 , Brent MARTINa, Jay HOLLANDa, Nancy MILIKa, Konstantin ZAKHAROVa and Nicholas McGUIGANc a Intelligent Computer Tutoring Group, University of Canterbury, New Zealand b Monash University, Melbourne, Australia c Lincoln University, Lincoln, New Zealand
Abstract. In this chapter we focus on the role of ontologies in developing constraint-based tutors, a special class of Intelligent Tutoring Systems (ITSs). Domain models for ITSs are extremely difficult to develop, and therefore efforts devoted to automatic induction of the necessary knowledge are of critical importance for widening the real-world impact of ITSs. We conducted an initial study which showed that ontologies were useful for manual composition of domain models for constraint-based tutors, as they allow authors to reflect on their understanding of the domain and organize the domain model better. Starting from these encouraging results, we developed ASPIRE, an authoring system for constraint-based tutors, which automated many of the tasks in domain model generation and serves the produced ITSs. The domain ontology plays a central role in the authoring procedure deployed in ASPIRE. We present one of the ITSs produced in ASPIRE as well as the experiences of authors in using ASPIRE. Keywords. Intelligent Tutoring Systems, constraint-based tutors, authoring support, ontologies, domain models
Introduction Intelligent Tutoring Systems (ITS) have shown a lot of promise to revolutionize education, due to their high effectiveness in supporting learning (e.g. [1, 2, 3, 4]). Main stream adoption of ITSs has remained an elusive goal due to the challenges in developing them. The development of an ITS requires much time and effort, most of which being consumed in encoding the domain knowledge in the chosen studentmodeling representation [5]. Our approach to building ITSs is based on Ohlsson’s theory of learning from performance errors [6], which resulted in the methodology known as Constraint-Based 1 Corresponding Author: Antonija Mitrovic, Department of Computer Science and Software Engineering, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand; E-mail: [email protected].
78
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
Modeling (CBM) [2, 7, 8]. Ohlsson proposed that knowledge should be represented in the form of constraints, which specify what ought to be so. Domain knowledge (i.e. constraints) is thus used as a way of prescribing abstract features of correct solutions. Constraints support evaluation and judgment, not inference, and are used to represent both domain and student knowledge. Although representing domain knowledge in the form of constraints makes the authoring process easier [8, 9], building a knowledge base still remains a major challenge. Developing a constraint-based tutor, like any other ITS, is a labor-intensive process that requires expertise in CBM and programming. We believe that domain ontologies can play a significant role in the domain model authoring process. We conducted an initial study at the University of Canterbury which confirmed the usefulness of ontologies of manual development of domain models, which we report in Section 1. Based on this result, we developed an authoring system named ASPIRE that assists in the process of composing domain models for constraint-based tutors and automatically serves tutoring systems on the web. ASPIRE guides the author through building the domain model, automating some of the tasks involved, and seamlessly deploys the resulting domain model to produce a fully functional web-based ITS. The author is expected to model an ontology of the domain and provide sample problems and their solutions, and ASPIRE automatically generates a domain model that can be deployed to instantiate a web-based tutoring system. We present ASPIRE in Section 2. Finally, Section 3 presents an ITS for capital investments developed using ASPIRE. We also present the author’s experiences in using ASPIRE, focusing on the process of modeling an ontology.
1. Initial Study: the Use of Ontologies for Authoring Domain Models Manually It is widely accepted that the quality of the knowledge base is the determining factor for the quality of diagnosis and instruction in ITSs. The knowledge base of a constraint-based tutor consists of a collection of constraints which formalizes the syntactic and semantic rules of the domain. A knowledge base is incomplete if it is missing one or more constraints that account for significant problem states. Incomplete knowledge bases result in student solutions falsely diagnosed as correct. Therefore, careful engineering and iterative testing has to be performed in order to avoid producing incomplete constraint bases. We believe that it is highly beneficial for the author to develop a domain ontology prior to composing constraints, as this helps the author to reflect on the domain. Such an activity enhances the author's understanding of the domain and is helpful when identifying constraints. We also believe that categorizing constraints according to the ontology assists the authoring process. As a consequence, the author would only be required to focus on constraints related to a single concept at a time, thus reducing their working memory load. The use of ontologies may help authors to produce more complete constraint bases. 1.1. Are Ontologies Useful for Developing Constraints? We hypothesized that the task of composing an ontology and organizing the constraints according to its concepts would assist in manually composing constraints. To evaluate
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
79
our hypothesis, we conducted an empirical study [10], where the participants were assigned the task of composing a domain model using a tool (called the domain model composition tool) centered on a domain ontology. The author starts with creating a domain ontology, and then develops constraints according to concepts of the ontology. One of our goals was to minimize the training required for novices in ontology modeling. In order to achieve this goal, the environment should be easy and intuitive to use. Consequently, the ontology workspace was designed to compose ontologies in a manner analogous to using a drawing program. The domain model composition tool was designed to encourage the use of a domain ontology as a means of visualizing the domain and organizing the knowledge base. As shown in Figure 1, the tool represents concepts using rectangles and generalization/specialization relationship using arrows. As it has no restrictions in placing concepts within the workspace, the author can position concepts to resemble a hierarchical structure. The author can define properties of concepts and relationships (in addition to generalization/specialization) between concepts. The bottom panel displays the properties (under the Details tab), relationships (the Relationships tab) and constraints (the Constraints tab) related to the currently selected concept. The screenshot in Figure 1 shows an ontology 2 developed for the comparison of adjectives task.
Figure 1. Interface of the Domain Model Composition Tool
2
The ontology shown in Figure 1 was developed by one of the study participants.
80
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
The root of the hierarchy is the Adjectives concept, for which it is necessary to define syntax and semantics aspects. The Ideal Scenario concept deals with regular comparisons, such as turning “pretty” into “prettier” and “prettiest”. The other concepts on the same level of the ontology deal with adjectives containing more than one syllable and with irregular comparisons. The other aspect is the correct use of a particular form of an adjective within a sentence. The bottom panel in Figure 1 lists the three constraints developed for the currently selected concept (Ending with “y”), and shows all components of the first constraint. The domain model composition tool also contains two text editors for composing syntax and semantic constraints, which the author can use to specify the constraints needed for the ITS. The text area of the editors is divided based on the concepts of the ontology, clearly showing a set of constraints that correspond to the same ontology concept. The editors also provide syntax highlighting facilities, by automatically coloring keywords in a different color, to assist in constraint composition. The text editors and the ontology view are automatically synchronized. Constraints added in the text editor are automatically displayed in the constraints tab bottom panel of the ontology view and vice versa. We conducted a study with 18 students enrolled in the graduate course on Intelligent Tutoring Systems at the University of Canterbury in 2003. They were assigned the task of building a domain model for teaching adjectives in the English language. This tutor would present sentences to its student, to be completed by providing the correct form of a given adjective, such as “My sister is much ________ than me (wise)” – correct answer “wiser”. The participants used the domain model composition tool to develop the constraint set and were provided facilities for loading completed domain models into WETAS, a tutor authoring shell [11], to instantiate a web-based ITS. The participants were allocated a total time period of three weeks to complete the task. The participants had attended 13 lectures on ITSs, including five on CBM, prior to assigning the task. They also had a 50 minute presentation on the constraint language supported by WETAS, and were given a description of the task, instructions on how to write constraints, and the section on adjectives from the Clutterbuck English text book [12] for vocabulary. They were free to explore the domain model of LBITS [11], a tutor that teaches simple vocabulary skills. 1.2. Results of the Initial Study Seventeen students completed the task satisfactorily. One student lost his entire work due to a technical problem; that student’s data was not included in the analysis. The same problem did not affect other students (it was eliminated before they experienced it). Table 1 gives some statistics about the remaining students, including interaction times, numbers of constraints developed and marks for constraint sets and ontology. The participants took an average of 37 hours to complete the task, spending 12% of the time in the ontology view, with a minimum of 1.2 and maximum of 7.2 hours. This can be attributed to different styles of developing the ontology. Some students may have developed the ontology on paper before using the system, whereas others developed the entire ontology online. Furthermore, some students also used the
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
81
ontology view to add constraints 3 , which increased the total time spent in this view. However, the logs showed that this was not a popular option, with most students composing constraints in the constraint editors. One factor that may have contributed to this behavior is the restrictive nature of the constraint tab of the ontology view, which displays only a single constraint at a time. In constraint-based tutors, constraints are classified as semantic constraints (checking whether the solution answers the question) or syntactic (checking whether the solution obeys the syntax rules of the domain). In the domain of adjectives this distinction is sometimes subtle. For example, in order to determine whether the submitted adjective form is correct, the student’s solution would be compared to the ideal solution. This involves checking whether the correct rule for determining the ending has been applied (semantics) as well as whether the resulting word is spelled correctly (syntax). Such constraints might be classified into either category, as is evident in the numbers of constraints for each category reported in Table 1. The averages of both categories are similar (9 semantic constraints and 11 syntax constraints). Some participants have included most of their constraints as semantic and others vice versa. Note that this classification does not affect the tutor’s behavior: all constraints are used to diagnose students’ solutions. The participants composed a total of 20 constraints on average. Table 1. Some statistics about the produced domain models
3 Participants were able to add constraints in two ways: either directly in the ontology view, as shown in Figure 1, or via the syntax/semantic editor.
82
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
We assessed the participants’ domain models by comparing them to one produced by an expert, the resulting marks being given under Completeness (the last two columns in Table 1). The expert’s knowledge base consisted of 20 constraints. The Constraints column gives the number of constraints from the expert’s set that are accounted for in the participants’ constraint sets. Note that the mapping between the ideal and participants’ constraints is not necessarily 1:1 (i.e. the participants might use multiple constraints to represent a concept that the expert encoded in a single constraint, and vice versa). Two participants accounted for all 20 constraints. On average, the participants covered 15 constraints. The quality of the constraints was high. The ontologies produced were given a mark out of five (the Ontology column in Table 1). All students scored highly, which was expected as the ontology was simple. Almost every participant specified a separate concept for each group of adjectives according to the given rules [12]. However, some students constructed a flat ontology, which contained only the six groupings corresponding to the rules. Five students scored full marks for the ontology by including the degree (comparative or superlative) and syntax considerations such as spelling. Fourteen participants categorized their constraints according to the concepts of the ontology while others chose to ignore this. For the former, there was a significant correlation between the ontology score and the constraints score (0.679, p<0.01). However, there was no significant correlation between the ontology and constraints scores when all participants were considered. This strongly suggests that the participants who made use of the ontology developed better constraint bases. An obvious reason for this finding may be that more able students both produced better ontologies and produced a complete set of constraints. To test this hypothesis, we determined the correlation between the participant’s final grade for the course (which included other assignments) and the ontology/constraint scores. There was indeed a strong correlation (0.840, p<0.01) between the grade and the constraint score. However, there was no significant correlation between the grade and the ontology score. This lack of a relationship might be due to a number of factors. Since the task of building ontologies was novel for the participants, they may have found it interesting and performed well regardless of their ability. Another factor is that the participants had more practice at writing constraints (in another assignment for the same course) than with ontologies. Finally, the simplicity of the domain could also be a contributing factor. The participants spent 2 hours per constraint (sd=1 hour). This is twice the time reported in [13], but the participants are neither knowledge engineers nor domain experts, so the difference is understandable. In particular, the students were new to the constraint language. The participants reported that building an ontology made constraint identification easier. For example, the following comments were extracted from their reports: “Ontology helped me organize my thinking”, “The ontology made me easily define the basic structure of this tutor”, “The constraints were constructed based on the ontology design”, “Ontology was designed first so that it provides a guideline for the tasks ahead.” The results indicate that ontologies do assist constraint acquisition: there is a strong correlation between the ontology score and the constraints score for the participants who organized the constraints according to the ontology. Subjective reports confirmed that the ontology was used as a starting point when writing constraints. As expected, more able students produced better constraints. In contrast, most participants
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
83
composed good ontologies, regardless of their ability, showing that ontology building is feasible for novice authors.
2. ASPIRE: An Ontology-Centric Authoring Systems As the development of the domain model is the bottleneck of ITS development [5], our long-term research goal has been to investigate ways of supporting the authoring process by automating as many tasks as possible. The result of our work is ASPIRE 4 [14], an authoring and deployment environment that assists in the process of composing domain models for constraint-based tutors and automatically serves tutoring systems on the web. Domain ontologies play a central role in the authoring process of ASPIRE. The author defines an ontology containing the concepts relevant to the instructional task. The ontology is used to define the structure of solutions expected by the intended tutoring system. The authors are required to provide sample problems and solutions based on the solution structure. Consequently, the example solutions provided by the author are instantiations of ontological concepts. ASPIRE automatically generates syntax and semantic constraints by analyzing the ontology and the sample solutions. These constraints are categorized based on concepts of the ontology to group them into meaningful categories. The authoring process in ASPIRE consists of eight phases. Initially, the author specifies general features of the chosen instructional domain, such as whether it consists of sub-domains focusing on specific areas, and whether or not the task is procedural. For procedural tasks, the author describes the problem-solving steps. The author then develops the domain ontology using ASPIRE’s ontology workspace. In the third phase the author defines the problem structure and the general structure of solutions, expressed in terms of concepts from the ontology. The author then adds sample problems and their correct solutions using the problem solution interface. The interface enforces the solutions to adhere to the structure defined in the previous step. The author is encouraged to provide multiple solutions for each problem, demonstrating different ways of solving it. ASPIRE then generates syntax constraints by analyzing the ontology and the solution structure. The semantic constraint generator analyzes problems and their solutions to generate semantic constraints. Finally, the author can deploy the developed ITS. We provide details of the phases in the following subsections. 2.1. Specifying the Domain Characteristics The first authoring phase requires the author to identify the problem solving steps for the chosen instructional task. This is not a trivial task, as the author needs to decide on the approach to teaching the task. The decisions would depend on the author’s teaching approach and/or the target population of students. The author also needs to decide on how to structure the student interface: will the task be presented on the same page or on multiple pages? As an example, consider the procedural task of adding fractions. The problemsolving procedure can be broken down into four steps. Initially, it is necessary to check whether the two fractions have the same denominator; if that is not the case, the lowest 4
ASPIRE is freely available at http://aspire.cosc.canterbury.ac.nz/
84
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
common denominator must be found. Step two involves modifying the two fractions to have the lowest common denominator (when needed). After that, the two fractions are added, which may result in an improper fraction. Finally, the result is to be simplified, if necessary. Note that the teacher may have a different set of steps; for example, steps three and four may be combined into one. In the rest of this section we assume the former procedure. 2.2. Modeling an Ontology of the Domain In the second phase the author develops an ontology of the chosen instructional domain, which plays a central role in the authoring process. ASPIRE-Author provides an ontology workspace for visually modeling ontologies, as shown in Figure 2. The ontology specifies the hierarchical structure of the domain in terms of sub- and superconcepts. Each concept might have a number of properties, and may be related to many other domain concepts.
Figure 2. ASPIRE’s Ontology workspace
A property is described by its name and the type of values it may hold. Properties can be of type ‘String’, ‘Integer’, ‘Float’, ‘Symbol’ or ‘Boolean’. For example, the currently selected concept (Reduced Fraction) in the ontology shown in Figure 2 has three properties: two of them are inherited from the Fraction concept (Numerator and Denominator), while Whole number is the local property defined for this concept. New properties and relationships can be added using the interface shown in Figure 3, which allows the specification of a default value for ‘String’ and ‘Boolean’ properties. It also
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
85
allows the range of values for ‘Integers’ and ‘Floats’ to be specified in terms of a minimum and maximum. When creating a property of type ‘Symbol’ the list of valid values must be specified. Property values may be specified as unique, optional and/or multivalued. In the latter case the number of values that a property may hold is specified using the ‘at least’ and ‘at most’ fields of the property interface; the ‘at least’ field specifies the minimum number of values a property may hold while the ‘at most’ field specifies the maximum number of such values.
Figure 3. Adding a new property
The ontology workspace visualises only generalisation/specialization relationships. Other types of relationships, such as part-of relationships between concepts, can be specified, but are not shown visually. To add a relationship the author defines its name and then selects the other concept(s) involved. The resulting relationship holds between the concept initially selected in the graphical representation of the ontology and the concept(s) chosen in the relationship-composing interface. In some cases, a relationship may involve one of a set of concepts. For example, when specifying an assignment (a statement that assigns a value to a variable), the author may specify that the allowed concepts on the right-hand side are constants (e.g. "x = 1"), variables (e.g. "x = y"), functions (e.g. "x = max(a,b,c)") or arithmetic expressions (e.g. "x = y + 3"). The List box allows the author to specify such a case, and then the corresponding concepts can be added to the container by selecting the appropriate concept from the drop-down list, and clicking the + button. Figure 4 shows the assigned value relationship, when the first related concept has been added (i.e. the Number concept).
Figure 4. Adding a new relationship
We decided to design and implement an ontology editor specifically for ASPIRE after evaluating a variety of commercial and research ontology development tools. Although tools such as Protégé [15], OilEd [16] and Semantic Works [17] are sophisticated and possess the ability to represent complicated syntactical domain restrictions in the form of axioms, they are not intuitive to use; they were designed for knowledge engineering experts, and consequently novices in knowledge engineering
86
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
would struggle with the steep learning curve. One of the goals of this research is to enable domain experts with little knowledge engineering background to produce ITSs for their courses. To achieve this goal, the system should be easy and intuitive to use. Furthermore, the ontology editor should be tightly integrated with other phases of the authoring projects to ensure a seamless experience. We therefore decided to use an enhanced version of the ontology editor from the domain model composition tool (see section 1 for details). ASPIRE’s ontology editor achieves ease of use by restricting its expressive power: in contrast to Protégé where axioms can be specified as logical expressions, the ASPIRE ontology workspace only requires a set of syntactic restrictions to be specified through its interface. We believe this is sufficient for the purpose of generating syntax constraints for most instructional domains. The ontology workspace does not offer a way of specifying restrictions on different properties attached to a given concept, such as the number of years of work experience should be less than the person’s age. It also does not contain functionality to specify restrictions on properties from different concepts, such as the salary of the manager has to be higher than the salaries of employees for whom they are responsible. Such arbitrary restrictions can be specified in Protégé using the Protégé Axiom Language (PAL). However, these restrictions are not an obstacle for generating the constraint set, as ASPIRE generates constraints not only from the ontology, but also from sample problems and their solutions. 2.3. Defining the Structure of Problems and Solutions The structure of problems and solutions is specified in the third phase of the authoring process. ASPIRE assumes that each problem contains a problem statement. A problem may contain a collection of sub-components that add more information to the problem statement. Each component can be either textual or graphical. The solution structure for a non-procedural task consists of a list of solution components. A solution component is described in terms of the ontology concept(s) it represents. The task of modelling the solution structure therefore involves decomposing a solution into components and identifying the type of elements (in terms of ontology concepts) each component may hold. The author specifies the label for the solution element, selects one or more concepts from the ontology, and the number of elements it may hold. Additionally, each component can also be marked as “free-text”, if the student is allowed to freely type the content of the component. The free text components will be displayed in the student problem-solving interface as text areas. For procedural tasks, each problem-solving step might require several parts, and therefore the author needs to specify the solution components for each step. Consequently, the solution structure for procedural domains consists of a collection of solution component lists, one for each problem-solving step. For example, in the first step of adding fractions, the student needs to specify the lowest common denominator, which is a single number. The author would have to specify that the solution for this step is an instance of the LCD concept. 2.4. Modeling the Student Interface Starting from the problem and solution structures, ASPIRE generates an HTML interface automatically, which is shown to the author in the fourth phase. The default interface shows the problem statement (with any additional components), and consists
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
87
of an input area for each component defined in the solution structure. The structure of the input area for each component is determined from the ontology, by using the properties and relationships of the concept relevant for that solution component. For example, a component that accepts a “Fraction” (from the ontology in Figure 2) will have two input boxes; one for the numerator and the other for the denominator. The default HTML interface expects the student to type in the components of the solution. However, in some domains this is not a realistic expectation. For example, in a Mechanics tutor the student may draw a force diagram, so textual input is not appropriate. In such cases the author can provide a domain-specific applet to support the student in performing one or more steps. Please note that we do not expect the author to develop the applet himself/herself, as such development would require programming expertise. The applet would need to be developed by software professionals. 2.5. Adding Problems and Solutions In the fifth phase the author adds problems and their solutions. The interface for this is similar to the default student interface (i.e. HTML interface generated by ASPIRE from the domain definition). The Problem Editor provides the author with the necessary interface widgets based on the problem structure, and the author populates them. The author is required to specify several general problem features, while adding a problem. The system automatically assigns a unique problem number to each new problem. The author may specify an optional name as a descriptor for the problem. The author also assigns a problem difficulty, (ranging from 1 for the simplest problems to 9 for the most complex problems), for each new problem. The author has to also populate the problem statement and problem components, if defined in the problem structure. The author can add one or more solutions to each problem, depicting different ways of solving the same problem. For procedural tasks, if there are multiple steps for solving a problem, the solution workspace allows the author to enter all the steps simultaneously rather than navigating through them one at a time as the students would. This eliminates the navigation effort needed between steps, making it quicker and easier for the author to add and inspect the full solution for a problem. Each step is displayed along with its name and the description that the students would see, and is separated by borders to make a clear distinction between steps. The author needs to specify the solution components for each problem-solving step. In domains where there are multiple solutions per problem, the author should enter all practicable alternative solutions. The solution editor reduces the amount of effort required to do this by allowing the author to transform a copy of the first solution into the desired alternative. This feature significantly reduces the author’s workload because alternative solutions often have a high degree of similarity. 2.6. Constraint Generation Syntax constraints are generated on author’s request, from the domain ontology. The ontology contains a lot of information about the syntax of the domain. The syntax constraint generation algorithm extracts all useful syntactic information from the ontology and translates it into constraints. Syntax constraints are generated by analyzing relationships between concepts and concept properties specified in the ontology [18]. For example, any restrictions specified on relationships, such as
88
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
minimum and maximum cardinalities, will be translated into constraints automatically. The same happens with data types and value ranges specified for properties. An additional set of constraints is also generated for procedural tasks, which ensure the student performs the problem-solving steps in the correct order (also called path constraints). For example, in the domain of fraction addition a constraint is generated that verifies the lowest common denominator (LCD) part of the student’s solution is not empty, which becomes relevant when the student is working on this step and prevents them from moving on to the following step before satisfying it. After adding problems and their solutions, ASPIRE’s semantic constraints generator can be invoked to generate semantic constraints. These constraints check that the student’s solution has the desired meaning (i.e. it answers the question). Constraintbased tutors determine semantic correctness by comparing the student solution to a single correct solution to the problem; however, they are still capable of identifying alternative correct solutions because constraints are encoded to check for equivalent ways of representing the same semantics [2, 8, 19]. Such constraints are generated by ASPIRE from analysing alternative correct solutions for the same problem supplied by the author. ASPIRE analyses the similarities and differences between two solutions to the same problem. The multiple alternative solutions specified by the author enable the system to generate constraints that will accept alternative solutions by comparing the student’s solution to the stored (ideal) solution in a manner that is not sensitive to the particular approach the student used. A detailed discussion of the constraint-generation algorithms is available in [20, 21]. 2.7. Deploying the Domain Once the author has completed all the authoring steps he/she may wish to see the tutoring system running. This allows the author to interact with the final tutoring system, solving problems and receiving feedback in a manner similar to students. The task of starting a tutoring system (to run on ASPIRE-TUTOR) is called deployment. On the author’s request, ASPIRE-Author will perform a number of checks on the domain to test the information supplied by the author is consistent and the domain model is complete. The author can simply click on the Deploy Domain button to deploy the domain in situations where all the domain checks are satisfied. The author can then try the tutoring system on ASPIRE-Tutor.
3. Developing an ITS for Capital Investment Decision Making with ASPIRE Capital investment decision-making plays a crucial role in the financial evaluation of non-current assets within contemporary organizational practice. Teaching experience shows that capital investment evaluation techniques, namely the accounting rate of return, net present value and the internal rate of return, are problematic for students to master. Students find the principles of capital investment decision-making difficult to comprehend and lack the ability to translate from theory to practice. It was envisaged that a Capital Investment Tutor (CIT) would enable students to apply theoretical financial decision making to ‘real-life’ simulated business environments. It was with this in mind that CIT was developed by Nicholas McGuigan (the last author of this paper) in consultation with the ASPIRE team.
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
89
Nicholas teaches capital investment decision making as part of his first year “Accounting and Finance for Business” course. He was a novice to the area of Intelligent Tutoring Systems. He had no previous experience in ITSs or knowledge engineering. Furthermore, he did not have any programming experience. In the rest of the paper, we refer to Nicholas as the author if CIT. CIT has been developed over a period of approximately six months from the initial conception through to the finished system [22]. Please note that the development of CIT overlapped with the ASPIRE development; therefore, the ASPIRE documentation (i.e. authoring manual) was not available at the time. This necessitated meetings with the ASPIRE team, in order to familiarize the author with ASPIRE’s functionality. 3.1. Capital Investments Tutor The author designed the task the student needs to perform as a procedural one, consisting of seven steps. In the first step the student constructs a timeline of project costs from the information given in the problem statement. This step is shown to the student on its own on the first page. In step two (shown on its own on a new page) the student identifies the relevant problem type in terms of which variable needs to be calculated. Step 3 requires the student to select the formula corresponding to the chosen variable; they then enter the parameters for the formula in step 4. In steps 5 and 6 the student enters the known values into the selected formula and then specifies the computed value. Based on this computed value the student then makes the final decision regarding capital investment in step 7. In CIT there is only a single problem set, although ASPIRE allows for multiple problem sets to be defined for a domain. The domain ontology for CIS is illustrated in Figure 5, showing the important domain concepts and their relationships. The ontology contained a total of 30 concepts, each containing one or two properties. Due to the nature of the task, the ontology was modeled as a set of trees, rather than a single tree. A tree described the concepts relevant for one or more problem solving steps. For example, the ontology contains a Cash Flow concept that is specialized into initial, operating and terminal cash flows. These concepts are only relevant for the first problem solving step of constructing a timeline. The solution structure for problems presented to students is given in Table 2. It outlines the list of components expected for each problem solving step. Some steps such as the first step required the student to enter multiple values. On the other hand, steps 2 and 3 required selecting the correct answer from a list of choices. Table 2. Solution structure for CIT Problem-solving step 1. Construct a timeline 2. Identify problem type 3. Select the formula 4. Specify formula parameters 5. Complete the formula 6. Enter the NPV value 7. Make the final decision
Solution components Cash flows (initial, operating and terminal) Choice of type Choice of formulae n, k All components of the Net Present Value (NPV) formula NPV value Decision
90
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
Figure 5. The ontology for the Capital Investment Decision domain
ASPIRE was able to generate the default HTML student interface from the ontology and the solution structure. The author wanted the final student interface to use applets, rather than text boxes supplied by the default student interface, to make CIT more visually attractive. Please note that the applets themselves were not developed by the author, as the expertise required for developing applets is far above the normal expectations of ASPIRE authors. The applets were developed by the ASPIRE team. In phase 5 the author entered twelve problems and their solutions. In this domain there is only one correct solution per problem. Syntax and semantic constraints were generated, and the author modified the automatically generated feedback messages to be more helpful to students. Finally the author deployed the domain, which results in the domain information being transferred to ASPIRE-Tutor. The author/teacher also defined a group, consisting of students who will have access to the system, and tailored the behavior of the system by specifying the feedback and problem selection options available to the students 5 . Figure 6 shows the student interface with the applet for the first step in CIT. The top area of the page provides controls for selecting problems, obtaining help, and changing/leaving the ITS. The problem statement is shown together with the photo describing the situation. The problem-solving area for this step consists of an applet 5 Please note that in this paper we only discuss the authoring side of ASPIRE. For information about the tutoring server side of ASPIRE, please see the ASPIRE manual [23].
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
91
that visualizes the time line. The student needs to label the period on this timeline and enter the amounts corresponding to the various types of cash flows. In the situation illustrated in Figure 6, the student has incorrectly labelled the initial time as period 1 on the timeline, entered the incorrect value for the initial cash flow, and failed to specify the rest of the timeline. The feedback shown in the right-hand panel corresponds to the All Errors level: the first hint discusses operating cash flows that are missing, the second discusses the initial cash flow for which the student supplied the wrong value, while the last one discusses the terminal cash flow. The student can change the solution based on the feedback provided, and submit the solution again. 3.2. Experiences in developing CIT The authoring process implemented in ASPIRE is relatively straight forward for an average computer user. The CIT author has an academic background in business and education and therefore has a limited knowledge of computer applications. A familiarity with computer programming or software applications is not necessary for using ASPIRE; however, belonging to a younger generation the author has had exposure to computer technology. The author found that in constructing CIT the overall ASPIRE system was easy to use with an ability to clearly map the course learning material to the systems application. ASPIRE was able to meet the requirements needed for the design of the Capital Investment Decision Making tutor to include a visual image of the case study, clear student interface and templates, and progressive feedback.
Figure 6. Student interface showing the first step of the procedure (the timeline)
The author initially experienced difficulties modeling the ontology for CIT. This was partly due to an inadequate understanding of the ontology, as it can be difficult for
92
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
content experts to map out what they are teaching in a structured and detailed manner. In the case of CIT, it was difficult for the author to clearly ascertain what was required of him and the purpose the ontology served. It may prove beneficial to develop a partial ontology, and follow the authoring process through to its entirety, completing the remaining ontology requirements along the way. This would allow the author to obtain a holistic understanding of the process. The manual uploading of problems and suggested solutions was straightforward and required little development time, proving to be one of the advantages to using the ASPIRE system. The feedback provided to students proved to be one of the most useful elements of CIT as it could be tailored to the student group based on the learning outcomes of the course and the task at hand. The manual modification of generated feedback messages was conducted towards the end of the system’s development and was jointly achieved by the ASPIRE team and author. Having not had a computer programming background, the author found this stage relatively difficult, relying on face-to-face assistance to complete the task. 3.3. Evaluation of CIT We conducted an evaluation study of CIT in a summer school course at Lincoln University in February 2008. The participants were 21 students enrolled in ACCT102 (Accounting and Finance for Business), an introductory business decision-making course. Prior to the study the students had participated in lectures covering the relevant material. The course had two scheduled tutorial streams, and we randomly selected one of them to serve as the experimental group, while the other served as the control group. The length of tutorials was 90 minutes. During this time the students took a short pretest, interacted with the system (experimental) or solved the problems individually and then discussed them with a human tutor (control), and then took a post-test. Both groups spent 45 minutes on problem-solving. The experimental group participants also filled in a user questionnaire, which solicited their impressions of CIT. There was no significant difference between the pre-test scores, indicating that the prior knowledge of the two groups of participants was comparable. Both groups improved on the post-test, with the control group having a significant (p<0.04), and the experimental group a marginally significant improvement (p=0.066). There was no significant difference between the learning gains for the two groups. We attribute the relatively low results on the post-test to the short session length. We also analyzed how students acquire constraints; the resulting learning curve is shown in Figure 7. The x-axis shows the opportunity to use a given constraint during the session, on any attempt (please note that an attempt is a partial solution submitted for a particular step of the task). The average number of attempts was 78 (sd = 31). The y-axis shows the error probability. The data points were averaged over all constraints and all students who interacted with CIT. The minimal number of attempts per student is 42, and therefore the graph shown in Figure 7 represents all students from the experimental group. The initial probability was computed for 649 constraints, and the cut-off point for the graph is attempt 15, when the number of constraints used was 2/3 of the initial number. The data exhibits an excellent fit to a power curve, thus showing that students do learn effectively in CIT. Furthermore, the learning rate (i.e. the exponent of the power curve equation) is very high, showing that students acquired the necessary knowledge quickly. The initial error probability of 0.26 dropped by more than 50% to 0.12 after only four attempts.
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
93
The questionnaire responses were also analyzed. For the questions we discuss below the students selected a response on a scale ranging from 1 (not at all) to 5 (very much). The students enjoyed interacting with the system (mean=3.5, sd=1.1), and believed that their understanding of the domain was improved as a consequence of using CIT (mean=3.8, sd=1). Eleven out of the fourteen experimental group participants indicated they would recommend CIT to other students. The students rated CIT’s interface as fairly easy to use (mean=3.7, sd=1.1). Most of the students found the feedback useful (mean=3.3, sd=1.3), although a few pointed out that they would prefer a little more information. The results obtained from this initial evaluation of CIT are encouraging, and are similar to the results we have obtained in the previous studies with (manually developed) constraint-based tutors. However, the amount of collected data was limited, due to the small class size. We plan to repeat the evaluation study in 2009, when more students are expected to take the same course. We will then extend the length of the study to a couple of sessions. 0.4 0.35
Probability
0.3 y = 0.3508x -0.7445 R2 = 0.8565
0.25 0.2 0.15 0.1 0.05 0 1
3
5
7
9
11
13
15
Occasion
Figure 7. Learning curves for CIT
4. Conclusion and Future Work Ontologies are playing more and more important roles in many facets of computerbased systems. In this paper we focused on the role domain ontologies can play in authoring intelligent tutoring systems. Such systems are very difficult to develop, due to the need of having a detailed domain model. Within ICTG we developed a methodology for building constraint-based tutors, which are typically easier to build than some of the other prevalent types of ITSs. However, even with this advantage, constraint-based tutors require constraint sets that capture the basic domain principles. In order to develop a constraint set, the author needs not only domain expertise, but also familiarity with constraint-based modeling and the constraint language used.
94
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
We reported on a study which examined the role a domain ontology can play within manual development of constraint sets. Our novice ITS authors were required to develop a domain ontology first, before developing constraints manually. The results showed the authors who centered constraint generation around the ontology produced higher quality domain models. We believe this effect comes from the ontology enabling the author to reflect on their understanding of the domain, and structure the constraint set better, which was also confirmed by the comments the authors made about the process. Starting from this finding, we designed and developed ASPIRE, and authoring and deployment system for constraint-based ITSs. The authoring process in ASPIRE is organized around the domain ontology. The author develops the domain ontology, and uses it to specify the solution structure. ASPIRE uses the ontology to induce syntax constraints automatically. The ontology also provides useful information for induction of semantic constraints, although the main source of information for this phase is the set of problems and solutions specified by the author. Finally, ASPIRE uses the information provided in the ontology and the solution structure to generate the default student interface. In the paper, we presented CIT, one of the ITSs developed in ASPIRE by a novice author. The evaluation of CIT showed that it supported learning effectively. The author’s experience shows that the ontology development is still a difficult task, but one that is doable by a novice author with support provided by more experienced computer professionals. CIT was developed before full documentation and support was available for ASPIRE; we believe that even this difficult part is now better supported. Furthermore, ontology development in ASPIRE is a much easier task than manual development of constraints. The important achievement is that novice authors can develop their ITSs with much less effort; they do not require a detailed understanding of the underlying student modeling approach, the constraint language or programming experience in order to develop a working ITS. We plan to keep improving ASPIRE and its functionality. One of our current goals is to evaluate ASPIRE with various profiles of authors. ASPIRE is freely available and we invite the research community to test it by applying it to various instructional domains.
Acknowledgements The work on the role of domain ontologies for the manually composing constraints was supported by the University of Canterbury Grant U6532. The ASPIRE project was supported by the eCDF grants 502 and 592 from the Tertiary Education Commission of New Zealand. We gratefully acknowledge the support of all members of ICTG.
References [1] K.R. Koedinger, J.R. Anderson, W.H. Hadley, M.A. Mark, Intelligent Tutoring Goes to School in the Big City. Artificial Intelligence in Education 8 (1997), 30-43. [2] A. Mitrovic, S. Ohlsson, Evaluation of a Constraint-based Tutor for a Database Language. Artificial Intelligence in Education 10 (1999), 238-256. [3] K. VanLehn, et al. The Andes Physics Tutoring System: Lessons Learned. Artificial Intelligence in Education 15 (2005), 147-204.
P. Suraweera et al. / Using Ontologies to Author Constraint-Based Intelligent Tutoring Systems
95
[4] A. Mitrovic, B. Martin, P. Suraweera, Intelligent Tutors for All: Constraint-based Modeling Methodology, Systems and Authoring. IEEE Intelligent Systems 22 (2007), 38-45. [5] T. Murray, Expanding the Knowledge Acquisition Bottleneck for Intelligent Tutoring Systems, Artificial Intelligence in Education 8 (1997), 222-232. [6] S. Ohlsson, Learning from performance errors. Psychological Review 103 (1996), 241-262. [7] S. Ohlsson, Constraint-based Student Modelling, in Student Modelling: the Key to Individualized Knowledge-based Instruction, 1994, 167-189. [8] S. Ohlsson, A. Mitrovic, Fidelity and Efficiency of Knowledge Representations for Intelligent Tutoring Systems. Technology, Instruction, Cognition and Learning 5 (2007), 101-132. [9] A. Mitrovic, K.R. Koedinger, B. Martin, A Comparative Analysis of Cognitive Tutoring and Constraintbased Modeling. In: Brusilovsky, P., Corbett, A., and de Rosis, F. (Eds.) Proc. User Modelling, 2003, 313-322. [10] P. Suraweera, A. Mitrovic, B. Martin, The role of domain ontology in knowledge acquisition for ITSs. In: Lester, J., Vicari, R.M. and Fabio Paraguacu, F. (Eds.) Proc. Intelligent Tutoring Systems, 2004, 207-216. [11] B. Martin, A. Mitrovic, Authoring Web-Based Tutoring Systems with WETAS. In: Kinshuk, Lewis, R., Akahori, K., R. Kemp, T. Okamoto, L. Henderson, L. and C.-H. Lee (Eds.), Proc. ICCE, 2002, 183-187. [12] P.M. Clutterbuck, The art of teaching spelling: a ready reference and classroom active resource for Australian primary schools. Longman Australia Pty Ltd, Melbourne, 1990. [13] A. Mitrovic, Experiences in Implementing Constraint-Based Modelling in SQL-Tutor. In: Goettl, B.P., Halff, H.M., Redfield, C.L. and Shute, V.J. (Eds.), Proc. Intelligent Tutoring Systems, 1998, 414-423. [14] A. Mitrovic, P. Suraweera, B. Martin, K. Zakharov, N. Milik, J. Holland, Authoring constraint-based tutors in ASPIRE. In: Ikeda, M., Ashley, K. and Chan, T.-W. (Eds.), Proc. Intelligent Tutoring Systems, 2006, 41-50. [15] The Protege Ontology Editor and Knowledge Acquisition System, http://protege.stanford.edu, 2006. [16] A. Bechhofer, I. Horrocks, C. Goble, R. Stevens, OilEd: a Reason-able Ontology Editor for the Semantic Web, in 14th Int. Workshop on Description Logics, 2001, 396–408. [17] Altova – XML, Data Management, UML, and Web Services Tools (2005) http://www.altova.com. [18] P. Suraweera, A. Mitrovic, B. Martin, The use of ontologies in ITS domain knowledge authoring. Workshop on Applications of Semantic Web for E-learning SWEL'04, 2004, 41-49. [19] P. Suraweera, A. Mitrovic, An Intelligent Tutoring System for Entity Relationship Modelling. Artificial Intelligence in Education 14 (2004), 375-417. [20] P. Suraweera, A. Mitrovic, B. Martin, A Knowledge Acquisition System for Constraint-based Intelligent Tutoring Systems. In: Looi, C-K., McCalla, G., Bredeweg, B., Breuker, J. (Eds.) Proc. Artificial Intelligence in Education, 2005, 638-645. [21] P. Suraweera, A. Mitrovic, B. Martin, Constraint Authoring System: An Empirical Evaluation. In: Luckin, R., Koedinger, K. and Greer, K. (Eds.), Proc. Artificial Intelligence in Education, 2007, 451458. [22] A. Mitrovic, N. McGuigan, B. Martin, P. Suraweera, N. Milik, J. Holland,, Authoring Constraint-based Tutors in ASPIRE: a Case Study of a Capital Investment Tutor. ED-MEDIA, 2008, 4607-4616. [23] A. Mitrovic, B. Martin, P. Suraweera, N. Milik, J. Holland,, K. Zakharov. ASPIRE User Manual. http://aspire.cosc.canterbury.ac.nz, 2008.
96
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-96
CHAPTER 6
An Ontology-Based Test Generation System a
Larisa N. SOLDATOVAa,1 and Riichiro MIZOGUCHI b Department of Computer Science, Aberystwyth University, UK b ISIR, Osaka University, Japan
Abstract. This chapter proposes: an ontology for the development of tests, the architecture of a test generation module (as a part of a learning management system), and a test generation system (as a separate information system). As an example of the proposed ontology engineering approach to the development of test generation systems we describe an implemented semi-automatic system capable of: designing and generating tests, assisting a user in defining input test parameters (for example a test goal, a scoring schema), explaining the test generation process, and producing test specifications. The system has been tested in two Universities: Aberystwyth (UK), and Vladivostok (Russia). Keywords. Ontology, test generation, learning management system
Introduction Tests have played an important role in social life since ancient times. There is evidence of tests being used five thousand years ago in Babylon to check the knowledge of scribes about arithmetic, materials, plants, and writing ability [9]. In ancient Egypt only people who passed difficult exams were allowed to become priests. Pythagoras is believed to have passed tests and to have thoroughly checked the mental ability of his students through the use of difficult mathematical problems [4]. In III century B.C. in China there were exams for government officials, with the emperor as chief examiner [10]. Francis Galton was the first to consider objective assessment methods of mental abilities [10]. “An Introduction to the Theory of Mental and Social Measurements” by Edward L. Thorndike, a leading educational psychologist from Columbia University, is considered to be the first textbook on test theory, it was published in 1904. We argue that the measurement of academic performance by tests should be as scientific and professional as possible. However, careful preparation of a test and analysis of the test results is costly in time and requires knowledge from pedagogy, psychology, sociology, philosophy, mathematics, and computer science. There are now many theories and approaches to test construction [1, 4, 16], and classical test theory (CTT) has recently been significantly extended by item response theory,
1
Corresponding author. Email: [email protected]
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
97
generalizability theory, adaptive testing, and methods for criterion-referenced tests [5, 7, 13, 14, 16]. In the CTT (Classical Test Theory) model, which is applicable for fixed-length tests, the observed score xij of a learner i for the j measure is xij T i H ij , where T i is a true unknown score of the learner, and
H ij
is an error of the observation. The
measurement is any procedure as a test of knowledge or an expert rating for measurement the characteristics by questions. The purpose of a test construction is to minimize an error and evaluate a learner’s knowledge by an observed score. The quality of a test can be estimated by statistical characteristics such as reliability, precision with which xij is measured, and validity [1]. CTT remains a basis for standard achievement, attitude and aptitude tests. Despite this, in many ways test development is still “an art”. There is no an agreed vocabulary, standard, and technology for a test composition. “As a general rule, the process by which psychological construct have been translated into a specific set of test items has remained private, informal, and largely undocumented” [8]. The process of professional test construction is complicated and consists of many steps [8]: x x x x x x x x x
identification of a primary purpose for which a test score will be used; preparation of a test specification; construction of an initial pool of test items; item review; preliminary item tryouts; field-testing of the items on a large sample representative of the examinee population; determination of the statistical properties of item scores and improvement of the test; proof of reliability and validity of the final test; development of guidelines for the administration, scoring, and interpretation of the test scores.
Generally educational tests are developed by teachers, not specialists in test technology, and the result is that often these do not meet requirements for the test quality. Even tests developed by professionals may not meet the criteria. There is therefore an increasing necessity for effective test technology in order to: 1. 2. 3. 4.
Make test theory knowledge available to non specialists in test technology, teachers and developers of LMS (learning management systems). Reduce the time and financial expense of test construction and analysis of test results. Provide intelligent assistance to test developers. Design intelligent test generation systems incorporated into learning environments and compatible with other standard modules of LMS.
There are many information systems that facilitate test composition and management, for example, Test Central (http://www.test.com/), WebCT
98
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
(www.WebCT.com), Blackboard (http://www.blackboard.com/), Sigma (http://www.gosigma.com/), TPC TestMaster™ (http://www.tpctraining.com/), Easy Test Maker (http://www.easytestmaker.com), Test Pro (http://www.atrixware.com). Such systems usually have an interactive user interface, online and database support with a high level security, and interconnections between several subsystems. These systems allow the creation, editing, and storing of tests; the proofreading, searching, sorting, and re-arranging of sets of test items; calculation of test scores; the administration, and managing of tests; and the delivery of tests on local and global sites. Such systems solve many problems and provide: x x x x
assistance in the technical processes of a test development; the online support of test applications; the calculation of statistical characteristics of test results; assistance in the management and administration of tests.
These systems automate the routine processes in the test making, but they do not provide necessary assistance to an inexperienced test developer in the design of a test as whole or in the creation of test content. The main disadvantages of such systems are that they do not contain enough procedural knowledge for the test development. In this chapter we argue that an ontology engineering approach can significantly contribute to solving this problem. The chapter demonstrates how an ontological support can facilitate the development of a test generation system capable of: designing and generating tests, assisting an author in defining input test parameters (such as a test goal, a strategy, a scoring schema, etc.), explaining the test generation process, and producing test specifications. Section 1 describes an ontology for test design (OTD). OTD formalizes major terms for the test design process. Sections 2 and 3 present specifications of possible test generation systems: section 2 discusses components and functions of an intelligent test generation module and how such a module should be incorporated in an intelligent LMS, and section 3 presents a test generation system as a separate information system. In section 4 we demonstrate a prototype of a test generation system. We use italic font to indicate ontological classes where it is appropriate.
1. An Ontology for Test Design OTD (an Ontology for the Test Design) defines the major terms for test development required to provide computer support for the test designing process [11]. OTD provides a structured way to accumulate and formalize not only theories for the test design, but also facts and beliefs. Figure 1 shows a fragment of the upper-level structure. All entities of the class proposition (some portion of information) are divided into two subclasses: representation-neutral propositions (independent of the way the information is represented, for example fact is independent of the representation in a form of logic or text) and representation-oriented propositions (dependent on the way the information is represented, for example a representation of test item may effect the results of a test) [17, 18]. The class representation is composed of a representation form (for example a representation in a form of logic) and content (meaning). All the existing contents are subclasses of proposition. The class role (not shown in Fig.1) is used to define entities in some context. It is possible for the same person to play
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
99
different roles in the educational context, for example to be a learner and a teacher, or a teacher and an administrator. To express this, such entities as learner and teacher are defined as roles played by human. The class qualitative attribute (not shown in Fig.1) is used to define qualities or characteristics of entities. OTD formalizes major terms of CCT, but it is open for extensions.
Figure 1. A representation of propositional classes in OTD.
The class test is defined as a subclass of the class product proposition (proposition that is an output of some production process), and has the following characteristics (attributes): x x x
acceptability, a test is acceptable when all interested parties (candidates, parents, teachers, institutions of higher education, employers, and other endusers) regard the test results to be trustworthy and use them accordingly. efficiency, a test is efficient when it does not take up more resources than is necessary. score, an observed score xij T i H ij , where T i is a true unknown score of the learner, and
H ij
is an error of the observation.
The most important test characteristics for a conventional test that is designed according to CTT are the following: x
reliability, degree to which individuals’ deviation scores remain relatively consistent over repeated administration of the same test or alternate test forms;
100
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
x x
validity, as a content validation about how test items represent a problem domain, a criterion-related validation of the criteria used in making inferences from the test result, and a validation of a test construction; precision of decision made on the basis of test results.
The test characteristics are defined as subclasses of the upper-level class qualitative attribute [12]. OTD represents knowledge assessment as a part of an educational process which determines goals, strategies, and time points for the assessment [19]. Goals and methods to achieve those goals are key OTD classes. The class educational goal is represented as a role played by a statement (any statement can have several roles, for example a hypothesis, an assumption, a goal) and the class test goal is a subclass of the class teaching goal (see Fig. 2).
Figure 2. A representation of goals and tasks in OTD.
OTD classifies test goal as follows (see fig. 2): x x x x
progress check – a (usually regular) check of the learner’s current achievement within the specified course; diagnostics of difficulties – identification of differences between a learner model and an ideal model; qualification – a determination of whether learners have got the required level; ranking – placement of learners at the continuum of a group;
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
101
Test goals are determined by educational task, such as admission, evaluation including certification, diagnostic, mastering (not shown on Fig. 2). An educational task is used to select appropriate test tasks in order to build a test to achieve the corresponding assessment and other educational goals. The class test (a subclass of the class proposition) is the description of its parts and attributes, which are represented via part-of and attribute-of relations (see Fig. 3). A test has a test goal, a test object – a learner or a group of learners, and a test subject – it might be a teacher, a test developer (an author), a test reviewer, etc. Test administrative information contains for example information about where (educational institution or company) and when (year, semester) the test is going to be used. It also has administrative information about subjects and objects of the test. A test has statistical characteristics (for example reliability, validity, precision) and an observed test score. A test consists of test directions and test sections. An ordered sequence of sections with defined number of test items forms a test structure. A test section is composed of a number of test items and characterized by a test item form and a scaling system (nominal, ordinal, interval, or ratio).
Figure 3. A representation of a test in OTD, where a/o is an attribute-of relation and p/o is a part-of relation.
Each test item has a stem, response alternatives with a correct answer and destructors, and characterized by a level of understanding, which is defined by Bloom’s model (see below). OTD classify test items as follows: x
objective item form: o OTD divides multiple choice items according to the amount of the given answers (2,3, etc.), and logic of the answer (select 1 correct answer, select the most correct answer, or note all correct answers). The best known example is true-false question (is the statement true or false), and SAN question (select: the statement is correct sometimes, always, or never).
102
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
x
o Matching problems with different amount of columns. text-requiring item form: sentence completion, giving a short answer, writing an essay [3], problem solving, and simulation.
OTD defines tests in the form of paper, computer, listening, presentation, or oral. It is necessary to select a correct approach for scaling in order to get good test characteristics. Three broad approaches are considered in CTT [8]: x x x
subject-centered, which focuses on measurement of an individual, his/her place in the ranking of the examinee group; response-centered, which focuses on measurement of the individual’s correspondence to some criteria; stimulus-centered, which focuses primarily in locating the position of the item on a psychological continuum.
A test–set is formed by test documents: test specification with detailed description of the test for potential users, blanks of the test, answer sheets, etc. A test as an event is a part of an instructional event, and describes a place of testing, a time of testing, etc. OTD includes the popular pedagogic 6-layers model of understanding, suggested by Bloom [1]. The 6-layers model gives more information about a learner’s stage and an area of his/her difficulties compared with a standard “flat” learner’s knowledge model. Each test item corresponds to a definite level of understanding: Level 1: knowledge in a narrow sense, knowing facts and definitions of notions; Level 2: comprehension, understanding of meaning of notions, objects, abstracts, and knowing simple rules; Level 3: application, corresponds to the ability to apply known rules; Level 4: analysis, checks understanding of relationships between elements, ability to select and compound different rules; Level 5: synthesis, corresponds to the ability to generalize knowledge; Level 6: evaluation, uses meta knowledge. The first 1-3 levels correspond to the domain knowledge, and upper levels reflect mostly the domain independent knowledge. OTD is expressed in W3C standard ontology language OWL [24] using the Hozo ontology editor [15]. Our current version of OTD contains 348 concepts for the description of the test development, including upper-level classes. In this section we briefly described the key OTD classes, and the next sections show how these classes are used for test generation systems.
2. A Test Generation Module Often knowledge assessment is closely related to other teaching and learning processes [23, 25]. We will consider such a case in this section. In many situations knowledge assessment can be represented as a separate from a teaching process, for example entrance exams to university, aptitude tests. We will consider that case in the next section.
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
103
If for example consider an instructional event corresponding to teaching a course, all educational processes involved usually have the same subject, object, and domain. Goals, methods, and strategies of these processes should be compliant. In particular, the goal of the assessment depends on the educational goals and strategies. Thus it is technologically reasonable to analyze and provide support for a test generation in a context and connection to other instructional processes. In [2] we proposed a reference model of Intelligent Educational System (IES). IES typically includes the following modules: x x x x x x x
Instructional strategy; Sequencing strategies; User model; Adaptation; Test generation; Domain model; Learning resource model.
Test Generation Module (TGM) as a subsystem of IES can request from the other modules all necessary information about a learner (a test object), a domain, available resources including a test bank and a data base of test items, and about the instructional process. Instructional design and sequencing modules inform TGM when to test, with what a purpose, and if necessary what are its requirements: test duration, number of test items, test item forms, etc. TGM communicates with users via the interface of the IES. An intelligent TGM should be able to determine a teaching goal corresponding to the test goal, an appropriate strategy for the test design, the best test parameters (test duration, items forms, level of understanding, number of items) to achieve the goal if they are not provided, to import a proper test, or to create the test if it is not available from the test bank, and to give explanations about the test composition. The module also has to support an application regime of the system: to deliver tests to a learner (via system’s interface), to evaluate and analyze learner’s answers and to store the results. An ontology provides a systematic way to define all input, output terms for TGM such as test goal, target group, test item, etc. and TGM functions that are related to the test design process such as test design, test generation, test improvement (but not functions that are related to data interchange with other modules of IES such as import, export, administration) (see Fig. 4 and appendix for a list of functions). Each module of IES has its own ontology that is designed following the same principles and using the same upper-level classes [2]. It ensures full compliance and interoperability between the system modules. For example if TGM requests data about target group from the User module, the later knows exactly what it is and what data to provide. The class target group characterized by a specialty is defined as a subclass of the class group, and each group has members, in particular – a target group consists of learners. A learner characterized by ID is defined as a role played by a human and the class human has a name. So the User module will give to TGM: specialty name, specialty code, list of learners - test object with their names and IDs. These data are used for generation of test blanks and for making decision about the difficulty of a test (depending on the specialty). If TGM gives test results to the Sequencing Strategies module, the later knows how to interpret and use these data for evaluation of the user’s state, and how to find suitable learning tasks. The modules’ ontologies facilitate knowledge sharing and reusability, and they are proven to be useful in synchronizing
104
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
and managing many educational tasks. Thus OTD is essential not only in formalizing a test generation subsystem along with it functionality, but also in guarantying reliable communication with other subsystems of IES and users.
Figure 4. The test tasks in OTD.
3. A Test Generation System In this section we will consider test generation as a separate information system. A Test Generation System (TGS), as distinct from a TGM, is not supported by other IES modules by the provision of necessary input information about a learner and resources, such as a domain model and a database with available tests. A TGS must directly communicate with a user (a teacher or a learner) in order to request input information and deliver a service and output results. A teacher or a test developer has to tell to TGS what test tasks, requirements for the test and test time points are. However an intelligent system should be capable of providing assistance for a user to input information, for example selecting a test task from a list of available tasks, providing information on recommended parameters, and providing explanations for an author during the process of test generation. Additionally, the system must be ready to automatically determine necessary parameters. A TGS also needs to simulate some of the functions of other modules, for instance the Instructional design module, to determine an appropriate test strategy. A TGS has to also include an intelligent interface to end-users (test subjects and objects), knowledge base (KB) with rules of inference of test parameters, and data base (DB) for permanent storage of tests, test items and test results. The interface should support the following regimes: x x x
Application: assessment of learners Development: test generation and test review Knowledge acquisition: extending OTD and KB by new knowledge for the test development,
and should provide different levels of user’s support:
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
x x x
105
Novice: maximum automation of test processes with minimum explanations; Experienced: moderated by the user level of automation and explanation; Expert: full control of test processes by the user with maximum explanations by the system.
Figure 5 shows the flow diagram of TGS (without interface) for a test generation process in a case of the selected by a user test task test design. The flow diagram shows that this task is decomposed into subtasks: test structure design, section design, test item selection, test item generation, calculation of test characteristics, test improvement and test specification preparation (see Fig. 5). These tasks are defined in TDO and the system has the corresponding functions: input, design, generate, select, calculate, improve, prepare and output (see the appendix). Input: Teaching goal, target group, test parameters (test duration, item forms, …)
Design:
i=1,N
test goal, test structure
Design: section i Output:
time I, direction, scoring scheme, examples
TO
Test, test documentations
Generate: item ij level of understanding
KB Prepare: the test
J=1,K
specifications.
DB Calculate: the test
no
Possible to generate?
Select: item ij
characteristics.
yes
Generate: item ij
Improve: the test Test
Figure 5. The flow diagram of TGS.
4. Implementation In order to demonstrate how OTD can be used for the designing of a TGS we have developed a prototype of TGS that has most of the discussed above functions: to design and generate a test, to assist in defining input test parameters, to inform and explain the test generation process, and to produce requested test documentation [6]. Our prototype is a separate information system. Our system follows the flow diagram of TGS based on OTD. Additionally, OTD was used for the system’s interface to define input/output information for a user and how to interpret it. For example Figure 6 shows a list of available for a user test tasks, and Figure 7 shows a screen with the test parameters. All the terms are defined in OTD.
106
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
Figure 6. A screen shot of TGS with the test tasks.
If a user decides to skip these options, the system will determine the test parameters automatically according to the rules from KB [21, 22]. KB has generic rules for the test design following CTT. The rules tell the system what is an optimal duration of the test given a goal and a target group, what level of difficulty use, how to calculate number of test items on each level of difficulty, what types of test items use, etc. In any situation the system tries to generate a test with the best possible characteristics.
Figure 7. A screen shot of TGS with the test parameters.
In our example, the test was generated for the course “Data base management”. Figure 8 presents test items for the 1st level of understanding, which were automatically generated by the system. Note: the text is generated fully automatically using test item templates and might contain grammatical errors. The knowledge about the course has been formalized using the same ontology methodology and the same upper level classes: proposition, fact, event, qualitative attribute, etc (see sec. 1). The system can use instances of the classes to fill in the templates. The user can edit the items’ text if necessary, re-generate them, or import ready test items. Ready and generated test items are stored in DB.
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
Figure 8. Test items.
Figure 9. The generated test documents (a fragment).
107
108
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
Figure 9 shows a fragment of generated test specification with detailed information about the test and each test item, including a correct answer and how to score the item. The system provides the user with explanations of all the steps of the test generation during the session, and at the end of the session the system produces final text with explanations of how the test was created (see the option “System’s explanations” at the screenshot in Fig. 9). This TGS has been tested in Aberystwyth University, UK for the CS10610 Databases course, and in Vladivostok State University of Economy and Service, Russia for a course on Calculus. The TGS was highly rated by the lecturers as a useful tool in preparing tests, and liked by students because of clear test directions and the variety of test questions. Figure 10 compares the results of two tests: the teacher-made test and the test automatically generated by our system (marked as generated 1 and generated 2, where generated 2 is the same test but with different scoring system - more weights for more difficult questions). The theme of the tests is ‘Integration of functions’, and the goal is ranking, both of the tests were used for the assessment of 68 students of management and accounting specialties. The results of a test should clearly separate the learners, because the goal of the assessment is “to place learners at the continuum of a group”. Such tests can be used for example to identify the most appropriate candidates for a particular task, or the most advanced learners. Figure 10 shows clearly that the generated test better separates the students to corresponding ranks. 40 35 30
frequency
25 generated 2 20
generated 1 teacher
15 10 5 0 1
2
3
4
5
6
7
sum test score
Figure 10: Frequency of the tests scores, where ‘generated 1’ is automatically generated test by the TGS, ‘generated 2’ is the same test but with different scoring system and ‘teacher’ is a teacher-made test.
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
109
5. Conclusion We have proposed an ontology OTD that formalizes the key terms in test development, defines the functionality of test generation systems and its components. We have verified this approach by implementing a prototype of TGS, and applying it to real test generation processes. The results show that an ontology engineering approach can provide the formal description of the procedural knowledge for the test development process and improve the design and implementation of test generation systems.
Acknowledgements We would like to thank Professor R.D. King, Aberystwyth University, UK, for valuable assistance in writing the chapter. We thank Sergey Ishkov, a student at Ecole Nationale Supérieure de Techniques Avancées, Paris, France, who programmed the code of TGS. We thank Dr Natalia Bazhanova, lecturer at Vladivostok University, Russia, for the pedagogical experiments with TGS.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
[12]
[13] [14] [15]
[16] [17]
L.W. Anderson, D.R. Krathwohl, A Taxonomy for Learning, Teaching, and Assessing, Addison Wesley Longman, 2001. L. Aroyo, A. Inaba, L Soldatova, R. Mizoguchi, EASE: Evolutional Authoring Support Environment, Proceedings of ITS’04, 2004. D. Shermis, Jill C. Burstein (Eds), Automated Essay Scoring, 2003. V.S. Avanesov, Modern Methods of Learning and Control of Knowledge, DVGTU, Vladivostok, 1999 (in Russian). F. Baker, The Basics of Item Response Theory. ERIC Clearinghouse on Assessment and Evaluation, University of Maryland, College Park, MD, 2001. J. Brank, M. Grobelnik, D. Mladenic, A Survey of Ontology Evaluation Techniques, Proceedings of Data Mining and Data Warehouses, 2005. Sh. Cheng, Y. Lin, Y. Huang, Dynamic question generation system for web-based testing using particle swarm optimization, Expert Systems with Applications 36 (2008), 616-624. L. Crocker, J. Algina, Introduction to Classical and Modern Test Theory, Wadsworth Group, Thomson Learning, 1986. M.A. Dandamaev, Babylonian scribes, Moscow: Science, 1983 (in Russian). P.H. DuBois, A History of Psychological Testing Boston, Allyn & Bacon Inc, 1970. T. R. Gruber, Toward principles for the design of ontologies used for knowledge sharing, Int. J. of Human-Computer Studies, Special issue on Formal Ontology in Conceptual Analysis and Knowledge Representation, 1993. N. Guarino, Some Ontological Principles for Designing Upper Level Lexical Resources, First International Conference on Language Resources and Evaluation (eds. Rubio, Gallardo, Castro, Tejada), 1998, 527-534. E Hirata, C. Snae, M. Brueckner, E-Assessment System for Young Learners (EASY), Proceedings of WorldCALL, Fukuoka, 2008. C. L. Hulin, F. Drasgow, C.K. Parsons, Item response theory: Application to psychological measurement. Homewood, IL: Dow-Jones Irwin, 1983. K. Kozaki, Y. Kitamura, M. Ikeda, R. Mizoguchi, Hozo: An Environment for Building/Using Ontologies Based on a Fundamental Consideration of "Role" and "Relationship". Knowledge Engineering and Knowledge Management, 2002, 213-218. D.L. McArthur (Ed.), Alternative Approaches to the Assessment of Achievement, Kluwer Academic Publishers, Boston, 1987. R. Mizoguchi, Tutorial on Ontological engineering, Part 1: Introduction to Ontological Engineering, New Generation Computing, OhmSha&Springer 21 (2003), 365-384.
110
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
[18] R. Mizoguchi, Tutorial on Ontological engineering. Part 3: Advanced course of ontological engineering, New Generation Computing, OhmSha&Springer 22(2) (2004), 193-220. [19] R Mizoguchi, K. Sinitsa, Task Ontology Design for Intelligent Educational/Training Systems, Proceedings of the Workshop on Architectures and Methods for Designing Cost-Effective and Reusable ITSs, ITS'96, Montreal, 1996, 1-21. [20] OWL. http://www.w3.org/TR/owl-guide/ [21] L. Soldatova, R. Mizoguchi, Generation of test based on test ontology, Proceedings of the International Conference on Computers in Education (ICCE03), Hong Kong, 2003, 1147-1150. [22] L. Soldatova, R. Mizoguchi, Ontology of tests, Proceedings of Computers and Advanced Technology in Education, Greece, 2003, 175-180. [23] IMS Question and Test Interoperability. http://www.imsproject.org/question/ [24] OWL. http://www.w3.org/TR/owl-guide/ [25] The International Test Commission. http://www.intestcom.org/
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
111
Appendix 1. Functions of TGM as a part of IES: Design: Input: Output: Import: Input: Output: Select: Input: Output:
to determine the test goal, the approach for the test composition, the test structure, and the method of scaling. a teaching goal, a target group, requirements for the test. a test goal, a recommendable test structure and a scoring scheme. to select and import a ready test with the determined structure from the test bank, if available. the test structure. a test. to import test items from a data base of test items, if available. (from the test structure) the item form, the level of difficulty for each test item. test items.
Output:
to generate test items; (from the test structure) the item form, the level of difficulty of each test item; test items.
Calculate: Input: Output:
calculate test statistics characteristics. test, the test results. test statistics characteristics.
Export:
to export the generated test item to the data base. to export the generated test and its documentation to the test bank. test items; a test, test statistics characteristics. test items; the test, test statistics characteristics, test documentation.
Generate: Input:
Input: Output: Explain: Input: Output:
to provide explanations about test design process or analysis of the test results. the test, the test results. text of explanations.
Output:
to prepare required test documents. the test. test administrative information. the test documents.
Supply: Input: Output:
to send a test to interface module for delivering to a learner. the test. the test.
Observe: Input: Output:
to get information about the learner’s knowledge. the test. learner’s responses.
Prepare: Input:
112
L.N. Soldatova and R. Mizoguchi / An Ontology-Based Test Generation System
Evaluate: Input: Output:
to analyze the learner’s responses. learner’s responses;. quantitative and qualitative characteristics.
Improve: Input: Output:
to enhance quality of the test on the base of analysis of the test result characteristics. the test, the test characteristics. an upgraded test.
Administrate: Input: Output:
to provide storage and an access to the test results. commands. the test results information.
2. Functions of TGS, additional to the TGM functions:
Output:
to get input data from a test subject. a teaching goal, a target group, a domain model, requirements for the test. a test goal, a test structure, a scoring scheme.
Simulate: Input: Output:
to infer parameters of test requirements. a teaching goal, a target group, a domain model. test duration, item forms, item difficulty, a scoring scheme.
Assist: Input: Output:
to help user to determine a test goal, an approach for the test composition, a test structure, a method of scaling. a list of test tasks, test goals, recommendations for test parameters. a teaching goal, a target group, a domain model.
Inform: Input: Output:
to inform an user about the test generation process. test parameters. information, alert messages.
Access:
Output:
to provide access to test items and tests for editing and creating new ones. test subject admin information; test items, tests. test items, tests.
Deliver: Input: Output:
to provide interface for delivering a test to a learner a test, the learner’s admin information. the test.
Output: Input: Output:
to give output data of the test generation process to an user. a test, the test subject admin information. a test and the test documentations.
Input: Input:
Input:
Part 2 Semantic Web Technologies for e-Learning
This page intentionally left blank
Part 2.1 Instructional Support and Adaptation
This page intentionally left blank
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-117
117
CHAPTER 7
Using Semantic Web Technologies to Provide Contextualized Feedback to Instructors Jelena JOVANOVIC a,1 , Dragan GASEVIC b, Carlo TORNIAI c and Vladan DEVEDZIC a a University of Belgrade, Serbia b Athabasca University, Canada c University of Southern California, USA
Abstract. The chapter presents our research aimed at providing teachers with finegrained and contextualized feedback about students’ activities in online learning environments. The rational is that teachers provided with such advanced feedback are able to better organize/revise the course content and customize that content to the students needs. Our approach is based on semantic technologies, namely ontologies and semantic annotation, which enable integration of data about students’ interactions with e-learning environments, as well as interlinking of learning artifacts that were used or produced during those interactions. We have implemented this approach into a tool called LOCO-Analyst and first applied it in Learning Content Management Systems (LCMSs) as today’s most often used elearning environments. Subsequently, we further enhanced this approach (as well as the LOCO-Analyst tool) to leverage the features of advanced e-learning settings, i.e. LCMSs extended with tools for enhanced learning experience. In particular, we have studied the benefits for educational feedback provisioning that stem from integrating tools for collaborative content annotation (such as tagging, highlighting and commenting) into LCMSs. This enabled us to provide teachers with course ontology maintenance and evolution features. Keywords. E-learning, instructor-directed feedback, ontologies, folksonomies, semantic annotation
Introduction The most typical form of e-learning on the Web today is through Learning Content Management Systems (LCMS), such as Moodle 2 and Blackboard & WebCT 3 . It is a widely adopted technology that enables setting up online courses and managing the students’ activities. In particular, LCMSs provide instructors with substantial support 1
Corresponding Author: Jelena Jovanovic, Faculty of Organizational Sciences, University of Belgrade, Jove Ilica 154, 11000 Belgrade, Serbia; E-mail: [email protected] 2 http://moodle.org 3 http://www.blackboard.com
118
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
for numerous activities indispensable for securing high quality e-learning processes, such as preparation of learning content, structuring and organization of the content in accordance with the chosen teaching strategy, interactions with and coordination of students’ activities using online communication tools. Aiming to provide learners with advanced learning experiences, some universities have started enriching their LCMSs with tools for learning content annotation: highlighting, tagging, and commenting. Some tools have gone a step further, and on the lines of social constructivism learning theory, made these annotation activities collaborative. For example, Open Annotation and Tagging System (OATS, see Chapter 14) [1], is an open-source tool, currently integrated into iHelp Courses 4 LCMS, that allows learners to collaboratively create and share knowledge, by adding highlights, tags and notes in learning content. The information coming from peers, for instance, how other students have tagged or commented a piece of learning content, is seen as an important factor in increasing students interest in course topics. Even though state-of-the-art LCMSs successfully support a huge set of online teaching/learning activities, their support for adaptation of learning courses is very scarce. This is due to the fact that support for adaptation of e-learning materials is much trickier and less straightforward; hence widely used LCMSs enable only simple content editing features for this purpose. However, instructors need much better support since they have to almost constantly adapt their courses both in terms of the included materials and the applied instructional design. When doing this, instructors have to take into account students’ performance and interaction with learning content in order to better address the specific needs and requirements of each particular students group, and thus secure students' high performance and learning efficiency levels. The main problem regarding the adaptation of e-learning courses stems from the instructors’ need for some qualitative and reliable feedback about the students’ usage of learning materials. Most LCMSs provide only simple statistics about the technology the students have used, and only high-level view on their interactions with the learning content (e.g. page views) and other students (e.g. the number of posts sent/received). However, this superficial and partitioned view of the learning process cannot provide instructors with insights into and understanding of the students’ learning behavior. The reason is that the causes of any particular learning behavior and/or outcome are context dependent and they can only be recognized if an integrated, but also fine-grained view of the learning process is made available. To reconstruct the context of any particular learning situation, one has to be able to capture and integrate data about the learning activity (e.g., reading or discussing) that was performed, the learning content that was used or produced during that activity, and the learner(s) who were involved in the activity. Having such data integrated, one can generate more fine-grained feedback about any particular learning situation. For example, for students who performed poorly on quizzes, one can identify the learning paths they followed, to whom and about what they talked to in chat rooms or discussion forums and how their messages are related to the learning content being taught. Providing instructors with such advanced feedback results in a better organization/revision of the course content and increases instructors’ capability of customizing that content to the students needs. In this chapter, we present how we have addressed the instructors’ need for increased awareness of learners’ activities in today’s online learning environments by providing them with fine-grained and contextualized feedback. To achieve that, we 4
http://ihelp.usask.ca/
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
119
have developed a framework for educational feedback provisioning that is based on semantic technologies, namely ontologies and semantic annotation. The proposed solution exploits the nature of ontologies to integrate various sources of information covering the same domain knowledge. In our case, we use ontologies to interrelate information about learning objects, learning activities and learners captured from various systems and tools (e.g., LCMSs and various educational tools and environments that learners use today). In particular, we employ the Learning Object Context Ontology (LOCO) framework [2] which enables abstraction and integration of the usage tracking data of different online learning systems and tools. Furthermore, we use domain ontologies to semantically interrelate different kinds of learning artifacts (e.g., lessons, tests, and messages exchanged during online interactions) by exploiting semantic annotation techniques [3] [4]. Finally, by reasoning over integrated learning context data and semantically interlinked learning artifacts, we are able to derive meaningful information to be presented to instructors. We have integrated the proposed solution into Reload Editor 5 , an open-source, state-of-the art course authoring tool and named this feedback-augmented authoring tool LOCO-Analyst. LOCO-Analyst provides feedback on diverse levels of content granularity, feedback about different types of learning content (e.g., lessons and tests), as well as feedback about students interactions in an online learning environment. In that way, an instructor is provided with relevant information that can help him/her better distinguish what can be improved (if anything) in his/her course. Furthermore, LOCO-Analyst provides instructors with rich feedback about each individual student – the student’s interactions with the learning content as well as interactions with other students. Due to the complexity of the problem addressed and the fact that most of today universities are still using ‘ordinary’ LCMSs, that is, LCMSs that do not provide tools (such as OATS) for improving students learning experience, we have first devised and implemented a set of heuristics that enable automatic creation of useful feedback for instructors working with ‘ordinary’ LCMSs. In the second phase of the research, we have been analyzing whether and how feedback provisioning can be enhanced by leveraging the features of ‘increased’ learning experience environments. In particular, we use data about students’ annotation activities (highlighting, tagging and commenting of the course content) to provide instructors with enhanced feedback that would increase their awareness in the students’ way of thinking and comprehension of the course content. In addition, we investigate how the collective intelligence of these enhanced learning environments can be leveraged for domain ontology maintenance and evolution. This is still an open research question, because current approaches and tools assume a background in knowledge engineering, or familiarity with ontology languages; this is true even when a (semi-)automatic approach is proposed. In general, tools are too complex to be used by most instructors. To address this problem, we are pursuing an approach based on folksonomies which result from the students collaborative tagging of the course content. Besides being beneficial for instructors, this approach promises to be advantageous for students as well. In particular, it increases students’ intrinsic motivation by making them aware of the fact that their activities contribute to the improvements of the course they are enrolled in. When students are involved in the learning process, they feel some ownership over their work; when they feel ownership, they accept responsibility; the feeling of responsibility increases motivation [5]. In addition, by leveraging data 5
http://www.reload.ac.uk/editor.html
120
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
coming from students’ interaction with the course content (such as tagging, highlighting, and commenting) one can provide students with more appropriate tools for searching/browsing that content. For instance, a contextual tag cloud can be provided to act as a search engine for course content and lead to the useful information for ‘related’ course material). The chapter is organized as follows: the following section gives an overview of the related work in order to provide some background information and better position the research that we present. Then, in Sections 2, we present our LOCO-Analyst tool and the functionalities it offers to e-learning environments (typically in the form of LCMSs) that are not equipped with tools for enhanced learning experience. Section 3 describes the semantic technologies that our work is based upon and their role in feedback provisioning. In the second part of the chapter (sections 4, 5 and 6), we present how we leverage features of LCMSs offering rich user experience to improve and extend LOCO-Analyst’s feedback provisioning capabilities. In particular, we base our approach on data about students’ collaborative annotation activities (such as tagging and commenting) enabled by the integration of tools like OATS into LCMSs. Section 7 concludes the chapter by pointing out research issues and challenges that have to be addressed to further enhanced the support for instructors.
1. Literature Review We consider our research as bringing together two research areas that, despite their common focus on e-learning, have been evolving separately: provisioning of instructordirected feedback and empowering e-learning environments with Semantic Web technologies. In the following two subsections, we present the related work in these two areas. 1.1. Provisioning of Instructor-directed Feedback Classroom Sentinel is a Web service aimed at improving day-to-day instructional decision-making by providing instructors with timely and fine grained patterns of students’ behavior in classrooms [6]. In particular, it mines electronic sources of students’ data to detect critical teaching/learning patterns. Once a pattern is detected, the instructor is informed about it in the form of an alert which consists of the observed pattern, a set of possible explanations, and a set of possible reactions. In that way, the instructor is enabled to take a timely corrective action. Unlike this system that targets learning in traditional classrooms, our approach focuses on Web-based learning environments where student-instructor and student(s)-student(s) interactions are more complex to correctly detect, follow and analyze. Kosba and his associates have developed the Teacher ADVisor (TADV) framework which uses LCMS tracking data to elicit student, group, and class models, and using these models to help instructors gain a better understanding of their distance students [7]. It relies on a repertoire of predefined conditions to recognize situations that require instructors’ intervention, and when such a condition is met, TADV generates an advice for the instructor, as well as a recommendation for what is to be sent to students. Whereas TADV is focused on the instructors’ day-to-day activities, LOCO-Analyst aims at helping them rethink the quality of the employed learning content and learning design.
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
121
Zinn & Scheuer have developed Teacher Tool, a tool which analyzes and visualizes usage-tracking data in order to help instructors learn more about their students in distance learning environments. The development of the tool was preceded by a user study aimed at identifying the information that, on the one hand, is valuable for instructors, and on the other hand, can be generated from user-tracking data. However, unlike Teacher Tool which is bounded to ActiveMath 6 (a Web-based, useradaptive learning environment for mathematics) and iClass7 (an intelligent cognitive-based open learning system), our solution, thanks to its ontological foundation, is tool-independent. Some researchers have applied Information Visualization (IV) techniques to help teachers understand what is happening in their classes. An example of such approaches includes CourseViz, which works with the WebCT LCMS to produce various graphical representations of student tracking data [8]. Another one is GISMO, which takes a similar approach for Moodle LCMS [9]. GISMO aims at helping teachers examine social, cognitive, and behavioral aspects of students enrolled in Web-based courses. In particular, GISMO uses the students tracking data (e.g., access to resources and results on assignments and quizzes) as source data, and applies diverse IV techniques to graphically represent it. Whereas these systems exploits IV to present row data, LOCOAnalyst goes a step further in terms that it analyzes the data and provides teachers with qualitative feedback. Our work is also related to the research done in the area of Web mining, which is about nontrivial extraction of potentially useful patterns and trends from large Web access logs. For example, Zaine & Luo (2001) applied advanced data mining techniques on access logs of an LMCS in order to extract patterns useful for evaluating and interpreting on-line course activities [10]. Instructors can tailor the data mining process to their needs by expressing them as constraints on the mining process (e.g., they can select a desired student or study group and the desired time period). The discovered patterns are presented in the form of charts and tables. TADA-Ed (Tool for Advanced Data Analysis in Education) is another data mining platform which integrates various visualization and data mining facilities to help instructors discover pedagogically relevant patterns in students’ online exercises [11]. Unlike these and similar systems that focus on a single learning activity (reading and exercises, respectively, in the aforementioned systems), LOCO-Analyst analyzes diverse kinds of learning activities typically occurring in today’s LCMSs. In addition, it is easy to use (as our user study has demonstrated [12]), which, in general, is not the case with data mining based tools. 1.2. Empowering E-learning Environments with Semantic Web Technologies The last two decades have witnessed a lot of research efforts focused on applying Semantic Web technologies to different aspects of e-learning. This is an expected reaction, since there are many aspects of the e-learning process that are affecting the educational outcomes, including (but not limited to) domain knowledge, knowledge artifacts, pedagogical models, user behavior and characteristics, social interactions, and the platforms for delivery. If advanced educational services are to be provided, all these e-learning aspects need to be captured and represented in an integrated way – in a unified knowledge space [13]. This is precisely the reason why Semantic Web 6 7
http://www.activemath.org http://www.iclass.info
122
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
technologies have been recognized as one of the major directions for the nextgeneration e-learning environments. Several special issues of international journals [14], [15], [16] along with the SWEL 8 (Semantic Web for E-Learning) workshop series nicely illustrate the topics of interest and the state-of-the-art in this area. However, to the best of our knowledge, no one has tried to use the Semantic Web technologies to facilitate integration and interpretation of actual interaction data by contextualizing and formalizing the relationships among students, learning artifacts, learning activities, and instructors. Hence, our work brings in novelty in this research area by demonstrating how Semantic Web technologies enable a generic provision of feedback for content authors and instructors based on students’ actual activities in diverse online learning contexts. The major obstacle for the wider acceptance and deployment of e-learning environments based on Semantic Web technologies lies in the difficult and expensive process of domain ontology development and maintenance: the development of these ontologies is expensive for universities (given the number of online courses offered) and difficult for content authors and instructors who typically lack the required knowledge engineering expertise. Our solution (presented in Section 5) addresses this issue by leveraging a synergy of collaborative tagging and ontologies. Even though in the last couple of years there has been an upsurge of interests in using folksonomies for ontology development, maintenance and evolution, to the best of our knowledge, this is the first research that addresses this issue from the perspective of instructors and learning content authors. Current research approaches towards the integration of ontologies and folksonomies can be classified into two major groups. The common point of the first group of approaches is in their reliance on altering the collaborative tagging process so that it creates ‘semantic-rich tags’. Tags become semantically rich by either been disambiguated by a user (i.e. tags are mapped to concepts in an upper-level ontology) [1], or by community-defined tag relationships [17]. Neither method has proven as much successful. We attribute this to the fact that the additional effort required for tags disambiguation changes the inherent simplicity of the tagging process (for which it became so widespread and widely adopted) and repel typical taggers. The second group of approaches relies on leveraging different (existing) techniques, such as basic text processing techniques, statistical analysis, social network analysis and clustering, to enable automatic and semi-automatic linking of collaborative tags and ontologies [18]. Even though these approaches are increasingly prevalent and have shown some promising results, they have not yet offered a general purpose and reliable solution [19]. The approach that we suggest (and present in Section 5) does not aim at defining a new method for deriving an ontology out of a folksonomy. Rather our aim is to help instructors easily identify the folksonomy tags that are relevant for a particular concept of a given ontology. Our approach takes into account the context of concepts in the ontology they belong to (i.e. the relationships of each particular concept with other concepts of the ontology) to provide a ‘contextualized’ usage of already available measures of relatedness. Besides helping instructors to constantly improve course ontologies, this solution also makes them aware of the emergent feedback of their students.
8
http://compsci.wssu.edu/iis/swel/
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
123
2. Instructors-directed Contextualized Feedback In this section, we first present how we have identified the kinds of feedback relevant for instructors and then focus on LOCO-Analyst from the perspective of its use with the current LCMSs. 2.1. Defining Feedback Requirements In order to determine how learning context data can address the unsatisfied requirements of instructors of Web-based courses, we conducted a small-scale survey of the current practices and requirements of educators who are engaged in online learning. The survey was conducted in July and August 2006 and included instructors and designers from three Canadian universities, as well as members of the International Forum of Educational Technology & Society 9 mailing list, a well known group of developers, researchers, and instructional designers from around the world. The survey was completed by 15 participants, and each of them provided highly informative comments. Among other important findings, a particularly interesting one was that all survey participants reported a lack of feedback about the learning process. We consider feedback as a piece of information about observed learners’ interactions either with a learning content or with other participants in the learning process. Based on a detailed analysis of the collected responses we distilled the following kinds of feedback as the most relevant: x
Recognition of problems at different levels of content granularity (from the level of a single lesson to the entire learning module); Recognition of differences between successful and unsuccessful learning trajectories; Detection of content (i.e. lessons) that was hard for students to comprehend; Identification of students’ difficulties at a topic level; Identification of frequently discussed topics; Identification of the students’ level of engagement in online interactions.
x x x x x
The correctness of these findings were confirmed by making a comparison with the findings of two recent empirical studies that investigated the instructors' needs when teaching at distance using course management systems [8] [20]. In addition, our findings are in accordance with the study conducted through the Kaleidoscope 10 European Network of Excellence, which had the aim of defining patterns for recording and analyzing interactions in online learning environments. The aforementioned types of feedback are implemented in LOCO-Analyst. 2.2. LOCO-Analyst LOCO-Analyst is an educational tool aimed at providing instructors with feedback on the relevant aspects of the learning process taking place in a web-based learning environment, and thus helping them improve the content and the structure of their webbased courses. It provides instructors with feedback regarding: 9
http://ifets.ieee.org http://lp.noe-kaleidoscope.org/
10
124
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
x x x
the kinds of activities their students performed and/or took part in during the learning process, the usage and the comprehensibility of the learning content they had prepared and deployed in the LCMS, contextualized social interactions among students (i.e., social networking) in the virtual learning environment.
The generation of feedback in LOCO-Analyst is based on analysis of the user tracking data. These analyses are based on the notion of Learning Object Context, which is about a student (or a group of students) interacting with a learning content by performing a certain activity (e.g. reading, quizzing, chatting) with a particular purpose in mind. The purpose of learning object context is to facilitate abstraction of relevant concepts from user-tracking data of various e-learning systems and tools. LOCO-Analyst is implemented as an extension of Reload Content Packaging Editor, an open-source tool for creating courses compliant with the IMS Content Packaging 11 (IMS CP) specification. By extending this tool with the functionalities of LOCO-Analyst, we have ensured that instructors effectively use the same tool for creating e-learning courses, receiving and viewing automatically generated feedback about their use, and modifying the courses accordingly (these interactions are sketched on the left hand side of Figure1). LOCO-Analyst has been tested with the user tracking data of the iHelp Courses, an open-source standards-compliant LCMS [21]. This LCMS captures fine-grained interactions between learners and content (e.g., time and duration of visit to a piece of content, links clicked on, and videos watched) and between learners (e.g., the content and time of messages sent in chat rooms along with the participant list, and the times learners read one another’s discussion messages). The right hand side of Fig.1 indicates these interactions. LOCO-Analyst acquires users’ activities data from iHelp Courses (as depicted with an arrow directed towards LOCO-Analyst in Figure 1) and transforms it into ontological representation compliant with the ontologies of the LOCO framework (see Section 3.1). As illustrated on Figure 1, LOCO-Analyst and iHelp Courses share learning content and domain ontologies.
Figure 1. The framework for feedback generation
11
http://www.imsglobal.org/content/packaging/
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
125
Note that LOCO-Analyst is not coupled to any specific LCMS. Despite differences in the format of the tracking data provided by various LCMSs, there are commonalties in their content and structure (e.g., history of pages visited, marks students received on quizzes, and messages posted in online discussions). These commonalities can be captured in the form of learning object context data and formally represented in accordance with the ontologies of the LOCO framework. Since LOCO-Analyst works with ontological representation of learning object context data (stored in the Repository of Learning Object Contexts (LOCs), Figure 1), it is fully decoupled from any specific e-learning system/tool and can be considered as a generic feedback generation tool applicable to diverse distance learning environments. The only thing that needs to be adjusted is the mapping between the tracking data format and the ontology of learning object context of the LOCO framework (see Section 3.1 for more details about the ontologies of the LOCO framework). Videos illustrating different kinds of feedback that LOCO-Analyst provides are available at the project’s web site 12 .
3. Semantic Web Technologies for Feedback Provisioning LOCO-Analyst relies on an ontological framework. Semantic annotation is used to enable integration of diverse learning artifacts. 3.1. The LOCO framework LOCO-Analyst relies on the LOCO ontological framework. The framework was initially aimed at facilitating reusability of learning objects and learning designs [22], and was later extended to also provide support for personalized learning [23]. The Learning Context ontology is the central component of the framework and serves as an integration point of other types of learning-related ontologies, such as the user model ontology, content structure ontology, and domain ontology [23]. These ontologies allow one to formally represent all the details of any given learning context, thus preserving its semantics in machine interpretable format and allowing for development of context-aware learning services. The focal point of the Learning Context ontology is the LearningContext class (Figure 2) which is, in accordance with the given definition of learning context (see Section 2), related to the activity (Activity) that a learner or an instructor (um:User 13 ) undertook while interacting with a learning content (alocom-core:ContentUnit 14 ). An instance of LearningContext is always related to exactly one Activity instance as well as one alocom-core:ContentUnit instance. However, it can be related to more than one um:User instances in case of a collaborative activity engaging more users (e.g., various forms of online discussions). The alocom-core:ContentUnit class formally represents any learning content that was used or produced during a learning process (e.g., a lesson, a blog post, or a discussion forum post). For example, to represent messages exchanged via online 12
http://iis.fon.rs/LOCO-Analyst/ The um prefix indicates that the User class comes from the User Model ontology 14 The alocom-core prefix indicates that the ContentUnit class comes from the ALOCoM Core (content structure) ontology 13
126
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
collaboration tools, we introduced the Message class as a subclass of both alocomcore:ContentUnit and sioc:Post 15 classes. In addition, to allow for differentiating between messages exchanged in discussion forums and chat rooms, we defined ChatMessage and Posting classes as subclasses of the Message class. We have also defined a number of properties for describing exchanged messages in greater details. All kinds of messages as well as other kinds of content units are annotated semantically with concepts from one or more domain ontologies (see Section 3.2).
Figure 2. The Learning Context ontology: the basic concepts
The Activity class represents any kind of activity typically occurring in a virtual learning environment (e.g., LCMS). A few basic kinds of learning activities are recognized and modeled as subclasses of the Activity class – for example, students are either reading some learning content (Reading), or doing an assessment (Quizzing), or interacting with other participants in the learning process (Discussing). Each activity can comprise one or more events (an instance of the Event class). For example, we have recognized various kinds of discussion events, such as events related to the students (and occasionally to the instructors) activities in discussion forums. These events are represented in the ontology as subclasses of the DiscussionEvent class (which in turn subclasses the Event class). Each recognized kind of activity and event is further formally described through a number of classes and properties. Of course, the Activity and Event classes can further be extended if an e-learning system has some specific types of activities/events. For example, we have already developed extensions of the LOCO-Cite ontology which integrate classes and properties required for representing specific kinds of activities and events typically occurring within software programming tools [24] and collaborative software development environments [25]. Besides the Learning Context ontology, LOCO-Analyst also makes use of the user model ontology and domain ontologies of the LOCO framework. Here, we give just a brief overview these two kinds of ontologies. The concepts of the domain ontology are used for annotating semantically learning content units (i.e., instances of the alocom-core:ContentUnit class) and thus establishing semantic links between different kinds of learning content that students use or generate while learning (e.g., lessons of e-learning courses, blog posts, online messages). To annotate the learning content semantically, we use tools of the Knowledge & Information Management (KIM) platform [3] (see the next subsection). This determined the format of our domain ontologies (i.e. ontologies that formally describe the subject matter of learning content). In particular, we model domain
15 The sioc prefix indicates that the class Post originates from the SIOC (Semantically Interlinked Online Communities) ontology
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
127
concepts and their relations by instantiating appropriate classes and properties of the PROTON upper-level ontology 16 , as KIM requires. To describe formally a participant in the learning process, we use the class User originating from the user model ontology that we had developed in our previous work, in the scope of the TANGRAM project 17 . The ontology is described in detail in [2]. However, to make the ontology fully applicable for the purposes of LOCO-Analyst, we needed to slightly extend it with a few classes and properties that enable tracking of some additional instructors’ data, such as instructors’ feedback requests. Ontologies of the LOCO framework are developed by following the Linked Data best practices 18 . In particular, linkages were established with well-known Web ontologies, namely FOAF 19 (Friend-Of-A-Friend), SIOC 20 (Semantically Interlinked Online Communities) and the DC 21 (Dublin Core) vocabulary. For example, the um:User class from LOCO’s User Model ontology is defined as a subclass of the foaf:Agent class; the sioc:Forum class is chosen as the common superclass of all the classes of the Learning Context ontology which represent different kinds of online communication channels (e.g., lc:DiscussionForum and lc:ChatRoom), etc. SKOS 22 (Simple Knowledge Organization System) Core ontology was originally used as the meta-model for domain ontologies of the LOCO framework. All the ontologies of the LOCO framework are publicly available 23 . 3.2. Semantic Annotation in LOCO-Analyst Semantic annotation of learning artifacts is about annotating (i.e. describing) their content with semantic information from domain ontologies [4]. Semantic annotation of learning content has proven as highly beneficiary for the feedback provision since it enables establishing semantic relations among all kinds of learning artifacts – lessons, quizzes, forum postings and chat messages. For example, semantic annotation of quiz questions and lessons enables linkage of questions and lessons that are related semantically (i.e., questions and lessons discussing the same or similar domain concepts). Furthermore, semantic annotation of students’ messages exchanged via online communication tools (chat rooms and discussion forums) makes it possible to identify whether and how often the students have been discussing some domain topics. In LOCO-Analyst, semantic annotation is performed using the annotation capabilities of the KIM platform [3]. In order to apply KIM’s annotation facilities on content from a specific subject domain, KIM has to be extended with knowledge about that domain 24 . During the annotation process, each learning artifact (e.g., lesson, quiz, forum posting or chat message) is assigned zero or more semantic annotations (i.e., concepts from the domain ontology). In terms of ontological representation, each instance of the alocom-core:ContentUnit class is related, via the hasDomainTopic property (sub16
http://proton.semanticweb.org/ http://iis.fon.rs/TANGRAM/ 18 http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ 19 http://xmlns.com/foaf/0.1/ 20 http://sioc-project.org/ 21 http://dublincore.org/documents/dcmes-xml/ 22 http://www.w3.org/2004/02/skos/ 23 http://iis.fon.rs/LOCO-Analyst/ 24 Detailed instructions how to extend the KIM platform to cover a new domain can be found in KIM’s online documentation: http://www.ontotext.com/kim/doc/sys-doc/index.html 17
128
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
property of the dc:subject 25 property), to zero or more domain concepts, that is, URIs of concepts from the domain ontology. Figure 3 illustrates a content unit annotated with the ‘pseudocode’ concept of the domain ontology.
Figure 3.. A content unit annotated with the domain concept of pseudocode 26 which is integrated into the KIM’s knowledge base
3.3. Searching the Semantic Repository LOCO-Analyst integrates a Repository of LOCs that holds learning object context (LOC) data represented as instances of the Learning Context ontology. The repository relies on Sesame 27 , an open source Java framework for storing and querying ontological data. For querying the repository we use SeRQL (Sesame RDF Query Language) query language. Figure 4 shows a SeRQL query which we use to retrieve learning context data required for generating the feedback regarding students’ performance on a specific quiz. SELECT question, questionTxt, correct FROM {quizContext} rdf:type {lc:LearningContext}, {quizContext} lc:contentRef {quiz}, {quizContext} lc:activityRef {q}, {q} lc:result {quizRes}, {quizRes} lc:questionResultRef {questionRes}, {questionRes} lc:questionRef {question}, {question} quiz:questionItem {questionItem}, {questionItem} quiz:content {questionTxt}, {questionRes} lc:isCorrect {correct} WHERE localname(quiz) LIKE quizID USING NAMESPACE lc = , quiz =
Figure 4. SeRQL query for retrieving data about students’ performance on a specific quiz (identified with quizID), on the level of individual questions 28 . 25
dc stands for the Dublin Core metadata schema (http://pur1.org/metadata/dublin-core) kim-wkb stands for the namespace of the KIM’s ‘working knowledge base’, i.e. repository of ontological instances http://www.ontotext.com/kim/2005/04/wkb) 27 http://www.openrdf.org/ 28 The prefix quiz refers to the namespace of a tiny ontology that we developed to formally represent an assessment instrument (i.e. a quiz) 26
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
129
Specifically, this query is relevant for informing instructors about the students' performance on each individual question of the quiz.
4. Leveraging Students’ Collaborative Activities In this and the following two sections, we present our work on exploring the benefits of learning environments offering rich user experience for provisioning of educational feedback. First, we have explored how we can make use of folksonomies that emerge from students tags assigned to the learning content during the learning process. To this end, we leveraged folksonomies which resulted from the students’ collaborative tagging activities in OATS (Open Annotation and Tagging System). OATS is a tool which allows learners to collaboratively create and share knowledge by adding highlights, tags and notes to HTML-based learning content [1] (see Chapter 14). OATS has been integrated into the iHelp Courses LCMS and has been used in several e-learning courses deployed at the University of Saskatchewan. To leverage data about students’ interactions within this enhanced learning environment, we have developed an extension of LOCO-Analyst in which folksonomies, representing annotations of the course content from the students’ perspective, can be used by an instructor for revising and updating domain ontologies. This research direction was motivated by the fact that the major obstacle for LOCOAnalyst’s wider adoption, but also the wider adoption of any other ontology-based tool is the difficulty in creating and maintaining ontologies that such tools depend on. Actually, the major problems are related to the creation and maintenance of domain ontologies, since unlike other kinds of ontologies relevant for e-learning, domain ontologies: (i) cannot be reused across different subject domains, but have to be created anew for each domain and (ii) there is a need for their constant evolution, so that the semantics they capture do not lag behind the courses they are aimed to support. Despite current efforts to increase the availability and reusability of ontologies, through the development of online ontology libraries (e.g., Swoogle 29 ) and search engines (e.g., Sindici 30 ) or (semi-)automatic ontology development tools (e.g., Text2Onto [26]), the usage of these libraries, search engines and tools still require a high level of technical knowledge that instructors and content authors often lack. Figure 5 presents an overview of our framework (Figure 1) extended with the OATS tool. In the extended framework, students collaboratively tag the learning content in the iHelp Courses LCMS using OATS. The results of their tagging activities are accessed by LOCO-Analyst which in turn makes them inspectable by instructors. Specifically, data about students’ collaborative tagging activities in OATS are used to inform instructors about a student’s comprehension of the course content, i.e., to help the instructor evaluate the student’s conceptualization of the course content (see Section 6 for a detailed explanation). LOCO-Analyst also makes use of the students’ collaborative tags created in OATS to facilitate the instructors’ task of maintaining domain ontologies of their courses (explained in Section 5). We have developed an algorithm for computing context-based relatedness between students’ tags and ontological concepts, which we use to further assist the instructors' 29 30
http://swoogle.umbc.edu/ http://sindice.com/
130
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
task of ontology maintenance, by suggesting the most relevant tags for a particular concept. The algorithm is based on the idea that the ontology itself defines a ‘context’ for its concepts. So, when computing the relatedness between a concept and a tag, the surrounding concepts (forming the ‘context’ of the concept in question) must also be taken into account.
Figure 5. The original framework extended to support and leverage students’ collaborative tagging activities
Our algorithm uses the ‘context’ of a concept to provide ‘contextualized’ usage of the already available measures of semantic relatedness (e.g., Normalized Search Similarity based on Wikipedia [27]) for a concept-tag pair. In particular, we have defined a weighted measure of semantic relatedness between a concept C and a tag T as the average of the measures of semantic relatedness between the tag T and the concepts related to C (relations along the concepts hierarchy are considered, i.e., super-classes and subclasses of C). This measure is then generalized taking into account the whole ontology (i.e. all the weighted measures of semantic relatedness for the concepts surrounding C) providing our final contextualized measure of semantic relatedness between C and T. Detailed explanation of the algorithm and its evaluation can be found in [28]. The following section presents in more details our approach to folksonomy-driven ontology maintenance, and how we have implemented it into LOCO-Analyst.
5. Folksonomy-driven Ontology Maintenance In Figure 6, we present the user interface of the LOCO-Analyst extension that enables instructors to refine the domain ontologies of their courses using the students’ collaborative tags created with OATS. The extension includes a tag cloud visualizing the students’ tags (B) and a graph-based visual representation of the course domain ontology (C).
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
131
The domain ontology is presented using an interactive graph that we have implemented using Prefuse 31 , an open-source Java visualization framework. An instructor can explore the ontology by zooming in and out and/or changing the focus of the graph view by clicking and dragging nodes.
Figure 6. An extension of the LOCO-Analyst tool for the ontology maintenance task
The tag cloud employs the size and color of tags to convey to instructors information describing the tags popularity and relevancy, respectively. We have found these two feedback variables relevant for supporting the instructors’ task of enriching domain ontologies. The size of a tag reflects its popularity, which is calculated by the number of times that tag was used to annotate a particular piece of learning content. The saturation of a tag’s color reflects its relatedness to the ontological concepts encapsulated in the content of the currently selected lesson (Figure 6A), as evaluated from our algorithm – darker colors denote more relevant tags. We performed a pilot study of 3 alternative tag cloud configurations mapping our variables to different visual characteristics of the tag cloud [29]. The goal was to inform ourselves on the type of folksonomy visualization that would work best for instructors. The three alternatives used were font size, color, and a ranked vertical list. The most popular alternative was selected, which displays the popularity using the size of the tag, and the semantic relatedness in the saturation of the tag color (the higher the score the darker the tag appears). This is consistent with typical tag cloud displays. An instructor’s interaction with LOCO-Analyst’s extension for ontology maintenance can be described as follows: as the instructor selects a lesson (or a complete learning module) from the tree-like representation of the course structure (Figure 6A): 1. 31
The tag cloud (Figure 6B) is populated with tags related to the selected lesson.
http://prefuse.org
132
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
2.
The visual representation of the ontology (Figure 6C) changes to emphasize the concepts relevant for the selection being made. The colors of relevant concepts become darker as the relevancy with explored content increases.
The instructor selects a tag (from the tag cloud view) that (s)he wants to add to the ontology and simply drags that tag towards the ontology view panel (Figure 6C). When the tag is ‘over’ the ontological concept it should be related to (according to the instructor’s opinion), the instructor ‘drops’ it. Instantly, a pop-up menu appears offering various options for establishing a connection between the selected concept and the tag (e.g., adding a tag as a sub-concept or as a related concept). As the instructor selects one of the available options for ontology enrichment, the ontology view gets updated to reflect the changes to the ontology. In addition, the instructor has an option to postpone the decision about the tag relation and note it for later consideration. LOCO-Analyst supports this by adding both the tag and the concept into the instructor’s notes, for later consideration.
6. Enhanced Feedback Provisioning Based on Students Collaborative Activities Students’ collaborative tagging activities promise to be beneficial not only for facilitating (domain) ontology maintenance but also for providing instructors with new kinds of contextualized feedback. In particular, we have been investigating how students’ collaborative tags can be used for providing instructors with enhanced feedback about students’ comprehension of the course content. In the rest of the section, we present the extension of LOCO-Analyst, which uses data about students’ collaborative tagging activities in OATS to inform instructors about a student’s comprehension of the course content. The usage of social activities such as tagging goes in the direction of leveraging Social Web approaches in technology-enhanced learning, which we have already shown as being beneficial for a wider adoption of Semantic Web technologies in e-learning environment [30]. To present the instructor with this new kind of feedback, we have extended the dialog that is used in LOCO-Analyst for displaying feedback about one particular student. The dialog’s “Annotations” panel, Figure 7, provides feedback based on students collaborative tagging activities. On the left hand side of the dialog (Figure 7A), there is a tag cloud presenting tags that all students used for annotating the course content. We make a visual distinction between the tags that the selected student used, and those that other students used but the selected student did not. This distinction is visualized by making active and painted in blue only those tags that the student used (i.e., mouse pointer turns into a hand indicating clickable tag), whereas other tags are not clickable and are painted in grey. This allows the instructor to easily identify to what extent the student’s perception of the course content overlaps with that of his/her fellow students. After the instructor selects one of the student’s tags from the tag cloud, the course content annotated with that tag is presented in the form of a tree as shown in Figure 7B. The tree root represents the course, branches are lessons annotated with the selected tag and tree leaves are parts of the lesson’s content annotated with the selected tag. After the instructor selects one annotation (i.e., a tree leaf), the part of the lesson forming its ‘context’ is presented in the Annotation Preview panel and the student’s notes related to that annotation (if available) are listed in the Notes Preview panel (Figure 7C).
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
133
The idea behind the suggested interaction is to help the instructor evaluate the student’s conceptualization of the course content. The assumption is that the tags that the student used for annotating the content reflect his/her perception (or even comprehension) of the content. The suggested visualization would also help instructors easily spot all parts of the course that the tags were used with, and thus help them reveal some of the students’ misconceptions.
Figure 7. A screenshot of the interface providing feedback based on students collaborative tagging
We are currently working on some further visual indicators to be added to the content annotation tree (Fig. 7B) to indicate the level of agreement between the instructor’s and the student’s conceptualization of the content. To accomplish this, we will make use of the semantic annotations of the course content with concepts of the domain ontology, the students’ tags, and a context-based measure of relatedness between ontology concepts and tags (a variant of the one used for ontology maintenance). Since the domain ontology reflects (or at least should reflect) the instructor’s conceptualization of the course content, using our measure of relatedness between ontology concepts and tags, we can identify where the instructor and the student’s conceptualizations overlap and where they diverge.
7. Conclusion This chapter has presented how some semantic technologies, namely ontologies and semantic annotation, can be used for improving the state-of-the-art in generating feedback in today’s typical e-learning settings (i.e. LCMSs). We have also presented our research aimed at leveraging the features of advanced learning environments (i.e. LCMSs extended with tools for enhanced learning experience) for improving existing and enabling new kinds of educational feedback. In particular, we have so far studied LCMSs extended with tools for collaborative content annotation and recognized the benefits of this integration in: the facilitation of course ontology maintenance and the provision of enhanced feedback for instructors.
134
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
We have presented the LOCO-Analyst tool developed as a working prototype of the proposed approach. Being aware of the fact that a typical instructor might not have time to explore the feedback generated by the system, when developing LOCO-Analyst we focused on the provision of qualitative summaries that can be further expanded to provide more detailed information when needed (e.g., when an instructor wants to learn more about a particular information). The tool also relies on visual cues to give instructors quick insights into the students’ progress. The initial evaluation study that focused on the LOCO-Analyst’s feedback provisioning functionalities for the current LCMSs (i.e. LCMSs without support for enhanced learning experience) gave generally very positive results [12]. Our future work will deal with the evaluation of the LOCO-Analyst extensions that leverage students’ collaborative annotation activities within an LCMS. In particular, we intend to conduct an evaluation study in real e-learning settings in order to gather users (i.e. instructors) opinions about the proposed interfaces and modes of interaction. We expect that this study will not only inform us how to further adapt the proposed forms of interactions, but will also enable us to gather feedback on what additional types of user interactions might be valuable. We also intend to further investigate the integration of this approach into domain specific tools students are dealing with, as a way to increase awareness of students’ interactions with these tools and provide instructors with even more precise feedback [25]. Last but not the least important, we intend to investigate how data about students interactions with the learning tools can be used for directly enhancing their learning experiences.
References [1]
S. Bateman, R. Farzan, P. Brusilovsky, G. McCalla, OATS: The Open Annotation and Tagging System, Proceedings of the 3th Annual International Scientific Conference of the Learning Object Repository Research Network, Montreal, 2006. [2] J. Jovanoviü, D. Gaševiü, V. Devedžiü, Dynamic Assembly of Personalized Learning Content on the Semantic Web, Proceedings of the 3th European Semantic Web Conference, Budva, Montenegro, 2006, 545-559. [3] B. Popov, A. Kiryakov, A. Kirilov, D. Manov, D. Ognyanoff, M. Goranov, KIM – Semantic Annotation Platform, Proceedings of the 2nd International Semantic Web Conference, 2003, 834-849. [4] V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta, & F. Ciravegna, Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Journal of Web Semantics 4 (1) (2006), 14-28. [5] M. A Theobald, Increasing student motivation: strategies for middle and high school teachers, Corwin Press, Thousand Oaks, California, 2006. [6] M.K. Singley, R.B. Lam, The Classroom Sentinel: Sup-porting Data-Driven Decision-Making in the Classroom,” Proceedings of the 13th World Wide Web Conference, Chiba, Japan, 2005, 315-322. [7] E. Kosba, V. Dimitrova, R. Boyle, Using Student and Group Models to Support Teachers in WebBased Distance Education, Proceedings of the 10th International Conference on User Modeling, Edinburgh, UK, 2005, 124-133. [8] R. Mazza, V. Dimitrova, Visualising student tracking data to support instructors in web-based distance education, Proceedings of the 13th World Wide Web Conference, NY, USA, 2004, 154-161. [9] R. Mazza, C. Milani, Exploring Usage Analysis in Learning Systems: Gaining Insights From Visualizations, Proceedings of the Workshop on Usage analysis in learning systems at the 12th Int’l Conference on Artificial Intelligence in Education, Amsterdam, The Netherlands, 2005, 65-72. [10] O. R. Zaıane, J. Luo, Towards Evaluating Learners’ Behaviour in a Web-Based Distance Learning Environment,” Proceedings of the IEEE Int’l Conference on Advanced Learning Technologies, Madison, USA, 2001, 357-360.
J. Jovanovic et al. / Using Semantic Web Technologies to Provide Contextualized Feedback
135
[11] A. Merceron, K. Yacef, TADA-Ed for Educational Data Mining, Interactive Multimedia Electronic Journal of Computer-Enhanced Learning 7(1) (2005). [Online]. Available at: http://imej.wfu.edu/articles/ 2005/1/03/index.asp. [12] J. Jovanoviü et al, LOCO-Analyst: Semantic Web Technologies in Learning Content Usage Analysis, Int’l Journal of Continuing Engineering Education and Life-Long Learning 18(1) (2007), 54-76. [13] D. Gaševiü, J. Jovanoviü, & V. Devedžiü, Ontology-based Annotation of Learning Object Content, Interactive Learning Environments 15(1) (2007), 1-26. [14] D. G. Sampson, M. D. Lytras, G. Wagner, P. Diaz, Eds., Special issue on Ontologies and the Semantic Web for E-learning, Educational Tech. & Society 7 (4) (2004). [15] D. Dicheva, L. Aroyo, Eds., Special Issue on Application of Semantic Web Technologies in E-learning, Int’l J. of Continuing Engineering Education and Life-Long Learning 16 (1/2) (2006). [16] A. Naeve, M. D. Lytras, W. Nejdl, N. Balacheff, J. Hardin, Eds. Special issue on Advances of the Semantic Web for e-learning: expanding learning frontiers, British Journal of Educational Technology 37 (3) (2006). [17] R. Lachica, Towards holistic knowledge creations and interchange Part 1: Socio-semantic collaborative tagging. Talk at TMRA 2007. [Online]. Available at: http://www.informatik.unileipzig.de/~tmra/2007/slides/lachica_TMRA2007.pdf. (2007) [18] L. Specia, E. Motta, Integrating Folksonomies with the Semantic Web, Proceedings of the 4th European Semantic Web Conference, Innsbruck, Austria, 2007, 624 - 639. [19] H. S. Al-Khalifa, H. C. Davis, Exploring The Value Of Folksonomies For Creating Semantic Metadata, International Journal on Semantic Web and Information Systems 3(1) (2007), 13-39. [20] C. Zinn, O. Scheuer, Getting to Know your Student in Distance-Learning Contexts, Innovative Approaches for Learning and Knowledge Sharing, Proceedings of the 1st European Conference on Technology Enhanced Learning, Crete, Greece, 2006, 437-451. [21] C. Brooks, L. Kettel, C. Hansen, Building a Learning Object Content Management System, Proceedings of the 10th World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education (E-Learn2005), Vancouver, Canada, 2005, 2836-2843. [22] C. Knight, D. Gaševiü, G. Richards, Ontologies to integrate learning design and learning content, Journal of Interactive Media in Education 7 (2005). [23] J. Jovanoviü, C. Knight, D. Gaševiü, G. Richards, Learning Object Context on the Semantic Web, Proceedings of the 6th IEEE Int’l Conference on Advanced Learning Technologies (ICALT 2006), Kerkrade, The Netherlands, 2006, 669-673. [24] J. Jovanoviü, S. Rao, D. Gaševiü, M. Hatala, V. Devedžiü, An Ontological Framework for Educational Feedback, Proceedings of the 5th Int’l Workshop on Ontologies and Semantic Web Services for Intelligent Distributed Educational Systems at the 13th Int’l Conf. on Artificial Intelligence in Education, Marina Del Rey, California, USA, 2007, 54-64. [25] Z. Jeremiü, J. Jovanoviü, D. Gaševiü, A Semantic-rich Framework for Learning Software Patterns, Proceedings of the 8th IEEE Int’l Conference on Advanced Learning Technologies, Santander, Spain, 2008, 120-122. [26] P. Cimiano, J. Voelker, Text2onto - a framework for ontology learning and data-driven change discovery, Proc. of the 10th Int’l Conf. on Applications of Natural Language to Information Systems, Alicante, Spain, 2005, 227-238. [27] V. D. Veksler, A. Grintsvayg, R. Lindsey, W. D. Gray, A proxy for all your semantic needs, Proceedings. of the 29th Annual Meeting of the Cognitive Science Society, CogSci2007, 2007. [28] C. Torniai, J. Jovanoviü, S. Bateman, D. Gaševiü, M. Hatala, Leveraging Folksonomies for Ontology Evolution in E-learning Environments, Proceedings of the 2nd IEEE International Conference on Semantic Computing, Santa Clara, CA, USA, 2008, 206-215. [29] S. Bateman, J. Jovanovic, C. Torniai, D. Gasevic, M. Hatala, Combined Usage of Ontologies and Folksonomies in E-learning Environments, Proceedings of the Semantic Web User Interaction Workshop at CHI 2008, Florence, Italy, April, 2008. [30] C. Torniai, J. Jovanoviü, D. Gaševiü, S. Bateman, M. Hatala, E-learning meets the Social Semantic Web, Proceedings of the 8th IEEE Int’l Conf. on Advanced Learning Technologies, Santander, Spain, 2008, 389-393.
136
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-136
CHAPTER 8
A Cross-Curriculum Representation for Handling and Searching Dynamic Geometry Competencies Paul LIBBRECHTa,1 and Cyrille DESMOULINSb a DFKI Saarbrücken, Germany b Laboratoire d’Informatique de Grenoble, France
Abstract. Interactive Geometry is becoming part of the curriculum in many European countries; sharing the files of interactive geometry, the constructions, is, however difficult because of the communities are scattered between the many software systems and the many curriculum differences. The Intergeo project addresses this issue by offering a platform where cross-curriculum search and annotation can be done. The annotation language is an ontology and is made easily accessible to users; this ontology describes elementary competencies and topics and their relationships. The search functions, the management, and the access are all empowered by the semantic nature of this ontology together with the various names attached to each ontology element. This paper describes the ontology and the infrastructure that provides utility, usability, and interoperability to this knowledge corpus. Keywords. Competencies, topics, ontologies, learning resources, information retrieval, cross-curriculum search, internationalisation, multilinguality, authoring.
Introduction Interactive geometry resources are in wide use in many educational institutions to teach mathematics. Their adoption, however, is often difficult as is often the case with information technologies at school. More convergence is required; the Intergeo project intends to approach it through three different aspects: x x
1
Define a common file format enabling the interactive geometry constructions to cross the software borders, which, currently, often prevent neighbours to reuse each other’s resources. Create a web-based platform where learning resources with interactive geometry constructions are visible, exchangeable, and searchable: this should cross the borders of national curriculum.
Corresponding Author: Paul Libbrecht, DFKI, Stuhlsatzenhaus 3, D-66123 Saarbrücken,, Germany; Email: [email protected].
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
x
137
Allow the users of the platform to annotate the resources with quality statements so that interactive geometry resources are validated, in particular, for their appropriateness in a particular educational context.
This chapter is focused on the Intergeo web-based platform where teachers can share interactive geometry resources. The sharing mechanism is based on competencies taken from European curricula. The hypothesis is that teachers and educational experts refer to competencies in their everyday work and that competencies are the best way to link their needs, both pedagogical and mathematical, and corresponding resources. The purpose of this chapter is the cross-curriculum representation of competencies in the Intergeo web platform. With an approach mixing, on the one hand, the linguistic and mathematical semantic commitments and, on the other hand, techniques from classical database management, semantic web and information retrieval, handling and searching competencies is made at once useful, usable, scalable and interoperable. This paper starts with motivating examples and follows with an explanation of the GeoSkills ontology that was developed to represent competencies and solve limitations of existing approaches. The Comped editor which makes it possible to edit the competencies and topics of this ontology is then described followed by the usage of competencies in management and search. We conclude with perspectives on users manipulation of GeoSkills and extension to other domains.
1. Motivating Examples and Related Works To motivate our approach, let us compare it to two other approaches upon an example. Consider the competency of constructing the division of a segment in equal parts. It is described in the French national program of study [1] as “utiliser le théorème de Thalès ou sa réciproque” whereas the English curriculum only mentions the operation of “Enlargement”. In English, the French “Théorème de Thalès” is called “Intercept theorem” (in Spanish, “Teorema de Tales”, in German “Vierstreckensatz”, in Dutch “Stelling van Thales”). However, Thales' Theorem in English or in German (“Satz des Thales“) refers to another theorem (a right triangle inscribed into a circle). 1.1. Keyword Approach When searching “Thales” competencies across European curricula with a keyword approach, for example in the GNU-EDU system [2], mismatching competencies will be retrieved, some referring to Intercept theorem and the others to Thales’s one, without means to distinguish them. Handling competencies is also hard with this approach. There is no way to find similar competencies, except if they refer together to the same keyword, with possible mismatches. Even in the same language or country, the link between competencies is loose and browsing competencies is only possible through keyword input. 1.2. Thesaurus Approach Some existing systems propose, in addition to a keyword approach, to perform an “advanced search” based on age/level and subject classification.
138
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
Searching through “Thales” competencies on such systems (Department for Children 2009) is simply not possible as the concept of “Thales theorem” and competencies to use it are too fine-grained for these thesaurus approaches. Classification is finest at best with “Elementary Geometry”, at worst with “Mathematics”. The competencies management in such system is very poor; lots of competencies are merged into the same generic category. 1.3. The Intergeo Approach Mixing Linguistic and Mathematical Semantics With the Intergeo approach, searching through “Thales” competencies across European curricula by keyword is also possible. It provides also both types of competencies but the mathematical semantics enable to distinguish those referring to “Intercept theorem” from those referring to “Thales theorem”. Additionally, names provided in the user’s own language for competencies support the user to make the difference between them. The search is also extended relying on inferred knowledge, thanks to an ontology of competencies and topics. For example, consider the competency of being able to use binomial identities to solve equations (“utiliser les identités remarquables pour résoudre une équation”). 2 This should of course be matched by queries using strings such as “identity”, “equation” “résoudre”, etc. But it should also be matched by queries using strings as “equality”, because mathematically an equality is a kind of identity. "Utiliser les identités remarquables pour résoudre une équation” matches with "use equality" because the ontology defines an equality as an identity and the platform can then infer that this competency matches with equality. In the Intergeo approach, search can also be performed through browsing the competency hierarchy, which comes from declared or inferred knowledge, or through browsing the content of an official curriculum or a textbook content table. Mixing linguistic and mathematical semantics provides a great flexibility and accuracy in the management of competencies. Finding similar competency is easy, using semantic relations between competencies categories. New competency can be created without a precise mathematical classification, which is performed automatically afterwards.
2. Representing Competencies: the GeoSkills Ontology Representing competencies within Intergeo platform should meet two major semantic commitments taken from the previous examples: a fine-grained mathematical semantics and competency names taken from various contexts (educational regions and languages). This sections starts with a survey of competencies representations on learning object repositories. It then presents the GeoSkills rationale and details before explaining the means offered for managing and populating the ontology.
2 Throughout this paper we provide hyperlinks to the CompEd user-readable representation of the GeoSkills node when they are referenced.
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
139
2.1. Related Works In order to approach the representation of competencies for the Intergeo platform, we survey the state of learning object repositories, which are closest to what the Intergeo platform should be. Topical information forms an important part of curriculum description, the other parts being the competencies themselves, in the strict sense of the word, that is the ability to perform actions concerning a given topic. When expressed in a curriculum, a given topic implicitly means “mastering this topic”. For this reason, in the following of this text, we will generalise the competency notion to both competencies and topics, and as well as in the Geoskill ontology. As far as we could observe, learning object repositories all classify learning objects of a highly variable nature using a certain amount of bibliographic information augmented by some pedagogical and topical information. Unfortunately, there is rarely enough information to allow fine-grained search. Topical information is, at most, encoded in broad taxonomies such as the Mathematical Science Classification (MSC) by the American Mathematical Society [3]. The most fine-grained is the WebALT repository [4], which attempts to refine the MSC to a level close to a curriculum standard. Other approaches that tend to be fine-grained are the tag-based approaches, where any person providing content can freely attribute any sequence of words (see 1.1) as annotations. While this approach works fine for statistical similarity and in communities that share a language, it does not work so well to provide similarity measures of concepts in a well managed fashion: it could only offer translation capabilities if mostly used by multilingual users and users that bridge several communities; we have not found, yet, such users to be common. A learning object repository that provides topical information directly within the curriculum is GNU Edu [2]: this platform catalogues learning objects according to the skills described in a curriculum, split into years and chapters. GNU Edu allows the skills to be annotated with keywords, which can be used to access the skills directly. The keywords are translated and this is how GNU Edu achieves cross-curriculum search: a query matches a set of keywords, each matching skills from each curriculum. GNU Edu does not, however, rank the results or generalise a query so that related keywords also matched. The emergent repository TELOS from the LORNET research network, and its associated competency framework [5] is representing competencies. It has been considered, but rejected because of its main focus on the design and organisation of coherent courses or evaluations; on the contrary, Intergeo resources will be aimed at being used as building blocks by more elaborate Learning Content Management Systems. Several approaches to link resources to curricula are available. Curriculum Online by the British Educational Communication and Technology Agency [6] is a concerted effort between the Education Board of England and several publishers to present the curriculum standard of England associated with resources that schools may purchase. Microsoft Lesson Connection is a joint of effort of Microsoft [7] and a publisher. The ExploreLearning entreprise do the same for the curricula of the USA [8]. With a smaller scope, Key Curriculum Press indexes Sketchpad dynamic geometry resources with curriculum elements [9]. Most of these approaches seem to be based on directly and manually associating resources to lines in curricula, something which is clearly not
140
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
an avenue for us, since we want the resources to cross the curriculum barriers, ready to welcome new curriculum standards as they emerge. Since the start of the Intergeo project in October 2007, we have seen the rise of at least one such repository and the halt of two. The CALIBRATE project as explained in [10] and further detailed in private communications has appeared to be a first-class provider of annotated curriculum. Unfortunately, their intent did not seem to converge with a cross-curriculum search and their coverage intent appeared to be weak. The arguments of [10], noticing the rise of the competencies as essential curriculum ingredient and formalising competencies by a process and topics is what we follow in this work, bringing it to the ontological world. 2.2. Rationale of GeoSkills In order to get a precise mathematical semantics, the approach is to rely on welldefined semantics, decidable knowledge representation, and widely interoperable languages. OWL-DL meets such requirements. It is an interoperable format provided by the W3C in [11]. Its well-defined logic is the Descriptive Logic that has been proved to be decidable. Widely used OWL editors such as Protégé by [12] or Swoop described by [13] and [14]. Additionally, several inference engines such as that of [15], [16] , or [17] are available. This contrasts with the topic-maps standard. There exists a standardised language [18] for them and an editing tool [19] but this editor is less widely used than Protégé and, more importantly, there are no results about the decidability of algorithms on topic-maps.
Figure 1. Protégé editing.
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
141
On the linguistic level, in order to enable users to identify ontology elements with names and descriptions they are used to, each elements of the OWL ontology (classes, instances and properties) can be described by names for each language. This is made using for instances, dedicated datatype properties, and for classes and properties, rdfs:label annotations. On the practical level, the idea is to use tools providing enough affordance for non computer scientists such as curriculum experts from several countries, and to ask them to collaboratively construct the ontology and benchmark it with instances taken from the localised geometry curricula they master. The affordance is both at the tool level and at the conceptual level: the ontology has been kept rather simple in order for curriculum encoders to be able to handle it easily. For this reason we gave up on presenting relationships between competencies and between topics beyond the is-a and subclass relationships. Protégé has been chosen as an editing tool (see figure 1) both for the design of the ontology and for its first validation by a small group of experts. It offers OWL-DL editing, which is coherent with the theoretical requirements. To our experience and as seems to be reported by many, its stable versions are usable by non-computer-scientists or non OWL-specialists while allowing specialists to perform deep editing tasks; we had success with the majority of our curriculum expert. It is the most widely used ontology editor at this moment. The new version (March 2009) also integrates DL reasoner like Pellet and Fact++. 2.3. The Geoskills Ontology in Details For each Geoskills ingredients, a set of names is provided, at least one in each language, as explained in 2.2. This allows elements of the ontology to be presented to the user but also lets her search for it following an auto-completion mechanism described below, finally it supports the search engine to search for nodes being queried for text. Because names vary in their frequency of usage, they are of four different degrees (common, uncommon, rare, false-friend), which are taken in account when a word is matched with it. These names are not being mistaken with identifiers, which are ASCII names expected to be used in such references as URIs or URLs (e.g. when browsing about a topic). For the ontology elements, names are encoded as dedicated datatype properties: for classes and properties, rdfs:label annotations are used while, for the instances, the properties are commonName, unCommonName, rareName, and, falseFriendName, each describing a name with its particular commonality. Geoskills essential ingredients are topics, competencies, pathways, levels and programs. Topic is made as a taxonomy (see Figure 2), that is, a hierarchy of abstract classes each representing mathematical topics and objects. Multiple inheritance is possible thanks to OWL and is of great use in this case. Because OWL-DL properties only relate on instances, each class has a single representative individual. Property attached to Topic (other than names): -
belongsToCurriculum, with range Programme.
Examples of topics include isoceles triangle or ApproximationProcess_for_roots.
142
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
Figure 2. Extract of the topics hierarchy of GeoSkills.
Competency is becoming the major entity of assessment and learning-plans. In GeoSkills, just as in [10] or [20], competencies are made of a verb and a set of topics. The class hierarchy of competencies represents the specialisation hierarchy of verbs, that is the cognitive process of the competency. Examples of competencies include Calculate_trigonometric_ratio, Reproduce an isosceles triangle, or Identify_square_ numbers. In the first case, calculate trigonometric ratio, the OWL individual is of the class Calculate and contains the topic trigonometric ratio. Properties attached to Competency (other than names): -
belongsToCurriculum with range EducationalProgramme hasTopic with range Topic.
EducationalRegion is an administrative educational region such as GermanyBerlin.or France. EducationalPathway is a series of educational contexts such as elementary-school, or Secondaire_de_Qualification_Technique_Artistique. Property attached to EducationalPathway (other than names): -
inEducationalRegion with range EducationalRegion.
EducationalLevel is an element of a pathway, for example one of its year, for example Gymnasium_Saarland_7te, or Bachillerato_Ciencias_y_Tecnologia_2. Properties attached to EducationalLevel (others than names): -
belongsToEducationalPathway with range EducationalPathway age with range integer hasTopic with range Topic.
EducationalProgramme is the concrete plan of a level within a pathway; it is bound to curriculum standards. A programme can contain a list of competencies or the URL of an HTML where they are referenced. Properties attached to EducationalLevel (others than names):
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
-
143
hasLevel with range EducationalLevel hasSubject with range Subject hasURI with range string.
The ingredients of this ontology are among the ingredients of the metadata structure that the Intergeo platform is manipulating [21]. Thanks to their name-abilty and mathematical semantics, they can enter an information retrieval process for both the auto-completion paradigm and the search tool paradigm (as explained below). 2.4. Ontology Maintenance Practice In order to guarantee long-term quality of the competencies and topics, the management tasks have to be taken care of seriously. Aside of the many possible handcrafted error-reporting rules that will be written as needs arise, the OWL DL ontology nature provides us several tools ready to be used: OWL-DL axioms to constrain properties: several axioms can be encoded as part of the ontology constraining the properties of the individuals. For example, one axiom stipulates that at least a topic should be as an argument of a competency individual. OWL-DL class membership by Extension: OWL-DL allows axioms to state sufficient or necessary and sufficient conditions for an individual to be part of a class. This allows, for example, the competency individuals that are instances of the Construct competency class and have the Ruler and Compass topic to be instances of the Construct_figure_with_ruler_and_compass class as well. Better, this allows to automatically classify competencies as being member of a class, for example Use_equality class, that is also automatically considered as a subclass of Use_identity (because an equality is an identity). Abstraction for Similar Individuals: OWL-DL axioms can inject automatically property values implicitly based on class-membership. This is of particular use in countries where many educational regions exist such as Germany (each of the 11 Bundesländer has its own set of educational programmes) or Switzerland (each of the 22 Cantons has its own set of educational programmes). This abstraction allows speaking of a ninth class of German Gymnasium with the same common-name (9. Gymnasium, neunte, 9.), which then gets specialised per Bundesland. All these abstractions take advantage of the ontology nature of our list of competencies and topics. The fact that the description logics formalism is used provides reasoners with a decidable algorithm. Our current usage is based on the Pellet reasoner [16] version 1.5, an open-source, java-based, OWL-DL reasoner. 2.5. Populating the Ontology Populating the ontology has two purposes. The first one is to provide a first level of validation of the ontology structure. The second one is to provide to the Intergeo web platform a first set of encoded curricula. In the first phase, the curriculum experts of our group of the Intergeo have been involved closely followed by touches at the ontology structure, advise on the best ways to encode, and adjustments at the editing interface. The Protégé editor was the main editing tool.
144
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
We have made use of the Protégé client-server 3 which allowed team members to work synchronously on the ontology from remote places provided they are equipped with a very good network connection; only Universities met this challenge thus far. For other members, in particular companies involved in the Intergeo project, it was necessary to allow exclusive work on a local copy. Another limitation was met in the generic ontology-editor nature of Protégé, which makes it able to perform all sorts of changes, many of which should be reserved to ontology experts.
3. The Competency Editor - CompEd For the reasons enumerated above, a platform to edit the ontology was needed, a webbased editing tool that allows curriculum experts throughout Europe to contribute by translating and editing competencies and topics. Even before the editing actions, a first important aspect is to allow web-based navigation of nodes of the ontology to allow the annotation of curriculum texts and textbooks: both of these features are to be done by having topics, competencies, and levels addressable through URLs which can also be presented in a browser. The annotations edited in the Intergeo Figure 3. Rendered annotations of a resource. platforms use these links as part of their presentation as in Figure 3 aside from this paragraph. The web-based editing tool is called CompEd. Its objective is to edit topic and competency individuals of GeoSkills as well as the topic and competency sub-classes and individuals. Editing includes altering names and relation properties (such as the generalisation/specialisation, instantiation relationships, or the involvement relationship of a topic in a competency). 3.1. CompEd Features CompEd offers the browsing and editing of individual topics, competencies, and competency processes. Individuals can be reached by tracking recent activity; by browsing the alphabetic list view or hierarchical tree view; by navigating the relationships; by keyword searching; or by an external URL. Items are displayed in a consistent way. As depicted by Figure 4, which is an example for the "solve similarity problems" competency individual, the display is divided into three parts: x
The first part provides general information, which includes the name of the URI, the URI itself, the created and the modified dates. Below, the names in the user's language for the particular item are displayed. Names are grouped by type (common, uncommon, rare, false-friend). If wished, the user can click on the "more languages" link to get the other languages names.
3 The Protégé client-server setting is based on Java RMI http://protegewiki.stanford.edu/index.php/Protege_Client-Server_Tutorial.
and
is
documented
at
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
x
x
145
The topic part just provides a list of topics that are connected to the competency item. The list items are links, which simplifies the navigation to the topic. Note there is no topic part in the view for topic items (only competencies are linked to topics). Finally, the structural part shows a hierarchical tree, which represents the generalisation/specialisation/instantiation path down to the competency item. In the case of competency classes (called Competency processes in the English GUI and Catégories de compétences in the French one and corresponding to the underlying cognitive process or verb), the tree will have all super-classes, subclasses, and individuals that are on the path through the competency process node.
Adding and editing of names as in Figure 4 includes the provision of a textual name, a language, and a type. The type can be one of: common, uncommon, rare, or false friend. While the latter pieces of information have a default value to be displayed in Intergeo tools (common name and the native language of the user), the validation through OWL axioms guarantees that a name is provided.
Figure 4. Presentation of a competency in CompEd (for curriculum encoder role)
Editing of competencies includes: x x
changes, additions, and deletions of competencies alterations on the competencies' URI
146
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
x x x
making connections to competency processes referencing to topics provision of a default common-name in any language.
Editing of competency classes is very similar except that connections are established to other competency classes (which denotes a subclass relation) and to competencies instances (which denotes a membership relation). CompEd supports the user in altering data as much as possible, i.e., it suggests default values and signals errors in a user-friendly way. The remaining input that is not covered in the CompEd usage is that left for the ontology experts which includes adding or deleting extra properties, defining a class with a necessary and sufficient restriction, adding or deleting axioms about the ontology. Currently, edition of educational levels is also left to them, basically by using Protégé editor. They work informed by the curriculum encoding community based on a public forum where users of the curriculum knowledge, curriculum encoders, and ontology experts discuss. 4 3.2. CompEd Architecture The CompEd server software has been designed with high-usability in mind based on web-technologies that are widely spread. Thus the AppFuse framework 5 is at its core and its memory management is supported by the RDBMS persistence engine MySQL through the widespread java persistence framework Hibernate. 6 These choices make CompEd a long-lasting responsive edition framework. The decision not to use an OWL persistence engine is due to the apparently still lacking persistence framework for this technologies which scale long term and the ongoing need to load the complete ontology in RAM for most forms of reasoning.
4.
Management of the Competencies and Topics
Having described the editing framework we turn to the larger problem of maintenance of the knowledge in the GeoSkills ontology. It is done by the assertion of axioms following the OWL semantic and by its accessibility to users: either by browsing or by searching. These two aspects provide ways to make sure the utility of the knowledge of GeoSkills is maintained. 4.1. Ontological Management Two tools, CompEd and Protégé, can edit the Geoskills ontology. Protégé 3.3.1 has been the first editing tool for creating a Geoskills first version, used by two curriculum experts. It offers all the possible OWL expressivity. The normal tool to be used by curriculum experts is CompEd, but it offers an expressivity reduced to instances, hyponymy (is-a relation), links between competency and topics, and names. Because
4
The curriculum encoders’ online community is being built at http://curriculum.i2geo.net/ See http://www.appfuse.org/ 6 See http://www.hibernate.org/ 5
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
147
Because CompEd is unaware of axioms that have been expressed in OWL with Protégé, violations and new statements appear once the reasoning is invoked, nightly. The Pellet classifier, on a dedicated server, makes these ontological consistency checks. This classifier provides also automatic classification of Competency classes between them, and of Competency instances into classes. For example, a competency can be automatically classified in the Competency class Use formula which is defined with a necessary and sufficient condition as “Subclasses of Use that refers to a formula” (see the example of 1.3). We shall see below that this is done at synchronization time.
Figure 5. The second step of the wizard add-a-resource in the Intergeo platform.
4.2. Access to Competencies by Typing: SkillsTextBox To allow users to identify the competencies, topics, or levels they mean, we extend the familiar auto-completion: users can type a few words in the search field, these are matched to the terms of the names of the tokens; the auto-completion pop-up presents, as the user types, a list of matching tokens as seen on figure 5). This list presents, for each candidate, the default-common-name, the name found to match the user’s input, the number of related resources, an icon of the type, and a link to browse about the ontology at the node and around it. When chosen using either a click, or a few presses of the down key followed by the return key, the choice action either triggers a search or the addition of the node in a list, or for annotations.
148
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
SkillsTextBox uses a simple HTML form equipped with a GWT script . This script submits the fragments typed to the index on the server, which uses all the retrieval matching capabilities (stemming, fuzziness through edit distance or phonetic matching) to whose names start with the typed input, first in the languages supported by the user than in any language. The index returns the 20 best matching tokens and the script renders as an auto-completion list. More information about it is at http://www.activemath.org/projects/SkillsTextBox/. 4.3. Designating by Pointing in a Text Supplementary to letting users search for resources by explicitly typing the names of competencies and topics, we offer the possibility to do this implicitly by linking them from sections of curriculum standards or of textbooks that users know well (see for example Figure 6). Although we shall mostly not be able to offer whole textbooks to browse through, we expect it to be unproblematic to display their tables of contents and have obtained the yes of several publishers already.
Figure 6. An annotated curriculum standard of England’s KeyStage 3 maths.
Linked curriculum-texts are obtained by letting curriculum experts edit HTML forms of the curriculum-texts adding, hyperlinks containing comma separated URIs of the ontology. A small process then converts these hyperlinks to javascript calls: once in context with skills-text-box instance, a click on such a links triggers a choice action which amounts to a search-query or an set of added annotations. The idea is that a user can then browse through a table of content, through pages he is graphically familiar with, and click on sections of interest. This click triggers the selection of the competencies and topics associated with these sections, triggering the search for the related term or adding these nodes to the list.
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
149
4.4. CompEd, OWL, and the Term Index: Synchronisation The competency-editor, the Protégé editor, the skills-text-box’ term index all are places which store a representation of the GeoSkills’ ontology; in this section we explain how the OWL ontology file is at the centre of the synchronisation with incremental updates and regular resets. The architecture of these pieces is depicted in Figure 7: CompEd stores the contents of GeoSkills in a way made for massive collaborative edition; it cannot allow edition of all facets of the ontology; on the other side, Protégé allows full edition of the OWL ontology but is not suitable for such collaborative edition; the ontology server stores the ontology in RAM and performs the reasoning but it only receives the updates done by the CompEd users through update XML documents which are then incorporated into the OWL file. Finally, the term index contains an index of the names, ready for retrieval in the auto-completion and search functions. The communication flows between the pieces are as follows: CompEd updates: following the actions of a curriculum-encoder or curriculumtranslator, CompEd modifies his RDBMS storage and also sends an update document to the ontology server and to the term index. The latter update their representations following these updates. Regular resets: because the intent of the competency editing process is the GeoSkills ontology, the ontology is used to replace the contents in RDBMS. This is done through a conversion from the OWL file, read through the reasoner, to the tabular format. These resets are applied every night and are the key to receive the reasoner results (such as the axioms that add properties or classes). Ontology adaptation: from time to time, having concerted themselves, the ontology engineers will request to work at the ontology level, for example to add axioms, to add particular new classes or to perform clean-up operations. In order to do so, the CompEd server is taken read-only and the work on the OWL file in a text-editor or using Protégé can happen. It is followed by the regular reset, which re-imports the OWL file in the RDBMS.
5.
Usage of the Competencies Ontology in the Intergeo Platform
We have described the representation and technical choices to allow GeoSkills to be used as a language of annotation. In this section we present how the software pieces are used in the Intergeo platform, which is publicly available at http://i2geo.net/. 5.1. User Roles in Intergeo Platform The Intergeo platform's main goal is to allow sharing of interactive geometry constructions and related materials. This material can take on the form of interactive geometric constructions, with or without concrete learner tasks attached to them, as well as web-based materials that encompass these. We shall use the term resource here, as has been done often on the web, to denote any of these data types. Overall, the usage of the platform is the execution of the following roles: x
The annotator uses the editing front-end of the community platform in order to annotate resources as referencing the given competencies or topics, and a
150
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
x x
x
x
x x
given educational-level, as well with many other information fields (such as authorship or license). The searcher uses text-search, the ontology or curriculum-text browsing to identify the correct term so as to search through the platform's database to find relevant resources to use in teaching, to edit, or to evaluate. The curriculum encoder identifies a curriculum-text of interest that could be shared among platform users, obtains an appropriate electronic version, browses through it and creates, in the ontology, the needed competencies and topics. This may also require to declare a new learning-pathway, region, or programme. He is in charge of uploading the document into the web application for further sharing and adding hyperlinks into this text. The competency translator uses the competency-editing tool in order to add or edit titles of a competency or topic or descriptions in one's own language. Typically, they require knowledge of several languages but do not require understanding the data-model of a competency (hence cannot change it). The platform translator translates the messages of the system, a large dictionary modelled after the classical application internationalisation practice, while the web-content translator translates the pages of the Intergeo platform, which represent static texts. The XWiki infrastructure to this end is taken and works well. The quality evaluator role is described in [22]. The ontology engineer, together with the platform administrator, monitor the reset processes, operate changes on the ontology for any facet not covered by the CompEd tool, such as edition of the axioms or educational levels.
Thus the role of the annotator is to provide sufficiently detailed topical and educational context information so that all users can find resources using the language of their curriculum as well as using everyday language. For this to work, we have added two roles to this workflow: curriculum encoder and competency translators. They make sure that each competency and topic in the curriculum standard they are responsible for is properly listed and properly imputable. 5.2. Contribute a Resource The annotator actions are the classical form-based editing of the metadata. The process is part of the Curriki platform whose metadata schema is relatively lightweight. Intergeo has adapted these forms to contain the fields for trained topics and competencies and intended educational level, which both use skills-text-box. This is depicted in Figure 5. 5.3. Find a Resource We have explained in section 4.2 the retrieval process that searches through the labels of the ontology and proposes a completion list of tokens. Once chosen, the tokens can be used for the annotation or represent particular query-terms of the search tool. In this section we explain the search tools' query mechanism, which, again, relies on the knowledge of the ontology.
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
151
Figure 7. Architecture of CompEd and the terms-index.
In making the search tool, we rely on classical information retrieval principles which stipulate an easy query and result process with a result-list ranked by relevance as described, e.g. in [23] and using the Lucene library [24]. The query is made of a set of terms each made of a string (a set of words). Some strings represent a single node of the ontology, while others represent an arbitrary textual query. For each string, a query expansion is performed as follows: x x
x
Each string is expanded to a query for the competencies, topics, or levels, whose names match closely the string, along with a query for the resources whose text contains the string. Each query to a resource annotated with a competency is expanded to a query for resources annotated with this exact competency or, less importantly, resources annotated with topics of that competency, or resources annotated with parent-competencies. Each query to a topic is expanded to match resources annotated by the topic or by its parent or children topics.
This query expansion mechanism, which is detailed further in [25] is the key to a tolerance of the search tool, a fundamental criterion of search tools' acceptance. This tolerance is enabled by the knowledge stored in the competency ontology, transporting the distance between ontology nodes to a distance between resources and their annotations. For example, it will allow an English-speaking teacher to search for enlargement as node or as text in the platform and still find what a French teacher will have annotated with the competency of applying the intercepting lines theorem. This goes beyond a simple term-translation approach because of the semantic relationships between the nodes, which are more important than the word similarity.
152
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
The query-expansion is a process that adds weights the queries it produces, this is called boosting. This enables, for example, a query for the textual content of the resource (its title, description, or body) to match lower than a query for a topic or competency annotation. Similarly, the query expansion mechanism involves the user’s context by preferring resources that are marked appropriate for the users’ preferred educational levels. This boosting is natively supported by the Lucene search engine used. Contrary to symbolic query engines (such as SQL or RDF storage engines), the default disjunction of a query is a weighted query where a match of several terms adds to the score of a match. Being a retrieval engine, the index structure allows Lucene to return a search result listing documents order by the score of the match, which is considered a relevance ranking.
6. Conclusion Our mixed approached appears to succeed in the community-based creation of an interoperable language to annotate learning materials with a fine-grained knowledge about competencies and topics. This knowledge seems to be fundamental for the Intergeo repository of interactive constructions so that teachers, our target users, are able to publish and find the resources without being limited by the boundaries of their educational curriculum or their language. The semantic nature of the language helps its management from the validation as well as the accessibility point of view. Finally the usability of the language is supported by an easy input and browsing. 6.1. Implementation Status The GeoSkills ontology is now stable in structure. The geometry parts of the topics and competencies of the curriculum of several years of the French collège and a few England’s years are fully encoded, and in parts for of the curriculum of Cataluña. These parts have been done using the Protégé editor. The curriculum standards of the Czech republic, and the German state of Bayern are mostly encoded using CompEd. Further development from these Luxembourg and the Netherlands expected soon. At the time of writing, the GeoSkills ontology contains 120 competency-processes, 749 instantiated competencies, 427 topics. This ontology is available under either the Creative Commons Attributions-Sharealike License [26] or the Apache Public License 2.0 [27] from http://i2geo.net/ontologies/dev/GeoSkills.owl. The Intergeo platform is available on http://i2geo.net/ and already contains about 1500 resources with a preliminary recollection of 3000 resources, which were done in a first phrase with a shallow metadata model. It is built as an adaptation of Curriki [28], both being delivered under the GNU General Public License [29]. Its installation is documented in [30]. The list of currently actionable curriculum-texts is at http://i2geo.net/xwiki/bin/view/Main/CurriculumTexts. The simple user-interface of cross-curriculum search-engine can be used there, allowing search by competencies, topics, and levels and, of course, plain text. More information about the platform can be seen in [31].
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
153
6.2. Perspectives Among the avenues to be explored deeper is a more synthesised and complete exploitation of the conclusions of the reasoner. While inherited property values are easily handled by the parsing infrastructure which uses the reasoner, the automated classification results have been ignored thus far because it would make any parent class a direct subclass of the node: at least in the competency editing process, this is wrong as it would flatten the whole tree of inheritance (e.g. as in figure 4). We have to explore such avenues as taking parent-classes inferred by the reasoners and removing the asserted ancestor parent-classes. Beyond parsing, there should also be the possibility of the ontology server to feedback on changes done in the curriculum editing process, including indicate inconsistencies that have appeared. The XML encoding of the updates could be of use for this purposes. The commitment to encode the curriculum standards of mathematics of many European countries seems to be novel and start on the strong basis of a usable editing tool and internationalisation infrastructure. The perspective of such a large coverage may uncover new cross-lingual issues. Among the practical issues we have encountered is the desire of curriculum encoders to adjust URI-fragment-identifiers to be more precise or more correct especially when they display a typo or a wrong name. Such a change can break existing relationships and should, thus be discouraged. Except for a closed world where all the references can be updated, we miss management practices that would allow long term URI preservations while still allowing maintenance to bring external references up-todate. It may be that the most adequate answer to this need is the denial of the practice of readable URIs, which removes from it any expressivity thus need to be adjusted. Finally, a part of the GeoSkills, which seems to have a large potential for reusability is the part about educational contexts, which catalogues educational regions, pathways, levels, and programmes within an ontology encoded in a standards-based knowledge representation. Based on reference texts, we seem to be able to provide the coverage of the full set of European schools in strongly structured way. It has been our surprise that such an ontology is not yet available, the closest being a thesaurus with a structure which lacks a strong specification such as that offered by http://www.eurydice.org/.
Acknowledgements This work has been realised within Intergeo eContentPlus project, which was partially funded by the European Community and by participating institutions. The opinions of this paper are, however, those of the authors. We wish to thank the members of the work-package 2 of the Intergeo project, in particular Martin Homik and Arndt Faulhaber (DFKI), Colette Laborde (CabriLog), Maxim Hendriks (TU/e), and Albert Creus-Mir (Maths4More).
154
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
[16] [17] [18] [19] [20]
[21]
[22]
[23] [24] [25]
[26]
[27] [28] [29]
Ministère de l'Éducation Nationale, Programmes des classes de troisieme des colleges. Bulletin Officiel de l'Education Nationale 10 (1998), 108. OFSET, GNU EDU, http://gnuedu.ofset.org/, Accessed 2008. American Mathematical Society, Mathematical Subject Classfication, http://www.ams.org/msc/, Accessed 2009. J. Karhima, J. Nurmonen, M. Pauna, WebALT Metadata = LOM + CCD, in Proceedings of the WebALT 2006 Conference, The WebALT project, 2006. G. Paquette, An Ontology and a Software Framework for Competency Modeling and Management. Educational Technology & Society 10 (2007), 1-21. BECTA, British Educational Communications and Technology Agency, http://partners.becta.org.uk/index.php?section=rh&rid=13661, Accessed 2009. Microsoft, Microsoft Lesson Connection Launched At Technology + Learning Conference, http://www.microsoft.com/presspass/press/1999/nov99/lessonpr.mspx, Accessed 2008. Explore learning, Correlation of gizmos by state and textbooks, http://www.explorelearning.com, Accessed 2008. Key Curriculum Press, Sketchpad Lesson Link, http://www.keypress.com/x22318.xml, Accessed 2009. F. van Asche, Linking Learning Resources to Curricula by using Competencies, in First International Workshop on Learning Object Discovery & Exchange, Crete, 2007. D. L. Mc Guinness, F. van Harmelen, OWL Web Ontology Language Overview,W3C, http://www.w3.org/TR/owl-features/, Accessed 2007. National Library of Medecine, Protégé Editor version 3.3.1, http://protege.stanford.edu/, Accessed 2008. A. Kalyanpur, Bijan Parsia, E. Sirin, B. Cuenca-Grau, J. Hendler, Swoop: A 'Web' Ontology Editing Browser, Journal of Web Semantics 4 (2005). Mindswap, Swoop, http://www.mindswap.org/2004/SWOOP/, Accessed 2009. J. J. Carroll, I. Dickinson, C. Dollin, D. Reynolds, A. Seaborne, K. Wilkinson, Jena: implementing the semantic web recommendations, in International World Wide Web Conference, ACM, New York, NY, USA, 2004, 74 - 83. Clark&Parsia, Pellet, http://clarkparsia.com/pellet, Accessed 2008. HP Labs, Jena - a semantic web framework for java version 2.5.5, http://jena.sourceforge.net/, Accessed 2007. S. Pepper, G. Moore, XML Topic Maps (XTM) 1.0 -- TopicMaps.Org Specification, http://www.topicmaps.org/xtm/1.0/, Accessed 2009. D. Dicheva, Towards Reusable and Shareable Courseware: Topic Maps-based Digital Libraries, http://compsci.wssu.edu/iis/nsdl/, Accessed 2008. E. Melis, A. Faulhaber, A. Eichelmann, S. Narciss, Interoperable Competencies Characterizing Learning Objects in Mathematics In Lecture Notes in Computer Science, Intelligent Tutoring Systems, 5091, Springer, Berlin, 2008, 416-425. M. Hendriks, P. Libbrecht, A. Creus-Mir, M. Dietrich, Deliverable D2.4: Metadata specification., Project I, eContentPlus Program, European Community, http://i2geo.net/files/D2.4-MetadataSpec.pdf, 2008. C. Mercat, Soury-Lavergne, S., Trgalova, J., Deliverable D6.1: Quality assessment, Intergeo Project, eContentPlus Program, European Community, http://www.inter2geo.eu/files/ D6.1_060508.pdf, 2008. C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979. E. Hatcher, O. Gosnopedic, Lucene in Action, Manning, 2004. A. Creus-Mir, C. Desmoulins, M. Dietrich, M. Hendriks, C. Laborde, P. Libbrecht, Internationalized Ontology, Project I, eContentPlus Program, European Community, http://i2geo.net/files/D2.3-Intl-Ontology.pdf, 2008, 52. Creative Commons, Namensnennung-weitergabe unter gleichen bedingungen 2.0 Deutschland,Creative Commons, http://creativecommons.org/licenses/by-sa/2.0/de/, Accessed 2008. Apache Software Foundation, Apache public license 2.0,Apache Software Foundation, http://www.apache.org/licenses/, Accessed 2009. The Global Education Learning Community, Curriki, http://www.curriki.org/, Accessed Free Software Foundation, Gnu general public license version 2.0,Free Software Foundation, http://www.gnu.org/licenses/old-licenses/gpl-2.0.html, Accessed 2009.
P. Libbrecht and C. Desmoulins / A Cross-Curriculum Representation [30]
[31]
155
S. Egido, P. Libbrecht, H. Lesourd, Platform’s Administration Manual, Intergeo Project, eContentPlus Program, European Community, http://i2geo.net/files//D4.4-AdminManual.pdf, 2009, 50. P. Libbrecht, Kortenkamp, U., Mercat, C., I2Geo: a Web-Library of Interactive Geometry, in Digital Mathematics Library Workshop, Ed. Sojka P, 2008, 3–15.
This page intentionally left blank
Part 2.2 Semantic Web-Based Intelligent Learning Environments Architectures
This page intentionally left blank
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-159
159
CHAPTER 9
ActiveMath – a Learning Platform With Semantic Web Features Erica MELIS a,1 , Giorgi GOGUADZE b , Paul LIBBRECHT a , and Carsten ULLRICH a a DFKI GmbH, Saarbr¨ ucken, Germany b Universit¨ at des Saarlandes, Saarbr¨ ucken, Germany Abstract. ActiveMath is an intelligent e-Learning platform that exhibits a number of Semantic Web features. Its content knowledge representation is a semantic XML dialect for mathematics, semantic search is enabled, some of its components work as a web service and, vice versa, it employs certain foreign web services, e.g., for diagnostic purposes. Keywords. semantic e-Learning, web services, semantically annotated learning content
Introduction ActiveMath was one of the first educational systems that seriously addressed the Semantic Web – such as semantic representation and metadata – in a realistic e-Learning application. It is around for quite some time now and has evolved from a prototype to a full-blown platform that is used by an international community centred in Germany so far. ActiveMath has typical intelligent tutoring system’s (ITS) components such as domain (expert) model, a student model, and pedagogical modules comprising course generator, tutorial strategies, and feedback generators. They are organized as a client-server application with additional web-services. What is rather untypical for traditional ITSs are the advanced features that make it a Semantic Web application, e.g., truly semantic markup and reuse of content, semantic search, interoperable content and components, distributed architecture, generation of web presentations from the representation of content, asynchronous event framework, etc. Hence, ActiveMath is also a workbench for studying benefits of combining ITS and semantic e-Learning technologies as suggested in [1]. Different from traditional ITSs, ActiveMath encodes the domain 1 Corresponding author: Erica Melis, German Research Center for Artificial Intelligence (DFKI), Stulsatzenhausweg 3, 66123, Saarbr¨ ucken, Germany; Email: [email protected].
160
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
model, i.e., the domain ontology, implicitly in the content stored in a knowledge base. Because of this encoding and an active community of authors, the content and thus the ontology/domain model can evolve and change over time. Hence, ActiveMath has to take care of those changes preferably in an automatic way rather than in a constant re-engineering effort. In the following, we describe ActiveMath’s relevant design parts including knowledge representation, architectural issues, communication, role of ontologies, as well as relevant features of some ActiveMath components among them webservices. In order to provide a rather self-contained chapter, we summarize some of ActiveMath’s features which were included more detailed in [2].
1. Design Principles and Preliminaries ActiveMath has been designed with web-communication, interoperability, semantic knowledge representation and metadata standards in mind. 1.1. Architecture and Communication From the beginning, ActiveMath had a client-server architecture [3] as depicted in Figure 1. Its components communicate either internally, via http requests, or use an even more elaborate communication via a mediator or broker as described below. An event system (not depicted in Figure 1) enables an asynchroneous communication of components (including external systems) to which components and external systems ’issue’ event-information and components can ’listen’. For instance, the exercise subsystem or an external player can send information about the student’s performance in an exercise and the student model may listen and use this information. The figure shows the server of the ActiveMath platform in the middle and its web-service communications with external servers at the left-hand-side. Note that the employed ontologies are not incuded in Figure 1 because they are not handled as main components but their information is implicit in the representation of content (stored in content repositories) and in external content dictionaries of OpenMath, which are introduced below. However, the course generator uses an explicit stable ontology of instructional objects, oio, as explained below. Several content knowledge bases can be accessed by the course generator by way of mapping their ontologies of instructional objects to the course generator’s oio [4]. A domain ontology mapping is not (yet) used. The external diagnostic systems communicate with ActiveMath via a broker and exchange information in OpenMath as detailed below. The search component communicates with the content base(s) via direct java calls and initiates an indexing upfront. ActiveMath’s course(ware) generator is called Paigos [5], and it uses information about learning objects, the learner and his/her learning goals to generate an adapted sequence of learning objects that supports the learner in achieving his goals. Hence it communicates with the student model and the content base(s). Paigos is based on an extensive model of expert teaching knowledge – about 300 “rules” define how to assemble different types of courses. Paigos can function as
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
161
Figure 1. Coarse architecture of ActiveMath with services
a service and be accessed by other learning environments. Section 1.2.1 describes what is required of the knowledge representation to enable such a service, and Section 2.3 describes the service in detail. The student model provides information to the other components via an interface. The presentation subsystem (multiply presented in Figure 1 to allow for a clearer depiction) takes IDs, fetches XML lerning objects (LOs) from the content base(s) and generates the actual presentations which the client browser will present. The actual generation uses presentation information from the user profile/request, the course generator, exercise system, and the system settings. This information communication is not depicted in the architecture figure to keep it clear. 1.2. Semantic Knowledge Representation The knowledge representation in ActiveMath is based on the OMDoc standard for mathematical documents [6]. It defines fine-grained learning objects (LOs) connected to each other by relations and annotated with metadata. In ActiveMath, we differentiate between two types of OMDoc LOs: (1) so-called concepts (also called ’knowledge components’ in ITS publications) that are the main elements of the ontology such as symbols representing abstract mathematical concepts, definitions of these concepts, axioms, theorems and proofs; (2) satellite elements such as example, exercises and types of texts that elaborate on, explain, or train the related concepts. The OMDoc format itself uses OpenMath [7] as embedded format for representing mathematical formulaæ. OpenMath is a well-established standard for representing mathematical formulaæ and, indeed, a semantic representation. Its main goal is the interoperability of formulaæ and, thus, of the systems which process
162
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
them. It defines the mathematics symbols’ semantics by the usage of so-called content dictionaries [8], which contain agreed upon symbol declarations providing a hook, to which symbol occurrences point to when using an OpenMath symbol (OMS) element. This enables a semantic evaluation and interoperability of mathematical formulæ. The symbol declarations are complemented by a description in regular English and by formal properties which are mathematical statements that should hold for the symbol to be interoperable. Mathematical documents contain not only formulaæ but also several types of text, links, and potentially multimedia parts. Therefore, OMDoc extends OpenMath: all textual fragments can occur in multiple languages and be interleaved with formulæ; structural elements such as definitions, examples, exercises, etc. are added – as usual in XML – including text etc.. OpenMath content dictionaries serve as reference for symbols rather than for document elements. Referencing the content dictionay or grouping, to which a symbols belongs adds semantical information to mathematical expressions, e.g., whether + is an operation for real numbers or for matrices. This provides additional information for the expressions’ semantic evaluation by external mathematical reasoning services and for the diagnosis of user input. 1.2.1. Ontologies As sketched above, most ontological information is represented in the content, rather than explicitly in separate ontologies. This design decision goes back to ActiveMath’s flavor of an open platform to which (evolving) content can be added/modified by an authors’ community. This implies frequent changes of the ontologies except the rather stable oio. For a particular instance of ActiveMath we are currently testing an approach in which the domain ontology is a separate OWL-formalization to which the content and its metadata will refer to. ActiveMath uses two different ontologies, a domain ontology and a pedagogical one. The domain ontology contains abstract concepts (symbols) and their definitions, theorems, axioms as concepts and describes the subject domain from a mathematical point of view, e.g., relations between concepts that indicate the equivalence of two definitions. The pedagogical ontological information is represented in our extension of OMDoc. It includes relations such as domain rerequisite and properties of LOs such as difficulty as, e.g., defined in the LOM standard. Moreover, the pedagogical information declares the type of a learning objects according to its instructional function and the oio ontology. This is independent of the specific (mathematical) subject domain. We had to develop the additional ontology oio because existing learning object metadata standards such as LOM [9] failed to describe learning objects sufficiently precise for intelligent components to integrate them automatically into the students’ learning work flow. For instance, LOM’s learningResourceType mixes pedagogical and technical or presentation information: while its values Graph, Slide and Table describe the format of a resource, other values such as Exercise, Simulation and Experiment cover the instructional type. They represent different dimensions, hence need to be separated for an improved decision making. Furthermore, several instructional objects are not covered by LOM (e. g.,
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
163
Figure 2. Overview of the Ontology of Instructional Objects
definition, example). As a result, LOM fails to represent the instructional types sufficiently precise to allow for automatic usage of learning objects, in particular, if it involves complex pedagogical knowledge necessary for effective learning support. For instance, LOM has no easy way to determine to what extent a resource annotated with Graph can be used as an example. The then novel ontology of instructional objects (oio) [5] contains this previously missing information. Its classes are shown in Figure 2. The root class of the ontology is instructionalObject. Central to the ontology is the distinction between the classes fundamental and auxiliary. The class fundamental subsumes instructional objects that describe the central pieces of domain knowledge (concepts). Auxiliary elements include instructional objects which contain additional information about the fundamentals as well as training and learning experience. The oio enables several of ActiveMath’s advanced pedagogical features amd it allows to define the course generation knowledge such that it is independent of the specific domain. Applications of the ontology in areas other than course generation were investigated in the European Network of Excellence Kaleidoscope and published in [10]. Moreover, the oio was used for a revised version of the ALOCoM ontology [11], in the e-Learning platform e-aula [12], and in the CampusContent project of the Distant University Hagen [13]. The oio ontology facilitates the process of making third-party repositories available to the course generator which can assemble a course from resources of different repositories. The oio helps to provide the course generator’s functionality as a service to systems that register their repositories (described in Section 2.3).
164
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
In [14] we showed that it can also be applied to completely different domains (e.g., for work flow embedded e-Learning). 1.2.2. Metadata Metadata used by ActiveMath can be divided in three main categories: general administrative metadata, mathematical metadata, and pedagogical metadata. For general annotations of LOs such as title of the item, date of its creation, name(s) of author(s), copyright information and so on ActiveMath uses the standard Dublin Core metadata element set. The Rights element/values used for specifying copyright is replaced by the standard Creative Commons metadata. Mathematical metadata define mathematical types of LOs such as definition, theorem, exercise and relations between them. There are several kinds of mathematical relations between LOs. The most frequently used are: (1) the domain prerequisite relation that indicates that a concept is needed in order to introduce the current concept and (2) the mixed mathematical and pedagogical for relation indicating that a LO relates to a concept to define, explain, illustrate, introduce, prove, or train a concept.2 Pedagogical metadata include some metadata imported from LOM, such as learning context, difficulty, field, and abstractness. They define parameters of ’auxiliary’ LOs that help the components of ActiveMath to act intelligently and to model the student appropriately. Competency metadata are assigned to ’auxiliary’ LOs. Following the approach of Anderson and Krathwohl [15], a competency is represented as a pair of a cognitive process and one or more domain concepts. This metadata defines a skill the LO addresses. ActiveMath can relate to several competency schemes, such as Bloom’s Taxonomy of Learning Goal Levels, the PISA competencies [16], and an extension of Anderson and Krathwohl’s scheme described in [17].
2. Web-Services and Components of ActiveMath All components and communications in Figure 1 use the decribed knowledge representation, most importantly, the exercise subsystem with its diagnosis, the course generator, and the student model. 2.1. Diagnostic Services Diagnosis is a basis for generating feedback in interactive exercises, which is an efficient alternative to authoring (limited) feedback. ActiveMath has a generic framework for distributed diagnostic services. This framework implements interfaces for connecting different kinds of remote services using existing protocols supported by web applications. ActiveMath can query two kinds of services for diagnosing the student’s input: (generic) computer algebra systems (CASs) and domain reasoner services which rely on for2 Typically,
a cognitive task analysis should be the basis for introducing the for for exercises.
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
165
malizations of human-like reasoning and possibly frequently used incorrect rules in a specific mathematical domains. The semantic OpenMath markup for mathematical formulaæ and a generic format for queries to the diagnostic services support the interoperability of different CASs and reasoners (see below). ActiveMath can communicate with CAS and domain reasoner services which have so-called phrasebooks translating OpenMath formulaæ into the actual syntax of the service and vice versa translating computation result back to OpenMath for ActiveMath. Due to the fact, that the OpenMath format represents semantics of mathematical formulaæ, such phrasebooks can always be implemented nd already exist for a number of CAS, e.g., for Maple [18], Mathematica [19], Maxima [20], Yacas [21], WIRIS [22]. Currently, ActiveMath integrates and communicates with the following CASs: YACAS, Maxima, and WIRIS. Figure 3 shows different ways of connecting to CASs that we realized in our framework, which are dependent on a CAS’s implementation: the WIRIS server is connected to ActiveMath via XML-RPC and has an internal OpenMath phrasebook. YACAS has native support for OpenMath and is communicating directly via an internal OpenMath protocol. The Maxima server communicates via WDSL and the queries are piped through an external phrasebook. Currently, the most frequently used CAS connected to ActiveMath is YACAS, since it is modular. easily extensible, and open source. New domains can be attached to YACAS by extending domains that are represented as modules in form of scripts that can be attached as parameters to the YACAS process or loaded into the running system on fly. CASs are very efficient and fast in providing diagnoses needed for the generation of a flag feedback (correct/incorrect feedback) as well as for a correct solution of a given problem. CAS services are also used for creating so-called randomized exercises, in which the complete solution of an interactive exercise is parametrized. For every admissible instantiation of the parameters a concrete exercise and its solution can be generated. The Randomizer of the exercise subsystem of ActiveMath generates exercises by instantiating the parameters with randomly chosen values from defined ranges over sets (of numbers or functions) and intervals. Since the solution of each step of a problem is represented as a mathematical expression, for each randomized exercise the student’s answers can be diagnosed as correct or incorrect by a CAS. For more information about the Randomizer component see [23]. More diagnoses can be obtained when a domain reasoner is available for the mathematical domain(s) of the exercise, e.g., errors in a soution, the student’s solution strategy, irrelevant steps. A domain reasoner can respond to queries which are used to generate hints for the learner such as • next step • correct input for current step • number of steps to final solution, etc. An example of a domain reasoner service is SLOPERT [24], which encapsulates expert and buggy human-like rules for the (mathematical) domain of symbolic differentiation. This service maintains an internal state and, thus, can trace
166
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
Figure 3. Diagnosis framework architecture
the (partial9 solution of the student and diagnose his/her errors. Another domain reasoner connected to ActiveMath is MathCoach [25], which is, however, stateless and cannot trace a student’s solution. A series of domain reasoners is being developed at Open Universiteit Netherlands [26] using Haskell They can respond to queries similar to those of ActiveMath (see section 2.1.2. Currently, ongoing work is implementing rule-based domain reasoners in the form of YACAS modules, that could provide more sophisticated stepwise diagnosis and are answering ActiveMath specific queries described in the section 2.1.2. 2.1.1. Service Query Architecture ActiveMath implements a novel service architecture for the diagnosis of student’s actions in mathematical problem solving. A broker architecture distributes queries to external diagnosis services as shown in Figure 3. The Query Broker accesses those services that are registered for the (mathematical) domain needed for the diagnosis. This domain is recognized by the ’context’ parameter of the query to the semantic services. For instance, a domain reasoner for symbolic differentiation is only queried for (sub-)problems in symbolic differentiation. The subscribed mathematical services themselves can also send a query back to the Query Broker in case a subexpression belongs to another domain and has to be analyzed by another reasoner. For example, a domain reasoner for symbolic differentiation can send a query back to the broker if it needs to simplify an arithmetic expression. The Query Broker passes this new query to a CAS or an arithmetic domain reasoner. Few other systems try to make mathematical services provers accessible through the web. Examples of such are MONET services [27], or MathServe [28].
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
167
2.1.2. Queries In ActiveMath generic queries are used to access any diagnosis service. The queries include a number of dimensions, one of them is context, a novel construct. A context defines (sub-)sets of rules and functions that a domain reasoner or a CAS is allowed to use for a response to the query. The background for this restriction is that the student’s learning situation determines which ’rules’ and functions he/she is supposed to employ. That is, usually a student input such as Solve(expression) (where Solve is a CAS command) wont be accepted even though it is semantically equivalent with a correct result. For instance, if the task of the student is to differentiate the function f (x) = (x + 1) · x. If the student has not yet learned the product rule, a reasonable and correct next step would be an arithmetic transformation that removes brackets. Using the product rule would not be expected from the student. In this case, the evaluation of the student’s answer needs to exclude the product rule from the context but include the arithmetic context. In order to formalize queries used for diagnosis we defined a format for queries including : • action of the query with the commands explained below • (list of) input expressions to be evaluated or compared with each other and depending on the action. • context of action identifying the set of applicable rules, e.g., arithmetic, differentiation, logic • number of iterations defines how many atomic steps the domain reasoner should perform in the given context. In the following, e, e1 , e2 , are OpenMath expressions, C is a context of a query, N is the number of iterations. A solution path is a list of results of consecutive rule applications, which are annotated with rule identifiers. Currently the following queries to diagnostic services are used in ActiveMath: • query(getResults, e, C, N ) - returns the list of final nodes of all paths of length N starting at e in the context C • query(compare, e1 ,e2 , C, N ) - returns true if there exists a path of the length N from e1 to e2 in the context C, false otherwise • query(getRules, e, C) - returns the list of the identifiers of expert rules applicable to e in context C • query(getBuggyRules, e1 , e2 , C, N ) - returns the list of the identifiers of all buggy rules that belong to a path from e1 to e2 in the context C. This query is possible for those domain reasoners that can reason with (typical) buggy rules and some CASs, which can be extended to do so. • query(getUserSolutionPaths, e1 , e2 , C, N ) - returns the list of all paths of length N from e1 to e2 in the context C • query(getExpertSolutionPaths, e, C, N ) - returns the list of all paths of length N starting at e in the context C. In this query C can consist of expert rules only. • query(getNumberOfStepsLeft, e, C) - returns the number of steps left to reach the final node of the shortest expert solution path in context C
168
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
• query(getRelevance, e1 , e2 , C) - returns ’true’ if the expression e2 is closer than e1 to the actual solution in the context C, For example, a query for information about the next two steps for computing derivative of the function f (x) = (x + 1) · x using only arithmetic simplifications and differentiation rules except for a product rule looks as follows: query(getResults, (x + 1) · x, C, 2), where C is the composite context formed by concatenating arithmetical context and differential rules without product rule. 2.2. Student Model In most educational content, the metadata related to competencies refer to one of (the standard) taxonomies. For instance, the PISA specification for mathematics ’competencies’ includes think, argue, model, solve, represent, language, tools. Hence, to achieve interoperability a student model needs to be able to adopt different frameworks for competencies/skills which are used in educational contents/systems. ActiveMath’s semantics-aware student model (SLM) is flexible enough to act upon contents using different competency frameworks. Currently, it can choose between the competency taxonomies used in PISA [16], Bloom’s taxonomy [29], and the mmore recent two-dimensional taxonomies as described in [15,17]. Moreover, a system that relies on evolving content produced by a community of authors ActiveMath needs to adapt to (frequent) modifications of the domain model as well. Therefore, the structure of the SLM is dynamically generated from the metadata represented in the content representation. Becasue of the focus of the paper, we will mainly describe the build-process of SLM rather than go into detail of the updating of competency-values through Item Response Theory and Transferable Belief Model [30]. Note, however, that metadata such as the difficulty and competency-level of the recent exercise influences the updating process: difficulty provides a parameter of IRT. The generation of the student model is data-driven. The structure of SLM consists of nodes, each for a single concept to estimate competencies for. SLM automatically creates a node for each concept k included in the current learning content of a student, e.g., the concept ’definition of fraction’ or the rule ’addition of fractions with unlike denominators’. See Figure 4. SLM stores each associated competency value m(k, p) within the node, where a competency is defined as a pair (k, p), in which p is a cognitive process, such as apply an algorithm or model a mathematical problem that is applied to k. For one-dimensional competency frameworks such as PISA or Bloom’s SLM translates the competencies to the two-dimensional framework. Inter-node relations are dynamically extracted from the content metadata, most importantly the prerequisite relationship, which determines propagation in SLM. See Figure 4 for an illustration. The student model communicates its results to ActiveMath’s event system that every other component can listen.
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
169
Figure 4. Structure of the student model and its propagation links
2.3. Course Generation Service The course generator of ActiveMath (Paigos) is designed as an component whose services can be accessed by other learning environments, too. This requires that Paigos and those learning environments share a common understanding of the type of courses that are to be generated as described in Section 2.3.1. Then, Section 2.3.2 describes the communication between Paigos and its clients. 2.3.1. Representations Pedagogically sensible course generation requires reasoning about learning scenarios. In traditional course generation systems, scenarios consist only of learning objects, which represent the target content that is to be learned. Such an approach ignores that also selection and sequencing of LOs depend on the course’s purpose. For instance, LOs in a course for preparing an exam may differ from LOs in a guided tour. Furthermore, in the Web of today, where systems are no longer standalone but embedded in the eco-system of the Web, the representation of the scenarios should enable communication about and exchange of scenarios between different systems and services. Thus, the representation needs to contain sufficient semantic information to enable such functionality. Van Marcke [31] introduced the concept of an instructional tasks, which represents an activity that can be accomplished during the learning process. This helps to define scenario more accurately since both, the content and the instructional task are essential aspects of a learning goal. Therefore, we define scenarios as a combination of the two dimensions content and task. A scenario is a tuple t = (p, L), where p is an identifier of the instructional task and L is a list of learning object identifiers. L specifies the course’s target concepts, and p influences the structure of the course and the learning objects selected.
170
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features Table 1. A selection of tasks used in Paigos Identifier
Description
discover
Discover and understand concepts in depth
rehearse
Address weak points
trainWithSingleExercice
Increase mastery using a single exercise
illustrate
Improve understanding by a sequence of examples
illustrateWithSingleExample
Improve understanding using a single example
For instance, the instructional task to discover and understand content in depth is called discover. Let’s assume that def slope and def diff are the identifiers of the learning objects that contain the definition of the mathematical concept “average slope of a function” and “definition of the differential quotient”, respectively. The scenario for a learner who wants to discover and understand these two concepts is t = (discover, (def slope, def diff)). Table 1 contains a selection of tasks that Paigos can process, currently. Tasks can be internal tasks that are of potential interest for system components. Public tasks. need to be described sufficiently precise to enable a communication between different components, services, and systems. Their description contains the following information: • the identifier of the task • the number of concepts the task can be applied to. A task can either be applied to a single concept (cardinality 1) or multiple concepts (cardinality n) • the type of learning object (as defined in the oio) that the task can be applied to • the type of course to expect as a result. Possible values are either course in case a complete course is generated or section in case a single section is returned. Even in case the course generator selects only a single learning object, the resource is included in a section. • an optional element condition that is evaluated in order to determine whether a task can be achieved. An example is ActiveMath’s item menu for adding content. Its entries are displayed only if the corresponding tasks can be achieved. E.g., if there are no examples available for def slope, then the task (illustrate, (def slope)) cannot be achieved, so it wont be displayed. • a concise description of the purpose that is used for display in menus, for instance the item menu. Figure 5 displays the task trainWithSingleExercise!. It is applicable to a single educational resource of the type fundamental and returns a result in case the condition holds. 2.3.2. Course Generation Interfaces and Communication Paigos provides two main Web interfaces, the core interface contains the methods for the course generation and the interface that allows a client to register at Paigos. The core interface consists of the following methods:
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
171
(class Exercise)(relation isFor ?c) (property hasLearningContext ?learningContext) <description> Train the concept. \"Ube den Inhalt. ...
Figure 5. A public task representation
1. The method getTaskDefinitions is used to retrieve the pedagogical tasks which the course generator can process. 2. The method generateCourse starts the course generation on a given task. The client can make information about the learner available either through a pointer to his/her student model or by a list of property-value pairs. It is important to note the difference between the Web interface and the interface internally used in ActiveMath. In order to achieve interoperability, the Web interface returns an ims Manifest, consisting of references to LOs. The internal interface returns a table-of-contents like structure whose leaves are references or a special kind of tasks called dynamic tasks which are instantiated with specific LOs only at the time the user first visit the corresponding page. This approach has the advantage that the table of contents is generated immediately and thus the student has an overview of the complete course. The interface for repository registration consists of the following methods: 1. The method getMetadataOntology informs the client about the metadata structure used in Paigos. It returns the ontology of instructional objects. 2. The method registerRepository registers the repository that the client wants the course generator to use. The client has to provide the name and the location (url) of the repository. Additional parameters include the ontology that describes the metadata structure used in the repository and the mapping of the oio onto the repository ontology. 3. The method unregisterRepository cancels the registration of a repository. The interface ResourceQuery is used to query the repository about properties of learning objects. The interface consists of the following methods: 1. queryClass returns the classes a specified resource belongs to 2. queryRelation returns the set of identifiers of those learning objects the given resource is related to via the specified relation
172
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
Figure 6. Sequence diagram of repository registration
3. queryProperty returns the set of property-value pairs the specified resource has. The LearnerPropertyAPI makes the learners’ properties accessible to Paigos in case the client has a learner model and wants the course generator to use it. In the current version of Paigos, this interface is not yet implemented. It would require a mediator architecture similar to the one used in ActiveMath for repository integration [4]. The result of the course generation is a structured sequence of learning objects represented in an ims Manifest[32]. Since the result does not contain the resources but only references, it is not an ims cp or SCORM package. Because ActiveMath only focusses on personal learning IMS-LearningDesign has not yet been considered useful yet. A repository is registered in the following way (for a sequence diagram illustrating the registration, see Figure 6): in a first step, the client (LMS-Client in the figure) retrieves the metadata ontology used in Paigos. The ontology is then used to generate a mapping between the oio and the ontology representing the client metadata (Step 2) (the currently existing mappings were manually authored). Then, the repository is registered using the method registerRepository (Step 3). The repository is added to the list of available repositories and made known to the mediator (a component of Paigos that allows the integration of third-party repositories) (Step 4). Subsequently, the mediator fetches the ontology mapping from the client and automatically generates a wrapper for querying the contentAPI of the client. A client starts the course generation using the service method generateCourse. In a first step, Paigos checks whether the task is valid. If so, the course is generated. During the generation process, Paigos sends queries to the mediator, which passes the queries to the repository. After the course is generated, the omgroup (the element OMDoc uses for grouping elements) generated by Paigos is transformed into an ims Manifest and sent to the client.
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
173
3. Presentation and Management of Mathematical Expressions For different users, mathematical formulæ can be presented in different formats (for print or browser) and forms – depending on the user’s (cultural) context and preferences. Moreover, the diversity of the rendering forms also builds on the diversity of the notations in mathematical practice, e.g., the fact that sin2 x is written without bracket while sin2 (x + y) is written with brackets even though the mathematical symbol is the same. Most solutions for browser rendering mathematics focus on a single presentation language which can be rendered in multiple browsers. For instance, JS-math3 or Wikipedia’s texvc4 use a subset of LATEX to allow for authoring of mathematical formulæ with presentation markup, i.e., usable only for rendering. This does not, however, solve the presentation problem for student interactions and search as needed in an e-Learning system. In order not to confuse the student, to avoid cognitive overload, and to support smooth interaction with content and tools the presentation of mathematical expressions should be the same in all tools of his/her learning experience. For instance, when the learner uses a curve plotter, the lexicon/search tool, the input editor, a CAS service, he/she should view the same presentation of a symbol in any application. At first, this seems to be trivial but it is not: 1. interactions occur in interactive exercises for which the student’s input is evaluated. Interactions also occur with (GUIs of) interactive tools such as computer algebra systems or function plotter. In this case, the semantic and computable nature of the mathematical object is required for consistency. Hence, the presentation in any of the tools cannot be hard coded but needs to be generated. The generation of a presentation is also required because of the need of cultural adaptation which requires to use culture-specific presentations for a number of symbols and expressions. 2. search for mathematical formulæ needs to be independent of the actual rendering and should exclude mismatches such as x + y 2 when the user queried for x2 . ActiveMath responds to the need to render formulæ consistently in the content as well as in interactions and to search semantically by processing formulæ in OpenMath as the semantic basis for presentation and management (search, input, copy and paste, etc.) of mathematical expressions. 3.1. Adaptive Rendering of Formulæ Following the common web practice, all interactions in ActiveMath occur in a web-browser and applets. The browser’s interactions with the web-server involve the generation of a presentation code (refer to Figure 1). As much as possible, ActiveMath uses its generic presentation architecture to produce the rendering of mathematical formulæ based on their semantic representation. Rendering of 3 JSmath is a javascript library that renders TeX within the browser, see http://www.math. union.edu/∼dpvc/jsMath/ 4 texvc is an add-on to mediawiki explained at http://en.wikipedia.org/wiki/Texvc.
174
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
formulæ is part of the presentation process of ActiveMath which aims at delivering browser code of the content. This delivery depends on the context and preferences of the user which includes the following dimensions: 1. the format of delivery, which is mostly a choice of the user (currently HtML +css, TEX/PDF, and xHtML+MathML are supported) 2. the language of the user, which impacts the notations 3. the educational context and field of study 4. the course that is currently delivered. The delivery converts OMDoc items into chunks of browser code based on the format, language, and notation. XSLT transformations are used to this end. The XSLT transformations are partially generated by a set of notations which associates OpenMath prototypes (expressions with variables as place holders) to a presentation template. The resulting adapted rendering of mathematical content is in line with a user’s cultural customs while at the same time it keeps its meaning through the underlying OpenMath expressions. 3.2. Input of Mathematical Formulæ An interaction for which OpenMath is crucial too, is the input of formulæ. In ActiveMath mathematical formulæ can be input in three ways: Input Editor: The input editor of ActiveMath is palette-based and can be used in different platforms. It is easily accessible to a novice user for the input of symbols. It is implemented as a Java applet which internally edits an OpenMath expression. Its palettes are configurable by a skilled author. Its rules for transforming OpenMath expressions to a rendering code employ internal rules and notations central to ActiveMath – thus achieving consistency for a student. Textfields: Because not all students want to work with such an input editor ActiveMath enables linear input syntax as well. Its syntax resembles that of the Maple and its output is OpenMath. Linear Input for Authors: In the authoring environment mathematical formulæ are input with a linear syntax (OQMath), which is configurable by notation files. These methods can be complemented by a copy-and-paste facility: a feature of ActiveMath’s presentation is a reference to a URL at which the OpenMath term is available. The paste of the URL representing this term is interpreted by the input editor and other recipients (linear input, function plotter, etc.) as a request to fetch and insert the OpenMath term.
4. Conclusion The article described several Semantic Web features of the e-Learning platform ActiveMath. The backbone of many of those features is the semantic knowledge representation for mathematics, OMDoc, and its ActiveMath metadata extensions. This representation is processed by a number of components of the platform whose architecture and communication is conveyed at a high level.
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
175
We explain relevant features of web services and components used for adaptive course generation, for the diagnosis of student input in exercises, as well as the presentation and management of mathematical expressions. 4.1. Future Work The oio ontology has been adopted and extended by other groups. We hope this will also happen to the interoperable services which are currently used by ActiveMath. Currently, the ActiveMath group is in the process of reusing learning material originally devised for other learning environments. For this purpose, however, mathematical semantics and metadata have to be added to the content. Currently, ActiveMath can exchange basic student profile information with other applications such as Moodle but does not (yet) exchange detailed student model information. This is a future goal. Paigos was successfully used by the two third-party systems MathCoach (a learning tool for statistics [25]), and Teal (work flow embedded e-learning at the workplace, [14]). Future work on Paigos is necessary to realize a mediator-like architecture for the generic integration of/communication with student models of other applications. The search tool of ActiveMath has almost been neglected in this article even though it searches semantically by matching formulæ and their OpenMath trees and its search for LOs can integrate metadata. Its match of OpenMath trees is exact, i.e., no equivalent formulation is returned yet. That is, for x + y only x + y would be returned but not y + x. Normalization is a first step to cope with various equivalent encodings. Future work will deal with more fuzzy search. 4.2. Acknowledgement This publication has been supported by projects LeActiveMath (FP6-507826) funded by the EU and ATuF (ME 1136/5-1) funded by the German National Science Foundation (DFG). The authors are solely responsible for its content. The authors wish to thank Tianxiang Lu for the implementation of the work described in Section 2.3. References [1]
[2]
[3]
[4]
C. Brooks, J. Greer, E. Melis, and C. Ullrich. Combining its and elearning technologies: Opportunities and challenges. In K.D. Ashley M. Ikeda and T-W. Chan, editors, Intelligent Tutoring Systems (ITS-06), volume 4053 of LNCS, Jhongli, Taiwan, Springer-Verlag, 2006, 278–287. E. Melis, G. Goguadze, M. Homik, P. Libbrecht, C. Ullrich, and S. Winterstein. SemanticAware Components and Services of ActiveMath. British Journal of Educational Technology, 2006, 37(3):, 405–423. E. Melis, E. Andr`es, J. B¨ udenbender, A. Frischauf, G. Goguadze, P. Libbrecht, M. Pollet, and C. Ullrich ActiveMath A Generic and Adaptive Web-Based Learning Environment. International Journal of Artificial Intelligence in Education 2001, vol.12, 385–407. P. K¨ arger, C. Ullrich, and E. Melis. Integrating learning object repositories using a mediator architecture. In Wolfgang Nejdl and Klaus Tochtermann, editors, Proceedings of ECTEL’06, volume 4227, pages 185–197, Heraklion, Greece, Springer-Verlag, ISBN 9783540457770, 2006.
176 [5]
[6] [7] [8] [9] [10]
[11]
[12]
[13] [14]
[15]
[16] [17]
[18] [19] [20] [21] [22] [23]
[24]
[25]
[26]
[27]
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
C. Ullrich. Pedagogically Founded Courseware Generation for Web-Based Learning – An HTN-Planning-Based Approach Implemented in PAIGOS. Number 5260 in Lecture Notes in Artificial Intelligence. Springer, ISBN 978-3-540-88213-8, 2008. M. Kohlhase OMDoc: An Open Markup Format for Mathematical Documents (version 1.2). LNCS 4180, Springer-Verlag, Berlin, 2006. O.Caprotti, D.P.Carlisle and A.M. Cohen. The OpenMath Standard. The OpenMath Consortium, 2002. J.Davenport et al. OpenMath Core Content Dictionariers. 2004, http://www.openmath. org/. ieee Learning Technology Standards Committee. 1484.12.1-2002 ieee standard for Learning Object Metadata, 2002. A. Merceron, C. Oliveira, M. Scholl, and C. Ullrich. Mining for Content Re-Use and Exchange – Solutions and Problems. In Poster Proceedings of the 3rd International Semantic Web Conference, ISWC2004, pages 39–40, Hiroshima, Japan, November 2004. C. Knight, D. Gaˇsevi´ c, and G. Richards. An ontology-based framework for bridging learning design and learning content. Educational Technology and Society, 2006, 9(1): 23–37. Pilar Sancho, Iv´ an Mart´ınez, and Baltasar Fern´ andez-Manj´ on. Semantic web technologies applied to e-learning personalization in e-aula. Journal of Universal Computer Science, 2005, 11(9):1470–1481. B.J. Kr¨ amer. Reusable learning objects: Let’s give it another trial. Forschungsberichte des Fachbereichs Elektrotechnik ISSN 0945-0130, Fernuniversit¨ at Hagen, 2005. O. Rostanin, C. Ullrich, H. Holz, and S. Song. Project teal: Add adaptive e-learning to your workflows. In Klaus Tochtermann and Hermann Maurer, editors, Proceedings: I-KNOW’06, 6th International Conference on Knowledge Management, Graz, Austria, 2006, 395–402. L.W. Anderson, D.R. Krathwohl, P.W. Airasian, K.A. Cruikshank, R.E. Mayer, P.R. Pintrich, and M.C. Wittrock. A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom’s Taxomnomy of Edicational Objectives. Longman, New York, 2001. OECD, editor. Learning for Tomorrows World – First Results from PISA 2003. Organization for Economic Co-operation and Development (OECD) Publishing, 2004. E. Melis, A. Faulhaber, A. Eichelmann, and S. Narciss. Interoperable competencies characterizing learning objects. In E. Aimeur B.Woolf and R. Nkambou, editors, Proceedings of the International Conference on Intelligent Tutoring Systems, ITS-2008, volume 5091 of LNCS, Springer-Verlag, 2008, 416–425. http://www.maplesoft.com http://www.wolfram.com http://maxima.sourceforge.net http://yacas.sourceforge.net http://www.wiris.com M. Dudev, A. Gonz´ alez Palomo, Generating Parametrized Exercises, Student Project at the University of Saarland, 2007, PDF http://www.matracas.org/escritos/edtech report.pdf C. Zinn. Supporting tutorial feedback to student help requests and errors in symbolic differentiation. In K. Ashley M. Ikeda, editor, Proceedings of Intelligent Tutoring Systems 8th. International Conference ITS-2006, volume LNCS 4053 of Lecture Notes in Computer Science, Springer-Verlag, 2006, 349–359. B. Grabowski, S. G¨ ang, J. Herter, and T. K¨ oppen. MathCoach und LaplaceSkript: Ein programmierbarer interaktiver Mathematiktutor mit XML-basierter Skriptsprache. In Klaus P. Jantke, Klaus-Peter F¨ ahnrich, and Wolfgang S. Wittig, editors, Leipziger Informatik-Tage, volume 72 of LNI, GI, 2005, 211–218. B. Heeren, J. Jeuring, A. van Leeuwen, and A. Gerdes. Specifying Strategies for Exercises. In Serge Autexier, John Campbell, Julio Rubio, Volker Sorge, Masakazu Suzuki, Freek Wiedijk, editors, AISC/Calculemus/MKM 2008, LNAI 5144, 2007, Springer-Verlag. 2008, 430 – 445. MONET Architecture Overview, The MONET Consortium, Deliverable D04, March, 2003
E. Melis et al. / ActiveMath – A Learning Platform with Semantic Web Features
177
[28] J. Zimmer and S. Autexier. The MathServe System for Semantic Web Reasoning Services, Proceedings of the 3rd International Joint Conference on Automated Reasoning (IJCAR’06), volume 4130, Springer Verlag, 2006, 140–144. [29] B.S. Bloom, editor. Taxonomy of educational objectives: The classification of educational goals: Handbook I, cognitive domain. Longmans, Green, New York, Toronto, 1956. [30] A. Faulhaber and E. Melis. An efficient student model based on student performance and metadata. In N. Fakotakis M. Ghallab, C.D. Spyropoulos and N. Avouris, editors, 18th European Conference on Artificial Intelligence (ECAI-2008), volume 178 of Frontiers in Artificial Intelligence and Applications, IOS Press, 2008, 276–280. [31] Kris Van Marcke. GTE: An epistemological approach to instructional modelling. Instructional Science, 1998, 26:147–191. [32] IMS Global Learning Consortium. IMS content packaging information model, 2003.
178
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-178
CHAPTER 10
An Intelligent Framework for Assessment Systems Sonja D. RADENKOVIû 1 , Vladan DEVEDŽIû and Nenad KRDŽAVAC FON – School of Business and Administration, University of Belgrade, Serbia
Abstract. This chapter presents the development of a computer assisted intelligent assessment system. The system is based on the IMS QTI standard and designed by applying the Model Driven Architecture (MDA) software engineering standards, the artificial intelligence and description logic reasoning techniques based on tableau algorithm. The chapter, also, shows the use of metamodel transformations between concrete languages. We propose the framework for assessment system that is reusable, extensible, and facilitates interoperability between its component systems. Also the chapter defines the consistency of test. Keywords: Assessment systems, description logics, model driven architecture, semantic web.
Introduction In modern assessment systems the huge possibilities of the Semantic Web [1] in which computer software, as well as people can find, read, understand and use data over the World Wide Web to accomplish useful goals for users are limited by the fact that the end user still needs to take care of the data. The main problem in creating modern assessment systems on the Semantic Web is how to exchange and enrich the knowledge in heterogeneous systems, as well as to provide the way for knowledge evaluation. Whenever the assessment experts exchange assessments electronically, whatever software and hardware systems they use, interoperability enters the scene. Interoperability is the capability of software systems to use the same formats for storing and retrieving information and to provide the same service on different hardware and software platforms [2]. This chapter proposes a way to create a flexible, interoperable assessment system that can be easy to maintain and reuse. It is based on the IMS Question and Test Interoperability (QTI) standard [3] and designed using the Model Driven Architecture (MDA) standard [4] that comes from software engineering. The core concept here is to change the system’s specification rather than implementation using the Unified Modeling Language [5] as the standardized modeling language that most tools provide support for. 1 Corresponding Author: Sonja Radenkoviü, FON – School of Business and Administration, University of Belgrade, Serbia; Email: [email protected].
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
179
One of the main ideas the chapter proposes is using Description Logics (Dls) [6] reasoning techniques for intelligent analysis of students’ answers and solutions of the problems they are working on during assessment sessions with the system. Also, the chapter examines application of Dls reasoning techniques in analyzing consistency of assessment tests. The use of an MDA-based DL reasoner called LORD 2 [25] enables processing of open-ended questions, which is the novelty that can be applied in the IMS QTI standard. Furthermore, this is the way for applying a framework for data sharing and reuse across heterogeneous applications, which is the core of the Semantic Web. Last but not least, designing and developing reliable, robust, well-architectured, and extensible software applications or tools in any field requires conformance to sound principles and rules of software engineering. E-Learning systems are no exception to that rule. We have developed QTI-FAR – a Framework and an Architecture for designing and implementing QTI-based assessment systems (or just QTI systems, for short) and for semantic analysis and evaluation of students’ answers and solutions acquired through the use of such a system. The chapter is organized as follows. The next section describes the problem statement. Section 3 and Section 4 describe the basic principles of the IMS QTI standard and MDA standard, respectively. Section 5 presents the modeling of a QTIbased assessment system using the MDA standard. Section 6 describes the reasoning with QTI models, whereas section 7 shows the reasoning with QTI tests. Section 8 presents the QTI-Far framework and generic architecture of QTI systems. The last section shows the conclusions and indicates directions for future work.
1. Problem Statement In order to create a flexible assessment system capable of presenting and analyzing the students’ solutions and answers, it is necessary to define the system requirements precisely. These requirements are: x
x x x x x
2
The system should be based on the IMS QTI standard that is a general specification created to facilitate interoperability between a number of subsystems of a modern assessment system in relation to the actors that use them [26]; The system should be designed using the MDA standards in order to be flexible, interoperable, and easy to maintain and reuse, changing the system's specification rather than implementation [26]; It has to enable retesting until the student presents a critical amount of knowledge necessary to pass the course; It has to provide a well-documented content format for storing items, in a way independent of the authoring tool used to create them [3]; It has to support the deployment of item banks across a wide range of learning and assessment delivery systems [3]; It has to support the deployment of items and item banks from diverse sources in a single learning or assessment-delivery system [3];
Descrption LOgic Reasoner
180
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
x
It should provide other systems with the capability to report the test results in an intelligent manner, using DL reasoning techniques such as concept classification and consistency checking [26].
By creating a system to accomplish the above requirements, one obtains: a flexible and quickly developed assessment environment; high-level interoperability between various component systems; easy and intuitive testing of the student knowledge, an extensible system that can be improved by including new subsystems.
2. IMS QTI Standard The IMS QTI standard [3] specifies how to represent question (assessmentItem) and test (assessmentTest) data and the corresponding result reports. These items are the smallest exchangeable assessment object within this specification [3]. An item is more than a ‘question’ in that it contains the question and instructions of how to be presented, the response processing to be applied to the candidate response(s), and the feedback that may be presented (including hints and solutions). There are different forms of items, such as multiple-choice questions or fill-in-the-blank tasks [7]. The presentation provides the structure for defining several possibilities for the same question, in order to be able to present the same question in different ways. Each answer within this question can also have different wordings. The response processing allows the author of a test to predefine how the answers to the test will be evaluated. The results of a test can be expressed using the “result reporting”. This definition describes that the results of a test can be recorded so that other systems can make use of it. The feedback component of QTI consists of two types, modal and integrated [3]. Modal feedback is shown to the candidate after the response processing has taken place and before any subsequent attempt or review of the item [3]. Integrated feedback is only shown during subsequent attempts or during review. There is an exchange of items, assessment, and results between authoring tools, item banks, learning systems and assessment delivery systems. For interchange between these systems, an XMI binding is provided [8].
3. Model Driven Architecture Model Driven Architecture is defined as a realization of model-driven-engineering principles proposed by Object Management Group (OMG) 3 [9]. This is a software design approach that provides the set of guidelines for structuring specifications expressed as models [10]. MDA defines three view points (levels of abstraction) from which the system can be analyzed. According to [9], there are three models that correspond to these points of view: x x
3
Computation Independent Model – CIM doesn’t show the details of the system structure and it’s very similar to the concept of ontology; Platform Independent Model – PIM defines the system functionality using an appropriate domain-specific language;
www.omg.org
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
181
x
Platform Specific Model – PSM is a system specification that uses concepts and constructs from concrete domain-specific language, or general-purpose language (like Java) that computers can run. The central concept of the MDA standards is a four-layer modeling architecture (Fig. 1).
Figure 1. Four-layer architecture of MDA [10]
The topmost layer (M3) is called meta-meta model layer and it corresponds to CIM. OMG has defined a standard at this layer – MOF (Meta Object Facility). According to [11], MOF is the language intended for defining meta-models at M2 layer, which corresponds to PIM. The next layer is the model layer (M1) – the layer where we develop real-world models. It corresponds to PSM. An important, recently defined MOF-based metamodel is the OWL metamodel called the Ontology Definition Metamodel (ODM) [12]. It covers common concepts of ontological engineering, such as classes, properties, resources, etc. To an extent, it is similar to RDF Schema and OWL languages, commonly used for ontology development. However, since it is MDA- and MOF-based, it has an important advantage of enabling the use of graphical modeling capabilities of the standardized UML modeling language for ontology development. Model transformation is the process of converting one model to another model of the same system [13]. Model engineering uses the terms “representedBy” and “conformantTo” [10] to establish relations between the models in the MDA layers. It is possible to define the transformation of one model into another if the meta-models of different models are made in the same language. This language is defined by the XML Meta-Data Interchange (XMI) standard [8] that defines how XML tags are used to represent serialized MOF-compliant models in XML. MOF-based metamodels are translated to XML Document Type Definitions (DTDs) and models are translated into XML Documents that are consistent with their corresponding DTDs. Using XMI, it is possible to generate a “middleware” environment automatically, as well as to transform a “middleware” platform into another. Automated tools generally perform these translations, for example tools compliant to the OMG standard named QVT [14]. MDA is a very generic approach. A more specific one is a widely used implementation of MDA called Eclipse Modeling Framework (EMF) [15]. The meta-
182
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
metamodel of EMF is called Ecore. It has started as an implementation of MOF, and has evolved into a meta-metamodel of its own. It essentially represents a simplified implementation of MOF. A well-known EMF-based language for developing model and metamodel transformations is called ATL – Atlas Transformation Language [16]. It has been defined to perform general transformations within the MDA framework (Model Driven Architecture) recently proposed by the OMG. The natural application domain of ATL is to express MDA-style model transformations, based on explicit meta-model specifications [27]. 4. Modeling the QTI-based Assessment System Using MDA Standards The main reason for applying the MDA standards [4] in development of assessment systems is to make a clear difference between conceptual and concrete modeling, in order to automate transfer and sharing of information and knowledge. The essence of the development of an assessment system based on the IMS QTI standard (the QTI system, for short) using MDA standards is a transformation from the system’s platform independent model (PIM) to platform specific model (PSM). In this case, the PSM is generated using Ecore classes of the Eclipse Modeling Framework (EMF), whereas the transformation from the PIM to the PSM is made using the Atlas Transformation Language (ATL) as the most recent implementation of the QVT standard. 4.1. QTI Metamodel The first step in the development of QTI system using the MDA standard is creating the metamodel. Because of that, we have developed the QTI metamodel [17] that captures the main concepts of QTI standard. The way the QTI metamodel was developed is illustrated in Figure 2.
Figure 2. The QTI metamodel development
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
183
Most of our developmenta are based on the Eclipse development environment and EMF. That is the reason why we need the QTI metamodel, as well QTI models that could be used in Eclipse development environment. Hence we transformed the MOFbased QTI metamodel to the Ecore-based QTI metamodel (Figure 2, middle). That transformation was done by Eclipse plug-ins 4 . The MOF-based QTI metamodel supports the XMI 1.2 standard, and the Ecore-based QTI metamodel (EMF repository) supports the XMI 2.0 standard. Because of that, it was necessary to use KM3 [18] as an intermediate model. KM3 presents domain specific language for defining of metamodels and it is very similar to Java syntax.. The MOF-based QTI metamodel was first transformed to KM3, and then further to the equivalent Ecore-based QTI metamodel. This metamodel represents the Platform Independent Model (PIM) for assessment systems based on IMS QTI standard. 4.2. Creating the QTI Models Based on the QTI Metamodel Having the Ecore-based QTI metamodel that is located at the M2 level of the MDA hierarchy (Figure 1), we can create models that correspond to a given metamodel. There is a lot of examples of assessmentItems that are proposed in the IMS QTI standard (see [3]). In order to illustrate the creation of models that correspond to a given QTI metamodel (Figure 2), we present an example the Simple Choice item shown in Figure 3. We have chosen the simplest type of item in the IMS QTI standard. The system expects a single response from the candidate, because only one of the options presented to the candidate is correct. The Ecore-based model for this example is shown in Figure 4.
Figure 3. A simple choice item example
5
Figure 4. Ecore QTI model for the simple choice item 4 5
www.eclipse.org/gmt http://www.imsglobal.org/question/qti_v2p1pd/examples/items/choice.xml
184
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
Response processing in this example is also simple. As shown in Figure 4, the candidate’s response is declared on top of the item to be a single identifier. The values this identifier can take are the values of the identifier’s attributes corresponding to the individual simpleChoices(ChoiceA, ChoiceB, ChoiceC). The correct answer is included in the declaration of the response (Figure 4). 4.3. Model Transformation in QTI System In order to use DL reasoning techniques in the process of analyzing of students’ solutions, it is necessary to perform the transformation of the QTI-based models into the equivalent ODM-based QTI-OWL model (Figure 5). This process is automated using the Atlas Transformation Language (ATL). Details of the DL reasoning approach illustrated in Figure 5 are further elaborated in Section 6.
Figure 5. Analysis of Students’ Solutions
The result of qti2owl.atl transformation is the QTI metamodel as well as the QTI models in Ecore that are OWL based (Figure 6).
Figure 6. OWL model of a question and correct answer transformed from a QTI model
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
185
5. Reasoning with QTI models In this section, we focused on the intelligent analysis of the semantics of students’ solutions to the problems they solve during assessment (students’ solutions, for short). To this end, we examined using LORD, our DL reasoner based on MDA software engineering standards. We propose an architecture for intelligent analysis of students’ solutions (Fig. 5). To elaborate the idea, we present two very easy examples of using LORD in analyzing a student’s answer to a simple choice question (Figures 3 and 4). The basic notions in Dls are concepts (unary predicates) and roles (binary predicates) [7]. Reasoning services in Dls are based on the tableau algorithm [19]. The tableau algorithm tries to prove the satisfiability of a concept term C, by demonstrating a model in which C can be satisfied [19]. For more information about Dls and its reasoning capabilities, see [7]. 5.1. DL Reasoning in Intelligent Analysis of Student’s Solutions Many Semantic Web-based education environments use different reasoning techniques to help the authors make improvements in the course design (e.g. case-based reasoning techniques explained in [20], or rule-based reasoning [21]), or for intelligent analysis of the students’ solutions. For example, Simic [21] used an XML format to represent the domain knowledge and generate a CLIPS file (*.clp) before using the reasoning mechanism. The Jess (http://www.jessrules.com/jess/index.shtml) expert system shell’s inference engine was used as the reasoning mechanism. However, some problems are difficult for Jess to solve, such as the following: 1. 2. 3.
Reasoning about the course material subsumed by another one (i.e., classification of learning material). Reasoning about a student's answer that represents a model of the domain knowledge (in the sense the term "model" is used in DLs). Intelligent analysis of the semantics of students’ solutions.
Intelligent analysis of students’ solutions using LORD may fulfill the following requirements: 1. 2. 3. 4.
Check whether the student’s answer is satisfiable with regard to the question. Find the student’s mistakes (semantic inconsistencies) in the answer. Find if the student’s answer can be described with an uncertainty, rather than just as a true answer. Use different pedagogical strategies in the analysis of students’ solutions, according to a hierarchy of answers.
These requirements may be satisfied using DL reasoning services like classification (subsumption) and consistency. Using some Dls reasoning techniques in this chapter, we focus on problem 1 from the above list. Classification is useful in checking the hierarchy of students’ answers or teaching courses. A question put to the student may imply a few different answers, and all of the answers may be true. In the case of several correct answers answers, LORD may find the most common answer and give positive (but different) marks to the student. The answer hierarchy can be calculated using the DL subsumption reasoning technique. This classification cannot be applied in cases when there is only one answer to the question.
186
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
The benefit of applying LORD, I.e. DL consistency checking, is the capability of finding logical mistakes in the students’ answers. Some existing DL reasoners [19], [22] may fulfill the above requirements, but there are a few problems to this end, mostly related to the architectures of these reasoners. For example, the FACT reasoner [19], implemented in LISP, is difficult to integrate with Intelligent Semantic Web Based Education Systems (ISWBES), even it is DIG compliant [19]. FACT may check the consistency of some of the students’ answers (if they are compliant to an ontology), also if FACT is integrated in our framework, but cannot discover inconsistencies precisely. We assume that the students’ answers are submitted to the reasoning machine as OWL models (Fig. 5), transformed from a QTI model (Fig. 4). The OWL model conforms to the OWL metamodel, which is defined in the Ontology Definition Metamodel (ODM) [12]. The tableau model is described using XML Metadata Interchange (XMI) [8]. XMI has a tree structure – every tableau model is a tree (graph) [19]. Using the interfaces generated for the tableau metamodel (Ecore classes), we can analyze such a tableau model, I.e. LORD can find the student’s mistakes and return them to the Assessment Delivery System [3]. To transform OWL models of students’ solutions to the corresponding tableau models, we have also used the Atlas Transformation Language (ATL) (http://www.eclipse.org/m2m/atl/). 5.2. Examples of Application Dls Reasoning in Intelligent Analysis of Student’s Solutions Let us illustrate now DL reasoning over a simple choice question such as the one in Figure 3 (compliant with the QTI standard). This example simultaneously illustrates advantages of using LORD instead of the existing reasoners. We do not cover the implementation issues in this section. Among a few possible answers, a student may choose one or more. In Dls terms, the items correspond to Tbox (or Rbox 6 ) (Table 2), where questions are presented as concepts using a DL language. These simple choice items (Figure 3) are presented as a part of the QTI model (Figure 4). We transformed the model into the equivalent OWL model (simple choice model – Figure 6), which conforms to the OWL metamodel (Figure 5). There are at least two ways of applying LORD in this example (Figure 6): 1. 2.
Instance checking (Abox query). Checking for satisfiability of the student’s answer.
Instance checking is applicable in cases when one or more OWL Classes have a few instances. For example, the class STUDENT can have the following instances (individuals): John, Mary, and Philip. However, Mahesh is not a student. The multiple choice question can offer these names. Intuitively, the student’s answer is an instance of the STUDENT class and can be checked using the instance checking reasoning technique. This is the simplest case, and some cases can include more complex Abox formulas. According to the previous example, the assessment system can ask the student: 6
Rbox is part of DL knowledge base consisting of role inclusion axioms [6]
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
Student(Mary) Չ Student(John) Չ Student(Philip)
187
(1)
or, in plain English, “Are Mary or John or Philip students?”. Mary, John and Philip are instances of the class STUDENT. In the multiple choice question, the student must/may choose one of the four names. However, two different students may submit syntactically different but semantically equivalent answers (e.g., if student A answers “MARY”, student B answers “JOHN”). Both answers are true, but syntactically different. To illustrate the use of satisfiability technique, consider an example from the IMS QTI standard document [3], illustrated in Figure 4. To check the satisfiability of student’s answer LORD uses transformations from an OWL model to a tableau model (Figure 5). In this example, the questions are saved in the knowledge base (OWL model) as “SimpleChoice” class instances (Figure 6). If the student selects one among a few answers, LORD can check if this answer is satisfiable w.r.t. The knowledge base (the OWL model). The correct answer in this case is “Choice A”. The QTI standard suggests saving answers as instances of “ReponseDeclaration” [3], but in some cases the answers may not be saved as instances of this class, because LORD can check the satisfiability of the answer with respect to the axioms (Tbox/Rbox) the in knowledge base (in our case OWL model). Table 2. DLs expressions corresponding to the OWL model (Figures 5 and 6) DLs expressions (ChoiceB Ո ChoiceC) Ն ChoiceA ChoiceA Չ (¬ChoiceB Չ ¬ChoiceC)
Formula number (1) (2)
Table 2 shows how the OWL model (Figures 5 and 6) can be described as a set of Tbox axioms (1) (only the question, without the answer). Formula (1) means that Concept ChoiceA subsumes intersection of concepts ChoiceB, ChoiceC. The OWL model (Figure 6) is satisfiable if and only if (iff) formula (1) is satisfiable. Formula (1) is a subsumption relation in Tbox, called concept inclusion [6]. In this case it is a union of three concepts. It means that formula (1) is satisfiable iff formula (2) is. In this example, the correspondence between OWL models and DLs formulas means that LORD can be also used to test the satisfiability of questions w.r.t OWL models. Further discussion of this issue is beyond the scope of this chapter. The first example (section 6.2.1) presents a satisfiable case, but the second one (section 6.2.2) presents an unsatisfiable student’s answer, using the same simple choice item (Figure 6). As a solution, LORD generates tableau models (conformant to the tableau metamodel [23]) in both cases. The reasoning process is done during the model transformations, and it is the main difference between LORD and existing ones, like FACT [19]. The advantage of this methodology is that the generated tableau model can be analyzed later, using ECore classes or JMI interfaces 7 . It means that the tableau model saves implicit knowledge of the reasoning process, I.e. using the tableau model we may find out useful information about the student’s semantic mistakes (like what was logically wrong in his/her answers and process of studying). We can use LORD to test how some pedagogical strategies can improve the learning progress. 7
http://mdr.netbeans.org/architecture.html
188
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
There are some disadvantages of using, for example FACT [19], in this situation. The problems of using this reasoner can be divided into two parts: 1.
Practical aspect of using FACT.
We cannot use FACT as a plug-in. It may be called from our framework, but only using the 8080 port [12]. It is very difficult to adapt FACT in case of extending the functionality of our assessment system. 2.
FACT answers only NO/YES in the case of (un)satisfiability (respectively).
LORD can find out what the student’s mistake is, and even why the student did make the mistake, i.e. analyze his answer step-by-step using the tableau model saved in Ecore. The tableau model is basically a graph with nodes labeled by the names of concepts/classes Contrary to that, FACT [19] answers only YES/NO. 5.2.1. Example of a Satisfiable Student’s Answer Suppose that a student submitted “ChoiceA” as the answer to the question presented in Figure 3. LORD takes this answer as an OWLmodel, and calculates the (un)satisfiability of the model (Figure 6) and generates the equivalent tableau model. It is important to say that before the reasoning starts, the models are saved in negation normal form. Correct answer, in this example, can be reduced to the satisfiable answer. To explain the reasoning mechanism with OWL models, we use the DLs notation: ChoiceA ҲChoiceA ҵ (¬ChoiceB ҵ ¬ChoiceC)
(2)
(ChoiceA Ո ¬ChoiceA) Ո (ChoiceB Ո ChoiceC )
(3)
Checking subsumption can be reduced to checking the satisfiability of concepts [19]. In this case, it means: “Does the question subsume the answer?” as it is written in formula (2). The constraint system as the starting point in the reasoning process can be presented as a finite set of models, as follows: L(x) = { ChoiceA Ո¬ChoiceA Ո (ChoiceB Ո ChoiceC ) }
(4)
The individual “x” (see Figure 7) is an instance of all subconcepts in this set. Using the reasoning rules (in this case it is an intersection rule for ALC logic) [19], the constraint system described by formula (4) can be extended to new ones: L(x) = {ChoiceA, ¬ChoiceA} Ӣ L(x)
(5)
This constraint system represents a contradiction and formula (3) is not satisfiable, which implies that the question subsumes the answer and the student has submitted the true answer. The unsatisfiable points of the starting model are represented in dark color in Figure 7.
189
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
Figure 7. Tableau model of non-SAT student’s answer
5.2.2. Example of an Unsatisfiable Student’s Answer Suppose that the student has submitted a wrong answer (in this case, “ChoiceB” or “ChoiceC”). Again, LORD first checks if the answer is subsumed by the question. This can be described by formula (6): (ChoiceB Չ ChoiceC) ՆChoiceA Չ (¬ChoiceB Չ ¬ChoiceC)
(6)
The subsumption relation can be reduced to the concept satisfiability as follows: (ChoiceB Չ ChoiceC) Ո ¬ChoiceA Ո (ChoiceB Ո ChoiceC)
(7)
In this case, it should follow that formula (7) is unsatisfiable. The reasoning process creates the constraint system (Formula 8): L(x) = { (ChoiceB Չ ChoiceC) Ո ¬ChoiceA Ո (ChoiceB Ո ChoiceC)) }
(8)
Applying intersection reasoning rule to formula (8) results in: L(x) = { (ChoiceB Չ ChoiceC), ¬ChoiceA, ChoiceB, ChoiceC }
(9)
Now LORD applies the second rule (union) to the last constraint system (Formula 9) and gets: L(x) = { (ChoiceB , ¬ChoiceA, ChoiceB, ChoiceC }
(10)
It is easy to check that there is no contradiction here. It implies that the starting OWL model is satisfiable, and that the question does not subsume the answer. The individual “x” has the same meaning as in previous example. The tableau model corresponding to this example is illustrated in Figure 8.
190
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
Figure 8. Tableau model of SAT answer
6. Reasoning with QTI Tests Every teacher tries to prepare consistent tests in order to check student’s knowledge. This consistency depends on what kind of student’s knowledge a teacher wants to check (for example, the level of student’s understanding of materials being taught). Inconsistent tests may imply the following disadvantages during the process of testing students’ knowledge: 1. 2.
It is difficult to analyze students' knowledge and discover how they understand the teaching materials. It is difficult to apply different pedagogical strategies when preparing the tests.
This section shows how LORD can help teachers in preparing consistent tests, with respect to the QTI standard. If LORD is to be applied here, then consistent tests should satisfy the following requirements: 1.
2.
They must conform to the definition of consistency in a description logic. This means that the tests should contain no logical contradictions. (This requirement does not include tests (questions) with just true/false answers).They must conform to the pedagogical strategies applied, i.e. to the strategies the teacher wants to enforce when testing the student's knowledge. If pedagogical strategies can be varied during the test preparation, they can be saved in a DL-based knowledge base, for example as Tbox or Rbox axioms. Practical application of LORD or other theorem provers and DL reasoners in preparing consistent tests they satisfy pedagogical strategies is a part of our ongoing work.
Figure 9 shows the interaction between the process of preparing tests and that of testing the students’ knowledge. Test preparation starts with a set of questions (for example simple choice, or multiple choices). The consistency of this set can be checked by LORD, in order to have a consistent test (requirement 1). An example of an inconsistent test is when two questions are in contradiction. It is also important to stress that, for example, a test consistent for high school students may not be consistent for primary school students.
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
191
Figure 9. Process of preparing tests and testing using a DLs reasoner
As an illustration, consider an example of a test inconsistent with respect to the given teaching material. Suppose that a student should master a lecture modeled by the ontology (partially) shown in Figure 10. This example reflects the second requirement of test consistency, shown above, i.e. it is important to know what group of students is capable of providing a satisfiable answer.
Figure 10. Part of the teaching material defined as a property chain
The part of the ontology shown in Figure 10 defines two object properties, hasMother and hasParent. Assume that two individuals, mother and person, are instances of the concepts Mother and Person. According to the property chain defined in this ontology (see Figure 10, if a person has mother it implies that the person has parent (see Figure 10). Also, there is an Abox assertion that defines that the person individual is in hasMother relation with the mother individual. Suppose that the student has to answer the following two questions: TXHVWLRQ hasMother ດ hasParent)(Ian, Mary)? (”Is the person Mary a mother or a parent of the person Ian?”) TXHVWLRQ ໍhasParent(Ian, Mary)? (”The person Ian does not have a parent Mary”) This test, with only two questions, is not consistent with respect to given
192
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
knowledge base. Inconsistency is a consequence of the non-satisfiable conjunction of these two questions with respect to the knowledge base (the simple lecture, in this case). The source of inconsistency is the second question. It means that there is no person that at the same time has a parent and does not have a parent.
7. QTI-FAR Framework and Architecture for QTI Systems The QTI-FAR reference architecture is shown in Figure 11. The idea is similar to that implemented in the Eclipse platform for software development. There is an architecture called the Eclipse Rich Client Platform (Eclipse RCP) [24], which allows developers to use the Eclipse architecture to design flexible and extensible applications re-using a lot of already existing functionality and coding patterns inherent in Eclipse. For more information about Eclipse RCP see [24]. The core part of a QTI system is its Delivery Engine, which is a part of assessmentDeliverySystem. All other parts can be developed as plug-ins, associated with the core by a corresponding manifest file and the metadata. The manifest must contain a separate resource describing the plug-in. The metadata associated with a specific plug-in should conform to the model specified by using the QTI metamodel [17].
Figure 11. QTI-FAR reference architecture
The Delivery Engine plays the role of a coordinator in the process of analyzing the students’ solutions in a QTI system. This means that the assessment items from one subsystem (Figure 11) have to be delivered to the Delivery Engine before transferring them to another subsystem, because every subsystem needs the assessment item in a specific form. This is necessary for the subsystem in order to process the assessment item in its own way, unknown to the Delivery Engine. This form is defined in the metadata of the specific subsystem. An important part of this system is the response-processing machine, as a part of the Delivery Engine. The response-processing machine plays an essential role in the process of creating the questions, as well as in processing the students’ answers.
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
193
For the purpose of this work a QTI-FAR system prototype in the domain of driving licensing has been implemented. The system consists of the delivery engine as well as the test construction tool as a plug-in. The screenshots are shown in the Figures 12 and 13. When the session starts, the candidate chooses an assessment item from the list on the left and answers the question shown in the central workspace. The items to select from are created with an authoringTool and managed by using an itemBankTool. In order to simplify this scenario in Figure 12, the items are grouped in a single assessment section. Each assessmentTest like the one presented in Figure 12 is created by a testConstructionTool, using the assessment items that are stored in the knowledge base.
Figure 12. An example of simple choice question in a QTI-FAR-based assessment system
The candidate submits his answer to the question by clicking the Answer button. At that point, the item gets check-marked in the list on the right side of the screen; the list shows the candidate’s progress in the test (the assessment items answered are check-marked). At the same time, a modal feedback is generated at the bottom of the screen. It shows the current number of correct and incorrect answers, as well as the number of items answered. The candidate can answer the question just once, which means that he cannot change the answer to an already answered item. Upon answering all items, the candidate can run response processing to generate a detailed score report. This denotes the end of the assessment test. The QTI-FAR framework also enables using open-ended questions in the process of assessment, as shown in Figure 13. The main problem with using this type of questions in other assessment systems is the support for response processing. Response
194
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
processing in the IMS QTI specification is based on matching the students’s response to the previously stored correct one. Processing of open-ended questions/responses in IMS QTI specification is “beyond the scope of specification”. The use of LORD in QTI-FAR makes it capable of processing of open-ended questions, which is a novelty and a step further from the IMS QTI specification. For example, the answer to the question shown in Figure 13 (”What is the meaning of this
sign?”) may be “Stop sign ahead”; another one may be “Stop ahead”. These are two syntactically different, but semantically correct answers. Our DL reasoner can detect that both answers are correct by applying a DL reasoning technique called instance checking – it checks if the two answers are instances of the same concept from the knowledge base (the traffic sign, in this case).
Figure 13. An example of open-ended question in a QTI-FAR-based assessment system
8. Conclusion The implementation of the full QTI specification has proven to be difficult. In a review of software applications that claim to support QTI [7], it has been found that in almost all cases the support was restricted to the item layer, leaving the Assessment and Section layer aside. Applying the MDA standard in QTI-based assessment system development goes to creating a flexible, robust and interoperable assessment system because:
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
1.
2.
195
It is possible to provide the interoperability in both homogeneous and heterogeneous QTI assessment systems by using the proposed framework and architecture supported by Eclipse RCP. The use of metamodels and ontologies in the assessment system development improves the system behavior and decision making.
The main idea proposed in this chapter is how Semantic Web based knowledge representation and Model Driven Architecture (MDA) can be brought close together in designing assessment systems. The chapter also describes how to use DL reasoning techniques in intelligent analysis of the students’ solutions and preparing consistent tests. Analysis of the semantics of the student’s answer is the key for providing the response processing of open-ended questions in the IMS QTI standard. The proposed way for realization of that idea is to apply QTI-FAR reference architecture in the assessment systems.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
T. Berners-Lee, J. Hendler, O. Lassila, The Semantic Web, Scientific American 284 (5), (2001), 34-43. P. Miller, Interoperability. What is it and why should I want it? Ariadne, 24. Retrieved March 10, 2004 Online: http://www.ariadne.ac.uk/issue24/interoperability IMS Question and Test Interoperability Overview, Version 2.1 Public draft, Online:http://www.imsglobal.org/question/qti_v2p1pd/imsqti_ oviewv2p1pd.html J. D. Poole, Model-Driven Architecture: Vision, Standards and Emerging Technologies, Workshop on Metamodeling and Adaptive Object Models ECOOP, 2001 UML 2.0 Infrastructure Overview, Online: http://www.omg.org/issues/UMLSpec.pdf F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P. Patel-Schneider, The Description Logic Handbook-Theory, Implementation and Application, Cambridge University Press, 2003. ELENA: Creating a Smart Space for LearningTM, Online: www.elena-project.org OMG XMI Specification, Version 1.2, OMG Document Formal/02-01-01, 2002. Online: http://www.omg.org/cgi-bin/doc?formal/2002-01-01.pdf J. Miller, J. Mukerji, (eds.), MDA Guide Version 1.0.1, OMG 2003, Retrieved November 25, 2006, Online: http://www.omg.org/docs/omg/03-06-01.pdf J. Bezivin, In Search of Basic Principles for Model Driven Architecture, The European Journal for The Informatics Professional 5 (2), 2004. Meta Object Facility (MOF) Specification, Version 1.4, Online: http://www.omg.org/docs/formal/0204-03.pdf Ontology Definition Metamodel, Preliminary Revised Submission to OMG RFP ad/2003-03-40 1, 2004. Online: http://codip.grci.com/odm/draft J. Bézivin, F. Jouault, J. Paliès, Towards Model Transformation Design Patterns, In: Proceedings of the First European Workshop on Model Transformation (EWMT 2005), Rennes, France, 2005. Request for Proposal: MOF 2.0 Query / Views /Transformations RFP, OMG Document: ad/2002-04-10 2002, Online: http://www.omg.org/docs/ad/02-04-10.pdf E. Litani, E. Merks, D. Steinberg, (2008). Discover the Eclipse Modeling Framework (EMF) and Its Dynamic Capabilities, Online: http://www.devx.com/Java/Article/29093 F. Allilaire, J. Bézivin, F. Jouault, I. Kurtev, ATL: Eclipse Support for Model Transformation, Online: http://www.sciences.univ-nantes.fr/lina/atl/www/papers/eTX2006/14%20-%20FreddyAllilaire.pdf. S. Radenkoviü, N. Krdžavac, V. Devedžiü, A QTI Metamodel, In Proceedings of International Multiconference on Computer Science and Information Technology, 2007, 1123 – 1132. KM3 (Kernel Meta Meta Model), Online: http://wiki.eclipse.org/index.php/KM3 I. Horrocks, Optimising Tableaux Decision Procedures for Description Logics, PhD Thesis, University of Manchester, 1997. M. Ferrario, B. Smyth, Collaborative Knowledge Management & Maintenance, In Proceedings of German Workshop of Case Based Reasoning, Germany, 2001, 14-15. G. Simiü, The Multi -cources Tutoring System Design, ComSIS 1 (1), 2004. E. Sirin, B. Parsia, An OWL DL Reasoner, In Proceedings on International Workshop on Description Logics (DL2004), British Columbia, Canada, 2004.
196
S.D. Radenkovi´c et al. / An Intelligent Framework for Assessment Systems
[23] N. Krdžavac, V. Devedžic, A Tableau Metamodel for Description Logics, In Proceedings of 13th Workshop on Automated Reasoning (ARW 2006) , Bristol, UK, 7-9, 2006. [24] J. McAffer, J. Lemieux, Eclipse Rich Client Platform: Designing, Coding, and Packaging Java™ Applications, Addison Wesley Professional, 2005. [25] N. Krdžavac, D. Gaševic, A Method for Implementation Description Logic Reasoner, School of Electrical Engineering, University of Belgrade 16 (2005), 119-130. [26] S. Radenkoviü, N. Krdžavac, V. Devedžiü: Towards more Intelligent Assessment Systems, Technology Enhanced Learning, IGI-Global publishing, USA, 2008, 257-283. [27] A. Gerber, M. Lawley, K. Raymond, J. Steel, A.Wood: Transformation: The Missing Link of MDA, ICGT, 2002. [28] P. Gorissen, Quickscan QTI, Retrieved November 15, 2005, Online: http://www.digiuni.nl/digiuni//download/35303.DEL.306.pdf. Utrecht: De Digitale Universiteit.
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-197
197
CHAPTER 11
PhiloSurfical: an Ontological Approach to Support Philosophy Learning a
Michele PASIN a,1 and Enrico MOTTA a Knowledge Media Institute, The Open University, UK
Abstract. As the Semantic Web is increasingly becoming a reality, the availability of large quantities of structured data brings forward new challenges. In fact, when the content of resources is indexed, not just their status as a text document, an image or a video, it becomes important to have solid semantic models which avoid as much as possible the generation of ambiguities with relation to the resources’ meaning. Within an educational context, we believe that only thanks to these models it is possible to organize and present resources in a dynamic and contextual manner. This can be achieved through a process of narrative pathway generation, that is, the active linking of resources into a learning path that contextualizes them with respect to one another. We are experimenting this approach in the PhiloSurfical tool, aimed at supporting philosophy students in understanding a text, by presenting them ‘maps’ of relevant learning resources. An ontology describing the multiple aspects of the philosophical world plays a central role in this system. In this chapter we want to discuss some lessons-learned during the modeling process, which have been crystallized into a series of reusable patterns. We present three of these patterns, showing how they can support different context-based reasoning tasks and allow a formal conceptualization of ambiguities that are primarily philosophy-related but can be easily found in other domains too. In particular, we describe a practical use of the ontology in the context of a classic work in twentieth century philosophy, Wittgenstein’s Tractatus LogicoPhilosophicus. Keywords. Philosophy, ontology, digital narratives, semantic web, Wittgenstein
Introduction The need to specify and separate the information about the context of usage of a learning resource, from the resource itself, is one of the main reasons behind the creation of various kinds of metadata schemas. In the past years, this work has focused around the notion of learning object (LO) [1], as the technology capable of guaranteeing interoperability to the rapidly growing number of Web-based educational applications. However, increasingly researchers are now arguing that LOs’ metadata are not fine-grained enough to non-trivial composition of resources, e.g. when constructing a curriculum [2]. As a result, as attested by a series of workshops held worldwide [3], the e-Learning research community has begun looking at the potential 1 Corresponding Author: Michele Pasin, Knowledge Media Institute, The Open University, UK; EMail: [email protected].
198
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
for e-Learning of the emerging Semantic Web technologies. In this context, ontologies [4] have been proposed by many [5,6] as a technology that can be used to complement the functionalities of traditional LOs metadata standards – for example, ontologies can be useful when it is important to describe precisely and unambiguously a specific domain of interest, for this is the main content a LO is dealing with; or to the end of producing an extensive formalization of learning and teaching strategies, so to have much more control over the context the learning resources are being used within [7]. Our approach, in compliance with this new research direction, supports the enhancements of LOs’ metadata through the usage of ontologies so to represent the content of learning resources at a finer level of detail. More precisely, in the PhiloSurfical tool this approach is realized through the formalization of a humanistic discipline, philosophy. An ontology to describe and organize theories, schools of thought, arguments, problems and their relations to other philosophical concepts will allow the annotation of the learning material, and, subsequently, its dynamic reorganization with a degree of accuracy and flexibility well beyond the one provided by standard LOs metadata. By doing so we aim at providing a platform that supports philosophy students in understanding key aspects of the discipline’s discourse. This is achieved by means of a pathway creation process, i.e., an approach that gives students the means for contextualizing the resources they have found so to better analyze and interpret them in the light of the multiple roles they play in the world of philosophy. Our approach takes the notion of a learning pathway as a “system of specially stored and organized narrative elements which the computer retrieves and assembles according to some expressed form of narration” [8] and attempts to transpose it within the specific scenario made up of philosophical entities. For example, we can think of a young philosopher trying to understand Wittgenstein’s picture theory of language. Our purpose is to let her discover the significance of this theory by putting it into different perspectives (i.e., the pathways) and autonomously exploring how it relates to other philosophical entities (e.g., theories, events or people). Among the pathways available our student could find the following ones: -
-
-
the critical explanation of a theory (a meta-historical learning path that highlights the opposing theories, and the problems on which they are focused), the historical contextualization of a theory (a learning path that shows associated information about an author, or the historical period, or other contemporary important theories in different research areas), the description of the whole body of work of an author (a learning path that recollects all the activities and results of an author, and organizes them according to the user’s preferences), the intellectual lineage of a concept/theory (a learning path that follows the influence of ideas throughout the history of thought, across different areas and historical periods).
The aim of this chapter is twofold. Firstly we consider three ontological lessonslearned which emerged as fundamental in supporting philosophical learning pathways. Under this respect, the modeling patterns we are presenting are quite different from the patterns discussed in other works such as that one of Gangemi and colleagues [9], where the focus is more on the architectural issues involved in the ontology creation process. In particular, the patterns we are describing in the next sections represent some
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
199
modeling decisions that are meant to guide the interpretation of philosophical texts, so to have formal models that are applicable for providing non-trivial navigation mechanisms. We believe that such a modeling can have a significance that goes beyond the specific domain of philosophy and can be reusable within more generic areas of interest. Secondly, we show the reader a practical application of this ontology, the PhiloSurfical tool. This is a web application that supports users in learning about a philosophical text, Wittgenstein’s Tractatus Logico-Philosophicus [10]. By relying on the multiple representations of our philosophical ontology, PhiloSurfical’s learning pathways lets students benefit from multiple perspectives on the text and on related resources. For example, they can reorganize the text according to the relevance of a single annotation, e.g. the concept of “logical-independence”, or they can adopt more complex strategies to retrieve other resources that are not directly related to the Tractatus. In particular, in the following sections we are going to discuss the functioning of a specific type of pathways, the theoretical ones. The rest of the chapter is organized as follows: section 1 provides some pedagogical background about philosophy learning; section 2 gives an overview of the ontology we built for representing the world of philosophy, for then discussing the details of three important modeling patterns that emerged during the work; section 3 describes PhiloSurfical, our narratology-inspired prototype application. Finally, section 5 concludes the chapter by providing information about other related work.
1. Constructive Learning in Philosophy Let us think again about our philosophy student, while she is tackling Wittgenstein’s theory of language, as exposed in the Tractatus Logico-Philosophicus. She would probably read Wittgenstein’s texts several times, analyze the language being used and deepen her understanding of a number of concepts the philosopher’s argument relies upon. Also, she would likely make use of other reference material about this topic, so to gain insight into the historical and theoretical contexts the theory originated from. Is it addressing a long-standing problem, or does it raise a completely new one? Who has been influencing Wittgenstein, and how much of his ideas can be related to other preexisting philosophical work? These are the type of questions we expect our student to try to answer. At a more general level, we could say that our student is actively exploring this new philosophical ‘territory’. An active style of learning, not just, for example, a passive reading and remembering of what is read, is reputed by many as being the main cause for successful learning. In educational theory, this thesis (and others related to it) is one of the central tenets of doctrines such as constructivism [11] and situated cognition [12]. Their importance and academic relevance, beyond the various and inevitable debates, is widely acknowledged. For example, an active style of learning implies that, when facing a text, although a teacher's explanation is of help in the learning process, he/she is not the main reason for it. In fact, according to this position teachers are more often viewed as ‘knowledge facilitators’, in opposition to the traditional figure of the ‘knowledge dispenser’. In general, students are advised to engage directly with a subject matter (e.g., an author's text), in order to obtain their own understanding and actively construct a meaning out of it.
200
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
However, this picture is quite a simplified one. While an active style of learning is relatively easy to foster in natural, everyday situations (for example, when learning how to ride a bike or how to speak a language), this is not the case for the more artificial, academic learning. The learning and teaching of philosophy can be taken as an example of this difficulty. Philosophy, as other subjects such as theoretical physics, mathematics and logic, deals only with abstractions. That is, in Laurillard terms, “descriptions of the world” [13]. As a consequence it is harder to situate its learning in a natural context and it is also hard to apply constructivist approaches to teaching. In such an academic and abstract context, what are the ideal students’ activities which can lead to a successful learning experience, and what are the best methods and situations to support them, is the object of much debate [14-16]. But even if a general agreement on this matter will hardly be reached, we can still attempt to define some essential requirements to achieve in the context of philosophy teaching. More precisely, we agree with Carusi that the three most important skills to develop in a philosophy student must be (a) analysis, (b) argument and (c) interpretation. As the author remarks, the “three skills are interwoven as analysis requires interpretation, and argument depends on the prior abilities to analyze and interpret correctly other philosophical positions” [17]. In particular, in table 1 we detail Carusi’s lengthier description of what each of the skills may entail, as far as the student is concerned. Table 1. The three major philosophical skills (from Carusi, 2003) Skill
Description
Analysis
• analyze a philosophical problem or position into its component parts and be able to tell how they are connected together; • analyze an argument into premises and conclusions, and reconstruct the structure of the argument, filling in implicit premises where necessary; • analyze philosophical texts into sections and be able to see the connections between sections.
Argument
• understanding of the standard fallacies; • being able to distinguish between inductive and deductive arguments, and being able to say what constitutes an acceptable argument of both kinds; • understand the role of counter-examples and be able to use them; • understand the role of analogies and be able to use them; • understand the role of thought experiments and be able to use them.
Interpretation
• Interpretations should be coherent in that they should not contain inconsistencies or contradictions. • Interpretations should be cogent in that they should account for as much of the text as possible within a unified framework. • Interpretations should be informed by an understanding of the historical tradition in which the text is embedded and the meanings of concepts and terms as specified within that tradition. As a minimum, this should include some knowledge of history of ideas in philosophy.
With the PhiloSURFical tool (see section 3) we aim especially at supporting the (a) analysis and (c) interpretation skills development, through an environment which allows constructing advanced strategies to present annotated resources to the user, in the form of browsing facilities and narrative generation. The active involvement of the student in a process of semi-structured navigation (the structure being provided by the
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
201
ontology) guarantees her engagement with the subject matter in a constructivist manner.
2. Engineering Philosophical Knowledge 2.1. Overview: an event-centered design The specific approach used to realize the PhiloSurfical ontology has at its centre the decision to employ the CIDOC Conceptual Reference Model [18] as a starting point for our formalizations. The CRM ontology is a renown ISO-standard which aims at supporting semantic interoperability for museum data. In the following sections, we are referring to version 4.2 of the ontology [19].
Figure 1. Example of an event-based representation
The choice of using the CRM was motivated by two reasons. Firstly, for its widely recognized status as a standard for interpreting cultural heritage data. In fact, by reusing and extending an existing and internationally recognized ontology, we will give our tool's users more chances to benefit from the emerging Semantic Web infrastructure. Secondly, for its extensive event-centered design. This design rationale, in fact, appeared to be appropriate also when trying to organize the history of philosophy: even if it is common to see it as an history of ideas, stressing the importance of the theoretical (i.e. meta-historical) dimension, this cannot be examined without an adequate consideration of the historical dimension. That is, a history of the events related (directly or indirectly) to these ideas. As an example, in figure 1 we can see an event-centered representation in the PhiloSurfical ontology. The persistent-item class, which is one of the five classes composing CIDOC’s top layer (together with time-specification, dimension, place and temporal-entity) subsumes thing and actor. The two branches of the ontology
202
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
departing from them can have various instances, which are related by taking part (in various ways) to the same event (“1933-Prague-meeting”). This kind of modeling, in the context of the PhiloSurfical tool, is extremely useful because of the multiple navigational pathways it can support (e.g. we could move to another event having the same topic, or to another topic treated during the same event, etc.). Please note that in the figure some relations (e.g., has-worked-for) are graphical shortcuts for the actual and lengthier formalization of the relevant event (e.g., an event instance stating that an actor worked for an institution at some point in time etc.). From the implementation point of view, the ontology has been prototyped by using the Operational Conceptual Modelling Language (OCML) [20], which provides rich support for both specification and reference. Import/export mechanisms from OCML to other languages, such as OWL and Ontolingua, ensure symbol-level interoperability. Please notice that in the next sections we used different fonts depending on whether we refer to classes in the ontology (e.g., event) or properties associated to them (e.g., hasduration). Instances are always double quoted (e.g., “the concept of will”). For that regards the figures, classes are oval-shaped, rounded rectangles stand for instances and arrows represent relations. In particular, if not labeled otherwise, dashed arrows stand for the instance-of relation, while solid arrows stand for the subclass-of relation. At the time of writing, the ontology2 counts 348 classes, partly integrated from other relevant semantic models and partly identified through various knowledge acquisition techniques (formal and informal). In conclusion, it is worth remembering that our ontology resulted as being the first and most ambitious attempt to provide a formal meta-language usable for describing the world of philosophy. Although we used it mainly in the context of Wittgenstein’s philosophy (see section 3), the ontology is very abstract and could be easily applied to other philosophical domains too. We provide an extensive description of all of its features in another publication [21]. In the next sections we will present three ontological issues we encountered during the modeling process, together with the solutions we contrived in order to solve them. As in the example above, the derived modeling patterns aim at taking advantage of the multiple meanings a philosophical entity (e.g. an idea, a text or an event) can have, by making these meanings explicit and employable when building novel exploration mechanisms. In other words, according to our approach ‘ambiguities are good’ because, if properly identified, they let us explore the domain in different and interesting ways. 2.2. Pattern #1: is rationalism a school of thought or an event? The first pattern originates from the fact that in our everyday language we refer to belief groups, intellectual movements and schools of thought ambiguously, often using the same word. For example, let us consider the following three statements: a)
“Throughout history, the attacks of rationalism against empiricism has diminished” b) “Descartes was one of the founders of modern rationalism” c) “This theory is clearly a new and re-shaped rationalism”
2
The ontology is available online at http://philosurfical.open.ac.uk/onto.html
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
203
Initially, we set out to model concepts such as “rationalism” by adding a philosophy-specific subclass to CIDOC’s period. In fact, according to CIDOC, period (which is a direct subclass of temporal-entity) should subsume prehistoric or historic periods, or even artistic styles. This is motivated by the fact that "it is the social or physical coherence of these phenomena that identify a Period and not the associated spatio-temporal bounds" [19]. This seemed to apply quite neatly also to cultural and philosophical periods, thus we have added intellectual-movement and its subclass philosophical-movement to the hierarchy. However, at a deeper ontological analysis, we came to the conclusion that in the sentences above we are using the same word to express three different meanings. Precisely, in a) “rationalism” is the label referencing to a group of people, in b) we are meaning an event, while in c) we are probably referring to an abstract idea. A modeling pattern (figure 2) achieves the goal of expressing both the difference in meaning and the interrelations of the three senses implied by words such as “rationalism”. This pattern involves subclasses of actor, period and view (a type of abstract philosophical idea, as we shall see later, expressing a standpoint). The ambiguity of a term such as "rationalism" can be clarified, since the semantic model keeps the three different ways to intend the word into a consistent representation. By doing so, we are providing a context of usage for such ambiguous terms, and a direct way to navigate coherently among entities that are ontologically quite distinct (i.e. from temporal-entity to actor and propositional-content, which belong to separate branches of the ontology). Moreover, such a context-specification could be used for by a reasoner to derive inferences from incomplete or inconsistent data sources, or for performing information extraction.
Figure 2. The actor-event-view modeling pattern
So, for example, we can describe the “enlightenment movement” in the following way3 (note that the temporal relations are specified here as slots, but are usually
3 Although OCML has a simple frame-like syntax, in order to facilitate readability here we are using an abstracted syntax.
204
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
inferred whenever the appropriate time specifications of the other periods were provided): INDIVIDUAL Enlightenement instance of: has-time-specification : overlaps-in-time-with :
Intellectual-movement. 18th-century. scientific-revolution, renaissance. meets-in-time-with : french-revolution, american-revolution, romanticism. overlaps-with : age-of-Reason, neo-classical-art. took-place-at : germany, france, britain, spain. has-related-group-of-people : enlightenment-group-of-people. is-typified-by : enlightenment-conception.
The last two slots in the formalization above have a special importance, for they serve the purpose of interrelating the three different senses highlighted in the pattern. In particular, the slots has-related-group-of-people and is-typified-by link the “enlightenment” instance (an intellectual-movement) to the relevant instances of group-of-people and of school-of-thought.
2.3. Pattern #2: not all views are theories! The second pattern is related to the fact that people often employ the term ‘theory’ in a loose manner, over-classifying views with different characteristics. Consequently, a thorough formalization of these entities proved to be an important meta-model usable by students for ‘learning the differences’ among the different theory-types and relationships. In our ontology, view has been defined as a generic class referring to philosophical ideas expressing a viewpoint. That is, propositions picturing a perspective on the world in the form of more or less structured interpretations of things and events. Examples of view are "solipsism", "theory of evolution by natural selection", "philosophy of Plato" or "a name has a meaning only in the context of a proposition" (i.e. Frege's context principle). Because of their ‘categorical’ attitude, views usually define concepts and, in general, create the context for the definition of other meanings too (e.g. problem-areas, problems, methods etc.). A number of properties connect views to the other philosophical-ideas: e.g. views can use other ideas, tackle problems, influence and support/contrast each other, be-supported by arguments. However, the feature we want to highlight here is how views can have different granularities: from our analysis of the literature, we identified four of them. This classification is mainly related to the degree of generality they exhibit, and the level of complexity they have. So, we can have (as shown in figure 3):
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
205
Figure 3. The view-types instantiation
- Thesis: it is the least structured view, as sometimes it consists only of a standpoint in the form of a statement (i.e. an assertion). So, for example, in the context of Wittgenstein's “picture theory of language”, a thesis can be the "independence of the state of things". - Theory: it is a systemic conceptual construction with a coherent and organic architecture. A theory explains a specific phenomenon (or a class of phenomena) and typically answers to an already existing problem. Examples can be Darwin’s “theory of evolution” or Quine’s “verification theory - Philosophical-system: it might appear as a theory, at first sight, but it differs from it essentially for its generality. That is, because it spans over various problem-area, while a theory is usually confined to one problem-area only. As a consequence, theories are usually part-of philosophical systems. We can therefore define a system as the set of a person’s views that are consistently connected to each other, in such a way to form a unity (in a way, this class refers to what is normally called the "philosophy" of a thinker). - School-of-thought: this class refers to the set of theory-types, or generic standpoints, which in the history of thought have acquired a particular significance and, seemingly, a life on their own. They correspond to widely known conceptions, or standardized intellectual trends that hint at typical ways to answer a problem (or a set of problems). Examples are “pacifism”, “animism”, “expansionism”, "empiricism" or "monism". A school-of-thought, compared to the other views, is not as formalized and specific as a theory, and not as general and systematic as a philosophical-system. Thanks to this quadruple classification, it is possible to specify all the hierarchical and mereological relationships among views with a good degree of precision. From the point of view of learners, this modeling pattern facilitates the creation of pathways that place a theory or school of thought within the larger theoretical context, i.e., showing how it is related to other intellectual entities.
206
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
2.4. Pattern #3: ‘problematic’ problem areas The third pattern we are presenting wants to provide a way of expressing the distinctive features of ‘fields of study’. It is normal for philosophy learners to refer to the topics they are studying in terms of the area they belong to, e.g., metaphysics, logic, philosophy of language etc. Indeed, the way the discipline is organized reflects some common denominators of philosophical research: these can be some classic problems investigated in philosophy, or the generic approaches used to solve them. In order to provide explorative pathways that focus on these particular aspects of philosophical discourse we created a modeling pattern centered around these notions. As we will see, one the major difficulties here arises from the fact that we can interpret fields of studies in at least two different ways: a generic one (e.g. the field of “physics”) and a specific one (e.g. “Newtonian physics”). The pattern models the relations between them. Our starting point is a problem-centered approach, that is, the decision to see the activity of philosophers as essentially an ongoing process of specifying and giving solutions to problems. Consequently, we consider any recognized area of study, of whatever type or dimensions, as a problem-area. In its simplest version, a problem-area is composed by a set of problems linked by different relational schemas, but in general, tying around a main theme. This theme, in our ontology, can be represented through a problem (has-central-problem property) or thanks to a thesis functioning as a criterion (has-criteria property). For example, “psychology”, when treated as a problem-area, can gather problems tied to the “mind-definition” problem, to the problem of “relating human behavior to brain activities”, or to the thesis that "brain and mind can be investigated with the methods of natural sciences". Other features of problem-areas are that they can be related to each other (e.g. “mathematics” and “philosophy of mathematics”) and that they can be organized into simple hierarchies (e.g. “internet-ethics” is a sub-area of “ethics”). However, we realized soon that "psychology" has a role and significance in our world that goes beyond a mere problem area. In a similar fashion, "ethics" or "cognitive science" would not be properly characterized only as instances of problem-area, for they also refer to theories or methods which have become intrinsically related to the definition of the area. Moreover, if we consider the history of thought, the topic and description of problem areas have always been subject of many debates: different views aspire at having the ultimate vision about what the central issues to look at are, or the right methods to take. In this respect, problem-areas are not very different from other ideas that can be defined by multiple views. For example, we can just consider how different was the sense given to “philosophy of language” by the first philosophy of Wittgenstein and the second one. In order to catch these subtle differences, we defined the class field-of-study as a problem-area that has been socially and historically recognized as separate from the others (and from being a mere agglomerate of problems). In the ontology, this is reflected by the fact that a field-of-study is not just specified by a criteria, but is defined-by a view. It is also characterized by the fact that it collects not only problems, but also ways to solve or tackle them (i.e. theories and methods). The distinguishing properties are therefore defined-by-view, has-exemplar-theory and has-methodology. Finally, a last tricky issue regarding fields of study must be addressed. This does not emerge when treating relatively isolated entities such as “phrenology”, but it clearly
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
207
is an issue if we consider, say, “physics”. In our everyday language, and also in the organization of academic programs, we usually refer to “physics”, “psychology” or “philosophy of mind” as generic fields of study. What this means, is not really clear. In fact, when we delve into them (or even more, if we ask for clarifications to a practitioner), we discover quickly that there are many “physics”, “psychologies” and “philosophies”, at least as many as the views defining them. From our ontological perspective, these would all be separate instance-candidates of the field-of-study class. However, we also need to represent the fact that they are all part of a more generic (and probably emptier, for that regards its meaning) type of field of study.
Figure 4. Problem areas and fields of study
Our solution to this problem consists in the creation of a generic-field-of-study class, which has no defining view but the views defining the specific fields-of-study that are claimed to be part of it. In other words, we are formalizing the fact that generic fields of study such as “physics” or “philosophy” can be defined only extensionally. So: CLASS Generic-Field-of-Study subclass-of :Problem-area. defined-by-view : :range View. :range-constraint:
[?GF defined-by-view ?V] => Exists ?F:Field-of-Study [?GF has-sub-area ?F] [?F defined-by-view ?V].
In the formula, the variables ?GF, ?V and ?F refer respectively to generic-fieldview and field-of-study. Therefore, doing so we can maintain the interoperability between specific thinkers’ definitions of classic problem areas, and the generic but useful ways to refer to them. In figure 4 we give a graphical overview of this modeling pattern, highlighting the important relationships among the classes involved. Please note that in this figure we used a graphical ‘shortcut’: when a relation is attached to a group of instances, that is to mean that the relation is repeated over all
of-study,
208
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
of those instances. For example, the generic-field-of-study instance “physics” exhibits the property has-sub-area three times, corresponding to the three instances of field-ofstudy we grouped together.
3. The PhiloSurfical Tool 3.1. Overview In order to test the usage of the ontology within a specific philosophical scenario we created PhiloSurfical4 (see fig. 5). This is an application that supports learning about a classic work in twentieth century philosophy, Wittgestein’s “Tractatus LogicoPhilosophicus” [10].
Figure 5. Screenshot of the PhiloSurfical application
The PhiloSurfical’s tool functionalities, and in general, the envisaged context of usage which has been guiding the ontology engineering process is the following: the semantic model should support the reconstruction of the history of ideas, by relying on structured information about the practical domain and the theoretical domain of thinkers. For example, within an educational scenario where young philosophers try to understand domain notions (in a wide sense, comprising ideas and events), these functionalities will exist in the form of mechanisms for contextual navigation and linking of relevant resources. As a result, we expect such a service to facilitate the discovery of (related) unknown resources, which can be used by students and scholars during the process of answering difficult problems. This methodology, which has been previously defined as ontology-based navigation [22], can be further developed by means of an approach modeled on 4
The application is available online at http://philosurfical.open.ac.uk/
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
209
narratology [23]. As already discussed in an earlier publication [24], following structuralist theorists we can sketch out the structure of a narrative as the union of a story (what is told) and a discourse (the ‘how’ of what is told, that is, the specific way in which the basic elements of a story are re-organized and conveyed to the listener, in order to create different effects). In our narratology-inspired approach, a formal ontology can be used to express the semantics of the different elements composing a story, so that it is also possible to formalize the way a discourse recomposes the same elements according to different criteria. So, for example, the same chosen set of ‘atomic’ philosophical events could be ordered following a historical perspective, a geographical one or even a theoretical one. Similarly, the same set of philosophical ideas could be organized differently if investigated under a problem-centered perspective, a theory-centered one, or simply one based on their historical succession. 3.2. Knowledge base creation It is important to remember that although one of the aims of the ontology was to facilitate data-exchange among distributed resource-providers, for bootstrapping purposes (as the availability of free and adequately encoded philosophical data on the web is still limited), PhiloSurfical strongly relies on an internal knowledge base of our creation. As suggested by recent projects such as the DBPedia [25], or the Discovery project [26], we envisage that in the near future this situation will change, as much more structured data about philosophy will be made available. Our knowledge base was constructed in three phases: first, we transformed the text itself in a format compatible with our ontology - i.e., we instantiated classes representing the text and its paragraphs. Second, we annotated the text by working in collaboration with a Wittgenstein scholar, Andrea Bernardi; in this phase we instantiated classes representing ideas and relations among ideas, and indexed the text using these representations. Third, we enlarged the knowledge base by ‘scraping’ philosophy-related information from various websites in the public domain; in particular, we created more instances of philosophers and philosophical schools of thought. At the end of this process, we gathered a total of more than 20 thousand instances connected to Wittgenstein and his philosophy. It was not our purpose to create an exhaustive resource about the Tractatus; accordingly, we stopped refining the knowledge base as soon as we thought we had reached a critical mass of data, usable for testing the ontology through our ‘learning pathways’ approach. 3.3. System description From the technical point of view, PhiloSurfical is a lisp web-application running on the Lispworks environment [27]. It uses OCML [20] for the knowledge representation and storage functionalities, and Hunchentoot [28] as a web-server.
210
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
Fig. 6. The ‘Browse the annotations’ tab in PhiloSurfical
The application is organized into five sections or tabs. We attempted to organize the tabs’ sequence according to their increasing difficulty of usage (namely, the first tab requires less learning effort than the second one, the second one less than the third one, etc.). By doing so, we wanted users to have a more gradual encounter with the software. This becomes important especially when considering that not all Wittgenstein’s scholars are familiar with web-based educational tools. The five tabs can be briefly described as follows5: 1) The Welcome tab serves as a ‘splash screen’ and provides some contextual information and links to relevant resources. 2) The Browse the text tab (fig. 5) lets users browse the Tractatus, which is made available in three versions (the original German edition and the two major English translations). In order to facilitate this activity, a tree-like outline of the book on the left hand side lets them jump quickly to a specific paragraph. Moreover, we make use of a simple mechanisms for helping learners select which of the text’s translations to visualize: when the mouse hovers one of the paragraphs shown on the right side of the screen, this is highlighted and a contextual menu appears above the text. By clicking on one of the available options, it is possible to view more than one translation at the same time. 3) The Browse the annotations tab supports a different type of text navigation by means of a smart-index of the topics associated to the Tractatus’ fragments (fig. 6). For example, by clicking on a paragraph, it is easy to see all the the annotations which have been associated to it (in the local panel). Similarly, by clicking on an annotation we can search for all the paragraphs related to it, which are displayed in the main central panel. Users can also go through all the philosophical annotations available (by means of the categories panel) or find out more information about an annotation in free-text form (describe panel) or by looking at what relations it entertains with other annotations (inspect panel). 5
A lengthier description can be found in the first author’s PhD thesis [29]
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
211
4) The Browse the pathways tab lets users select topics of interest and explore related resources by means of the ‘learning pathways’ facility (see the next section). 5) Finally, the Browse the ontology tab visualizes the tree-hierarchy of the ontological representations PhiloSurfical relies on. This tab does not have any specific learning functionality but it has been added to the prototype mostly as a way for instructional technologists to inspect the underlying model of PhiloSurfical. 3.4. Ontology-enabled pathways for learning philosophy The learning pathways are the most advanced navigation facility PhiloSurfical provides. By means of these pathways we aim at helping learners explore actively and autonomously the world of philosophy (see also section 1 above). In general, a ‘pathway’ is essentially a way to retrieve different instances stored in the knowledge base and organize them into a coherent whole. Pathways’ results are normally presented in the form of maps of connected ideas - e.g., a map of competing views on the same topic, or a map of the philosophical problems typical of a research area - thus helping a student analyze a particular concept and interpret its significance within the various existing philosophical contexts. So, for example, let us imagine a learning scenario where Lisa, a young philosopher, attempts to make sense of Wittgenstein’s text. After having explored a number of Tractatus’ topics by using the Browse the annotations tab described above (tab-3, fig. 6), Lisa develops more interest for the topic called “philosophy of mathematics”. Thanks to the mechanisms available in tab-3, Lisa can see where this topic is dealt with by Wittgenstein in the text, and also how it is related to other topics. He/she then realizes that a key point to clarify concerns the significance of the so-called “problem of the foundations of mathematics”. In order to benefit from more perspectives on this topic, our student now moves to the Browse the pathways tab. Here she can search and select the problem instance called “problem of the foundations of mathematics” and find out more about it by using the pathway called problem-centric map of the attempts to solve a problem. As shown in figure 7, this type of query produces a list of concurrent view instances which have been classified as attempting to solve that problem. Each view is presented together with other useful information too (e.g., the values of the slots has-main-exponent, has-exemplar-theory, etc.). As a result, Lisa can now see what other authors have attempted to solve the “problem of the foundations of mathematics” - e.g., Plato and Frege. Also, she realizes that their respective theories have to be considered too when trying to understand this problem. In order to do so, she starts by selecting Frege’s “mathematical logicism” instance and explore what pathways are available for it. Finally, in order to find some other literature about the topic, Lisa selects a textual type of pathway and arrives at the Stanford Encyclopedia of Philosophy entry about ‘Frege’s mathematical logicism’.
212
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
Figure 7. Pathway representing the various ‘attempts to solve a problem’
At the interface level, such mechanisms can be described as follows. First of all, users select a content of interest as the starting point of a pathway (fig. 7, item in focus box). Learners may then choose from one of the available choices appearing in the pathways list panel (see figure 7, bottom-left). The pathways that are not available are dimmed out; the available ones, instead, come with a brief description explaining their meaning. Once triggered, the pathway’s results are shown as a list of interrelated entities (fig. 7, results panel). Here, a number of important relations among the pathway’s items are made explicit, so to highlight their significance in the philosophical discourse. Moreover, by clicking on any of these items it is possible to put it into focus and use it as the starting point of new pathways. A recent items panel is used to keep track of all the items selected since the beginning; also, from here it is possible to search for these topics elsewhere on the web (e.g., on philosophical portals, specialized search engines, etc.). Furthermore, by clicking on the see in a graph button learners can view the pathways results’ using a graphical visualization. E.g., in fig. 8 we can see the results of a theoretical pathway starting from the idea of “Frege’s conception of logic”. In this case the pathway selected is generic map of related ideas, which simply shows all information associated to an idea. We classified pathways according to the ontological type of their ‘entry point’, and, more generally, according to the types of the instances that are retrieved from the knowledge base. So, for example, by selecting instances of philosophical-idea we would usually trigger a theoretical pathway; instead, if we selected instances of person we would probably trigger a textual or historical pathway. Because of space limitations, we cannot give here a complete description of all the pathways made available in PhiloSurfical6. In the table below (table 2) it is possible to see more information about a specific type of learning pathways, the theoretical ones.
6
A complete list of the pathways can be found in the first author’s PhD thesis [29]
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
Figure 8. Graphical view of a theoretical pathway about Frege
Figure 9. Abstract representation of two pathways’ algorithm
213
214
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning Table 2. The theoretical pathways available in PhiloSurfical Name (input type)
Description
Ideas having the same name (propositional-content)
This pathway retrieves ideas having the same name but a different meaning than the selected one. E.g., starting from the concept of ‘fact’ in Wittgenstein, we would find out about other authors who used the word ‘fact’ in a different sense (such as Frege and Russell). "Generic and specific schools of thought" Starting from a school of thought, this pathway retrieves a set of (school-of-thought) related schools of thought that are all specializations of the same generic one. This pathway is related to the formalization presented in section 3.5.4: e.g., by focusing on ‘atomism’ we would be able to see the related contextual versions of it, such as ‘logical atomism’, ‘metaphysical atomism’, ‘social atomism’, etc. "Influences among related views" (view) Starting from a view, this pathway is a recursive function showing information about other views that support/compete with the first one. E.g., starting from ‘Wittgenstein’s theory of language’, we could go to the ‘Russell’s theory of language’ (which opposes it), then to ‘Whitehead’s theory of logic’ (which supports Russell’s) etc. "Generic map of related ideas" (propositional-content)
This pathway shows all the information an idea has been described with. This is a generic way to retrieve all the interpretations associated to an idea.
"Problem-centric map of the attempts to solve a problem" (problem)
This pathway takes a problem instance and retrieves information related to the competing views (theories, schools of thought, philosophies) that tackle that problem.
Finally, it is important to mention that internally PhiloSurfical represents pathways as abstract procedures applicable to any ontology-compliant data repository. For instance, in figure 9 we reproduced the algorithms behind the ‘influences among related views’ and the ‘problem-centric map of the attempts to solve a problem’ pathways (cf. also table 2 above). In particular, notice that after a pathway is triggered we normally scan the knowledge base for instances of the interpretation class mentioning the item which has been selected by the user. This class serves as an abstraction mechanisms for letting multiple annotators work together within PhiloSurfical; essentially, this means that every time an annotator formalizes a philosophical concept through the ontology, her activity is ‘reified’ by instantiating an interpretation object. A more detailed description of this feature can be found in another publication [21].
4. Related Work In general, we reckon that there are two main major contributions in our work. First, an extensive ontology to represent the philosophical world (and in particular, philosophical ideas). Second, a Semantic Web oriented system for supporting learners in navigating interactively through philosophical resources. Accordingly, we will describe related research enterprises for that regards both the formal representation of philosophical domains and the navigation of them through semantic technologies.
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
215
With reference to the first aspect, the most relevant (and to our knowledge unique) attempt to systematically formalize the philosophical domain is the one carried out in [30], as part of a digital library project aimed at building a dynamic ontologicalbackbone for the online version of the Stanford Encyclopedia of Philosophy (SEP). Compared to our approach, this work is less focused on knowledge modeling and more targeted at finding useful information extraction techniques, which could benefit from the vast expert-reviewed SEP. For example, in their case the idea sub-branch of the ontology is populated according to “semantic relevance” of ideas (based on words cooccurrence), instead of trying to model a hierarchy of types. Therefore, we see the two approached as fundamentally complementary and likely to be used together in future work. As various publications suggest, the humanities computing community has recently been more interested in the usage of ontologies for facilitating data representation and exchange [31,32]. In this context, the Discovery project [26] stands out for its explicit goal of creating an ontology-centered infrastructure usable by philosophers for exchanging data on the Semantic Web. In particular, the authors plan to use a “network of ontologies” [33]. This seems really promising, but unfortunately at the time of writing there is still no publicly available ontology for the philosophical domain. We plan to investigate how our results compare with theirs as soon as they will make them available. Regarding the formalization of ‘abstract’ ideas (and especially philosophical ideas) we found little evidence of relevant work in the knowledge representation research literature. Although models such as Wordnet [34] and Cyc [35] have in their knowledge-base philosophy-related concepts, they present them in hierarchies that are either too flat (e.g. everything is a subclass of “doctrine”) or not complex enough to support any navigation mechanism. When compared to such models, our ontology proved to be much more suited to the task7. Two noteworthy exceptions must be mentioned here. First, the DnS module of Dolce [36], which is “intended to provide a framework for representing contexts, methods, norms, theories, situations”, and has strongly influenced us. However, our ontology appears to be much more specifically suited to represent philosophical entities, such as schools of thoughts or problems. In fact, such topics are only marginally treated by DnS, which focuses on the formalization of entities such as plans, laws and regulations (legal objects). Second, the research of Mizoguchi and colleagues: his ontology of ‘representations’ [37] includes a conceptual model which organizes propositional contents in two groups, product propositions and design propositions. The former “works as specification of the production of something”, while the latter “is the product”. We have found this distinction very useful and included it in our formalizations. Also, our modeling of philosophical theories (cf. section 2.3) can be compared to their formalization of learning theories in the OMNIBUS ontology [38]. The authors present a “theory-neutral” ontology that aims at expressing the similarities and differences of various instructional and learning theories. Their approach is based on the “working hypothesis that a sharable engineering approximation related to learning can be found in terms of the changes that are taking place in the state of the learners”. Consequently, the authors’ characterization of learning theories relies on an extensive description of learners’ states and other important contextual elements of 7 The reader can find a more detailed analysis and comparison of the philosophical concepts in CYC and other foundational ontologies in chapter 3 of the first author’s PhD thesis [29]
216
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
learning scenarios. In general, this approach seems to be an interesting alternative to ours. In fact, we only attempted to model theories or schools of thought according to their ‘theoretical’ features, i.e., without referring to their implications in the real world (e.g., the change a learners’ state). This might have been a direct consequence of the fact that often philosophical theories have a much less ‘pragmatic’ attitude, especially when compared to learning and instructional theories. However, we think that this problem necessitates further research so we plan to investigate it in future work. Finally, our formalization of fields of studies (cf. section 2.4) could be related to the various works done in digital libraries subjects’ classification. Although we come from a different perspective, we acknowledge that approaches such as the mereotopological one [39] could be well suited also for the philosophical domain. The second contribution in our work regards the semantic navigation component of the PhiloSurfical tool. In this respect, the most relevant research work it could be compared to is Story Fountain [40]. This is an ontology-based application developed to support a community in the exploration of digital resources, specifically stories. Users ask questions about the domain (Bletchley Park, a second-world-war heritage site) and receive as answers some explicatory paths along the many annotated stories in the knowledge base. Our pathway-centered approach have been largely inspired by Story Fountain, although our application domain - philosophy - required a radical change of perspective. In fact, while Mulholland and colleagues are creating pathways that focus on stories’ protagonists (e.g., an army colonel) and objects (e.g., a pistol), in our scenario those type of entities are often secondary. The paths we are dealing with usually center around abstract ideas, such as philosophical theories and problems. Finally, it is worth mentioning recent research aimed at facilitating the semantic navigation of digital resources’ repositories, for it complements our learning-pathways approach. Faceted browsing systems usually provide generic architectures that aim at letting users explore potentially unfamiliar domains in a gradual and incremental manner. These approaches, inspired by faceted theory [41], have been tested in various humanities domains, such as classical music [42], visual arts [43], cultural heritage [44] and literature [45]. In general, by means of highly interactive visualization mechanisms which are controlled by the user’s selection of facets, the structure of a domain can be disclosed in a very intuitive manner. The main limitations of these systems, in our opinion, is linked to their very best feature. That is, being largely non-domain specific and allowing navigation based on ‘small’ and ‘incremental’ steps (i.e. selection of views/facets) the navigation mechanisms can hardly be tailored to specific learners’ needs. For instance, it would not be possible to construct a ‘view’ which organizes resources in a way that mimics, or at least supports, the traditional ways a discipline is presented or taught. In conclusion, our narrative inspired approach seems to be better targeted to an educational scenario.
5. Conclusion In this chapter we summarized our work with the PhiloSurfical tool. This is an application built to support students in understanding a philosophical text, through contextual navigation mechanisms based on semantic technologies. The application is being prototyped with Wittgenstein’s Tractatus Logico-Philosophicus using a philosophical ontology we created and instantiated with the relevant data. The ontology modeling process has demonstrated to be crucial to the aim of providing valuable and
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
217
non-naïve navigation mechanisms. In particular, we showed how the usage of solid modeling schemas can serve to solve ambiguities in the philosophical domain, and possibly to tidy up poorly or wrongly structured data in the quickly improving Semantic Web. We are currently in the process of elaborating the data obtained from two separate evaluations, one of the application and one of the ontology. We plan to make such results available in a separate publication.
Acknowledgments This work has been carried out under a grant provided by the EU-funded Knowledge Web project. We would like to thank all the people who have been providing feedback and support during the various stages of the research. In particular (in chronological order) Andrea Bernardi, Keith Frankish, Gordon Rugg, Marian Petre, Riichiro Mizoguchi and Martin Doerr.
References [1] [2] [3] [4] [5]
[6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
E. Duval, Learning Technology Standardization: Making Sense of it All. International Journal on Computer Science and Information Systems 1 (2004), 33-43. D. Gasevic, J. Jovanović and V. Devedzic, Enhancing Learning Object Content on the Semantic Web. IEEE International Conference on Advanced Learning Technologies (ICALT’04), 2004. International Workshop on Applications of Semantic Web technologies for E-Learning (SW-EL), official website: http://compsci.wssu.edu/iis/swel/index.html, retrieved on July 2009. T. R. Gruber, A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition 5 (1993), 199-220. D. Gasevic, J. Jovanović, V. Devedžić and M. Bošković, Ontologies for Reusing Learning Object Content. International Workshop on Applications of Semantic Web Technologies for E-Learning (SWEL), ICALT-05 conference, 2005. J. Brase and W. Nejdl, Ontologies and metadata for eLearning, in Handbook on Ontologies (eds S. Staab & R. Studer), Springer-Verlag, 2004, 555-574. L. Stojanovic, S. Staab and R. Studer, Elearning Based on the Semantic Web. WebNet2001, 2001. K. M. Brooks, Do Story Agents Use Rocking Chairs? The Theory and Implementation of One Model for Computational Narrative. Fourth ACM international conference on Multimedia, 1997. A. Gangemi, Ontology Design Patterns for Semantic Web Content, International Semantic Web Conference , ISWC’05, 2005. L. Wittgenstein, Tractatus Logico-Philosophicus, Routledge & Kegan Paul, 1922. J. Bruner, The Process of Education, Harvard University Press, Cambridge, MA, 1960. J.S. Brown, A. Collins and P. Duguid, Situated Cognition and the Culture of Learning, Educational Researcher 18 (1989), 32-42. D. Laurillard, Rethinking University Teaching, Routledge, 1993. G. Kemerling, Teaching Philosophy on the Internet, 20th World Congress of Philosophy, 1998. W. Mays, The Teaching of Philosophy, 13th University Conference, 1965. T. Kasachkoff, Teaching Philosophy: Theoretical Reflections and Practical Suggestions, Littlefield Publishers, 2004. A. Carusi, Taking Philosophical Dialogue Online. Discourse: Learning and Teaching in Philosophical and Religious Studies 3 (2003), 95-156. M. Doerr, The Cidoc Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata. AI Magazine archive 24 (2003), 75-92. N. Crofts, M. Doerr, T. Gill, S. Stead and M. Stiff, Cidoc-CRM Version 4.2 - Reference Document, 2005. E. Motta, Reusable Components for Knowledge Modelling - Principles and Case Studies in Parametric Design Problem Solving, IOS Press, The Netherlands, 1999. M. Pasin and E. Motta, Ontological Requirements for Annotation and Navigation of Philosophical Resources. To appear in Synthese Special Issue: Representing Philosophy, 2009.
218
M. Pasin and E. Motta / PhiloSurfical: An Ontological Approach to Support Philosophy Learning
[22] M. Crampes, S. Ranwez, Ontology-Supported and Ontology-Driven Conceptual Navigation on the World Wide Web, 11th ACM Hypertext Conference, 2000. [23] S. Chatman, Story and Discourse, Cornell University Press, 1978. [24] M. Pasin and E. Motta, Semantic Learning Narratives, International Workshop on Applications of Semantic Web Technologies for E-Learning (SW-EL), KCAP’05, 2005. [25] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives, DBpedia: A Nucleus for a Web of Open Data, 6th International Semantic Web Conference (ISWC’07), 2007. [26] Discovery project, official website: http://www.discovery-project.eu/home.html, retrieved on July 2009. [27] Lispworks - The integrated cross-platform development tool for ANSI common lisp, http://www.lispworks.com/, retrieved on July 2008. [28] E. Weitz, HUNCHENTOOT - The Common Lisp web server formerly known as TBNL, http://weitz.de/hunchentoot/, retrieved on July 2008. [29] M. Pasin, Ontological Requirements for Supporting Smart Navigation of Philosophical Resources. Ph.D. Thesis, Knowledge Media Institute, The Open University, 2009. [30] M. Niepert, C. Buckner and C. Allen, A Dynamic Ontology for a Dynamic Reference Work, Joint Conference on Digital Libraries (JDCL’07), 2007. [31] G. Nagypál, R. Deswart and J. Oosthoek, Applying the Semantic Web: The Vicodi Experience in Creating Visual Contextualization. Literary and Linguisting Computing 20 (2005), 327-349. [32] J. M. Vieira and A. Ciula, Implementing an RDF/OWL Ontology on Henry the Iii Fine Rolls, OWLED, European Semantic Web Conference ESWC’07, 2007. [33] M. Nucci, S. David, D. Hahn and M. Barbera, Talia: A Framework for Philosophy Scholars, 4th Italian Semantic Web Workshop (SWAP’07), 2007. [34] C. Fellbaum, (ed.) WordNet: An Electronic Lexical Database, MIT Press, 1998. [35] D. B. Lenat and R.V. Guha, Building Large Knowledge-based Systems: Representation and Inference in the Cyc Project, Addison-Wesley, Boston, Massachussets, 1990. [36] A. Gangemi and P. Mika, Understanding the Semantic Web Through Descriptions and Situations, International Conference on Ontologies, Databases and Applications of Semantics (ODBASE), 2003. [37] R. Mizoguchi, Tutorial on Ontological Engineering - Part 3: Advanced Course of Ontological Engineering, New Generation Computing 22 (2004), 198-220. [38] Y. Hayashi, J. Bourdeau and R. Mizoguchi, Using Ontological Engineering to Organize Learning/Instructional Theories and Build a Theory-Aware Authoring System, International Journal of Artificial Intelligence in Education 18 (2008). [39] C. Welty and J. Jenkins, Formal Ontology for Subject. Journal of Data and Knowledge Engineering 31 (1999), 155-181. [40] P. Mulholland, T. Collins and Z. Zdrahal, Story Fountain: Intelligent Support for Story Research and Exploration, 9th International Conference on Intelligent User Interface, 2004. [41] S.R. Ranganathan, Elements of Library Classification, South Asia Books, 1990. [42] mc. Schraefel, D. Smith, A. Russel, A. Owens, C. Harris and M. Wilson, The Mspace Classical Music Explorer: Improving Access to Classical Music for Real People, V MusicNetwork Open Workshop: Integration of Music in Multimedia Applications, 2005. [43] M. Hildebrand, J. Van Ossenbruggen and L. Hardman, /Facet: A Browser for Heterogeneous Semantic Web Repositories, International Semantic Web Conference (ISWC’06), 2006. [44] E. Hyvönen, T. Ruotsalo, T. Haggstrom, M. Salminen, M. Junnila, M. Virkkila, M. Haaramo, T. Kauppinen, E. Makela, K. Viljanen, CultureSampo-Finnish Culture on the Semantic Web. The vision and first results, Information Technology for the Virtual Museum, LIT Verlag, 2008. [45] B. Nowviskie, Collex: Facets, Folksonomy, and Fashioning the Remixable web, Digital Humanities conference (DH’07), 2007.
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-219
219
CHAPTER 12
Comparative Evaluation of ASPL, Semantic Platform for e-Learning Martin DZBOR 1 and Dnyanesh G. RAJPATHAK Knowledge Media Institute, The Open University, Milton Keynes, UK
Abstract. Work reported in this chapter focuses on the learner’s interaction with resources on the Semantic Web; in particular with the semi-structured data that can be exposed to the user via domain-specific inference templates. We assessed this capability to use information from multiple sources of the service-based ASPL-v2 framework and analyzed it in terms of assisting users with interpreting connections in the academic domain; for example, identifying leading experts, recognizing communities of practice, or associating research topics and issues with particular publication outlets. The outcomes of a user-based study are reported, with our semantic platform found to outperform other similar tools – including the generic search engine aggregator Ask and semi-specialized Google Scholar. Keywords. e-Learning, ASPL, knowledge access, knowledge exploration, tool evaluation, comparative analysis
Introduction Education in general is taking advantage of the maturing Web to provide learning resources efficiently and effectively, and to tailor them to the needs of a learner. However, education has always relied on a strong interpretative component. In addition to recalling information from the databases, searching document repositories or retrieving from information warehouses, education also relies on information and resource connecting – both at the level of individual learners and at a group level. Interpretation can be seen as ability to link otherwise independent information, to make statements about it, and to make inferences from the available knowledge. Education is thus an interactive activity focusing on the learners and expecting their active participation. Our research has been motivated by the prevailing claim in many research papers arguing so-called ‘information overload’ and pointing to ever increasing size of the information repositories learners need to cope with. We aimed to chal1
Corresponding Author: Martin Dzbor, Knowledge Media Institute, The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK; E-mail: [email protected].
220
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
lenge this focus on data overload by returning to the early pioneering work in the domain of data and information interpretation. In this chapter we use term ‘interpretation’ in the sense of data analysis and synthesis – that is, ability to make sense of data chunks by meaningfully and transparently relating it to other data chunks. For instance, as early as in 1940-s Bush [4] argued there was more information published than it was possible for humans to process. This processing bottleneck has not been overcome yet (and hardly can ever be overcome), but in the context of learning it is often addressed by teaching the learners smarter forms of information processing – for instances, relating one chunk of knowledge to another and then re-purposing this ‘semantic’ connection in a new situations. In the pedagogical domain, similar arguments were proposed by Bloom in 1960s [2] – information processing goes beyond its recall and that advanced cognitive processes, such as synthesis and judgment lead to longer-lasting knowledge. These processes share one feature: they are relational; i.e., they comprise associations between separate pieces of information. Despite the advances in eLearning, the support for associate thinking is not as mature and widespread as one would expect after half a century since the issues were first time identified. The reasons may be numerous, according to [11] (i) it is hard to formally capture all subtleties of a learning task in a tutoring system, and (ii) learner modeling is only approximate, tutoring systems tend to be over-constrained closed worlds. To address the constraints, it is more valuable for the learner to see what can be done with a given knowledge, rather than merely following a prescribed workflow for a learning task. In this paper we discuss the performance of a semantic platform for learning we developed during the KnowledgeWeb project between 2004 and 2007. The main principle of our approach is to link the process of associative learning to the activity of web browsing. Browsing the web involves two main tasks: finding the right resource and making sense of its content. A significant amount of effort has gone into supporting the task of finding web resources, either by means of ‘standard’ information retrieval mechanisms, or by means of semantics-enhanced search [13, 16]. Less attention has been paid to the second problem, supporting the interpretation of web pages. Annotation approach [14, 17] allows users to associate meta-information with web resources, which can then be used to facilitate their interpretation. However, while annotation per-se is useful to support some shared interpretation, it is nevertheless very limited. Annotation is normally carried out manually, which means that the quality of the sensemaking support is dependent on the willingness of stakeholders to provide annotation and their ability to provide valuable information. This is of course even more of a problem, if a formal approach to annotation is assumed, based on semantic web technology [1]. In this chapter we start by linking our development work to the notions of exploratory learning of data analysis and synthesis skills on the Web in section 2. A part of this motivation is also a summary of one of our formative studies looking into what skills enable learners to perform better in complex knowledge synthesis tasks (see section 2.1), and a summary of the functional features of our platform ASPL (section 2.2). Then in section 3 we present the design of our user study alongside the tasks used in the study and their explanation. Section 4 presents and
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
221
discusses the main findings from our study in terms of perceived usefulness of three analyzed tools (ASPL, Google Scholar and Ask!) in 4 complementary tasks. In section 5 we try to draw generalized lessons from the studied tasks. Section 6 gives information on some of the related work, especially focusing on the theoretical foundations of exploratory learning skills. Finally, the chapter is concluded with a brief conclusion and a sketch of potential future challenges.
1. Advanced Semantic Platform for Learning The goal of our activities in the KnowledgeWeb project 2 was to provide a platform supporting delivery, combination and presentation of (a) the educational content that is stored in the internal learning material repository REASE 3 , i.e., a portal where learning resources can be uploaded and annotated by their authors, and (b) other learning materials available widely on the Web facilitating serendipitous learning by analyzing the existing materials, e.g., in the form of scientific publications, communities of practice, etc. ASPL (Advanced Semantic Platform for Learning) supports the user in interpreting texts related to Semantic Web Studies. This version of ASPL includes the Magpie semantic browser framework [8], which was chosen in order to manage the costs of developing ASPL and balancing efficiency of the application development with an effective balance between research and implementation work. Magpie has been designed at The Open University as a generic platform on which more sophisticated and specialized infrastructures and applications can be built. ASPL was originally designed and prototyped as a Magpie-based application (see Figure 1 for reference), and it has been available as a plug-in for a number of web browsers, including Internet Explorer and Mozilla/Firefox. It operates by making use of domain ontologies to dynamically annotate web pages the user encounters while browsing or searching the Web. Users can make use of the ASPL web services, which have been associated with classes in the domain ontology to access a range of relevant resources and activities. ASPL interacts with the user by means of highlighting the entities and concepts in web pages. These lexical keywords are derived and serialized from domain ontologies and transformed to the visual form to reduce the user’s ‘entry costs’. 1.1. Summary of the Initial Formative Analysis of ASPL The first phase of the platform development concluded in 2005 by evaluating the application built on top of the platform. The evaluation was formative; i.e., we intended to identify the gaps in the current platform, which would help us to focus on and elaborate specific strengths of our approach [7]. In our previous publica2 http://knowledgeweb.semanticweb.org 3 REASE is one of the outcomes of the KnowledgeWeb project’s educational area, it stands for Repository of the European Association for the Semantic Web Education, and is available at http://rease.semanticweb.org
222
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
tions [7, 9]] we presented the theoretical underpinning of the process we intended to pursue to augment the prototype of the advanced semantic platform for learning (ASPL-v2), and here we sum up that study. In the nutshell, the first formative analysis of the design decisions about the future semantic platform for learning took place in November 2005 at four universities in Europe, referred below as Site 1 to 4. In each site we worked with 10 local students, thus having in total 40 participants. Two groups (Site 1 and Site 2) were comprised of undergraduate students, the rest of PhD candidates recruited from a range of IT-related disciplines (e.g., computer science, database systems, software engineering, or data modelling). Initially, we interviewed the participants to obtain knowledge of their background, formal research skills training, especially, the extent of their training in literature analysis and reviewing. This preexperiment interviewing was used later to interpret the objective findings.
n
o
Figure 1. A screenshot showing a Magpie-enhanced web browser and a web page annotated using the lexicon derived for the Semantic Web domain; pointer n shows a user-selected ontology with several abstract categories of identifiable concepts (highlighted in different colours), and pointer o shows a sample menu with semantic services associated with a particular category of concepts.
We aimed (i) to see how semantic annotations and services are used by students during an information gathering task, and (ii) to identify ways of supporting more complex parts of the task of preparing the literature review using the Semantic Web. Group A participants used Google; thus this was our control group. Group B involved participants using ASPL-v1.
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
223
The participants were asked to compile ten references, which in their view addressed the task they were given. Their interactions with the ASPL-v1 were recorded, and responses were marked by a domain experts based on their subjectively perceived suitability for a literature review. A participant’s score was the sum of all the marks, and is shown in Table 1. The ‘Mean quality’ reflects only the marks from the expert; whereas the ‘Overall mean score’ takes into account time limit bonuses/penalties. The ‘quality’ measure has been acquired by domain experts (here, the three local tutors and experiment facilitators) assigned a mark to every resource proposed by the participant and complete with rationale and justification. A partial mark has been given if answers were partially correct or complete. The overall score was adjusted so that if the task was finished on time, the overall score equalled the quality – if the time allocated was exceeded, extra time ‘penalty’ of one mark per 2 minutes was introduced. The means given in Table 1 took in account the answer sheets from the participants. Table 1. Overall and per-location results of ASPL-v1 evaluation
The variance in performance was not statistically significant for the whole population (at p=5%) and only ‘Site 3’ showed a performance rise for the ASPL users. As we suspected, skilled users of Web search engines did not find much value in Magpie-annotated web pages for the sole purpose of retrieving but not discriminating the results in a manner similar to a generic search engine. The outlier ‘Site 3’, however, comprised students with a prior tuition in writing literature reviews. They relied on semantic annotations more frequently and tried to interpret and explore the retrievals through Magpie services. The annotations helped them filter and more cleverly navigate to the key references among retrievals. If we qualitatively generalize this – our outliers went beyond mere information retrieval, whether semantic or not. They showed expectations for more investigative navigation, which was what we aimed to capture and embed in the ASPL. For example, these outliers expected to obtain some guidance on what to do with partial solutions, assistance with query changes and re-formulations, etc. A flat list of semantically indiscriminate records provided in the initial demonstrator of ASPL was clearly insufficient in this respect. In our view, people who managed to replicate some aspects of the investigative or exploratory navigation through the retrieved resources gained more from Semantic Web annotations. The challenge for the ASPL re-design was thus to facilitate more of the guided exploration in otherwise large and ill-defined problem space. If we look at the task our participants tackled, its outcome is essentially an
224
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
argument comprising publications and rationale that can be conceptualized into a literature review ‘model’. As our user study showed, there may be many configurations of this conceptual task model. We call these configurations sensemaking paths. A sensemaking path, along which learning tasks can be organized, is one way to elaborate and to formalize exploratory navigation. One form of exploration corresponds to the capability to refine an initial problem; in our case it helps reduce the number of retrieved papers if (say) teams of co-authors or broader communities of practices are explored as alternatives accessible from the original query. Benefits of Semantic Web to a user’s interaction with learning resources are in an improved interpretation of the navigational choices in a multi-dimensional space. At any point, there are alternative directions to take; each step triggers a specific outcome, has specific strengths and weaknesses. Some are about deepening knowledge in one direction (this is conceptually close to faceted views [19]) and other steps are ‘lateral’; i.e. moving from one view of the domain to another (similar to horizontal navigation [3]). Unlike specialized faceted or navigational tools, Semantic Web technology may address both alternatives and offer a more robust and flexible support for the learner. 1.2. Functional and Technological Overview of ASPL The key argument from the past papers is that for the purposes of learning the interactions between a user/learner and the learning content are more than mere annotations of web pages, retrievals and subsequent browsing of semantic metadata. In our approach, ASPL has capabilities to mine for semantic relationships in a given domain and wrap complex information retrieval and analysis sequences into semantic ‘queries’. These queries, in turn, not only retrieve the data needed, but also act as models for information synthesis from multiple distributed sources. In other words, ASPL supports an exploratory, ‘combinatorial’ approach to developing learning strategies and skills by means of interacting with distributed data resources, focusing on creating analytic or synthetic pathways rather than merely retrieving simple data. From the pedagogic point of view, our approach offers one possible realization of what Laurillard calls a learning by conversation between the problem and the available data resources [15] – a conversation that facilitates exploration of the relevant knowledge space and creates a rich set of different kinds of associations between the chunks in the explored knowledge space [7]. Thus, for example, our task 1 can be seen as the first step in reviewing literature for a given domain. By supporting the identification of key personalities based on a user-specified topic and subsequently by enabling the user to adjust the list of results, the learner is acquiring a good practice in learning the skill of setting up the problem, of tackling the task systematically and of transparently justifying his or her choices. Our aim (for this first task, for example) is not to teach people that X, Y and Z are experts on topic T! Our aim is to support the acquisition of a skill to obtain such a list and to facilitate the use of skills to explore such a list at a knowledge level.
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
225
Specifically we implemented two distinct modes of exploratory learning: (i) convergent, ‘spotlight-style’ [6] browsing of semantically enriched resources, and (ii) divergent, ‘serendipitous’ browsing into an open web space [3]. Together, the two helped us to introduce support for analytic and synthetic learning tasks, and the value of our approach has been corroborated in a user-based study – majority of users liked the way ASPL-v2 helped them to navigate through the problem space in a structured way, which they could mimic and thus develop a skill in analyzing academic data. The notion of exploratory navigation is not novel [3-5, 11]. What our work brings to the state of the art is an open, extensible framework and the capability to achieve exploration by combining the specialized services. Information retrieval services are rarely conceived as exploratory; their aim is to access material needed for a given task. The ASPL framework offers several entry points enabling the exploration by linking such simple services and interpreting their outcomes semantically. For example, entities such as authors or research areas in ASPL provide means to get started with the literature review. The role of ASPL is to enrich these ‘gateways’ by offering services and realize the filtering, deepening and lateral dimensions, as mentioned in the previous section. Applying Semantic Web to construct multiple exploratory paths and attending to different aspects of the exploration, rather than to the individual nodes of the semantically enriched space, has several side effects. For instance, from the user experience viewpoint, the application becomes more flexible. A semantically enriched application does not confine its user to one specific activity or role. Another side effect is the dynamics of the semantic application. Ontology-driven solutions are often brittle; often based on closed worlds that enable reasoning solely about the known concepts. Linking the association discovery to the presentation overcomes this brittleness, and also avoids the knowledge acquisition bottleneck. In order to address this known issue we had to address two aspects to prove the viability of a semantic ‘solution’ to the eLearning problem: (a) implementing learning services for the revised ASPL-v2 framework, and (b) carrying out a comparative assessment of ASPL-v2 vis-à-vis other tools. As a proof of concept we used the DBLP data set 4 , for which we developed an interactive service frontend offering a rich, faceted interface to access the content of DBLP. The ASPL platform is essentially about associating web services with the concepts and instances from a particular ontology, which is of interest to the user. Thus, a suite of web service end points for the DBLP data set was developed, and these were later complemented with a user-friendly front end – simple, Google-style user interface for querying the content of DBLP and also for making knowledge-level inferences and connection interpretations (see Figure 2). In particular, the following web services were exposed from the DBLP data set as entry points for learning deeper connections within the domain of Semantic Web Studies:
4 DBLP is a well-known database of publications in computer science focusing on an ongoing semiautomated capture of new publications in the domain (see http://dblp.uni-trier.de).
226
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
x Person’s publications and interests … a combination of retrieving publications with an interpretative inference based on the publication keywords, Semantic Web Topic Hierarchy matches, etc. x Person’s interests … an interpretative inference based on the occurrence of keywords and phrases (also from Semantic Web Topic Hierarchy) x Person’s community characteristics … an interpretative inference based on the co-occurrence of co-authors, keywords and themes allowing generalizations from the individuals (researchers) to their collections (communities) x Person’s co-authors and communities … a combination of the retrieval function with an interpretative function as described above x Leading experts on topic … an interpretative inference based on the statistical and semantic analysis of individuals’ profiles and other web-based data x Main publication outlets for topic … an interpretative inference allowing the user to generalize from single nodes (publications and authors) to their collections (journals, conferences, etc.) In this chapter we focus on the comparative assessment of ASPL-v2 vis-à-vis other tools that have a similar scope and are commonly used by the users (Ask! and Google Scholar).
2. User-based Evaluation of Semantic Platform for eLearning The main objectives of this study are a) to identify whether ASPL-v2 is perceived as a useful tool for its users to perform the functions related to learning the practices needed in the academic domain, b) to compare and contrast this perceived performance of ASPL-v2 with sample tools from the domain, e.g., Google Scholar and Ask, and finally, c) to identify the perceived scope of usefulness for ASPL/DBLP++ vis-à-vis similar tools, which may help the users to decide when they can use tools such as ASPL and when other tools might be more suitable. Comparing tools can be difficult; particularly when some of them work in a specific domain and others are generic, and therefore can be used to find the information in any domain. We had to overcome this difficulty by designing five different tasks for the users to carry out. These tasks allowed us to evaluate the performance of the three tools on the same basis. In the study described next the users work with three tested tools – ASPL-v2 (i.e., its lightweight web-based user interface shown in Figure 2), Google Scholar 5 , and Ask! 6 , and they are asked to perform the study tasks by using the three tools in a different and variable order. 2.1. User Study Setup In total 20 postgraduate students from two different university and altogether 4 faculties were approached to participate in the user evaluation study. They were 5 Accessible from http://scholar.google.com 6 Accessible from http://uk.ask.com/?o=312&l=dir
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
227
selected in such a way that they represent different levels of skills, research background, and research expertise but remained ‘learners’: In particular, we had 10 research students in their first year with little experience in structuring and retrieving academic information, 6 more senior research students with medium level of skill to deal with and analyze academic data, and 4 research assistants with higher levels of the same skill. The participants were taken from the fields of mathematics, new media, semiotics, electrical engineering and artificial intelligence.
Figure 2. A screenshot showing a web-based user interface to the ASPL web services designed to obtain an ‘even’ visual interaction as Google Scholar or Ask! engines (available from http://neonproject.org/aspl-v2).
The evaluation study consisted of five tasks the users performed with a subset of tools: ASPL/DBLP++ 7 , Google Scholar, and Ask! engine. We asked the users that all the tasks involved in this evaluation study were performed independently; i.e., trying not to bias their assessment of performance of second and third tool by their impression from the first one. To counter this potential confoundment of the study we swapped the order in which the tools were shown to the users. During the study, the participants could ask for assistance from a facilitator if there was any clarification needed. The evaluation session lasted 80 minutes, with a short period for familiarizing oneself with the evaluation material and tools helped by the facilitator and then performing all the five tasks involved in the evaluation study. To begin with, the facilitator explained the reasons for carrying out the user study by giving some examples. Then the facilitator introduced the 7 Accessible from http://neon-project.org/aspl-v2
228
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
tools and demonstrated the key functional properties that we were interested in evaluating as a part of this user evaluation study. The tasks had a fixed duration, but in the case the participants ran out of time, they were asked to summarize the reasons they believed hindered them. There was no explicit reward for an early finish, neither was there any penalty for an unfinished task. The key requirement for each task was underlined, and this specified the material that needed to be retrieved in the task. The reason we provided this information was to avoid confounding the study by people trying to find some hidden catches in the task statements or spending time interpreting the natural languages sentences. 2.2. Overview of Evaluation Tasks Next we summarize the setup of the evaluation study and the tasks designed for the evaluation. For each task we also give a rationale for including it in the study. 2.2.1. Task 1: Identifying Expertise on a Topic In this task, the user was required to retrieve the names of the leading experts active in a specific research area. The users interacted with the tools in the following order: Google Scholar, ASPL/DBLP++, and Ask. To retrieve the information described above they were expected to use the query below: List 5 top researchers whose research work is closely associated with the research topic Semantic Web Services. Please explain the reasons why the specific researchers are included in the list. The main purpose of this task was to evaluate how well the three tools handled a domain-specific search for a well-defined query. In particular for ASPL, this allowed us to evaluate whether the ‘leading experts on topic’ service, which has been designed to interpret the results in addition to merely listing them, satisfied the user needs. The ‘leading experts on topic’ service took in account semantically interpreted annotations of the publications in our DBLP++ store and combined these with some statistical evidence to hypothesize the leading roles of particular individuals (authors) with regard to a particular topic. The input to this task was a phrase representing the research topic in question as underlined above. 2.2.2. Task 2: From Individuals to Communities In Task 1, the users identified the key experts, active and leading personalities in the research area of Semantic Web Services. Here we asked them to assume for the purpose of this task that Sheila McIlraith and Ora Lassila were two experts included in that list. In Task 2, the users performed the following activity by using the tools in the following order: Ask, ASPL/DBLP++, and Google Scholar: (i) For both researchers identify their areas of expertise and research interests between 1990-2003 and prior to the current date. (ii) Having identified the expertise and research interests of the two researchers, please generalize these areas of expertise so that we
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
229
can describe the research communities these experts belong to. Please, list these ‘community descriptions’ and state whether (in your opinion) it is clear from the tool output if these past research activities are related to the Semantic Web Services research. An important function of ASPL is to assist its users in exploring the areas of expertise not only by means of listing the individuals fitting the keywords. We aimed to also support analytic and synthetic processes whereby the user could infer the communities of interest for a particular individual, and justify why a particular community and/or individual fits within a particular research interest, etc. Thus, activity 2i was included to see how the three tools handle a domainspecific query to find research areas of the named individuals contextualized with a given temporal modifier. For ASPL, in particular, it allowed us to evaluate the performance of the services ‘person’s publications & interests’ and ‘person’s interests’ that evolved from the initial simple listing of the DBLP records. The second part of Task 2 required the users to: (iii) […] suggest 3-5 concrete publications of both experts that cover their top-ranked past research interests. The rationale for this task is that the learner new to the domain tries to understand the scope of a research community and its relationships to other research communities. In order to evaluate the performance of the tools to see how they support such users, Activity 2ii was included that allowed us to see how these tools handle this query broadening, synthetic scenario. The context here is to take into account the research interests of the named researchers and their commonly occurring co-authors, and try to generalize this into knowledge about communities of practice. Our aim was to evaluate the performance of the following two services: ‘person’s community characteristics’ and ‘person’s co-authors and community characteristics’ that are also an outcome of the ASPL re-engineering. Finally, Activity 2iii was included to get back to the core capability of the ASPL and DBLP framework – the retrieval of the actual publications for a given individual and topic. Here we looked at specific, detailed information, not merely a title, but something more like a bibliographic reference. 2.2.3. Task 3: Bibliographic Lists In the previous task the users identified top 3 publications of the experts active in the research area of Semantic Web Services. Next we asked them to prepare a full list of the found publications e.g., for the purpose of a literature review, for those publications retrieved in Activity 2iii: For all the publications that are collected in Task 2iii, please, describe in detail the relevant places (incl. the name of a conference, workshop, or journal) to find the collected publications along with complete bibliographical information of these publications. As above, the purpose of this task was to test the tools’ abilities to ground their search results in the additional information that can be readily reused, e.g., in
230
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
the literature review. This functionality was partially present in the earlier version of ASPL, so this was mainly to see whether the redesigned ASPL maintained one its key original motivation capabilities. 2.2.4. Task 4: Constraining the results of a query One important part of a critical literature review on a certain topic is the need to drill into in-depth details of a research topic. As a part of literature review the users may need, for example, to compare and contrast different viewpoints that exist over a particular issue, say Ontology Alignment. For this purpose, the users are required to find only those publications about Ontology Alignment that are technical in nature (e.g., white papers or technical reports). To perform Task 4 they need to use the tools in the following order: Google Scholar, ASPL/DBLP, and Ask: Find up to three publications on research topic Ontology Alignment, which provide detailed technical information on the topic. In this task, the technical nature is seen as a publication containing such aspects as definitions, schemas, architectures, and similarly. The motivation for this activity is to see what support do the tools offer in terms of constraining the search boundaries, and to what extent are the applications knowledgeable or aware of different purposes publications and papers may serve for.
3. Data Analysis and Qualitative Feedback from Participants Having completed the user study, we analyzed the comments that were raised by the users reflecting their experiences from interacting with all three tools in the tasks that involved different types of searches (as described briefly in the previous section. In order to have some benchmark we carried out the experimental tasks ourselves and in collaboration with 2 additional research students. The purpose of these pre-experiments was to establish benchmark time intervals in which tasks described in section 3 could be accomplished to an acceptable standard. These benchmarks were later used to decide whether a particular participant was ‘successful’ or not – we have not stopped participants concluding their task when the time expired, it was only noted that the tools have not substantially affected the users’ efficiency or they could even hinder the efficiency. Obviously, with no time limits, tasks could be accomplished, but we were interested to observe also the efficiency and for that some time benchmarks were needed. Prior to the study, we asked the users to indicate their level of experience (comfort) with the tools used and explained to them that a mark for their answer will be given partly for a correct answer and partly for the justification of why and how the selection of answers was made (their transparency). Thus, participants filled in sheets with the answers to the respective tasks and for each task filled in a series of structured questions where they were asked to record their perception of tools’ performance, simplicity, strengths and weaknesses. The answers to each task were recorded while doing the task; the perception of usefulness questions
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
231
were asked right after each task has been concluded in terms of gathering answers. Each task was concluded with a choice question where participants were asked to choose the tool they would like to use in the future to carry out similar tasks – this response is also the basis of our wrap-ups for each of the tasks in the following sub-sections. The main aims of our analysis are to see – a) if the users managed to perform the tasks and the activities associated with it within a benchmark time, b) then we compared and contrasted the performance of all three tools to evaluate which is appropriate to perform a specific type of data retrieval and synthesis, c) we analyzed if any specific tool performed better as compared with other tools in general, and finally, d) based on this analysis we determined the scope of ASPL/DBLP. In other words, we provide an indication about ASPL/DBLP, which will help its users to determine the types of searching services that can be successfully handled by ASPL/DBLP when compared with more generic search tools. 3.1. Analysis of Task 1 The limit for performing this task by using all three tools was 10 minutes. All the 20 participants successfully managed to complete the Task 1 within allocated time without having to extend the time duration. Moreover, all the users were satisfied with the description of the task provided to them in the user evaluation material and therefore no further assistance was required to be provided to them. 90% of the users found ASPL/DBLP useful for this kind of task. ASPL successfully managed to retrieve the top 5 experts on Semantic Web Services. In some cases the users decided to crosscheck the information retrieved by ASPL and therefore they changed the time interval given in the task to retrieve the names of the experts with a different time period. The users changed the time interval to 1980-2006 to get new results. They were particularly happy with the results they received after changing the time interval and confirmed that ASPL retrieved the same set of results in both time intervals. One of the main reasons why ASPL successfully handled the change in the time duration because the retrieval function embedded in ASPL was robust enough to handle scenarios where knowledge may be evolving and subject to temporal context. This can be particularly useful, as we will show later when the temporal context becomes a key part of a query. Different users provided us with different names; thus the assessment of correctness related to the degree of overlap of what the participants provided and what we obtained by speaking to the researchers in that community. To account for different views, our ‘benchmark list’ contained 15 people, whereas participants were asked to list (and justify) 5 people 8 they found using a particular tool. In this analysis we have not grouped participants by their level of expertise (unlike in section 2.1) due to a more homogeneous and smaller population – yet, such an analysis across different types of user may reveal further patterns. For a more for-
8 Note that we have not asked for a list of any 5 people, but a list of those 5 people they can get using a particular tested tool.
232
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
mal statistical analysis of performance across groups of users and/or tools (such as chi2 test) one would ideally need a larger population. In addition to direct answers to the queries and subjective perception of tools’ usefulness, we also observed and recorded how people achieved their goals; that is, how many queries they posted, how much query re-formulation was needed, how they captured partial results and asked new queries. These observations were used to interpret our findings and suggest reasons for specific situations. When compared with ASPL, only 10% of the users perceived Google Scholar as a more useful tool in this task, and that it was simpler and more straightforward to carry out such type of retrieval using Google. One important observation was made where the users mentioned that Google Scholar did not take into account the publishing dates associated with the publications, and therefore they had to look somewhere else (e.g., Google itself or the content of the link) in order to decide about which individuals were active in a specific period. As a result, they spent more time to look for the correct information by performing several searches in generic search engines. The reason why the users stressed the publication dates as one of the indicators was that if a certain researcher published a higher number of publications during a specific time period then s/he can be considered to be a more active in that period. In some other cases, the users also indicated that in contrast with ASPL, the ranking mechanism used by Google Scholar was less transparent because some important publications authored by the key researchers were placed lower in a list of results. Finally, the performance of Ask! was not satisfactory at all. Typically, the users stated that they had to look at multiple places in order to realize who might be the top experts in the given domain. When compared specifically with Google Scholar, almost all users stated that the search with Ask! was less intuitive – mainly because the search with Ask failed to find the publications and researchers, while even simple Google Scholar search was more accurate and quicker. Ask! failed to provide any indication about who were the top experts in a specific research area, as it pointed users to Citeseer 9 . It was difficult for the users to judge, based on the Citeseer entries, whether certain researchers had more influence as compared with others in a certain time interval. The service ‘leading experts on topic’ embedded under the tab ‘search in topic domain’ in ASPL was a key factor why this tool outperformed the other two. This service certainly helped the users not only to retrieve the required results, but it also saved their time because they did not have to look elsewhere. Because the users could set a specific time period to work with, it helped them to identify the leading researchers in Semantic Web Services without having to process the time interval part of a query. Figure 3 shows a sample output produced by ASPL for the service ‘leading experts on topic’. About 10% of the users did not want to use the existing services associated with ASPL. In one case, the user stated that ASPL failed to retrieve the individuals, who can be considered to be ‘gurus’ in the domain. In our viewpoint, this cannot be considered as a limitation of ASPL, because it is a subjective opinion to de9 http://citeseer.ist.psu.edu
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
233
termine who can be considered a ‘guru’ in a certain domain without using any explicit basis for selection. Some users expected ASPL to consider the relevance of the conferences for determining the importance of the publications, and therefore the authors who authored such publications.
Figure 3. A part of the output in ASPL for Task 1
Figure 4 shows two charts: The pie chart on the left shows the percentage of users who preferred to use the three tools for performing Task 1. The pie chart on the left shows percentages where users preferred not to make use of a certain tool in the future, in order to perform searches similar to the one performed in Task 1. User preference for using the tools
User preference for not choosing to use the tools
Google Scholar Ask 10% 0%
ASPL/DBLP 10% Google Scholar 20%
ASPL/DBLP Google Scholar
Google Scholar Ask
Ask ASPL/DBLP 90%
ASPL/DBLP
Ask 70%
Figure 4. Representation of user preferences for using the tools
In some cases, the users stated that the existing version ASPL failed to take into account the number of citations made about a specific publication because the authors of such a publication can be considered to be key researchers in a research area. Other users suggested it would be useful if ASPL allowed to set multiple types of searching focus, such as leading experts based on number of publications, based on impact factor of the publications authored by the researchers, etc. 3.2. Analysis of Task 2 The total time allocated to this task was 20 minutes. We observed that 80% users took more time to complete this task. On average the users took 5 minutes of extra time to complete the task. The main reason why the users found it difficult to fit their search within the time limits was Activity 1ii – the search performed by ge-
234
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
neric search engines such as Google and Ask took several iterations to retrieve meaningful results, and the users were forced to re-formulate their criteria several times. We will discuss this point later in this section. 65% of the participants felt that ASPL was an appropriate tool for this task, which provided them with most assistance to find out about the personal interest and research expertise of the named individuals. When looking for the past research interest of the researchers, the users voted ASPL as an easy and most straightforward tool to use, because they could set the time interval to find the necessary information explicitly. As a result, they didn’t have to worry about processing time interval as a part of interpreting the query results. Furthermore, the results in ASPL were temporally ranked, which also helped to speed it up. The two ASPL services – ‘person’s publications & interest’ and ‘person’s interests’ helped the users to look for the required information in a very friendly way. With the help of these two services the users only had to provide as an input the name of a researcher and set the time period to get the necessary information quickly. Moreover, ASPL not only retrieved the past research interests of the researchers in a given period, but it also sorted these past research interests in the order of most recent ones to the oldest ones. Figure 5 shows a partial view of the output produced by ASPL for the service ‘person’s publications & interests’. The performance of Google Scholar turned out to be worse. All the users stated that Google Scholar was not the desired tool when they were looking for the information about the personal interest and research expertise of researchers when the time duration was a crucial part of a query. In contrast with ASPL, Google Scholar did not allow users to set a specific time interval to filter out the information. As a result, the users had to use several search query combinations to get the results about personal areas of interests and research expertise of the researchers. Having received this information, it was difficult for them to map it to the time interval within which the experts in a certain research area were active. In other words, they had to manually perform the mapping of the retrieved information over the time interval. 35% users considered Ask! as an appropriate tool for such a type of task. The main reason behind this was that when searching for the information, Ask! led users straight to the people’s web pages, where research interest were listed explicitly. Once again the two analytic services in ASPL person’s publication & interest’ and ‘person’s interest’ under the tab ‘search in people domain’ helped ASPL to outperform Ask. In Activity 2iii, it turned out that 90% users thought ASPL helped them most to perform this part of Task 2 quickly and efficiently. Generally speaking, the users preferred ASPL because they only submitted the name of a researcher and then used the service called ‘person’s publications and interest’ to get the necessary information. However, in some cases the users reported that in contrast with Google Scholar the numbers of publications retrieved by ASPL were limited in numbers. More importantly, the users stated that ASPL failed to retrieve any information when the names of the researchers were submitted incomplete, e.g., ‘McIlraith’ or ‘Lasilla’. When the order of names and surnames was reversed, ASPL failed to find any publications, too. In contrast with this both Google Scholar and
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
235
Ask provided the necessary information when the queries submitted by the users were incomplete. We saw this as one of major weaknesses of the tested version of ASPL, and actually implemented the input disambiguation service within the framework, which allowed users to submit the input in any fashion that is suitable for them, and the ASPL would help them to complete it.
Figure 5. The output produced by ASPL for the service ‘person’s publications & interest’
In comparison with ASPL, the users found it difficult to perform this part of Task 2 by using Google Scholar. Similarly as with the Activity 2i, with Google Scholar they found it difficult to establish a relation between research interests of the researchers and their publication that could cover the different interests. While using Google Scholar the users used the following type of query ‘name of a researcher + research publications’ to retrieve the information. It was observed that Google Scholar simply retrieved the list of publication in which a researcher was an author. However, no support was provided by Google Scholar to determine whether these research publications covered the past research interests. About 10% of users voted Ask! as the most appropriate tool to perform activity 2iii. Typically the users used the query ‘name of a researcher + research publications’ to retrieve the information and Ask! led them to the individuals’ home pages, where the users could access much of the relevant information. However, for the users it was difficult to relate the research publications that would cover top-ranked past research interests of a researcher. The users preferred Ask! to Google Scholar because for the same query, Ask gave them to access different type of information, whereas Google Scholar only failed to retrieve any results. Figure 6 shows the charts where we represent the comparative preference for the three tools. In a nutshell, we can say that in the context of search where the users are looking for establishing a relationship between the submitted keywords, e.g. “Sheila McIlraith” + “Research Interest” + “1990-2003”, in order to retrieve more meaningful results because the information may be represented in different context, and therefore, it may be distributed at different places, ASPL with its services allowing information mashup can be seen as the best choice.
236
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform User preference for the tools: Activity 1
User preference for the tools: Activity 2
Ask 10% Ask 35%
Google Scholar 0%
ASPL/DBLP 65%
ASPL/DBLP Google Scholar Ask
Google Scholar 0%
ASPL/DBLP Google Scholar Ask ASPL/DBLP 90%
Figure 6. User preferences for using the tools during Activity 2a/b and 2c
3.3. Analysis of Task 3 The total duration for completing this task was 15 minutes. Because this task was a continuation of the previous one we expected it to be quite straightforward for the users to finish it in time, but it turned out that about 15% users failed to complete the task in a given time. The main reason behind this was the users found it difficult to retrieve complete bibliographic information of the publications by using Google Scholar and Ask. In ASPL the users typically relied on the service ‘person’s publications and interests’ under the tab ‘search in people domain’. Having retrieved the publications as shown in Figure 7 for Sheila McIlraith, ASPL allowed users to navigate the complete bibliographic information of the publication simply by clicking on an icon with the “BibTeX” logo, in the column ‘Navigate’.
n
Figure 7. Navigation provided by ASPL to access complete bibliographic information; pointer n shows how the actual publication can be included in the result set by means of utilizing online link repositories such as Bibsonomy or DOI.
Generally speaking, 70% users preferred ASPL to perform this type of search for it gave them very detailed results. The users typically considered ASPL as a useful tool, because it not only provided an access to the complete bibliographic information of a publication but also the bibliographic information was retrieved straight from DBLP the users had a higher trust factor to the result. As a result the
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
237
users did not have to cross check authenticity of bibliographic information elsewhere. This saved a crucial time while performing this task. The users also liked the way ASPL presented the results to them in a simple, consistent format: authors, publications, sources (i.e., name of conference, workshop or journal), and the publication years. The users preferred this type of postprocessing service performed directly by ASPL services, because it was easier for them to interpret the structured results returned. When they used the other two tools they stated explicitly that the format of the retrieved results changed from one publication to the next one, which made it difficult for them to interpret them. About 20% users preferred Google Scholar: they liked it because they were already familiar with its interface and the way it worked (thanks to its Google parent). In contrast with ASPL, in Google Scholar no facility was provided to retrieve research publications straight from an online repository. As a result, the users decided to cross check the results of Google Scholar with DBLP, which contributed to failing to complete the task in a given time. In contrast with ASPL, which presented the results in a consistent format as described earlier, no such postprocessing service was implemented in Google Scholar. That caused the users in some cases to fail to recognize the retrieved result as the correct one. Finally, only 10% of the users preferred to use Ask! for performing the task. They stated that Ask! managed to retrieve the publications authored by both authors but retrieving the bibliographic information was not straightforward. Similarly as Google Scholar, no post-processing of the results was performed by Ask! and the format of the results changed from one publication to the next one. As before, there was no service implemented in Ask, which would check whether the bibliographic information was compliant with a specific format. Once again ASPL turned out to be performing reasonably well where the academic related information was required in a consistent format. Tools like ASPL can be seen as an acceptable resource for the users interested in finding the information from a specific domain (here, computer science) – mainly in balancing the complexity of queries with results precision. As a result, after completing evaluation of the first three tasks, we realized that ASPL effectively supports narrow needs of the learners within such domains as information consultancy, information analysis, academic research in computer science, etc. 3.4. Analysis of Task 4 The total duration allocated to perform this task was 15 minutes, and all the users successfully managed to finish the task in a give time. Based on the data gathered from the users, about 25% users considered ASPL as the best tool to perform a context-specific search; 75% of the users preferred to use the generic search tools instead. Generally speaking, by using ASPL the users managed to retrieve the results that were associated with ‘ontology alignment’ using the service ‘main publishing outlets’ under ‘search in topic domain’ tab. However, no assistance was provided by ASPL that would allow its users to make a decision about whether the results were good candidates for providing the technical information on topic.
238
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
The users had to pass through several publishing outlets, such as finding the relevant conferences on a topic, then they had to manually look whether that conference hold a session or a paper on ontology alignment, and finally decide whether the conference and/or publications were technical in nature. More importantly, the users found the existing version of ASPL less flexible compared to generic search tools. The main reason was that generic search tools allow the users to provide additional keywords, such as “ontology alignment + conference paper” or “ontology alignment + definition”, which take into account the full-text content rather than any background knowledge. Most users preferred Google Scholar to perform such a type of search. One of the reasons for this was that it allowed them to add multiple new keywords and that it based its retrieval on the full text of the document rather than its metadata. As a result the users received a fairly accurate output with very little effort. The preference for Ask and Google Scholar were similar in this task, but both higher than ASPL. When analyzed their responses, the users preferred to use Ask because, it suggested the results from other search tools, e.g., Excite and Lycos, which were relevant to their query. In some cases, the users preformed the search by using the following combination of keywords to find the relevant information: ‘ontology alignment’ Æ ‘ontology alignment + technical papers’ Æ ‘ontology alignment + define [model, schema,…]’ in both Google Scholar and Ask. Google’s awareness of article recency was flagged by some users as a helpful feature, which helped to retrieve the most recent publications about the research topic. ASPL has support for interval restrictions to indicate temporal context, and no such feature existed in Ask. Also the functionality of attaching a brief summary of the analyzed material next or under its bibliographical identifier was considered as helpful, as it gave the participants rich clues about the style of language in the paper, type of target audience, etc. – which were all helpful to make the decision about the nature of the publication without necessarily going into details. In ASPL, a helpful feature was the classification of the publishing outlets, thus allowing the participants to base their decision about the nature of the paper on the type of conference/journal/magazine. About 40% users indicated that this would be an interesting feature, as the collections often give more reliable information than the individual items (papers, articles).
4. Discussion of Evaluation and Analysis We conclude the analysis of the data gathered during the user evaluation study of the performance of our semantic platform for learning compared to non-semantic search tools. Our ASPL turned out to be an appropriate tool when the users could make use of its specialized analytic services implemented in the tested version. The users clearly preferred to use ASPL to search for expertise and for communities of practice. However they preferred generic tools, i.e., Google Scholar or Ask, when the search was open-ended and as a result they had to make new query templates to look for the information. In such cases ASPL failed to retrieve the necessary information because the existing version of ASPL did not allow to add new
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
239
keywords or to formulate new analytic techniques. The combination of these two findings supports the design decisions for the ASPL architecture. Our original argument was to introduce more support for ‘higher forms’ of learning (see e.g., [7]) into the technological platforms. Where majority of tools focuses on retrieving the right information, the right learning material, etc. we set ourselves the challenge to actually facilitate learning of skills and practices enabling the learners to process, analyze and synthesize the information they come across on the Web. As we discussed in our past papers, e.g., in [7], in terms of learning, ASPL offers learners the access not only to atomic resources (e.g., publications or learning units) but also to the range of relational associations. Our aim was to show that semantically associated entities enable the learner to draw more abstract analytic and/or synthetic conclusions, which are, in turn, beneficial to support an openended task of analyzing and reviewing the state-of-the-art in a given domain. Intelligent Tutoring Systems [12] and other tools supporting learners are active applications – leading the learner through their content. Yet they largely operate on a closed set of resources and tend to use manually defined abstract relationships between concepts and resources [11]. The most popular link of this kind is ‘requires’ – as in “Study of ontology-based annotation requires knowledge of RDF.” Applying the ‘requires’ link transitively, it is possible to compute user paths through the resources and ensure each user follows a prescribed learning task. However, manual annotations are not scalable; they assume one path fits all user needs. Unfortunately, as the number of links increases, this approach reduces the feasibility of this type of systems. Rather than tying the learner into one specific learning task, we see learning tasks as an optional element in a semantic system supporting the learners. Many learning tasks can be achieved by following distinct paths through the space of (learning) resources. It is nearly impossible to formalize any one of these paths as an ideal execution of the learning task. Instead, different paths can be triggered by associations that happen to be useful at a particular moment. The relationship between tasks and paths is many to many – one task can be achieved by following several paths, and vice versa.
5. Related Work The notion of exploratory user interaction with (learning) resources has been around for some time. In the pedagogical domain, for example Laurillard [15] presents learning as ‘a conversation of the learner with the problem and available resources – a conversation that facilitates exploration of relevant knowledge spaces and creates a rich set of different kinds of associations between the nodes in the explored knowledge space’. Similar views were observed by cognitive scientists studying skill acquisition in the design problems. Schön [18], for example, talks about ‘reflective conversations of a practitioner with a design situation’. These conversations allow alternative interpretations of the situation by relying on different associative frames. The theory of problem framing [10, 18] was developed at an abstract, conceptual level; nevertheless, its core is in recognizing and select-
240
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
ing associations to make sense of a situation. This, in turn, can be considered as the basis for a more operational, technical approach, which we followed up in the ASPL scenario and re-engineering of the underlying semantic platform. The Web community attempted to address the issue of dynamic associations in standard hypermedia. E.g., COHSE [5] or Magpie [8] re-introduced the serendipity into navigating using dynamic links. In this case, the hyperlinks were independent of the actual web pages and could be ‘injected’ into the page as and when needed. In educational hypermedia, benefits of horizontal (non-hierarchical) navigation in digital textbooks were analyzed e.g., in [3]. A lack of support for the horizontal links was noticed, as this mode of navigation is more resource-intensive than taxonomic classifications. Vertical content-based links represent the plan; the horizontal associative links are serendipitous and subjective to a learner. If the capabilities of ASPL to handle incompleteness by perusing the existing and inferred semantic relationships are improved, the tool is likely to gain a good support within a narrow domain of users who need to search for and interpret academic literature as well as research community relationships. Thus, we believe that ASPL can be considered a successful, albeit still open to further improvements, realization of the initial objective we set for this educational area research.
6. Conclusion Reducing the information overload caused by the growing web is often cited as the premise for work on supporting eLearning and web-based learning. But finding relevant documents is only half of the story. Their interpretation involves a reader in understanding the surrounding context, in which the document was created. In order to gain the full understanding a reader will require knowledge of the specific terms mentioned and the implicit relationships contained both within the document and between the document and other external knowledge sources. In this chapter we proposed an approach to address this issue by capturing context within the exploratory relationships, which then is used to enrich the user’s interaction with the underlying knowledge by means of perusing analytic service-based compositions within a semantic platform for learning to expose relevant segments of the background context according to the user’s needs. Attention as opposed to information is now widely acknowledged to be the scarce resource in the Internet age. Consequently, tools that can leverage semantic resources to take some of the burden of the interpretation task from the human reader are going to be of enormous use. ASPL – the technology evaluated in this chapter, is one of the steps towards achieving this goal. As we summed up in section 5, the idea of exposing more complex information analyses to the user by means of dedicated services clearly led to ASPL outperforming data retrieval centred search engines. As a source of information, knowledge and guidance, more and more supporting learning both formally and informally, the Web needs tools and middleware with the capacity to create semantic associations and to use such associations, e.g., to enrich more familiar user interaction processes, such as search or data retrieval.
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
241
Applying Semantic Web to construct multiple exploratory paths and attending to different aspects of the exploration, rather than to the individual nodes of the semantically enriched space, seems to be a promising way forward. In addition to opening up the analytic and synthetic phases of information processing to the user, the conscious support for data integration and interpretation has several side effects. For instance, from the user experience viewpoint, the application becomes more flexible. A semantically enriched application does not confine its user to one specific activity or role. Another side effect is the dynamics of the semantic application. Ontology-driven solutions are often brittle; often based on closed worlds that enable reasoning solely about the known concepts. Linking the association discovery to the presentation overcomes this brittleness, and also avoids the knowledge acquisition bottleneck.
Acknowledgments The work reported in this chapter has been partially supported by the following grants: climateprediction.net sponsored by the UK Natural Environment Research Council and UK Department of Trade and Industry’s eScience Initiative; AKT, an Interdisciplinary Research Collaboration (IRC) sponsored by the UK Engineering and Physical Sciences Research Council by grant GR/N15764/01; KnowledgeWeb Network of Excellence funded by the European Commission’s Sixth Framework Programme under grant FP6-507482; and NeOn Integrated Project funded under the same programme with grant FP6-027595. In addition to funding sources, we also want to thank the anonymous reviewers for their valuable comments on this work and on its possible extensions.
References T. Berners-Lee, J. Hendler and O. Lassila, The Semantic Web. Scientific American 279(5) (2001), 34-43. [2] B.S. Bloom, A Taxonomy of Educational Objectives Handbook 1: Cognitive Domain. 2nd ed. New York, US, 1965. [3] P. Brusilovsky and R. Rizzo, Map-Based Horizontal Navigation in Educational Hypertext. Journal of Digital Information 3(1) (2002), 156. [4] V. Bush, As We May Think. The Atlantic Monthly 176(July 1945), 101-108. [5] L. Carr, S. Bechhofer, C. Goble, et al. Conceptual Linking: Ontology-Based Open Hypermedia. in Proc. of the 10th Intl. WWW Conf., Hong-Kong, 2001. [6] T. Collins, P. Mulholland, and Z. Zdrahal, Semantic Browsing of Digital Collections. in Proc. of the 4th Intl. Semantic Web Conf., Ireland, 2005. [7] M. Dzbor and E. Motta, Semantic Web Technology to Support Learning About the Semantic Web. in Proc. of the 13th Intl. Conf. on Artificial Intelligence in Education (AIED), California, US: IOS Press, 2007. [8] M. Dzbor, E. Motta and J. Domingue, Magpie: Experiences in Supporting Semantic Web Browsing. Journal of Web Semantics 5(3) (2007), 204-222. [9] M. Dzbor, A. Stutt, E Motta, et al., Representations for Semantic Learning Webs: Semantic Web Technology in Learning Support. Journal of Computer Assisted Learning 23(1) (2007), 69-82. [10] M. Dzbor and Z. Zdrahal, Design as Interactions between Problem Framing and Problem Solving. in Proc. of the 15th European Conference on AI (ECAI), Lyon, France, 2002. [1]
242
M. Dzbor and D.G. Rajpathak / Comparative Evaluation of ASPL, Semantic Platform
[11] M. Eisenstadt, B.A. Price and J. Domingue, Software Visualization as a Pedagogical Tool. Instructional Science 21 (1983), 335-365. [12] C.Frasson, G. Gauthier and G.I. McCalla, Intelligent Tutoring Systems, in Intl. Conf. on Intelligent Tutoring Systems (ITS), Springer-Verlag, Berlin, 1992. [13] N. Guarino, C. Masolo and G. Vetere, Ontoseek: Content-Based Access to the Web. IEEE Intelligent Systems 14(3) (1999), 70-80. [14] J. Kahan, M.-R Koivunen, E Prud'Hommeaux, et al. Annotea: An Open Rdf Infrastructure for Shared Web Annotations. in Proc. of the 10th Intl. WWW Conf., Hong-Kong, 2001. [15] D. Laurillard, Rethinking University Teaching: A Conversational Framework for the Effective Use of Learning Technologies. 2nd ed., London, UK: RoutledgeFarmer, 2002. [16] D.L. McGuinness, Ontological Issues for Knowledge-Enhanced Search. in Proceedings of Formal Ontology in Information Systems, 1998. [17] I.A Ovsiannikov, M.A. Arbib and T.H. Mcneill, Annotation Technology. International Journal of Human-Computer Studies 50(4) (1999), 329-362. [18] D.A. Schön, Reflective Practitioner - How Professionals Think in Action, USA: Basic Books, Inc., 1983. [19] M.C. Schraefel, D.A. Smith, A. Owens, et al. The Evolving Mspace Platform: Leveraging the Semantic Web on the Trail of the Memex. in Proceedings of the International Conference on Hypertext, Austria, 2005.
Part 3 Social Semantic Web Applications
This page intentionally left blank
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-245
245
CHAPTER 13
E-Learning and the Social Semantic Web Jelena JOVANOVIC a,1 , Dragan GASEVIC b and Vladan DEVEDZIC a a University of Belgrade, Serbia b Athabasca University, Canada
Abstract. The Social Semantic Web has emerged recently as a new paradigm for creating, managing and sharing information through the combined use of technologies and approaches from the Social Web (aka Web 2.0) and the Semantic Web. In this chapter, we first introduce the fundamental concepts of the Social Semantic Web. Subsequently, we focus on (some of) the benefits that the Social Semantic Web paradigm can bring to e-learning environments, such as effective and reliable knowledge management and sharing, advanced forms of interactivity and ubiquitous access to learning resources. Keywords. E-Learning, Intelligent Learning Environments, Semantic Web, Social Web, Social Semantic Web
Introduction In this chapter, we introduce the Social Semantic Web paradigm and present how it can be leveraged in e-learning for improving current e-learning practices and introducing new ones. We also show how it helps addressing some of the still open research issues related to Intelligent Learning Environments (ILEs), including (but not restricted to): x
x x
1
Enabling effective and reliable mechanisms for managing (i.e., capturing, representing, and evolving) various types of knowledge (e.g., domain, user, and pedagogical) relevant for providing personalized learning experiences in ILEs. Equally important is the ability to preserve the semantics of this knowledge while sharing it among various learning systems and tools that students interact with during the learning process. Opening ILEs so that they can make use of the open Web content instead of being restricted to the closed corpus of documents assembled at the design time. Improving present and developing new forms of interaction along each dimension of the interactivity triangle [1]. The interactivity triangle (Figure 1) is a widely accepted model of interactivity in learning settings. It has students, teachers and content at its vertices. Each vertex is related with the other two
Corresponding Author: Jelena Jovanovic, Faculty of Organizational Sciences, University of Belgrade, Jove Ilica 154, 11000 Belgrade, Serbia; E-mail: [email protected]
246
J. Jovanovic et al. / E-Learning and the Social Semantic Web
x
x x
and with itself, so that, for example, students are in interactions with teachers and content, but they also interact among themselves. Providing support for interactivity across diverse learning systems and tools that students turn to during the learning process. Interactivity in the context of learning is largely equivalent to the students’ social and creative engagement, that is, communication, collaboration, and authoring. Integrating, sharing and using of interaction data to allow for advanced forms of adaptive and personalized learning. Enabling ubiquitous access to learning resources, that is, allowing for access to relevant resources (both human and digital) regardless of the system/tool/service the user is currently interacting with.
Figure 1. Interactivity triangle [1]
1. The Social Semantic Web The Semantic Web has been introduced as a vision of the evolution of the current Web in which “information is given well-defined meaning, better enabling computers and people to work in cooperation” [2]. The building blocks of the Semantic Web are ontologies – formally described conceptualizations of shared domain knowledge. They are expressed through standard languages (such as RDF and OWL), which allows them to be combined, shared, easily extended and used to semantically annotate different kinds of resources, such as Web pages, documents, and multimedia content, to name but a few. Despite its many promising aspects [3] [4], the Semantic Web is still not widely adopted yet. This is mainly due to the difficulties in ontology creation and maintenance, and the process of semantic annotation. The development of ontologies is difficult and strenuous for domain experts who typically lack the required knowledge engineering expertise. Despite current efforts to increase the availability and reusability of ontologies through the development of online ontology libraries (e.g., Swoogle2 ) or (semi-)automatic ontology development tools, the usage of these libraries and tools still require a high level of technical knowledge [5]. A new wave of so-called social applications has emerged as a culmination of technology and interaction techniques, and has been labeled the Social Web or Web 2.0 [6]. While much hype has surrounded these recent innovations, the uptake of software 2
http://swoogle.umbc.edu/
J. Jovanovic et al. / E-Learning and the Social Semantic Web
247
solutions has been significant. The Social Web transforms the “old” model of the Web – a container of information accessed passively by users – into a platform for social and collaborative exchange. On this platform, users meet, collaborate, interact and most importantly create content and share knowledge. Popular social websites, such as Facebook, Flickr and YouTube, enable people to keep in touch with friends and share content. Other services such as blogs, wikis, video and photo sharing, that together enable what recently has been defined as “lifestreaming” 3 , allow novice users to easily create, publish and share their own content. Furthermore, users are able to easily annotate and share Web resources using social bookmarking and tagging, thus creating metadata for Web content commonly referred to as “folksonomies”. However, Social Web technologies in general, and collaborative tagging in particular, suffer from the problems of ambiguity of meanings. For instance collaborative tags are often ambiguous due to their lack of semantics (e.g., synonymous meanings for a tag). Moreover, they lack a coherent categorization scheme, and require significant time and a sizeable community to be used effectively [7]. Despite the initial perception that the Social Web and the Semantic Web oppose each other, the two efforts are jointly being used to create a common space of semantic technologies. In fact, the Semantic Web cannot work alone. It requires society-scale applications (e.g., advanced collaborative applications that make use of shared data and annotations) [8]. Moreover, the paradigm of knowledge creation derived from the Social Web can be effectively used to refine/update ontologies generated according to Semantic Web standards and best-practices. At the same time, the Social Web can benefit from the paradigm of structured knowledge, represented with standard languages adopted in the Semantic Web vision. Such standards will make it easier for collective knowledge to be shared and to interoperate with any sort of application. The idea of merging the best of both worlds has converged in the concept of the Social Semantic Web, in which socially created and shared knowledge on the Web leads to the creation of explicit and semantically-rich knowledge representations. The Social Semantic Web can be seen as a Web of collective knowledge systems, which are able to provide useful information that is based on human contributions, and which improves as more people participate [9]. Specific examples of the Social Semantic Web are being undertaken in a wide number of projects. For instance, DBpedia 4 is a large-scale semantic knowledge base, which structures socially created knowledge on the Wikipedia 5 , a wiki-based encyclopedia. DBpedia takes advantage of the common patterns and templates used by Wikipedia authors to gather structured information into a knowledge base of socially created structured knowledge. The result is a huge database of shared knowledge which allows “intelligent” queries such as “List the 19th century poets from England” [10]. With its capability of answering very specific queries, DBpedia can serve as a very handy learning tool and is an excellent example of the advantages that Social Semantic Web paradigm brings to the educational domain. Throughout the chapter, we provide a lot of additional examples of the benefits that the Social Semantic Web paradigm brings to e-learning and education in general.
3 According to WordSpy (http://www.wordspy.com/words/lifestreaming.asp) lifestreaming is “an online record of a person's daily activities, either via direct video feed, or via aggregating the person's online content such as blog posts, social network updates, and online photos.” 4 http://dbpedia.org 5 http://www.wikipedia.org/
248
J. Jovanovic et al. / E-Learning and the Social Semantic Web
2. Benefits of the Social Semantic Web Paradigm for e-Learning In order to analyze the benefits of the Social Semantic Web paradigm for ILEs, we use a bottom-up approach combined with the qualitative observations of features of the Social Semantic Web concepts, standards, and state of the art applications and systems. Our analysis is not exclusively based on the existing research publications covering the exact topic of the Social Semantic Web in ILEs, but we also refer to general-purpose solutions and research results that are leveraging the concepts of the Social Semantic Web. This is a natural decision, as the process of learning is not strictly bound to only one learning environment, but it is a continuous process spanning across different (social and/or learning) systems and contexts. Although we are aware of the large amount of research work on social, collaborative and situated approaches to e-learning that have been done in the mainstream e-learning research communities (outside of SWEL 6 ), we have decided to keep primarily the semantic-web-based perspective as we found it more appropriate for the SWEL audience. Accordingly, in this section, we provide a number of examples that illustrate the benefits that the Social Web and Semantic Web technologies offer to e-learning. 2.1. Development and Maintenance of Domain Ontologies Every ILE (e.g., intelligent tutoring system) requires a domain model – a model which integrates knowledge about the domain to be taught. The focus on details in designing the domain model is essential for ensuring the desired levels of adaptation, learning path definition, feedback provisioning, etc. For this reason, the process of creating such a model is time-consuming and, in general, rather expensive. Therefore, it is important to allow for domain knowledge sharing, reuse, and exchange among several different ILEs covering the same or similar domains. For more than a decade, researchers have been using ontologies for domain knowledge modeling and representation in ILEs [11] [12]. However, the problem is that these early endeavors were restricted to local ontologies that were usable only in systems for which they were developed, and not used for knowledge sharing among ILEs covering the same or similar knowledge areas. Therefore, the problem of enabling (semi-)automatic knowledge sharing, reuse, and exchange among several different ILEs covering the same or similar domain, was still open. One of the primary obstacles for fulfillment of this goal was the fact that at the time when the first solutions (e.g., MOBLIGE [13]) were proposed, the Semantic Web infrastructure was not mature enough to provide the required support. Since these first proposals, Semantic Web technology has made a significant progress, and the next generation of Semantic Web applications [14] could take advantage of the vast amount of semantic data and ontologies available online. For instance, there exist infrastructures, such as Watson 7 or SWSE 8 , for the collection, indexing and provision of access to semantic data and ontologies on the Web. Another problem related to the usage of ontologies for representing domain models is the constant need for ontology evolution (maintenance and updating). This is not a trivial task, because current approaches and tools assume a background in knowledge 6
SWEL stands for Semantic Web for E-Learning http://watson.kmi.open.ac.uk 8 http://swse.org/ 7
J. Jovanovic et al. / E-Learning and the Social Semantic Web
249
engineering, or familiarity with ontology languages. This is true even when a (semi-) automatic approach is proposed. In general, these tools are too complex to be used by most teachers and learning content developers. The Social Web paradigm as a mean for facilitating ontology development and maintenance has been receiving a constantly increasing interests of the Semantic Web research community. For example, Hepp et al. [15] have suggested a Wiki-based infrastructure and culture as an environment for constructing and maintaining domain ontologies and using the Wikipedia URIs as unique identifiers of ontology concepts. This seams to be an appealing solution from the perspective of end-users (i.e. teachers and instructors), as it would provide them with an easy-to-use working environment. However, this solution produces an “informal ontology”, that is, a collection of named conceptual entities with a natural language definition, and such an ontology cannot address specific requirements of e-learning environments. There is an increasing number of ontology editors which rely on collaborative features of the Social Web paradigm to facilitate the task of ontology authoring and maintenance. For example, Neologism [16], a Web-based ontology editor and publishing system is aimed at reducing the time required to create, publish and modify RDF Schema-based ontologies. Another example is Knoodl 9 , a Semantic Web platform that allows for community-driven development and maintenance of OWL ontologies, as well as for development and usage of RDF knowledge bases on top of these ontologies. In our recent research work, we have suggested a novel method of interactive visualizations that provide an intuitive and practical way for instructors to use the implicit feedback available from student folksonomies for evolving domain ontologies [17]. In addition, we have developed a method which uses algorithms for computing the semantic relatedness to further facilitate the teacher’s task of ontology maintenance by suggesting him/her the tags that are relevant for any particular ontology concept. The method is based on the idea that the ontology itself defines a ‘context’ for its concepts. So, when computing the relatedness between a concept and a tag, the surrounding concepts (forming the ‘context’ of the concept in question) must also be taken into account. The initial evaluation has shown that our method is particularly useful in situations where (i) the chosen semantic relatedness measure can not relate a high number of concept-tag pairs and (ii) fine grained domain ontologies are available [18]. The tasks of ontology refinement are constant, and in order to efficiently address them we combine several approaches that leverage student contributions. This combined approach allows support to be given which is consistent with the course content, and with the conceptualizations that instructors and students have of that content. Furthermore, intrinsic motivation and trust of students in a system that derives knowledge from their activities is certain to increase, since they are aware that they are contributing to the system and that their contribution ‘counts’. 2.2. User Generated Content as Relevant Learning Resources Almost all traditional ILEs work with closed set of documents assembled together at design time and fully known to the system. However, this approach does not go along with the open nature of the Web. The major challenge is to empower these systems, so
9
http://www.knoodl.com/ui/home.html
250
J. Jovanovic et al. / E-Learning and the Social Semantic Web
that they can extract some meaning from an open corpus of documents and work with the open Web without the help of a human indexer [19]. Technologies such as RDFa [20], eRDF 10 and microformats 11 offer a part of the solution since they allow for embedding semantic annotations in Web (e.g., XHTML) documents in a standardized way. There are already tools for extracting semantics (i.e. RDF data) from Web pages enriched with the RDFa/eRDF markup (e.g., Gleaning Resource Descriptions from Dialects of Languages - GRDDL [21]) as well as services harvesting and indexing this semantic data, such as Sindici 12 and SearchMonkey 13 . Even though these technologies still require humans in the loop (to embed semantic markup in Web pages), there are more and more incentives for human participation 14 and tools that facilitate the process (such as Semantify 15 and SearchMonkey API). In addition, services for automatic information extraction from Web resources, such as Zemanta 16 and SemanticProxy 17 , which have recently started to emerge promise to further facilitate the inclusion of the open Web content in the repertoire of ILEs. Some initial research work has been done on using the above mentioned technologies in the e-Learning domain. For example, DERI Galway has developed a framework for extracting useful knowledge published online in an informal way (e.g. wikis, blog posts, and forum posts), structuring the acquired knowledge and putting it into use within Learning Management Systems [22]. The first implementation of this framework is IKHarvester, a Web service capable of capturing RDF data from Social Semantic Information Sources (e.g., semantic blogs and semantic wikis), and resources with semantics embedded in the form of microformats. In addition, by scrapping HTML pages, IKHarvester can generate RDF descriptions from non-semantic information sources such as Wikipedia. The harvested resources together with their semantic descriptions are available for use in Learning Management Systems. IKHarvester has already been used within the Didaskon learning framework [23] and the initial evaluation of this service has provided positive results; a full usability survey of the service is on the way. An important source of semantic markup is tags. They are mainly used as descriptive metadata (i.e., tags often describe the content of the tagged resource). Still, they can also be used as administrative metadata (e.g. “creative-commons” to identify license issues). They can also identify the source/author of the tagged resource (e.g., “w3c” tag to identify a document from the w3c website, or “byTBL” tag to identify Tim Berners-Lee as the author) [24]. In the last couple of years, the Semantic Web research community has made a significant effort to disambiguate and formalize tags, that is, to bridge the gap between the needed level of semantic richness and the level offered by tags. Besides being beneficial for improving semantic richness of tags, those approaches and techniques could also be applied for analyzing students’ tags to identify students sub-communities based on shared interests: annotated lessons and/or tags used for annotation. 10
http://getsemantic.com/wiki/ERDF http://microformats.org/ http://sindice.com/ 13 http://searchmonkey.sourceforge.net/ 14 Yahoo has recently announced a new search engine that will index pages with embedded semantics (http://www.ysearchblog.com/archives/000523.html) 15 http://www.dapper.net/semantify/ 16 http://www.zemanta.com/ 17 http://semanticproxy.opencalais.com/ 11 12
J. Jovanovic et al. / E-Learning and the Social Semantic Web
251
For semantic annotation of learning resources on the Web, the MOAT (Meaning Of A Tag) 18 project offers an interesting solution. Instead of trying to disambiguate and semantically enrich tags after their creation (as the majority of similar research efforts does), MOAT aims to empower users to define meaning(s) of their tag(s) – by relating them to the URIs of existing concepts from Semantic Web knowledge bases (such as DBpedia and GeoNames) – while they are annotating Web resources [25]. While users can still benefit from the simplicity of free-tagging when annotating content, the linking to existing concepts (i.e., URIs) offers a way to solve tagging ambiguity. Moreover, the relationships between concepts that tags are linked to can be leveraged for deducing additional relationships among tags themselves, as well as among tagged resources. There is also the MOAT ontology for formal representation of tags, their meaning and the tagging context. A similar approach towards semantic rich tags is taken by some of the latest social bookmarking tools, such as Faviki 19 and Zigtag 20 . 2.3. New and Improved Forms of Interaction Besides being members of general social networks, like Facebook, MySpace and YouTube, many students are also participants to online social networks specifically focused on their studies, like stud.icio.us, NoteMesh 21 and CollegeRuled 22 . These networks typically allow students of the same class to share notes with each other, offer a message board and/or a discussion area where students discuss assignments with classmates, ask questions, work in groups, and the like. There are also online social networks aimed primarily at teachers and instructors, for their professional development, as well as collaborative creation and exchange of learning content and instructional practices; examples include Curriki 23 , EdTechTalk 24 and EduBlogs 25 . Some online social networks are aimed at connecting students and teachers, like Schoopy 26 and BuddySchool 27 . Finally, there are social networking systems that support continuous development of portfolios. Through interactions within the social network, one’s portfolios can constantly be developed and improved, while the social network can leverage different feedback, commenting and rating instruments to constantly provide the evaluation of the produced content published through portfolios. The use of these and similar tools and services can significantly facilitate interaction along student-student, student-teacher, teacher-teacher, student-content, and teachercontent dimensions of the interactivity triangle [1]. Having recognized the importance of online social networking for education, traditional e-learning environments like Learning Management Systems have recently started to incorporate well known social networking tools. Currently, the best example of this practice is Haiku 28 Learning Management System that already has over 80 social networking tools ready to embed with just a simple drag-and-drop. Since both 18
http://moat-project.org http://www.faviki.com 20 http://zigtag.com/ 21 http://www.notemesh.com/ 22 http://collegeruled.com/ 23 http://www.curriki.org/ 24 http://www.edtechtalk.com/ 25 http://edublogs.org/ 26 http://www.schoopy.com/ 27 http://www.buddyschool.com/ 28 http://www.haikuls.com/ 19
252
J. Jovanovic et al. / E-Learning and the Social Semantic Web
students and teachers are used to interacting via those tools and services in their daily practices, there is no barriers for adoption. In addition, the interaction data could be captured internally (i.e., by the system itself), and subsequently used for adaptation and personalization purposes, as we present in Section 2.5. By leveraging the integration of Social Web and semantic technologies, new, advanced forms of social networking platforms have started to emerge. They allow for advanced forms of social interactions, as well as knowledge creation and exchange. Among these emerging systems, the most representative one is probably Twine 29 , a knowledge networking site where users are encouraged to connect with other people, create, organize and share information and knowledge. Twine integrates facilities currently available in different Social Web tools, but also uses semantic technologies to enable sophisticated services, such as automatic tagging of users’ bookmarks, and recommendation of relevant content and people. Users can create topic-oriented communities, called twines. They can be members of different twines and exchange knowledge with other users both within a twine and across twines. It is expected that by the end of 2009 users will be able to easily pull their information and knowledge from Twine to make them available in other systems and tools. Another interesting approach towards enabling advanced forms of social networking by utilizing semantic technologies is provided by Innoraise 30 – a social semantic network enabling one to easily find people knowledgeable in a certain domain and/or about a certain topic. It allows users to find out who knows about their topic of interest, and start to interact, collaborate, and follow the activities of their contacts. In order to enable this, the system aggregates content produced and consumed by users and by employing semantic analysis, information retrieval and data mining technologies, assesses the knowledge of a person. The first social network powered by this solution is STI community 31 , an international network of experts in Semantic technologies. Twine, Innoraise and similar systems, like the forthcoming Qitera 32 , can be leveraged for increasing the interactivity of online learning environments and facilitating collaborative learning approaches – students and teachers can create communities around course topics; develop and exchange knowledge within and across these communities; meet peers studying/teaching the same or similar subjects; more easily search for relevant content by leveraging semantic tags; and get recommendations about relevant resources (both human and digital). ILEs should communicate with these systems (via open APIs and/or data exchange protocols) in order to acquire the interaction data that they can further leverage for improving students' learning experience. Some emerging software solutions that rely upon the Social Semantic Web paradigm promise to significantly improve end-users interaction with the content. For example, Parallax 33 offers a new way of browsing and exploring data stored in Freebase 34 – an open, semantically structured database of information of general interest. The tool leverages the faceted browsing paradigm to allow for seamless exploration of data. It also enables one to browse from one set of things to another 29
http://www.twine.com http://innoraise.com/ http://sti.innoraise.com/ 32 http://www.qitera.com/ 33 http://mqlx.com/~david/parallax/index.html 34 http://www.freebase.com 30 31
J. Jovanovic et al. / E-Learning and the Social Semantic Web
253
related set of things (e.g., find the architects of skyscrapers in New York and all the structures that they have designed) – a novel and powerful mechanism for exploring the data, much more efficient than the ability to browse from one single thing to another single thing. One of the key representatives of the Social Web is mash-ups – Web applications allowing users to combine and integrate different types of data, often originating from different sources. Mapping mash-ups, in which maps are overlaid with information, may be the best known example of this rapidly growing genre. Tools, such as Google’s Mashup Editor 35 , or Yahoo Pipes 36 allow individuals to mix up data, find new meaning, and present it in interesting ways. The suite of tools developed in the scope of MIT’s SIMILE 37 project (such as Exhibit [26], Potluck [27], and PiggyBank [28]) facilitates the creation of Semantic Web mash-ups – by leveraging Semantic Web technologies (primarily RDF and SPARQL), these mash-ups are more dynamic and flexible than those offered by Web 2.0 tools and services. Maybe the most distinctive among those tools is Potluck, a tool that lets casual end-users (i.e. non-programmers) easily make mash-ups of structured, semantically reach data, often expressed in RDF or JSON 38 format. Potluck acknowledges the fact that the real-world RDF is messy, “broken perhaps not just in syntax but also in semantics” [27], and empowers users to deal with this problem by providing them with visual editing facilities. In particular, the tool assumes an iterative process of data integration in which the user leverages the tool’s rich visualization capabilities to explore the data, identify data of interest as well as merge, align and/or clean up the data – all that in an easy and intuitive manner. In education, tools like Potluck can be extremely valuable by helping students integrate previously disparate types of information and explore them from different perspectives and in more depth. Not only do mash-ups improve the interactivity along student-content and teacher-content dimensions, but they also introduce a novel form of content-content interaction. In particular, the mash-up resulting from the integration of disparate sources of data brings in a new quality (e.g., a new point of view, or a better understanding of some phenomenon) that is often more valuable than the pure sum of the integrated parts. In addition, mash-ups can be semantically annotated with the data about the context of their creation (who created them, what data sources they used, for what purpose) and used as learning content in ILEs. 2.4. Supporting Interactivity across Different Systems and Tools Currently, one of the major obstacles to collaborative creation and sharing of knowledge on the Social Web is the fact that online social networks are like isolated islands – knowledge can be exchanged within the island (i.e. network) but not across them, at least not without a lot of effort (i.e. manual copy-and-paste activities). For example, let us consider a student who is studying a certain domain topic and wants to acquire the knowledge on that topic by leveraging the resources gathered by an expert or his/her peers. Unfortunately, resources maintained by those people can be located on many different social networks. The student would spend of time on manually importing these resources, or may even abandon the operation in favor of using other, potentially less relevant or less trustworthy sources of knowledge. 35
http://code.google.com/gme/ http://pipes.yahoo.com/pipes/ 37 http://simile.mit.edu/ 38 http://json.org/ 36
254
J. Jovanovic et al. / E-Learning and the Social Semantic Web
The integration of the Semantic Web technologies into the Social Web paradigm promises to solve this problem. For example, Social Semantic Collaborative Filtering (SSCF) [29] allows users to easily share their knowledge with others within and across online social networks. For example, one could easily import friends’ bookmarks and utilise their expertise and experience in specific domains of knowledge. In addition, SSCF allows users to set fine grained access rights for their resources. Access control is based on the distance and the friendship level between users which is expressed using FOAFRealm 39 . Semantic Web technologies are an important part of emerging efforts aimed at decentralization of social networks. The common objective of projects like NoseRub 40 and DiSo 41 is to enable users to own and control their online profiles, including their contact lists and the streams of their online activities. A related project called Knowee 42 , initiated by the Semantic Web Education and Outreach 43 group, is aimed at developing a Semantic Social Web address book – a distributed address book releasing users from the mundane task of maintaining their contact data. Instead, Knowee lets users subscribe to diverse Social Web applications and services and the address book updates itself automatically; it also provides the user with his/her integrated social graph. All the above mentioned efforts are important for e-learning, as they (seam to) offer a solution to the long-standing problem of integrating learner profiles from different systems and tools that learners turn to during the learning process. They can also help to build comprehensive learner models and share them among these systems and tools. This can further contribute to surpassing the paradigm of “walled garden” learning environments (typical for traditional e-learning systems, like Learning Management Systems) and replacing it with the novel paradigm of Personal Learning Environments (PLEs) [30]. A PLE allows a learner to interact with diverse systems, tools and services to access content, assess his/her knowledge, collaborate with peers and the like. Another important project that utilizes the Social Semantic Web paradigm in order to provide support for interactivity at the Web scale is Semantically-Interlinked Online Communities (SIOC) 44 . SIOC is an initiative aimed at enabling the integration of usergenerated content and information contained both explicitly and implicitly in Web discussion methods such as blogs, forums and mailing lists. The cornerstone of this initiative is the SIOC ontology that allows for machine readable and formal representation of all data relevant for keeping track of various kinds of Web discussions [31]. Applied in educational settings, the SIOC ontology enables gathering of data about all kinds of interactions that a student has had on the Web, and allows for the inference of additional knowledge about the student that can be beneficial for improving his/her student model. For example, an ILE could analyze online discussions in which the student participated, and relate messages that the student exchanged with his/her peers to the topics of domain ontologies in order to infer the student’s level of mastery of some of the domain topics. In addition, this offers an additional knowledge base of unofficial content that can be recommended to students while studying related 39
see Section 2.5 for explanation of FOAFRealm http://noserub.org/ http://diso-project.org/ 42 http://knowee.net/ 43 http://www.w3.org/blog/SWEO/2007/03/06/community_project_support 44 http://sioc-project.org/ 40 41
J. Jovanovic et al. / E-Learning and the Social Semantic Web
255
concepts, or to educators for inclusion in the ‘official’ course content. However, in order to fully leverage the potentials offered by the SIOC, we need rules and heuristics that would allow for the interpretation of the interaction data, and inference of relevant knowledge about students. 2.5. Integrated Interaction Data for Adaptation and Personalization Current learning practices are often based on individual use of diverse learning systems, tools and services. One of the major problems with this ‘fragmented’ approach is in its lack of means for enabling exchange of data about the activities that students performed within individual learning systems/tools and learning artifacts they have produced during these activities. Besides, with such an approach it is very hard to provide support for context-aware learning services and offer personalized learning experience to students. In order to address this issue, in our recent research work we have developed a collaborative learning environment, named DEPTHS (DEsign Patterns Teaching Help System) [32]. It relies on the Social Semantic Web paradigm to integrate several existing, proven learning systems and tools and provide students with context-aware learning services. In particular, DEPTHS integrates: x x x x
an existing Learning Management System, which enables students to learn at the pace and in a place that best suit them, providing them at the same time with a variety of learning activities and resources; a software modeling tool that enables students to experience patterns-based software development in the context of real-world problems; diverse collaboration tools supporting different kinds of collaborative activities, such as discussions, collaborative tagging, and commenting; and relevant online repositories of software design patterns that provide students with plenty of important resources on design patterns containing both valuable examples of design patterns in use and instructions how they should be used.
The integration of all these components is achieved by leveraging the LOCO (Learning Object Context Ontologies) framework [33]. LOCO is a comprehensive ontological framework aimed at formally representing diverse kinds of learning situations (i.e., learning contexts), as well as diverse kinds of interactions that occur during a learning process (e.g., students' mutual interactions and their interactions with the learning content). It allows one to formally represent all particularities of the given learning context: the learning activity, the learning content that was used or produced, and the student(s) involved. Accordingly, the framework integrates a number of learning-related ontologies, such as learning context ontology, a user model ontology, and domain ontologies. These ontologies allow one to formally represent all the details of any given learning context, thus preserving its semantics in a machine interpretable format and allowing for development of context-aware learning services. DEPTHS currently makes use of two ontologies of the LOCO framework: a domain ontology is used for representing the domain of software patterns, whereas the learning context ontology was extended to allow for capturing and unambiguous representation of learning contexts specific to the systems and tools that DEPTHS integrates. Context-aware learning services offered by DEPTHS are accessible to all systems and tools integrated in the DEPTHS framework and are exposed to end users (students)
256
J. Jovanovic et al. / E-Learning and the Social Semantic Web
as context-aware learning features. Based on the student’s current learning context, these services provide students with recommendations regarding: 1) relevant Web resources, 2) relevant internally produced resources (e.g., discussion threads, brainstorming notes, and project description) and 3) peers, teachers, or experts as possible collaborators. These recommendations are based on the formally represented semantics of the student’s learning context and learning resources (both online resources and those internally produced). Even though DEPTHS is developed for the domain of software design patterns, with a very slight modification, it can be equally well applied for any other learning domain. DEPTHS is yet another example of how the Social Semantic Web paradigm can be the key enabler in surpassing the paradigm of “walled garden” learning environments and replacing it with the novel paradigm of Personal Learning Environments. As shown above, the systems and tools that the student interacts within DEPTHS also communicate with each other to exchange the data about students’ interactions and use that data for adaptation purposes, recommendation of content and peers and generation of feedback for teachers (as suggested in [34]). The integrated interactions data can also be used for enhancing user models (not just student models, but also teacher models) with knowledge about their social relations. These can be expressed by using, for example, the FOAF (Friend-Of-AFriend) ontology [35]. Due to its popularity and wide acceptance among Web users and communities (the number of FOAF profiles on the Web already counts in tens of millions), this ontology has become the basis for building domain/application specific ontologies for user and group modeling. Moreover, in the context of ILEs, it offers potentials to allow for seeking peer support while studying certain topics, as well as for indicating and/or creating successful learning paths of the fellow students. Similarly, the interaction data can be used for inferring a user’s reputation, that is, how he/she is perceived by the other members of the community (e.g., how competent a particular student is in a particular subject area according to his/her peers). This knowledge can be represented using the FOAFRealm ontology [36] – an extension to the FOAF ontology which allows users to express how well one person knows, or trusts, another – and leveraged by an ILE for providing recommendations. 2.6. Ubiquitous Access to Learning Resources The notion of context as an aggregate of spatial and temporal aspects of a user’s situation is becoming increasingly important with the constantly growing usage of smart phones and emergence of mobile social networks. This nascent but constantly growing trend of location-based social networking is empowered by GPS technology and platforms like Yahoo!’s FireEagle 45 and Google Latitude 46 that enable one to share his/her location online. There are already a number of services that make use of this public location-data to allow users find their friends located nearby, to discover and share what is happening in the vicinity, or to get contextualized search results. These kinds of services can also be highly beneficial for educational purposes as suggested in [37]. In particular, the location data can be used for ad-hoc detection of fellow students that are nearby and organization of F2F meetings and assignment-based study groups. This possibility can be especially relevant for blended learning. For example, in the 45 46
http://fireeagle.yahoo.net/ http://www.google.com/latitude/
J. Jovanovic et al. / E-Learning and the Social Semantic Web
257
context of DEPTHS (see Section 2.5), its semantically-enabled peer discovery service might identify student A as the most relevant person for the current problem that student B is trying to solve. Accessing the student A’s online presence data, the system learns that she is ‘away’ (from her online status), but also that she is in the same building as the student B (from her current location data). Therefore, the system can offer to student B an option to contact student A via SMS for an ad-hoc F2F meeting. Of course, online sharing of one’s location and other context data has important privacy implications. To deal with them, for example, Yahoo! allows users to turn FireEagle off when they want to keep their location private. However, this could be considered just as an initial solution, since more fine grained management of private data should be enabled (e.g., to enable one to define with whom he/she is willing to share his/her location data). The above issues are calling for the use of various ways for regulating access to private data. To date, the most relevant solutions are based on the use of policy languages such as Ponder, KAoS, Rei, PeerTrust, and XACML [38]. Typically defined over ontologies, policy languages provide a reliable mechanism for (rule-based) reasoning in open environments, where the use of roles and institutions the users may belong to is not possible [39]. Current policy languages rather allow for context-based reasoning where one can only leverage the knowledge coming from the shared vocabularies (i.e., ontologies) used by different communities and reputation of individuals gained in different communities. However, management of policies today requires a lot of technical knowledge, which in general disables wide adoption of policy-based approaches for privacy protection. Accordingly, there is a need for the development of user-friendly interfaces for policy management. Moreover, we cannot expect that end-users will define a policy for each possible threat that may arise, but we need to develop mechanism for automatic context-aware detection of privacy threats by leveraging the ontology-based definitions of contexts. Similarly to the relations between ontologies and folksonomies, there is a need to investigate policy languages that allow for reasoning over socially constructed knowledge, in addition to the formally defined ontologies.
3. Conclusion As Tom Gruber stresses, it is a "popular misconception that the two worlds (Social Web and Semantic Web) are alternative, opposing ideologies about how the Web ought to be. Folksonomy vs. ontology. Practical vs. formalistic. Humans vs. machines. This is nonsense, and it is time to embrace a unified view" [9]. E-learning can make use of this unified view to a great extent. Since both the Social Web and the Semantic Web have advantages and deficiencies, why not take the best of both worlds and make a synergy of these technologies for the benefit of both students and teachers? In addition, why not identify and tackle e-learning problems that neither of the two technologies addresses properly, and make the synergy open for "third-party add-ons"? Note that both Social and Semantic Web technologies lack a sounder software engineering foundation, and both would benefit from deploying advanced, personalized, and multimodal user interfaces for knowledge and data acquisition and sharing. Research efforts are already underway that indicate ways of how to underpin e-learning applications relying on Social and Semantic Web technologies with stable software engineering methodologies, thus producing more robust and more useful e-learning
258
J. Jovanovic et al. / E-Learning and the Social Semantic Web
systems [40], [41]. More automation is certainly welcome in the area of semantic annotation of learning resources, where social tagging and folksonomies represent at best the first step on the ladder. In addition, there is a need for human-supported approaches, where the tools must be both highly usable by teachers and learners (well beyond current tools which require training often to the level of a Computer Science degree to be used) and also highly motivating or natural in order to actually get learners and teachers to use them. A potential solution lies in dialog-based human-computer interaction and natural language interfaces which are both very social and very semantic-rich, so they can be investigated in e-learning systems as a natural extension to the synergy of the Semantic and Social Web. Last but not the least, there are still very few e-learning applications that really reason over their data, resources, learner models, and the like. This creates a great challenge for future exploration and integration of the Social Semantic Web with more advanced technologies.
References [1] T. D. Anderson, D. R. Garrison. Learning in a networked world: New roles and responsibilities, In Chère Campbell Gibson (Ed). Distance learning in higher education: Institutional responses for quality outcomes. Madison, WI: Atwood Publishing, 1998, 65-76. [2] T. Berners-Lee, J. Hendler, O. Lassila. The Semantic Web, Scientific American 284(5) (2001), 34–43. [3] L. Feigenbaum, I. Herman, T. Hongsermeier, E. Neumann, & S. Stephens. The Semantic Web in Action. Scientific American 297 (2007), 90-97. [4] N. Shadbolt, T. Berners-Lee, W. Hall. The Semantic Web Revisited, IEEE Intelligent Systems 21(3) (2006), 96-101. [5] D. Gaševiü, J. Jovanoviü, V. Devedžiü. Ontology-based Annotation of Learning Object Content, Interactive Learning Environments 15(1) (2007), 1-26. [6] T. O'Reilly. What Is Web 2.0 – Design Patterns and Business Models for the Next Generation of Software (2005) [Online], Available at: http://www.oreillynet.com/pub/a/ oreilly/tim/news/2005/09/30/ what-is-web-20.html. [7] A. Mikroyannidis. Toward a Social Semantic Web, IEEE Computer 40(11) (2007), 113-115. [8] J. Breslin, S. Decker. Semantic Web 2.0: Creating Social Semantic Information Spaces, Tutorial at the World Wide Web Conference, Edinburgh, Scotland, 2006, http://www2006.org/tutorials/#T13. [9] T. Gruber. Collective Knowledge Systems: Where the Social Web meets the Semantic Web. Journal of Web Semantics (2008). [10] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, Z. Ives. DBpedia: A Nucleus for a Web of Open Data. In Proceedings of the 6th Int’l Semantic Web Conference (ISWC 2007), Busan, Korea, 2007. [11] R. Mizoguchi, J. Bourdeau. Using Ontological Engineering to Overcome Common AI-ED Problems, International Journal of AI in Education 11 (2000), 1-12. [12] T. Murray. Authoring Knowledge-Based Tutors: Tools for Content, Instructional Strategy, Student Model, and Interface Design. Journal of Learning Sciences 7(1), (1998), 5-64. [13] A. Mitroviü, V. Devedžiü. A model of multitutor ontology-based learning environments. International Journal of Continuing Engineering Education and Life-Long Learning 14(3) (2004), 229–245. [14] E. Motta, M. Sabou. Next Generation Semantic Web Applications, Asian Semantic Web Conference (keynote), China, 2006. [15] M. Hepp, K. Siorpaes, D. Bachlechner. Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for Knowledge Management. IEEE Internet Computing 11(5) (2007), 54-65. [16] C. Basca, S. Corlosquet, R. Cyganiak, S. Fernández, T. Schandl. Neologism – Easy Vocabulary Publishing, Proceeding of the 4th Workshop on Scripting for the Semantic Web, Tenerife, Spain, June 2008. [17] C. Torniai, J. Jovanoviü, D. Gaševiü, S. Bateman, M. Hatala. E-learning meets the Social Semantic Web. In Proceedings of the 8th IEEE Int’l Conference on Advanced Learning Technologies (ICALT 2008), Santander, Cantabria, Spain, 2008, 389-393. [18] C. Torniai, J. Jovanoviü, S. Bateman, D. Gaševiü, M. Hatala. Leveraging Folksonomies for Ontology Evolution in E-learning Environments. In Proceedings of the 2nd IEEE International Conference on Semantic Computing, Santa Clara, CA, USA, 2008, 206-215.
J. Jovanovic et al. / E-Learning and the Social Semantic Web
259
[19] P. Brusilovsky, W. Nejdl. Adaptive Hypermedia and Adaptive Web. In: M. P. Singh (ed.) Practical Handbook of Internet Computing. Baton Rouge: Chapman Hall & CRC Press, 2005, 1.1-1.14. [20] B. Adida, M. Birbeck, S. McCarron, S. Pemberton. RDFa in XHTML: Syntax and Processing, W3C Candidate Recommendation, 20 June 2008. Available at: http://www.w3.org/TR/2008/CR-rdfa-syntax20080620/ [21] D. Connolly (Ed.) Gleaning Resource Descriptions from Dialects of Languages (GRDDL), W3C Recommendation, 11 September 2007. Available at: http://www.w3.org/TR/grddl/ [22] J. Jankowski, A. Westerski, S. R. Kruk, T. Nagle, J. Dobrzanski. IKHarvester, Informal eLearning with SemanticWeb Harvesting, In Proceedings of the 2nd IEEE International Conference on Semantic Computing (ICSC2008), Santa Clara, CA, USA, 2008. [23] J. Jankowski, F. Czaja, J. Dobrzanski. Adapting informal sources of knowledge to e-learning. In Proc. of 5th Annual Teaching and Learning Conference (CELT’2007), 2007. [24] H. Kim, A. Passant, J. Breslin, S. Scerri, S. Decker. Review and Alignment of Tag Ontologies for Semantically-Linked Data in Collaborative Tagging Spaces. In Proceedings of the 2nd International Conference on Semantic Computing, San Francisco, USA, 2008. [25] A. Passant, P. Laublet. Meaning Of A Tag: A collaborative approach to bridge the gap between tagging and Linked Data. In Proceedings of the WWW 2008 Workshop Linked Data on the Web (LDOW2008), Beijing, China, 2008. [26] D. F. Huynh, D. R. Karger, R. C. Miller. Exhibit: lightweight structured data publishing. In Proceeding of the 16th International Conference on World Wide Web, Banff, Alberta, Canada, 2007, 737-746. [27] D. F. Huynh, R. C. Miller, D. R. Karger. Potluck: Data Mash-Up Tool for Casual Users. In Proceeding of the 6th International Semantic Web Conference, Busan, Korea, 2007, 239-252. [28] D. F. Huynh, S. Mazzocchi, D. R. Karger. Piggy Bank: Experience the Semantic Web inside your web browser. Journal of Web Semantics 5(1) (2007), 16-27. [29] S. R. Kruk, S. Decker, A. Gzella, S. Grzonkowski, B. McDaniel. Social semantic collaborative filtering for digital libraries. Journal of Digital Information, Special Issue on Personalization, 2006. [30] G. Attwell. The Personal Learning Environments - the future of eLearning?, eLearning Papers 2(1) (2007). [Online]. Available at: http://www.elearningeuropa.info/files/media/media11561.pdf [31] U. Bojãrs, J. Breslin (eds.) SIOC Core Ontology Specification. (2008). [Online]. Available at: http://rdfs.org/sioc/spec/ [32] Z. Jeremic, J. Jovanovic, D. Gasevic. Towards a Semantic-rich Collaborative Environment for Learning Software Patterns. In Proceedings of the 3rd European Conference on Technology Enhanced Learning, Maastricht, The Netherlands, 2008, 155-166. [33] J. Jovanoviü, C. Knight, D. Gaševiü, G. Richards. Ontologies for Effective Use of Context in eLearning Settings. Educational Technology & Society 10(3) (2007), 47-59. [34] J. Jovanoviü, D. Gaševiü, C. Brooks, V. Devedžiü, M. Hatala, T. Eap, & G. Richards. Using Semantic Web Technologies for the Analysis of Learning Content. IEEE Internet Computing 11(5) (2007), 45-53. [35] D. Brickley, L. Miller. FOAF Vocabulary Specification. (2005). [Online]. Available at: http://xmlns.com/foaf/spec/ [36] S. R. Kruk. FOAF-Realm - control your friends’ access to the resource. In FOAF Workshop Proceedings, 2004. [37] M. Siadaty, C. Torniai, D. Gaševiü, J. Jovanoviü, T. Eap, M. Hatala. m-LOCO: An Ontology-based Framework for Context-Aware Mobile Learning. In Proceeding of the 6th International Workshop on Ontologies and Semantic Web for Intelligent Educational Systems at 9th Int’l Conf. on Intelligent Tutoring Systems, Montreal, Canada, 2008. [38] P. Bonatti, C. Duma, B.E. Fuchs, W. Nejdl, D. Olmedilla, J. Peer, N. Shahmehri. Semantic Web Policies - A Discussion of Requirements and Research Issues. In Proceedings of the 3rd European Semantic Web Conference, Przno, Montenegro, 2006, 712-724. [39] J.L. De Coi, P. Kärger, A.W. Koesling, D. Olmedilla. Control your elearning environment: Exploiting policies in an open infrastructure for lifelong learning. IEEE Transactions on Learning Technologies 1(1) (2008). [40] S. Radenkoviü, N. Krdžavac, V. Devedžiü. MDA and Semantic Web Technologies for Assessment Systems. In Proceeding of the 6th International Workshop on Ontologies and Semantic Web for ELearning, Montreal, Canada, 2008. [Online]. Available at: http://compsci.wssu.edu/iis/swel/SWEL08/ Papers/Krdzavac.pdf [41] T. Klobuþar, iCamp Space - an environment for self-directed learning, collaboration and social networking . WSEAS Transactions on information science and applications 5(10) (2008), 1470-1479.
260
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-260
CHAPTER 14
Lessons Learned using Social and Semantic Web Technologies for E-Learning Christopher BROOKS 1 , Scott BATEMAN, Jim GREER and Gord MCCALLA Laboratory for Advanced Research in Intelligent Educational Systems (ARIES) Department of Computer Science, University of Saskatchewan, Canada
Abstract. This paper describes work we have done over the past five years in developing e-learning applications and tools using Semantic Web and Web 2.0 technologies. It does not provide a comprehensive description of these tools; instead, we focus on the pitfalls we have encountered and attempt to share experiences in the form of design patterns for developers and researchers looking to use these technologies. Keywords. Semantic Web, E-Learning, Web 2.0, Social Semantic Web, Lessons Learned
Introduction Over the past five years we have been involved in numerous research projects with the goal of building e-learning systems or components that contain "intelligent" features. These features are typically diverse, and include collaborative filtering, information visualization, data mining, and instructional planning. This research has resulted in a number of different tools that are used to deliver education at our institution (e.g. the iHelp Courses [1] learning content management system and the Recollect [2] video course casting system), as well as various research implementations used to study particularly interesting phenomena we have observed (e.g. the Open Annotation and Tagging System [3]). Through our work we have gained some insights into what works and what doesn't when dealing with Semantic Web and Web 2.0 technologies. Perhaps more interestingly, we have identified trade-offs between the two technologies that suggests they may not be diametrically opposed to one another; however, there may be reasons to choose one and not the other. The educational domain we have been principally interested in is undergraduate level Computer Science, but we believe that the lessons we have learned are largely independent of this domain. This chapter will describe three different Semantic Web and Web 2.0 themes we have investigated. The first of these themes focuses on architecture, and is the direct result of our previous investigation of agents as brokers of semantic information in e1 Corresponding Author: Christopher Brooks, Laboratory for Advanced Research in Intelligent Educational Systems (ARIES), Department of Computer Science, University of Saskatchewan, Canada; Email: [email protected].
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
261
learning applications. The application we developed, the Massive User Modelling System (MUMS) [4], is a semantic web middleware research prototype. Its goal is to act as both an aggregator and specialization system for semantic information being generated by e-learning applications. The information collected by MUMS could then be used to reason and react to the choices a learner would make in a heterogeneous learning environment. This system was built in 2003 and used in conjunction with the iHelp Courses and iHelp Discussions learning environments until 2005. The second theme that will be discussed is the form and semantics of metadata associated with learning content. Informed by our work implementing a learning content management system, we will focus specifically on the challenges in integrating metadata with learning objects using semantic web technologies. Metadata is a broad topic and a comprehensive treatment of the issues is outside of the scope of this chapter. Instead, we will give particular treatment to the issue of why we need metadata, and argue that the creation of metadata should not happen without thought about how the content is likely to be used. The discussion of metadata will dovetail with our third theme, how end users can explicitly create metadata. In this part of the paper we describe two systems we have deployed aimed at end-user created metadata: CommonFolks and the Open Annotation and Tagging Systems (OATS). These systems use a mix of social and semantic web technologies, and experiences with user studies will be used to describe how social semantic web technologies are used in practice. Explicit metadata creation begs to be followed up by a discussion of implicit metadata creation. The fourth section of this paper will describe two data mining projects underway that look separately at the content or usage of learning resources to understand more about how they can be reused. We conclude with a synopsis of the main lessons we have learnt, as well as a brief discussion of where we see social and semantic learning technologies heading in the next decade.
1. Semantic Web Architecture The last twenty years have seen an explosion in the diversity and scale of e-learning environments. Scientific and education research in the fields of artificial intelligence in education, intelligent tutoring systems, computer supported collaborative learning, and the learning sciences have resulted in a wide range of different pedagogical approaches and software solutions for increasing engagement and learning outcomes. The diversity of these solutions has been amazing, and as technologies have changed, so too have the frameworks that researchers are using to communicate between various components of their systems. Various paradigms have been used for component interaction, including agent-oriented programming, remote method invocation, and service-oriented architectures. Similarly, the advent of the web has helped to distribute learning content for a low cost to people all over the world. While content started off as simple text with accompanying graphics, it has very quickly grown to include online assessment, interactive applications, and even graphics-intensive virtual worlds. The increase in quantity and diversity of content has created a challenge for educational institutions to manage this content in effective ways. Learning Content Management Systems (LCMSs), such as WebCT and Moodle, are now common place at higher education institutions across the globe, and with the introduction of the Shareable Content Object Reference Model (SCORM) specifications [26] penetration into
262
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
corporate training environments is also on the rise. Despite the vast array of functionality required by different institutions, the LCMS market is controlled by a few major players (e.g. Blackboard), with more diversity in niche areas usually associated with a particular subject field (e.g. the health sciences). In addition to accessing research prototypes and content management systems, students are increasingly turning to web-enabled tools to assist in their learning. Computer mediated communication tools such as discussion boards, blogs, microblogging, and wikis are often used to aid in learning, even if not officially endorsed by the instructor or the institution. Tying into the data being created by these tools represents a significant opportunity for e-learning based semantic web researchers, but the challenge of integrating both functionally and semantically with these tools is not simple. In the past [5] we have leveraged agent-based solutions to help mediate this process. Individual agents representing users, processes, or pieces of data can negotiate with one another and trade information as needed. At a metaphorical level this increases cohesion and data encapsulation, but has the negative effect of increasing the coupling and dependencies of agents to one another. At first consideration it may seem contradictory that there is an increase in dependency when agents are considered autonomous (as they often are). However, here the question of semantics is really where the dependency exists: at a semantic level as agents exchange information between themselves for different purposes they need to have a shared vocabulary both for data transport (e.g. speech acts) as well as metadata (for reasoning). Semantic web technologies (notably RDF [28] and OWL [29]) are appropriate for creating semantically meaningful data markup, and there has been much work (e.g. FIPA [30], etc.) done in creating semantics for speech acts. However, arriving at a shared conceptualization in a distributed system is not a trivial issue. Changes to this conceptualization often require a retooling of many of the applications in the learning environment to “speak” according to the new semantics. Changes also often increase the need for agents to depend on and negotiate with one another about the data being exchanged as the semantics of how this data is represented may also have changed. Thus the semantics of the information being exchanged between agents couples those agents together. Classical agent systems often implement bidirectional agent communication which further reduces the resiliency of the system to semantic changes. The blackboard architecture [31], originally developed for speech understanding research at Carnegie-Mellon University and explored more widely in distributed artificial intelligence research, has similar problems. While the “knowledge sources” in a blackboard architecture are nominally independent, they often have implicit dependencies that must be understood by the designers if the resulting application system is to work effectively. What is needed is a distributed systems approach that keeps needed modularity, yet puts structure on both the entities and the flow of information in the system that is appropriate to the application domain that makes the dependencies explicit. This is the goal of the Massive User Modelling System (MUMS) [4], designed for use in systems, such as e-learning systems, that must flexibly respond adaptively to differences in users and context. MUMS draws from the producer-consumer model to identify a higher level information flow in Semantic Web systems, especially those used for e-learning. While MUMS is aimed at sharing data about users, a similar model could be used for sharing any form of data. In our model we separate components (or agents, provided their autonomy remains limited) into three categories (Figure 1):
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
263
1. Evidence Producers: observe user interaction with an application and publish opinions about the user. These opinions can range from direct observations of the interaction that has taken place, to beliefs about the user’s knowledge, desires, and intentions. While the opinions created can be of any size, the focus is on creating brief contextualized statements about a user, as opposed to fully modelling the user. 2. Modellers: are interested in acting on opinions about the user, usually by reasoning over these to create a user model. The modeller then interacts with the user (or the other aspects of the system, such as learning materials) to provide adaptation. Modellers may be interested in modelling more than one user, and may receive opinions from more than one producer. Further, modellers are often situated and perform purpose-based user modelling by restricting the set of opinions they are interested in receiving. 3. Broker: acts as an intermediary between producers and modellers. The broker receives opinions from producers and routes them to interested modellers. Modellers communicate with the broker using either a publish/subscribe model or a query/response model. While the broker is a logically centralized component, different implementations may find it useful to distribute and specialize the services being provided for scalability reasons. The principal data artifact that these components exchange is the opinion. Specifically, an opinion is a temporally grounded codification of a fact about a set of users from the perspective of a given event producer. Particular importance should be paid to the issue of perspective; different components may have different models of what a user is doing and why. Opinions are used to influence those models through reasoning, but not replace them. In this way, the data exchange is more than a remote method invocation (e.g. CORBA, or Web Services) and much more analogous to the kinds of jobs in which one expects agents to be used. This three-entity system purposefully supports the notion of active learner modelling [17]. In the active learner modelling philosophy, the focus is on creating a learner model situated for a given purpose, as opposed to creating a complete model of the learner. This form of modelling tends to be less intensive than traditional user modelling techniques, and focuses on the just-in-time creation and delivery of models instead of the storage and retrieval of models. The MUMS architecture supports this by providing both stream-based publish/subscribe and archival query/response methods of obtaining opinions from a broker. Both of these modes of event delivery require that modellers provide a semantic query for the opinions they are interested in, as opposed to the more traditional event system notions of channel subscription and producer subscription. This approach decouples the producers of information from the consumers of information, and leads to a more easily adaptable system where new producers and modellers can be added in an as-needed fashion. The stream-based method of retrieving opinions allows modellers to provide just-in-time reasoning, while the archival method allows for more resource-intensive user modelling to occur. All opinions transferred within the MUMS system include a timestamp indicating when they were generated, allowing modellers to build up more complete or historical user models using the asynchronous querying capabilities provided by the broker. In our implementation of this architecture, we use the RDQL [27] query language for registering for opinions of interest, and RDF/OWL for the data format of opinions. This allows Modellers to both ask for opinions that fit both schematic queries (e.g.
264
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
when a user has read some content) as well as queries that are instance specific (e.g. when a user has read the content on the artificial intelligence topic of language translation). By applying the adaptor pattern [6] to the system, a fourth entity of interest can be derived, namely the filter. 4. Filters: act as broker, modeller, and producer of opinions. By registering for and reasoning over opinions from producers, a filter can create higher level opinions. This offloads the amount of work done by a modeller to form a user model, but maintains the more flexible decentralized environment. Filters can be chained together to provide any amount of value-added reasoning that is desired. Finally, filters can be specialized within a particular instance of the framework by providing domain specific rules that govern the registration of, processing of, and creation of opinions. Filters are not built-in components of the system, but are an example of how the system can be extended using wellknown software engineering techniques. Designers can choose to use filters, if they like, to essentially allow the creation of “higher level opinions”. Interactions among the MUMS entities are shown in Figure 1. Some set of evidence producers publish opinions, based on observations of the user, to a given broker. The broker routes these opinions to interested parties (in this case, both a filter and the modeller towards the top of the diagram). The filter reasons over the opinions, forms derivative statements, and publishes these new opinions back to the broker and any modellers registered with the filter. Lastly, modellers interested in retrieving archival statements about the user can do so by querying any entity which stores these opinions (in this example, the second modeller queries the broker instead of registering for real time opinion notification).
Figure 1. A logical view of the MUMS architecture from [4].
This architecture addresses concerns people have had in the past with distributed elearning systems. First, the architecture clearly separates the duties of creating
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
265
semantic information and consuming it, and weights the reasoning in the system towards modellers and filters. This reduces the overhead required by application developers when integrating their systems with more research-focused software. This allows for both robustness and extensibility in the e-learning environments as a change in the evidence being created by a given evidence producer only impacts modellers that have explicitly registered for those messages. Further, the availability of a modeller to receive data doesn't impact the evidence producers, as they have no bidirectional contact between one another. While this architecture addresses the issues related to sub-optimal communication between components within an e-learning system, it does not address the issue of changes at the conceptual (e.g. schematic or ontological) level. New ontologies being introduced by designers of Evidence Producers need to be shared with Modeller designers who intend to use this data. However, unlike a traditional agent system, these designers do not have to implement, test, and deploy their changes in a coordinated fashion to maintain the robustness of the overall e-learning solution. Instead, their communication can take place asynchronously, with the Evidence Producer designers (often traditional software engineers) adding new semantics when they are available to be captured, and Modeller designers (typically research-focused individuals) subscribing to these semantics when needed.
2. Learning Object Metadata While Semantic Web and Web 2.0 technologies were being developed by information technology researchers and practitioners in the late 1990’s, work was also beginning on standards for describing educational content. The most popular and comprehensive approach taken, the Learning Object Metadata (LOM) standard [7], was finalized in 2004 and has seen significant adoption by educational technology vendors. But the effectiveness of this standard has been questioned by some, including ourselves [8]. Principal among our issues with the LOM is its extremely broad goal “…to facilitate search, evaluation, acquisition, and use of learning objects, for instance by learners or instructors or automated software processes”. [7] Our experiences when using the LOM to categorize learning content illuminated that it is often very difficult to annotate content in a way that it was both intuitive to humans as well as logically clear for software processes. For example, human annotators typically use free-form text and abbreviations as well as local terminology when identifying the topics that are being covered in a piece of content. Despite the creation of application profiles aimed at standardizing vocabularies to particular schema (e.g. CanCore [9]), annotators often diverge from these schemes and use content-specific wording. While this may be reasonable for humans involved in the metadata process (the idea being that someone searching for a particular piece of content will have enough domain knowledge to know how that content was described), the lack of standardized vocabularies makes comparison between pieces of content by software processes very difficult. This becomes even more of an issue when fields in the LOM have a form that implies rigorous semantics to human users of the metadata but that may be misinterpreted by automated reasoning tools. In some of our earlier work [10] we had to modify the lifecycle element in the LOM to provide unambiguous versioning semantics similar to those available in software configuration management systems. This resulted in an immediate trade off: in order to support functionality such
266
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
as calculating the differences between versions of (and thus observing the evolution of) learning content, the lifecycle element needed to include information about how the content had changed structurally and semantically. This metadata turned out to be fairly verbose and decidedly unreadable by humans, but was easy to manipulate to show both renderings of a particular version of content as well as visualizations of how content had changed over time. Both of these experiences led us to conclude that the purpose and audience of metadata needs to be considered in detail before making schema to handle such metadata. Overly broad metadata definitions make achieving specific purposes difficult, especially if the goal is to handle that purpose computationally instead of by human actors. Once the question of purpose and audience have been determined, ensuring the reliability of the form of metadata in deployed systems is a key to achieving a given purpose. Unfortunately, most learning object metadata tools have limited validation abilities built in, a practice that has not changed significantly in the five years since the LOM has been finalized. For instance, while LOM contributors can have a variety of different roles (e.g. author, publisher, instructional designer, etc.), they are represented using the vCard syntax, which was (again, broadly) “intended to be used for exchanging information about people and resources.” [11] In a study using five different cross-cultural learning object repositories, Friesen [12] was unable to find a single conformant piece of vCard information in more than 3,000 metadata instances. In response, there has been some research in how to automatically create metadata directly from learning object content (e.g. [13], [14]). In the experiments run by [13], only a small set of fields (less than 25%) could be data mined from learning object content and even then many of the automatically generated fields disagree with the values set by content experts. We further explored this issue of reliability of e-learning metadata through a collaborative tagging study. We used the Automatic Metadata Extractor application [14] to data mine learning content (a single page of HTML) and create a list of keywords that describes it. We also surveyed students (n=200), and instructors (n=2) to see what keywords they would associate with the content. After normalizing the results, we observed that human keywords differed in a couple of ways from those that are automatically data mined. Firstly, human annotators often used "tag phrases" instead of single words, where a set of words together describe the content (e.g. case based reasoning). Student keywords also had a much lower signal-to-noise ratio (sometimes being made up of seemingly random content), while instructor keywords were often based on high-level concepts being described, and data mining keywords were based on specific occurrences of text in the web page. Surprisingly, inter-rater reliability between the two instructors was low as well as the inter-rater reliability between subjects in the student group and inter-rater reliability between students and instructors. Figure 2 explores the results of our analysis, and shows that while a few keywords showed relatively high agreement between students, only a few keywords were agreed upon by both of the instructors. Interestingly, many of the keywords that were automatically generated by the metadata extractor application were in either the student or the instructor keyword set.
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
267
Figure 2. Comparison of student keywords, subject matter expert keywords, and Automated Metadata Extractor keywords from [15]
This experience, along with the observations of Friesen and others, have demonstrated to us that agreement amongst metadata values are likely to be low in some circumstances. Disagreement can happen for a number of reasons, some of which are tolerable (e.g. a difference of opinion) while some are not (e.g. an incorrectly formatted data field, in this case the semantic of the tags which are similar should be used). Collaborative tagging and related web 2.0 technologies encourage diverse metadata values. Similar to other collaborative tagging studies, Figure 2 demonstrates a power curve when looking at the uniqueness of tags being generated by a larger population (in this case, the student population). When designing tools to make use of metadata collected in this manner, effort should focus on methods and interfaces for using a diversity of results instead of trying to distill the metadata into an authoritative set of “true” pieces of metadata. This runs counter to Semantic Web approaches, where software is relying on first order logic being present in the metadata schemas themselves.
3. End User Created Explicit Metadata As discussed, standardized and comprehensive metadata schemas sometimes lack in their applicability to actual end use. At the same time our own experiments have shown that there can be a sizeable disparity between how learners view learning content, as compared to instructors (experts) or automatic methods. In contrast to traditional metadata and modelling approaches, the active learner model paradigm suggests that:
268
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
“… the emphasis [should be] on the modelling process rather than the global description. In this re-formulation there is no one single learner model in the traditional sense, but a virtual infinity of potential models, computed ‘just in time’ about one or more individuals by a particular computational agent to the breadth and depth needed for a specific purpose. Learner models are thus fragmented, relativized, local, and often shallow. Moreover, social aspects of the learner are perhaps as important as content knowledge.” [23]
These contrasting views along with the movement towards social web applications have led us to investigate the applicability of metadata explicitly created by end users. Our initial approach for learner created metadata was done through the CommonFolks application, which allowed users to create “semantically rich” tags to label and organize learning content. Unlike traditional metadata schemes, socially created tag sets have the desirable property that they allow for many points of view to be represented (in the different tag labels) as well as encouraging consensus (as demonstrated in our tagging experiment, Figure 2). Tagging can be used by end-users as both a tool for organization as well as a tool for reflection based on how learning content is labeled. We had the further goal of being able to reason over created tag metadata, but collaborative tags provided several obstacles to the reasoning process. The first obstacle is that collaborative tags typically do not provide a predicate relationship to identify how labels are related to the content they label. For instance, the tag “Christopher Brooks” might be used to identify an individual and relate that individual to a piece of content. But the details of the relationship are lost; is that person an author of the content, the publisher of the content, or a student who has used the resource previously? Second, tag labels are not unambiguous, so determining the exact meaning of a tag may difficult. For example, consider a learning resource labeled simply with “python”. The content could equally be an introduction to the programming language, or an article describing a group of snakes. Extending this example, and assuming the resource is about snakes, without a “semantic” relationship the resource could be a picture of a snake (predicate: “picture of”) or an article describing something related to snakes (predicate: “about”). Based on these two main constraints on the typical collaborative tagging process, we produced a prototype called CommonFolks that would allow the creation of semantic tags. Learners create bookmarks based on resource URLs in a central repository where they can be unambiguously described. Disambiguation happens through the use of base annotations on selected concepts from an extendable database (based on WordNet), and include predicate relationships for typical tag labels. In a typical tagging system, a user might describe an HTML tutorial with the tags “tutorial”, “html”, and “intermediate”. In CommonFolks the user would first describe some pedagogical information about the resource (e.g. it is a “tutorial”, see Figure 3). Next, semantic tags would be added such as “[has] topic: hypertext markup language”, with each part of this description being disambiguated using the concept database described previously. Previous experiences with annotating content suggest that subject matter experts want to use local terms, so we included provisions that would allow the concept database to evolve through end-user additions. New concepts would need to be first provided with a definition, however, and the relationship between the new concept and existing WordNet concepts would need to be defined.
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
269
Figure 3. A resource that has been added and tagged in CommonFolks.
The motivation for a student (or end-user) to use CommonFolks is the same as with typical collaborative tagging: to enable a simple scheme for personal organization. The advantage with CommonFolks over typical tagging is that it would enable improved browsing and searching facilities based on semantically created tags. However, CommonFolks revealed one significant issue with such an approach, which was discovered after testing with end-users: compared to other collaborative tagging systems we were providing substantially more overhead in tag creation. In other systems, tags are typically created by simply typing the first words that come to mind without much regard for the semantic relationships these tags encode, whereas CommonFolks requires more in terms of the amount of time and effort for creating each tag. The resistance to overhead in semantic tagging was also shown with Fuzzzy created by Roy Lachica [22]. Fuzzzy is different from CommonFolks in that it does not require users to disambiguate tags from the outset; rather, users can explicitly relate tags to provide semantics whenever they wish. For example, a user can say the tag “chapter” is “part of” the tag “book”. The motivation for the user to engage in this semantic authoring task of relating tags is to get more specific search results, since tags could be explicitly disambiguated and related. However, Lachica found after a community of users had formed around the website’s use that very few people employed the semantic authoring capabilities (< 1% of the community engaged in this type of authoring) [22]. We suspect this is because the motivation to engage in defining tags semantically is not strong enough to overcome the effort required. The problems with both the CommonFolks and Fuzzzy approaches are akin to well known problems that have been found in structured messaging systems [32]. Structured messaging systems, like CommonFolks, enforce a sort of global structuring upon their users in an effort to more effectively manage collections of information. Such systems have been investigated as an alternative to email (which is largely unstructured). However, in this context they never have caught on – users seem to opt for evolving their own structure. Another example is that of wikis. In Wikipedia articles about cities contain structured templates, including standard information such as country and population. These templates were not created by Wikipedia programmers, yet such templates exist for many different categories of information; instead they have evolved over time through user created conventions in an ad hoc manner. In this light, it maybe
270
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
that CommonFolks is still too structured (requiring too much overhead) for its potential users. We may also draw further parallels between our experiences and the differences between semantic and Web 2.0 technologies more generally, where the former in a way imposes structure while the latter does not (yet structure may evolve when needed). We anticipate that the CommonFolks or Fuzzzy approaches may be improvements for highly motivated annotators (such as content developers or instructional designers) in terms of usability, while maintaining ample expressivity for purposes of reasoning. However, results of our studies suggested that the average learner is unlikely to engage in this new type of organizational method unless it is required by an instructor for some pedagogical purpose. Similar approaches for authoring semantic data have been more widespread with semantic wikis, such as OntoWiki [17]. These systems extend wikis with the ability to reference and link data, based on an ontology. However, given our experiences we worry that these too will be short-lived; again, the effort required of casual authors is too high. Other more practical approaches are underway which pull data from social software sources and represent them directly in ontological forms. Most notably dbpedia (dbpedia.org) [18] is a project that collects data from Wikipedia and keeps it in a semantic database. It is able to do this by directly scraping data from the consistently structured data templates that are created for certain types of articles (as described above). Once this data is aggregated and semantically represented in dbpedia, sophisticated queries are possible, such as “List all 19th century poets from England.” This approach seems to be successful because it is able to leverage the existing structure in an abundant source of human created metadata, rather than imposing a certain style of metadata creation upon its authors. Based on our experiences with CommonFolks, we began to focus on providing ample motivation for use by learners when organizing and sharing ideas about learning content. The Open Annotation and Tagging System (OATS) provides a set of tools inspired by related systems that have emerged from work in web annotations, social navigation, web service architectures, and e-learning; and is an extension of the work that had been started by others with the AnnotatEd system [21]. The principal goal of OATS is to study the benefits and problems associated with social annotation tools for learners and instructors in online education environments. We aim to provide tools to effectively organize and navigate learning content. Learners’ personal organization and reminders are made available to both themselves and other users of the system. Part of our goal for OATS was to create a system that would allow us to post process these explicitly created annotations, and to assess its applicability as a learner created source of metadata. We stress an important distinction here between explicit annotations (those annotations created by a student for a particular purpose), and implicit annotations (those annotations resulting directly from usage or that are inferred by the system from usage). The metaphor behind OATS was similar to highlighting text in a traditional book. Creating a highlight in OATS is achieved by typical click-and-dragging selections, and results in a request to create an explicit annotation. OATS automatically changes the selection to appear as a highlighted piece of text that will reappear when the learner revisits the page.
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
271
Figure 4. Highlight annotations created by OATS within iHelp Courses (highlighted text have slightly darker background – appear yellow in the system).
We selected highlighting as the basis of annotations because it was the simplest explicit interaction available that could provide some benefit for learners. By selecting a piece of text, the learner is essentially reminding themselves that they found something of interest in the passage. We extended the highlighting metaphor to the group by displaying highlights based on a histogram. Both personal and group highlights can be turned on and off by the leaner at any time (Figure 5).
Figure 5. OATS displaying the highlights of all learners, through a highlight histogram (originally pink, shown as a dark grey background of differing heights). Also displayed are the individual learner's own highlights (originally yellow, here the light grey background). For example the text “observed in MUDs and massively multiplayer online role-playing games”, has been highlighted by the user (light grey), and by other users (dark grey highlights of differing heights).
Group highlights allow users to get a quick view of what their peers thought were the most important passages. The group highlights are discretized into 3 levels showing the strength of interest in a particular part of the text; the higher the pink highlight the more interesting a particular part of the text is to all learners. OATS also allows tags (keyword-based) and notes (longer free-text) to be added on a per highlight basis. Any highlight, whether personal or group-based, may be selected. Selecting a highlight displays a popup that shows all highlight-associated tags and notes that have been added by the viewing learner and their peers (see Figure 6).
272
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
Figure 6. The annotation interface, which is presented to a learner after clicking on a highlighted piece of text.
We performed a trial to assess the use of OATS as part of a senior undergraduate face-to-face class on ethics and computer science. This course discussed the implications and impacts of information technology on society. Part of student assessment was based upon reading a number of articles each week and discussing them within the iHelp Discussion forum. For two weeks during the course students used OATS instead of the discussion forum to create annotations within the documents, and they used the notes facility to discuss points of interest. Students were also reminded that OATS would provide them the added benefit that it would allow them to organize their readings for review before the final exam. We captured student interactions with OATS and also provided a follow-up usability questionnaire after the end of the two week period. 2 Before the study we hypothesized students would enjoy the time-savings of using a single system where they did not need to switch application contexts to read content and make comments. Further, because these notes and highlights could be organized using tags, we expected that students would find tags a useful means to organize discussion-based information and would use them for studying for the course. We anticipated that the tag usage would be widespread and consistent enough to encourage future assessment as an appropriate source for learner created metadata. Overall, we found that the students found the system easy to use. The students liked being able to highlight text, and one even described it as being “enjoyable”. What was an even more popular feature was viewing group highlights, as students found it valuable to see how others were annotating content. One student noted that, “if other classmates highlighted something it helped me realize it's significance especially if i [sic] overlooked it.” Further, students found that highlights afforded targeted discussion. One student noted that, “… the iHelp [Discussion forum] lends itself to a broader array of unrelated topics, whereas OATS tends to result in more focused discussion on a narrower set of topics. One could see this as good or bad I 2
We refer the interested reader to [20] for a more complete description of the results of this study.
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
273
suppose, depending on the circumstance.” We hypothesize that this view was a result of comments being made within the context of a specific passage of text; so, comments must be focused to make sense in terms of a passage. Students did find that the existing interface constrained their ability to find and read discussions. We believe this shortfall could be largely overcome with a redesign of the system, and is not surprising as the current interface was designed with note-taking in mind, rather than use as a discussion forum. With regards to the tagging behaviour, we hoped that the success of the other parts of OATS would lead users to tag abundantly and consistently. However, studying the usage data of the students, we found most learners engaged in “tire kicking” behaviour; it seemed learners tried tagging but quickly abandoned it. The most prolific tagger of all students (who accounted for over 40% of tags created) commented, “I wasn't able to utilize the tag function as well as I wanted. I found myself adding a lot of tags, as I'm experienced with tagging, but very rarely searching for tags. Although, this may reflect upon the nature of the course; I think other more technical courses could offer a lot more of a benefit to tagging.” This being said, we also got the impression from the comments of other students that the motivation for tagging was largely absent, despite the potential benefits of helping to organize themselves for the final exam. Our findings also suggested that that for tagging to be applicable and widespread, it needs to be persistent and span some length of time, where recalling individual documents and passages without an organization scheme would suffer. For example, being able to apply tags over an entire course or over several academic years may help provide ample motivation for their use. In the case of our institution, most courses are offered in a face-to-face situation and online content is sporadically used and largely supplemental. Our experiences in developing and using both OATS and CommonFolks, have had mixed results. We have identified several new techniques that allow users to interact with learning content and create metadata (some ontological in nature) in a usable and straight-forward manner. We also have several important lessons that we can draw from these experiences. We put OATS into a situation where the only constraints were those that would be imposed by actual pedagogical use, and this allowed us to discover interesting implications of how learners interact with social software. For instance, we did not anticipate the finding that highlighted text would help focus discussion. We also did not know if or how students would use tags and, while we found tagging usage to tailoff quickly, the insights provided by students could not have been garnered had we set a task that required them to tag. In so doing, we got a more accurate picture of how explicit annotations would be used by students. We feel that explicit annotations do have their place, and we have found some evidence that suggests students find them a valuable addition when they fit into the pedagogy of a class. However, particular types of explicit annotations may require too much effort if there is not ample motivation (whether for self organization or interaction with peers). Based on these findings we are also working to evaluate how implicit annotations of learning content can be used in coordination with explicit annotations to provide a rich view of how learning content is used, and perhaps to better characterize it.
274
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
4. Data Mining and Implicit Metadata More recently we have begun to explore how to extract information that is already implicitly available with web pages making up the content of an e-learning system. The possibilities include understanding some aspects of the content through textmining of natural language pages and/or image processing of video segments, as well as finding patterns in how users (learners and teachers) interact with the content. These approaches have promise because they avoid the problem of motivating humans to attach appropriate metadata and the inconsistency that is often inherent when more than one person attaches metadata. These approaches are hard, however, because the problem of extracting appropriate information is extremely difficult (AI-complete in the general case) and often computationally intractable. Nevertheless, the e-learning domain has enough structure (tasks are known, users have knowable profiles, evaluation is often explicit) that such data mining approaches have considerable potential. One of the specific content domains we have been looking at consists of recordings of face-to-face lectures. Large, static online video of lectures is often underwhelming and uninspiring for learners with just-in-time learning goals. Yet the success of video portal sites on the web (e.g. YouTube) suggests that learners would be interested in these kinds of materials if they could be contextualized and partitioned into appropriately sized segments. With this in mind, we have begun to mine the structure and content of video lectures. The simplest of the approaches we are taking is to apply optical character recognition (OCR) to screen captures of slides projected digitally and in order to try to associate keywords appearing on the slides with those in the instructor syllabus. While this research is ongoing, we have identified that there are specific issues with text analysis of screenshots versus textual scans. In particular, the extensive use of both non-traditional layout mechanisms and animations or graphics makes it difficult to determine when screen captures should be made. As it is impractical to apply OCR to each frame of a video, we have begun to focus our efforts on the segmenting of a video into stable "scenes", where each scene represents a small piece of content (roughly a single PowerPoint slide, although our approaches are not specific to PowerPoint). Our current work [2] has shown that there is a high degree of consensus amongst human raters on what constitutes a scene of a video for traditional PowerPoint lectures. We investigated the task of how learners would subchapter video, and found that the majority of subjects don't create chapters based on the concepts within the video, but instead use characteristics of the stability of image content in a video to create appropriate chapters. Using a variety of image recognition techniques, we were able to create accurate decision tree rules for the automatic sub-chaptering of lecture content. While we have not yet deployed this system widely, we hypothesize that it will lead to more searchable video, as well as make automatic semantic annotation of video more viable. Further, the decomposition of video into subchapters provides a basis for associating usage metadata around video content. In 2004 McCalla [24] proposed the ecological approach to e-learning, which essentially suggests that the large scale mining of log file data tracking learner interactions with an e-learning system can be used to find patterns that can allow the elearning system to adaptively support learners (or teachers). Taking advantage of knowing the learner goals at a particular time, it was hypothesized that it may be possible to mine for patterns of particular use to helping the learner achieve these goals.
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
275
In this sense, learner modelling is active and fragmented, reacting to specific goals and needs of a particular learner at a particular time, rather than a process aimed at maintaining a consistent and coherent learner model of that learner. Of course, if there is a consistent and coherent learner model, then the data mining algorithms should be able to take advantage of the structure inherent in the model to help them find and interpret the patterns. The ecological approach thus holds out the tantalizing promise of removing the need for creating explicit metadata. We elaborate on this further and suggest that instead of a structure of fields and tags, metadata is “...the process of reasoning over observed interactions of users with a learning object for a particular purpose”. [8] As with the many of our other research projects, our experiments with trying to flesh out the ecological approach have been carried out in the context of the iHelp system, where many years worth of student interaction data have been collected. Our early attempts to find patterns in this data foundered on its extremely fine-grained nature: clickstream level data is simply too low level to be very useful. Thus, our first step was to transform the data into slightly coarser-grained abstractions, describing more pedagogically relevant steps. Data mining algorithms still found too many patterns, most of them irrelevant for e-learning purposes. This led to concurrent topdown processes to identify pedagogically useful metrics that can be calculated from this slightly interpreted low-level data. These metrics can then be computed real time and used by an e-learning system as it interacts with learners. Another project underway is aimed at the group level: to find differences in learners’ behaviour between different groups of learners, for example learners in an on-line section of a course vs. learners in a classroom section of the same course (both of whom can interact with much e-content). A particular technique that has not been used in educational data mining, a version of contrast set attribute-oriented generalization [33], is being used to find patterns that might be useful in distinguishing the behaviours of two sets of learners, for example, the differences between on-line learners and in-class learners in the same course. It is still too early to be sure that these data mining approaches will be effective, but we are becoming more confident that they will be useful, at least in specific niches. This is echoed by other successful educational data mining research, for example used to detect students who are trying to game an elearning system [25].
5. Conclusion This brief survey of our experiences with social semantic web applications in elearning documents a number of projects that were initiated, taken to completion of their research objectives, but did not manage to live on in deployed systems. While it is clear that adding intelligent inferencing into e-learning systems is non-trivial and requires a well developed semantic framework, the effort on the part of learners to explicitly create semantically useful metadata is too great. Further, the laborious tuning and scripting that experts would need to do in sorting through semiautomatically generated metadata is also too demanding. Attempts to gather metadata and aggregate it appropriately seem to hold promise. Besides learning that semantic web technologies for e-learning metadata are not going to be a simple win, we do wish to focus on the successes we have had and look forward to extending and simplifying those technologies that are most promising. Of
276
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
these, OATS-style annotations are promising because they leverage the motivation of learners to make notes for themselves as they engage in learning activities. MUMS technology holds promise when a sufficient critical mass of raw data has been gathered so that statistical and data mining algorithms might be able to recognize patterns of learner behaviour that correspond to important learning events. We hope that our video-mining and usage mining experiments may yet allow us to understand what kinds of algorithms work to inform e-learning systems in different situations for different pedagogical purposes. In order to move to the next level of metadata for learning resources it may be advisable to maintain more complex and less human-interpretable metadata. In the learner modeling community there is a movement toward open learner models – making learner models inspectable by learners and other human agents. There is a benefit to a learner to be able to interpret what the learning environment “knows” about him or her. This requires a translation from an internal model to a presentation layer version that is suitable for human consumption. With learning object metadata, we may need to move in the other direction – from metadata tags that are human readable through a translation process into a standardized internal knowledge representation that may be quite opaque to learners or teachers but semantically clear for intelligent agents. There are three main general lessons that we can draw from our research. The first is that there is no substitute for constantly trying to test techniques in the real world of student learning. Often, a good technique in theory simply fails to scale well to actual learning situations, and apparently good ideas turn out to be unworkable in practice. The second lesson is that it is very useful to have a large amount of data tracking student behaviour collected over many years that can be explored for pedagogical patterns. Our commitment to the iHelp system over the past decade has provided us with both the real world situations required by the first lesson and the data required by the second lesson. Indeed, thousands of undergraduate students from a variety of disciplines use iHelp every academic term creating a significant source of interesting user modelling data. Our third lesson, more speculative, derives from the first two: perhaps the most promising approach to empowering the semantic web for e-learning applications is to find ways to exploit the information that is implicit in the content of the web pages and the way they are used. The shift is from adding extra metadata to a web page to leveraging the information that is already there. There is actually quite a lot of information available: text, images, and video on the pages; various kinds of feedback from quizzes, problems, and activities undertaken by learners; links connecting pages to other pages; fine-grained keystroke level tracking data of learner interactions with the web pages and with each other; specialized structural information inherent in some kinds of e-learning material (for example discussion threads); etc. Making sense of all this data is difficult, but there is huge potential, especially in elearning domains where there is more structure and a better chance of knowing users and their goals.
Acknowledgements This work has been conducted with support from funds provided by the Natural Science and Engineering Research Council of Canada (NSERC) to Greer and McCalla for their “discovery research” and to the Canadian Learning Object Repositories Research Network (LORNET). Special thanks to the dozens of graduate students,
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning
277
undergraduate students, technical support staff and research assistants who over the years have helped to develop, deploy, and evaluate these projects.
References [1]
[2]
[3]
[4] [5]
[6] [7] [8] [9] [10] [11] [12] [13]
[14] [15]
[16]
[17] [18]
[19]
[20] [21]
[22]
C. Brooks, L. Kettel, C. Hansen, Building a Learning Object Content Management System, World Conference on E-Learning in Corporate, Healthcare, & Higher Education (E-Learn 2005), October 2428, 2005. Vancouver, Canada. C. Brooks, K. Amundson, J. Greer, Detecting Significant Events in Lecture Video using Supervised Machine Learning, International conference on Artificial Intelligence in Education (AIED 2009), Brighton, England, July 6-10, 2009. S. Bateman, R. Farzan, P. Brusilovsky, G. McCalla, OATS: The Open Annotation and Tagging System, In the Proceedings of the Third Annual International Scientific Conference of the Learning Object Repository Research Network, Montreal, November 8-10, 2006. C. Brooks, M. Winter, J. Greer, G. McCalla, The Massive User Modelling System, 7th International Conference on Intelligent Tutoring Systems (ITS2004), Maceio, Brazil, Aug. 30 - Sep. 4, 2004. J. Greer, G. McCalla, J. Vassileva, R. Deters, S. Bull, L. Kettel, Lessons Learned in Deploying a MultiAgent Learning Support System: The I-Help Experience, Proceedings of AI in Education (AIED’2001), San Antonio, IOS Press: Amsterdam, 2001, 410-421. E. Gamma, R. Helm, R. Johnson, and J. Vlissides (eds), Design Patterns, 1st edition, Addison-Wesley, 1995. IEEE, Inc. IEEE P1484.12.1-2002, Draft Standard for Learning Object Metadata, 2002. C. Brooks, G. McCalla, Towards flexible learning object metadata, International Journal of Continuing Engineering and Lifelong Learning 16(1/2) (2006), 50-63. CanCore Initiative, CanCore Learning Object Metadata: Metadata Guidelines, Version 1.1, 2002. C. Brooks, Supporting Learning Object Versioning. Master's thesis, Department of Computer Science, University of Saskatchewan, 2005. Versit Consortium, vCard: The Electronic Business Card, Version 2.1. September 18, 1996. N. Friesen, Final Report on the “International LOM Survey”’, Technical Report Document 36C087, Canadian Avisory Commiittee for ISO/IEC JTC1/SC36, 2004. K. Cardinaels, M. Meire, E. Duval, Automating metadata generation: the simple indexing interface, The 14th International World Wide Web Conference 2005 (WWW 2005), International World Wide Web Conference Committee (IW3C2), 2005. K. Hammouda, M. Kamel Automatic Metadata Extractor. Http://pami.uwaterloo.ca/projects/lornet/ software/ame.php, (accessed January 30th, 2007). S. Bateman, C. Brooks, G. McCalla, P. Brusilovsky. Applying Collaborative Tagging to E-Learning, In the Proceedings of the Workshop on Tagging and Metadata for Social Information Organization, held in conjunction with the 16th International World Wide Web Conference, Banff, Canada, May 7, 2007. A.R. Doherty, D. Byrne, A.F. Smeaton, G.J. Jones and M. Hughes. Investigating KeyFrame Selection Methods in the Novel Domain of Passively Captured Visual Lifelogs, In ACM International Conference on Image and Video Retrieval (CIVR 2008), Niagara Falls, Canada, 7-9 July 2008. S. Auer, S. Dietzold, T. Riechert, A Tool for Social, Semantic Collaboration, In Proceedings of 5th International Semantic Web Conference, November 5th - 9th, Athens, GA, USA, 2006, 736–749. S. Auer, C. Bizer, J. Lehmann, G. Kobilarov, R. Cyganiak, Z. Ives, DBpedia: A Nucleus for a Web of Open Data. In Proceedings of 6th International Semantic Web Conference, November 11th –15th, Busan, Korea, 2007, 722–735. S. Bateman, C. Brooks, G. McCalla, Collaborative Tagging Approaches for Ontological Metadata in Adaptive ELearning Systems. In the Proceedings of the Fourth International Workshop on Applications of Semantic Web Technologies for E-Learning (SW-EL 2006) in conjunction with 2006 International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems (AH2006), June 20, 2006. Dublin, Ireland, 3-12. S. Bateman, Collaborative Tagging: folksonomy, metadata, visualization, e-learning, thesis. Master's Thesis, Department of Computer Science, University of Saskatchewan, 2007. R. Farzan, P. Brusilovsky, AnnotatEd: A Social Navigation and Annotation Service for Web-based Educational Resources. In Proceedings of E-Learn 2006--World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, 2006. R. Lachica, Towards holistic knowledge creations and interchange Part 1: Socio-semantic collaborative tagging. Talk presented at the Third International Conference on Topic Maps Research
278
[23] [24]
[25] [26]
[27] [28] [29] [30] [31] [32] [33]
C. Brooks et al. / Lessons Learned using Social and Semantic Web Technologies for e-Learning and Applications. October 11-12, 2007, Leipzig, Germany, Slides available at http://www.informatik.uni-leipzig.de/~tmra/2007/slides/lachica_TMRA2007.pdf. G. McCalla, J. Vassileva, J. Greer, S. Bull, Active Learner Modelling, 5th International Conference on Intelligent Tutoring Systems (ITS2000), Montreal, Canada, June 2000. G. McCalla, The Ecological Approach to the Design of E-Learning Environments: Purpose-based Capture and Use of Information About Learners. Journal of Interactive Media in Education 7 (2004). Special Issue on the Educational Semantic Web. ISSN:1365-893X [www-jime.open.ac.uk/2004/7] R.S. Baker, A. Corbett, K. Koedinger, I. Roll, Detecting When Students Game The System, Across Tutor Subjects and Classroom Cohorts, Proceedings of Conference on User Modeling, 2005, 220-224. Various Authors. SCORM 2004 4th Edition, Documentation Suite. Advanced Distributed Learning, 2009. Available online at http://www.adlnet.gov/Technologies/scorm/SCORMSDocuments/2004 4th Edition/ A. Seaborne, RQDL – A Query Language for RDF. W3C Member Submission, January 9th 2004. Available online at http://www.w3.org/Submission/2004/SUBM-RDQL-20040109/ D. Beckett, RDF/XML Syntax Specification (Revised). W3C Recommendation, February 10th 2004. Available online at http://www.w3.org/TR/rdf-syntax-grammar/ D. McGuinness, F. van Harmelen, OWL Web Ontology Language Overview. W3C Recommendation, February 10th, 2004. Available online at http://www.w3.org/TR/owl-features/ Various Authors, FIPA ACL Message Structure Specification. Document SC00061G, FIPA TC Communication, 2002. Available online at http://www.fipa.org/specs/fipa00061/SC00061G.html R. Engelmore, T. Morgan, Editors, Blackboard Systems, Addison-Wesley, Menlo Park, CA, 1988. A. J. Dix, J. E. Finaly, G. D. Abowd, R. Beale, Human-Computer Interaction. Second Edition. Prentice Hall Europe, 1998. R. Hilderman, T. Peckham, A statistically sound alternative approach to mining contrast sets, In Proceedings of the 4th Australian Data Mining Conference (AusDM), Sydney, Australia, December 2005, 157-172.
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-062-9-279
279
CHAPTER 15
Disburdening Tutors in E-Learning Environments via Web 2.0 Techniques Frank LOLL 1 and Niels PINKWART Clausthal University of Technology, Germany
Abstract. Today, collaborative filtering techniques play a key role in many Web 2.0 applications. Currently, they are mainly used for business purposes such as product recommendation. Collaborative filtering also has potential for usage in “Social Semantic Web” e-learning applications in that the quality of a student provided solution can be heuristically determined by peers who review the solution, thus effectively disburdening the workload of teachers and tutors. This chapter presents a collaborative filtering algorithm which is specifically adapted for the requirements of e-learning applications. An empirical evaluation of the algorithm showed that the results of the collaborative filtering were more accurate than the self-assessment of the participants and that already four peer evaluations were generally enough to reach a satisfying accuracy. Based on these results, we developed a web based e-learning system (CITUC), which was successfully used in a university course in summer 2008. This chapter describes an evaluation of CITUC based on surveys, interviews and a detailed analysis of the system’s usage by students. Our conclusion is that Social Semantic Web applications such as CITUC, which enable learners to review and comment on peer solutions, have high potential as a support for classic academic teaching in larger classes. Keywords. Peer review, collaborative filtering, educational technology
Introduction The term “Social Semantic Web” describes an emerging design approach for building and using Semantic Web applications which employs Social Software and Web 2.0 approaches. In Social Semantic Web systems, groups of humans are collaboratively building domain knowledge, aided by socio-semantic systems [1]). The collaboration process of the users in Social Semantic Web systems can have multiple purposes – among them are the group based structuring of a domain (creation of domain ontologies) and the collaborative classification of content (determination of properties of ontology elements). Both of these are potentially valuable in educational settings. While the former can be a technique for collaborative knowledge building through jointly structuring an unknown knowledge domain, including the discussion of domain 1
Corresponding Author: Frank Loll, Julius-Albert-Str. 4, 38678 Clausthal-Zellerfeld, Germany, [email protected].
280 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
concepts and relations, the latter allows for jointly annotating or evaluating learning materials [2] and for heuristically determining the quality of task solutions through a collaborative effort. This chapter presents an example for the latter approach. We present a system which is based on collaborative filtering (CF) algorithms [3]. This family of algorithms, which provide an essential base for the Web 2.0, is characterized by associations between users and system artifacts which are determined by explicit or implicit user actions and which are used to provide system services. Prominent examples for those associations are book orders at amazon.com, the input of user profiles in online dating sites as well as the tagging of pictures at flickr.com. All these applications have in common that the saved associations are used to recommend artifacts (books, potential partners, pictures, etc.). Although the calculation details vary between the systems, the underlying principle – the use of user information to assess or recommend artifacts in the system – is the same. In educational Social Semantic Web systems, CF algorithms have application potential: The quality of a student’s task solution can be determined heuristically by assessments of other students (peer reviews) via CF techniques. In this case, the objective of the CF algorithm lies less in the calculation of a potential fit between users and artifacts (like in classical application areas) than rather in the estimation of a solution’s quality. Such an approach is not unproblematic. Typical points of critique concerning a peer review approach in education are related to the students’ lack of knowledge and experience in assessing task solutions and to the risks of intentional manipulation [4], [5]. Yet, this approach also has a lot of advantages in practice. It disburdens teachers and tutors from a lot of assessment tasks and at the same time it provides the possibility for students to train their evaluating and critiquing skills by assessing other students’ solutions. If there are tasks which allow for more than one correct solution, then students have a chance to get to know different acceptable approaches and perceptions and have to compare them, which is beneficial for learning. Also, students can potentially empathize with other learners’ problems easier and understand reasons for wrong task solutions sometimes better than experts, which can make their reviews sometimes more valuable than those of experts [6]. In summary, CF algorithms have potential as a tool in Social Semantic Web elearning systems: they can allow learners to evaluate and annotate peer solutions and to build a semantic system heuristics based on multiple peer reviews. Literature shows that the resulting, collaboratively built, heuristics, could even lead to better annotations and evaluations of solutions than one single expert could do [7], [8]. In this chapter, we describe a CF heuristics which is especially designed for e-learning applications and the CITUC e-learning system which implements the heuristics in a practical context.
1. Collaborative Filtering in Existing Frameworks In spite of their potential, CF mechanisms have only been rarely used in the e-learning sector until now, and there have only been a few empirical studies about the effectiveness of these methods. One of the few existing systems is the web-based PeerGrader (PG) [9]. The purpose of this tool is to help students improve their skills by reviewing and grading solutions of their fellow students blindly. PG works in the following way: First, the students get a task list and each student chooses a task. Next, the students submit their solutions to the system, where they are read by another student who then provides
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 281
feedback in form of textual comments. After that, the authors modify their solutions based on the comments they have received, and re-submit their modified solutions again to the system, where they will be reviewed by other students. Then, the solutions’ authors grade each review with respect to whether it was helpful or not. Finally, the system calculates grades for all student solutions. One of PG’s strengths is to provide students with high-quality feedback also in ill-defined [10] homework tasks that do not have clear-cut gold standard solutions (such as design problems). This kind of feedback could not be generated automatically. A disadvantage is the time required for the system to work effectively: due to the complexity of the reviewing process and the textual comments, the evaluation of a single student answer is very time consuming. This may cause student drop-outs and deadline problems [9]. Also, studies with PG revealed problems with getting feedback of high quality. An evaluation of subjective usefulness showed that the system was appreciated by its users [9], yet a systematic comparison of PG scores to expert grades has not been conducted. A newer web-based collaborative filtering system is the Scaffolded Writing and Rewriting in the Discipline (SWoRD) system [8], [11]. SWoRD addresses the problem that in the writing discipline, homework solutions are often long texts, which cannot be reviewed in detail by a teacher for time reasons. Because of this, students do often not receive any detailed feedback on their solutions at all. Having such feedback would be beneficial for students though, since they could use it to improve their future work. To address this problem, SWoRD relies on Social Semantic Web techniques (peer reviews). An evaluation showed that the participants benefitted from multi-peers’ feedback more than from single-peer’s or single expert’s feedback [8]. A different approach is used by the LARGO system [12], where students create graphs of US Supreme Court oral arguments. Within LARGO, collaborative scoring is employed for a group based assessment of the quality of “decision rules” student have included in their diagrams. Since this assessment involves interpretation of legal argument in textual form, it cannot be automated reasonably and is thus an ideal field for Social Semantic Web techniques. While the overall LARGO system has been tested in law schools and shown to help lower-aptitude learners [12], [13], empirical studies to test the educational effectiveness of the specific collaborative scoring components have not been conducted. Another area where collaborative filtering has been used in educational technology systems is the recommendation of learning resources. The system Altered Vista (AV) [2] provides a database in which user evaluations of web-based learning resources are stored. Users can browse the reviews of others and can get personalized learning resource recommendations from the system. In contrast to the other systems mentioned before, AV does not aim to support learners directly by giving them feedback on their work. Instead, AV provides an indirect learning support in which (presumably) suitable learning tools are recommended. A survey-based evaluation of AV showed a predominant positive feedback, but also identified issues with the system’s incentive and with regard to privacy [2]. In summary, the relatively few educational technology systems with collaborative filtering components all have an underlying algorithm to determine solution quality based on collaborative scoring. Yet, existing systems are often specialized for a particular application area such as legal argumentation (LARGO), writing skills training (SWoRD), or educational resource recommendation (AV), or they involve a rather complicated and long-term review process (SWoRD, PG).
282 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
2. Collaborative Filtering Heuristics Based on the results of the existing e-learning systems reviewed above, we can state that the use of a combination of CF and peer reviews promises a benefit for classic environment. Yet, the existing systems have limitations in terms of generality and practical applicability. For this purpose we developed a heuristics which combines some of the features of PG, LARGO and SWoRD. Details of the CF heuristics will be described in the following - the main differences between our heuristics and the existing systems are: x x x
It is not constrained to a specified task area like the education of writing skills (SWoRD) or the education of argumentation in law (LARGO). There are no time-consuming re-writing phases and only short quality assessments on a Likert scale, but no detailed textual reviews. In our heuristics, peer assessments have an impact on the person who grades as well as on the solution that is graded.
2.1. Algorithm The CF heuristics consists of two components – a base rating and an evaluation rating, which are finally merged into a quality rating. Figure 1 illustrates the workflow: In the first step a student works on a task and provides a solution. After that he assesses a couple of alternative solutions (in our lab study 3) for the same task. Based on his assessments, the heuristics calculates a first rating, the base rating, which is a result of the deviation between the solutions’ quality ratings and the student’s assessments. In the third step, other students assess the new solution, which results in the evaluation rating. Finally, the heuristics calculates the quality rating of the new solution based on the base rating and the evaluation rating. The underlying formulae for the base rating, the evaluation rating and the final quality rating will be described in more detail in the next subsections.
Figure 1. Heuristics’ workflow
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 283
2.1.1. Base Rating Based on the assumption that a student who can classify the quality of given alternative solutions correctly is also able to provide a high-quality solution himself, the heuristics first calculates a base rating for a student’s solution. Once a student provided n assessments w 1 , …, w n for n other student‘s solutions (which have quality ratings, i.e. system’s classification, of q 1 , …, q n themselves), the base rating is calculated by: b
1
1 n ¦ wi qi n i1
(1)
Here it is important to note that we tested two variants of the heuristics: variant N (normal) allowed only coarse-grain assessments of 0 (bad) and 1 (good) for all elements w i , while system variant D (detailed) allowed a more fine-grain assessment in steps of 0.1. Figures 2 and 3 illustrate the different ways of assessment in the two algorithm variants. An example of how the algorithm works: Let’s assume that, as shown in Figure 3, a student assesses the three given alternative solutions with scores w 1 = 0.5, w 2 = 0.1 and w 3 = 0.9. If the current system internal quality ratings of these three solutions (see section 2.1.3) are q 1 = 0.5, q 2 = 0.05 and q 3 = 0.95, then the base rating b for the student who makes the assessments is: b
1
1 3 ¦ (| wi qi ) 3i 1
1
1 | 0.5 0.5 | | 0.1 0.05 | | 0.9 0.95 | | 0.97 3
(2)
This high base rating results from the fact that the student assessed the given solutions as correctly as possible (as compared to the quality rating). According to the assumption, he was thus probably able to provide a high-quality solution himself. 2.1.2. Evaluation Rating The second component of the heuristics is the evaluation rating. Once a student has provided his solution (and has assessed some peer solutions), it is presented to other students to be assessed. All assessments get collected and averaged. Here, a weighting of assessments is made where assessments of better students get a higher weights. Thus, the evaluation rating is calculated as: e
§ j · ¨¨ ¦ wi q i ¸¸ i 1 © ¹ ¦ qi 1
(3)
j
i 1
To illustrate the weighting, here is another example: Assume a solution gets four assessments w 1 = 0.9, w 2 = 0.2, w 3 = 0.4 and w 4 = 0.5 from students whose own solutions have internal system quality ratings of q 1 = 0.8, q 2 = 0.1, q 3 = 0.3 and q 4 = 0.7. Then, the evaluation rating e for the assessed solution is:
284 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
e
1 § 4 · ¨ ¦ wi qi ¸ ¦ qi © i 1 ¹ 4
1 0.9 0.8 0.2 0.1 0.4 0.3 0.5 0.7 | 0.63 1 .9
(4)
i 1
The first assessment gets a higher weight than the others because the student who provided it has a higher quality rating as compared to the others. His opinion is thus considered as more important than the other students’ opinions by the system heuristics.
Figure 2. Solution assessment (variant N) 2
Figure 3. Solution assessment (variant D)
2.1.3. Quality Rating Finally, the base rating and the evaluation rating are combined to a quality rating. The evaluation rating gets weighted dependent on the number of received assessments p for 2
The original user interface was in German language (for this figure and all others).
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 285
a solution. Its impact thus increases with an increasing number of assessments. The formula contains a constant c which corresponds to the number of presented alternative solutions in the dialogs (see Figures 2 and 3). In our example, we have c=3 and p=4. Thus, the quality rating is calculated by: q
c p b e pc pc
3 4 0.97 0.63 | 0.73 43 43
(5)
2.2. Implementation We developed a Java and XML based system to test the CF heuristics. After an initial login to the system, students go through the following phases as they use the system: 1. 2. 3. 4.
Work on task. Assess three alternative solutions for the just completed task (Figures 2 and 3). Repeat steps 1 and 2 as long as there are tasks to complete. Self-assessment of their own solutions’ qualities.
Figure 4 shows the user interface with a sample task (in this case a task on Java programming) from the lab study we describe in section 2.4 in more detail. The users got a given text, a question as well as a time limit and a character limit for their solution. The limits were used as orientation guide for what kind of solution was expected.
Figure 4. User interface
2.3. Research Questions To test whether our CF heuristics algorithm actually works, we investigated the following research questions:
286 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
1. 2.
3.
4.
5.
Does the heuristics correctly classify students’ solutions in comparison to a manual grading of human experts? This point is of course fundamental. Is the level of detail which is available in the peer assessments important? This point is interesting since it is probably easier for students to perform a coarse-grain assessment (like right or wrong) than a fine-grain assessment (like an assessment on a 10pt Likert scale). Does the heuristics’ performance dominate the participants’ self-assessment in comparison to a manual expert’s assessment? This aspect is interesting, because a participant’s self-assessment is usually much easier to get than multiple peer ratings. Will the estimation quality of the heuristics improve with a growing number of peer assessments? This is typical for CF algorithms in other domains so that we hypothesize that it will improve. The critical part of the question is the number of assessments which is needed to achieve sufficient quality. If the number is low, then the algorithm will also be applicable in small learning groups where only few peer assessments are possible. Does the heuristics’ estimation quality depend on the task type? While for well-defined tasks, students only have to compare between the right solution and the solution to be assessed (if the student knows the right solution), more work is required for ill-defined tasks where students have to think about other students’ viewpoints, since multiple solutions could be acceptable. Thus it is a priori unclear if the heuristics will be suitable also for those task types.
2.4. Study Description To answer the research questions, we conducted a controlled lab study in May 2008 at Clausthal University of Technology with 45 participants, including 18 female and 27 male students. The participants were assigned randomly to the two system variants D and N, resulting in 7 female and 15 male participants in variant N and 11 female and 12 male participants in variant D. The participants were volunteers from different domains (e.g., computer sciences, physics or business) in different stages of their studies, i.e. there were first semester bachelor students as well as advanced diploma and PhD students. All participants were recruited via public announcements on the local newsgroups or e-mail lists and were paid for their time. The students had to work on 12 tasks from various knowledge areas. The tasks were of the following types: 1. 2. 3. 4.
Text summaries. Text interpretations. Knowledge tests without possibility to guess. Knowledge tests with possibility to guess.
In the first task type (text summaries), the participants got articles dealing with different topics (e.g. a text about Second Life). These articles differed in their level of complexity and required, at least in parts, domain-specific knowledge to get the main points, which had to be summarized in a short text. The second task type (text interpretations) focused a fact-based news article about the take-over of DoubleClick by Google. The participants were asked to mention and discuss possible concerns towards privacy of customers based on the facts in the text. The third task type (knowledge tests without possibility to guess) consisted of five tasks where guessing
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 287
was not possible (e.g. the calculation of a derivative of a function to calculate the slope at a given coordinate). The fourth and final task type (knowledge tests with possibility to guess) consisted of problems which could at least be approximated by logical deduction even without specific domain-knowledge. An example here was the estimation of the population of Austria by means of a text about the size of the country. The students had an overall time limit of 75 minutes. Furthermore, each task had a character limit as well as a time limit (see section 2.2). All participants were instructed to assess alternative solutions even if they did not know the correct solution for a task. To solve the cold-start problem [14] and offer alterative solutions also for the first participants who took part in the study, we provided 3 alternative solutions of different quality per task. 2.5. Results To evaluate the results of the heuristics, all solutions were manually graded independently by two human experts (a professor of computer science and an advanced graduate student) on a scale from 0 to 10. To check whether the human grader’s assessments were similar (if human graders disagree, then a realistic baseline for the heuristics is hard to define), we first calculated inter-rater reliability based on Cronbach’s Alpha [15]. Table 1. Inter-rater reliability of human graders by means of Cronbach’s alpha 1. 2. 3. 4.
Task Group Text summaries Text interpretations Knowledge tests without possibility to guess Knowledge tests with possibility to guess
Į 0.834 0.888 0.982 0.932
As Table 1 shows, of the agreement between the two graders was acceptable (and even excellent for the knowledge tests). Therefore, we averaged both human grader’s scores and used the resulting “human grading” as a baseline for comparisons to the self-assessment of the students and to the results of the systems’ quality rating. As a next step, we needed to define what we consider as an acceptable level of deviation between the system’s quality rating and the human grading. In this context, it is important to note that despite their overall agreement, the human graders still had slight differences between their grading, especially in the more ill-defined tasks. Thus it is not realistic to define the acceptable level of deviation as 0.0: This was achieved between the human graders only in 44.6% of the cases. Considering the fact that a random system assessment would have led to an expectation value for the difference of E[X] = 0.33 and a static default value of 0.5 would have led to an expectation value of 0.25 in theory and to E[X] = 0.305 (in variant D) and E[X] = 0.29 (in variant N) for our data set, we set the maximum acceptable deviation to 0.2. This choice is supported by the agreement between the human graders, who differed by more than 0.1 in 21.2% of the cases, but by more than 0.2 only in 11.5% of the cases. Using a limit of 0.2 thus, in our view, is an acceptable compromise between being unrealistically strict (so that even humans would not agree to this extent) and overly relaxed (so that even guessing would fulfill the criteria).
288 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
2.5.1. General Heuristics’ Quality To analyze the overall quality of the heuristics, we compared the average deviation between system score and human grading to the number of assessments. Figure 5 gives an overview about both systems variants.
Figure 5. Heuristics quality measured by average difference between system score and human grading
The ordinate shows the average deviation between system’s quality rating and the human grading, while the abscissa shows the minimal number of assessments a solution received. An example: The average difference between the system’s quality rating and the human grading was 0.25 in variant N when only considering those solutions which had been assessed at least once. Based on the quality threshold of 0.2 discussed above, Figure 5 shows that both variants of the heuristics provided acceptable results when a sufficient number of assessments were available. This was independent of the educational background (including their major topic and semester) of the participants. Thus we can answer our first research question: The heuristics is able to classify solutions correctly. As expected, Figure 5 also shows that the average deviation between heuristics and human grading decreases continuously with an increasing number of available peer assessments, resulting in an improved prediction quality of the heuristics (research question 4). Four (variant D) to five (variant N) assessments of participants were enough to achieve a prediction quality which differed from the human grade by not more than 0.2. Yet, Figure 5 also shows that the heuristics’ quality is not sufficient if it is based only on the base rating, i.e. if there are no assessments from the participants’. We will discuss this later in section 2.6. 2.5.2. Influence of Assessment’s Granularity Next, we checked how the degree of granularity that was available for the peer assessments influenced the resulting heuristics’ quality (research question 2). As Figure 5 suggests, there is no significant difference between both variants. This was confirmed by an ANOVA: The differences between the system’s quality rating and the human grading were statistically not significant, i.e. F (1, 538) = 2.69, p > 0.1 for all solutions and F (1, 31) = 0.71, p > 0.4 for solutions with six or more assessments. Since both variants provided sufficient results and did not differ significantly, we used the combined results of both variants (D+N) in the following.
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 289
2.5.3. Heuristics Quality vs. Self-Assessment To investigate whether the heuristics outperforms the participants’ self-assessments (research question 3), we compared their average deviation to the experts’ grades.
Figure 6. Average deviation between system’s quality heuristics and participants’ self-assessment
As shown in Figure 6, the heuristics outperformed the participants’ selfassessments when three or more assessments were available for a solution. A t-test showed that this result is statistically significant (p < 0.05 for solutions with at least 4 assessments). 2.5.4. Task Group Dependency Finally, we looked at the differences of the heuristics’ quality between the four different task types (research question 5). As Figure 7 illustrates, the system provided satisfying results in all task groups, however it took more peer assessments for text summaries and for knowledge tests with possibility to guess. An ANOVA however showed that the differences between the task types were not statistically significant (p>0.5).
Figure 7. Results of system’s quality heuristics depending on task group
2.6. Discussion Overall, the pilot study confirmed our expectations. The collaborative filtering heuristics provided acceptable quality assessments for participants’ solutions when enough, i.e. four to five, assessments were available. Confirming findings in literature
290 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
[11], [16], the participants’ self-assessments were qualitatively beaten by the peer assessments. The heuristics turned out to be adequate for different types of tasks, starting from well-structured knowledge tests, where solutions could be checked automatically, to ill-defined tasks like interpretations of rather complicated texts. Contrary to our expectations, both variants (D and N) were on a similar quality level – so it did not make a difference whether peer grades were given on a coarse grain scale or on a fine grain scale. One possible explanation for this might be the fact that, while variant N did not allow for “medium” ratings, students in variant D tended to prefer less extreme scores (like 0.7 to 0.9 for good solutions and 0.3 to 0.1 for bad solutions). This finally led to a need for more assessments in variant D to achieve extreme scores of <0.2 or >0.8. One aspect of the heuristics that was not confirmed by our study is related to the base rating. In section 2.1.1, the heuristic’s base rating was described: Based on the assumption that a student who can classify the quality of a given solution correctly is able to provide a high-quality solution himself, the base rating assigns a first quality score to a student’s solution even though it has not been reviewed by peers yet. Unfortunately, our analysis showed that this goal could not be fully achieved. Figure 8 shows the average deviation between the base rating and the human grading for both system variants. We compared it to a default initial value of 0.5 which results in an average difference to the human grading of 0.305 in variant D 0.29 in variant N. As the diagram shows, the base rating delivered comparable results to a default initial value in variant N and even worse results in variant D. Thus, theoretically, the base rating formula could have been replaced by a constant to improve system’s quality.
Figure 8. Comparison between base rating quality and default initial values, measured by average deviation to the human grading
But the base rating formula can be improved. In the study it became apparent that the major weakness of the base rating lies in its lack of achieving extreme (especially extremely low) scores. In variant D, there were 142 solutions which got a human grading of < 0.5. However, there were only 14 solutions which got a base rating of less than 0.5. In variant N, this effect was less extreme but still observable (137 to 86). The main reason for this effect can be found in the combination of alternative solutions. The problem lies is the following: Assume a participant got three solutions with quality ratings of 1.0, 0.67 and 0.0. Based on the worst imaginable assessment, i.e. 0.0, 0.0 and 1.0, the base rating results in:
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 291
1
b
1 3 ¦ (| wi qi ) 3i 1
1
1 | 1.0 0.0 | | 0.67 0.00 | | 0.0 1.0 | | 0.11 3
(6)
Thus we know that it is not possible to achieve a lower score than 0.11 for the base rating in this constellation – this gets worse the more medium the quality ratings of the solutions to be graded are. In variant D, this problem is amplified by the participants’ trend to avoid extreme assessments (as discussed before). Concretely, the lowest base rating achieved in our study in variant N was 0.14, and it was 0.31 in variant D. Hence here is potential for improvements.
3. The CITUC System Based on the promising results of the lab study, our next step was to develop an elearning system for practical use to test the heuristics in a more realistic setting. The resulting system called CITUC (Collective Intelligence @ Technical University of Clausthal) was intended to support students in their preparation for a final exam without increasing the workload of tutors. 3.1. System Modifications To improve the base rating (cf. section 2.6), the algorithm was changed in a way that allows for achieving extreme scores independently of the quality ratings of the alternative solutions to be graded (even if they are near to 0.5). Thus we modified the base rating formula in the following way: bnew
1
| wi qi | 1 n ¦ n i 1 max( q i ,1 qi )
(7)
The advantage here is that it is possible to achieve extreme scores due to the linear scaling. Therefore, base ratings from 0.0 or 1.0 are always possible. To illustrate this: Assume there are solutions with quality ratings of q 1 = 0.35, q 2 = 0.6 and q 3 = 1.0. The worst ratings a user might make here are w 1 = 1.0, w 2 = 0.0 and w 3 = 0.0. In the old base rating this would have led to a base rating b old : bold
1
1 3 1 ¦ (| wi qi ) 1 3 | 0.35 1.0 | | 0.6 0.0 | | 1.0 0.0 | 0.25 3i1
(8)
This user would thus have got a far too high base rating score (0.25) with respect to his poor assessments. The new base rating b new corrects for this: bnew
1
1 3 (| wi qi |) ¦ 3 i 1 max(qi ,1 qi )
1 § | 0.35 1.0 | | 0.6 0.0 | | 1.0 0.0 | · ¸ 1 ¨¨ 3 © max(0.35;0.65) max(0.6;0.4) max(1;0) ¸¹
0.00
(9)
Another starting point for improvements is to offer the option to skip tasks if a student is not able to provide at least a basic solution. This appeared repeatedly in the lab study for the task type 3: knowledge tests without possibility to guess. Here, it was
292 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
possible to get a high base rating by lucky guessing. This led to mistakes in the system’s heuristic which lasted until enough peer assessments were available to filter this failure out. To exemplify: In some cases, solutions like “no idea” got a high base rating due to “good guessing”. This then led to a low base rating for other participants who correctly assessed this solution as bad. This propagation of mistakes could have been avoided by giving a possibility to skip tasks. In our concrete use case for CITUC (helping with the preparation for a final exam), a required sequential working through tasks would be misplaced. The problem described here was solved by allowing students a free choice among tasks to work on. Based on the results presented in section 2.5, the number of solutions which had to be assessed by students was set to 5 to get a more reliable quality rating. We also opted for using the system variant D because this provided slightly better results in ill-defined tasks (text interpretations) and results of similar quality in the other categories. 3.2. CITUC System Description CITUC was implemented as a web based system using PHP and a relational database for data storage. In addition to the “core functions” of entering and assessing solutions, the system offered facilities to comment solutions, to exchange private messages (for private call backs to comments) and e-mail notifications as awareness messages once new tasks, messages or comments were available or if there were new solutions for tasks that a student had already completed. After the login to the system by an anonymous identification number, the portal presented students some personalized awareness messages and a menu with options what to do next (see Figure 9).
Figure 9. CITUC: User-interface with awareness information
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 293
The most important point in the menu is the work on the tasks. After selecting it, the user will get a list of all tasks that the system offers (set up by a tutor or by other users) so that he can choose which task he wants to work on. After providing a solution for the task (see Figure 10), the user will see alternative solutions from other students, anonymously presented. Analog to the study’s variant D, he has to assess these these solutions on a scale from 0 (poor) to 10 (good). In addition, he has the possibility to add comments to each solution to help the respective author of the presented solution to recognize his possible mistakes (see Figure 11).
Figure 10. CITUC: Working on task
For each completed task, the user can take a look at all other solutions with their quality ratings and their comments for the respective task. Here, it is possible for the users to communicate via private messages or to add further comments.
Figure 11. CITUC: Assessment of alternative solutions
294 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
As indicated before, the system offers students to enter tasks. This option was included to allow students to enter problems they may have had encountered during their exam preparation (to see how other students deal with these problems). Nevertheless there is a roles management: the system differentiates between administrators, tutors and students. The first two groups have access to all tasks with their solutions and comments (see tutor area in Figure 9). 3.3. Research Questions In our research, we focused on the investigation of the following questions: 1. 2.
To what extent is the heuristics’ quality rating ready for use in practice? Does a usage in a real context confirm the results from the previous lab study? Does the system have the potential to replace classical tutorials for exam preparation? Is the student’s motivation to use the system on a voluntary basis sufficient, (usage frequency) is CITUC considered as helpful by the users (usage quality), and does it actually help students (effectiveness)?
3.4. Study Description The CITUC system was used in the course “Business Information Systems II: Modeling of Information Systems” at Clausthal University of Technology in summer 2008. The course was attended by Business Information Systems students as well as Management and Economics students in the first semesters. The system was made available after a short introduction in the last course lecture. It was available for approximately six weeks until the day of the course exam. The participation was voluntary. To motivate the students to use CITUC, e-mail reminders were sent at intervals of 2 weeks. 98 users were finally registered in the system, 85 students took part in the final exam. Overall, there were 50 tasks in the system: 22 of them were known to the students since they were taken from previous tutorials (they were put in to familiarize the students with the system) and 27 new tasks were explicitly marked as exam preparation tasks. One task was entered by a student. A few days before the final exam, the participants were asked to fulfill an online survey to assess the CITUC system. 29 of the 98 students participated in this. 3.5. Results The following sections summarize the results of the system’s evaluation parted according to the research questions, i.e. (1) performance of the heuristics in real settings, (2a) usage frequency of the system, and (2b) system’s quality and effectiveness. 3.5.1. Performance of the Heuristics To investigate the heuristics’ classification performance, we looked at the 30 worst and 30 best solutions (according to the system’s quality rating). Among the 30 worst solutions, there were 83% “spam”, i.e. solutions like “foo”. These “spam” answers were given by students who, apparently, wanted to look at other student’s solutions (and had to provide their own one to do so). Therefore we can note that the heuristics is
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 295
capable to filter out this kind of spam successfully. The remaining other solutions, in the “poorest 30” were classified correctly as being of low quality, too. Within this “poorest 30” set, the mean value of the quality ratings was m=0.087 (sd=0.034) and the mean value of the according base rating was m=0.238 (sd=0.179), which indicates that the base rating was improved as compared to the lab study. A similar picture was drawn when looking at the top 30 solutions. Among them, there was a single spam solution with a high base rating, which could be ascribed to excellent guessing on the student’s side, but this solution did not receive any other assessments of other students until the end (it was one of the last ones entered), so that the base rating was the only available score. The other 29 solutions in the “top 30” set were classified correctly and received 5 assessments each. The mean quality rating value in this set was m=0.914 (sd=0.025). The mean value of the base rating was m=0.747 (sd=0.139). Thus, the heuristics confirmed the results of the lab study, but now even the base rating provided very good classifications. 3.5.2. Frequency of Use As Figures 12 and 13 show, the system was used mainly during the last 1.5 weeks before the final exam. The last day before the exam had most logins (see Figure 12) and most provided solutions (see Figure 13) at the last day before the final exam. The small peaks in the system’s use in the first days of use as well as after two weeks can be explained by the reminder emails. We conclude from this usage pattern that a pure voluntary use of the system was – at least within this course – a sufficient motivation for the students to use the tool during the exam preparation phase (yet not throughout a longer period).
Figure 12. CITUC: Logins per day
Figure 13. CITUC: Provided solutions per day
296 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques
Figure 13 also shows the main advantage of the system as compared to the existing approaches (PG, SWoRD): Even solutions which were posted at the last day got feedback via comments and system ratings. Thus, nearly all students (but the last one) received feedback until the “last minute”. 3.5.3. Students’ Opinions The results of the online survey show that the students found the system useful. They graded it with m=3.89 (sd=0.766, n=26) on a scale from 1 (very useless) to 5 (very useful). The question about the usefulness of the comment function drew a similar picture (m=3.556, sd=0.974, n=27). To the question if the CITUC system is a good preparation for the final exam 18 students voted for “yes”, while 3 students voted for “no”. It is important to note that the latter ones did not use the system at all, i.e. they registered to the system, but did not work on even one single task. A question about usability of the system resulted in an average value of m=3.704 (sd=0.993, n=27). We noticed that not all students understood the sense of the system. A few of them thought in the “traditional” pattern where students work on a task and after that a tutor corrects their solutions or at least presents sample solutions. These students repeatedly asked for sample solutions, even if there were solutions in the system with an excellent score and content. Only after a written confirmation of a tutor that the online solution provided by another student was correct, they believed in it. So they wanted a clear sign that a solution is some kind of sample solution. 3.5.4. Tutor’s Opinion An interview with the course tutor showed that he believed that his workload was approximately equal to before (where he held classical tutorials instead of feeding tasks into CITUC), but the main advantage in the CITUC system was the possibility to handle more tasks than in a 90 minutes tutorial. In the tutor’s opinion, the utility of CITUC was confirmed. Furthermore, he stated that the system allowed for addressing specific weaknesses “on the fly” during the course, which is not always possible in classical tutorial groups which have to be planned in advance. Concerns were mentioned by the tutor with respect to of solution assessment: He was not sure about whether students would also provide high-quality assessments if the tasks were more complicated and the solutions were longer. In our current setting we could not confirm or falsify this point, because most of the tasks were rather short. 3.5.5. System Effectiveness Out of the 98 registered users in the CITUC system, 79 took part in the final exam. Overall there were 85 participants in the final exam, i.e. 6 participants did not register in the system. The achieved average score of all participants were 3.282 3 , the average result of the CITUC users was 3.266 and there were no significant differences between students majoring in different topics. The correlation between the number of logins to the system and the exam’s results was r=-0.1546, while it was r=-0.1504 between provided solutions in the system and exam’s results. Both values suggest a trend in the desired direction (higher grade of use would lead to a better exam’s result), but are clearly not statistically significant. 3
1 = A (very good), 2 = B (good), 3 = C (satisfying), 4 = D (sufficient), 5 = E (insufficient)
F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques 297
We investigated deeper and classified the users into active (more than average use) and passive users (no usage or less than average usage) dependent on their grade of activity. Furthermore we divided the active users into three subgroups, i.e. low, medium and high, as shown in Table 2. Table 2. CITUC user classification by means of their rate of use
Classification passive use active use
Rate of Use low medium high
Characteristics < 7 solutions, 4 logins 7 solutions, 4 logins 14 solutions, 8 logins 28 solutions, 16 logins
# 53 11 24 10
Out of the 45 active users, 44 took part at the exam. 41 passive users participated in the exam (35 of 53 with system logins, plus 6 who never logged in). The average result of the active users was 2.993 (sd = 1.344) compared to the passive users’ result of 3.57 (sd = 1.42). Thus the latter clearly achieved a worse result. Again, this is not statistically significant, but still a noteworthy trend. The failure rate was analogue: 20.4% of the active users failed in the exam, as compared to 45.71% of the passive users. Clearly, these findings are of correlational (not causal) nature, and the exam results depends on multiple factors beyond CITUC usage, but these results might be seen as indication that the system has some educational value.
4. Conclusion The CITUC system, presented in this paper, is an example of a system which allows a student group to collaboratively build knowledge by classifying and annotating various (student provided) solutions to problems. CITUC uses CF algorithms in combination with peer reviews to address tutor workload issues in learning environments. In a controlled lab study, the CITUC heuristics provided ratings of sufficient quality (as compared to expert provided grades) and outperformed the participants’ self assessments significantly when four or more assessments for each solution were available. The heuristics also proved its suitability for daily use beyond the limit of the study and provided persuasive classification results of student solutions in a field study. It thus has application potential for Social Semantic Web systems. Problems were identified in a lack of motivation to use the system among the students (apart from the last 2 weeks before the exam) as well as in the use of backdoors to get access to other students’ solutions without providing content oneself. CITUC was assessed as helpful by the students and by the tutor, and an active usage of CITUC was correlated with better exam results.
References [1] [2]
P. Morville, Ambient Findability, O’Reilly Media, 2005. A. Walker, M. M. Recker, K. Lawless, D. Wiley, Collaborative Information Filtering: a review and an educational application, International Journal of AIED 14(1) (2004), 1-26.
298 F. Loll and N. Pinkwart / Disburdening Tutors in e-Learning Environments via Web 2.0 Techniques [3] [4] [5] [6] [7] [8] [9]
[10]
[11]
[12]
[13]
[14] [15] [16]
D. Goldberg, D. Nichols, B. M. Oki, D. Terry, Using Collaborative Filtering to Weave an Information Tapestry, Communications of the ACM 35(12) (1992), 61-70. W. T. Dancer, J. Dancer, Peer Rating in Higher Education, Journal of Education for Business 67(5) (1992), 306-309. B. Mathews, Assessing Individual Contributions: Experience of Peer Evaluation in Major Group Projects, British Journal of Educational Technology 25(1) (1994), 19-28. P. J. Hinds, The Curse of Expertise: The Effects of Expertise and Debiasing Methods on Predictions of Novice Performance, Journal of Experimental Psychology: Applied 5(2) (1999), 205-221. J. Surowiecki, The Wisdom of the Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations, Doubleday, 2004. K. Cho, C. D. Schunn, Scaffolded Writing and Rewriting in the Discipline: A Web-Based Reciprocal Peer-Review System, Computers & Education 48(3) (2007), 409-426. E. F. Gehringer, Electronic Peer-Review and Peer Grading in Computer-Science Courses, In Proceedings of the 32nd SIGCSE Technical Symposium on Computer Science Education, February 2001, Charlotte, North Carolina, United States, 2001, 139-143. C. Lynch, K. Ashley, V. Aleven, & N. Pinkwart, Defining Ill-Defined Domains; A Literature Survey, In V. Aleven, K. Ashley, C. Lynch, & N. Pinkwart (Eds.), Proceedings of the Workshop on Intelligent Tutoring Systems for Ill-Defined Domains at the 8th International Conference on Intelligent Tutoring Systems, Jhongli, Taiwan, 2006, 1-10. K. Cho, C. D. Schunn, R. W. Wilson, Validity and Reliability of Scaffolded Peer Assessment of Writing From Instructor and Student Perspectives, Journal of Educational Psychology 98(4) (2006), 891–901. N. Pinkwart, V. Aleven, K. Ashley, C. Lynch, Evaluating Legal Argument Instruction with Graphical Representations Using LARGO, In Proceedings of the 13th International Conference on Artificial Intelligence in Education, IOS Press, 2007, 101-108. N. Pinkwart, V. Aleven, K. Ashley, C. Lynch, Schwachstellenermittlung und Rückmeldungsprinzipen in einem intelligenten Tutorensystem für juristische Argumentation, In: M. Mühlhäuser, G. Rößling, & R. Steinmetz (Eds.), GI Lecture Notes in Informatics - Tagungsband der 4. e-Learning Fachtagung Informatik, Bonn (Deutschland), Gesellschaft für Informatik, 2006, 75-86. D. Maltz, E. Ehrlich, Pointing the Way: Active Collaborative Filtering, In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1995. L. J. Cronbach, Coefficient Alpha and the Internal Structure of Tests, Psychometrika 16(3) (1951), 297334. L. A. J. Stefani, Peer, Self and Tutor Assessment: Relative Reliabilities, Studies in Higher Education 19(1) (1994), 69-75.
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved.
299
Subject Index ASPL 219 assessment systems 178 authoring 136 authoring support 77 collaborative filtering 279 comparative analysis 219 competencies 136 computer science education 44 computing disciplines 44 constraint-based tutors 77 cross-curriculum search 136 curriculum development 44 description logics 178 digital narratives 197 domain models 77 educational technology 279 e-learning 117, 219, 245, 260 folksonomies 117 information retrieval 136 instructor-directed feedback 117 intelligent learning environments 245 intelligent tutoring systems 77 internationalisation 136 knowledge access 219 knowledge exploration 219 learning management system 96 learning resources 136 lessons learned 260 managing the ontology-based referencing of resources 5
metadata harvesting 24 model driven architecture 178 multilinguality 136 ontological engineering 59 ontology(ies) 44, 77, 96, 117, 136, 197 ontology evolution 5 ontology mapping 24 ontology of learning and instructional theories 59 ontology of ontology changes 5 ontology-based courseware 24 peer review 279 philosophy 197 semantic annotation 117 semantic e-learning 159 semantic web 178, 197, 245, 260 semantically annotated learning content 159 social semantic web 245, 260 social web 245 test generation 96 theory-aware authoring system 59 tool evaluation 219 topics 136 tracking changes 5 Web 2.0 260 web services 159 Wittgenstein 197
This page intentionally left blank
Semantic Web Technologies for e-Learning D. Dicheva et al. (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved.
301
Author Index Bateman, S. Bourdeau, J. Brooks, C. Cassel, L. Desmoulins, C. Devedžić, V. Dichev, C. Dicheva, D. Dzbor, M. Gasevic, D. Goguadze, G. Greer, J. Hayashi, Y. Holland, J. Jovanovic, J. Krdžavac, N. Libbrecht, P. Loll, F. Martin, B.
260 59 260 44 136 117, 178, 245 24 v, 24 219 117, 245 159 v, 260 59 77 117, 245 178 136, 159 279 77
McCalla, G. McGuigan, N. Melis, E. Milik, N. Mitrovic, A. Mizoguchi, R. Motta, E. Paquette, G. Pasin, M. Pinkwart, N. Radenković, S.D. Rajpathak, D.G. Rogozan, D. Soldatova, L.N. Suraweera, P. Torniai, C. Ullrich, C. Zakharov, K.
260 77 159 77 77 v, 59, 96 197 5 197 279 178 219 5 96 77 117 159 77
This page intentionally left blank
This page intentionally left blank
This page intentionally left blank