I
Advanced Technologies
Advanced Technologies
Edited by
Kankesu Jayanthakumaran
In-Tech
intechweb.org
Published by In-Teh In-Teh Olajnica 19/2, 32000 Vukovar, Croatia Abstracting and non-profit use of the material is permitted with credit to the source. Statements and opinions expressed in the chapters are these of the individual contributors and not necessarily those of the editors or publisher. No responsibility is accepted for the accuracy of information contained in the published articles. Publisher assumes no responsibility liability for any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas contained inside. After this work has been published by the In-Teh, authors have the right to republish it, in whole or part, in any publication of which they are an author or editor, and the make other personal use of the work. © 2009 In-teh www.intechweb.org Additional copies can be obtained from:
[email protected] First published October 2009 Printed in India Technical Editor: Goran Bajac Advanced Technologies, Edited by Kankesu Jayanthakumaran p. cm. ISBN: 978-953-307-009-4
V
Preface This book, edited by the Intech committee, combines several hotly debated topics in science, engineering, medicine, information technology, environment, economics and management, and provides a scholarly contribution to its further development. In view of the topical importance of, and the great emphasis placed by the emerging needs of the changing world, it was decided to have this special book publication comprise thirty six chapters which focus on multi-disciplinary and inter-disciplinary topics. The inter-disciplinary works were limited in their capacity so a more coherent and constructive alternative was needed. Our expectation is that this book will help fill this gap because it has crossed the disciplinary divide to incorporate contributions from scientists and other specialists. The aim is to amalgamate some recent theoretical and empirical contributions that reflect current changes in technology and develop associations and interactions between a range of disciplines. In this sense the book contains multi-disciplinary and inter-disciplinary papers where multi-disciplinary is the act of joining together two or more disciplines without integration, and then leaving a third party observer to integrate. Inter-disciplinary studies target new knowledge by combining two or more disciplines which are outside an existing discipline. An inter-disciplinary approach is essential and needs more focus so our hope is that this book will direct and encourage multi-disciplinary researchers to integrate two or more disciplines. Each chapter has been extensively revised and updated using papers which best address themes which are relevant, and under researched. It was not possible to deal adequately with all the multi-faceted aspects of the disciplines, nevertheless we hope that this publication will be a timely and useful addition to the existing body of knowledge. We hope it will benefit policy makers by enhancing their knowledge of advanced technology, and disciplines from academia, industry, government, NGO’s and mining communities. It will also be interest to postgraduate students working on engineering and development issues, although it is not intended to be a substitute for existing books. The Intech committee hopes that its book chapters, journal articles, and other activities will help increase knowledge across disciplines and around the world. To that end the committee invites readers to contribute ideas on how best this objective could be accomplished.
Dr Kankesu Jayanthakumaran
Senior Lecturer School of Economics University of Wollongong NSW 2522 Australia
VII
Contents Preface 1. Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
V 001
Kankesu Jayanthakumaran and Shao-Wei Lee
2. Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix 019 Nursyarizal Mohd Nor, Prof. Dr. Ramiah Jegatheesan and Ir. Perumal Nallagowden
3. Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations
047
Zanariah Abdul Majid and Mohamed Suleiman
4. Toward Optimal Query Execution in Data Grids
057
Reza Ghaemi, Amin Milani Fard, Md. Nasir Bin Sulaiman and Hamid Tabatabaee
5. Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
073
Toshihiko Takemura, Makoto Osajima and Masatoshi Kawano
6. Evaluating Intrusion Detection Systems and Comparison of Intrusion Detection Techniques in Detecting Misbehaving Nodes for MANET
091
Marjan Kuchaki Rafsanjani
7. Graph Theory and Analysis of Biological Data in Computational Biology
105
Shih-Yi Chao
8. Harmonics Modelling and Simulation
119
Dr. Rana Abdul Jabbar Khan and Muhammad Junaid
9. Knowledge Management Mechanisms In Programmes
151
Mehdi Shami Zanjani and Mohamad Reza Mehregan
10. Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
163
Peter Andráš, Adam Lichý, Ivan Križáni and Jana Rusková
11. Health Technology Management Roberto Miniati, Fabrizio Dori and Mario Fregonara Medici
187
VIII
12. Differential Sandwich Theorems with Generalised Derivative Operator
211
Maslina Darus and Khalifa Al-Shaqsi
13. Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
225
Khair Eddin Sabri, Ridha Khedri and Jason Jaskolka
14. Energy Field as a Novel Approach to Challenge Viruses
251
S. Amirhassan Monadjemi
15. A Manipulator Control in an Environment with Unknown Static Obstacles
267
Pavel Lopatin and Artyom Yegorov
16. Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
285
Hilal A. Fadhil, S. A. Aljunid and R. B. Ahmad
17. Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
307
Kamarul Hawari Ghazali, Mohd. Marzuki Mustafa and Aini Hussain
18. Technologies to Support Effective Learning and Teaching in the 21st Century
319
Susan Silverstone, Jack Phadungtin and Julia Buchanan
19. Multiphase Spray Cooling Technology in Industry
341
Roy J. Issa
20. Web Technologies for Language Learning: Enhancing the Course Management System
357
Afendi Hamat and Mohamed Amin Embi
21. New Advances in Membrane Technology
369
Maryam Takht Ravanchi and Ali Kargari
22. Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
395
Venus Marza and Mohammad Teshnehlab
23. Posture and Gesture Recognition for Human-Computer Interaction
415
Mahmoud Elmezain, Ayoub Al-Hamadi, Omer Rashid and Bernd Michaelis
24. Optimal Economic Stabilization Policy under Uncertainty
441
André A. Keller
25. New Model and Applications of Cellular Neural Networks in Image Processing
471
Radu Matei
26. Concentration of Heterogeneous Road Traffic
503
Thamizh Arasan Venkatachalam and Dhivya Gnanavelu
27. Implementation of Fault Tolerance Techniques for Grid Systems Meenakshi B. Bheevgade and Rajendra M. Patrikar
531
IX
28. A Case Study of Modelling Concave Globoidal Cam
547
Nguyen Van Tuong and Premysl Pokorny
29. Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
563
Algebra Veronica Vargas A., Sarfraz Ul Haque Minhas, Yuliya Lebedynska and Ulrich Berger
30. Graph-based exploratory analysis of biological interaction networks
585
Maurizio Adriano Strangio
31. Intruder Alarm Systems - The Road Ahead
595
Rui Manuel Antunes and Frederico Lapa Grilo
32. Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
611
Wudhichai Assawinchaichote, Sing Kiong Nguang and Non-members
33. Data Compression and De-compression by Causal Filters with Variable Finite Memory
631
A. Torokhti and S. Miklavcic
34. Feedback Control of Marangoni Convection with Magnetic Field
655
Norihan Md. Arifin, Haliza Rosali and Norfifah Bachok
35. BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
663
Marc Pifarré, Xavier Sorribas and Eva Villegas
36. Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm Justin S. C. Prentice
677
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
1
X1 Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC Kankesu Jayanthakumaran and Shao-Wei Lee
School of Economics, University of Wollongong Australia
1. Introduction1 The complementary nature of regional trade agreements (discriminatory RTAs) and multilateralism (non-discriminatory) is widely discussed in the literature (Jayanthakumaran and Sanidas, 2007; Ornelas, 2005; Koopmann, 2003; Either, 1998). The argument is based on the fact that RTAs and multilateralism are interdependent and that both encourage trade creation (both world and intra-regional) and growth. The next step in this process is to test the hypothesis that multilateralism and RTAs are complementary and that regional income convergence is likely to occur with like-minded and committed RTAs that often have links geographically and culturally. Trade and investment reforms (regardless of RTAs or multilateralism) tend to induce the resources within a region to be reallocated in response to the removal of quotas and tariffs from sectors traditionally protected, and allow income to flow from a rich to a poor nation. The catch up due to involvement in newly emerging manufacturing sectors occurs in the light of the comparative advantage (at the expense of agriculture), and converging capital-labour ratio across countries in the region (Slaughter, 1997). Our expectation is that regional members are more likely to integrate due to their ethnic and cultural links, and lower transport and transaction costs. The existing literature on the trade-growth nexus (Lewer and Van den Berg, 2003) and trade-income convergence/divergence nexus (Ben-David, 1996) reveals a strong foundation for forming this hypothesis. In view of this, the hypothesis that ‘openness’ can lead to income convergence between rich and poor economies and relatively better economic growth by poor countries is widely tested (Dawson and Sen, 2007; Ghose 2004; Slaughter, 1997). The founder members of the Association of South East Asian (ASEAN-5) and the South Asian Association of Regional Cooperation (SAARC-5) had dissimilar levels of integration in terms of the RTAs (discriminatory) and multi-lateralism (non-discriminatory) and were therefore suitable for a comparative study. We used intra-ASEAN-5’s (of the 5 founding counties) historical data to isolate the historically different policy interventions: the introduction of Preferential Trade Agreement (PTA) in 1977 (regionalism), uni-lateral liberalisation following a severe 1The
authors wish to thank audiences at the WCSET conference in Bangkok (December 1719, 2008) for insightful comments.
2
Advanced Technologies
recession in the mid-1980s (non-discriminatory multilateralism), the ASEAN Free Trade Area (AFTA) formation (regionalism) in 1992, and the ASEAN and other bi-lateral RTAs in 2000s. We also used intra-SAARC-5’s (of the 5 founding counties) historical data to isolate the historical policy interventions: the formation of the South Asian Association of Regional Cooperation (SAARC) in 1985, the introduction of the South Asian Association for Regional Cooperation Preferential Trading Agreement (SAPTA) in 1995, the formation of the South Asian Free Trade Area (SAFTA) in 2006, more bi-lateral agreements inside and outside member countries since the late 1990s, and a uni-lateral liberalisation by a majority of member countries in the 1990s (non-discriminatory multi-lateralism). Numerical studies have shown that empirical tests of the convergence hypothesis using ‘stochastic’ convergence, which implies that shocks to the income of a given country relative to the average income across a set of countries will be temporary and ‘ -convergence,’
shows that initially, a poorer economy grows faster than a richer one (Dawson and Sen, 2007; Baddeley, 2006; Carlino and Mills, 1993). The Unit-Root test revealed that the estimate of the dummies for breaks in the intercept and trend for countries in which a trend break is found, tend to be statistically significant. Various factors including financial backwardness, human capital and education and technological characteristics were considered as determining factors for the existence of poverty traps (Baddeley, 2006). This means that a trend test for the pre-and post-break periods can be applied, although shortcomings do arise if countries concerned limit the analysis to differences in GNI per capita without adequately revealing their links to international trade. The association and causal relationship between trade per person (proxy for government intervention on trade and foreign investment), income per capita (proxy for efficiency) and Theil’s (1979) value (proxy for regional income convergence/divergence) in founder members of ASEAN and SAARC nations is examined in order to present an analysis of trade policy interventions in regional income convergence. This was undertaken in the light of the comparative advantage of allowing income to flow from a rich to a poor nation within the region through trickle down effects. The expectation of this method is to directly link international trade and income convergence of the region as a whole. The Lumsdaine and Papell approach is used to detect two structural breaks over time (reflecting policy reforms as indicated by Perron, 1997) in the unit root analysis. We adopted a residual based cointegration test in the possible presence of an endogenously determined structural break. The Granger causality test (1969) will be performed on a stationary basis to show the causal relationship between the variables. Section 2 reviews the theoretical and empirical analysis of trade liberalisation and income convergence. Section 3 deals with trade liberalisation and regional income convergence in ASEAN-5 and SAARC-5 countries. Section 4 deals with methodology and section 5 with the results. Section 6 presents conclusions.
2. Trade liberalisation and regional income convergence Solow’s factor-price equalisation theorem shows that free trade tends to equalise (or converge) factor prices and endowments across countries (Solow, 1956). In order to find evidence on Solow’s theorem, recent empirical studies focus per capita income convergence (Ben-David, 1993). International trade can influence per-capita income by enforcing the factor price equalisation theorem, as mentioned above, and by encouraging the international flow of technology and by trading capital goods (Slaughter, 1997). However, the factor-price
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
3
equalisation theorem revealed outcomes in a steady state free trade equilibria but not in the process of trade liberalisation. Empirical studies fill this gap in their analytical framework by considering proxies such as high levels of trade between countries and the removal of obstacles. Ben-David (1993, 1994, 1996) examined the effect of trade on income convergence on a regional basis (i.e. crosscountry income differentials) and concluded that most sample countries among particular groups, for example, the European Economic Community (EEC) and European Free Trade Association (EFTA) exhibited significant income divergence during the pre-war (pre-trade reforms) period this tended towards convergence when trade liberalisation was introduced. Empirical tests revealed mixed results but there was strong evidence in favour of convergence (Dawson and Sen, 2007; Ghose 2004; Slaughter, 1997). On the contrary, wealthier countries grew faster than poor countries and the current era of globalisation has not been associated with convergence in economic outcomes (Baddeley, 2006; Pritchett, 1997). Chotikapanich, Valenzuela and Rao (1997), show a very high degree of global inequality but with some evidence of catch-up and convergence between regions. The convergence hypothesis articulates three reasons why poor economies within a region (RTAs) are expected to grow faster than wealthier economies during regional integration. First, poor economies (late comers) are more likely to adopt existing technologies which pioneers have already developed, and trade raises factor prices for poor economies and thus per capita income. Second, the growth theory assumes diminishing returns to factor inputs and therefore capital productivity is higher among poor economies subjected to scarce capital. The expectation is that the capital-labour ratio converges across a region and thus per capita income. Third, workers are likely to move from low productivity agricultural activities to various high productivity manufacturing and service sectors where there are cost advantages. The critics argue that (a) as wealthier countries in the region have their accumulated experience of developing leading edge technologies, (b) poor economies tend to adopt labour intensive technology instead of capital intensive technology and, (c) wealthier countries tend to access increasing returns to factor inputs (Ghose 2004; Slaughter, 1997). The convergence hypothesis cannot be interpreted at the growth aspect only, as mentioned above, but also as distributional outcomes that are widely known as the income trickle down effect. It is expected that the higher the integration across a region, the higher will the trickle down effect be as regionally-oriented trade and investment reforms tend to allocate resources internally in response to comparative advantages, and then incomes trickle down over time to the respective booming sectors. The inter-links and complementarity nature of regionalism (discriminatory RTAs) and multi-lateralism (non-discriminatory) gained attention in the literature (Jayanthakumaran and Sanidas, 20072; Ornelas, 2005; Koopmann 2003; Freund 2000; Ethier, 1998). Although RTAs are discriminatory by nature, they are competent at deeper trade reforms because they are more like minded and dedicated, and are often connected culturally and geographically. Access to wider regional markets encourages deeper economic and institutional integration, and extra economic reforms enhance regional cost advantages which eventually allow a 2Jayanthakumaran and Sanidas concluded that intra-ASEAN-5 exports and national Gross
Domestic Products doubled from the first stage of regionalism to the second stage of multilateralism, and doubled again from the second stage of multi-lateralism to the third stage of regionalism. The stages are defined in section 3.
4
Advanced Technologies
region to reach global efficiency. Marginal reforms (removal of protection) to regionally oriented trade and investment tend to allocate resources internally in response to the elimination of quotas and tariffs in sectors traditionally protected. On the other hand, global reform policies are likely to trigger regional economic activities and factor mobility by creating links between regional firms and industries due to lower transaction and transport costs. Regional member countries are relatively competent at exploiting these advantages mainly because of lower transportation costs, similar ethnic and cultural links, and lower transaction costs. Recent studies that focused on RTAs to income convergence (Moon, 2006; Niebuhr and Schlitte, 2004; Lopez-Bazo, Vaya and Artis, 2004) revealed positive results. Convergence within a group of nations does not imply a reduction in international inequality but it does imply a convergence within a group motivated by population growth rates, investment rates, human capital, and policy intervention. For example, Niebuhr and Schlitte (2004) concluded that per capita incomes in the 15 European Union countries converged between 1950 and 2000 at an estimated average rate of about 1.6%. Moon (2006) concluded that East Asia as a whole tended to converge through the period 1980-2000. Lopez-Bazo, Vaya and Artis’ (2004) study of the externalities of production across the European region concluded that spill overs, far from being negligible, are robust and may cause non-decreasing returns at the spatial aggregate level. Studies attempting to focus on trade to income convergence at a country level (Yao, Zhang and Feng, 2005; Silva and Leichenko, 2004) revealed mixed results. Silva and Leichenko (2004) investigated the effects of trade on income inequality across regions/states in the United States and concluded that the impact of globalisation was uneven. Yao, Zhang and Feng (2005) showed clear evidence of a divergence in percapita rural (and urban) incomes and total expenditure. Countries which embrace greater global and regional integration experience macroeconomic fluctuations such as business cycles, co-movement in sub-sets of countries, uncertainty in oil prices, and increasing costs of international transportation. The extent and variation of these fluctuations depend on the substitutability of domestic, regional and world goods. Costs of transportation into an international trade model can have large welfare costs and determine the substitutability of domestic, regional and world goods (Ravn and Mazzenga, 2004). A common world or regional factor such as world price shocks can be a pertinent source of volatility for aggregates in most countries, especially in open developing economies (Kose, 2002). During the oil shocks over the period from 1971 to 1989, the increased volatility in the terms of trade occurred largely from an increased volatility in the relative price of oil rather than the increased volatility of exchange rates (Backus and Crucini, 2000). The financial crisis in Thailand in 1997 eventually impacted on the majority of countries in the region. This evidences show that the extent of vulnerability varies and requires the substitutability of domestic, regional, and world goods. Greater regional integration can be an option in the situation where petrol prices and costs of international transportation are rising. The comovement of countries in subsets may demand greater international integration.
3. ASEAN-5 and SAARC-5 countries 3.1 Trade liberalisation In 1967 Malaysia, Indonesia, Thailand, the Philippines, and Singapore formed the ASEAN-5 group to promote cooperation in economic, social, and cultural areas, and to promote
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
5
regional peace and stability.3 They introduced a Preferential Trade Agreement (PTA) in 1977, uni-lateral liberalisation following the severe recession of the mid-1980s, the ASEAN Free Trade Area (AFTA) formation in 1992 and proliferation of RTAs in the 2000s such as ASEAN + China + Korea + Japan; India + Malaysia; India + Singapore, ASEAN + India and Thailand + the United States in the 2000s. The formation of the ASEAN Free Trade Area (AFTA) was a milestone followed by the signing of a Common Effective Preferential Tariff (CEPT) Agreement that limited tariffs to 0-5% by 2002/2003.4 The average CEPT tariff rate in the inclusion list was reduced from 12.76% in 1993 to 2.68% in 2003 (US - ASEAN Business Council, 2004). After1992 an agreement for intra-ASEAN investment, non-tariff barriers, services, intellectual property, and customs and tourism was also reached. The prospect of ASEAN’s decision in 2003 to create an ASEAN Economic Community by 2020 was another important item on the agenda. Seven South Asian countries – India, Pakistan, Bangladesh, Sri Lanka, Bhutan, Maldives and Nepal – agreed to commit to trade liberalisation under the umbrella of the South Asian Association for Regional Cooperation Preferential Trading Agreement (SAPTA).5 The South Asian Association of Regional Cooperation (SAARC) was formed in 1985, the South Asian Association for Regional Cooperation Preferential Trading Agreement (SAPTA) began in 1995, the South Asian Free Trade Area (SAFTA) began in 2006, the uni-lateral liberalisation by a majority of member countries began in the 1990s (non-discriminatory multi-lateralism) and there have been more bi-lateral agreements between inside and outside member countries since the 2000s.6 India, Pakistan, and Sri Lanka agreed to reduce customs duties for products from those wealthy member countries to 0-5 per cent by 2009, to allow differential treatment for the least developing members. The extent of globally oriented trade and investment reforms (non-discriminatory multilateralism) across SAARC countries has not been consistent enough and varied over time (Panagariya, 1999). India is the largest nation, contributing about 80% of the regional GNI, and is the determining force in SAARC. The present trade and investment regime in ASEAN-5 countries is much more liberal towards globally-oriented multilateralism (Table 1). Following a severe recession in the mid-1980s and the steady fall in the price of oil, the Brunei joined ASEAN in January 1984. Burma, Cambodia, Laos, and Vietnam joined in the 1990s. Our research focuses on the founder members of ASEAN (ASEAN-5) mainly because of continuous availability of data. 4The AFTA Council was made responsible for supervising, coordinating and reviewing the implementation of the CEPT agreement that covered manufacturing and processed and unprocessed agricultural commodities. 5 In December 1985 (Dhaka), a Charter that established the South Asian Association of Regional Cooperation (SAARC) was adopted. In December 1991 (Colombo), an institutional framework under which specific measures for trade liberalisation between SAARC member countries could be advanced, was agreed upon. The idea of forming SAPTA originated in 1991 and became operational in December 1995 (Paswan 2003, pp.346-49). 6India, and Singapore Comprehensive Economic Cooperation Agreement (CECA) in 2003, India and Thailand in 2004, India and Malaysia agreed to establish a Comprehensive Economic Partnership (CEP) in 2004, India and China in 2004, India and Mercosur, constituting Argentina, Brazil, Paraguay, Uruguay in 2004, Bolivia and Chile in 2004, ASEAN-India Regional Trade and Investment Area (RTIA) in 2003. 3
6
Advanced Technologies
ASEAN-5 countries initiated important policy reforms (de-regulation, trade, finance, tax and foreign direct investment) at their own pace (Tan, 2004). The extent varied between countries and over time but trade liberalisation as the bottom-line of all reform exercises, remained the same. Pre 1990s, import weighted means tariffs were much lower in the ASEAN-5 countries and have been reduced extensively since then. On the contrary though, pre-1990’s tariffs in the SAARC countries were very high although some attempts were made to reduce tariffs during this period. The current tariff rates in Sri Lanka are comparable to ASEAN countries. The economic performance of ASEAN-5 is remarkable, with the region’s per-capita income in 2007 ranging from US$1620 in the Philippines to $32470 in Singapore (World Bank, 2008). The ASEAN countries had homogeneous historical and cultural values and increasingly adopted the policy experiences of their neighbours. The regional measures (discriminatory) taken by the ASEAN-5 countries reduced inefficiencies and transaction costs in the system and accelerated economic growth which in turn resulted in ‘innovative and bold regional experiments’ (Ariff, 1994). The ASEAN-5 countries were now integrated more than ever, partly due to regional economic cooperation initiated by them, and partly due to anonymous market forces initiated by global policies. There is evidence to show that the uni-lateral liberalisation taken in the late 1980s by the ASEAN-5 countries outside the ASEAN framework united the ASEAN members in economic cooperation and contributed to increased intra-ASEAN trade flows (Kettunen, 1998; Ariff, 1994; Imada, 1993). SAARC countries are also comparable as they have similar historical and cultural links with increasingly dissimilar policy experiences. The economic performance of SAARC-5 countries is not impressive, with the region’s per capita income in 2007 ranging from US$340 in Nepal to US$1540 in Sri Lanka (World Bank, 2008). This region accounts for approximately 20 per cent of total world population and generates less than 2 per cent of total world GNP. It disintegrated due to political differences, ethnic tensions, human rights abuses, and corruption (Wagle, 2007). Bandara and Yu (2003) used the GTAP model to argue that SAFTA would not benefit the region economically because they cannot meet at summits due to political conflicts.
Country SAARC India
Sri Lanka Pakistan Bangladesh
Year
Simple mean tariff (percent)
Standard deviation
Importweighted mean tariff (percent)
1990 1999 2001 1990 2000 2001 1995 1998 2003 1989 2000 2004
79.0 32.5 32.3 28.3 9.9 9.2 50.9 46.6 17.1 106.6 21.3 18.4
43.6 12.3 13.0 24.5 9.3 9.3 21.5 21.2 10.9 79.3 13.6 10.2
49.6 28.5 26.5 26.9 7.4 6.6 46.4 41.7 14.4 88.4 21.0 16.8
Percentage share of tariff lines with tariffs above 15 percent 97.0 93.1 51.7 22.0 91.4 86.3 98.2 51.8
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
Nepal ASEAN Indonesia Malaysia The Philippines Thailand
7
1993 2000 2003
21.9 17.9 13.6
17.8 20.9 10.9
15.9 17.7 16.8
58.9 18.7
1989 2000 1988 1997 1989 2000 1989
21.9 8.4 17.0 9.3 28.0 7.6 38.5
19.7 10.8 15.1 33.3 14.2 7.7 19.6
13.0 5.2 9.4 6.0 22.4 3.8 33.0
50.3 11.2 46.7 24.7 77.2 8.8 72.8
Table 1. Comparison of External Tariff Barriers: SAARC-5 and ASEAN-5 Countries Source: Srinivasan and Tendulkar (2003), and Das (2008) 3.2 GNI per capita, trade per person and Income convergence The Theil value of inequality reveals an indicator of the relative distribution of income across the nations. It does this by measuring the proportion of each nation’s share of income to its share of population in the region and then adds these figures to the region (Silva and Leichenko, 2004; Nissan and Carter, 1993). The Theil value enables one to decompose overall changes in the region into changes within the concerned groups of countries that are suitable for regional analysis. One should note that the Theil value lack a straightforward representation and appealing interpretation like the Gini coefficient. Let the country
i
populating share of the region be
share of the region by
p g i pi , iR g ;
yi ,
y
1
; the region
and the region
R g income
i
i
pi ,
Rg
i Rg
Where
i
i
1
; the country
i
income
population share of the nation be
share of the region be
iR g . The measure can be written as; J r (1/ ng ) (
p
y g i y i
pi y )( i ) pg yg
,
(1)
n g is the number of countries in the region R g . When J r >1, the country concerned
has less than its proportional share of national income. In other words the share of population is larger than the share of income, which implies higher levels of inequality across countries. When
J r <1,
the country concerned receives a larger share of income than its share of
population. The notation
iR g indicates that the country ( i ) is part of the region ( R g ).
If there is a catch up process then it is expected that international trade (trade per person) and convergence (the Theil value) is possibly associated with each other. For example, there will be a positive association in the case of either (a) trade increasing and the Theil value is approaching 1 (equality) from a higher value or (b) trade is increasing and the Theil value is approaching 1 from a lower value. Trade per person is defined as the overall trade in the region (exports + imports) divide by the size of population in the region.
8
Advanced Technologies
ASEAN-5 Countries 3500
TRADE PER PERSON
GNI PER CAPITA
CONVERGENCE
3000
1.2 1
2500
0.8
2000 0.6 1500 0.4
1000
0.2
500
0
0 1967
1972
1977
1982
1987
1992
1997
2002
Fig. 1. (a): Trade per person, GNI per capita and Theil (convergence) values: ASEAN-5 and SAARC-5 countries SAARC-5 Countries GNI PER CAPITA
800
TRADE PER PERSON
CONVERGENCE
1.6
700
1.4
600
1.2
500
1
400
0.8
300
0.6
200
0.4
100
0.2 0
0 1971
1975
1979
1983
1987
1991
1995
1999
2003
Year
Fig. 1. (b): Trade per person, GNI per capita and Theil (convergence) values: ASEAN-5 and SAARC-5 countries Figure 1 (a and b) shows the trade per person, GNI per capita, and Theil values for the 5 founding ASEAN countries and 5 major SAARC countries. It is evident that the major partners of ASEAN-5 experienced a relative boost in their trade per person and GNI per capita following major policy changes in (approximately) 1977, 1987, 1992, and 2002. The Asian financial crisis is also visible with a sudden drop in 1997-1998. Another notable aspect
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
9
is that in 1992 trade-per-person exceeds income-per-capita, and the gap has widened further since then. The Theil values over the study period indicated that the region received a larger share of income, on average, than its share of population and experienced regional income convergence and rising trade. The Theil values showed that convergence (equality) across the member countries increased from 0.76 in 1970 to 0.89 in 1995 and remained the same thereafter, except in 1998. The SAARC-5 partners seem to have experienced a relative boost in their trade per person and GNI per capita after these approximate dates, 1995 and 2004. However, there is a quite a large gap between trade per person and GNI per capita. The SAARC-5 region receives less than its proportional share of national income. The Theil values show that divergence (inequality) across the member countries remained stable with the small amount of fluctuations around 1.18 from 1971 to 1990. During 1991-2001, the Theil values fluctuated between 1.09 to 1.13, reflecting the consequences of convergence. More divergence was recorded after 2002.
4. Methodology This study proposes the hypothesis that multi-lateralism and regionalism are complementary and that convergence of regional income with a like minded and committed regionalism that often has links culturally and geographically, is more than likely. In order to examine the impact of government intervention on trade and the convergence (or divergence) regional income, the historical time series data of trade per person (proxy for government intervention on trade), GNI per capita (proxy for efficiency), and Theil’s values (proxy for income convergence/divergence) that covers the periods from 1967 to 2005 for ASEAN-5 countries and 1971 to 2005 for the SAARC-5 countries, are measured and analysed separately. The unit root tests and the Augmented-Dickey-Fuller (ADF) can be applied to examine the characteristics of stationary for uni-variate time series. It should be noted however, that the conventional ADF test failed to detect structural breaks in the time series data, which may have been biased towards the non-rejection of a unit root when the trend of the series is stationary within each sub-period (Perron, 1997). Furthermore, structural breaks may occur by reflecting, for example, on a country’s policy reforms or slowdown in growth (Perron, 1997). Therefore, a unit root test in the presence of two endogenously determined structural breaks will be achieved using the Lumsdaine and Papell approach. Lumsdaine and Papell (1997) argue that only one endogenous break may not be sufficient because it could lead to a loss of information if there is more than one break. Using LP approach, the unit root analysis in the presence of structural breaks is formulated as follows: k
yt t DU 1t DT 1t DU 2t DT 2t yt 1 ci yt i t i 1
(2)
where represents the first difference operator. yt is the time series being tested and t is a time trend variable. t =1, ….,T, where c(L) is a lag polynomial of known order k. This model included enough numbers of lags, k, to ensure the residual term εt is white noise, and the optimal lag length k is based on the general to specific approach suggested by Ng and
10
Advanced Technologies
Perron (1995). DU1t and DU2t are dummy variables for a mean shift occurring at times TB1 and TB2 (1
TB1 and zero otherwise; DU2t =1 if t > TB2 and zero otherwise; DT1t = t – TB1 if t > TB1 and DT2t = t – TB2 if t > TB2 and zero otherwise. This equation (2) allowed for two breaks in both intercept and slope term of the trend function. The break dates were determined by depending on the minimum value of the t statistics for . Using an annual time series in this study followed by an LP approach (1997), we assumed kmax was up to 8 for ASEAN-5 countries and 2 for SAARC-5 countries. If the t-statistic of was higher than the critical value then the unit root of null hypothesis could not be rejected. Once the characteristics of the uni-variate time series data was identified, the Granger (1969) causality test were performed on a stationary basis and in the framework of either the VAR model or vector ECM. The Granger causality test provided information about whether changes in one variable preceded changes in another. The decision rule for causality was that if the null of non-Granger causality from X to Y was rejected at the 5 per cent level, then it can be concluded that X Granger caused Y. If both tests rejected the null hypothesis, then we could conclude that there was a lagged feedback effect which was a bi-directional causal relationship between two variables. It should be noted that this model does not infer any ‘cause and effect’ relationship but only predictability between these two variables. The Granger causality tests were based on the framework of either the VAR or ECM. The Granger causality model can be formulated as follows (3) ~ (6): p
p
j 1
j 1
(3)
LTRADEt 11, j LTRADEt j 12, j LGNI t j t1
p
p
j 1
j 1
(4)
LGNI t 21, j LGNI t j 22, j LTRADEt j t 2 p
p
j 1
j 1
(5)
LTRADEt 11, j LTRADEt j 12, j LTHEILt j t1 p
p
j 1
j 1
(6)
LTHEILt 21, j LTHEILt j 22, j LTRADEt j t 2 where t1 and
t 2 are white noise, and p is the lag length. A test of joint significance of these lagged terms ( 12, j = 0, j = 1, ….p and 22, j = 0, j = 1, ….p) constitutes a short-run Granger causality test. Possible situations showing whether two variables have any causal relationship are as follows: p
a) One-way causality if
22, j 0 and j 1
p
21, j = 0 j 1
p
, or
12 0 and j 1
p
j 1
11, j
= 0.
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC p
b) Bi-directional causality if
12 0 and j 1
p
c) No causal relationship if
12 j 1
p
j 1
22, j
11
0.
p
and
j 1
22, j
are not statistically significant.
Our analysis covers 39 years (1967-2005) for ASEAN-5 and 34 years (1971-2005) for SAARC5 countries.7 All the variables are in ratios and expressed in natural logs. We obtained the data for trade (export plus imports), GNI, exchange rate and population from World Bank dx spreadsheets (2006). GNI is converted into US$ by using corresponding exchange rates. GAUSS software was used to conduct the LP tests while E-views was used to conduct Granger causality.
5. Results The results8 in Table 2 and Figure 2 show the two most significant structural breaks which affected the variables in the respective trade blocs (ASEAN-5 and SAARC -5) using the Lumsdaine and Papell procedure. TB1 and TB2 indicate the time of the structural breaks. The endogenously determined structural breaks for the majority of variables are significant, at least at the five per cent level. The empirical results show that the t-statistics for , , , , and
are significant in most cases. Given the fact that all of the
estimated coefficients for the indicator and dummy trend variables are statistically significant for the majority of them, it can be argued that the estimated dates of the structural breaks are indeed significant. The test detected break points in trade per person for ASEAN-5 countries in 1987 and 1999. These breakpoints coincide with multi-lateral trade liberalisation by individual member countries of ASEAN and recovery from the Asian crisis respectively. The break for GNI occurred in 1989 (not significant at the 10% level) and 1998, the latter coincided with a recovery from the Asian crisis. The break points for the THEIL value (proxy for convergence/divergence) occurred in 1981 and 1989, which coincided with the oil crisis and multi-lateral trade liberalisation respectively. Both events pushed the region to converge such that it now closely approaches the point where the share of income to the share of population is equal. We observed that the dispersion gap in income widened in the Philippines and Indonesia and narrowed down in Thailand. Malaysia and Singapore remained the same (Jayanthakumaran and Verma, 2008). The results in ‘Table 2’ and ‘Figure 3’ from using the Lumsdaine and Papell procedure show the two most significant structural breaks which affected the variables in the SAARC-5 region. The test detected break points in trade per person in 1992 and 2002. These breakpoints coincide with India’s (largest nation in the region contributing 80% of the overall GNI) attempt at multi-lateral trade liberalisation and more bi-lateral dealings among Data for Bangladesh is available (after the separation from Pakistan) only from 1971 and therefore our analysis covers the 34 years from 1971 (1971-2005). 8 ASEAN results are from Jayanthakumaran and Verma (2008). 7
12
Advanced Technologies
individual SAARC member countries (for example, progress on India Sri Lanka CEPA) respectively. The break of GNI occurred in 1990 which indicates that a deterioration in the growth rates in the region coincided with a global recession, while an indication of positive growth in 2002 coincided with more bi-lateral trade agreements by individual countries. The break points for the THEIL values (proxy for convergence/divergence) occurred in 1985 and 2000, which coincided with the oil crisis in 1985/86 and the enforcement of bi-lateral RTAs respectively. Both events pushed the region to diverge and moved it away from the point where the share of income to the share of population widened. From 1990-1994 one can notice that the region converged such that the index closely approaches the point where the share of income to the share of population is equal.
ASEAN-5 LGNI
TB1 TB2 1989+ 1998
SAARC-5 LGNI ASEAN-5 LTHEIL
1990 2002 1981 1989
SAARC-5 LTHEIL ASEAN-5 LTRADE
1985 2000 1987 1999
SAARC-5 LTRADE
1992 2002
k
8
6.161 (5.23)
-0.565 (-4.52)
-1.528 (-2.08)
0.263 (1.52)
-2.590 (-3.17)
0.625 (1.52)
-1.731 (-5.73)
118.61 (6.57) 3.082 (4.24)
12.528 (5.77) -1.050 (-5.35)
-51.66 (-4.67) -3.107 (-3.52)
-0.066 (-0.04) 1.600 (4.63)
-31.387 (-1.78) 2.832 (3.10)
35.468 (4.96) -1.265 (5.06)
-0.916 (-6.05) -1.653 (-7.44)*
5.990 (2.68) 2.884 (5.37)
0.015 (2.32) 0.193 (4.742)
0.119 (3.72) 2.804 (4.18)
-0.0340 (-3.88) -0.963 (-6.32)
-0.015 (-0.52) 1.328 (1.82)
0.0795 (5.99) -0.299 (-1.92)
-2.417 (-6.01) -2.760 (-6.76)
85.073 (7.12)
7.134 (6.27)
-8.081 (-2.19)
6.402 (9.20)
-17.596 (-2.27)
32.467 (6.69)
-3.475 (-6.70)
2 5 7 6 8
Table 2. Estimating the Time of Structural Breaks by Lumsdaine and Papell (LP) Approach k
yt t DU 1t DT 1t DU 2t DT 2t yt 1 ci yt 1 t (2). i 1
Note: * Critical value at 5% level of significance for coefficient αis -6.82. t-statistics are in parentheses. See equation 1 for details of notations. + The break is not significant. 50
40
40
30
30
20 10
20
0
10
-10
0
-20
-10
-30
-20 -30
-40 1970 1975 1980 1985 1990 1995 2000 2005 LTRADE
LTRADE_CHANGE
-50
1970 1975 1980 1985 1990 1995 2000 2005 LGNI
LGNI_GROWTH
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
13
12 8 4 0 -4 -8 -12
1970 1975 1980 1985 1990 1995 2000 2005 LTHEIL
LTHEIL_CHANGE
Fig. 2. Plots of the ASEAN-5 Series and Endogenously Estimated Timing of Structural Breaks by the Lumsdaine and Papell Test 40
24 20
30
16 12
20
8 4
10
0
0
-4
-10
-12
-8
1975
1980
1985
LTRADE
1990
1995
2000
2005
1975
1980
LTRADE_CHANGE
1985
LGNI
1990
1995
2000
2005
LGNI_GROWTH
8 4 0 -4 -8 -12
1975
1980
1985
LTHEIL
1990
1995
2000
2005
LTHEIL_CHANGE
Fig. 3. Plots of the SAARC-5 Series and Endogenously Estimated Timing of Structural Breaks by the LP Test
14
Advanced Technologies
The Granger causality test of ASEAN-5 countries (Table 3) showed the null hypothesis that trade does not ‘Granger cause’ the Theil value, and can therefore be rejected at the 5 per cent level (p-value: 0.0021), whereas the null hypothesis that the Theil values does not ‘Granger cause trade can be rejected at the 1 per cent level (p-value 0.0000). Based on these results for the ASEAN-5 countries, we conclude that there is a two-way causal relationship between the flow from trade to convergence and from convergence to trade. The unit root null hypothesis cannot be rejected for GNI and TRADE at the five per cent level because the tstatistic is below the critical value of -6.82. However the THEIL value (proxy for convergence/divergence) was found to be stationary in the presence of two structural breaks at the significant five per cent level. Ho ASEAN-5 LTRADE does not Granger cause LGNI SAARC-5 LTRADE does not Granger cause LGNI ASEAN-5 LGNI does not Granger cause LTRADE SAARC-5 LGNI does not Granger cause LTRADE ASEAN-5 LTRADE does not Granger cause LTHEIL SAARC-5 LTRADE does not Granger cause LTHEIL ASEAN-5 LTHEIL does not Granger cause LTRADE SAARC-5 LTHEIL does not Granger cause LTRADE
p
Chi-sq
d.f
prob
4
7.9510
4
0.0934
1
2.6921
1
0.1008
4
28.5458
4
0.0000***
1
2.0502
1
0.1522
4
16.7950
4
0.0021**
1
0.1483
1
0.7002
4
36.8628
4
0.0000***
1
2.2396
1
0.1345
Table 3. Results of Granger Causality Test: ASEAN-5 and SAARC-5 Countries. ** significant at the 5% level; *** significant at 1% level; p is the lag length, which is selected by AIC, for the causality model. L is Log. The result of the Granger causality test of SAARC-5 countries is shown in Table 3. Firstly, the null hypothesis that trade does not “Granger cause” GNI cannot be rejected at the 5% level (p-value: 0.1008). Similarly, the null hypothesis that GNI does not “Granger cause” trade cannot be rejected at the 5% level (p-value: 0.1522). Therefore we may conclude that there is no causal relationship between trade and GNI in the SAARC-5 countries. Secondly, the null hypothesis that trade does not “Granger cause” GNI cannot be rejected at the 5% level (p-value: 0.7002), and the null hypothesis that income inequality does not “Granger cause” trade cannot be rejected at the 5% level (p-value: 0.1345). We cannot establish a causal relationship between trade and income inequality in the SAARC-5 countries.
6. Conclusion This study demonstrated that multi-lateralism and RTAs are complementary and that regional income convergence is likely with a like-minded and committed regionalism with cultural and geographical links. The complexity of this link between openness and income convergence has not been fully captured in the existing literature although our study shed
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
15
some light by revealing the experiences of ASEAN and SAARC countries. The expectation is that reforms (both multi-lateral and RTAs) tend to allocate resources as the quotas and tariffs in traditionally protected sectors are removed, and to motivate the convergence of regional income in the light of the factor price equalisation theorem. Regardless of multilateralism or RTAs, it is likely that countries within the RTAs integrate more due to the potential advantages of ethnic and cultural links, and lower transport and transaction costs. By applying the Lumsdaine and Papell (1997) model for detecting breaks in the trend function of uni-variate trade performance time series data (trade per person), we found significant trend breaks in 1987 and 1999 which coincided with economic reforms initiated by individual member countries of ASEAN-5, and recovery from the Asian crisis that focused on multi-lateralism respectively. The significant break in 1987 is an indication that multi-lateralism had a greater impact on trade in the region. A significant trend break occurred in the GNI per capita in 1998, which coincided with a recovery from the Asian crisis of 1997. The Asian crisis and recovery in 1997 -1998 imitates the co-movement properties of sectoral outputs in the ASEAN region as a result of intense integration into the region. It is relevant to note that Kose (2002) argued that world price shocks account for a significant proportion of business cycle variability in small, open, developing countries. Our results from the Granger causality test show that there is a one-way causal relationship flow from GNI to trade. If causality is assessed at an early stage then trade flows could lead to income, but this may be reversed at a later stage when increases in income increase the capability of poor countries in the region to import and export. The analysis showed that the break points for the Theil values in 1981 and 1989 coincided with the oil crisis and economic reforms by individual countries respectively. The results from the Granger causality test indicated that there is a two-way causal relationship between the flow from trade to convergence and convergence to trade. The result deviated slightly from that of Ben-David (1996), who argued that trade liberalisation induces income convergence. It is the view that if causality is assessed at an early stage of development then flows of trade could appear to be leading to income convergence. However, this could be reversed at a later stage when income convergence increases the trade capability of poor countries in the region. By applying the Lumsdaine and Papell approach for detecting breaks in the trend function of uni-variate trade per person time series data (proxy for trade and foreign investment) for SAARC-5 countries, we found significant break points in trade per person in 1992 and 2002. These break points coincide with India’s attempt at multi-lateral trade liberalisation (India is the largest nation in the region and contributes 80% of the overall GNI) and more bi-lateral dealings inside and outside SAARC member countries respectively. The significant trend break in 1992 indicated that multi-lateralism had a greater impact on trade in the region and tended to unite the SAARC countries in economic cooperation. The global commitments can be viewed as a strengthening rather than a detrimental force for the region. A significant trend break occurred both in income per capita and trade per person in 2002, which coincided with India having more bi-lateral trade agreements with Sri Lanka, Singapore, Thailand, Malaysia, Mercosur and the ASEAN. It is relevant to note that the World Bank (2006:1) described the India-Sri Lanka FTA in 2000 as ‘the most effective free trade area in existence’. Econometric analysis of SAARC-5 countries showed that the break points for the Theil value (representing convergence/divergence) in 1985 and 2000 coincided with the oil crisis in the
16
Advanced Technologies
mid 1980s, and the engagement of more bi-lateral RTAs in the early 2000s, respectively. Both events increased income divergence among individual SAARC- 5 countries and in the region as a whole. It is relevant to note that Backus and Crucini (2000) argued that from 1972 to 1987 the dramatic changes in the relative price of oil drove the terms of trade. Uncertainty in oil prices and increasing costs in international transportation in the mid-1980s can be associated with widening income gaps within the region. An immediate effect of increased trade agreements due to bi-lateral trade agreements in the early 2000s was the widening income gap in the region. India has entered into many bi-lateral trade agreements that coincided with regional income divergence, which is contradictory to the original objective of SAARC. There is a two-way casual relationship in ASEAN-5 countries between the flow from trade (both multi-lateral and regional) to regional income convergence and vice versa. The global commitments of ASEAN-5 countries can be viewed as a strengthening rather than a detrimental force for the region. The advantages of similar cultural values, low wages, low transaction and transport costs, and strong fundamentals, promoted export oriented foreign investments and exports that led to increased regional efficiency (technical, allocated and trade efficient) among the ASEAN-5 countries that eventually led to income convergence in the light of Solow perspectives. ASEAN is a unique case in this sense. We cannot establish a similar causal relationship between increased trade and convergence in SAARC-5 countries. Bandara and Yu (2003) used the GTAP model to argue that SAFTA would not benefit the region economically because they cannot meet at summits due to political conflicts. Regional economic and political integration among SAARC member countries is not enough to utilise the advantage of similar cultural values, low wages, and low transaction and transport costs. In this sense the South Asian Association of Regional Cooperation (SAARC) needs a more radical approach to eliminate trade barriers at both regional and multi-lateral levels in order to converge more successfully. It is important to note that our tests were only concerned with two breaks in the series and were unable to detect more than two structural breaks. We acknowledge the limitations of the unit root test due to its low power in rejecting the null hypotheses on I(1), particularly when there are relatively few degrees of freedom. Our analysis incorporated a low degree of freedom when estimating equations. The findings are highly specific to the ASEAN-5 setting so the general limitations on a focused case study research still apply. This sort of analysis rarely gives conclusive results. The models need to be developed further to capture more of the impacts of RTAs and multi-lateralism.
7. References Ariff, M. (1994). Open regionalism a la ASEAN, Journal of Asian Economics, 5 (1), pp. 99-117. Backus, D. K. and Crucini, M. J. (2000). Oil prices and the terms of trade, Journal of International Economics, 50 (1), 185-213. Baddeley, M. (2006). Convergence or divergence? The impacts of globalisation on growth and inequality in less-developed countries, International Review of Applied Economics, 20 (3), 391-410. Bandara, J. S. and Yu, W. (2003). How desirable is the South Asian Free Trade Area? A quantitative economic assessment, The World Economy, 26 (9), 1293-1323.
Multilateralism, Regionalism and Income Convergence: ASEAN and SAARC
17
Ben-David, D. (1996). Trade and convergence among countries, Journal of International Economics, 40, 279-298. Ben-David, D. (1994). Income disparity among countries and the effects of freer trade, in: L.L. Pasinetti and R.M. Solow, eds., Economic Growth and the Structure of Long Run Development, Macmillan, London, 45-64. Ben-David, D. (1993). Equalizing exchange: Trade liberalization and income convergence, Quarterly Journal of Economics, vol. 108, 653-679. Carlino, G. and Mills, L. (1993). Are U.S regional economies converging? a time series analysis, Journal of Monetary Economics, 32, 335-346. Chotikapanich, D., Valenzuela, R. and Rao, P. D. S. (1997). Global and regional inequality in the distribution of income: estimation with limited and incomplete data, Empirical Economics, 22 (4), 533-546. Das, D. K. (2008). The South Asian Free Trade Agreement: Evolution and Challenges, MIT International Review (Spring), http://web.mit.edu/mitir/2008/spring/south.html accessed on 11 June 2008. Dawson, J. W. and Sen, A. (2007). New evidence on the convergence of international income from a group of 29 countries, Empirical Economics, 33, 199-230. Ethier, W. J. (1998). Regionalism in a multilateral world, The Journal of Political Economy, 106 (6), 1214-1245. Freund, C. (2000). Multilateralism and the endogenous formation of preferential trade agreements, Journal of International Economics, 52, 359-376. Ghose, A. K. (2004). Global inequality and international trade, Cambridge Journal of Economics, 28 (2), 229 – 252. Granger, C. W. J. (1969). Investigating causal relations by econometric models and crossspectral methods, Econometrica, 424-438. Imada, P. (1993). Production and trade effects of an ASEAN Free Trade Area, The Developing Economies, 31 (1), 3-23. Jayanthakumaran, K. and Sanidas, E. (2007). The complementarity hypothesis of integration: regionalism, multilateralism and the ASEAN-5, The Asia Pacific Journal of Economics and Business, 11 (1), 40-60. Jayanthakumaran, K., Verma, R. (2008). International trade and regional income convergence: the ASEAN-5 evidence, ASEAN Economic Bulletin, (accepted for publishing in August 2008 issue). Kettunen, E. (1998). Economic integration of the ASEAN countries, in L.V. Grunsven, (Ed.), Regional Change in Industrialising Asia. Ashgate, Sydney. Koopmann, G. (2003). Growing regionalism – a major challenge to the multilateral trading system, Intereconomics, 38 (5), 237-241. Kose, M. Ayhan (2002). Explaining business cycles in small open economies ‘how much do world prices matter?’, Journal of International Economics, 56 (2), 299-327. Lewer, J. J. and Van den Berg, H. (2003). How Large is International Trade’s Effect on Economic Growth?, Journal of Economic Surveys, 17 (3): 363-396. Lopez-Bazo, E., Vaya, E. and Artis, M. (2004). Regional externalities and growth: evidence from European regions, Journal of Regional Science, 44 (1), 43-73. Lumsdaine, R. L., and Papell, D. H. (1997). Multiple trend breaks and the Unit-Root hypothesis, The Review of Economics and Statistics, 79 (2), 212-218.
18
Advanced Technologies
Moon, W. (2006). Income convergence across nations and regions in East Asia, Journal of International and Area Studies, 13 (2), 1-16. Ng, S. and Perron, P. (1995) unit root tests in ARMA models with data dependent methods for the selection of the truncation lag, Journal of American Statistical Association 90 (429), 268-81. Niebuhr, A. and Schlitte, F. (2004). Convergence, trade and factor mobility in the European Union – implications for enlargement and regional policy, Intereconomics, 39 (3), 167 – 177. Nissan, E. and Carter, G. (1993). Income inequality across regions over time, Growth and Change, 24, 303-319. Ornelas, E. (2005). Trade creating free trade areas and the undermining of multilateralism, European Economic Review, 49, 1717-1735. Panagariya, A. (1999). Trade policy in South Asia: recent liberalisation and future agenda, The World Economy, 22 (3), 353-378. Paswan, N. K. (2003). Agricultural trade in South Asia: potential and policy options, APH Publishing Corporation, New Delhi. Perron, P. (1997). Further evidence on breaking trend functions in macroeconomic variables, Journal of Econometrics, 80 (2), 355-385. Pritchett, L. (1997). Divergence, big time, Journal of Economic Perspectives, 11 (3), 3-17. Ravn, M. O. and Mazzenga, E. (2004). International business cycles: the quantitative role of transportation costs, Journal of International Money and Finance, 23, 645-671. Silva, J. A. and Leichenko, R. M. (2004). Regional income inequality and international trade, Economic Geography, 80 (3), 261 – 286. Slaughter, M. J. (1997). Per-capita convergence and the role of international trade, The American Economic Review, 87 (2), 194-199. Solow, R. M. (1956). A contribution to the theory of economic growth, Quarterly Journal of Economics, 70(1): 65-94. Srinivasan, T.N. and S. D. Tendulkar (2003). Reintegrating India with the World Economy, Oxford: Oxford University Press. Tan, G. (2004). ASEAN: Economic Development and Cooperation. Eastern University Press, Singapore. Theil, H. (1979). World income inequality and its components, Economic letters, 2, 99-102. US-ASEAN Business Council. (2004). The ASEAN Free Trade Area and other Areas of ASEAN Economic Cooperation, Available: www.us-asean.org/afta.asp [accessed 14 February 2005]. Wagle, U. R. (2007). Are economic liberalisation and equality compatible? Evidence from South Asia, World Development, 35 (11), 1836-1857. World Bank (2006). World Tables, DX Database, Washington, DC. World Bank (2008). World Development Indicators Database, Washington, DC. Yao, S., Zhang, Z. and Feng, G. (2005). Rural-urban and regional inequality in output, income and consumption in China under economic reforms, Journal of Economic Studies, 32 (1), 4-24.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
19
X2 Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix Nursyarizal Mohd Nor, Prof. Dr. Ramiah Jegatheesan and Ir. Perumal Nallagowden
University Technology PETRONAS Malaysia
1. Introduction State Estimation (SE) in power systems is considered as the heart of any energy control center. It is responsible for providing a complete and reliable real-time data-base for analysis, control, and optimization functions (A. Monticelli 2002). Since electric power system state estimation was introduced by F. C. Schweppe et. al. 1970, it has remained an extremely active and contentious area. Nowadays, state estimation plays an important role in modern Energy Management systems (EMS) providing a complete, accurate, consistent and reliable database for other functions of the EMS system, such as security monitoring, optimal power flow, security analysis, on-line power flow studies, supervisory control, automatic voltage control and economic dispatch control (A. Monticelli 2002, F. C. Schweppe et. al. 1970, Ali Abur & A. G. Exposito 2004). The energy control centers gather information and measurements on the status and state of a power system via Supervisory Control and Data Acquisition System (SCADA). Various methods for state estimation have been introduced (A. Monticelli 2002, F. C. Schweppe et. al. 1970, Holten et. al. 1988 and A Garcia et. al. 1979) in the past. Among those methods, Weighted Least Squares (WLS) algorithm is the most popular and finds applications in many fields. In WLS method the measured quantities are represented as sum of true values and errors as z h( x ) e
(1)
where z is the measurement vector, consisting of real and reactive power flows, bus injection powers and voltage magnitudes; x is the true state variable vector, consisting of bus voltage magnitudes and bus voltage angles; h(x) is the non-linear function that relates the states to the ideal measurements; e is a vector of measurement errors. A state estimate xˆ is to be obtained that minimizes the objective function f given by
20
Advanced Technologies
e 2j
m
m
j 1
2 j 1 j
f w j e 2j or
(2)
and this can be achieved when m
e j
j 1
xn
2w je j
0,
(3)
where ‘wj’ is the weighting factor for the respective measurement, m = 1, 2,..., number of measurements and n = 1,2,..., number of state variables. This non-linear least squares problem is usually solved iteratively as a sequence of linear least squares problem. At each step of the iteration, a WLS solution to the following noise-corrupted system of linear equation is sought: eˆ z zˆ z Hxˆ e H ( xˆ x )
(4)
In Eq. 4, e is the measurement residual vector, the difference between the actual measurement vector and the value of h(x) at the current iteration, xˆ x is the difference between the updated state and the current state, H is the Jacobian of h(x) in Eq. 1 at the current iteration. The flow of SE solution algorithm is shown in Figure 1. Although the function of a SE is understandable, there is much freedom of choice in its practical implementation. One of the important options is that of the statistical methodology used to filter the measured data. The basic Newton-Raphson WLS method, when used in power systems, has good convergence, filtering and bad data processing properties, for a given observable meter placement with sufficient redundancy and yields optimum estimates. However, the gain and the Jacobian matrices associated with the basic algorithm require large storage and have to be evaluated at every iteration, resulting in very long computing time. The essential requirements for any on-line SE are reliability, speed and less computer storage. The computational burden associated with the basic WLS algorithm makes it unsuitable for on-line implementation in large scale power systems.
2. State of the Art One of the main steps in the algorithm of WLS is creating and updating the Jacobian matrix, H, which is significantly time consuming step which requires a large number of floating point multiplication. Fred Schweppe et. al. in their paper had modified the basic WLS algorithm purposely for real time application in large scale power system (F. C. Schweppe & E. J. Handschin 1974). In that paper, the constant gain and Jacobian matrices are used in order to reduce the computational time. However, WLS processing could still need a longer time for medium to large networks and becomes even longer in the presence of multiple data with gross error and the procedures for bad data processing. Since then, several different alternatives to the WLS approach have been investigated. Some of the algorithms developed and implemented in real time are sequential estimators,orthogonal transformation methods, hybrid method and fast decoupled
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
21
estimators (A Garcia et. al. 1979 and A. Monticelli & A Garcia 1990). In sequential state estimation, each measurement is processed sequentially, usually one at time. Processing of the measurement is done by avoiding the matrix procedures. Thus, the objective of sequential state estimation is mainly intended to provide computational efficiency both in terms of computing time and storage requirements. Sequential estimators have so far been found to be practical for small networks, but not for medium to large networks. In orthogonal transformation methods (A. Simoes-Costa & V. H. Quintana Feb. 1981, A. Simoes-Costa & V. H. Quintana August 1981, Ali Abur & A. G. Exposito 2004 and A. Monticelli 2002), there is no need to compute the gain matrix. The measurements are transformed into virtual measurements that are functions of the state and of the original measurements. However, the only concern of this method is the need to obtain the orthogonal matrix which, in spite of being actually expressed as the product of elementary matrices, is much denser than the gain matrix which can slower the computational speed. Some ideas, such as Hybrid method (Slutsker I.W. Vempatin & W.F. Tinney 1992) has been proposed to speed up orthogonal factorization. This method does not require the storage of the orthogonal matrix and it can easily be implemented in the efficient fast decoupled version. However, according to Holten, L et al. 1988 and Slutsker I.W. Vempatin & W.F. Tinney 1992, this method is less stable than the orthogonal transformation method and also it remains rather slower compared with the normal equation method. The WLS formulation may be decoupled by separating the measurement set into real and reactive power groups and by using the same simplifying assumptions as used in the fast decoupled load flow (B Stott & O Alsac 1974). The integrated SE like fast decoupled state estimation may not meet the requirements of on-line state estimation in terms of computer storage and time for very large scale power systems containing thousand or more busses. The reduction of the computation burden that fit with normal equation of SE needs further investigation.
22
Advanced Technologies
zˆi hi (x)
H
R1 W
1 2 1 0 0
0
1 0 22 0 0 0
h (x ) 1 x 1 h (x ) 2 x 1 h (x ) m x1
h1(x) h1(x) x2 xn h2(x) h2(x) x2 xn hm(x) hm(x) x2 xn
0 0 1 2 m
0
e z h(x)
Fig. 1. State Estimation Solution.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
23
3. Contribution In the Newton-Raphson State Estimator (NRSE) method, the process of computing the elements of the Jacobian matrix is significantly time consuming step which requires evaluation of large number of trigonometric functions. It is significant, especially in large scale power system networks. Fast Decoupled State Estimator (FDSE) (A. Monticelli 2002, A Garcia et. al. 1979 and Ali Abur & A. G. Exposito 2004) are based on assumptions that in practical power system networks under steady-sate, real power flows are less sensitive to voltage magnitudes and are very sensitive to voltage phase angles, while reactive power flows are less sensitive to voltage phase angles and are very sensitive to voltage magnitudes. Using these properties, the sub-matrices HP,V, Hpij,V, Hpji,V, HQ,δ, Hqij,δ and Hqji,δ are neglected. Because of the approximations made, the corrections on the voltages computed in each iteration are less accurate. This results in poor convergence characteristic. Newton-Raphson State Estimator (NRSE) method (Ali Abur & A. G. Exposito, 2004, A. Monticelli 1999, John J. Grainger & William D. Stevenson Jr. 1994 and Stagg,G.W. & El-Abiad, A.H. 1968) that was subsequently introduced became more popular because of exact problem formulation and very good convergence characteristic. In NRSE method, elements of Jacobian matrix are computed from the standard expressions which are functions of bus voltages, bus powers and the elements of bus admittance matrix. Nowadays, with the advent of fast computers, even huge amount of complex calculations can be carried out very efficiently. The aim of this research is to reduce the time taken to construct the H matrix. A simple algorithm to construct the H matrix is presented in this chapter. This algorithm can be easily fit into the Newton Raphson State Estimation method. It is recognized that each element of the Jacobian matrix is contributed by the partial derivatives of the power flows in the network elements. The elements of the state estimation Jacobian matrix are obtained considering the power flow measurements in the network elements. The proposed algorithm will process the network elements one-by-one and the elements of H matrix are updated in a simple manner. The final H matrix thus constructed is exactly same as that obtained in available NRSE method. The details of the proposed algorithm are discussed in the following sections.
4. General Structure of H matrix The SE Jacobian H, is not a square matrix. The H matrix always has (2N – 1) columns, where N is equal to number of buses. The number of rows in H matrix is equal to number of measurements available. For full measurement set, number of rows will be equal to (3N + 4B) where B is number of lines. The elements of H represent the partial derivates of bus voltage magnitudes, bus powers and line flows with respect to state variables and V. As shown in Eq. 5, HV,δ, HV,V, Hpij, δ, Hpij, V, Hpji, δ, Hpji, V, Hqij, δ, Hqij, V, Hqji, δ, Hqji, V, HP,δ, HP,V, HQ,δ and HQ,V are the sub-matrices of Jacobian matrix. The first suffix indicates the available measurement and the second suffix indicates the variable on which the partial derivatives are obtained. The constructional details of the SE sub-matrices are discussed in Section 5. 4.1 Power flows in network elements The transmission network consists of transmission lines, transformers and shunt parameters. In NRSE method the transmission network is represented by the bus
24
Advanced Technologies
admittance matrix and the elements of the H matrix are computed using the elements of bus admittance matrix. Alternatively, in this chapter, the elements of the H matrix are obtained considering the power flows in the transmission network elements. Consider the transmission network element between buses i and j, as shown in Figure 2. i Vi i
j
Transmission line / Transformer
V j j
gsh i j bs h i
gsh j j bs h
j
Fig. 2. Transmission network element between buses i and j
The transmission line is represented by the series impedance rij+ jxij or by the corresponding admittance gij + jbij. Transformer with series impedance rij + jxij and off-nominal tap setting “a” with tap setting facility at bus i is represented by the series admittance
1 (gij + jbij) and a
a1 shunt admittances 1 a (gij + jbij) and (gij + jbij) at buses i and j respectively. Half 2 a a line charging admittance and external shunt admittance if any, are added together and represented as gshi + jbshi and gshj + jbshj at buses i and j respectively. For such a general transmission network element, the real and reactive power flows are given by the following expressions.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
δ3
δ2 V1
V1
2 V2
3 V2
2 VN
3 VN
2
H=
3
pij i
p ji i
qij i qji i
P1 2 P2 2 PN 2
P1 3 P2 3 PN 3
Q 1 2 Q 2 2 Q N 2
Q 1 3 Q 2 3 Q N 3
δN
V1
V1
V1
N V2
pij j
p ji j
qij j qji j
V1
V2 VN
V1
V2
pij
p ji
VN VN
VN
pij
Vi
Hpij,δ
Hpij,V
p ji
Hpji,δ
Hpji,V
qij Vj
qji
qji
Q 1 N Q 2 N Q N N
Vj
Vi
HV,V
qij
HV,δ
Vj
Vi
VN V2
Vi
V1
V2 V2
V1 VN
VN
P1 N P2 N PN N
V2
V1 V2
N VN N
25
Vj
(5)
Hqij,δ
Hqij,V
Hqji,δ
Hqji,V
P1 V1 P2 V1 PN V1
P1 V2 P2 V2 PN V2
Q 1 V1 Q 2 V1 Q N V1
Q 1 V2 Q 2 V2 Q N V2
P1 VN P2 VN PN VN Q 1 VN Q 2 VN Q N VN
HP,δ
HP,V
HQ,δ
HQ,V
26
Advanced Technologies
pi j Vi2 (
gi j a
2
gs h i )
p j i Vj2 ( gi j gs h j )
where
ViVj a
( gi j cos i j bi j sin i j )
ViVj
( gi j cos i j bi j sin i j ) a ViVj bi j ( gi j sin i j bi j cos i j ) qi j Vi2 ( 2 bs h i ) a a ViVj q j i Vj2 (bi j bs h j ) ( gi j sin i j bi j cos i j ) a
i j i j
(6) (7) (8) (9) (10)
All the line flows computed from Eq. 6 to 9 are stored in the real power and reactive power matrix as in Eq. 11 and 12 from which bus powers can be calculated.
0 p 21 P p31 p N1
p12
p13
p23 0
pN 2
pN 3
q12 0 q 0 21 Q q31 q32 q N 1 qN 2
q13
0 p32
q23 0
qN 3
p1N p2 N p3 N 0
(11)
q1N q2 N q3N 0
(12)
The real and reactive power flows in line i-j depend oni, j, Vi and Vj. The partial derivatives of pij, pji, qij and qji with respect to i, j, Vi and Vj can be derived from Eq. 6 to 9.
5. Construction of SE Jacobian Matrix, H All the elements of H matrix are partial derivatives of available measurements with respect to and V. The elements of sub-matrices HV,δ, HV,V are given by HVi , j
Vi 0 for all i and j j
HVi ,V j
Vi 0 i j V j
HVi ,Vi
Vi 1 Vi
(13)
(14)
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
27
If at particular bus, the voltage meter is not available, the row corresponding to that particular bus will be deleted. Using Eq. 6 to 9 the expression for the partial derivatives of pij, pji, qij and qji with respect to i, j, Vi and Vj are obtained. Thus pi j i
pi j j pi j Vi
pi j V j
p j i i p j i j p j i Vi p j i Vj
qi j i qi j j qi j Vi
qi j V j
q j i i q j i j q j i Vi q j i Vj
ViVj
ViV j a gi j
2Vi (
( gi j sin i j bi j cos i j )
a
a2
a
ViV j a Vj
( gi j sin i j bi j cos i j )
( gi j cos i j bi j sin i j )
a
2Vj ( gi j gs h j )
ViVj a
ViV j
2Vi (
bi j
( gi j cos i j bi j sin i j )
a2
bs h i )
Vj a
( gi j sini j bi j cosi j )
Vi ( gi j sin i j bi j cos i j ) a
ViV j a
Vi ( gi j cos i j bi j sin i j ) a
( gi j cos i j bi j sin i j )
a
( gi j cos i j bi j sin i j )
( gi j sin i j bi j cos i j )
a
Vj
gs h i )
Vi ( gi j cos i j bi j sin i j ) a
ViVj
( gi j sin i j bi j cos i j )
( gi j cos i j bi j sin i j )
ViV j
Vj a
a
( gi j cos i j bi j sin i j )
( gi j sin i j bi j cos i j )
2Vj (bi j bs h j )
Vi ( gi j sin i j bi j cos i j ) a
(15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27) (28) (29) (30)
To construct the H matrix, initially all its elements are set to zero. Network elements are
28
Advanced Technologies
considered one-by-one. For the element between buses i-j, the partial derivatives of line flows with respect toi, j, Vi and Vj are computed using Eq. 15 to 30. These values are simply added to the corresponding elements of sub-matrices Hpij, δi, Hpij, δj, Hpij, Vi, Hpij, Vj, Hpji, δi, Hpji, δj, Hpji, Vi, Hpji, Vj, Hqij, δi, Hqij, δj, Hqij, Vi, Hqij, Vj, Hqji, δi, Hqji, δj, Hqji, Vi and Hqji, Vj. Sub-matrices HP,δ, HP,V, HQ, δ and HQ,V are now considered. Partial derivatives of bus powers can be expressed in terms of partial derivatives of line flows. To illustrate this let i-j, i-k and i-m be the elements connected at bus i. Then the bus powers Pi and Qi are given by
Therefore
Pi pi j pi k pi m
(31)
Qi qi j qi k qi m
(32)
Pi pi j pi k pi m i i i i
(33)
Qi qi j qi k qi m i i i i
(34)
Similar expressions can be written for other partial derivatives of Pi and Qi with respect to j, Vi and Vj. Likewise considering bus powers Pj and Qj, partial derivatives of Pj and Qj can also be obtained in terms of partial derivatives of line flows in the lines connected to bus j. It is to be noted that the partial derivatives of the line flows contribute to the partial derivatives of bus powers. Table 1 shows a few partial derivatives of line flows and the corresponding partial derivative of bus powers to which it contributes. Partial Derivatives of Line flows
pi j i
,
p j i j
,
qi j q j i , Vi V j
Bus Power
Pi Pj Qi Q j , , , i j Vi Vj
Table 1. Partial Derivatives of Line Flows and the Corresponding Partial Derivatives of Bus Powers.
The partial derivatives of
pi j pi j pi j pi j p p p p will contribute to i , i , i , i , , , i j Vi V j i j Vi Vj
respectively. Similar results are true for pji, qij and qji. Those values will be added to the corresponding elements of HP,δ, HP,V , HQ,δ and HQ,V. This process is repeated for all the network elements. Once all the network elements are added, we get the final H matrix.
6. Computing and recording only the required partial derivatives alone The H matrix will have 3N+4B number of rows if all possible measurements are available in the network. However, in practice, number of available measurements will be much less.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
Instead of computing elements of rows then deleting them, proper logics can be partial derivatives alone. When line i-j compute all the 16 partial derivatives
29
corresponding to unavailable measurements and adopted to compute and record only the required is processed, it may not be always necessary to given by Eq. 15 to 30. The partial derivatives
pi j pi j pi j , , and pi j are to be computed only when pij or Pi or both pij and Pi are in the i j Vi V j
available measurement list. Thus following three cases are possible. CASE 1: pij is an available measurement. The four partial derivatives are entered in row corresponding to pij. CASE 2: Pi is an available measurement. The four partial derivatives are added to previous values in the row corresponding to Pi. CASE 3: pij and Pi are available measurements. The four partial derivatives are entered in the row corresponding to pij and added to previous values in the row corresponding to Pi. p j i p j i qi j qi j qi j qi j p Such logics are to be followed for p j i , j i , , ; , , , and V j i j Vi V j i j Vi q j i i
,
q j i j
,
q j i Vi
,
q j i V j
also.
6.1 Application example to illustrate the construction of H Jacobian matrix The three bus power system (Ali Abur & A. G. Exposito 2004) as shown in Figure 3 is used to illustrate the construction of H Jacobian matrix. In this system bus 1 is the slack bus and the tap setting “a” for all lines are 1. With the network data as listed in Table 2 and the available measurements as listed in Table 3, the Jacobian matrix H is constructed as discussed in Section 5, taking the initial bus voltages as V1 V2 V3 1.000 . 1
2
3
: Voltmeter : Power measurement : Line flow measurement
Fig. 3. Single-line diagram and measurement configuration of a 3-bus power system. Line
R (pu) X (pu) To Bus 2 0.01 0.03 3 0.02 0.05 3 0.03 0.08 Network data for 3-bus system
From Bus 1 1 2 Table 2.
Total Line Charging Susceptance B (pu) 0 0 0
30
Advanced Technologies
Measurements V1 V2 p1-2 p1-3 q1-2 q1-3 P2 Q2
Value (pu) Weightage 1.006 62500 0.968 62500 0.888 15625 1.173 15625 0.568 15625 0.663 15625 -0.501 10000 -0.286 10000 Table 3. Available measurements for 3-bus system Noting that V1 and V2 are available measurements, the sub-matrices of HV,δ, HV,V obtained as H V 1 , H V 2 ,
H V 1 ,V 0 ,V 0
H V2
0
0
1
0
0
0
0
1
are
0 0
where δ spans from δ1, δ2 to δ3 and V spans from V1, V2 to V3. To illustrate all the stages of constructing the other sub-matrices, the network elements are added one by one as shown below. Iteration 1 Element 1-2 is added. The line flow measurements corresponding to this element are p12, p21, q12 and q21. All these measurements are categorized according to the three different cases as in Section 5. The p12 will be categorized as CASE 1 since this measurement is one of the available measurements and P1 is not an available measurement. Similarly, q12 is also categorized as CASE 1. However, p21 and q21 are categorized as CASE 2 since these measurements will contribute to P2 and Q2 respectively; but they are not listed as the available measurements. The new constructed sub-matrices are: H p 12 , H p 13 , H q 12 , H q 13 , H P , 2 H Q 2 ,
H p 12 ,V 30 H p 13 ,V 0 H q 12 ,V 10 H q 13 ,V 0 H P2 ,V 30 H Q 2 ,V 10
30 0 10
0 0 0
10 0 30
10 0 30
0 30 10
0 0 0
0 10 30
0 10 30
0 0 0 0 0 0
Element 1-3 is added. The line flow measurements corresponding to this element are p13, p31, q13 and q31. Now, p13 and q13 will be categorized as CASE 1 since these measurements are listed as the available measurements and P1 and Q1 are not the available measurements. However it is not necessary to compute the partial derivatives of p31 and q31 as they and P3 and Q3 are not in available measurements. With this, the constructed sub-matrices are:
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix H p 12 , H p 13 , H q 12 , H q 13 , H P , 2 H Q 2 ,
H p 12 ,V 30 H p 13 ,V 17 .24 H q 12 ,V 10 H q 13 ,V 6.89 H P2 ,V 30 H Q 2 ,V 10
31
30
0
10
10
0
17 .24
6.89
0
10
0
30
30
0 30 10
6.89 0
17 .24 10
0 10
0
30
30
0 6.89 0 17 .24 0 0
Element 2-3 is added. Following similar logics, p23 and q23 will fall under CASE 2 and the partial derivatives of p32 and q32 are not required. The constructed sub-matrices are: H p 12 , H p 13 , H q 12 , H q 13 , H P , 2 H Q 2 ,
H p 12 ,V 30 H p 13 ,V 17 .24 H q 12 ,V 10 H q 13 ,V 6.89 H P2 ,V 30 H Q 2 ,V 10
30
0 17 .24
10 6.89
10
0 10 0 40 .96 14 .11
0 6.89 10 .96 4.11
30 17 .24 10 30
30 0 14 .11 40 .96
0
0 6.89 0 17 .24 4.11 10 .96
The final H matrix will be the combination of all the sub-matrices with the column corresponding to slack bus being deleted. Thus the constructed Jacobian matrix H in the first iteration is 0 1 0 0 0 0 30 0 10 0 17 . 24 6 .89 H 10 0 30 6.89 17.24 0 40.96 10.96 10 30 4.11 14.11
0
0
1 10
0 0
6.89 0 30 0 17.24 0 14.11 4.11 40.96 10.96
Using the above H matrix, state variables are updated as V1 0.9997 0 0 ;V2 0.9743 0.0210 ; V3 0.9428 0.0450. All the above stages are repeated until the convergence is obtained in
iteration 3 with the final state variables values as V1 0.9996 0 0 ;V2 0.9741 0.022 0 ; V3 0.9439 0.048 0. These estimates are same as obtained in NRSE method.
The suggested procedure is also tested on 5-bus system (Stagg,G.W. & El-Abiad,A.H. 1968) and IEEE 14-bus system, IEEE 24-bus system, IEEE 30-bus system, IEEE 57-bus system, local utility 103-bus system, IEEE 118-bus system and IEEE 300-bus system. The final results of all the test systems agree with those obtained using NRSE method. The details are discussed in Section 7.
32
Advanced Technologies
7. Simulation In order to simulate the NRSE algorithm, data obtained from power flow studies is used. The initial states of voltage vectors are commonly assumed to have a flat profile that means all the voltage phasors are of 1 p.u and in phase with each other, i.e. 0 degree. Meanwhile the convergence of the algorithm is tested using the following criteria: max x k
(31)
where is a given by user, most of the cases is set to 0.001. The performance of the algorithm in simulation studies is assessed by comparing the estimated values of the states and the measured values of the state with the true value (from power flow solution) of the state. The performance indices in Eq. 32 – 35 are used for this comparison. 2 m zi zti 1 Jmeas m i 1 i 2 m zˆ i zti 1 J est. m i 1 i
t m zˆ i zi Rave. 1 m i 1 i zˆ zt i i Rmax max i for i 1,2 ,...m
(32) (33)
(34)
(35)
The level of uncertainty on the measurements is indicated by the performance index Jmeas. While the performance index of Jest. shows how close are the estimated values to the measured value. The filtering process is considered good if the ratio of Jest./Jmeas is less than one. The performance indices Rave. and Rmax indicate the average and the maximum values of weighted residuals from the true values to compliment the general information of the indices Jmeas and Jest.. In the other hand, the true and the estimated values of the state variables are compared by
Ex
xˆ i xt i 100% xt i
(36)
Table 4 shows the summary of the general computational for the tested network. The convergence tolerance of 0.001 p.u. and the margin of error 5 % is assigned to all tested network. The tested network shows no bad data is detected since the objective function J is found less than the chi-square value.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
System Bus
Tolerance
Confidence level
Degrees of freedom N=m-n
5 0.001 p.u 95% 8 IEEE 14 0.001 p.u 95% 26 IEEE 30 0.001 p.u 95% 57 IEEE 57 0.001 p.u 95% 192 103-bus 0.001 p.u 95% 574 IEEE 118 0.001 p.u 95% 289 IEEE 300 0.001 p.u 95% 954 Table 4. The summary of computational results.
33
Chi- Square χ2N, α
Objective function J
Iter.
1.83E+01 3.89E+01 7.79E+01 2.25E+02 6.31E+02 3.32E+02 1.03E+03
4.37E-03 1.19E+01 7.45E+01 1.92E+02 4.60E+02 3.00E+02 1.00E+03
3 4 4 7 6 7 8
Table 5 depicted the analysis of state vector x for 5-bus system. Meanwhile Table 6 depicted the analysis of state vector for IEEE 14-bus system.
Bus no.
Voltage magnitude (p.u.) Estimated TRUE
V
t
Vˆ
1 1.06 1.06 2 1 1 3 0.987 0.98725 4 0.984 0.98413 5 0.972 0.9717 Table 5. State Vector for 5-bus system.
Voltage phase (degree) Estimated TRUE
Ex (%)
δt
δˆ
|V|
δ
0 -2.061 -4.64 -4.96 -5.76
0 -2.06115 -4.6366 -4.95691 -5.76476
0 0 0.02533 0.01321 0.03086
0 0.00728 0.07328 0.0623 0.08264
Overall, this two network shows the performance of NRSE is good since the average of error for state vector is less than one for both network as shown in Table 7. To support the reliability of NRSE, the performance of indices for all tested network is shown in Table 8. It shows that the filtering process of the NRSE is acceptable since the ratio of Jest./Jmeas. is less than one. However, the most important performance index that can prove the aim of the research is achieved is the computer time. The computer time for this algorithm should be as short as possible as the size of the network becomes larger. Same as NRSE method, 5-bus system, IEEE 14-bus, IEEE 30-bus, IEEE 57-bus, local utility 103-bus, IEEE 118-bus and IEEE 300-bus system is tested. The results are compared with results obtained for existing NRSE method. The final estimate of the proposed method compare with the actual and NRSE method is shown in Figure 4 to 7 for 5-bus system, IEEE 14-bus, IEEE 30-bus and IEEE 57-bus respectively. Due to space limitation, the results of final estimate for local utility 103-bus, IEEE 118-bus and IEEE 300-bus system are not furnished in this section. The final estimated results of the proposed method are exactly the same as results obtained in NRSE.
34
Advanced Technologies
Bus no.
Voltage magnitude (p.u.) Estimated t TRUE V Vˆ
Voltage phase (degree) Estimated TRUE
System Bus
Jmeas.
5 0.0115 IEEE 14 0.5713 IEEE 30 1.0942 IEEE 57 1.00E+05 103 1.1473 IEEE 118 1.12E+00 IEEE 300 1.09E+00 Table 7. Performance Indices
δˆ
|V|
δ
0 -4.99057 -12.75775 -10.2318 -8.75767 -14.4546 -13.25646 -13.25655 -14.83789 -15.04667 -14.87054 -15.30079 -15.33827 -16.08195
0.06698 0.0488 0.0495 0.63886 0.34314 0.03925 0.19115 0.06972 0.24905 0.13225 0.05582 0.10521 0.24381 0.45946
0 0.21225 0.29678 0.95063 0.25433 1.64979 0.84921 0.77433 0.68347 0.35318 0.54456 1.53145 1.17592 0.26153
δt
1 1.06 1.06071 0 2 1.045 1.04551 -4.98 3 1.01 1.0105 -12.72 4 1.019 1.01249 -10.33 5 1.02 1.0165 -8.78 6 1.07 1.07042 -14.22 7 1.062 1.05997 -13.37 8 1.09 1.09076 -13.36 9 1.056 1.05337 -14.94 10 1.051 1.05239 -15.1 11 1.057 1.05759 -14.79 12 1.055 1.05389 -15.07 13 1.05 1.04744 -15.16 14 1.036 1.03124 -16.04 Table 6. State Vector for IEEE 14-bus system.
Ex (%)
Jest.
Rave
Rmax
Jest./Jmeas.
0.0088 0.2257 0.6221 4.76E+04 0.7844 5.69E-01 6.45E-01
0.0274 0.0881 0.0668 8.21E-01 0.0413 6.14E-04 3.09E-02
0.3861 1.5950 2.0470 1.65E+03 6.3426 3.96E+00 4.65E+00
0.7638 0.3952 0.5685 0.4748 0.6837 0.5096 0.5943
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
35
1.08 1.06 1.04
|V|, p.u.
1.02
Actual
1
NRSE Proposed
0.98 0.96 0.94 0.92 1
2
3
4
5
4
5
Bus
(a) 1
2
3
0
Bus
-1
Volatge angle, deg
-2 -3
Actual NRSE
-4 -5 -6 -7
(b) Fig. 4. (a) & (b) The final estimate for voltage vector of 5-bus system.
Proposed
36
Advanced Technologies
1.1 1.08
|V|, p.u.
1.06 1.04
Actual NRSE Proposed
1.02 1 0.98 0.96 1
2
3
4
5
6
7
8
9
10
11
12
13
14
9
10
11
12
13
14
Bus
(a) 1
2
3
4
5
6
7
8
0
Bus
-2
Voltage angle, deg
-4 -6 -8 -10 -12 -14 -16 -18
(b) Fig. 5. (a) & (b) The final estimate for voltage vector of IEEE 14-bus system.
Actual NRSE Proposed
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
37
1.1 1.08 1.06
|V|, p.u.
1.04
Actual
1.02
NRSE Proposed
1 0.98 0.96 0.94 1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
19
21
23
25
27
29
Bus
(a) Bus
0 -2
1
3
5
7
9
11
13
15
17
-4
Volatge angle, deg
-6 -8 -10 -12 -14 -16 -18 -20
(b) Fig. 6. (a) & (b) The final estimate for voltage vector of IEEE 30-bus system.
Actual NRSE Proposed
38
Advanced Technologies
1.1
1.05
|V|, p.u.
1 Actual NRSE
0.95
Proposed 0.9
0.85
0.8 1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 Bus
(a) 0 1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55
Bus
Voltage angle, deg
-5
-10
Actual NRSE
-15
Propsed
-20
-25
(b) Fig. 7. (a) & (b) The final estimate for voltage vector of IEEE 57-bus system. On the other hand, the proposed method is certainly advantageous by considering the performance computational speed. Table 9 shows the average processor time taken for the overall process. The overall algorithm is developed using Matlab 7.0 and the programs are tested on computer with the CPU time of 504 MB of RAM and the speed of CPU is Pentium(R) 3.40 GHz.
NRSE Proposed method H time G time BD time Conv time Total H time G time BD time Conv time bus sec sec sec sec sec sec sec sec sec 5 3.13E-02 1.88E-01 9.38E-02 4.69E-02 3.28E-01 1.56E-01 5.16E-01 9.38E-02 4.69E-02 IEEE 14 3.13E-02 2.19E-01 9.38E-02 7.81E-02 3.75E-01 2.66E-01 6.56E-01 9.38E-02 3.13E-02 IEEE 30 3.13E-02 3.28E-01 1.09E-01 4.69E-02 5.63E-01 2.81E-01 7.66E-01 1.25E-01 3.13E-02 IEEE 57 1.09E-01 9.84E-01 1.56E-01 1.72E-01 1.38E+00 4.84E-01 1.33E+00 1.56E-01 1.41E-01 103 1.22E+00 8.16E+00 8.28E-01 1.02E+00 1.37E+01 5.00E-01 3.31E+00 8.44E-01 1.05E+00 IEEE 118 1.52E+00 1.13E+01 1.41E-01 4.69E-01 1.54E+01 2.97E-01 4.56E+00 4.69E-01 1.41E-01 IEEE 300 2.52E+01 2.28E+02 1.05E+01 3.48E+00 3.03E+02 4.25E+00 8.33E+01 9.75E+00 3.47E+00 Note : H time is the time taken to build Jacobian Matrix; G time is the time taken to build the Gain Matrix; BD time taken to process the bad data algorithm. Table 9. Processor Time Comparison of NRSE vs Proposed method.
System
Total sec 6.56E-01 7.97E-01 9.84E-01 1.72E+00 7.84E+00 7.44E+00 1.37E+02 is the time
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix 39
40
Advanced Technologies
For the sake of analysis, the tabulated data in Table 9 is converted in graph figure as shown in Figure 8 to 10. It is clearly show that processor time increases considerably when size of network become larger. This applied to the both methods. However, comparing the two methods, the proposed method uses significantly less processor time than that of the NRSE method when the size of network becomes larger.
2.00E+00 1.80E+00 1.60E+00 1.40E+00 sec
1.20E+00 1.00E+00
NRSE
8.00E-01
Proposed
6.00E-01 4.00E-01 2.00E-01 0.00E+00 5
IEEE 14
IEEE 30
IEEE 57
Bus
3.50E+02 3.00E+02 2.50E+02
sec
2.00E+02 NRSE 1.50E+02
Proposed
1.00E+02 5.00E+01 0.00E+00 103
IEEE 118
IEEE 300
Bus
Fig. 8. Total processing time comparison of NRSE vs Proposed method.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
41
1.40E+00 1.20E+00
sec
1.00E+00 8.00E-01 NRSE
6.00E-01
Proposed
4.00E-01 2.00E-01 0.00E+00
5
IEEE 14
IEEE 30
IEEE 57
Bus
2.50E+02 2.00E+02 NRSE
1.50E+02 sec
Proposed
1.00E+02 5.00E+01 0.00E+00 103
IEEE 118
IEEE 300
Bus
Fig. 9. Formation of Gain time comparison of NRSE vs Proposed method.
42
Advanced Technologies
5.00E-01 4.50E-01 4.00E-01 3.50E-01 sec
3.00E-01 2.50E-01
NRSE
2.00E-01
Proposed
1.50E-01 1.00E-01 5.00E-02 0.00E+00 5
IEEE 14
IEEE 30
IEEE 57
Bus
3.00E+01 2.50E+01
sec
2.00E+01 NRSE
1.50E+01
Proposed 1.00E+01 5.00E+00 0.00E+00 103
IEEE 118
IEEE 300
Bus
Fig. 10. Formation of Jacobian time comparison of NRSE vs Proposed method.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
43
8. Discussion. It has been illustrated in Figures 8 through 10 that the proposed method does not provide any advantage when applied to a small network. However, the advantages of the proposed method can be seen when it is applied to a large network. Processing time reduction is achieved when the network size is large. To understand this phenomenon, let us examine an N-bus network with a maximum number of measurements where N is number of busses and Nbr is number of branches. The assumptions made for the case of full measurements are as follows: - All active power (P) and reactive power (Q) are measured at the sending and receiving end of each transmission line - Active and reactive power injection, as well as voltage magnitude are measured at each bus in the system - The number of states (2N -1) is approximately equal to 2N. In NRSE method, the sub-matrix HV, shown in Eq. 13 and Eq. 14, is constant, i.e. 0 or 1. Therefore they need not to be computed at each iteration resulting in very small computation time. The sub-matrices of Hpij, Hpji, Hqij and Hqji have exactly the same zero and nonzero structures as the network branches. This is because each line flow is incident to its two terminal buses. Thus, these sub-matrices are sparse and may not take longer time to compute. On the other hand, the sub-matrices of HP and HQ involve derivatives of real and reactive power injections into the buses. When the measurement set is full, the sub matrix in matrix HP and HQ will be of size N x N. Each sub-matrix will have N diagonal terms and N2 off-diagonal terms. For each diagonal term, N numbers of trigonometric functions are to be evaluated and for each off-diagonal term, one trigonometric function is to be evaluated. Therefore, total number of trigonometric functions that are to be evaluated = 8 N2
(37)
In the proposed method, the sub-matrix of HV is the same as in NRSE method. However, the rest of the sub-matrices totally depends on 16 numbers of partial derivatives of line flows, shown in Eq. 15 through Eq. 30. For each partial derivative of line flow, 2 trigonometric functions are to be evaluated. Meanwhile, to compute calculated powers, 4 numbers of line flows, shown in Eq. 6 through Eq. 9 are to be computed. For each calculated power, 2 trigonometric functions are to be evaluated. Therefore the number of trigonometric functions that are to be evaluated is equal to 32 Nbr + 8 Nbr. Taking Nbr = 1.4 N, total number of trigonometric functions that are to be evaluated = 56 N
(38)
Based on Eq. (37) and Eq. (38), we can conclude that the proposed method uses significantly less number of mathematical operations and hence takes less CPU time particularly for larger networks compared to the NRSE method.
9. Conclusion. Newton-Raphson State Estimation (NRSE) method using bus admittance matrix remains as an efficient and most popular method to estimate the state variables. In this method,
44
Advanced Technologies
elements of Jacobian matrix H are computed from standard expressions which lack physical significance. The process of computing the elements of the Jacobian matrix is significantly time consuming step which requires evaluation of large number of trignometric functions. It is significant, especially in large scale power system networks. In order to reduce the computation time, a simple algorithm to construct the H matrix is presented in this chapter. This can be easily fit into the NRSE method. It is recognized that each element of the H matrix is contributed by the partial derivatives of the power flows in the network elements. The elements of the state estimation Jacobian matrix are obtained considering the power flow measurements in the network elements. Network elements are processed one-by-one and the H matrix is updated in a simple manner. The final H matrix thus constructed is exactly same as that obtained in available NRSE method. Systematically constructed Jacobian matrix H is then integrated with WLS method to estimate the state variables. The final estimates and time taken to converge are recorded and compared with results obtained from NRSE method available in the literature. The results proved that the suggested method takes lesser computer time compared with available NRSE method, particularly when the size of the network becomes larger. The suggested procedure is successfully tested on IEEE standard systems and found to be efficient.
10. Acknowledgements The authors gratefully acknowledge University Technology PETRONAS, Malaysia for providing the facilities and financial support to carry out the studies.
11. References A. Monticelli (2002), Electric Power System State Estimation, Proceedings IEEE, Vol. 88, No. 2, pp. 262-282. F.C. Schweppe; J. Wildes & D.B. Rom (1970), Power system static-state estimation, parts I, II and III, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-89, No. 1, pp.120-135. Ali Abur & Antonio Gomez Exposito (2004), Power System Estimation: Theory and Implementation, Marcel Dekker, Inc., New York. Holten, L.; Gjelsvik, A.; Aam, S.; Wu, F.F. & Liu, W.-H.E (1988), Comparison of different methods for state estimation, IEEE Transactions on Power Systems, Vol. 3, No 4, pp. 1798 – 1806. A. Garcia; A. Monticelli & P. Abreu (1979), Fast decoupled state estimation and bad data processing, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-98, No. 5, pp.1645-1651. F. C. Schweppe & E. J. Handschin (1974), Static state estimation in electric power systems, Proceedings IEEE, Vol. 62, No. 7, pp. 972-983. A. Garcia; A. Monticelli & P. Abreu (1979), Fast decoupled state estimation and bad data processing, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-98, No. 5, pp.1645-1651. A, Monticelli & A. Garcia (1990), Fast decoupled state estimators, IEEE Transactions on Power Systems, Vol. 5, No. 2, pp. 556-564.
Newton-Raphson State Estimation Solution Employing Systematically Constructed Jacobian Matrix
45
A. Simoes-Costa & V. H. Quintana (Feb. 1981), A robust numerical technique for power system state estimation, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-100, No. 2, pp. 691-698. A. Simoes-Costa & V. H. Quintana (August 1981), An orthogonal row processing algorithm for power system sequential state estimation, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-100, No. 8, pp. 3791-3800. Slutsker I.W. Vempatin & W.F. Tinney, Orthogonal Sparse Vector Methods, IEEE Transactions on Power Systems, Vol. 7, No. 2, pp. 926-932. Holten, L.; Gjelsvik, A.; Aam, S.; Wu, F.F.& Liu, W.-H.E (1988), Comparison of different methods for state estimation, IEEE Transactions on Power Systems, Vol. 3, No 4, pp. 1798 – 1806. B Stott & O Alsac (1974), Fast Decoupled Load Flow, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-93, No. 3, pp. 859-869. A. Monticelli (1999), State Estimation in Electric Power Systems. A Generalized Approach, Kluwer Academic Publishers, New York. John J. Grainger & William D. Stevenson, Jr. (1994), Power System Analysis, McGraw-Hill International Editions, New York. Stagg,G.W. & El-Abiad,A.H. (1968), Computer Methods in Power System Analysis, McGrawHill Book Company, New York.
46
Advanced Technologies
Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations
47
X3 Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations Zanariah Abdul Majid and Mohamed Suleiman
Universiti Putra Malaysia Malaysia
1. Introduction The performance of the developed two point block method designed for two processors for solving directly non stiff large systems of higher order ordinary differential equations (ODEs) has been investigated. The method calculates the numerical solution at two points simultaneously and produces two new equally spaced solution values within a block and it is possible to assign the computational tasks at each time step to a single processor. The algorithm of the method was developed in C language and the parallel computation was done on a parallel shared memory environment. Numerical results are given to compare the efficiency of the developed method to the sequential timing. For large problems, the parallel implementation produced 1.95 speed up and 98% efficiency for the two processors.
2. Background The ever-increasing advancement in computer technology has enabled many in science and engineering sectors to apply numerical methods to solve mathematical model that involve ODEs. The numerical solution of large ODEs systems requires a large amount of computing power. Users of parallel computing tend to be those with large mathematical problems to solve with the desire to obtain faster and more accurate results. In this paper, we consider solving directly the higher order IVPs for system of ODEs of the form (1) y ′′ = f (x, y, y ′) , y (a ) = y 0 , y′(a ) = y0′ , x ∈ [a, b] . Equation (1) can be reduced to the equivalent first order system of twice the dimension equations and then solved using any numerical method. This approach is very well established but it obviously will enlarge the dimension of the equations. The approach for solving the system of higher order ODEs directly has been suggested by several researchers such as in (Suleiman, 1989); (Fatunla, 1990); (Omar, 1999); (Cong et al., 1999) and (Majid & Suleiman, 2006). In previous work of (Omar, 1999), a general r block implicit method of multistep method for solving problems of the form (1) has been investigated. The code used
48
Advanced Technologies
a repetitive computation of the divided differences and integration coefficients that can be very costly. The worked in (Majid & Suleiman, 2007) has presented a direct block method (2PFDIR) for solving higher order ODEs in variable step size which is faster in terms of timing and comparable or better in terms of accuracy to the existence direct non block method in (Omar, 1999). The 2PFDIR method will store all the coefficients in the code and there will be no calculations that involved the divided difference and integration coefficients. In this paper, we would like to extend the discussions in (Majid & Suleiman, 2007) on the performance of 2PFDIR method using parallel environment particularly focus on the cost of time computation by comparing the execution time of sequential and parallel implementation for solving large problem.
3. Formulation of the method In Figure 1, the two values of y n +1 and yn + 2 are simultaneously computed in a block using the same back values. The computed block has the step size h and the previous back block has the step size rh . The idea of having the ratio r is for variable step size implementation.
rh
xn − 2
rh
h
xn −1
h
xn +1
xn
xn + 2
Fig. 1. Two Point Block Method. In Eqn. (1), the f (x, y, y′) will be replaced with Lagrange interpolation polynomial and the interpolation points involved were (xn − 2 , f n − 2 ),, (xn + 2 , f n + 2 ) . These polynomial will be integrate once and twice over the interval corrector formulae were obtained, Integrate once First point:
y′(xn +1 ) = y′(xn ) +
(
[xn , xn +1]
h
240(r + 1)(r + 2 )(2r + 1)r 2
)
[xn , xn+ 2 ] ,
and
and the following
(
(
+ 4r 2 (r + 2 ) 18 + 75r + 80r 2 f n +1 + (r + 1)(r + 2 )(2r + 1) 7 + 45r + 100r 2 −4(2r + 1)(7 + 30r ) f n −1 + (r + 2 )(7 + 15r ) f n − 2 .
Second point:
y′(xn + 2 ) = y′(xn ) +
(
h
15r 2 (2r + 1)(r + 2 )(r + 1)
)
) )f
− (2r + 1)r 2 3 + 15r + 20r 2 f n + 2 n
(2)
(
)
r 2 (2r + 1) 5r 2 + 15r + 9 f n + 2
(
)
+ 4r 2 (r + 2 ) 10r 2 + 15r + 6 f n +1 + (r + 2 )(r + 1)(2r + 1) 5r 2 − 1 f n +4(2r + 1) f n −1 − (r + 2 ) f n − 2 .
(3)
Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations
Integrate twice First point: h2
y (xn +1 ) − y (xn ) − hy′(xn ) =
(
240r (r + 1)(r + 2 )(2r + 1) 2
)
49
(
)
− r 2 (2r + 1) 1 + 6r + 10r 2 f n + 2 +
(
)
4r 2 (r + 2 ) 4 + 21r + 30r 2 f n +1 + (r + 1)(r + 2 )(2r + 1) 3 + 24r + 70r 2 f n −4(2r + 1)(3 + 16r ) f n −1 + (r + 2 )(3 + 8r ) f n − 2
.
(4)
Second point:
y (x n + 2 ) − y (x n ) − 2hy ′(x n ) =
(
)
h2 r (2 + 3r )(2r + 1) f n + 2 15r (r + 1)(r + 2 )(2r + 1)
+ 8r 2 + 6r + 5r 2 (r + 2 ) f n +1 + (3 + 10r )(r + 1)(r + 2)(2r + 1) f n −8(2r + 1) f n −1 + (r + 2 ) f n − 2
.
(5)
During the implementation of the method, the choices of the next step size will be restricted to half, double or the same as the previous step size and the successful step size will remain constant for at least two blocks before considered it to be doubled. This step size strategy helps to minimize the choices of the ratio r. In the code developed, when the next successful step size is doubled, the ratio r is 0.5 and if the next successful step size remain constant, r is 1.0. In case of step size failure, r is 2.0. Substituting the ratios of r in (2) – (5) will give the corrector formulae for the two point block direct integration method. For detail see (Majid & Suleiman, 2007). For example, taking r = 1 in Eqn. (2) – (5) will produce the following corrector formulae: Integrate once: y n′ +1 = y n′ −
h (19 f n + 2 − 346 f n +1 − 456 f n + 74 f n −1 − 11 f n −2 ) 720
y n′ +2 = y n′ +
h (29 f n+ 2 + 124 f n+1 + 24 f n + 4 f n−1 − f n−2 ) 90
(6)
(7)
Integrate twice: y n +1 = y n + hy n′ −
h2 (17 f n+ 2 − 220 f n+1 − 582 f n + 76 f n−1 − 11 f n−2 ) 1440
y n + 2 = y n + 2hy n′ +
h2 (5 f n+ 2 + 104 f n+1 + 78 f n − 8 f n−1 + f n−2 ) 90
(8)
(9)
This block method was applied using predictor and corrector mode. The developed method is the combination of predictor of order 4 and the corrector of order 5. The predictor formulae was derived similar as the corrector formulae and the interpolation points involved are xn −3 ,, xn .
50
Advanced Technologies
4. Parallelism Implementation The code starts by finding the initial points in the starting block for the method. Initially we use the sequential direct Euler method to find the three starting points for the first block using constant h. The Euler method will be used only once at the beginning of the code. Once we find the points for the first starting blocks, then we could apply the block method until the end of the interval. Within a block in the parallel two point direct block method for two processors (P2PFDIR), it is possible to assign both the predictor and the corrector computations to a single processor and to perform the computations simultaneously in parallel. Each application of the block method generates a collection of approximation to the solution within the block. In Eqn. (2) – (5), each computed block consists of two steps, i.e n+1 and n+2. The predictor equation are dependent on values taken from the previous block of n, n-1, n-2, n-3 and the corrector values depend on the current blocks at the points n+1 and n+2. The predictor and corrector equations at each point are independent of each other. Thus, the equation can be easily mapped onto one processor each. In the shared memory machine this synchronisation point takes the form of a barrier. Below are given the parallel algorithm of the block method in the code: Processor 1 (P1)
Processor 2 (P2) p
Step 1: Prediction Y Step 2: Evaluate F
p
Step 1: Prediction Y
n +1
n +1
synchronisation point
Step 3: Correction Y c n +1 Step 4: Evaluate F
c
n +1
Step 2: Evaluate F
p
p
n+2
n+2
Step 3: Correction Y c n + 2 synchronisation point
Step 4: Evaluate F c n + 2
Fig. 2. The parallel process of P2PFDIR. In Figure 2, both processors have to exchange information after the evaluation of the terms F p n +1 and F p n + 2 before continue to Step 3. The same process happen at Step 4 after the evaluation of the terms F c n +1 and F c n + 2 . The parallelism is achieved when the code computes at Step 3 – 4, particularly the Y c n + m and F c n + m , m=1,2. Step 1 – 2 and Step 3 – 4 can be done concurrently as they are independent to each other. Parallelization in P2PFDIR is achieved by sharing the f-evaluations. The sequential programs were executed on DYNIX/ptx operating system. The parallel programs of the method employed were run on a shared memory Sequent Symmetry parallel computer at the Faculty of Computer Science and Information Technology, Universiti Putra Malaysia. The choice of an implementation on a shared memory parallel computer is due to the fact that such a computer can consists of several processors sharing a common memory with fast data access and requiring less communication times, which is suited to the features of the P2PFDIR method.
Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations
51
The algorithm for P2PFDIR is executed in C language. In order to see a possible speed up of the parallel code, the test problems in Section 5 should be expensive. Therefore, the relatively small problems have been enlarge by scaling. The computation cost increased when solving large systems of higher order ODEs because the function evaluations continue to increase. Using two processors to do the work simultaneously can help to reduce the computation time when solving large problem.
5. Results and Discussions The following two problems were tested using P2PFDIR code and compare the sequential and parallel timing for N=1000, 2000 and 4000 in Problem 1 and N=101 for interval [0, 20] and [0, 40] in Problem 2. Problem 1: (Lagrange equation for the hanging spring) y1″ = K 2 (− y1 + y2 ) y2″ = K 2 ( y1 − 3 y2 + 2 y3 )
y3″ = K 2 ( 2 y2 − 5 y3 + 3 y4 ) y N ″ = K 2 ((N − 1)y N −1 − (2 N − 1)y N ) N = number of equations, 0 ≤ x ≤ b , b = end of the interval. (0) = y K = 1 , the initial values y (0 ) = y ′ (0 ) = 0 except y i
i
N −2
N −2
′ (0 ) = 1 ,
Source: (Hairer et al., 1993). Problem 2: (Moon – the second celestial mechanics problem.) xi″ = γ
N
∑m
j
j = 0, j ≠ i
yi″ = γ
N
∑m
j
(y j − yi )
j = 0, j ≠ i
((
rij 3
(x j − xi ) rij 3
where i = 0, , N
)2 + (yi − y j )2 )2 , 1
rij = xi − x j
γ = 6.672, m0 = 60, mi = 7 × 10 −3
i, j = 0, , N i = 1, , N
Initial data: x0 (0 ) = y0 (0 ) = x0′ (0 ) = y0′ (0 ) = 0 . 2π 2π ′ xi (0) = 30 cos + 400, xi (0 ) = 0.8 sin 100i 100i
2π ′ 2π yi (0 ) = 30 sin , yi (0 ) = −0.8 cos +1 100i 100i N = 101 , 0 ≤ t ≤ b , b = end of the interval.
52
Advanced Technologies
Source: (Cong et al., 2000). The performance of the sequential and parallel execution times for every problem is shown in Table 1 – 5 while Table 6 shows the speed up and efficiency performance for the problems. The notations are defined as follows: TOL MTD TS FS FCN MAXE
Tolerances Method employed Total number of steps Failure steps Total function calls Magnitude of the global error (max yn − y (xn )
TIME(min) TIME(sec) S2PFDIR P2PFDIR
The execution time in minutes The execution time in seconds Sequential implementation of the two point implicit block method Parallel implementation of the two point implicit block method
In the code, we iterate the corrector to convergence. The convergence test employed were abs ( y (s +1)n + 2 − y ( s ) n + 2 ) < 0.1
× TOL, s = 0,1,2,
(10)
where s is the number of iteration. After the successful convergence test of (10), local errors estimated at the point xn + 2 will be performed to control the error for the block. The error
controls were at the second point in the block because in general it had given us better results. The local errors estimates will be obtain by comparing the absolute difference of the corrector formula derived of order k and a similar corrector formula of order k-1. In these problems we recall that speed up is a measure of the relative benefits of parallelising a given application over sequential implementation. The speed up ratio on two processors that we use is defined as
Sp =
Ts Tp
(11)
where Ts is the time for the fastest serial algorithm for a given problem and T p is the execution time of a parallel program on multiprocessors and in this research we used p = 2 . The maximum speed up possible is usually p with processors p, normally referred to as linear speed up. The efficiency of the parallel algorithm, E p , is defined as Ep =
Sp p
× 100
(12)
which is the ratio of speed up compared to the number of processors used. In an ideal parallel system, speed up is equal to the number of processors p being used and efficiency is equal to 100%. In practice, speed up is less than p and efficiency is between 0% and 100%,
Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations
53
depending on the degree of effectiveness with which the processors are utilised. The speed up shows the speed gain of the parallel computation and it can describe the increase of performance in the parallel system. The two problems above were run without exact reference solution in a closed form, so we used the reference solution obtained by the same program using tolerance at two order lower from the current tolerance. The tested problems were run without calculating the maximum error for the execution time of the sequential and parallel. The values of maximum errors were computed in a separate program. In Table 1 – 5 show the numerical results for the tested problems. For sequential S2PFDIR only one processor was used and two processors were employed for the parallel algorithms of P2PFDIR. The numerical results show that the parallel execution time is faster than the sequential execution time for large ODEs systems. In Table 1 – 3, without loss of generality, we only compute the MAXE at TOL = 10 −2 since the execution time is grossly increased with a finer tolerance. TOL
MTD
10 −2
S2PFDIR P2PFDIR
−4
S2PFDIR P2PFDIR S2PFDIR P2PFDIR S2PFDIR P2PFDIR S2PFDIR P2PFDIR
10
10 −6 10 −8 10 −10
TS 239
N=1000, [0, 5] FS FCN TIME (min) 0 1762 0.195781 883 0.179703
MAXE=1.15083(-2) 570 1 3384 1698 714 0 5584 2797 1743 0 10402 5207 4298 0 25722 12867
0.381467 0.354987 0.609685 0.601066 1.167327 1.087472 2.882636 2.682821
Table 1. Numerical results of 2PFDIR Method for Solving Problem 1 When N=1000. TOL
MTD
10 −2
S2PFDIR P2PFDIR
−4
S2PFDIR P2PFDIR S2PFDIR P2PFDIR S2PFDIR P2PFDIR S2PFDIR P2PFDIR
10
10 −6 10 −8 10 −10
TS 329
N=2000, [0, 5] FS FCN TIME (min) 0 2300 0.649994 0 1150 0.436222
MAXE=2.36506(-2) 796 0 4734 0 2373 997 0 7642 0 3801 2451 0 14648 0 7330 6066 0 36330 0 18171
1.360806 0.936770 2.128587 1.449660 4.082667 2.874696 10.672470 7.236281
Table 2. Numerical results of 2PFDIR Method for Solving Problem 1 When N=2000.
54
Advanced Technologies
TOL
MTD
10 −2
S2PFDIR P2PFDIR
−4
S2PFDIR P2PFDIR S2PFDIR P2PFDIR S2PFDIR P2PFDIR S2PFDIR P2PFDIR
10
10 −6 10 −8 10 −10
TS 457
N=4000, [0, 5] FS FCN TIME (min) 0 3070 1.278904 1530 0.683906
MAXE=4.23114(-2) 1116 1 6654 3323 1397 0 10032 5012 3459 0 20644 10318 8566 0 51328 25658
2.812752 1.488229 4.127480 2.195468 8.778264 4.524878 21.953364 11.258135
Table 3. Numerical results of 2PFDIR Method for Solving Problem 1 When N=4000. TOL
MTD
10 −2
S2PFDIR P2PFDIR
−4
S2PFDIR P2PFDIR
10 −6
S2PFDIR P2PFDIR
10 −8
S2PFDIR P2PFDIR
10 −10
S2PFDIR P2PFDIR
10
TS 29
N=101, [0, 20] FS FCN TIME (sec) 0 128 2.079400 70 1.329228
MAXE=3.78992(-4) 36 0 158 85 MAXE=2.40610(-6) 44
0
198 105 MAXE=8.76138(-7) 52 0 238 125 MAXE=4.36732(-9) 70 0 336 174
2.542150 1.543892 3.175919 1.902361 3.808737 2.260861 5.360910 3.137757
MAXE=6.78177(-11) Table 4. Numerical results of 2PFDIR Method for Solving Problem 2 When N=101, interval [0, 20].
Parallel direct integration variable step block method for solving large system of higher order ordinary differential equations
TOL
MTD
10 −2
S2PFDIR P2PFDIR
−4
S2PFDIR P2PFDIR
10 −6
S2PFDIR P2PFDIR
10 −8
S2PFDIR P2PFDIR
10 −10
S2PFDIR P2PFDIR
10
TS 31
N=101, [0, 40] FS FCN TIME (sec) 0 136 2.201868 74 1.395546
MAXE=1.77673(-4) 39 0 176 94 MAXE=1.53896(-6) 48
55
0
224 118 MAXE=1.39440(-7) 62 0 296 154 MAXE=1.29065(-8) 94 0 478 245
2.835758 1.704369 3.674788 2.133841 4.720133 2.779889 7.599380 4.409488
MAXE=1.54553(-10) Table 5. Numerical results of 2PFDIR Method for Solving Problem 2 When N=101, interval [0, 40]. PROB
N
Interval
1000
[0, 5]
TOL 10 −2
10 −4
10 −6
10 −8
10 −10
1.09 1.07 1.01 1.07 1.07 [55] [54] [51] [54] [54] 2000 [0, 5] 1.49 1.45 1.47 1.42 1.52 [75] [73] [74] [71] [76] 4000 [0, 5] 1.87 1.89 1.88 1.94 1.95 [94] [95] [94] [97] [98] 101 [0, 20] 1.56 1.65 1.67 1.68 1.71 2 [78] [83] [84] [84] [86] 101 [0, 40] 1.58 1.66 1.72 1.70 1.72 [79] [83] [86] [85] [86] Table 6. The Speed Up and Efficiency of the 2PFDIR Method for Solving Problem 1 and 2. Note. For each tolerance the values in the square brackets give the results of the efficiency in percentage. 1
In Table 6, the speed up ranging between 1.87 and 1.95 for solving Problem 1 when N = 4000 and the efficiency is between 94% and 98%. Better speed up and efficiency can be achieved by increasing the dimension of the ODEs. In Problem 2, the speed up ranging between 1.58 and 1.72 as the interval increased at the same number of equations. The number of function evaluations is almost half in the parallel mode compared to the sequential mode. In term of accuracy, numerical results are within the given tolerances. The performance of parallel implementation of an integration method depends heavily on the machine, the size of the problem and the costs of the function evaluation.
56
Advanced Technologies
6. Conclusion In this paper, we have considered the performances of the parallel direct block code (P2PFDIR) based on the two point implicit block method in the form of Adams Moulton type. By using large problems and by implementing the code on the shared memory computer, we have shown the superiority of the parallel code over the sequential code. The P2PFDIR achieved better speed up and efficiency when the dimension of ODEs systems increased and hence the parallel code developed are suitable for solving large problems of ODEs.
7. Acknowledgements This research was supported by Institute of Mathematical Research, Universiti Putra Malaysia under RU Grant 05-01-07-0232RU.
8. Future Work Instead of implemented the block method using variable step size, it is possible to vary the step size and order of the method in solving the higher order ODEs. Therefore, the approximations for the tested problems will be better in terms of accuracy and less number of solving steps. It would also be interesting to implement the parallel computation to new three and four point direct block method.
9. References Cong, N.H.; Strehmel, K.; Weiner, R. & Podhaisky, H. (1999). Runge–Kutta-Nystrom-type parallel block predictor-corrector methods, Advances in Computational Mathematics, Vol 10, No. 2., ( February 1999), pp. 115–133, ISSN 1019-7168. Cong, N.H.; Podhaisky, H. & Weiner, R. (2000). Performance of explicit pseudo two-step RKN methods on a shared memory computer, Reports on Numerical Mathematics, No. 00-21, Dept. Math & Computer Science, Martin Luther University HalleWittehberg. (http://www.mathematik.uni-halle.de/reports/rep-num.html) Fatunla, S.O. (1990). Block Methods for Second Order ODEs. Intern. J. Computer Math, Vol 40, pp. 55-63, ISSN 0020-7160. Hairer, E.; Norsett, S.P. & Wanner, G. (1993). Solving Ordinary Differential Equations I: Nonstiff Problems, Berlin: Springer-Verlag, ISSN 3-540-17145-2, Berlin Heidelberg New York. Majid, Z.A. & Suleiman, M. (2006). Direct Integration Implicit Variable Steps Method for Solving Higher Order Systems of Ordinary Differential Equations Directly, Jurnal Sains Malaysiana, Vol 35, No. 2., (December 2006), pp 63-68. ISSN 0126-6039. Majid, Z.A. & Suleiman, M. (2007). Two point block direct integration implicit variable steps method for solving higher order systems of ordinary differential equations, Proceeding of the World Congress on Engineering 2007, pp. 812-815, ISBN 978-98898671-2-6, London, July 2007, Newswood Limited, Hong Kong. Omar, Z. (1999). Developing Parallel Block Methods For Solving Higher Order ODEs Directly, Ph.D. Thesis, University Putra Malaysia, Malaysia. Suleiman, M.B. (1989). Solving Higher Order ODEs Directly by the Direct Integration Method, Applied Mathematics and Computation, Vol. 33, No. 3., (October 1989), pp 197-219, ISSN 0096-3003.
Toward Optimal Query Execution in Data Grids
57
X4 Toward Optimal Query Execution in Data Grids 1 Academic
Reza Ghaemi1, Amin Milani Fard2, Md. Nasir Bin Sulaiman3 and Hamid Tabatabaee4
Member in Department of Computer Engineering, Islamic Azad University, Quchan branch, Iran and PhD Student in Faculty of Computer Science and Information Technology Putra University of Malaysia, E-mail : [email protected] 2 School of Computing Sciences, Simon Fraser University, BC, Canada, E-mail: [email protected] 3 Associate Professor and Academic Member in Department of Computer Science,Faculty of Computer Science and Information Technology Putra University of Malaysia, E-mail : [email protected] 4 Academic Member in Department of Computer Engineering, Islamic Azad University, Quchan branch, Iran, and PhD Student in Ferdowsi University of Iran, E-mail : [email protected]. 1. Introduction Nowadays Grid technology [Foster & Kesselman, 1999] provides us with simultaneous and effective use of distributed computational and informational resources. Three main types of this technological phenomenon are known as resource discovery grids, computational grids, and data grids. In data grids, distributed heterogeneous database systems play important role to provide users access information easily without knowing the resource position. Due to heterogeneity property, communication among subsystems has to be handled well considering different network structure, operating system, and DBMS. In such systems we are also interested in efficient search and retrieval mechanism to speedup traditional relational database queries. Distributed systems can be taught of as a partnership among independent cooperating centralized systems. Upon this idea number of large scale applications has been investigated during past decades among which distributed information retrieval (DIR) systems gained more popularity due to its high demand. The goal of DIR is to provide a single search interface that provides access to the available databases involving resource descriptions building for each database, choosing which databases to search for particular information, and merging retrieved results into a single result list [Si & Callan, 2003]. A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. This resource distribution improves performance, reliability, availability and modularity that are inherent in distributed systems. As with traditional centralized databases, distributed database systems (DDBS) must provide an
58
Advanced Technologies
efficient user interface that hides all of the underlying data distribution details of the DDB from the users. The use of a relational query allows the user to specify a description of the data that is required without having to know where the data is physically located [Li & Victor, 1981]. Data retrieval from different sites in a DDB is known as distributed query processing (DQP). For example, the following query accesses data from the local database as well as the remote sales database. The first table (EMP) found in site1 and the second table (DEPT) found in site2: SELECT FROM WHERE
ename, dname company.emp e, [email protected] d e.deptno = d.deptno
So a distributed query is one that selects data from databases located at multiple sites in a network and distributed processing performs computations on multiple CPUs to achieve a single result. Query processing is much more difficult in distributed environment than in centralized environment because a large number of parameters affect the performance of distributed queries, relations may be fragmented and/or replicated, and considering many sites to access, query response time may become very high [Li & Victor, 1981]It is quite evident that the performance of a DDBS is critically dependant upon the ability of the query optimization algorithm to derive efficient query processing strategies. DDBMS query optimization algorithms attempts to reduce the quantity of data transferred. Minimizing the quantity of data transferred is a desirable optimization criterion. The distributed query optimization has several problems relate to the cost model, larger set of queries, optimization cost, and optimization interval. The goal of DQP is to execute such queries as efficiently as possible in order to minimize the response time that users must wait for answers or the time application programs are delayed. And to minimize the total communication costs associated with a query, to improved throughput via parallel processing, sharing of data and equipment, and modular expansion of data management capacity. In addition, when redundant data is maintained, one also achieves increased data reliability and improved response time. In this work we propose a multi-agent architecture for distributed query processing. The structure of the paper is as follows. Section II describes an overview of query optimization process. An investigation on related works is declared in section III, our proposed approach in section IV, and simulation results in V. We finally concluded the work in the section VI.
2. Query Optimization Process In a relational database all information can be found in a series of tables. A query therefore consists of operations on tables. The most common queries are Select-Project-Join queries. In this paper, we will focus on the join-ordering problem since permutations of the join order have the most important effect on performance of relational queries [Özsu & Valduriez, 1999]. The query optimization process shown in Figure 1, consists of getting a query on n relations and generating the best Query Execution Plan (QEP).
Toward Optimal Query Execution in Data Grids
59
Input Query
Search Space Generation
Transformation Rules
Equivalent QEP Cost Model
Search Strategy
Best QEP
Fig. 1. Query optimization process For a given query, the search space can be defined as the set of equivalent operator trees that can be produced using transformation rules. The example bellow illustrates 3 equivalent join trees, which are obtained by exploiting the associative property of binary operators. Join tree (c) which starts with a Cartesian product may have a much higher cost than other join trees. SELECT FROM WHERE AND
ENAME, RESP EMP, ASG, PROJ EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO ⋈
⋈
PNO
⋈ PNO EMP
⋈ ENO PROJ EMP
ASG (a)
⋈
ENO
ASG
PROJ (b)
X PROJ
ENO, PNO ASG
EMP (c)
Fig. 2. Query equivalent trees Regarding different search spaces, there would be different shape of the join tree. In a linear tree, at least one operand of each operand node is a base relation. However, a bushy tree might have operators whose both operands are intermediate operators. In a distributed environment, bushy trees are useful in exhibiting parallelism [Özsu & Valduriez, 1999].
60
Advanced Technologies
⋈ ⋈ ⋈
⋈
R3
⋈
⋈
R4 R1
R2 R3
R4
R2
R1
(a) Linear Join Tree
(b) Bushy Join Tree
Fig. 3. Linear vs. bushy join tree Considering new large scale database applications such as deductive database systems and bioinformatics, it is necessary to be able to deal with larger size queries. The search complexity constantly increases and makes higher demand for better algorithms than our traditional relational database queries.
3. Related Works Three most common types of algorithms for join-ordering optimization are deterministic, Genetic and randomized algorithms [Zalenay, 2005]. Deterministic algorithm, also known as exhaustive search dynamic programming algorithm, produces optimal left-deep processing trees with the big disadvantage of having an exponential running time. This means that for queries with more than 10-15 joins, the running time and space complexity explodes [Zalenay, 2005]. Due to the very large time and space complexity of this algorithm for plan enumeration, iterative dynamic programming approach was proposed which produces reasonable plans with reasonable running time for most network topologies. However, its complexity is not much more than classical DP algorithm. Genetic and randomized algorithms [Ioannidis & Kang, 2003; Steinbrunn et al., 1997] on the other hand do not generally produce an optimal access plan. But in exchange they are superior to dynamic programming in terms of running time. Experiments have shown that it is possible to reach very similar results with both genetic and randomized algorithms depending on the chosen parameters. Still, the genetic algorithm has in some cases proved to be slightly superior to randomized algorithms. Layers of distributed query optimization have been depicted in Figure 4. There are number of Query Execution Plan for DDB such as: row blocking, multi-cast optimization, multi-threaded execution, joins with horizontal partitioning, Semi Joins, and Top n queries [Kossman, 2000; Selinger et al., 1998; Stocker et al., 2001]. In this paper we propose a novel agent based QEP generator for heterogeneous distributed data base systems.
Toward Optimal Query Execution in Data Grids
61
Calculus Query on Distributed Relation
Query Decomposition
Global Schema
Algebraic Query on Distributed Relations Control Site
Data Localization
Fragment Schema
Fragment Query
Global Optimization
Statistics on Fragment
Optimized Fragment Query with Communication Operations
Local Sites
Local Optimization
Local Schema
Input Query
Fig. 4. Distributed query optimization 3.1 Distributed Cost Model An optimizer cost model includes cost functions to predict the cost of operators, and formulas to evaluate the sizes of results. Cost functions can be expressed with respect to either the total time, or the response time [Ceri & Pelagatti, 1984; Özsu & Valduriez, 1999]. The total time is the sum of all times and the response time is the elapsed time from the initiation to the completion of the query. The total time (TT) is computed as bellow, where TCPU is the time of a CPU instruction, TI/O the time of a disk I/O, TMSG the fixed time of
62
Advanced Technologies
initiating and receiving a message, and TTR the time it takes to transmit a data unit from one site to another. TT = TCPU * #insts + TI/O * #I/Os + TMSG * #msgs + TTR * #bytes When the response time of the query is the objective function of the optimizer, parallel local processing and parallel communications must also be considered. This response time (RT) is calculated as bellow: RT = TCPU * seq_#insts + TI/O * seq_#I/Os + TMSG * seq_#msgs + TTR * seq_#bytes Most early distributed DBMSs designed for wide area networks have ignored the local processing cost and concentrate on minimizing the communication cost. Consider the following example: Site 1
x units Site 3
Site 2
y units
TT = 2 * TMSG + TTR * (x +y) RT = max {TMSG + TTR * x, TMSG + TTR * y} In parallel transferring, response time is minimized by increasing the degree of parallel execution. This does not imply that the total time is also minimized. On contrary, it can increase the total time, for example by having more parallel local processing (often includes synchronization overhead) and transmissions. Minimizing the total time implies that the utilization of the resources improves, thus increasing the system throughput. In practice, a compromise between the total and response times is desired. 3.2 Database Statistics The main factor affecting the performance is the size of the intermediate relations that are produced during the execution. When a subsequent operation is located at a different site, the intermediate relation must be transmitted over the network. It is of prime interest to estimate the size of the intermediate results in order to minimize the size of data transfers. The estimation is based on statistical information about the base relations and formulas to predict the cardinalities of the results of the relational operations. Cartesian product: The cardinality of the Cartesian product of R and S is as (1): card ( R × S ) = card ( R ) × card ( S )
(1)
Join: There is no general way to estimate the cardinality of a join without additional information. The upper bound of the join cardinality is the cardinality of the Cartesian
Toward Optimal Query Execution in Data Grids
63
product. Some systems, such as Distributed INGRES [Stonebraker, 1986], use this upper bound, which is quite pessimistic. R* [Selinger & Adiba, 1980] uses this upper bound divided by a constant to reflect the fact that the join result is smaller than the Cartesian product. However, there is a case, which occurs frequently, where the estimation is simple. If relation R is equi-join with S over attribute A from R, and B from S, where A is a key of relation R, and B is a foreign key of relation S, the cardinality of the result can be approximated as (2): card ( R A= B S ) = card ( S )
(2)
In other words, the Cartesian product R * S contains nr * ns tuples; each tuple occupies sr + ss bytes. If R ∩ S = ∅, then R ⋈ S is the same as R * S. If R ∩ S is a key for R, then a tuple of s
will join with at most one tuple from R, therefore, the number of tuples in R ⋈ S is no greater than the number of tuples in S. If R ∩ S in S is a foreign key in S referencing R, then
the number of tuples in R ⋈ S is exactly the same as the number of tuples in s. The case for R ∩ S being a foreign key referencing S is symmetric. As discussed earlier, ordering joins is an important aspect of centralized query optimization. This matter in a distributed context is even more important since joins between fragments may increase the communication time. Two main approaches exist to order joins in fragment queries: 1) Direct optimization of the ordering of joins (e.g. in the Distributed INGRES algorithm). 2) Replacement of joins by combination of semi-joins in order to minimize communication costs. Let R and S are relations stored at different sites and query R ⋈ S be the join operator. The obvious choice is to send the smaller relation to the site of the larger one. if size(R) < size(S) R
S if size(R) > size(S)
More interesting is the case where there are more than two relations to join. The objective of the join ordering algorithm is to transmit smaller operands. Since the join operations may reduce or increase the size of intermediate results, estimating the size of joint results is mandatory, but difficult. Consider the following query expressed in relate. algorithm: PROJ ⋈PNO EMP ⋈ENO ASG whose join graph is below: Site 2 ASG ENO
Site 1
EMP
PNO
PROJ Site 3
64
Advanced Technologies
This query can be executed in at least 5 different ways. 1. EMP ASG ; EMP’ = EMP ⋈ ASG PROJ ; EMP’ ⋈ PROJ
2. ASG EMP ; EMP’ = EMP ⋈ ASG PROJ ; EMP’ ⋈ PROJ
3. ASG PROJ ; ASG’ = ASG ⋈ PROJ EMP ; ASG’ ⋈ EMP
4. PROJ ASG ; PROJ’ = PROJ ⋈ ASG EMP ; PROJ’ ⋈ EMP
5. EMP ASG ; PROJ ASG ; EMP ⋈ PROJ ⋈ ASG
To select one of these programs, the following sizes must be known or predicted: size(EMP), size(ASG) , size(PROJ) , size(EMP ⋈ ASG), size(ASG ⋈ PROJ). Furthermore, if it is the response time that is being considered, the optimization must take into account the fact that transfers can be done in parallel with strategy 5. An alternative to enumerating all the solutions is to use heuristics that consider only the size of the operand relations by assuming, for example, that the cardinality of the resulting join is the product of cardinalities. In this case, relations are ordered by increasing sizes and the order of execution is given by this ordering and the join graph. For instance, the order (EMP, ASG, PROJ) could use strategy 1, while the order (PROJ, ASG, EMP) could use strategy 4. 3.3 Multi Agent Systems Multi-agent systems (MASs) as an emerging sub-field of artificial intelligence concern with interaction of agents to solve a common problem [Wooldridge, 2002]. This paradigm has become more and more important in many aspects of computer science by introducing the issues of distributed intelligence and interaction. They represent a new way of analyzing, designing, and implementing complex software systems. In multi-agent systems, communication is the basis for interactions and social organizations which enables the agents to cooperate and coordinate their actions. A number of communication languages have been developed for inter-agent communication, in which the most widely used ones are KIF (Knowledge Interchange Format) [Genesereth & Fikes, 1992], KQML (Knowledge Query and Manipulation Language) [Finin et al., 1994], and ACL (Agent Communication Language) [Labrou et al., 1999]. KQML uses KIF to express the content of a message based on the first-order logic. KIF is a language intended primarily to express the content part of KQML messages. ACL is another communication standard emerging in competition with KQML since 1995. Nowadays, XML (Extensible Markup Language) started to show its performance as a language to encode the messages exchanged between the agents, in particular in agent-based e-commerce to support the next generation of Internet commerce [Korzyk, 2000].
4. Proposed System Architecture Although the problem of distributed query processing in heterogeneous systems has been investigated before [Huang et al., 1981], a good solution to practical query optimization has not been studies well. To meet so we proposed new multi-agent system architecture based on Java Agent DEvelopment (JADE) framework [Bellifemine et al., 2006]. JADE is a software development framework aimed at developing multi-agent systems and applications in
Toward Optimal Query Execution in Data Grids
65
which agents communicate using FIPA 1 Agent Communication Language (ACL) messages and live in containers which may be distributed to several different machines. The Agent Management System (AMS) is the agent who exerts supervisory control over access to and use of the Agent Platform. Only one AMS will exist in a single platform. Each agent must register with an AMS in order to get a valid AID. The Directory Facilitator (DF) is the agent who provides the default yellow page service in the platform. The Message Transport System, also called Agent Communication Channel (ACC), is the software component controlling all the exchange of messages within the platform, including messages to/from remote platforms. JADE is capable of linking Web services and agents together to enable semantic web applications. A Web service can be published as a JADE agent service and an agent service can be symmetrically published as a Web service endpoint. Invoking a Web service is just like invoking a normal agent service. Web services’ clients can also search for and invoke agent services hosted within JADE containers. The Web Services Integration Gateway (WSIG) [JADE board, 2005] uses a Gateway agent to control the gateway from within a JADE container. Interaction among agents on different platforms is achieved through the Agent Communication Channel. Whenever a JADE agent sends a message and the receiver lives on a different agent platform, a Message Transport Protocol (MTP) is used to implement lower level message delivery procedures [Cortese et al., 2002]. Currently there are two main MTPs to support this inter-platform agent communication - CORBA IIOP-based and HTTP-based MTP. Considering large-scale applications over separated networks, agent communications has to be handled behind firewalls and Network Address Translators (NATs), however, the current JADE MTP do not allow agent communication through firewalls and NATs. Fortunately, the firewall/NAT issue can be solved by using the current JXTA implementation for agent communication [Liu et al., 2006]. JXTA is a set of open protocols for P2P networking. These protocols enable developers to build and deploy P2P applications through a unified medium [Gradecki, 2002]. Obviously, JXTA is a suitable architecture for implementing MTP-s for JADE and consequently JADE agent communication within different networks can be facilitated by incorporating JXTA technology into JADE [Liu et al., 2006]. In this work, a query is submitted by user to the Query Distributor Agent and then it will be distributed using the submitted terms in the available ontology. Our proposed architecture uses different types of agents, each having its own characteristics. The proposed system architecture is shown in Figure 5.
1
Foundation for Intelligent Physical Agents (http://www.fipa.org)
66
Advanced Technologies
Search page
Request
User
Use GOA
QDA
LOA #1
Database #1
LOA #2
Database #2
LOA #3
...
LOA #n
Database #n
Fig. 5. Proposed system architecture Query Distributor Agent (QDA): After receiving the user query, the QDA sends subqueries to responsible local optimizer agents. The QDA can also create search agents if needed. Local Optimizer Agents (LOAs): These agents apply a Genetic algorithm based sub-query optimization which will be discussed later, and return a result table size to the Global Optimizer Agent. Global Optimizer Agent (GOA): This agent has the responsibility to find best join order via network. To do so, GOA receives result table size information from LOAs and again using an evolutionary method finds a semi-optimal join order, however, this time the genetic algorithm fitness function is based on minimizing communication rate among different sites. The final result would be then send to LOAs to perform operation and deliver result to the GOA to show them on the screen. 4.1 Genetic Algorithm based Optimization The basic concept of the genetic algorithms were originally developed by Holland [Holland, 1975] and later revised by Goldberg [Goldberg, 1989]. Goldberg showed that genetic algorithms are independent of any assumption about the search space and are based on the mechanism of natural genetics. The first step to model this problem as a Genetic Algorithm problem, is determining the chromosome, Genetic A operators, and fitness function.
Toward Optimal Query Execution in Data Grids
67
Chromosome Design In order to encode different access plans, we considered ordered list scheme to represent each processing tree as an ordered list. For instance (((((R1 ⋈ R5) ⋈ R3) ⋈ R2) ⋈ R4) is encoded as “15324”. This converts the join order to the well-known traveling salesman problem (TSP). Possible query plans are encoded as integer strings. Each string represents the join order from one relation of the query to the next. This mechanism is also used within the PostgreSQL optimizer [PostgreSQL]. There may be other ways to encode processing trees especially bushy trees; however, we have used the above explained ordered list encoding method for implementation simplicity and faster run. Genetic Algorithm operations For the crossover, one point in the selected chromosome would be selected along with a corresponding point in another chromosome and then the tails would be exchanged. Mutation processes causes some bits to invert and produces some new information. The only problem of mutation is that it may cause some useful information to be corrupted. Therefore we used elitism which means the best individual will go forward to the next generation without undergoing any change to keep the best information. Fitness function Defining fitness function is one of the most important steps in designing a Genetic Algorithm based method, which can guide the search toward the best solution. In our simulation we used a simple random cost estimator as defined bellow in (3), where random(x) returns an integer between 0 and x.
random( R × S ) + ( R + S ) ; if ( R + S ) > ( R × S ) fitness = random( R + S ) + ( R × S ) ; else
(3)
Algorithm design After calculating the fitness function value for each parent chromosome the algorithm will generate N children. The lower a parent chromosome's fitness function value is the higher probability it has to contribute one or more offspring in the next generation. After performing operations, some chromosomes might not satisfy the fitness and as a result the algorithm discards this process and gets M (M ≤ N) children chromosomes. The algorithm then selects N chromosomes with the lower fitness value from the M + N chromosomes (M children and N parents) to be parent of the next generations. This process would repeat until a certain number of generations are processed, after which the best chromosome is chosen. Figure 6 shows our genetic algorithm based approach.
68
Advanced Technologies
input: Relation size on different sites output: Semi-Optimal join orders initialize gene with a random join order produce N initial parent chromosomes mincost = d[gene.answer[0]]+d[gene.answer[1]]; maxcost = d[gene.answer[0]]*d[gene.answer[1]]; if (mincost>maxcost) gene.value = random(mincost-maxcost)+maxcost; else gene.value = random(maxcost-mincost)+mincost; while done ≠ yes do produce N random children chromosomes pass the best individual to next generation randomly mating exchange parts of chromosomes mutate with rate = 2/100 for(i=1;imaxcost) gene.value = random(mincost-maxcost) + maxcost; else gene.value = random(maxcost-mincost) + mincost; } if gene.value satisfied done = yes else produce next generation chromosomes end while Fig. 6. Our Genetic Algorithm based query optimization algorithm
5. Results Experiments were done on a PC Pentium 4 CPU 2 GHz and 1GB RAM. The evolutionary process accomplished in less than a second and seen in Figure 7 a sample query for 20 joins is converged to a near optimal fitness almost after 100 generations. Table 1 shows the parameters value we set for our implementation. Parameter Value Population size 100 Mutation probability 0.02 Crossover probability 0.7 Elitism probability 0.5 Number of generations 100 Table 1. Parameteres settings for Genetic Algorithm based Approach
Fitness
Toward Optimal Query Execution in Data Grids
69
82000 80000 78000 76000 74000 72000 70000 68000 66000 64000 62000 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 Generation
Fig. 7. Optimization of a 20-joins query in 100 generations There is no doubt that dynamic programming method always gives us optimal solution, however, since the time and space complexity of the genetic algorithm base optimization is much less, it is not a practical approach for high amount of nested joins. Figure 8 shows TGA / TDP which is rate of average run time for 10 queries, where TGA is genetic algorithm based optimization average run time and TDP is dynamic programming based optimization average run time for the same query. Results as expected shows this rate is always less than 1 which means the genetic algorithm approach has a less overhead in contrast with the DP method. 0.9 0.8 0.7 0.6 0.5
Series1
0.4 0.3 0.2 0.1 0 1
2
3
4
5
6
7
8
Fig. 8. Rate of optimization average run time for 10 queries
9
10
70
Advanced Technologies
A computational complexity analysis in table 2 shows the superiority of the evolutionary method in comparison with the dynamic programming approach. The time and space complexity in the evolutionary method is liner with respect to G as the number of generations, and N as the number of chromosomes in the population, while it is exponential in the DP method which is not efficient in handling more than 10 joins. Space Complexity
Time Complexity
O(N)
O(G.N)
O(2n)
O(2n)
Evolutionary Method Dynamic Programming Table 2. Computational Complexity Comparison
6. Conclusion An evolutionary query optimization mechanism in distributed heterogeneous systems has boon proposed using multi-agent architecture and genetic algorithm approach. Although deterministic dynamic programming algorithm produces optimal left-deep processing trees, it has the big disadvantage of having an exponential running time. Genetic and randomized algorithms on the other hand do not generally produce an optimal access plan. But in exchange they are superior to dynamic programming in terms of running time. Our practical framework uses hybrid JADE-JXTA framework which allows agent communication through firewalls and NATs in heterogeneous networks.
7. References Bellifemine, F.; Caire, G.; Trucco, T. And Rimassa, G. (2006). JADE Programmer’s Guide, August 2006, published by the Free Software Foundation, USA Ceri, S.; Pelagatti, G. (1984). Distributed Databases: Principles and Systems, Mcgraw-Hill, ISBN10: 0070108293, ISBN-13: 978-0070108295, New York, NY, USA Cortese, E.; Quarta, F.; Vitaglione, G. and Vrba, P. (2002). Scalability and Performance of the JADE Message Transport System, Analysis of Suitability for Holonic Manufacturing Systems, exp, 2002 Finin, T.; Fritzson, R.; McKay, D. and McEntire, R. (1994). KQML as an Agent Communication Language, Proceedings of the 3rd International Conference on Information and Knowledge Management (CIKM'94), editor N. Adam, B. Bhargava, Y. Yesha, pp. 456-463, ACM Press, Gaithersburg, MD, USA Foster, I.; Kesselman, C. (1999). The Grid: Blueprint for a Future Computing Inf., Morgan Kaufmann Publishers, pp. 105-129 Genesereth, M. R.; Fikes, R. E. (1992). Knowledge interchange format, version 3.0. Technical Report 92-1, Stanford University, Computer Science Department, San Francisco, USA Goldberg, D. E. (1989). The genetic algorithms in search, optimization, and machine learning, Addison-Wesley, New York, USA Gradecki, J. D. (2002). Mastering JXTA: Building Java Peer-to-Peer Applications, JohnWiley & Sons, ISBN 0471250848, 9780471250845, Canada
Toward Optimal Query Execution in Data Grids
71
Holland, J. H. (1975). Adaptation in natural and artificial systems, Ann Arbor, MI University of Michigan press, USA Huang; Kuan-Tsae; Davenport and Wilbur, B. (1981). Query processing in distributed heterogeneous databases, MIT. Laboratory for Information and Decision Systems Series/Report no.: LIDS-P ;1212 Ioannidis Y. E.; Kang Y. C. (2003). Randomized Algorithms for Optimizing Large Join Queries, Conference on Very large data bases, pp. 488-499, September 2003, Berlin, Germany, ACM Publisher , New York, NY, USA JADE Board (2005). JADE WSIG Add-On Guide JADE Web Services Integration Gateway (WSIG) Guide, published by the Free Software Foundation, March 2005, Boston, MA, USA Korzyk, A. (2000). Towards XML As A Secure Intelligent Agent Communication Language, the 23rd National Information Systems Security Conference, Baltimore Convention Center, Baltimore, October 2000, Maryland, SA Kossman, D. (2000). The state of the art in distributed query processing, ACM Computing Surveys, Vol. 32, Issue 4, December 2000, pp. 422 – 469, ISSN:0360-0300 Labrou, Y.; Finin, T. and Peng, Y. (1999). The current landscape of Agent Communication Languages, The current landscape of Agent Communication Languages, Intelligent Systems, Vol. 14, No. 2, March/April 1999, IEEE Computer Society Li; Victor, O. K. (1981). Query processing in distributed data bases, MIT. Lab. for Information and Decision Systems Series/Report no.: LIDS-P, 1107 Liu, S.; Küngas, P. and Matskin, M. (2006). Agent-Based Web Service Composition with JADE and JXTA, Proceedings of the International Conference on Semantic Web and Web Services (SWWS 2006), June 2006, Las Vegas, Nevada, USA Özsu, T. M.; Valduriez, P. (1999). Principles of Distributed Database Systems, Second Edition, Prentice Hall, ISBN 0-13-659707-6 PostgreSQL 8.3.0 Documentation, Chapter 49. Genetic Query Optimizer http://www.postgresql.org/docs/8.3/interactive/geqo.html Selinger, P. G.; Adiba, M. E. (1980). Access Path Selection in Distributed Database Management Systems, ICOD, pp. 204-215, Selinger, P. G.; Astrahan, M. M.; Chamberlin, D. D.; Lorie, R. A. and Price, T. G. (1998). Access path selection in a relational database management system, Morgan Kaufmann Series In Data Management Systems, Readings in database systems (3rd ed.), pp. 141 – 152, ISBN:1-55860-523-1 Si, L.; Callan, J. (2003). A Semi supervised Learning Method to Merge Search Engine Results, Proceeding ACM Transactions on Information Systems, Vol. 21, No. 4, October 2003, pp. 457–491 Steinbrunn, M.; Moerkotte, G. & Kemper, A. (1997). Heuristic and Randomized Optimization for the Join Ordering Problem, The VLDB Journal - The International Journal on Very Large Data Bases, Vol. 6 , Issue 3, August 1997, pp. 191-208, ISSN:1066-8888 Stocker; Kossman; Braumandl and Kemper (2001). Integrating semi join reducers into state of the art query processors, ICDE, ISSN:1063-6382, IEEE Computer Society Washington, DC, USA
72
Advanced Technologies
Stonebraker (1986). The Design and Implementation of Distributed INGRES (context), Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA, ISBN:0-20107185-1 Wooldridge, M. (2002). An Introduction to Multi agent Systems, Published by John Wiley & Sons., February 2002, ISBN 0 47149691X Zelenay, K. (2005). Query Optimization, Seminar Algorithmen für Datenbanksysteme, June 2005, ETH Zürich
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
73
X5 Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers Toshihiko Takemura1, Makoto Osajima2 and Masatoshi Kawano3 1 RISS,
Kansai University Waseda University 3The Ministry of Internal Affairs and Communications Japan 2 GITS,
1. Introduction Information and Communication Technology (Abbreviation, ICT) including the Internet, improves the productivity of firms, and creates new businesses. Concretely, it is found that ICT provides a positive impact to society and the economy (Brynjolfsson et al., 2002, and Takemura, 2008). On the other hand, according to an information security white paper 2008 in Japan (Information-technology Promotion Agency, 2008), serious problems have occurred at the same time. These problems are caused by Internet threats such as illegal access, malware, Spam mails, and system troubles. Many accidents caused by these incidents are reported all over the world. These threats evolve minute by minute every day and the number of them increases rapidly. Moreover, the number of vulnerabilities (chiefly, in Web application) has increased every year. Therefore, firms and individuals are exposed to those threats. Although secured networks or NGNs (Next Generation Networks or New Generation Networks) have appeared recently, information security countermeasures against these threats must be executed because the Internet has no security of its own. We have much academic researches on information security technology in the field of natural science such as cryptographic technology and secured networking 1 . These accumulated researches achieve a constant result. However, Internet threats evolve minute by minute every day. Information security management tries to support workers against these threats. Unfortunately, research in the social sciences about how management can avoid these threats is still limited and exploratory. The majority of researches are theoretical ones, which use the framework of game theory, and ones on management system. On the other hand, empirical researches on information security are limited. We believe that many For example, refer to Cook and Keromytis (2006) in details of recent researches on cryptographic technology. 1
74
Advanced Technologies
scholars were not interested in empirical research still now because of scant data on information security countermeasures and/or investment 2 . Recently, the number of empirical researches tends to increase since the data are accumulated. In this chapter, we focus on firms in telecommunication infrastructure, especially Japanese Internet Service Providers (Abbreviation, ISPs) that provide Internet environment for users in Japan 3. From the ISPs’ influence on society and economy, they continue to maintain the high level of information security countermeasures. In other words, the level of information security countermeasures in critical infrastructures should be set higher than in general firms (Ebara et al., 2006, and Takemura, 2007a) 4. In this paper, we investigate the relation among information security incidents, vulnerability, and the information security countermeasures of ISPs, and we show what effective countermeasures are available. To achieve this purpose, we use the micro data on the questionnaire survey we conducted in 2007, and logistic regression analysis as the statistical method. Recently many countries, including Japan, have begun to gather data and analyze them on information security incidents because they recognize that these kinds of researches are important. In Japan, the Information Security Measures Promotion Office was established in the IT strategy headquarters of the Cabinet Secretariat in 2001, and the office has worked on security policy. This policy was promoted further by the reestablishment of the NISC (National Information Security Center) in 2005. Every year, the NISC implements a new national information security policy package called “Secure Japan”. In National Information Security Center (2008), “accident assumption society” is one of keywords, which stand for accidents that happen in the Internet society, and what information security policy the government should implement is discussed. In addition, since 2007, there is a project of the Cyber Clean Center (Abbreviation, CCC), which MIC (the Ministry of Internal Affairs and Communications) and METI (the Ministry of Economy, Trade and Industry) in Japan has been set up to coordinate countermeasures against bot threats 5 . The project gathers information on bots, and suggests the countermeasures. The information on Spam mails are gathered by Japan Data Communications Association (Abbreviation, JADAC) in Japan, and spam mails countermeasures are discussed. The Information-technology Promotion Agency, It seems that the majority of firms might not disclose the information security countermeasure and/or the investment even if the data exists. 3 Refer to Ebara et al. (2006) and Yokomi et al. (2004) for investigation and research on Japanese ISPs. 4 Some researches exist on the layer of the infrastructure and its interdependent relationship (Information-technology Promotion Agency, 2000, and Yamaguchi, 2007). In particular, the telecommunication infrastructure is a second important critical infrastructure among all infrastructures. Critical infrastructure in Japan includes the following fields; telecommunications, finance, airlines, railways, electric power, gas, government and administrative service (including the local public entity), medical treatment, water service, and distribution. 5 Bot (virus) is a malicious program with the purpose of fraudulent use of computer. Once computer is infected with bot, the malicious attacker remotely controls your computer from the external. This attack causes serious harm of making public nuisance such as sending numerous number of mails and attacking a particular website, as well as of stealing information stored in your computer, i.e., spying activities. 2
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
75
Japan (Abbreviation, IPA) also provides various information analyses and enlightening activities on security as well as the Japan Vulnerability Notes (Abbreviation, JVN), the Japan Network Security Association (Abbreviation, JNSA), and the Japan Computer Emergency Response Team Coordination Center (Abbreviation, JPCERT/CC). In addition, the National Police Agency (Abbreviation, NPA) in Japan controls the cybercrime. These organizations have achieved constant effects. Thus, in Japan, academic investigation is still rare although the accumulation of surveillance study data has advanced. Table 1 shows Japanese organizations concerned with information security countermeasures and policies. NISC MIC METI CCC JADAC IPA JVN JNSA JPCERT/CC NPA
URL http://www.nisc.go.jp/ http://www.soumu.go.jp/ http://www.meti.go.jp/ https://www.ccc.go.jp/ http://www.dekyo.or.jp/ http://www.ipa.go.jp/ http://jvn.jp/ http://www.jnsa.org/ http: www.jpcert.or.jp/ http://www.npa.go.jp/
Remark Security policy Security policy Security policy Bot Spam mail Virus and warm Vulnerability Network security Security incident Cyber crime
Table 1. Japanese Organizations Concerned with Information Security Countermeasures and Policies This chapter consists of the following sections. Section 2 introduces some related literatures and shows topics on economics of information security. In Section 3, we summarize ISPs in Japan. Section 4 explains logistic regression analysis and the data used in this chapter. In Section 5, we show the estimation results. Finally, we present a summary and future work in Section 6.
2. Related Literatures In this section, we briefly introduce related works on information security in the fields of social science. There have been many qualitative researches on information security in the field of management science; concretely, various management systems on information security such as ISMS (information security management system), ISO27000, and BCMS (business continuity management system). However, it seemes that these researches are insufficient to give the incentive for individuals and/or firms to execute information security countermeasures. On the other hand, in the field of economics, pioneer and representative researches on information security are Gordon and Loeb (2002), and Varian (2002). These include theoretical models of information security countermeasures and investment from the viewpoint of economics and management science. In addition, they discuss the incentive to execute information security countermeasures. In the former, the economic effect of information security investment is analyzed and the latter discusses the free rider problem
76
Advanced Technologies
by analyzing the information security system as public goods 6. Under these frameworks, some empirical analyses have been accumulated after around 2000. Mainly, we classify three types of the related works. 1. The first type of related work is modeling relations among information security incidents, information security countermeasures and/or investment. For example, there are a few empirical researches such as Tanaka et al. (2005), Liu et al. (2007), Takemura (2007b) and Takemura et al. (2008) in Japan. Our research in this chapter is included in this type. 2. The second type of related works is modeling relations between the value of the firm (market value) and information security countermeasures and/or investment. The representative model is the market value one of Brynjolfsson et al. (2002) applied to information security investment instead of ICT investment. For example, in Japan, Tanaka (2005), Takemura and Minetaki (2009a, 2009b) and Minetaki, et al. (2009) carry out empirical analysis by using their framework, respectively. In addition, Nagaoka and Takemura (2007) discuss a new type of model from the viewpoint of BCP (business continuity plan) 7 . Moreover, in recent years many firms have paid pay attention to the information security report based on this model. 3. The third type of related works is calculating the amount of damage and influence to economy and society. For instance, JNSA calculates the amount of compensation when information leakage was caused (Japan Network Security Association, 2008). Ukai and Takemura (2007), Takemura and Ebara (2008a, 2008b, 2008c), and Japan Data Communications Association (2008) calculate the amount of GDP loss caused by Spam mails in Japan.
3. Summary on Japanese ISPs In this section, we present a short summary on Japanese ISPs by referring to Ebara et al. (2006) and Yokomi et al. (2004) for investigation and research on Japanese ISPs. The commercial use of the Internet in Japan started in around 1992, and then Japanese ISPs were born. ISPs are key movers that provide the Internet environment to individuals and firms (users). They help that Japan achieves the status of an advanced ICT society. In other words, ISPs have played an important role in developing the Internet to a business platform with a social infrastructure (critical infrastructure) in only about two decades. Originally, in Japan, ISPs were defined as businesses (companies) providing only a connectivity service to users. With time, the definition has evolved to businesses providing various services such as applications, operations, and support services. The characteristics of Japanese ISPs differ from overseas' ISPs. According to Ebara et al. (2006), Japanese ISPs divide the classification of ISPs into local ISPs and nationwide ISPs 8 . The former are businesses that provide service for users in a limited area or local community, and the latter provides service for users in multiple areas or a nationwide area. 6 Gordon, et al. (2003), and Gordon and Loeb (2006) enhances the model in Gordon and Loeb (2002). 7 BCP is the creation and validation of a practiced logistical plan for how an organization will recover and restore partially or completely interrupted critical (urgent) functions within a predetermined time after a disaster or extended disruption. 8 Ebara et al. (2006) classify ISPs by viewpoints such as the management organization, too.
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
77
According to Yokomi et al. (2004), a polarization in management efficiency exists between the two classifications. We show that there are differences of financial health in the level of efficiency scores between them. They point out that the reasons for this difference may be found in the differences in the number of potential Internet users they can cover in a service area, and the scale of capital. The Internet Provider Association illustrates how there are only a few nationwide ISPs with 1% or more share of the market (Internet Provider Association, 2003). In addition to these providers, through a low-priced broadband strategy of Yahoo! BB in Japan, which diversified in 2000, Internet users in Japan have come to expect high quality service and low price. On the other hand, from the viewpoint of ISPs, this fact implies that the providers are taken off the market unless they provide high quality service and low price. Therefore, the financial health of many local ISPs has been deteriorating. Under such situations, Internet threats such as Spam mails, illegal access and malware has become more serious as social and economical problems. In addition, against the background, individuals, firms and governments demand various countermeasures and compliances to ISPs 9. Under such a situation in Japan, Takemura conducted a questionnaire survey on the situation of information security countermeasures against Japanese ISPs in February 2007 (Takemura, 2007a). He point out that in recent years ISPs have improved their attitudes to information security within an improvement of social consciousness. Although many ISPs still have a problem of capital and a human resource shortage, they recognize social corporate responsibility (Abbreviation, CSR) and act under this recognition 10. Nevertheless, some problems such as users' convenience and legal interpretations for role of ISPs in society still exist. For example, some ISPs are confused with guidelines concerned with information security, and these ISPs cannot execute information security countermeasures for the users' convenience. Though many ISPs execute various information security countermeasures, information security accidents still occur (some ISPs encounter information security incidents). We discuss the reasons for some of these incidents in the following sections. In the next section, based on the micro data of the questionnaire survey in Takemura (2007a), we statistically analyze the relationships in risk factors that ISPs encounter with information security incidents.
4. Framework 4.1 Model The purpose of this chapter is to investigate the relationships among information security incidents, information security countermeasures, and vulnerability by using a statistical method. Through the result, we can discuss which factors reduce the risk that ISPs
9 In Takemura et al. (2009a), ratio of the individuals who consider that only ISPs should take countermeasure is about 8.9% in Japan. They conducted a Web-based survey on the Internet. 10 CSR is a form of corporate (firm) self-regulation integrated into a business model. Ideally, CSR policy in each firm would function as a built-in, self-regulating mechanism whereby business would monitor and ensure their adherence to law, ethical standards, and international norms.
78
Advanced Technologies
encounter information security incidents. In other words, we provide suggestions about what efficient countermeasures are available. We adopt logistic regression analysis as the statistical method. For a long time, logistic regression (or called multiple logistic regression) has been widely used as one of the methods which grasp the relationships among explanatory variables and explained variables in various fields such as psychology, sociology, economics, and business administration. Generally, in logistic regression, an explained variable is a probability that a certain event happens p, and explanatory variables are co-variables that influence p. Note that p follows logit distribution, logit(p)=log(p/1-p). We build the model showing which factors such as vulnerabilities and information security countermeasures influence the risk that ISPs encounter in information security incidents. The relationship is described by using equation (1). log (pj/1-pj)=a+bVXV + bCXC + cZC
(1)
where pj represents the probability that ISPs encounter information security incident j, and XV, XC and ZC represent vulnerability, information security countermeasure, and characteristics of ISPs, respectively. The explained variable on the left side of equation (1) represents a logarithm of odds ratio. This can be interpreted as one of the risk indices 11. Also, the coefficient parameter of each explanatory variable on the right side represents a logarithm odds ratio when the explanatory variable increases one unit. For example, this increase implies that the risk that ISPs encounter information security incident j becomes Exp [bV] times when XV increases one unit. By using this model in (1), above mentioned, we can discuss which countermeasures ISPs can use to reduce the risks. At the same time, we can evaluate the risk that each ISP faces. Combining vulnerability with various threats creates the risk that users may encounter as an information security incident. That is, vulnerability is one of the factors raising the probability of risk that they encounter. This implies that the coefficient parameter of XV in equation (1) is positive; bV>0. Generally, information security countermeasures are executed to reduce the risk that users encounter an information security incident. Therefore, the coefficient parameter of XC in equation (1) is negative; bC<0. In this chapter, we roughly divide the information security countermeasures into two kinds of countermeasures; technical information security countermeasures and non-technical information security countermeasures. The former introduces and operates various information security systems and technologies, and the latter manages countermeasures such as information security education and reminder of information security incident to users. We are particularly interested in countermeasures concerned with management. We use the service area as an attribute of ISPs. The reason is that there are the differences in the financial health between local ISPs and nationwide ISPs, as discussed in Section 3. We set up a hypothesis that the possibility of differences causes the difference of the risks
An odds ratio is a statistical measurement indicating the odds of an event occurring in one group to the odds of it occurring in another group. 11
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
79
encountered by information security incidents 12. Finally, here we show the processes to estimate coefficient parameters in equation (1) by using logistic regression, and to evaluate the fitness of our model. To estimate each coefficient parameter in equation (1), we use the general maximum likelihood estimation method based on a binominal distribution. Because calculating the estimation is too complex, we use SPSS as a statistical computer software in this chapter 13. SPSS has a) a method by compulsion inserting explanatory variables, b) a variable increase (decrease) method by likelihood ratio, c) a variable increase (decrease) method by Wald, and d) a conditional variable increase (decrease) method as a method of variable selection. From these methods, we apply the variable increase (decrease) method by likelihood ratio as a method of variable selection in this chapter. This method is often used as one of the most preferable indices. Next, we run the Hosmer-Lemeshow test to evaluate the fitness of our model. Note that the null hypothesis of this test H0 is that the model is well suited 14. In addition, we evaluate the validity of the model by using a positive distinction rate, which forecasts this model correctly 15. 4.2 Dataset We conducted the mailing questionnaire survey for ISPs in Japan from February to March 2007 16. In this questionnaire, we received answers from 63 ISPs (the recovery percentage was about 10.3%). The purpose of this questionnaire was to investigate the current situation regarding information security countermeasures of Japanese ISPs. Overall, the contents included the general business conditions of ISPs, the situation of the information security countermeasure, the situation of the information security incidents, and opinions toward government. We use a part of the results (micro data) and analyze them below 17. (a) Information Security Incidents As information security incidents, we used the following: illegal access, computer viruses and worms, and system trouble. Although we set four outcomes on information security incidents in the questionnaire survey, we replaced them with binary outcomes (whether or not ISPs encounter information security incidents) as follows 18: For j=IA, CV, and ST, 12 Takemura (2007b) points out that ISPs might not execute enough information security countermeasures because the financial health of local ISPs have deteriorated. 13 We use SPSS version 17.0J for Windows, SPSS, Inc.. 14 Refer to Hosmer and Lemeshow (2000) about details of this test. 15 The higher the positive distinction rate, the more correctly the model is forecasted. Therefore, this model is said to be preferable. 16 Strictly speaking, we conducted this questionnaire survey to ISPs in the Internet Service Provider Association, Japan. You can refer to this mailing questionnaire survey by accessing http://www.kansai-u.ac.jp/riss/shareduse/data/07.pdf. 17 This questionnaire survey has many kinds of information of ISPs. Refer to Takemura (2007a) about these results in the questionnaire survey. 18 The four outcomes are the following; 1) nothing, 2) there is a network fault, but servers are not downed, 3) some servers are downed, and 4) the entire system crashes.
80
Advanced Technologies
pj = 1 if ISP encounters information security incident j, 0 otherwise
(2)
where IA, CV, and ST are illegal access, computer viruses and worms, and system trouble, respectively. pj is indicator function. In Figure 1, conditions on each information security incident are shown.
Fig. 1. Conditions on Information Security Incidents From Figure 1, we see that the rate of ISPs that encounter system trouble is about 59%, and at least one or more system troubles occurred in more than half of the ISPs. According to Takemura (2007a), the rate of crashes in all ISPs systems was about 6%. For reference, we calculate the odds ratio between risks (probability) that ISPs encounter at each information security incident. The results are shown in Table 2. We see that the risk is not mutually independent. From these results, it is clear that we need to discuss efficient countermeasures against information security incidents. Illegal Access Illegal Access Computer Virus and Warms System Trouble
System Trouble
----------------------
Computer Virus and Warms 9
----------------------
----------------------
3
----------------------
----------------------
----------------------
Table 2. Odds Ratio between Information Security Incidents
19
(b) Vulnerability In this chapter, we use the following two vulnerability indices; one is the number of users as a proxy variable of the vulnerability caused by the users, and the other is the number of servers as a proxy variable of the vulnerability caused by vulnerabilities of Web application and/or computer software and programs 19.
We have data on the number of individuals and firms, separately. First, we tried to estimate coefficient parameters in equation (1). Unfortunately, we could not find a significant result. Therefore, in this chapter we use data on the total number of individual users and firm users. The number of users can be considered as not only a proxy variable of the vulnerability, but also the scale of the ISP. 19
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
81
The number of users and servers vary greatly in scale among ISPs; that is, these distributions are too large. Therefore, we control the model by taking their logarithm 20 . Therefore, vulnerability XV,m is the following: For m=U, and S, XV,m=log (m)
(3)
where U and S represent the number of users and the number of servers, respectively. Table 3 shows a descriptive statistics on the number of users and servers. Mean
Standard Deviation 1.917 1.314
Skewness
8.121 -1.019 XV,U 2.814 0.056 XV,S Table 3. Descriptive Statistics on the Number of Users and Servers
Kurtosis 2.134 -0.435
(c) Information Security Countermeasures We roughly divide the information security countermeasures into two kinds of countermeasures; technical information security countermeasures and non-technical information security countermeasures. We use the number of introduced information security systems as information security technology index. The kinds of systems we assume are six: 1) a firewall (FW), 2) an Intrusion Detection System (IDS), 3) an Intrusion Prevention System (IPS), 4) a virus check on the Web, 5) setting a DMZ segment, and 6) the others. Therefore, maximum of XC,NS is 6 and minimum of XC,NS is 0. Table 4 shows descriptive statistics on the number of introduced security systems. Mean XC,NS
2.480
Standard Deviation 1.424
Skewness
Kurtosis
-0.072
-0.811
Table 4. Descriptive Statistics on the number of Introduced Security Systems
On the other hand, we use the following four indices as a non-technical (management) information security countermeasure index: 1) information security education, 2) reminder of information security incident to users, 3) a penetration test, and 4), a system audit. The information security management indices are given by binary choices (whether or not the ISP executes the countermeasure) as follows: For k=EDU, RU, PT, and SA, XC,k = 1 if ISP executes the countermeasure k, 0 otherwise
(4)
where EDU, RU, PT, and SA represent information security education, reminder of information security incident to users, the penetration test, and the system audit. 20
Liu et al. (2007) uses mail accounts as a proxy variable of the vulnerability.
82
Advanced Technologies
Figure 2 shows conditions on each information security countermeasure. From Figure 2, it is found that many ISPs (the ratio is over 70%) execute management on information security.
Fig. 2. Conditions on Information Security Countermeasures (d) Attributes of ISPs We use the service area as an attribute of ISPs. In other words, this index shows whether the ISP is local or nationwide. Concretely, ZC = 1 if ISP is nationwide, 0 if ISP is local. where ZC is an indicator function. Figure 3 shows the ratios of local ISPs and nationwide ISPs.
Fig. 3. Local ISPs and Nationwide ISPs
5. Estimation Results Before logistic regression, we examine the rank correlations among explanatory variables. Unless the variables are mutually independent, we cannot run logistic regression. Table 5
(5)
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
83
shows the rank correlation coefficient of explanatory variables because many explanatory variables are discrete. We can easily see that each rank correlation coefficient is far less than 1. Therefore, we can use these data for our analysis as explanatory variables. XV,U XV,S XC,NS XC,EDU XC,RU XC,SA XC,PT ZC
XV,U 0.216 -0.076 0.211 0.177 -0.035 0.168 -0.092
XV,S 0.216 0.223 0.123 0.268 0.189 0.173 0.281
XC,NS -0.076 0.223 0.313 0.333 0.323 0.139 -0.042
XC,EDU 0.211 0.123 0.313 0.338 0.093 0.209 0.080
XC,RU 0.177 0.268 0.333 0.338 0.196 0.090 0.090
Table 5. Correlation Coefficient of Explanatory Variables
XC,SA -0.035 0.189 0.323 0.093 0.196 0.365 0.093
XC,PT 0.168 0.173 0.139 0.209 0.090 0.365 0.145
ZC -0.092 0.281 -0.042 0.080 0.090 0.093 0.145
Next, we divide estimation results in some cases and estimate the coefficient parameters in equation (1). Hereafter, the results are sequentially shown. Unfortunately, we cannot attain significant results using the risks that ISPs encounter with virus and worm accidents as an explained variable. Therefore, we omit these cases in this chapter. 5.1 Illegal Access In Tables 6-8, we show the explained variables, which are the estimation results using the risk that ISPs encounter with illegal access accidents. Note that in Table 6 both the numbers of users and servers are used as the vulnerability index. In Table 7, we use only the logarithm of the number of users as the vulnerability index and we use only the logarithm of the number of servers as the index in Table 8. Chi-square in each Table is used to run the Hosmer-Lemeshow test.
bC,NS bC,EDU Constant term
Table 6. Estimation Result I
bC,NS bC,EDU Constant term
Table 7. Estimation Result II
Estimated Standard Coefficient Error 1.755 0.789 -4.515 1.966 -2.108 1.436 Chi-square(5)=2.556 [0.768], 7 Steps Positive distinction rate: 80.6%
exp[B] 5.686 0.011 0.121
Estimated Standard exp[B] Coefficient Error 1.373 0.533 3.947 -3.659 1.555 0.026 -1.971 1.148 0.139 Chi-square (5)=2.059 [0.841] ,6 Steps Positive distinction rate: 76.3%
84
Advanced Technologies
bV,S bC,NS bC,EDU Constant term
Table 8. Estimation Result III
Estimated Standard exp[B] Coefficient Error 0.732 0.423 2.079 0.893 0.459 2.443 -2.833 1.254 0.059 -3.428 1.609 0.032 Chi-square (8)=2.990 [0.935], 5 Steps Positive distinction rate: 82.9%
In Tables 6-8, we find that it is common to the results that the estimated coefficient parameter of the number of countermeasures, bC,NS, is positive, and the estimated coefficient parameter of the information security education, bC,EDU, is negative. Both parameters are statistically significant. Oppositely, variables such as the logarithm of the number of users, reminder of information security incident to users, the system audit, the penetration test, and the area providing service were not selected as inappropriate variables during the process of the logistic regression. In addition, the coefficient parameter of the logarithm of the number of users, bV,S, in Table 8 becomes positive. From the results of the Hosmer-Lemeshow test, we can evaluate how these models are fit to some degree because each model has a 5% or more significance level. In addition to these results, because the positive distinction rate is at a level between 76.3 and 82.9%, we can insist that our models are valid. 5.2 System Trouble In Tables 9 and 10, the estimation results using the risk that ISPs encounter system trouble as explained variables are shown. Note that in Table 9 we use both the numbers of users and servers as the vulnerability index and we use only the logarithm of the number of users as the vulnerability index in Table 10. In the case using the logarithm of the number of servers as the vulnerability index, we cannot gain significant results. Thus, we omit the results. In Tables 9 and 10, we find that it is common to the results that the estimated coefficient parameter of the number of countermeasures, bC,NS, is positive, and the estimated coefficient parameter of the information security education, bC,EDU, is negative. Both parameters are statistically significant. Oppositely, variables such as the logarithm of the number of servers, reminder of information security incident to users, and the system audit were not selected as
bV,U bC,NS bC,EDU bC,PT C Constant term
Table 9. Estimation Result IV
Estimated Standard exp[B] Coefficient Error 0.534 0.290 1.706 1.085 0.587 2.959 -3.562 1.719 0.028 -1.915 1.201 0.147 2.886 1.303 17.918 -5.110 3.091 0.006 Chi-square (7)=3.730 [0.881], 3 Steps Positive distinction rate: 80.6%
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
bC,NS bC,EDU Constant term
Table 10. Estimation Result V
85
Estimated Standard exp[B] Coefficient Error 0.522 0.292 1.685 -1.968 1.220 0.140 0.877 1.169 2.403 Chi-square (5)=7.659 [0.176], 6 Steps Positive distinction rate: 73.7%
inappropriate variables during the process of the logistic regression. In addition, the coefficient parameters of the logarithm of the number of users, bV,S, the penetration test, bC,PT, and the area providing service, c, in Table 9 become positive, respectively. From the results of Hosmer-Lemeshow test, we can evaluate how these models are fit to some degree because each model has a 5% or more significance level. In addition to these results, because the positive distinction rate is at a level between 73.7 and 80.6%, we can insist that our models are valid. 5.3 Review of Estimation Results The estimation results in the previous section are interesting. First of all, the number of introduced information security systems and information security education show that the estimated coefficient parameters are statistically significant and the same sign through all models. The former is positive and the latter is negative. The former means that the more information security systems ISPs introduce, the higher the probability of risk that they encounter information security incident is 21. On the other hand, the latter means that the more information security education ISPs positively execute, the lower the risk becomes. If the education is executed more positively, the risk can be reduced. Generally, many people think that the risk would be reduced if ISPs introduce various information security systems. Of course, when we discuss network security (countermeasures) against Internet threats, systems such as FW and IDS play an important role and they are needed. However, the former result throws doubt on this thinking. We interpreted this result as follows. First, ISPs may tend to hesitate on investment on information security countermeasures and use old information security systems because the amount of investment is vast. This fact is pointed out in Takemura (2007b). Therefore, there is a possibility that the old systems will not be able to correspond to present threats. Second, even if ISPs introduce the recent systems, the various systems may be not efficiently operated because ISPs have few employees of a high technical level, such as system administrators 22 . Third, including ISPs that had encountered an information security incident, the causal relation might be reversed. In other words, the higher the risk becomes that ISPs encounter information security incidents, the more ISPs introduce information 21 This result might represent an opposite causal relation. That is, the higher the risk, the more ISPs will introduce information security systems. We want to discuss this relation further in the future. 22 We believe that enough effects cannot be demonstrated unless the system is introduced to employees such as the SE (system engineer) who has enough knowledge.
86
Advanced Technologies
security systems. If these possibilities exist, coefficient parameters of the number of introduction systems can be considered intentionally positive. Moreover, it seems that the result that executing information security education reduces the risk has the great meaning. Though it costs to execute information security education, the cost-benefit between information security education and the expected effects is higher than introducing information security systems in the long term. The reason is that executing information security education is effective (reduces the probability of risk that ISPs encounter information security incidents). Takemura (2007a) reports that information security education includes etiquette on the Internet, knowledge about not only viruses and worms, but also the knowledge about security laws, and correspondence in emergencies. The information security education is executed with not only ex ante countermeasures, but also ex post ones in mind. Moreover, according to Takemura (2007a), the ratio of ISPs planning to execute information security education in the future is over 95%. In other words, many ISPs intend to execute information security education. We expect that the risk of ISPs encountering information security incidents will be reduced in the future. Next, as part of the results, the estimated coefficient parameters of the logarithm of the number of servers and users are positive. These results imply that these vulnerabilities heighten the risk that ISPs encounter information security incidents, and these results are consistent with the theoretical consideration in Section 4. Finally, we confirm that some countermeasures are not effective now because their coefficient parameters are not statistically significant. In Table 9, the estimated coefficient parameter of the service area is positive. That is, nationwide ISPs have a higher risk than local ISPs that they will encounter system trouble. The reason may be that the systems and networks they handle are too complex and widespread. Therefore, our results overrule our initial intuition (hypothesis in section 4) though we assumed that local ISPs have a higher risk rather than nationwide ISPs.
6. Concluding Remarks and Future Work The purpose of this chapter is to investigate the relations among information security incidents, information security countermeasures and vulnerability. We analyze the relations by using data on a 2007 questionnaire survey for Japanese ISPs and logistic regression. As a result, we found that there are statistically significant relationships between information security incidents and some information security countermeasures. Concretely, the relation between information security incidents and the number of introduced information security systems (resp. information security education) is positive (resp. negative). The results mean that the risk would rise if the number of introduced information security systems increases, and the risk would decrease if information security education were executed. These results are valid and provide important information when considering the financial health of ISPs. For heightening and maintaining information security level of ISPs, it is efficient to execute information security education as a non-technological countermeasure (management). This higher information security level cannot be achieved by the IPSs alone. We suggest that the government needs to help these ISPs. As one of examples, the government can hold the seminar on information security. Actually, MIC and METI in Japan cooperate with IPA, JNSA, JPCERT/CC and the other organizations, and the seminars have been held several
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
87
times every year. The challenge seems to be one of efficient policies and we strongly recommend that the seminars should be held regularly. In addition, we suggest that the government should put out united guidelines on information security. Now, in Japan there are many guidelines on information security. It is necessary to rearrange these guidelines for heightening Japanese information security level. We expect that NISC will play an important role in coordinating policies on information security among ministries (Takemura and Osajima, 2008). This idea of holding seminars on security countermeasures also applies to usual firms. Of course, it is necessary for NPA to strengthen control in cooperation with the organization of foreign countries, too. Finally, we discuss the future works. The researches on “economics of information security” are not only meaningful in social sciences, but also help to the real business activities. Therefore, these researches need be accumulated more. We will continue to research the social and economic effects of information security countermeasures and investment quantitatively. This will be one of our future endeavors. Concretely, we will focus on economic behaviors about information security against individuals, who are Internet users, employees, and usual firms (Takemura and Minetaki, 2009a and 2009b, Minetaki, et al., 2009, and Takemura et al., 2009a and 2009b).
7. Acknowledgements This work is supported in part by the Telecommunications Advancement Foundation in Japan, the Ministry of Education, Culture, Sports, Science and Technology, Japan: Grant-inAid for Young Scientists (B) (20730196) and Japan Society for the Promotion of Science: Grant-in-Aid for Science Research (B) (21330061). The authors are grateful to Y. Ukai (Kansai University, Japan), K. Minetaki (Kansai University, Japan), T. Imagawa (The Ministry of Internal Affairs and Communications, Japan), A. Umino (The Ministry of Internal Affairs and Communications, Japan) and T. Maeda (Kansai University, Japan). We also express our gratitude to the ISPs who cooperated with our survey. The remaining errors are the authors.
8. References Brynjolfsson, E.; Hitt, L. & Yang, S (2002). Intangible assets : how the interaction of computers and organizational structure affects stock market valuations. Brookings Papers on Economic Activity : Macroeconomics, Vol.1, 137-199 Takemura, T. (2008). Economic analysis on information and communication technology, Tagashuppan, Tokyo Information-technology Promotion Agency (2008). Information of security white paper 2008, Jikkyo Shuppan, Tokyo Cook, D. & Keromytis, A. (2006). Cryptographics: exploiting graphics cards for security, Springer, New York Ebara, H.; Nakawani, A.; Takemura, T. & Yokomi, M. (2006). Empirical analysis for Internet service providers, Taga-shuppan, Tokyo
88
Advanced Technologies
Yokomi, M.; Ebara, H.; Nakawani, A. & Takemura, T. (2004). Evaluation of technical efficiency for Internet providers in Japan: problems for regional providers. Journal of Public Utility Economics, Vol.56, No.3, 85-94 Takemura, T. (2007a). The 2nd investigation of actual conditions report on information security countermeasures for Internet service providers, Kansai University Information-technology Promotion Agency (2000). Investigation report: case study of information security countermeasures in critical infrastructure, Online Available: http://www.ipa.go.jp/security/fy11/report/contents/intrusion/infrasec_pts/infr asec_pj.pdf Yamaguchi, S. (2007). Expectation for academic society, JSSM Security Forum distributed material, Online Available: http://www.jssm.net/ National Information Security Center (2008). Secure Japan 2008: concentrated approach for strengthening information security base, Online Available: http://www.nisc.go. jp/active/kihon/pdf/sj_2008_draft.pdf Gordon, L.A. & Loeb, M.P. (2002). The Economics of information security investment. ACM Transactions on Information and System Security, Vol.5, 438-457 Varian, H.R. (2002). System reliability and free riding. ACM Transactions on Information and System Security, Vol.5, 355-366 Gordon, L.A.; Loeb, M.P. & Lycyshyn, W. (2003). Sharing information on computer systems security: an economic analysis. Journal of Accounting and Public Policy, Vol.22, No.6, 461-485 Gordon, L.A. & Loeb, M.P. (2006). Expenditures on competitor analysis and information security: a managerial accounting perspective, In: Management Accounting in the Digital Economy, Bhimni, A. (Ed.), 95-111, Oxford University Press, Oxford Tanaka, H.; Matsuura, K. & Sudoh, O. (2005). Vulnerability and information security investment: an empirical analysis of e-local government in Japan. Journal of Accounting and Public Policy, Vol.24, No.1, 37-59 Lie, W.; Tanaka, H. & Matsuura, K. (2007). Empirical-analysis methodology for informationsecurity investment and its application to reliable survey of Japanese firms. Information Processing Society of Japan Digital Courier, Vol.3, 585-599 Takemura, T. (2007b). Proposal of information security policy in telecommunication infrastructure, In: What is Socionetwork Strategies, Murata, T. & Watanabe, S. (Eds.), 103-127, Taga-shuppan, Tokyo Takemura, T.; Osajima, M. & Kawano, M. (2008). Positive Analysis on Vulnerability, Information Security Incidents, and the Countermeasures of Japanese Internet Service Providers, Proceedings of World Academy of Science, Engineering and Technology, Vol.36, 703-710 Tanaka, H. (2005). Information security as intangible assets: a firm level empirical analysis on information security investment, Journal of information studies (The University of Tokyo), Vol.69, 123-136 Takemura, T. & Minetaki, K. (2009a). Strategic information security countermeasure can improve the market value: evidence from analyzing data on Web-based survey, The Proceedings of 6th International Conference of Socionetwork Strategies, 243-246 Takemura, T. & Minetaki, K. (2009b). The policies for strategic information security countermeasures improving the market value, The Proceedings of 66th Conference on Japan Economic Policy Association
Economic Analysis on Information Security Incidents and the Countermeasures: The Case of Japanese Internet Service Providers
89
Minetaki, K. ; Takemura, T. & T. Imagawa (2009). An empirical study of the effects of information security countermeasures, Mimeo, Kansai University Nagaoka, H. & Takemura, T. (2007) A business continuity plan to heighten enterprise value, Proceedings of 55th National Conference, pp.149-152, Aichi-gakuin University, November 2007, Japan Society for Management Information, Nagoya Japan Network Security Association (2008). Fiscal 2006 information security incident survey report (information leakage: projected damages and observations), Online Available: http://www.jnsa.org/en/reports/incident.html Ukai, Y. & Takemura, T. (2007). Spam mails impede economic growth. The Review of Socionetwork Strategies, Vol.1, No.1, 14-22. Takemura, T. & Ebara, H. (2008a). Spam mail reduces economic effects, Proceedings of the 2nd International Conference on the Digital Society, pp.20-24, February 2008, IEEE, Martinique Takemura, T. & Ebara, H. (2008b). Economic loss caused by spam mail in each Japanese industry, Selected Proceedings of 1st International Conference of Social Sciences, Vol.3, 29-42 Takemura, T. & Ebara, H. (2008c). Estimating economic losses caused by spam mails through production function approach. Journal of International Development, Vol.8, No.1, 23-33 Japan Data Communications Association (2008). Inspection slip of Influence That Spam Mail Exerts on Japanese Economy, Online Available: http://www.dekyo.or.jp/ Internet Provider Association (2003). Actual conditions on investigation of nationwide Internet services 2003, Internet Provider Association, Tokyo Takemura, T.; Umino, A. & Osajima, M. (2009a). Variance of analysis on information security conscious mind of Internet users. GITI Research Bulletin 2008-2009 (Waseda University), forthcoming Hosmer, D.W. & Lemeshow, S. (2000). Applied logistic regression (2nd ed.), Wiley-Interscience publication, New York Takemura, T. & Osajima, M. (2008). About some topics on countermeasures and policies for information security incidents in Japan. GITI Research Bulletin 2007-2008 (Waseda University), 163-168 Takemura, T.; Minetaki, K.; Takemura, T. & Imagawa, T. (2009). Economic analysis on information security conscious mind of workers, Mimeo, Kansai University
90
Advanced Technologies
Chapter title
91
X6 Evaluating Intrusion Detection Systems and Comparison of Intrusion Detection Techniques in Detecting Misbehaving Nodes for MANET Marjan Kuchaki Rafsanjani
Department of Computer Engineering, Islamic Azad University Kerman Branch Iran 1. Introduction Mobile Ad hoc Networks (MANETs) are a new paradigm of wireless communication for mobile hosts. These networks do not need the costly base stations in wired networks or mobile switching centres in cellular wireless mobile networks. The absence of a fixed infrastructure requires mobile hosts in MANETs to cooperate with each other for message transmissions. Nodes within the radio range of each other can communicate directly over the wireless links, while those that are far apart use other nodes as relays. In MANETs, each host must act as a router too since routes are mostly multi hop. Nodes in such a network move arbitrarily, thus the network topology changes frequently and unpredictably (Sun, 2004a). MANETs have different features with respect to the wired or even standard wireless networks. Due to their open and distributed nature, lack of fixed infrastructure, lack of central management, node mobility and dynamic topology, it enables intruders to penetrate the network in different ways. On the other hand, dependency and decentralized of MANET allows an adversary to exploit new type of attacks that are designed to destroy the cooperative algorithms used in these networks (Farhan et al., 2008). Therefore, MANETS are vulnerable to different security attacks such as distortion of routing data, exhausting node resources and maliciously manipulating data traffic. To secure a MANET in adversarial environments, an important challenging problem is how to feasibly detect and defend possible attacks, especially internal attacks (Yu et al., 2009). Prevention mechanisms, such as encryption and authentication, can be used in MANETs to decrease intrusions, but cannot eliminate them. Hence these mechanisms are not enough to have a secure MANET. So, Intrusion Detection Systems (IDSs) are used as one of the defensive ways to detect a possible attack before the system could be penetrated. In general, if prevention mechanism and intrusion detection systems are integrated, they can provide a high-survivability network. In this chapter, we first illustrate intrusion detection systems and then discuss why wired and cellular wireless IDSs are not suitable and applicable for MANETs. Then, the classification of IDSs is discussed and their strengths and weaknesses are evaluated. The architectures proposed so far for intrusion detection systems in MANET are classified
92
Advanced Technologies
because they are able to operate under different security situations and conditions. Misbehaving nodes in MANET are considered and then various intrusion detection techniques for detecting these nodes are introduced and compared. Finally important future research directions are indicated.
2. Intrusion Detection Systems Intrusion detection can be defined as a process of monitoring activities in a system which can be a computer or a network. The mechanism that performs this task is called an Intrusion Detection System (IDS) (Zhou & Hass, 1999; Anantvalee & Wu, 2006). Studies show that intrusion detection techniques just like encryption and authentication systems which are the first line of defence are not enough. As the system grows in complexity their weaknesses grow causing the network security problems to grow too. Intrusion detection can be considered as a second line of defence for network security. If an intrusion is detected then an answer for preventing intrusion or minimizing the effects can be generated. There are several assumptions for developing IDS. In the first assumption, user operations and the programs are visible and in the second assumption, normal and intrusive activities in a system behave differently. So, IDS should analyze system activities and ensure whether or not an intrusion has occurred (Brutch & Ko, 2003; Kuchaki & Movaghar, 2009). 2.1 Comparison between Wired and Cellular Wireless IDSs and MAENT IDSs Unlike conventional cellular wireless mobile networks that rely on extensive infrastructure to support mobility, MANETs do not need expensive base stations or wired infrastructure. Global trustworthiness in all network nodes is the main fundamental security assumption in MANETs. However, this assumption is not always true in reality. The nature of MANETs makes them very vulnerable to misbehaving nodes attacks (such as malicious attacks) ranging from passive eavesdropping to active interfering. Most routing protocols only focus on providing efficient route discovery and maintenance functionality and pay little attention to routing security. Very few of them specify security measures from the very beginning. The nature of MANETs makes them very vulnerable to malicious attacks compared to traditional wired networks, because of the use of wireless links, the low degree of physical security of the mobile nodes, the dynamic topology, the limited power supply and the absence of central management point. In a network with high security requirements, it is necessary to deploy intrusion detection techniques. While most of today’s wired IDSs, which rely on real-time traffic parse, filter, format and analysis, usually monitor the traffic at switches, routers, and gateways. The lack of such traffic concentration point makes traditional wired IDSs inapplicable on MANET platforms. Each node can only use the partial and localized communication activities as the available audit traces. There are also some characteristics in MANET such as disconnected operations, which seldom exist in wired networks. What’s more, each mobile node has limited resources (such as limited wireless bandwidth, computation ability and energy supply, etc.), which means MANET IDSs should have the property to be lightweight. All of these imply the inapplicability of wired IDSs on the MANET platform. Furthermore, in MANETs, it is very difficult for IDSs to tell the validity of some operations. For example, the reason that one node sends out falsified routing information could be because this node is malicious, or because of the link is broken due to the physical movement of the node. All
Chapter title
93
these suggest that an IDS of a different architecture needs to be developed to be applicable on the MANET platform (Zhang & Lee, 2000; Sun, 2004a). In general, the important differences between MANETs and wired and cellular wireless networks make it unsuitable to apply traditional wired and cellular wireless intrusion detection technologies directly to MANET intrusion detection systems.
3. Intrusion Detection Systems Classification Intrusion detection can be classified based on audit data collection mechanism which is host based or network based. A network-based IDS, receives packets from the network and analysis it. On the other hand, host-based IDS, analyses the events taken place in application programs or the operating systems. Also, IDS can be divided into three groups based on detection techniques which have two main types and one hybrid model; Anomaly based intrusion detection system (or behaviour-based detection), misuse based intrusion detection system (or knowledge-based detection) and specification-based intrusion detection system (hybrid detection) (Brutch & Ko, 2003; Kuchaki et al., 2008b). These three broad categories of IDSs can be used on host-based and network-based intrusion detection systems. Host-based and network-based approaches have its strengths and weaknesses; they are complementary to one another. A successful IDS would be applied in both approaches. In Table 1, comparison of network-based and host-based IDSs has been shown, in case of their strengths and weaknesses to demonstrate how these two can work together to provide additional effective intrusion detection and protection (Pahlevanzadeh & Samsudin, 2007).
Network-based IDS
Broad in scope Examines packet headers and entire packet Near real-time response Host independent Bandwidth dependent No overload Slow down the networks that have IDs clients installed Detects network attacks, as payload is analyzed Not suitable for encrypted and switches network Does not perform normally detection of complex attacks High false positives rate Lower cost of ownership Better for detecting attacks from outside and detect attacks that host-based IDS would miss
Host-based IDS
Narrow in scope, monitor specific activates Does not see packet headers Responds after a suspicious entry Host dependent Bandwidth independent Overload Slow down the hosts that have IDS clients installed Detects local attacks before they hit the network Well-suited for encrypted and switches environment Powerful tool for analyzing a possible attack because of relevant information in database Low false positive rate Require no additional hardware Better for detecting attacks from inside and detect attacks that network-based IDS would miss
Table 1. Evaluation of network-based and host based IDSs versus their strengths and weaknesses
94
Advanced Technologies
4. Intrusion Detection System Architectures in MANET The network architectures for MANET with regards to its applications are either flat or multilayer. Therefore optimum network architecture for a MANET depends on its infrastructure. In flat network infrastructures, all nodes are considered equal. Thus, they are suitable for applications such as virtual classes or conferences. In multilayer infrastructures, all nodes are considered different. Nodes may be grouped in clusters, with a cluster-head node for each cluster. To communication into a cluster, nodes are in direct contact with each other. Nodes communication between clusters is performed through each cluster-head nodes. This infrastructure is suitable for military applications (Anantvalee & Wu, 2006; Kuchaki et al., 2008b). 4.1 Stand-alone Intrusion Detection Systems In this architecture, one IDS is executed independently for each node, and the necessary decision taken for that node is based on the data collected, because there is no interaction among network nodes and therefore no data is interchanged. In addition, each node has no knowledge of the position of other nodes in that network and no alert information crosses the network. Even though, due to its limitations, they are not effective, but they can be suitable for networks where nodes are not capable of executing an IDS or where an IDS has been installed. This architecture is also more suitable for flat network infrastructure than for multilayered network infrastructure. Due to the fact that exclusive node information is not enough to detect intrusions, thus this architecture is not selected in many of the IDS for MANETs (Farhan et al., 2008; Kuchaki et al., 2008b; Anantvalee & Wu, 2006). 4.2 Distributed and Cooperative Intrusion Detection Systems MANETs are distributed by nature and requires nodes cooperation. Zhang et al. (Zhang et al., 2003) put forward an intrusion detection system in MANET which is both distributed and dependent on nodes cooperation. Each node cooperates in intrusion detection and an action is performed by IDS agent on it. Each IDS agent is responsible for detection, data collection and local events in order to detect intrusions and generate an independent response. Even though neighbouring IDS agents cooperate with each other when there is not any convincing evidence in global intrusion detection. So, in case of some indecisive evidence, each node runs IDS agent comprised of six modules, that include local and global detection engine and response modules. To achieve better performance, they use integration approach to analyze the attack scenario entirely. However, this architecture is complex since each node maintains local and global intrusion detection mechanism, anomalies and response methods; thus storing lot of information independently, which leads to memory overhead (Samad et al., 2005). This architecture, which is similar to stand-alone IDS architecture, is more suitable for flat network infrastructure compared with multi-level infrastructure. 4.3 Hierarchical Intrusion Detection Systems Hierarchical IDS architecture is the well developed distributed and cooperative IDS architecture and has been presented for multi-layered network infrastructure in such a way that network is divided into clusters. The cluster-heads of each cluster has more
Chapter title
95
responsibilities compared to other members, For example, sending routing packets between clusters. In this way, these cluster-heads, behave just like control points, for example switches, routers or gateways, in wired networks. The name “multi-layer IDS” is also used for hierarchical IDS architecture. Each IDS agent is performed on every member node and locally responsible for its node, for example, monitoring and deciding on the locally detected intrusions. Each cluster-head is locally in charge of its node and globally in charge of its cluster. For example, monitoring network packets and initiating a global reaction where an intrusion is detected (Kuchaki et al., 2008b; Huang & Lee, 2003b; Yingfang et al., 2007). 4.4 Mobile Agents for Intrusion Detection Systems Mobile agents are intelligent and autonomous agent that can move through heterogeneous network and interact with nodes. In order to employ mobile agents for intrusion detection in the network, it is necessary that many host and network devices must be installed with a mobile agent platform (Pahlevanzadeh & Samsudin, 2007). Mobile agents have been deployed in many techniques for IDSs in MANETs. Due to its ability of moving in network, each mobile agent is considered for performing just one special task and then one or more mobile agents are distributed amongst network nodes. This operation allows the distributed intrusion detection in the system. There are advantages for using mobile agents (Mishra et al., 2004). Some responsibilities are not delegated to every node, and so it helps in reducing the energy consumption, which is also an important factor in MANET network. It also provides for fault tolerance in such a way that if the network is segmented or some of the agents break down, they can still continue to function. In addition, they can work in big and different environments because mobile agents can work irrespective of their architecture, but these systems require a secure module that enables mobile agents to settle down. Moreover, Mobile agents must be able to protect themselves from secure modules on remote hosts. For instance, Li et al. (Li et al., 2004) used the mobile agent technology for coordinated IDS in ad-hoc networks. This architecture uses the cluster-head as manager that contains assistant and response mobile agents, while each node runs a host monitor agent to detect network, file, and user intrusion using intrusion analyzer and interpretation base. The assistant agent is responsible for collecting the data from the cluster-member nodes, while the response agent is used for informing the cluster-member nodes about a certain response. It does not use the multilayer detection approach. Also, it does not use the clustering approach to minimize the intrusion response flooding (Samad et al., 2005). Therefore the main mobile agent’s features which illustrate straight relation to the special challenging requirements found in MANET include: (Hijazi & Nasser, 2005; Pahlevanzadeh & Samsudin, 2007, Kuchaki et al., 2008b) Robustness and fault-tolerant behaviour Bandwidth conservation Energy consumption reduction Load balancing improvement in the network Total tasks completion time reduction Working on a heterogeneous network Lightweight
96
Advanced Technologies
These qualities make mobile agents a choice for security framework in MANET (Smith, 2001; Albers et al., 2002; Kachirski & Guha, 2002; Huang & Lee, 2003b). Data collection, data analysis, alert and alarm messages can be achieved by using mobile agents, which may reduce the data transmission to save the bandwidth resource in the MANET.
5. Misbehaving Nodes in MANET Those nodes in the network which cause dysfunction in network and damage the other nodes are called Misbehaving or Critical nodes. Mobile Ad hoc Networks (MANETs) like other wireless networks are liable to active and passive attacks. In the passive attacks, only eavesdropping of data happens; while active attacks include injecting packets to invalid destinations, deleting packets, changing the content of packets and impersonating other nodes. Certain nodes in MANETS can produce attacks which cause congestion, distribution of incorrect routing information, services preventing proper operation, or disable them (Karygiannis et al., 2006; Lima et al., 2009). Those nodes in the network which perform active attacks to damage other nodes and cause disconnection in the network are called Malicious or Compromised nodes. Also, those nodes which do not send the received packets (used for storing battery life span to be used for their own communications) are called Selfish nodes (Kong et al., 2002; Blazevic et al., 2001). A selfish node impacts the normal network operations by not participating in routing protocols or by not sending packets. A malicious node may use the routing protocols to announce that it has the shortest route to the destined node for sending the packets. In this situation, this node receives the packets and does not send them. This operation is called "black hole" attack (Zhang & Lee, 2000; Komninos et al., 2007). Malicious nodes stop the operation of a routing protocol by changing the routing information or by structuring false routing information; this operation is called the "wormhole" attack. As two malicious nodes create a wormhole tunnel and are connected to each other through a private link, it can be concluded that they have a detour route in the network. This allows a node to create an artificial route in the current network and shorten the normal currency of routing messages in a way that the massages will be controlled by two attackers (Kyasanur & Vaidya, 2003; Hu et al., 2004). Malicious nodes can easily perform integrity attacks by changing the protocol fields in order to destroy the transportation of the packets, to deny access among legal nodes, and can perform attacks against the routing computations. "Spoofing" is a special case of integrity attacks with which a malicious node, due to lack of identity verification in the special routing protocols, forges the identity of a legal node. The result of such an attack by malicious nodes is the forgery of the network topology which creates network loops or partitioning of the network. The lack of integrity and authentication in the routing protocols creates forged or false messages (Komninos et al., 2007; Papadimitratos et al., 2002; Sun et al., 2004b). Malicious nodes in "selective forward" attack behave like normal nodes in most of the times but selectively drop sensitive packets for the application. Such selective dropping is difficult to detect. Selfish nodes can intensively lower the efficiency of the network since they do not easily participate in the network operations. The aim of a selfish node is to make use of the benefits of participating in the ad hoc network without having to expand its own resources in exchange (Lima et al., 2009).
Chapter title
97
6. Intrusion Detection Techniques for Misbehaving Nodes in MANET As it has been said before, MANETs have no infrastructure, so each node is dependant on cooperation with other nodes for routing and forwarding packets. It is possible that intermediate nodes agree for packet dispatch, but if these nodes are misbehaving nodes, they can delete or alter packets. Simulations that Marti (Marti et al., 2000) performed show that only a few misbehaving nodes can reduce entire system efficiency. 6.1 Watchdog and Pathrater These two techniques were presented by Marti, Giuli, Lai and Baker (Marti et al., 2000) and were added to the standard routing protocol in ad hoc networks. The standard is Dynamic Source Routing protocol (DSR). Malicious nodes are recognized by eavesdropping on the next hop through Watchdog technique. Then Pathrater would help in finding the possible routes excluding the misbehaving nodes. In DSR protocol, routing data is defined in the source node. This data is passed to the intermediate nodes in the form of a message until it reaches its intended destination. Therefore each intermediate node in the path must recognize the node in the next hop. In addition, due to the special features of wireless networks, it is possible to hear messages in the next hop. For example, if node A is in the vicinity of node B, then node A can hear node B's communications. Figure 1 shows how the Watchdog technique operates.
S
A
B
C
D
Fig. 1. Watchdog operation Assume that node S wishes to send a packet to node D. There exists a route form S to D via A, B and C. Imagine now that node A had previously received a packet on route from S to D. The packet contains a message plus routing data. When A sends this packet to B, it keeps a copy of it in its buffer. It then eavesdrops on node B ensuring that B forwards the packet to C. If the packet is heard by B (shown by dotted lines) and it is also identical to what it has in its buffer, this indicates that B has forwarded the packet to C (shown by solid lines). The packet is then removed from the source node buffer. If, on the other hand, the packet is not compared with the packet in the source node buffer in a specific time, the Watchdog adds one to the node B's failure counter. If this counter exceeds the threshold, node A concludes that node B is malicious and reports this to the source node S. Pathrater technique calculates path metric for every path. By keeping the ratings of each node in the network, the path metric can be calculated through combining the node rating with connection reliability which is obtained from previous experience. After calculating the path metric for all accessible paths, Pathrater will select the path with the highest metric. If such link reliable data with regards to the connection were not available, the path metrics would enable the Pathrater to select the shortest path. Thus it avoids routes that have misbehaving nodes. Simulation results show that systems using these two techniques to find their routes are very effective in detecting misbehaving nodes. But it does not deal with or punish them in
98
Advanced Technologies
any way. These nodes can continue to use network resources and continue their usual behaviours (Kuchaki et al., 2008b). 6.2 Confidant Bachegger and Le Boudec (Bachegger & Le Boudec, 2002) further developed the DSR protocol and devised a new protocol called CONFIDANT, which is similar to Watchdog and Pathrater. In this protocol, each node can observe the behaviour of all its neighbouring nodes that are within its radio range and learns from them. This protocol resolves the Watchdog and Pathrater problem, meaning that it does not use the misbehaving nodes in routing and not forward packets through them, so they are punished. Additionally, when a node discovers a misbehaving node, it informs all other nodes and they too do not use this node. CONFIDANT protocol consists of monitoring system, reputation system, trust manager and path manager. Their tasks are divided into two sections: the process to handle its own observations and the process to handle reports from trusted nodes. Since this protocol allows network nodes to send alarm messages to each other, it is therefore a good opportunity for the attackers to send false alarm messages regarding misbehaving nodes, even though this is not true (i.e. this is not a misbehaving node). 6.3 Core Michiardi and Molva (Michiardi & Molva, 2002) proposed a technique for detecting selfish nodes. These nodes force other nodes to cooperate with them. This technique is similar to CONIDENT is based on monitoring system and reputation system. In this technique each node receives reports from other nodes. The difference between CORE and CONFIDANT is that CORE only allows positive reports to pass through, but CONFIDANT allows negative reports. This means that CORE prevents false reports. Therefore, it prevents a DoS attack which CONFIDANT can not do. When a node can not cooperate, it is given a negative rating and its reputation decreases. In contrast, a positive rating is given to a node when a positive report is received from this node and its reputation increases. 6.4 Ocean Bansal and Baker (Bansal & Baker, 2003) proposed a protocol called OCEAN (Observationbased Cooperation Enforcement in Ad hoc Networks), which is the enhanced version of DSR protocol. OCEAN also uses a monitoring system and a reputation system. Even though OCEAN, contrary to previous methods, relays on its own observation to avoid the vulnerability of false accusation from second-hand reputation exchanges, therefore OCEAN can be viewed as a stand-alone architecture. OCEAN divides routing misbehaviour into two groups: misleading and selfish. If a node takes part in routes finding but does not forward a packet, it is therefore a misleading node and misleads other nodes. But if a node does not participate in routes finding, it is considered as a selfish node (Anantvalee & Wu, 2006). In order to discover misleading routing behaviours, after a node forwards a packet to its neighbour, it saves the packet and if the neighbouring node tries to forward the packet in a given time period, it is monitored. It then produces a positive or negative event as its monitoring results in order to update the rating of neighbouring node. If the rating is lower than faulty threshold, neighbouring node
Chapter title
99
is added to the list of problematic nodes and also added to RREQ as an avoid-list. As a result all traffic will not use this problematic node. This node is given a specific time to return to the network because it is possible that this node is wrongly accused of misbehaving or if it is a misbehaving node, then it must improve in this time period. 6.5 Cooperative Intrusion Detection System Huang and Lee (Huang & Lee, 2003b) proposed a cluster-based cooperative intrusion detection system, which is similar to Kachirski and Guha’s system (Kachirski & Guha, 2003). In this method, an IDS is not only capable of detecting an intrusion but also reveals the type of attack and the attacker. This is possible through statistical anomaly detection. Identification rules for discovering attacks by using statistical formulas have been defined. These rules help to detect the type of attack and in some cases the attacking node (Huang et al., 2003a). In this technique, IDS architecture is hierarchical, and each node has an equal chance of becoming a cluster-head. Monitoring is how data is obtained in order to analyze for possible intrusions, however it consumes power. Therefore, instead of every node capturing all features themselves, the cluster-head is solely responsible for computing traffic-related statistics. This can be done because the cluster-head overhears incoming and outgoing traffic on all members of the cluster as it is one hop away (a clique: a group of nodes where every pair of members can communicate via a direct wireless link). As a result, the energy consumption of member nodes is decreased, whereas the detection accuracy is just a little worse than that of not implementing clusters. Besides, the performance of the overall network is noticeably better decreases in CPU usage and network overhead (Anantvalee & Wu, 2006). 6.6 ExWatchdog Nasser and Chen (Nasser & Chen, 2007) proposed an IDS called ExWatchdog which is an extension of Watchdog. Its function is also detecting intrusion from malicious nodes and reports this information to the response system, i.e., Pathrater or Routguard (Hasswa et al., 2005). Watchdog resides in each node and is based on overhearing. Through overhearing, each node can detect the malicious action of its neighbours and report other nodes. However, if the node that is overhearing and reporting itself is malicious, then it can cause serious impact on network performance. The main feature of the proposed system is its ability to discover malicious nodes which can partition the network by falsely reporting other nodes as misbehaving and then proceeds to protect the network. So, ExWatchdog solves a fatal problem of Watchdog.
7. Intrusion Detection Techniques Comparison for Detecting Misbehaving Nodes The Watchdog has been used in all of the discussed IDSs, but has several limitations and in case of collisions can not work correctly and lead to wrong accusations. When each node has a different transfer range or implements directional antennas, the Watchdog can not monitor the neighbouring nodes accurately. All IDSs discussed so far can identify selfish nodes. CORE can not detect malicious nodes misbehaviours, but others can detect some of them
100
Advanced Technologies
such as unusually frequent rout update, header change, or payload of packets, etc (Anantvalee & Wu, 2006; Kuchaki et al., 2008b). Several mechanisms have been proposed for securing data forwarding. CORE and CANFIDANT are examples of reputation systems that provide information to distinguish between a trustworthy node and a misbehaving node. This information also encourages nodes to participate in the network in a trustworthy manner (Lima et al., 2009). Type of data collection in all the mentioned intrusion detection techniques is reputation, but in cooperative IDS technique it is statistical. Table 2 represents the final comparison among discussed IDSs. ID Techniques
Watchdog/ Pathrater
CONFIDANT
CORE
ExWatchdag
OCEAN
Cooperative IDS
self to neighbour neighbour to neighbour malicious - routing malicious- packet forwarding selfish - routing selfish - packet forwarding
yes no
yes yes
yes no
yes no
yes yes
yes yes
no yes
yes yes
no no
yes yes
no no
yes yes
no yes
yes yes
yes yes
no yes
yes yes
yes yes
Punishment Avoid misbehaving node in rout finding Architecture
no no
yes yes
yes no
no no
yes yes
n/a n/a
Stand alone
Hierarchical
Observation
Misbehavior detection
Distributed and cooperative
Table 2. Intrusion detection techniques comparison
8. Future Research Directions In general, security research in MANET focused on Prevention to avoid any type of attack as first defence line, Intrusion Detection Systems (IDS) to detect any intruder as second defence line and Intrusion Tolerance (IT) as third defence line. The systems which use techniques for tolerating intrusions and attacks are called Intrusion Tolerance Systems (ITS) (Lima et al, 2009). IDS research for MANETs requires a distributed architecture and the collaboration of a group of nodes to make accurate decisions. Intrusion detection techniques also should be integrated with existing MANET application. This requires an understanding of deployed applications and related attacks to deploy suitable intrusion detection mechanisms. Also attack models must be carefully established. On the other hand, solutions must consider resource limitations such as energy (Kuchaki & Movaghar, 2008a; Kuchaki & Movaghar, 2009). Sometimes the attackers may try to attack the IDS system itself. Therefore, protection against such attacks should be extended further. Also, in an extensive sense, intrusion tolerance techniques can be considered, so that these techniques can provide the development of survivable systems.
9. Conclusion A Mobile Ad hoc Network (MANET) is a group of wireless nodes that can be dynamically organized as a multi-hop packet radio network. MANETs are an increasingly promising
Chapter title
101
area for research with lots of practical applications. However, MANETs are extremely vulnerable to attacks due to their dynamically changing topology, absence of conventional security infrastructures and open medium of communication, which unlike their wired counterparts, cannot be secure. Security issue is becoming a main concern in the applications of MANET. We considered the problem of misbehaving nodes and detecting them by Intrusion Detection techniques in Mobile Ad hoc Networks. Experience has shown that avoidance techniques such as cryptography and authentication are not enough. Therefore, intrusion detection systems have grown popular. With respect to MANET features, nearly all of the IDSs are distributed and have a cooperative architecture. New attacks are growing quickly and they have to be detected before any damage is caused to the system or data. The aim of an intrusion detection system is detecting attacks on mobile nodes or intrusions into network. Intrusion detection systems, if designed well, can effectively identify misbehaving activities and help to offer adequate protection. Therefore, an IDS has become an indispensable component to provide defence-in-depth security mechanisms for MANETs. Some attacks are also categorized as misbehaviour attacks, being generated by network nodes whose actions cannot be trusted or do not conform to protocol specifications. Black hole, wormhole, flooding and selective forwarding are examples of misbehaviour attacks which are created by misbehaving nodes such as malicious and selfish nodes in the network. So, techniques in mobile ad hoc networks with wireless channel have been proposed to detect and minimize misbehaving nodes. On the other hand, intrusion detection techniques used in wired networks cannot be directly applied to mobile ad hoc networks due to special characteristics of the networks. Furthermore, most current MANET intrusion detection systems are still in the test phase.
10. References Albers, P.; Camp, O.; Percher, J.; Bernard, J.; Ludovic, M. & Puttini, R. (2002). Security in ad hoc networks: A general intrusion detection architecture enhancing trust based approaches, Proceedings of the 1st International workshop on Wireless Information systems, pp. 3-6, Ciudad Real, Spain. Anantvalee, T. & Wu, J. (2006). A survey on intrusion detection in mobile ad hoc networks, In : Wireless/Mobile Network Security, Xiao, Y.; Shen, X. & Du, D.-Z., page numbers (170-196), Springer. Bansal, S. & Baker, M. (2003). Observation-based cooperation enforcement in ad hoc networks, Research Report cs.NI/0307012 vl, July 2003, Stanford University. Blazevic, L.; Buttyan, L.; Capkun, S.; Giordano, S.; Hubaux, J. & Le Boudec J. (2001). Selforganization in mobile ad-hoc networks: The approach of terminodes, IEEE Communications Magazine, Vol. 39, No. 6, June 2001, page numbers (166–174). Brutch, P. & Ko, C. (2003). Challenges in intrusion detection for wireless ad-hoc networks, Proceedings of the Symposium on Applications and the Internet Workshops, pp. 368-373, ISBN: 0-7695-1873-7, January 2003. Buchegger, S. & Le Boudec, J.-Y. (2002). Performance analysis of the CONFIDANT protocol: Cooperation of nodes - fairness in dynamic ad-hoc networks, Proceedings of the IEEE/ACM Workshop on Mobile Ad Hoc Networking and Computing (MobiHoc'02), pp.226-336, Lausanne, Switzerland, June 2002.
102
Advanced Technologies
Farhan, A.F.; Zulkhairi, D. & Hatim, M.T. (2008). Mobile agent intrusion detection system for mobile ad hoc networks: A non-overlapping zone approach, Proceedings of the 4th IEEE/IFIP International Conference on Internet(ICI), pp. 1-5, ISBN: 978-1-4244-2282-1, Tashkent, September 2008. Hasswa, A.; Zulkernine, M. & Hassanein, H. (2005). Routeguard: an intrusion detection and response system for mobile ad hoc networks, Proceedings of the IEEE International Conference on Wireless and Mobile Computing, Networking And Communication, Vol. 3, pp. 336-343, ISBN: 0-7803-9181-0, August 2005. Hijazi, A. & Nasser, N. (2005). Using mobile agents for intrusion detection in wireless ad hoc networks, proceedings of the second IFIP International Conference on Wireless and Optical Communications Networks(WOCN2005), pp. 362-366, ISBN: 0-7803-9019-9, March 2005. Hu, Y.-C.; Perrig A. & Johnson D.B. (2003). Packet leashes: A defense against wormhole attacks in wireless networks, Proceedings of the 22th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM’03), Vol. 3, pp. 1976-1986, ISBN: 0-7803-7752-4, March/April 2003. Huang, Y.; Fan, W.; Lee, W. & Yu, P. (2003a). Cross-feature analysis for detecting ad-hoc routing anomalies, Proceedings of the 23rd IEEE International Conference on Distributed Computing Systems (ICDCS'03), pp. 478-487, May 2003. Huang, Y. & Lee, W. (2003b). A cooperative intrusion detection system for ad hoc networks. Proceedings of the 1st ACM Workshop on Security of Ad Hoc and Sensor Networks, pp. 135-147, ISBN: 1-58113-783-4, Fairfax, Virginia. Kachirski, O. & Guha,R. (2002). Intrusion detection using mobile agents in wireless ad hoc networks, Proceedings of the IEEE Workshop on Knowledge Media Networking, pp. 153158, ISBN: 0-7695-1778-1. Kachirski, O. & Guha, R. (2003). Effective intrusion detection using multiple sensors in wireless ad hoc networks, Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03), pp. 57.1, ISBN: 0-7695-1874-5, January 2003. Karygiannis, A.; Antonakakis, E. & Apostolopoulos, A. (2006). Detecting critical nodes for MANET intrusion detection systems, Proceedings of 2nd International Workshop on Security, Privacy and Trust in Pervasive and Ubiquitous Computing, pp. 9-15, ISBN: 07695-2549-0, Lyon, June 2006. Komninos, N.; Vergados, D. & Douligeris C. (2007). Detecting unauthorized and compromised nodes in mobile ad hoc networks, Elsevier Ad hoc network Journal, Vol. 5, No. 3, April 2007, page numbers (289-298), ISSN: 1570-8705. Kong, J.; Luo, H.; Xu, K.; Gu, D.L.; Gerla, M. & Lu, S. (2002). Adaptive security for multilevel ad hoc networks, Wireless Communication and Mobile Computing Journal, Vol. 2, No. 5, September 2002, page numbers (533-547). Kuchaki Rafsanjani, M. & Movaghar, A. (2008a). Identifying monitoring nodes in MANET by detecting unauthorized and malicious nodes, Proceedings of the 3rd IEEE International Symposium on Information Technology (ITSIM’08), pp. 2798-2804, ISBN: 978-1-4244-2327-9, Kuala Lumpur, Malaysia, August 2008. Kuchaki Rafsanjani, M.; Movaghar, A. & Koroupi, F. (2008b). Investigating intrusion detection systems in MANET and Comparing IDSs for detecting misbehaving
Chapter title
103
nodes, Proceedings of World Academy of Science, Engineering and Technology, Vol. 34, pp. 351-355, ISSN: 2070-3740, Venice, Italy, October 2008. Kuchaki Rafsanjani, M. & Movaghar, A. (2009). Developing a hybrid method for identifying monitoring nodes in intrusion detection systems of MANET. Contemporary Engineering Sciences Journal, Vol. 2, No. 3, page numbers (105-116), ISSN: 1313-6569. Kyasanur, P. & Vaidya, N.H. (2003). Detection and handling of MAC layer misbehavior in wireless networks, Proceedings of International Conference on Dependable Systems and Networks (DSN’03), pp. 173–182, ISBN: 0-7695-1952-0, June 2003. Li, C.; Song, Q. & Zhang, C. (2004). MA-IDS architecture for distributed intrusion detection using mobile agents, Proceedings of the 2nd International Conference on Information Technology for Application (ICITA), pp. 451-455, ISBN: 0-646-42313-4. Lima, M.N.; Santos, A.L. & Pujolle, G. (2009). A survey of survivability in mobile ad hoc networks, IEEE Communications Surveys & Tutorials Journal, Vol. 11, No. 1, First Quarter 2009, page numbers (66- 77), ISSN: 1553-877X. Marti, S.; Giuli, T.J.; Lai, K. & Baker, M. (2000). Mitigating routing misbehavior in mobile ad hoc networks, Proceedings of the 6th Annual International Conference on Mobile Computing and Networking (MobiCom'00), pp. 255-265, ISBN: 1-58113-197-6, Boston, Massachusetts, United States, August 2000. Michiardi, P. & Molva, R. (2002). CORE: A collaborative reputation mechanism to enforce node cooperation in mobile ad hoc networks, Proceedings of the IFIP TC6/TC11 Sixth Joint Working Conference on Communications and Multimedia Security, pp. 107-121, ISBN: 1-4020-7206-6, Deventer, The Netherlands, September 2002. Mishra, A.; Nadkarni, K. & Patcha, A. (2004). Intrusion detection in wireless ad hoc networks, IEEE Wireless communication Journal, Vol. 11, No. 1, February 2004, page numbers (48-60), ISSN: 1536-1284. Nasser, N. & Chen, Y. (2007). Enhanced intrusion detection system for discovering malicious nodes in mobile ad hoc network, Proceedings of the IEEE International Conference on Communication (ICC’07), pp. 1154-1159, ISBN: 1-4244-0353-7, Glasgow, June 2007. Pahlevanzadeh, B. & Samsudin, A. (2007). Distributed hierarchical IDS for MANET over AODV+, Proceedings of the IEEE International Conference on Telecommunications and Malaysia International Conference on Communications, pp. 220-225, ISBN: 978-1-42441094-1, Penang, Malaysia, May 2007. Papadimitratos, P.; Haas, Z.J. & Sirer, E.G. (2002). Path set selection in mobile ad hoc networks, Proceedings of the 3rd ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 1–11, ISBN: 1-58113-501-7, Lausanne, Switzerland. Samad, K.; Ahmed, E.; Mahmood, W.; Sharif, K. & Chaudhry, A.A. (2005). Efficient clustering approach for intrusion detection in ad hoc networks, Proceedings of Conference on Engineering Sciences and Technology, pp. 1-6, ISBN: 978-0-7803-9442-1, Karachi, August 2005. Smith, A. (2001). An examination of an intrusion detection architecture for wireless ad hoc networks, Proceedings of the 5th National Colloquium for Informayion System Security Education, May 2001. Sun, B. (2004a). Intrusion Detection in Mobile Ad Hoc Networks (Dissertation), ProQuest Information and Learning Company, UMI Number: 3172171, USA. Sun, B.; Kui W.; & Pooch, U.W. (2004b). Towards adaptive intrusion detection in mobile ad hoc networks, Proceedings of the IEEE Global Telecommunications Conference
104
Advanced Technologies
(GLOBECOM’04), Vol. 6, pp. 3551–3555, ISBN: 0-7803-8794-5, November/December 2004. Yingfang, F.; Jingsha, H. & Guorui, L. (2007). A distributed intrusion detection scheme for mobile ad hoc networks, Proceedings of the 31st Annual International Computer Software and Applications Conference (COMPSAC), Vol. 2, pp. 75-80, ISBN: 0-76952870-8, Beijing, July 2007. Yu, M.; Zhou, M. & Su, W. (2009). A secure routing protocol against byzantine attacks for MANETs in adversarial environments, IEEE Transactions on Vehicular Technology, Vol. 58, No. 1, January 2009, page numbers (449-460), ISSN: 0018-9545. Zhang, Y. & Lee, W. (2000). Intrusion detection in wireless ad hoc networks, Proceedings of the 6th Annual International Conference on Mobile Computing and Networking (ACM MobiCom’00), pp. 275-283, ISBN: 1-58113-197-6, Boston, Massachusetts, United States, August 2000. Zhang, Y.; Lee, W. & Huang Y.-A. (2003). Intrusion detection techniques for mobile wireless networks, Wireless Networks Journal (ACM WINET), Vol. 9, No. 5, September 2003, page numbers (545-556), ISSN: 1022-0038. Zhou, L. & Hass, Z.J. (1999). Securing ad hoc networks. IEEE Network Magazine Special Issue on Network Security, Vol. 13, No. 6, November/December 1999, page numbers (2430).
Graph Theory and Analysis of Biological Data in Computational Biology
105
X7 Graph Theory and Analysis of Biological Data in Computational Biology Shih-Yi Chao
Ching Yun University 229,Chien-Hsin R.d., Jung-Li, Taiwan 1. Introduction The theory of complex networks plays an important role in a wide variety of disciplines, ranging from communications to molecular and population biology. The focus of this article is on graph theory methods for computational biology. We'll survey methods and approaches in graph theory, along with current applications in biomedical informatics. Within the fields of Biology and Medicine, potential applications of network analysis by using graph theory include identifying drug targets, determining the role of proteins or genes of unknown function. There are several biological domains where graph theory techniques are applied for knowledge extraction from data. We have classified these problems into several different domains, which are described as follows. 1. Modeling of bio-molecular networks. It presents modeling methods of bio-molecular networks, such as protein interaction networks, metabolic networks, as well as transcriptional regulatory networks. 2. Measurement of centrality and importance in bio-molecular networks. To identify the most important nodes in a large complex network is of fundamental importance in computational biology. We’ll introduce several researches that applied centrality measures to identify structurally important genes or proteins in interaction networks and investigated the biological significance of the genes or proteins identified in this way. 3. Identifying motifs or functional modules in biological networks. Most important biological processes such as signal transduction, cell-fate regulation, transcription, and translation involve more than four but much fewer than hundreds of proteins or genes. Most relevant processes in biological networks correspond to the motifs or functional modules. This suggests that certain functional modules occur with very high frequency in biological networks and be used to categories them. 4. Mining novel pathways from bio-molecular networks. Biological pathways provide significant insights on the interaction mechanisms of molecules. Experimental validation of identification of pathways in different organisms in a wet-lab environment requires monumental amounts of time and effort. Thus, there is a need for graph theory tools that help scientists predict pathways in bio-molecular networks.
106
Advanced Technologies
Our primary goal in the present article is to provide as broad a survey as possible of the major advances made in this field. Moreover, we also highlight what has been achieved as well as some of the most significant open issues that need to be addressed. Finally, we hope that this chapter will serve as a useful introduction to the field for those unfamiliar with the literature.
2. Definitions and mathematical preliminaries 2.1 The concept of a graph The concept of a graph is fundamental to the material to be discussed in this chapter. A graph G consists of a set of vertices V(G) and a set of edges E(G). In a simple graph, two of the vertices in G are linked if there exists an edge (vi, vj)E(G) connecting the vertices vi and vj in graph G such that viV(G) and vjV(G). The number of vertices will be denoted by V(G), and the set of vertices adjacent to a vertex vi is referred to as the neighbors of vi, N(vi). The degree of a vertex vi is the number of edges with which it is incident, symbolized by d(vi). Two graphs, G1 and G2, are said to be isomorphic (G1 G2) if a one-to-one transformation of V1 onto V2 effects a one-to-one transformation of E1 onto E2. A subgraph G´ of a graph G is a graph whose set of vertices and set of edges satisfy the relations: V(G´)V(G) and E(G´) E(G), and if G´ is a subgraph of G, then G is said to be a supergraph of G´. The line graph L(G) of an undirected graph G is a graph such that each vertex in L(G) indicates an edge in G and any pairs of vertices of L(G) are adjacent if and only if their corresponding edges share a common endpoint in G. 2.2 Directed and undirected graphs A graph may be undirected, meaning that there is no distinction between the two vertices associated with each edge, or its edges may be directed from one vertex to another. Formally, a finite directed graph, G, consists of a set of vertices or nodes, V(G) = {v1, . . . ,vn}, together with an edge set, E(G) V(G)V(G). Intuitively, each edge (u, v) E(G) can be thought of as connecting the starting node u to the terminal node v. An undirected graph, G, also consists of a vertex set, V(G), and an edge set E(G). However, there is no direction associated with the edges in this case. Hence, the elements of E(G) are simply two element subsets of V(G), rather than ordered pairs as directed graphs. As with directed graphs, we shall use the notation uv (or vu as direction is unimportant) to denote the edge {u, v} in an undirected graph. For two vertices, u, v, of an undirected graph, uv is an edge if and only if vu is also an edge. We are not dealing with multi-graphs, so there can be at most one edge between any pair of vertices in an undirected graph. That is, we are discussing the simple graph. A simple graph is an undirected graph that has no loops and no more than one edge between any two different vertices. In a simple graph the edges of the graph form a set and each edge is a pair of distinct vertices. The number of vertices n in a directed or undirected graph is the size or order of the graph. 2.3 Node-degree and the adjacency matrix For an undirected graph G, we shall write d(u) for the degree of a node u in V(G). This is simply the total number of edges at u. For the graphs we shall consider, this is equal to the number of neighbors of u, d(u) = |N (u)|. In a directed graph G, the in-degree , d+(u) (out-
Graph Theory and Analysis of Biological Data in Computational Biology
107
degree, d-(u)) of a vertex u is given by the number of edges that terminate (or start) at u. Suppose that the vertices of a graph (directed or undirected) G are ordered as v1, . . ., vn. Then the adjacency matrix, A, of G is given by
1 if vi v j E (G ) a ij 0 if vi v j E (G ) Thus, the adjacency matrix of an undirected graph is symmetric while this need not be the case for a directed graph. 2.4 Path, path length and connected graph Let u, v be two vertices in a graph G. Then a sequence of vertices u = v1, v2, . . ., vk = v, such that for i = 1, . . ., k-1, is said to be a path of length k-1 from u to v. The geodesic distance, or simply distance, d(u, v), from u to v is the length of the shortest path from u to v in G. If no such path exists, then we set d(u, v) = 1. If for every pair of vertices, (u, v), in graph G, there is some path from u to v, then we say that G is connected.
3. Modeling of bio-molecular networks 3.1 Introduction Several classes of bio-molecular networks have been studied: Transcriptional regulatory networks, protein interaction network, and metabolic networks. In Biology, transcriptional regulatory networks and metabolic networks would usually be modeled as directed graphs. For instance, in a transcriptional regulatory network, nodes would represent genes with edges denoting the transcriptional relationships between them. This would be a directed graph because, if gene A regulates gene B, then there is a natural direction associated with the edge between the corresponding nodes, starting at A and terminating at B. In recent years, attentions have been focused on the protein-protein interaction networks of various simple organisms (Itzkovitz & Alon, 2005). These networks describe the direct physical interactions between the proteins in an organism's proteome and there is no direction associated with the interactions in such networks. Hence, PPI networks are typically modeled as undirected graphs, in which nodes represent proteins and edges represent interactions. In next sections, we individually introduce these bio-molecular networks. 3.2 Transcriptional regulatory networks Transcriptional regulatory networks describe the regulatory interactions between genes. Here, nodes correspond to individual genes and a directed edge is drawn from gene A to gene B if A positively or negatively regulates gene B. Networks have been constructed for the transcriptional regulatory networks of E. coli and S. cerevisiae (Salgado et al., 2006; Lee et al., 2002; Salgado et al., 2006; Keseler et al., 2005) and are maintained in databases such as RegulonDB (Salgado et al., 2006) and EcoCyc (Keseler et al., 2005). Such networks are usually constructed through a combination of high-throughput genome location experiments and literature searches. Many types of gene transcriptional regulatory related approaches have been reported in the past. Their nature and composition are categorized by
108
Advanced Technologies
several factors: considering gene expression values (Keedwell & Narayanan, 2005; Shmulevich et al., 2002), the causal relationship between genes, e.g. with Bayesian analysis or Dynamic Bayesian Networks (Zou & Conzen, 2005; Husmeier, 2003), and the time domain e.g. discrete or continuous time (Li et al., 2006; He & Zeng, 2006; Filkov et al., 2002; Qian et al., 2001). One of the limitations of graph theory applications in analyzing biochemical networks is the static quality of graphs. Biochemical networks are dynamical, and the abstraction to graphs can mask temporal aspects of information flow. The nodes and links of biochemical networks change with time. Static graph representation of a system is, however, a prerequisite for building detailed dynamical models (Zou & Conzen, 2005). Most dynamical modeling approaches can be used to simulate network dynamics while using the graph representation as the skeleton of the model. Modeling the dynamics of biochemical networks provides closer to reality recapitulation of the system's behavior in silico, which can be useful for developing more quantitative hypotheses. 3.3 Protein interaction networks Understanding protein interactions is one of the important problems of computational biology. These protein-protein interactions (PPIs) networks are commonly represented by undirected graph format, with nodes corresponding to proteins and edges corresponding to protein-protein interactions. The volume of experimental data on protein-protein interactions is rapidly increasing by high-throughput techniques improvements which are able to produce large batches of PPIs. For example, yeast contains over 6,000 proteins, and currently over 78,000 PPIs have been identified between the yeast proteins, with hundreds of labs around the world adding to this list constantly. Humans are expected to have around 120000 proteins and around 106 PPIs. The relationships between the structure of a PPI network and a cellular function are waited to be explored. Large-scale PPI networks (Rain et al., 2001; Giot et al., 2003; Li et al., 2004; Von Mering et al., 2004; Mewes et al., 2002) have been constructed recently using high-throughput approaches such as yeast-2-hybrid screens (Ito et al., 2001) or mass spectrometry techniques (Gavin et al., 2002) to identify protein interactions. Vast amounts of PPI related data that are constantly being generated around the world are being deposited in numerous databases. Data on protein interactions are also stored in databases such as the database of interacting proteins (DIP) (Xenarios et al., 2000). We briefly mention the main databases, including nucleotide sequence, protein sequence, and PPI databases. The largest nucleotide sequence databases are EMBL (Stoesser et al., 2002), DDBJ (Tateno et al., 2002), and GenBank (Benson et al., 2002). They contain sequences from the literature as well as those submitted directly by individual laboratories. These databases store information in a general manner for all organisms. Organism specific databases exist for many organisms. For example, the complete genome of yeast and related yeast strains can be found in Saccharomyces Genome Database (SGD) (Dwight et al., 2002). FlyBase (Ashburner, 1993) contains the complete genome of the fruit fly Drosophila melanogaster. It is one of the earliest model organism databases. Ensembl (Hubbard et al., 2002) contains the draft human genome sequence along with its gene prediction and large scale annotation. SwissProt (Bairoch & Apweiler, 2000) and Protein Information Resource (PIR) (McGarvey et al., 2000) are two major protein sequence databases. SwissProt maintains a high level of annotations for each protein including its function, domain structure, and post-translational modification information.
Graph Theory and Analysis of Biological Data in Computational Biology
109
Understanding interactions between proteins in a cell may benefit from a model of a PPIs network. A full description of protein interaction networks requires a complex model that would encompass the undirected physical protein-protein interactions, other types of interactions, interaction confidence level, or method and multiplicity of an interaction, directional pathway information, temporal information on the presence or absence of PPIs, and information on the strength of the interactions. This may be achieved by designing a scoring function and assigning weights to nodes and edges of a PPIs network. 3.4 Metabolic networks Metabolic networks describe the bio-chemical interactions within a cell through which substrates are transformed into products through reactions catalysed by enzymes. Metabolic networks generally require more complex representations, such as hyper-graphs, as reactions in metabolic networks generally convert multiple inputs into and multiple outputs with the help of other components. An alternative is a weighted bipartite graph to reduce representation for a metabolic network. In such graphs, two types of nodes are used to represent reactions and compounds, respectively. The edges in a weighted bipartite graph connect nodes of different types, representing either substrate or product relationships. These networks can represent the complete set of metabolic and physical processes that determine the physiological and biochemical properties of a cell. Metabolic networks are complex. There are many kinds of nodes (proteins, particles, molecules) and many connections (interactions) in such networks. Even if one can define sub-networks that can be meaningfully described in relative isolation, there are always connections from it to other networks. As with protein interaction networks, genome-scale metabolic networks have been constructed for a variety of simple organisms including S. cerevisiae and E. coli (Jeong et al., 2000; Overbeek et al., 2000; Karp et al., 2002; Edwards et al., 2000), and are stored in databases such as the KEGG (Kanehisa & Goto, 2000) or BioCyc (Karp et al., 2005) databases. A common approach to the construction of such networks is to first use the annotated genome of an organism to identify the enzymes in the network and then to combine biochemical and genetic information to obtain their associated reactions (Kauffman et al., 2000; Edwards et al., 2001). While efforts have been made to automate certain aspects of this process, there is still a need to validate the networks generated automatically manually against experimental biochemical results (Segre et al., 2003). For metabolic networks, significant advances have also been made in modelling the reactions that take place on such networks. The overall structure of a network can be described by several different parameters. For example, the average number of connections a node has in a network, or the probability that a node has a given number of connections. Theoretical work has shown that different models for how a network has been created will give different values for these parameters. The classical random network theory (Erdös & Renyi, 1960) states that given a set of nodes, the connections are made randomly between the nodes. This gives a network where most nodes have the same number of connections. Recent research has shown that this model does not fit the structure found in several important networks. Instead, these complex networks are better described by a so-called scale-free model where most nodes have only a few connections, but a few nodes (called hubs) have a very large number of connections. Recent work indicates that metabolic networks are examples of such scale-free networks (Jeong et al., 2000). This result is important, and will probably lead to new insights into the function of metabolic and signaling networks, and into the evolutionary history of
110
Advanced Technologies
the networks. Robustness is another important property of metabolic networks. This is the ability of the network to produce essentially the same behavior even when the various parameters controlling its components vary within considerable ranges. For example, recent work indicates the segment polarity network in the Drosophila embryo can function satisfactorily with a surprisingly large number of randomly chosen parameter sets (von Dassow et a.l, 2000). The parameters do not have to be carefully tuned or optimized. This makes biological sense, which means a metabolic network should be tolerant with respect to mutations or large environmental changes. Another important emerging research topic is to understand metabolic networks in term of their function in the organism and in relation to the data we already have. This requires combining information from a large number of sources, such as classical biochemistry, genomics, functional genomics, microarray experiments, network analysis, and simulation. A theory of the cell must combine the descriptions of the structures in it with a theoretical and computational description of the dynamics of the life processes. One of the most important challenges in the future is how to make all this information comprehensible in biological terms. This is necessary in order facilitate the use of the information for predictive purposes to predict what will happen after given some specific set of circumstances. This kind of predictive power will only be reached if the complexity of biological processes can be handled computationally.
4. Measurement of centrality and importance in bio-molecular networks Biological function is an extremely complicated consequence of the action of a large number of different molecules that interact in many different ways. Genomic associations between genes reflect functional associations between their products (proteins) (Huynen et al., 2000; Yanai et al., 2001). Furthermore, the strength of the genomic associations correlates with the strength of the functional associations. Genes that frequently co-occur in the same operon in a diverse set of species are more likely to physically interact than genes that occur together in an operon in only two species ((Huynen et al., 2000), and proteins linked by gene fusion or conservation of gene order are more likely to be subunits of a complex than are proteins that are merely encoded in the same genomes (Enright et al., 1999). Other types of associations have been used for network studies, but these focus on certain specific types of functional interactions, like subsequent enzymatic steps in metabolic pathways, or physical interactions. Elucidating the contribution of each molecule to a particular function would seem hopeless, had evolution not shaped the interaction of molecules in such a way that they participate in functional units, or building blocks, of the organism's function (Callebaut et al., 2005). These building blocks can be called modules, whose interactions, interconnections, and fault-tolerance can be investigated from a higher-level point of view, thus allowing for a synthetic rather than analytic view of biological systems (Sprinzak et al., 2005). The recognition of modules as discrete entities whose function is separable from those of other modules (Hartwell et al., 1999) introduces a critical level of biological organization that enables in silico studies. Intuitively, modularity must be a consequence of the evolutionary process. Modularity implies the possibility of change with minimal disruption of function, a feature that is directly selected for (Wilke et al., 2003). However, if a module is essential, its independence from other modules is irrelevant unless, when disrupted, its function can be restored either
Graph Theory and Analysis of Biological Data in Computational Biology
111
by a redundant gene or by an alternative pathway or module. Furthermore, modularity must affect the evolutionary mechanisms themselves, therefore both robustness and evolvability can be optimized simultaneously (Lenski et al., 2006). The analysis of these concepts requires both understanding of what constitutes a module in biological systems and tools to recognize modules among groups of genes. In particular, a systems view of biological function requires the development of a vocabulary that not only classifies modules according to the role they play within a network of modules and motifs, but also how these modules and their interconnections are changed by evolution, for example, how they constitute units of evolution targeted directly by the selection process (Schlosser et al., 2004). The identification of biological modules is usually based either on functional or topological criteria. For example, genes that are co-expressed or coregulated can be classified into modules by identifying their common transcription factors (Segal et al., 2004), while genes that are highly connected by edges in a network form clusters that are only weakly connected to other clusters (Rives et al., 2003). From viewpoint of evolutionary, genes that are inherited together but not with others often form modules (Snel et al., 2004; Slonim et al., 2006). However, the concept of modularity is not at all well defined. For example, the fraction of proteins that constitutes the core of a module and that is inherited together is small (Snel et al., 2004), implying that modules are fuzzy but also flexible so that they can be rewired quickly, allowing an organism to adapt to novel circumstances (Campillos et al., 2006). A set of data is provided by genetic interactions (Reguly et al., 2006), such as synthetic lethal pairs of genes or dosage rescue pairs, in which a knockout or mutation of a gene is suppressed by over-expressing another gene. Such pairs are interesting because they provide a window on cellular robustness and modularity brought about by the conditional expression of genes. Indeed, the interaction between genes epistasis (Wolf et al., 2000) has been used to successfully identify modules in yeast metabolic genes (Segre et al., 2005). However, often interacting pairs of genes lie in alternate pathways rather than cluster in functional modules. These genes do not interact directly and thus are expected to straddle modules more often than lie within one (Jeong et al., 2000). In silico evolution is a powerful tool, if complex networks can be generated that share the pervasive characteristics of biological networks, such as error tolerance, small-world connectivity, and scale-free degree distribution (Jeong et al., 2000). If furthermore each node in the network represents a simulated chemical or a protein catalyzing reactions involving these molecules, then it is possible to conduct a detailed functional analysis of the network by simulating knockdown or over-expression experiments. This functional datum can then be combined with evolutionary and topological information to arrive at a more sharpened concept of modularity that can be tested in vitro when more genetic data become available. Previous work on the in silico evolution of metabolic (Pfeiffer et al., 2005), signaling (Soyer & Bonhoeffer, 2006; Soyer et al., 2006), biochemical (Francois et al., 2004; Paladugu et al., 2006), regulatory (Ciliberti et al., 2007), as well as Boolean (Ma'ayan et a., 2006), electronic (Kashtan et al., 2005), and neural (Hampton et al., 2004) networks has begun to reveal how network properties such as hubness, scaling, mutational robustness as well as short pathway length can emerge in a purely Darwinian setting. In particular, in silico experiments testing the evolution of modularity both in abstract (Lipson et al., 2002) and in simulated electronic networks suggest that environmental variation is key to a modular organization of function. These networks are complex, topologically interesting (Adami, 2002), and
112
Advanced Technologies
function within simulated environments with different variability that can be arbitrarily controlled.
5. Identifying motifs or functional modules in biological networks Biological systems viewed as networks can readily be compared with engineering systems, which are traditionally described by networks such as flow charts. Remarkably, when such a comparison is made, biological networks and engineered networks are seen to share structural principles such as modularity and recurrence of circuit elements (Alon, 2003). Both biological systems function and engineering are organized with modularity. Engineering systems can be decomposed into functional modules at different levels (Hansen et al., 1999), subroutines in software (Myers, 2003) and replaceable parts in machines. In the case of biological networks, although there is no consensus on the precise groups of genes and interactions that form modules, it is clear that they possess a modular structure (Babu et al., 2004). Alon proposed a working definition of a module based on comparison with engineering. A module in a network is a set of nodes that have strong interactions and a common function (Alon, 2003). A module has defined input nodes and output nodes that control the interactions with the rest of the network. Various basic functional modules are frequently reused in engineering and biological systems. For example, a digital circuit may include many occurrences of basic functional modules such as multiplexers and so on (Hansen et al., 1999). Biology displays the same principle, using key wiring patterns again and again throughout a network. For instance, metabolic networks use regulatory circuits such as feedback inhibition in many different pathways (Alon, 2003). Besides basic functional modules, recently a small set of recurring circuit elements termed motifs have been discovered in a wide range of biological and engineering networks (Milo et al., 2002). Motifs are small (about 3 or 4 nodes) sub-graphs that occur significantly more frequently in real networks than expected by chance alone, and are detected purely by topological analysis. This discover kindled a lot of interest on organization and function of motifs, and many related papers were published in recent years. The observed over-representation of motifs has been interpreted as a manifestation of functional constraints and design principles that have shaped network architecture at the local level (Milo et al., 2002). Some researchers believe that motifs are basic building blocks that may have specific functions as elementary computational circuits (Milo et al., 2002). Although motifs seem closely related to conventional building blocks, their relation lacks adequate and precise analysis, and their method of integration into full networks has not been fully examined. Further, it is not clear what determines the particular frequencies of all possible network motifs in a specific network.
6. Mining novel pathways from bio-molecular networks In the studying organisms at a systems level, biologists recently mentioned (Kelley et al. 2003) the following questions: (1) Is there a minimal set of pathways that are required by all organisms? (2) To what extent are the genomic pathways conserved among different species? (3) How are organisms related in terms of the distance between pathways rather than at the level of DNA sequence similarity? At the core of such questions lies the identification of pathways in different organisms. However, experimental validation of an
Graph Theory and Analysis of Biological Data in Computational Biology
113
enormous number of possible candidates in a wet-lab environment requires monumental amounts of time and effort. Thus, there is a need for comparative genomics tools that help scientists predict pathways in an organism’s biological network. Due to the complex and incomplete nature of biological data, at the present time, fully automated computational pathway prediction is excessively ambitious. A metabolic pathway is a set of biological reactions where each reaction consumes a set of metabolites, called substrates, and produces another set of metabolites, called products. A reaction is catalyzed by an enzyme (or a protein) or a set of enzymes. There are many web resources that provide access to curated as well as predicted collections of pathways, e.g., KEGG (Kanehisa et al. 2004), EcoCyc (Keseler et al. 2005), Reactome (Joshi-Tope et al. 2005), and PathCase (Ozsoyoglu et al 2006). Work to date on discovering biological networks can be organized under two main titles: (i) Pathway Inference (Yamanishi et al., 2007; Shlomi et al., 2006), and (ii) Whole-Network Detection (Tu et al., 2006; Yamanishi et al. 2005). Even with the availability genomic blueprint for a living system and functional annotations for its putative genes, the experimental elucidation of its biochemical processes is still a daunting task. Though it is possible to organize genes by broad functional roles, piecing them together manually into consistent biochemical pathways quickly becomes intractable. A number of metabolic pathway reconstruction tools have been developed since the availability of the first microbial genome, Haemophilus influenza (Fleischmann et al., 1995). These include PathoLogic (Karp & Riley, 1994), MAGPIE (Gaasterland & Sensen, 1996) and WIT (Overbeek et al., 2000) and PathFinder (Goesmann et al., 2002). The goal of most pathway inference methods has generally been to match putatively identified enzymes with known or reference pathways. Although reconstruction is an important starting point for elucidating the metabolic capabilities of an organism based upon prior pathway knowledge, reconstructed pathways often have many missing enzymes, even in essential pathways. The issue of redefining microbial biochemical pathways based on missing proteins is important since there are many examples of alternatives to standard pathways in a variety of organisms (Cordwell, 1999). Moreover, engineering a new pathway into an organism through heterologous enzymes also requires the ability to infer new biochemical routes. With more genomic sequencing projects underway and confident functional characterizations absent for many of the genes, automated strategies for predicting biochemical pathways can aid biologists inunraveling the complex processes in living systems. At the same time, pathway inference approaches can also help in designing synthetic processes using the repertoire biocatalysts available in nature.
7. Conclusion The large-scale data on bio-molecular interactions that is becoming available at an increasing rate enables a glimpse into complex cellular networks. Mathematical graph theory is a straightforward way to represent this information, and graph-based models can exploit global and local characteristics of these networks relevant to cell biology. Moreover, the need for a more systematic approach to the analysis of living organisms, alongside the availability of unprecedented amounts of data, has led to a considerable growth of activity in the theory and analysis of complex biological networks in recent years. Networks are ubiquitous in Biology, occurring at all levels from biochemical reactions within the cell up to the complex webs of social and sexual interactions that govern the dynamics of disease
114
Advanced Technologies
spread through human populations. Network graphs have the advantage that they are very simple to reason about, and correspond by and large to the information that is globally available today on the network level. However, while binary relation information does represent a critical aspect of interaction networks, many biological processes appear to require more detailed models. A comprehensive understanding of these networks is needed to develop more sophisticated and effective treatment strategies for diseases such as Cancer. This may eventually prove mathematical models of large-scale data sets valuable in medical problems, such as identifying the key players and their relationships responsible for multifactorial behavior in human disease networks. In conclusion, it can be said of biological network analysis is needed in Bioinformatics research field, and the challenges are exciting. It is hoped that this chapter will be of assistance to researchers by highlighting recent advances in this field.
8. References Adami, C (2002). What is complexity. BioEssays, Vol. 24, pp. 1085–1094. Alon,U. (2003). Biological networks: the tinkerer as an engineer. Science, Vol. 301, No. 5641. Ashburner, M. (1993). FlyBase. Genome News, Vol. 13, pp. 19–20. Babu, M. M. (2004). Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol, Vol. 14, No. 3, pp.283-291. Bairoch, A. & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research, Vol. 28, pp. 45–48. Benson, D. A. (2002). GenBank. Nucleic Acids Research, Vol. 30, pp. 17–20. Callebaut W & Rasskin-Gutman D (2005). Modularity: understanding the development and evolution of complex systems. Mueller GB, Wagner GB, Callebaut W, editors. Cambridge (Massachusetts): MIT Press. Campillos, M. et al. (2006). Identification and analysis of evolutionarily cohesive functional modules in protein networks. Genome Res., Vol. 16, pp. 374–382. Ciliberti, S. et al. (2007). Robustness can evolve gradually in complex regulatory gene networks with varying topology. PLoS Comput Biol., Vol. 3. Cordwell, S. (1999). Microbial genomes and missing enzymes: redefining biochemical pathways. Arch. Microbiol., Vol. 172, pp. 269–279. Dwight, S. S. (2002). Saccharomyces genome database (SGD) provides secondary gene annotation using the gene ontology (GO). Nucleic Acids Research, Vol. 30, pp. 69–72. Edwards, J. & Palsson, B. (2000). The Escherichia coli MG1655 in silico metabolic genotype: Its definition, characteristics and capabilities. Proceedings of the National Academy of Sciences, Vol. 97, No. 10, pp. 5528–5533. Edwards, J. (2001). In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nature Biotechnology, Vol. 19, pp. 125–130. Erdős, P.& Rényi, A. (1960). The Evolution of Random Graphs. Magyar Tud. Akad. Mat. Kutató Int. Közl., Vol. 5, pp. 17–61. Enright, A. J. et al. (1999). Protein interaction maps for complete genomes based on gene fusion events. Nature, Vol. 402, pp. 86–90. Filkov, V. et al., (2002). Analysis techniques for microarray time-series data. J Comput Boil, Vol. 9, pp. 317-331.
Graph Theory and Analysis of Biological Data in Computational Biology
115
Fleischmann, R. et al., (1995). Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, Vol. 269, pp. 469–512. Francois, P. & Hakim, V. (2004). Design of genetic networks with specified functions by evolution in silico. Proc Natl Acad Sci U S A, Vol. 101, pp. 580–585. Gaasterland,T.& Sensen,C. (1996). MAGPIE: automated genome interpretation.Trends Genet., Vol. 12, pp. 76–78. Gavin, A. et al. (2002). Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature, Vol. 415, pp. 141–147. Giot, L. et al. (2003). A protein interaction map of Drosophila Melanogaster. Science, Vol. 302, pp.1727–1736. Goesmann, A., et al. (2002). PathFinder: reconstruction and dynamic visualization of metabolic pathways. Bioinformatics, Vol. 18, pp. 124–129. Hampton, A.N. & Adami, C. (2004). Evolution of robust developmental neural networks. Pollack JB, Bedau MA, Husbands P, Ikegami T, Watson R, editors. Boston: MIT Press. pp. 438–443. Hansen, M. C. et al. (1999). Unveiling the iscas-85 benchmarks: A case study in reverse engineering. IEEE Des. Test, Vol. 16, No. 3, pp. 72-80. Hartwell, L.H., et al. (1999). From molecular to modular cell biology. Nature, Vol. 402, pp. C47–C52. He, F. & Zeng, A.P. (2006). In search of functional association from time-series microarray data based on the change trend and level of gene expression. BMC Bioinformatics, Vol. 7, pp. 69-84. Hubbard, T. (2002). The ensembl genome database project. Nucleic Acids Research, Vol. 30, pp. 38–41. Husmeier, D. (2003). Sensitivity and Specificity of Inferring Genetic Regulatory Interactions from Microarray Experiments with Dynamic Bayesian Networks. Bioinformatics, Vol. 19, pp. 2271-2282. Huynen, M. (2000). The identification of functional modules from the genomic association of genes. Genome Res., Vol. 10, pp. 1204–1210. Ito, T. et al. (2001).A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proceedings of the National Academy of Sciences, Vol. 98, No. 8, pp. 4569– 4574. Itzkovitz, S. & Alon, U. (2005). Subgraphs and network motifs in geometric networks. Physical Review E, Vol. 71, pp. 026117-1-0261179, ISSN 1539-3755. Jeong, H. et al. (2000). The large-scale organization of metabolic networks. Nature, Vol. 407, pp. 651–654. Jeong, H. B. (2000). The large-scale organization of metabolic networks. Nature, Vol. 407, pp. 651-654. Joshi-Tope G. et al. (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. Database Issue, Vol. 33, pp. D428-32 Kanehisa, M. & Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, Vol. 28, No. 1, pp. 27–30. Karp, P. et al. (2002). The EcoCyc Database. Nucleic Acids Research, Vol. 30, No. 1, pp. 56–58. Karp, P. et al. (2005). Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Research, Vol. 33, No. 19, pp.6083–6089.
116
Advanced Technologies
Kauffman, K. et al. (2003). Advances in flux balance analysis. Current Opinion in Biotechnology, Vol. 14, pp. 491–496. Kashtan, N. & Alon, U. (2005). Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci USA, Vol. 102, pp. 13773–13778. Kanehisa, M. et al. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res., Vol. 32, pp. D277–280. Karp,P. & Riley,M. (1994). Representations of metabolic knowledge: pathways. In Altman,R., Brutlag,D., Karp,P., Lathrop,R. and Searls,D. (ed.) Second International Conference on Intelligent Systems for Molecular Biology. AAAI Press, Menlo Park, CA. Keedwell, E. & Narayanan, A. (2005). Discovering Gene Networks with a Neural-Genetic Hybrid. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Vol. 2, pp. 231-242. Kelley, P et al. (2003). Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc. of National Academy of Sciences USA, Vol. 100, No. 20, pp. 11394-11395. Keseler, I. et al. (2005). EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Research, Vol. 33, No. 1. Lee, T. et al. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, Vol. 298, pp. 799–804. Lenski, R.E. et al. (2006). Balancing robustness and evolvability. PLoS Biol., Vol. 4. Li, S. et al. (2004). A map of the interactome network of the metazoan C. elegans. Science, Vol. 303, pp. 540–543. Li, X. et al. (2006). Wand QK: Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling. BMC Bioinformatics, Vol. 7, pp.26-46. Lipson, H. et al. (2002). On the origin of modular variation. Evolution, Vol. 56, pp. 1549–1556. Ma'ayan, A. et al. (2006). Topology of resultant networks shaped by evolutionary pressure. Phys Rev E, Vol. 73, pp. 061912. McGarvey, P. B. (2000). PIR: a new resource for bioinformatics. Bioinformatics, Vol. 16, pp. 290–291. Mewes, H. et al. (2002). MIPS: a database for genomes and protein sequences. Nucleic Acids Research, Vol. 30, No. 1, pp. 31–34. Milo, R. et al. (2002). Net- work motifs: simple building blocks of complex networks. Science, Vol. 298, No. 5594, pp. 824-827. Myers, C. R.(2003). Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Physical Review E, Vol. 68. Overbeek, R. et al. (2000). WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction. Nucleic Acids Research, Vol. 28, No. 1, pp. 123–125. Ozsoyoglu, M. et al. (2006). Genomic Pathways Database and Biological Data Management. Animal Genetics, Vol. 37, pp. 41-47. Paladugu, S.R. et al. (2006). In silico evolution of functional modules in biochemical networks. IEE Proc Syst Biol., Vol. 153, pp. 223–235. Pfeiffer, T. et al. (2005). The evolution of connectivity in metabolic networks. PLoS Biol., Vol. 3.
Graph Theory and Analysis of Biological Data in Computational Biology
117
Qian, J. et al. (2001). Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. J Mol Biol, Vol. 314, pp. 1053-1066. Rain, J. et al. (2001). The protein-protein interaction map of Heliobacter Pylori. Nature, Vol. 409, pp. 211–215. Reguly, T. et al. (2006). Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol., Vol. 5, No. 11. Rives, A.W. & Galitski, T. (2003). Modular organization of cellular networks. Proc Natl Acad Sci U S A, Vol. 100, pp. 1128–1133. Salgado, H. et al. (2006). The comprehensive updated regulatory network of Escherichia coli K-12. BMC Bioinformatics,Vol. 7, No. 5. Salgado, H. et al. (2006). RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Research, Vol. 34, No. 1. Schlosser, G. & Wagner, G.P. (2004). Modularity in development and evolution. Chicago: University of Chicago Press. Segal, E. et al. (2004). A module map showing conditional activity of expression modules in cancer. Nat Genet, Vol. 36, pp. 1090–1098. Segre, D. et al. (2003). From annotated genomes to metabolic flux models and kinetic parameter fitting. Omics, Vol. 7, No. 3, pp. 301–316. Segre, D. et al. (2005). Modular epistasis in yeast metabolism. Nat Genet, Vol. 37, pp. 77–83. Shlomi, T. et al. (2006). QPath: a method for querying pathways in a protein-protein interaction network. BMC Bioinformatics, Vol. 7, No. 199. Shmulevich, I. et al. (2002). From Boolean to Probabilistic Boolean Networks as Models of Genetic Regulatory Networks. Proceedings of the IEEE, Vol. 90, pp.1778-1790. Slonim, N. et al. (2006). Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks. Mol Syst Biol., Vol. 2. Snel, B & Huynen, M.A. (2004). Quantifying modularity in the evolution of biomolecular systems. Genome Res, Vol. 14, pp. 391–397. Sprinzak, D. & Elowitz, M.B. (2005). Reconstruction of genetic circuits. Nature, Vol. 438, pp. 443–448. Stoesser, G. et al. (2002). The EMBL nucleotide sequence database. Nucleic Acids Research, Vol. 30, pp. 21–26. Soyer, O.S. & Bonhoeffer, S. (2006). Evolution of complexity in signaling pathways. Proc Natl Acad Sci U S A, Vol.103, pp. 16337–16342. Soyer, O.S. et al. (2006). Simulating the evolution of signal transduction pathways. J Theor Biol., Vol. 241, pp. 223–232. Tateno, Y. (2002). DAN data bank of japan (DDBJ). Nucleic Acids Research, Vol. 30, pp. 27–30. Tu, Z. et al. (2006). An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics, Vol. 22, No. 14, pp. e489-96. Von Dassow, G. (2000). The segment polarity network is a robust developmental module. Nature, Vol. 406, No. 6792, pp.188-192. Von Mering, C. et al. (2002). Comparative assessment of large-scale data sets of proteinprotein interactions. Nature, Vol. 417, pp. 399–403. Wilke, C. O. & Adami, C. (2003). Evolution of mutational robustness. Mutat Res., Vol. 522, pp. 3–11.
118
Advanced Technologies
Xenarios, I. et al. (2000). DIP: the database of interacting proteins. Nucleic Acids Research, Vol. 28, No. 1, pp.289–291. Wolf, J.B. (2000). Epistasis and the evolutionary process. Oxford: Oxford University Press. Yanai, I., et al. (2001). Genes linked by fusion events are generally of the same functional category: a systematic analysis of 30 microbial genomes. Proc. Natl. Acad. Sci. USA, Vol. 98, pp. 7940–7945. Yamanishi, Y. et al. (2005). Supervised enzyme network inference from the integration of genomic data and chemical information. ISMB (Supplement of Bioinformatics), pp. 468-477. Yamanishi, Y. et al. (2007). Prediction of missing enzyme genes in a bacterial metabolic network. FEBS J., Vol. 274, No. 9, pp. 2262-73. Zou, M. & Conzen, SD (2005). A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, Vol. 21, pp. 71-79.
Harmonics Modelling and Simulation
119
X8 Harmonics Modelling and Simulation Dr. Rana Abdul Jabbar Khan and Muhammad Junaid
Rachna College of Engineering & Technology, Gujranwala, Pakistan 1. Introduction In this modern era, it is necessary for regulated and deregulated power sectors to properly monitor power system signals in order to be able to access and maintain the quality of power according to the set standards. Harmonics are sinusoidal voltages or currents having frequencies, that are integer multiples of the fundamental frequencies (50 or 60 HZ), at which the supply system is designed to operate. The identification, classification, quantification and mitigation of power system harmonics signals is the burning issue for various stake holders including utilities, consumers and manufacturers world widely. To accomplish this task mathematical and computational tools like MATLAB and Electrical Transient Analyzer Program (ETAP) have been used while conducting this research. Experimental work and simulation, pertaining to harmonics, will really help the scientific community to understand this phenomenon comprehensively to gain in-advance information, acquired for remedial measures. This chapter will comprise of the following: Harmonics background and their analysis Harmonics modelling and simulation at high and low distribution voltage level
2. Harmonics background Over recent years, there has been a considerable increase in the installation and use of electronic devices in electrical power system revealing non-linear behavior. They draw current which is non-sinusoidal in nature because of the rectification/inversion phenomena of their operation. The reason of this non-sinusoidal/distorted current is the presence of harmonic contents in the current waveform drawn by these electronic equipment.
3. Harmonics Modelling & Simulation at High Voltage (HV) level In this chapter harmonics modelling and simulation has been performed at power distribution voltage level. For this purpose the chapter is sub-divided into two main parts. The first part deals with harmonics modelling and simulation at High Voltage (HV) distribution level which is 11 kV in most of the countries including Pakistan.
120
Advanced Technologies
A practical case of an independent 11 kV furnace has been discussed in this section. Modern induction furnace is an example of large non-linear load. Its operational and economic impacts have been analyzed comprehensively.
4. Operational and Economic Impacts of Large distorted current drawn by modern induction furnaces Modern induction furnace draws heavy current with considerable distortion in the current waveform. This section focuses on indirect consequences caused by distorted current waveform drawn by modern induction furnaces in terms of operational and economic impacts. This heavy distorted current cause’s distortion in the system voltage also. Owing to the insulation limitations at 11 kV line, it is very taxing for modern Power Quality Analyzers to capture the exact distorted voltage waveform. However the distorted waveform of current with high amplitude can be captured by these analyzers by virtue of the appropriate CT ratios of their current clamps. By using the Fast Fourier Transform (FFT) of the current waveform, a new mathematical approach using MATLAB has been developed for the exact modelling of distorted voltage waveform. This new approach has further been worked out to derive mathematical relation to compute the THDv which also shows its trend as a function of distance between the supply (utility) and Induction furnace (load). The rest of impacts are the derivation of distortion power, extra active power and reactive power in the line, percentage increase in system losses, displacement and true power factor measurement and finally the calculation of extra bill charged from the consumers. Mentioned above parameters have been derived mathematically and simulated in MATLAB, which also demonstrate their drift as a function of distance from grid to furnace at 11 kV voltage level. 4.1 Block diagram of modern induction furnace To illustrate the basic operation of the modern induction furnace, the generalized block diagram is given in Figure 1 which is self explanatory.
Fig. 1. Block diagram of Induction Furnace 4.2 Single line diagram of case under study The single line diagram of 11 kV furnace feeder is given in Figure 2. The monitoring points (M1, M2, M3 and M4) have clearly been depicted in the Figure 2. At these points the
Harmonics Modelling and Simulation
121
waveform, RMS value, FFT and THD (Following the IEEE standards) of distorted current have been recorded using power quality analyzer “Fluke 43B” making sure that the furnaces are running at considerable load. This type of analyzer is capable of recording the FFT of a distorted current waveform up to the 50th harmonic present in that distorted current waveform. Since all the furnaces are three phase and there is a negligible load unbalancing, therefore, only readings from red phase of 11 kV furnace feeder have been recorded for analysis. The data obtained at monitoring points shown in Figure 2, is given in Table 1. Monitoring Point M-1 (Furnace-A) M-2 (Furnace-B) M-3 (Furnace-C) M-4 (Coupling point F)
RMS value of Current (A) 62.20 A 160.9 A 136.5 A 358.1 A
Table 1. RMS Current and (%) THDi at different Monitoring Points
(%) THDi 21.2 17.8 13.2 14.5
Fig. 2. Single Line diagram of 11 kV furnace feeder 4.3 Description of captured current waveforms at various monitoring points a. Monitoring point M1 (Furnace-A) The Current waveform captured at monitoring point M1 having RMS value of 62.2A is shown in Figure 3. It is clearly visible that this current waveform is non-sinusoidal. The FFT Spectrum of the distorted waveform of Figure 3 is given in Figure 4, which clearly indicates the significant presence of 5th, 7th, 11th and 13th harmonics. However the higher order harmonics are considerably small as compared to above mentioned harmonics.
122
Advanced Technologies
Fig. 3. Current waveform at red phase of furnace A
Fig. 4. Current spectrum at red phase of furnace-A b. Monitoring point M2 (Furnace-B) The current waveform captured at monitoring point M2 having RMS value of 160.9 A is shown in Figure 5 which is also distorted.
Fig. 5. Current waveform at red phase of furnace-B The FFT Spectrum of the distorted current waveform of Figure 5 is Shown in Figure 6, which shows the considerable presence of 5th ,7th,11th,13th and 17th harmonics. It is obvious that harmonics higher than 17th order are significantly smaller in magnitude as compared to above mentioned harmonics.
Fig. 6. Current spectrum at red phase of furnace-B
Harmonics Modelling and Simulation
123
c. Monitoring point M3(Furnace-C) The Current waveform captured at monitoring point M3 having RMS value of 136.5 A is shown in Figure 7. Again it is clearly visible that this waveform is significantly distorted.
Fig. 7. Current waveform at red phase of furnace-C The FFT Spectrum of the distorted waveform of Figure 7 is given in Figure 8, which clearly indicates the significant presence of 7th, 11th, 13th, 17th, 21th, 23rd etc harmonics. However harmonics having order higher than 23rd are visibly small as compared to above mentioned odd harmonics. The operating frequency of furnace-C is 650 Hz, while the other furnaces operate at 600 Hz. Due to this difference in frequency, current waveform drawn by furnaceC is different from rest of the two furnaces.
Fig. 8. Current spectrum at red phase of furnace-C d. Monitoring point M4 (common coupling point F) The Current waveform captured at monitoring point M4 (common coupling point F) having RMS value of 358.1 A is shown in Figure 9 and its FFT is shown in Figure 10.
Fig. 9. Current Waveform at coupling point F
124
Advanced Technologies
The FFT Spectrum of the distorted waveform of Figure 9 is given in Figure 10 which clearly indicates the significant presence of 5th, 7th, 11th and 13th harmonics. However, the presence of higher order harmonics seems to be insignificant as they are smaller in magnitude in comparison with previously mentioned odd harmonics. It is obvious that the reactance of the line increases directly with increase in frequency of the current passing through it. Therefore, these higher order harmonics also contribute significantly to Voltage Distortion. It is worth mentioning that triplen harmonics (3th, 9th, 15th etc) of all the waveforms shown above have been trapped by the delta-Winding of line transformers of all the Furnaces .The position of line transformer has clearly been illustrated in block diagram of furnace, which is shown in Figure 1.
Fig. 10. Current spectrum at coupling point F Based upon the discussion carried out in previuos sections, M4 which is the reference point (Common Coupling Point F) will be focussed for modlling of distorted voltage waveform in terms of distance “d” from terminal pole T (Starting point of feeder) upto the coupling point-F. In latter sections, for mathematical analysis, the Monitoring point M4 will be denoted by F as shown in single line diagram shown in Figure 2. 4.4 Impedance diagram for the case under study For the calculation of mathematical expressions impedance diagram corresponding to the single line diagram of Figure 2 is shown in Figure 11. Current at M4 is taken as reference current. For simplification in analysis all the three furnaces (loads) are replaced with an equivalent load F.
Fig. 11. Impedance diagram corresponding to single line diagram Where, = Voltage at source (grid), pure sinusoid Vs VF = Voltage at load (F) terminal, to be calculated IF = Common Coupling Point Current Rcable = Resistance of 100meters 500 MCM cable (in Ohms)
Harmonics Modelling and Simulation
125
Lcable = Inductance of 100meters 500 MCM cable (in Henry) Rosp = Resistance of Osprey conductor (in Ohms/Km) Losp = Inductance of Osprey conductor (in Henry/Km) d = Length of Osprey conductor from terminal pole T to Common Coupling Point F The impedance diagram will be solved using phasor algebra by converting it into two parts. First it will separately solve for fundamental frequency current (IF1) and second for harmonic frequency currents (IFn, where n = 2 to 50). Taking IF1 as reference current the phasor diagram at fundamental frequency current is shown in Figure 12.
Fig. 12. Phasor diagram for fundamental frequency current Here; VF1x = VSCos(θ) - VR1 VF1y = VSSin(θ) - VX1, where, Cos (θ) is the displacement power factor at grid = 0.95, and VR1 = IF1 (2Rospd + 2Rcable)
VX1= IF1 (2πf(2Lospd) + 2πf(2Lcable)), here f is the fundamental frequency i.e., 50 Hz, therefore,
VF1X = Vs cos(θ) – IF1 (2Rospd + 2Rcable) VF1Y = Vs sin(θ) – IF1 (4πfLospd + 4πfLcable) Using Pythagoras theorem:
VF 1 =
VF1X
2
VF1Y
2
(1)
θF1 = tan-1 (VF1Y/VF1X) By putting the values of VF1x and VF1y:
VF1 =
e1d 2 f1d g1
(2)
Where: e1 = [(2IF1Rosp)2 + (4πfIF1Losp)2] = 1.1616×105 f1 = [(8IF12RospRcable – IF1Vs cos(θ)Rosp) + (32 IF12π2f2LospLcable – 8Vs sin(θ)IF1πfLosp)] = -6.7121×106 g1 = [(Vs cos(θ) – 2I1Rcable)2 + (Vs sin(θ) – 4I1πfLcable)2] = 2.4135×108 For any nth harmonic current IFn the source voltage Vs behaves as short circuit because it is pure sinusoidal in nature and load F acts as a source of harmonic current. Taking IFn as reference current, assuming Vs as zero, the required phasor diagram is shown in Figure 13.
126
Advanced Technologies
Fig. 13. Phasor diagram for harmonic currents Here; θn= Phase angle of nth harmonic current with respect to IF1, and VFnX = IFn (2Rospd + 2Rcable) VFnY = IFn (2πnf(2Lospd) + 2πnf(2Lcable)) Using Pythagoras theorem:
VFn =
VFnX
2
VFnY
2
(3)
θvn = tan-1 (VFnY/VFnX) θFn = Phase angle of VFn with reference to IF1 = θvn + θFn Putting the values of VFnx and VFny in Equ. (3) and then by squaring it on both sides: VFn2 = [(IFn2{(2Rosp)2 + (4πnfLosp)2}]d2 + [IFn2 {8RospRcable + 32π2n2f2LospLcable}]d + [IFn2{4Rcable2 + 16 n2f2Lcable2}] Taking square root on both sides:
V Fn =
a n d 2 bn d c n
(4)
Where, an = IFn2{(2Rosp)2 + (4πnfLosp)2} bn = IFn2{8RospRcable + 32π2n2f2LospLcable} cn = IFn2{4Rcable 2 + 16n2f2Lcable2} Here n= 2 to 50, in summation form:
50 n2
(VFn ) =
50 n2
a n d 2 bn d c n
(5)
Now the final value of voltage VF(t) in time domain is: VF(t)= V F 1 sin (2πft+ θF1) + V Fn sin (2πft+ θFn)
(6)
4.5 Operational Impacts a. Simulation of voltage waveforms using MATLAB By implementing Equ. (6) using MATLAB, the distorted voltage waveform at any point along the distance d from terminal point T to common coupling point F can be generated. For instance distorted voltage waveforms and their FFT’s have been generated at different points between terminal pole T and coupling point F, which are given as:
Harmonics Modelling and Simulation
Fig. 14. Voltage wave form at terminal pole T
Fig. 15. FFT Spectrum of voltage waveform shown in Figure 14
Fig. 16. Voltage waveform at 2.5 KM from Terminal pole T
Fig. 17. FFT Spectrum of voltage waveform shown in Figure 16
Fig. 18. Distorted Voltage waveform at common coupling point F
127
128
Advanced Technologies
Fig. 19. FFT corresponding to waveform shown in Figure 18 The above given waveforms along with their FFT’s signify the fact that the distortion in voltage increases as it travels from terminal pole (T) to common coupling point (F). b. Calculation of (%) THDv in terms of distance The expression for computation for THDv of VF is:
By putting the values of VF1 and following expression is obtained: THDv =
THDv =
from Equ.’s (2) and (5) in above expression, the
50 n2
(V Fn )
50 n2
2 VFn / VF1
a n d 2 bn d cn , where d<= 5 (km’s) e1d 2 f1d g1
(7)
Equ. (7) expresses THDv as a function of distance d in kilo-meters from terminal pole (T) to common coupling point (F). The values of % THDv computed have been tabulated in Table 2. Distance d (km’s) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Table 2. Variation of (%) THDv with distance
(%) THDv 0.2178 1.7163 3.2357 4.7759 6.3370 7.9191 9.5222 11.146 12.792 14.458 16.145
In order to illustrate the trend of increase in THDv from terminal pole (T) to common coupling point (F), a graph has been drawn by using the values given in the above table. This graph shows that the trend of increase in THDv with distance is approximately linear.
Harmonics Modelling and Simulation
129
Fig. 20. Graph showing trend of (%) THDv with increasing distance d c. Distortion Power Following the power triangle P, Q and S are Active, Reactive and Apparent powers respectively. These are the powers present in system in case of linear load. But in case of harmonics in the system, another power is introduced, which is called as Distortion power and represented by D. Figure 21 shows the association among the powers in case of non-linear load.
Fig. 21. Change in power triangle in case of non-linear load Now Apparent power S can be calculated as: S= VFrms × IFrms
Where, VFrms = VF1rms
1 THD
v
IFrms = IF1rms 1 THD
i
(8)
2
and VF1rms = VF1/ 2
2
and IF1rms = IF1/ 2
The total active power P and the reactive power Q are given by: P = P1 +
Q = Q1 +
50 n2
50 n2
Pn Qn
After simplification the required solutions in terms of distance d from starting point (T) to common coupling point (F) are given in Equ.’s (9) and (10) respectively: P = I1cos(θ)/2 e1d 2 f1d g1 +
Q = I1sin(θ)/2 e1d 2 f1d g1 +
50 n2
50 n2
[ IFncos(θFn)/2
a n d 2 bn d c n ]
(9)
[ IFnsin(θFn)/2
a n d 2 bn d c n ]
(10)
130
Advanced Technologies
Now, the distortion power D is given by the following formula: D=
S 2 P2 Q2
(11)
The derived expression in terms of distance d from terminal pole (T) up to common coupling point (F), is for the measurement of Distortion Power D. This expression also describes the relation between Distortion power (D) and the distance (d). In this case the value of d is fixed at 5 km’s but it can be generalized by putting the value of displacement power factor along with rest of the single line diagram parameters. Simulating these Equ.’s in MATLAB, the result obtained at different values of distance d is given in Table 3. Distance d (km’s) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Distortion Power (VA) 16.86×105 16.44×105 16.23×105 16.24×105 16.46×105 16.88×105 17.50×105 18.28×105 19.22×105 20.29×105 21.46×105
Table 3. Distortion Power (D) trend from Point T to F
d. Extra Active & Reactive Power The Active and Reactive powers due to the presence of harmonics can be derived in terms of distance d from point T to F as:
n2
n2
50
50
Pn = 3( 50
[ IFncos(θFn)/2
a n d 2 bn d c n ])
(12)
Qn = 3( 50
[ IFnsin(θFn)/2
a n d 2 bn d c n ])
(13)
n2
n2
Implementing the Equ.’s (12) and (13) in MATLAB, the resulted values in terms of distance are given in Table 4. Distance d (km’s) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Extra Active Power (W) 180.96 975.46 1769.99 2564.54 3.36×103 4.16×103 4.95×103 5.75×103 6.54×103 7.33×103 8.13×103
Extra Reactive Power (VAR) 2536.57 19.86 3.72×103 5.45×104 7.48×104 8.92×104 11.85×104 12.38×104 14.11×104 15.85×104 17.58×104
Table 4. Extra Active & Reactive Power in terms of distance
Harmonics Modelling and Simulation
131
e. Percentage Increase in Line Losses The losses due to the ohmic resistance of the line are known as line losses or copper losses (I2R). Here in case of harmonic currents, these losses can be calculated as: WC = 3[I2F1 (Rcable + Rospd) +
50
2 n2 Fn
I
(Rcable + Rospd)]
The percentage increased in Line Losses can be determined from the following relation: % Increased Losses = Itrue,rms2R – Ifund,rms2R/Ifund,rms2R × 100 Itrue,rms2 = Ifund,rms2(1 + THDi) % Increased Losses = THDi2*100
(14)
THDi = 0.145 % increased Losses = 2.1025 % Following the allowable permissible limit of technical losses i.e. 5%, this considerable increase in technical losses due to the presence of harmonic currents, is really alarming for utilities and furnace owners. f. Displacement and True Power Factors The power factor of fundamental frequency components of the voltage and current is known as Displacement Power Factor (DPF). Where, the ratio of total active and reactive power including harmonic contents is called True Power Factor (TPF). Mathematically expressed as: DPF= P1/S1 TPF= P/S
(15) (16)
The values of DPF and TPF obtained from MATLAB are given in Table 5. Distance (km’s)
DPF
TPF
Difference
0.0
0.9504
0.9406
0.5
0.9530
0.9431
0.0098 0.0099
1.0
0.9556
0.9454
1.5
0.9581
0.9474
2.0
0.9606
0.9491
2.5
0.9631
0.9505
3.0 3.5
0.9655 0.9679
0.9517 0.9525
4.0
0.9702
0.9530
4.5
0.9725
0.9532
5.0
0.9747
0.9530
0.0102 0.0107 0.0115 0.0126 0.0138 0.0154 0.0172 0.0193 0.0217
Table 5. Displacement and True Power Factor The graph in Figure 22 shows the behavior of DPF and TPF in terms of distance. Here DPF follows a linear pattern, where the TPF pursues a curve pattern. The reason of this curve pattern is the quadratic behavior of Equ.’s 15 & 16.
132
Advanced Technologies
Fig. 22. Graph of DPF and TPF 4.6 Economic Impact Table 4 shows the Extra Active Power flow in line, and its value calculated at Common Coupling Point (M4) is 8.13 KW. On the basis of this value Extra Bill of handsome amount charged from furnace owners using the tariff of the utility can be calculated. Table 6 shows the Sanctioned Load and Maximum Demand Indicator (MDI) of all the three furnaces. These MDI’s are of the same month, when readings are taken during monitoring. Furnace A B C TOTAL
Sanctioned Load (KW) 1160 3200 2500 6860
Table 6. Sanctioned Load and MDI of all the Furnaces
MDI (KW) 1008 2442 2083 5533
The Tariff for independent industrial (furnace) consumer is given in Table 7: Charges (Rs.) Fixed (per KW per Month) Variable (per KWh)
Table 7. Tariff for Furnace Consumers
Off Peak 305.00 3.88
Peak 305.00 6.97
Extra Active Power P = 8.13 KW Off Peak Units (per month) = 8.13*20*30= 4878 KWh Peak Units (per month) = 8.13*4*30= 975 KWh Fixed Charges = 305*MDI = Rs. 1687565 Variable Charges: Off Peak (O) = 4878*3.88 = 18927 Peak (P) = 975*6.97 = 6796 TOTAL = (O) + (P) = Rs. 25723 GST (15% of Variable Charges) = Rs. 3858 Income Tax = Rs. 2000 Additional Charges = Rs. 34212 TOTAL EXTRA BILL = Rs. 17,53,358/This considerable extra bill charged by utility will be divided among the furnaces according to their KWh and MDI.
Harmonics Modelling and Simulation
133
4.7 Conclusion of operational and economic impacts of distorted current drawn by modern induction furnaces The novel approach presented here will really help the power sector stake holders to measure the voltage distortion at any desired location irrespective of voltage level. Moreover, this newly developed methodology can be generalized to solve any type of electric circuits involving non-linear loads. The above section reveals that the presences of large distorted current in a power distribution network adversely affect the performance of the system operationally and economically. Its indirect consequences result in network overloading even under normal operating conditions. The life of power system reduces and some times can cause damaging to power/distribution transformers and furnace installations etc due to excessive system losses. This novel approach will really open the pathway for researchers and scientists in mitigation of these impacts in future.
5. Harmonics Modelling & Simulation at Low Voltage (LV) level This section deals with mathematical modelling and simulation at low voltage (LV) i.e. 400volts (secondary distribution voltage) level. Computer load, as case study has been selected for mathematical modelling and simulation.
6. Mathematical modelling of current harmonics caused by personal computers Personal computers draw non-sinusoidal current with odd harmonics more significantly. Power quality of distribution networks is severely affected due to the flow of these generated harmonics during the operation of electronic loads. In this section, mathematical modelling of odd harmonics in current like 3rd, 5th, 7th and 9th influencing the power quality has been presented. Live signals have been captured with the help of power quality analyzer for analysis purpose. The interesting feature is that Total Harmonic Distortion (THD) in current decreases with the increase of nonlinear loads has been verified theoretically. The results obtained using mathematical expressions have been compared with the practical results. 6.1 Methodology and instrumentation In this case study various computers were connected to the main of the power supply one by one and effect of each computer on the current waveform of the main was recorded. Figure 23 indicates the hardware arrangement and apparatus used during the experimental work. As it is evident from Figure 23, inputs for various computers under test one by one are drawn from AC mains. The waveforms of odd harmonics and THD have been observed and recorded. This data has been used to make observations about the changes and effects of electronic loads. Following equipment have been used for the experiment work. i) Power quality analyzer was used to record the current waveforms and THD. ii) Personal Computer (PC) details are as under: Pentium (R) 4 CPU 2.40 GHz
134
Advanced Technologies
ATX Power supply 220 t0 230 Volts Monitor 15 inch (100- 240V, 50/60Hz, 0.8- 1.5A)
Fig. 23. Hardware arrangement 6.2 Results and discussions PC’s numbering from PC 1 to PC 23 were connected to the AC mains gradually and then wave-forms of odd harmonics and THD in current have been captured and recorded in real time for observations. Table 8 describes the results taken for Total Harmonic Distortion (THD) in current and individual magnitudes of odd harmonics corresponding to different number of PCs connected to the main. No. of PC’s 1 4 7 10 13 16 19 21 23
% Mag. of 3rd harmonic 50 53 54 56 58 58 57 58 58
Table 8. Online recorded results
% Mag. of 5th harmonic 45 42 40 38 35 33 29 27 25
% Mag. of 7th harmonic 37 25 21 16 12 8 6 4 0
% Mag. of 9th harmonic 23 13 5 3 2 5 7 7 9
% THDi 79.3 74.6 72.3 70.1 68.6 66.2 64.0 62.8 61.4
6.3 Graphical representation of results a. 3rd Harmonic Only From graph shown in Figure 24, it is clear that the magnitude of 3rd harmonic increases upto certain level then it remains constant with further increase of the electronic load. Mathematically, using Curve Fitting technique, the relation between columns 1 and 2 of Table 8 can be written as: Y = -0.0248X2 + 0.935X + 49.228 In the form of current and number of PCs: I3 = - 0.0248NPC2 + 49.228
(17)
Harmonics Modelling and Simulation
135
Fig. 24. Graphical representation of 3rd harmonic only b. 5th Harmonic Only From graph shown in Figure 25, the magnitude of 5th harmonic decreases in linear fashion, as the number of PC’s connected to the supply mains increases, ultimately the magnitude of this odd harmonics approaches to zero.
Fig. 25. Graphical representation of 5th harmonic only Mathematically, using Curve Fitting technique relation between columns 1 and 3 of Table 8 can be written as: Y = - 0.8916X + 46.239 In the form of magnitude of harmonic current and the no. of PCs connected to the supply mains: I5 = - 0.8961NPC + 46.239 (18) Where, - 0.8961 is the slop of the line 46.239 is its y-intercept c. 7th Harmonic Only Figure 26 indicates that the magnitude of 7th harmonic decreases in a logarithmic fashion, rather than linear as in the case of 3rd and 5th harmonic, as the number of PC’s increases and consequently it becomes zero.
Fig. 26. Graphical representation of 7th harmonic only
136
Advanced Technologies
Mathematically, using Curve Fitting technique relation between columns 1 and 4 of Table 8 can be written as: Y = -11.278ln(x) + 39.85 In the form of magnitude of harmonic current and the no. of PCs connected to the supply mains: (19) I7 = -11.278ln(NPC) + 39.85 Where, ln(x) is a natural logarithmic function. d. 9th Harmonic Only From this graph shown in Figure 27, it is observed that the magnitude of 9th harmonic is following the trend line of polynomial of order 2 as compared to other harmonics as the number of PC’s is increasing the magnitude of 9th harmonic is decreasing resultantly.
Fig. 27. Graphical representation of 9th harmonic only Mathematically, using Curve Fitting technique relation between columns 1 and 5 of Table 8 can be written as: Y = 0.1188NPC2 – 3.3403NPC + 25. 159 In the form of magnitude of harmonic current and the no. of PCs connected to the supply mains: I9 = 0.1188NPC2 – 3.3403NPC + 25. 159
(20)
Where the expression on right hand side is a polynomial of 2nd order, geometrically it represents the characteristics of parabolic curve. e. THD in Current The percentage of Total Harmonic Distortion (%THD) can be defined in two different ways, as a percentage of the fundamental component (the IEEE definition of THD) or as a percentage of the rms (used by the Canadian Standards Association and the IEC). THD =
n2
I rms
,n
2
/ I1
Where Irms,n is the amplitude of the harmonic component of order n (i.e., the nth harmonic). The numerator gives the RMS current due to all harmonics and I1 is the RMS value of fundamental component of current only. Given above is the mathematical form of the IEEE definition of THD. According to IEC standards, the mathematical form of THD is given below:
Harmonics Modelling and Simulation
137
THD =
n2
Irms =
I rms
,n
I
n 1
2
rms
/Irms, and ,n
2
Where Irms,n is the amplitude of the harmonic component of order n (i.e., the nth harmonic) and Irms is the rms value of all the harmonics plus the fundamental component of the current. The later standard is referred in this study, because the apparatus used for analysis was based on IEC Standards. The 3rd, 5th, 7th and 9th harmonics being the most significant, the definition of THD may be modified and written as in the next page. THD =
2
2
2
I rms , 3 I rms , 5 I rms , 7 I rms , 9 I rms
2
(21)
The value of THD may be calculated for any number of computers using the above formula. Figure 28 is showing the magnitude of individual harmonics, when 4 PCs were connected to the supply mains.
Fig. 28. FFT of current waveform Irms = 3.28 A RMS magnitude of 3rd Harmonic = 53% of 3.28 = 1.7384 A RMS magnitude of 5th Harmonic = 42% of 3.28 = 1.3776 A RMS magnitude of 7th Harmonic = 25% of 3.28 = 0.8200 A RMS magnitude of 9th Harmonic = 13% of 3.28 = 0.4264 A THD = 1 . 7384 2 1 . 776 (%) THD = 73.26 %
2
0 . 8200
2
0 . 4264
2
2
2
/ 3.28
The above Equ. can be modified as: THD =
2
I3 I5 I7 I9
2
(22)
138
Advanced Technologies
Where I3, I5, I7 and I9 are the %age magnitudes of the 3rd, 5th, 7th and 9th harmonics respectively. In this case it can be calculated as: THD = 73.26 % In the Table 8, the online value of THD is 74.6%. The difference of the calculated and experimental value is 1.34, which is only 1.8%. This negligible difference caused by other odd harmonics being neglected proves the validity of measurement and it consequently plays a pivotal role for the accurate analysis of the odd harmonics under test in this research. Figure 29 explains the overall impact of individual harmonics cumulatively. Total Harmonic Distortion (THD) in Current with increase in electronic loads is decreasing. As discussed in previous sections, among odd harmonics only third harmonic plays active role whereas the other odd harmonics impact with increase in electronic loads is negligible.
Fig. 29. Graphical representation of THD in current The relation between THD in current and the number of PCs is given below: It = 80.11 – 0.81 NPCs The relation was not justified in but the analysis of individual harmonics performed in this paper justifies this relation too, as all individual harmonic components except 3rd harmonic are decreasing with the increasing number of PCs. 6.4 Justification of mathematical models Odd harmonic currents can be calculated for any desired number of PCs using Equ.’s (17) to (20) given in previous section. The obtained values of odd harmonics can be used to calculate THD in current using Equ. (22). For 10 PCs, the calculated values of 3rd, 5th, 7th and 9th harmonic currents are given below: I3 = 56.09 % I5 = 37.28 % I7 = 13.88 % I9 = 3.636 % THD = 56.09 2 37.282 13.28 2 3.636 2 THD = 68.86 %
Harmonics Modelling and Simulation
139
Following the same pattern for any number of PCs connected to the supply mains, percentage of odd harmonics and THDi can be calculated as follows: No. of PC’s
10
13
Parameters description I3 I5 I7 I9 THDi I3 I5 I7 I9 THDi
Calculated values 56.09 37.28 13.88 3.636 68.86 57.19 34.59 10.92 1.812 67.75
Experimental Values 56.00 38.00 16.00 3.000 70.10 58.00 35.00 12.00 2.000 68.60
Table 9. Comparison of calculated and experimental values
% Error 0.16 1.89 13.25 2.12 1.80 1.40 1.17 9.00 9.40 1.23
Last column of Table 9 reveals the ignorable values of error; all this confirms the authenticity of the developed mathematical models. 6.5 Conclusion of mathematical modelling of current harmonics caused by PC’s During the mathematical modelling, the individual assessment of odd harmonics in current waveform which are significant in magnitudes proved theoretically that THDi decreases with the increase of electronic/nonlinear loads using IEC Standards. Keeping in view predicted magnitudes by virtue of mathematical modelling, this innovative technique will certainly draw the attention of researchers, consumers, utilities and the manufacturers to think about the remedial measures for mitigation of this undesired phenomenon for smooth operation of power distribution network.
7. Impacts of harmonics caused by personal computers on distribution transformers As it is mentioned in previous section, Personal Computers (PC’s) being electronic load draw non-sinusoidal current. When this non-sinusoidal current passes through the impedance of the line/cable, it causes considerable distortion in voltage. This distorted voltage in a parallel connection scheme appears at the LT/HT sides of distribution transformer and causes significant effects on equipment performance which are designed to operate at sinusoidal voltage and current only. The complete distribution network of Rachna College of Engineering & Technology (RCET), Pk has been simulated using Electrical Transient Analyzer Program (ETAP) software. For this purpose an experiment has been performed in which current waveform drawn by a PC along with its spectrum has been recorded using oscilloscope at RCET Research Lab as prototype. This model of a single PC is injected into the harmonic library of ETAP for simulation of RCET distribution network. The impacts of harmonics caused by PC’s on distribution transformers have been completely analyzed. Moreover, the trend of Total Harmonic Distortion (THD) with variation in different types of loads using IEEE Standards has been analyzed mathematically & graphically.
140
Advanced Technologies
7.1 Experimental work The current waveform and its harmonic spectrum drawn by Personal Computer has been recorded using oscilloscope. The description of equipment and test unit is as under: Digital Storage Oscilloscope: TEXIO 60 MHz, 1000 Ms/s, with voltage range 100-240 V ac. Pentium 4.0 Computer: CPU 2.40 GHz, ATX power supply 115-230 Vac, 2/1.5 Ampere current rating. Monitor: Philips, 100-240 V ac with 1.5/0.8 Ampere current rating. Figure 30 shows the experimental setup, in which current drawn by a single Personal Computer (Monitor & CPU) along with its FFT has been recorded using Digital Storage Oscilloscope. Resistor is a linear element in which voltage and current waveforms are in phase with each other, so voltage waveform recorded across it is also the current waveform drawn by single PC.
Fig. 30. Experimental setup 7.2 Software used for simulation The software used for obtaining results is Electrical Transient Analyzer Program (ETAP), which is recognized software and currently used for power system analysis world widely. It has capacity to perform analysis including Load Flow (LF), Harmonic Load Flow (HA), Harmonic Frequency Scan, Optimal Power Flow, Short-Circuit, Motor Starting and Transient Analysis etc. In Harmonic Analysis (HA) this software has the provision to inject user defined library. The Results obtained at RCET Research Lab of a single PC are inserted into harmonic library of ETAP for simulation. Table 10 shows the percentage of Individual Harmonic Distortion (IHD) with reference to fundamental and Total Harmonic Distortion (THD) in current waveform drawn by single PC. Harmonic No. 3rd 5th 7th 9th 11th 13th 15th 17th 19th
% IHD 91.63 86.61 69.87 44.76 54.81 46.44 46.44 33.05 24.70
Harmonics Modelling and Simulation
141 23rd 25th 29th % THDi
Table 10. % IHD with reference to fundamental
11.74 7.900 5.120 178.97
7.3 Single line diagram of RCET power distribution network
Fig. 31. Single line diagram of RCET power distribution network Figure 31 shows the single line diagram of RCET distribution network. The 11 kV line is emanating from 20/26 MVA Power Transformer at 220 kV Grid Ghakkar, Pk. The details of distribution transformers along with their loads are shown in Table 11. ID
Location
T1 T2
Old Building Independent T/F Hostel A,B New Building Staff Colony Mobile Tower Hostel E,F TOTAL:-
T3 T4 T5 T6 T7
Ratted kVA 200 25
No. of PC’s
Other Load (KW)
PC Load (KW)
30 -
93.21 11.18
22.5 0
100 100 100 25 50 600
22 25 15 13 105
36.44 33.89 25.42 15.00 12.71 212.85
16.5 18.75 11.25 0 9.75 78.75
Table 11. Distribution T.F’s ratting and connected load
The lengths of secondary distribution line (11 kV) are clearly mentioned in single line diagram. The description for 11 kV line conductors is shown in Table 12.
142
Advanced Technologies Cond. Name Dog Rabbit
Resistance (Ohms/Km)
Reactance (Ohms/Km)
Max. Current (A)
0.336 0.659
0.378 0.202
307 202
Table 12. Conductor table
7.4 Harmonic Analysis Current and voltage waveforms along with their harmonic spectrums at LT & HT sides of all the distribution transformers have been recorded during Harmonic Analysis. Due to lack of space available only 100 kVA New Building transformer T4 results has been discussed. a. 100 kVA New Building TransformerT4 The waveforms recorded at LT/HT sides of distribution transformer T4 are: i. Current waveform & its Spectrum at LT side of T4 Current waveform and its FFT recorded at LT side are given in Figures 32 & 33 respectively. The waveform is highly distorted due to the presence of 3rd, 5th, 7th, 9th, 11th etc. harmonics.
Fig. 32. Current waveform at LT side of T4
Fig. 33. FFT of current waveform of Figure 32 ii. Current waveform & its Spectrum at HT side of T4 Current waveform and its FFT recorded at HT side are given in Figures 34 & 35 respectively. This waveform is also distorted but it is noticeable that triplen harmonics (3rd, 9th, 15th etc.) have been trapped by delta winding of distribution transformer.
Fig. 34. Current waveform at HT of T4
Harmonics Modelling and Simulation
143
Fig. 35. FFT of current waveform of Figure 34 iii. Voltage waveform & its Spectrum at LT side of T4 Voltage waveform & its FFT at LT side of distribution transformer T4 are given in Figures 36 & 37. This distortion in voltage waveform is the result of current waveform distortion. When the distorted current passes through the series impedance of the line it causes considerable distortion in voltage waveform. The LT voltage waveform spectrum also contains the triplen harmonics which are major cause of distortion in voltage waveform.
Fig. 36. Voltage waveform at LT side of T4
Fig. 37. FFT of voltage waveform of Figure 36 iv. Voltage waveform & its Spectrum at HT side of T4 Voltage waveform & its FFT at HT side of distribution transformer are given in Figures 38 & 39. The magnitude of harmonic contents with reference to fundamental frequency are very low and that’s why the waveform is almost sinusoidal.
Fig. 38. Voltage waveform at HT side of T4
144
Advanced Technologies
Fig. 39. FFT of voltage waveform of Figure 38 b. Summary of Over all distribution transformers THD’s at LT/HT sides It is clear from single line diagram of Figure 31 that five number transformers are general duty transformers and remaining two are independent transformers. These general duty transformers contained PC’s load along with other load. The transformer T2 is running a motor load, so it has zero harmonic distortion at LT side. The transformer T6 is also an independent transformer feeding to mobile tower having nonlinear load but will not be high lighted because the focus of this research work is mainly based upon PC’s. Total Harmonic Distortion (THD) of voltage and current waveforms are summarized in Table 13 at LT and HT sides of the distribution transformers. ID T1 T2 T3 T4 T5 T6 T7
LT 14.0 0.0 27.0 28.0 42.0 30.0 37.0
Table 13. % THD at different Buses
% THDi HT 25.28 0.0 47.94 49.41 44.30 27.61 65.76
% THDv LT HT 9.97 0.127 0.0 0.139 16.39 0.145 17.26 0.149 10.21 0.152 34.03 0.153 16.99 0.1554
Table 14 shows the percentage Individual Harmonic Distortion of voltage waveform (% IHDv) at LT side of those distribution transformers where its value is significant. ID T1 T3 T4 T5 T7
3rd 0.8 2.4 2.8 1.5 1.8
5th 2.0 4.0 4.2 2.4 3.4
7th 2.6 4.6 4.8 2.9 4.2
9th 1.1 3.5 3.8 2.1 2.6
11th 4.0 6.0 6.3 3.7 6.25
13th 5.3 7.7 8.1 4.8 8.6
15th 1.9 6.1 6.2 3.6 4.4
17th 4.2 5.9 6.2 3.8 7.2
Table 14. % IHDv with ref. to fundamental at HT side of T/F’s
19th 3.6 5.2 5.4 3.4 6.4
23rd 2.0 3.1 3.2 2.1 4.2
IEEE Std. 519-1992, deals with standards and limitations of harmonics level in electrical power system in presence of non-linear loads. For voltage level up to 69 kV, the level of distortion for % THDv is 5.0% and for % IHDv is 3.0%. Table 13 and 14 give the % THD and % IHD at LT/HT sides of distribution transformers. It is clear from the Tables that voltage limit set by IEEE is violated which is a proof of poor power quality.
Harmonics Modelling and Simulation
145
7.5 THD trend with variation in different types of loads ETAP has the provision to vary the load according to requirement and then to record the harmonic distortion in voltage and current waveforms: a. Increasing the No. of PC’s For case study, the transformer T4 (New Building, 100 kVA) computer load has varied from 1 PC up to the 50th PC to record the values of current and voltage waveform distortion by disconnecting other linear load. Table 15 shows that by increasing the number of PC’s, Total Harmonic Distortion in Current (THDi) remains same for 1 PC to 50th PC at LT side, by disconnecting other linear loads. No. of PC’s 1 5 10 15 20 30 40 50
% THDi 159.0 159.0 159.0 159.0 159.0 159.0 159.0 159.0
Table 15. THD trend at LT of T4 by increasing the no. of PC’s According to IEEE Standards, Total Harmonic Distortion is defined as “The ratio of the rootmean-square of the harmonic content to the root-mean-square value of the fundamental quantity, expressed as a percentage of the fundamental”. Mathematically,
THD =
hmax h 1
Mh
2
(23)
M1
If ‘n’ is the number of PC’s then the distortion for ‘n’ number of PC’s can be derived as:
THD =
hmax h 1
n2M h
2
nM 1
(24)
Simplifying the above equation and final result is given in equation 25:
THD =
hmax h 1
Mh
M1
2
(25)
The Equ. (25) proves that by increasing the number of PC’s the Total Harmonic Distortion will remain the same. It is worth mentioning here that this relation holds when the same electronic load is increased by disconnecting the other linear loads. If different types of non-
146
Advanced Technologies
linear loads are increased their THD trend may be increasing or decreasing depending upon fundamental and individual harmonic contents magnitudes. b. Increasing the Linear Load by fixing PC load THD trend can also be confirmed by fixing number of PC’s and varying the linear loads (i.e. resistive & inductive). For this purpose Transformer T4 of 100 kVA has been selected. There are 25 No. of PC’s connected with this transformer and linear load comprises of 20% resistive and 80% inductive. Table 16 indicates that by increasing the linear load by keeping PC’s load constant, THD in current and voltage will decrease accordingly at LT side of the transformer. Linear Load (kVA)
% THDi
% THDv
1
95.0
18.08
5
82.0
17.73
10
69.0
17.31
15
60.0
16.9
20
53.0
16.52
25
47.0
16.16
30
42.0
15.82
35
38.0
15.49
40
35.0
15.18
Table 16. THD trend at LT of T4 by increasing Linear Load
The reason for this decreasing trend is given mathematically in Equ. 26. Where ‘n’ represents number of PC’s which are fixed in this case. ‘Ic’ is current drawn by PC’s where ‘IL’ is current due to linear loads. Here, ‘m’ is the integer which shows the increase in linear load.
THD =
hmax h 1
n 2 I ch
nI c1 mLL
2
(26)
Solving Equ. (26), the results are shown in Equ.’s (27) and (28) respectively, which verify that by increasing ‘m’ (Linear load) the THDi decreases because of the inverse relation between THD and fundamental current.
THD = n
THD =
hmax h 1
I ch
2
nI c1 mL L
hmax h 1
I ch
(27)
2
I c1 m / nLL
(28)
Harmonics Modelling and Simulation
147
Graphical representation of decreasing trend of THDv and THDi are shown in Figure 40 and 41 respectively.
Fig. 40. Decreasing trend of THDv with increase in linear load
Fig. 41. Decreasing trend of THDi with increase in linear load c. Mixing of another Non-Linear Load For comprehensive modelling of THDi at the LT side of 100 kVA T4 transformer, another non-linear load has taken from ETAP harmonic library for further simulation. The manufacturer of this load is Toshiba and its model is PWM ASD. The Spectrum for this nonlinear load is given in Figure 42.
Fig. 42. Harmonic Spectrum for PWM ASD This load is connected with PC and other linear load. The PC and linear load is kept same while this PWM ASD load has varied and results are given in Table 17:
148
Advanced Technologies PWM Load (KW) 1 5 10 15 20 25 30 35 40
Table 17. THD at LT of T4 by increasing PWM
% THDi 34.0 31.0 30.0 30.0 30.0 31.0 33.0 34.0 36.0
This trend can be represented graphically as shown in Figure 43.
Fig. 43. THDi trend during mixing of another non-linear load 7.6 Conclusion of harmonic impacts caused by PC’s on distribution transformers Simulation conducted in ETAP based upon the experimental work injected in its library from proto type developed at RCET Lab is really interesting and innovative. All stake holders of Power Industry, Consumers and Manufacturers can have in advance knowledge using the mathematical relations derived during this research for predictive measures. Moreover, scientific community will really be beneficiary from the mathematical models developed by varying nature of connected load, keeping in view the on ground reality.
8. References Rana Abdul Jabbar, Muhamad Junaid, M. Ali Masood & Khalid Saeed Akhtar, (2009). Impacts of Harmonics caused by Personal Computers on Distribution Transformers, Proceedings of 3rd International Conference on Electrical Engineering (ICEE’09), IEEE, ISBN No. 978-1-4244-4361-1, Paper ID PWR_024, 09-11 April, 2009, University of Engineering & Technology, Lahore, Pakistan. R.A. Jabbar, Muhammad Akmak, Muhamad Junaid & M. Ali Masood, (2008). Operational and Economic Impacts of Distorted Current drawn by the Modern Induction Furnaces, Proceedings of Australasian Universities Power Engineering Conference (AUPEC’08), IEEE, Paper No. 266, ISBN: 978-0-7334-2715-2, 14-17 December, 2009, University of New South Wales, Sydney, Australia.
Harmonics Modelling and Simulation
149
R.A. Jabbar, M. Aldabbagh, Azah Muhammad, R.H. Khawaja, M. Akmal & Rehan Arif, (2008). Impact of Compact Fluorescent Lamp on Power Quality, Proceedings of Australasian Universities Power Engineering Conference (AUPEC’08), IEEE, Paper No. 025, ISBN: 978-0-7334-2715-2, 14-17 December, 2009, University of New South Wales, Sydney, Australia. R.A. Jabbar, Muhammad Akmal, M. Ali Masood, Muhammad Junaid & Fiaz Akram, (2008). Voltage Waveform Distortion Measurement Caused by current drawn by Modern Induction Furnaces, Proceedings of 13th International Conference on Harmonics and Quality of Power (ICHQP2008), IEEE, PES, On page(s): 1-7, ISBN: 978-1-42441771-1, Digital Object Identifier: 10.1109/ICHQP.2008.4668764, Published: 2008-1107 University of Wollongong, Australia. R.A. Jabbar & M. Akmal, (2008). Mathematical Modelling of Current Harmonics Caused by Personal Computers, International Journal of Electrical Systems Science and Engineering (IJESSE), WASET, pp. 103-107, ISSN 1307-8917, Volume 1, Number 2, Winter, May 2008, Bangkok. R.A. Jabbar, S.A. Qureshi & M. Akmal, (2007), Practical Analysis and Mathematical Modelling of Harmonic Distortions Caused by Electronic Loads, Proceedings of 7th the International Association of Science and Technology for Development (IASTED), pp. 145-150, ISBN: 978-0-88986-689-8, 29-31 August, 2007, Spain.
150
Advanced Technologies
Knowledge Management Mechanisms In Programmes
151
X9 Knowledge Management Mechanisms In Programmes Mehdi Shami Zanjani and Mohamad Reza Mehregan University of Tehran Iran
1. Introduction Since projects provide “more flexible and task specific allocation of resources”, companies use projects as a primary way of doing work. As a result, several projects are now concurrently and sequentially being managed in what has been recognized as multi-project or project-based organizations (Landaeta, 2008). Traditionally, the vast majority of practical and theoretical developments on project management have been related to single projects managed in isolation (Evaristo and van Fenema, 1999). Over time, however, issues have arisen where multiple projects are undertaken within organizations, including lack of co-ordination and confusion over responsibility for managing multiple demands on staff (Lycett et al., 2004). There has been an increasing awareness of the requirement for a new perspective on the management of projects, distinct from that applied in a single project context (Payne et al., 1995). In this context, the foundations have been laid for a new discipline, commonly referred to as programme management. Program management is defined as the integration and management of a group of related projects with the intent of achieving benefits that would not be realized if they were managed independently (Lycett et al., 2004). Some authors argue that programme management provides a means to bridge the gap between project delivery and organizational strategy (Lycett et al., 2004). While there has been an increasing recognition in the literature about diversity of different types of programmes, little guidance is offered in terms of the necessary difference in managing approaches for different programmes (Lycett et al., 2004), especially in the area of learning and knowledge management. Although knowledge management has been recognized as a critical success factor in programme management, very little research has been conducted to date (Owen, 2008). This chapter aims to examine the determinant role of programme dimensions onto knowledge management mechanisms. The research proposes a new framework for classifying different KM mechanisms in programmes and makes propositions about how the size, geographical concentration and task nature of programs affect the portfolio of mechanisms suitable for each program.
152
Advanced Technologies
Most prior studies tend to examine one dimension of knowledge management mechanisms – personalization versus codification. In this chapter, personalized versus codified, generalized versus specialized and IT-based versus non IT-based are highlighted as three distinct dimensions of KM mechanisms. The framework and its propositions are based on a literature review and analysis. Moreover, the results of the empirical case study of “Iran Tax Administration Reform & Automation” (TARA) employed to evaluate the research propositions. The “State Tax Organization of Iran” is undertaking the TARA programme aimed at improving its effectiveness and efficiency. The primary focus of this programme is the design and development of an “Integrated Tax System”, that is one of the most important national software systems in Iran, with the goal of developing and improving the existing tax administration and collections process, as well as implementation of a fully integrated technology solution to manage taxpayer information and automate manual processes. The chapter gives valuable guidance to scholars and managers about the kinds of dimensions that should be considered in order to have successful knowledge management mechanisms in programmes and adds originality to the chapter.
2. Programme Management Nowadays, modern organizations are increasingly using project-based structures to become more flexible, adaptive and customer-oriented. Programmes and projects deliver benefits to organizations by enhancing current or developing new capabilities for the organization to use. A benefit is an outcome of actions and behaviors that provides utility to stakeholders. Benefits are gained by initiating projects and programmes that invest in the organization’s future (PMI, 2006). Turner defines a project as an endeavor in which human, material and financial resources are organized in a novel way, to undertake a unique scope of work, of given specification, within constraints of cost and time, so as to achieve beneficial change defined by quantitative and qualitative objectives (Evaristo and van Fenema, 1999). Contrary to project management, which is a concept that is clearly understood by both academics and practitioners, programme management seems to be a term that has not reached this maturity yet (Vereecke et al., 2003). The ambiguity surrounding the nature and practice of programme management remains despite well over a decade of academic and practitioner interest (Pellegrinelli et al, 2007). Recent articles stress the difference between project and programme management, but do neither show consensus nor precise definitions of programme management (Artto et al., 2009). In the literature, many definitions of programme management have been given, ranging from the management of a collection of projects to the management of change (Vereecke et al., 2003). Some authors associate programmes with large projects. They argue that any project lasting longer than 2 years should be called a programme. Other authors associate it to multiproject co-ordination or portfolio management, which is often related to resource management (Thiry, 2002). Programme management is now a widely used approach for bringing about planned change. The approach is used to implement strategy, to develop and maintain new
Knowledge Management Mechanisms In Programmes
153
capabilities, to manage complex information systems implementations and many other business changes (Pellegrinelli et al, 2007). Today, the rationale of programme management lies in strategic management rather than the technical level; the focus is on the organization rather than the team and instead of talking about deliverables, one talk about benefits. In addition, the programme environment is complex: there are multiple stakeholders with differing and often conflicting needs, emergent inputs are always affecting the process, and ambiguity is high (Thiry, 200 ). The object of programmes is the change of a permanent organization. With projects, the permanent organization is usually a given factor that dictates criteria and enablers for project success. Therefore, projects represent narrowly defined task entities or temporary organization (Artto et al., 2009).
3. Knowledge Management The concept that profoundly has affected the discipline of management in recent years is the idea of knowledge as the most critical ingredient in recipes for organizational success. Knowledge management might be a popular challenge to today’s organizations, but successful firms and their managers have always realized its value. As Drucker (1995) rightfully predicts, knowledge has become the key economic resource and a dominant source of competitive advantage (Chong, 2006). Davenport and Prusak (1998) defined knowledge as “a fluid mix of framed experience, value, contextual information, and expert insight that provides a framework for evaluating and incorporating new experiences and information (Ma et al., 2008). It originates and is applied in the minds of the knower. In organizations, it often becomes embedded not only in documents or repositories but also in organizational routines, process, practices, and norms. Knowledge results from the interaction of someone’s insights (past experience, intuition and attitude), information and imagination (generating ideas and visualizing futures). Knowledge if properly utilized and leveraged can drive organizations to become more innovative, competitive, and sustainable. Today, more and more companies are looking for ways to improve and increase their rate of knowledge creation and sharing. In these days organizational processes have become more complex and knowledge intense, and therefore require more awareness and capability in the area of knowledge management. Knowledge management is the active management of creating, disseminating, and applying knowledge to strategic ends (Berdrow and Lane, 2003). The function of knowledge management is to allow an organization to leverage its information resources and knowledge assets by remembering and applying experience (Boh, 2007). In general, KM must be seen as a strategy to manage organizational knowledge assets to support management decision making, to enhance competitiveness, and to increase capacity for creativity and innovation (Nunes et al, 2006). APQC defines KM as an emerging set of strategies and approaches to create, safeguard, and use knowledge assets (including people and information), which allow knowledge to flow to the right people at the right time so they can apply these assets to create more value for the enterprise. Hence, Knowledge, and consequently its management, is currently being touted as the basis of future economic competitiveness.
154
Advanced Technologies
4. The Conceptual Framework Organizing work by projects & programmes allow organizations to respond flexibly to changing organizational needs, but project-based environments face significant challenges in promoting organization-wide learning. While it is a misconception to think that there is no learning across projects since there are little commonalities across projects, the challenges in facilitating knowledge sharing across projects are well-recognized (Boh, 2007). The management, reuse and transfer of knowledge can improve project and programme management capabilities resulting in continuous learning. In this chapter, three aspects of knowledge management are considered: knowledge management strategy, knowledge strategy and information technology strategy. Based on these aspects of knowledge management, the proposed framework highlighted three dimensions of knowledge management mechanisms in programmes: personalized versus codified (knowledge management strategy), generalized versus specialized (knowledge strategy) and IT-based versus non IT-based (information technology strategy). The interaction between these dimensions results in a framework that generates eight classes of knowledge management mechanisms. Knowledge management mechanisms are defined as the formal and informal mechanisms for sharing, integrating, interpreting and applying know-what, know-how, and know-why embedded in individuals, groups and other source of knowledge. The whole programme must share a common KM orientation. KM strategy describes the overall approach a programme intends to take in order to align its knowledge resources and capabilities to its business strategy, thus reducing the knowledge gap existing between what a programme must know to perform its strategy and what it does know. It is important to note that if an effective knowledge management strategy is not developed and managed by a programme, valuable intellectual capital can be lost, causing rework and loss of opportunities. Better identification, transfer and management of knowledge allows intellectual capital to be effectively retained within the programme, allowing it to be reused on other projects, reducing the time staff spend recreating what has already been learned (Owen, 2008). One of the typologies of knowledge strategy has become the most supported and referenced one. This typology recognizes two different knowledge management strategies for sharing tacit and explicit knowledge: codification and personalization (Venkitachalam and Gibbs, 2004). Codification strategy involves securing knowledge then storing it in databases for others to access and reuse. The knowledge is independent of the person who initially created it (smith, 2004). Codification can be a good mechanism to store large amounts of knowledge and to create an organizational memory for all employees (Boh, 2007). Codification strategy focuses on codifying knowledge using a ‘‘people-to-document’’ approach. On the other hand, personalization is a strategy to manage the knowledge that is produced by human interaction. This knowledge is difficult to codify and store because it is unable to replicate the human qualities used when resolving an issue (smith, 2004). Personalization strategy focuses on dialogue between individuals, not knowledge objects in a database. It is a person-to-person approach where the shared knowledge is not only face-to-face, but also electronic communications.
Knowledge Management Mechanisms In Programmes
155
Codification mechanisms typically do not provide a rich medium for communication. Personalization, on the other hand, provides a rich medium for communication, as it is concerned with the use of people as a mechanism for sharing knowledge (Boh, 2007). Programmes should not attempt to implement both strategies. Rather, they should use one strategy primarily and use the second strategy to support the first. Some authors argued that you need to start by identifying what kind of organization you have and what your information needs are, and then primarily focus either on a personalization or a codification strategy (Greiner et al., 2007) In a small programme, personalized mechanisms may serve the knowledge management needs of the programme adequately as employees frequently meet each other in the hallways or at meetings. In a large programme, it is a challenge to find ways of making the connections between individuals who have the right knowledge to share with one another. The probability of serendipitous encounters drops drastically (Boh, 2007). Hence, the first research proposition is: “Codified mechanisms are more suitable for large programmes; while personalized mechanisms are more suitable for small programmes”. A suitable knowledge strategy should answer to the important question: ‘What knowledge is important to your programme?’ While knowledge strategy deals with identifying important knowledge, knowledge management strategy deals with implementing knowledge initiatives to close knowledge gap. According to knowledge strategy, two types of knowledge have been identified in the field of programmes: programme management knowledge and programme domain knowledge. Based on this aspect of knowledge, the proposed framework is highlighted the second dimension of knowledge management mechanisms in programmes: generalized versus specialized mechanisms. “Programme Management Knowledge” is the sum of knowledge within the profession of programme management which includes proven traditional practices that are widely applied, as well as innovative practices and published and unpublished material. This type of knowledge, which can be labelled kernel (Leseure and Brookes, 2004), includes forms of knowledge that need to remain and be nurtured within a company in order to sustain high programme performance in the long term. Because kernel knowledge is what allows programmes teams to repeatedly complete independent programmes in the long term, it matches the accounting definition of intangible assets. “Programme Domain Knowledge” is the knowledge about programme domain (e.g., general business, industry, company, product, and technical knowledge) of an application area in use during the project. This type of knowledge is called application area-specific knowledge in the (PMI, 2006). This knowledge is useful for one programme, but has a low probability of ever being used again. This form of knowledge, which is also labelled ephemeral knowledge according Leseure and Brookes (2004), is only active and useful during the life time of a programme. Ephemeral knowledge does not match the definition of intangible assets as there is no evidence that it will be useful again in the future. If a programme provides a standardized and routine solution to its client, generalized mechanisms would leverage the ability to create and reuse the programme management knowledge in order to sustain high programme performance in the long term. On the other hand, programmes tend to tackle problems that do not have clear solutions at the outset;
156
Advanced Technologies
benefit more from specialized mechanisms, which allow them to create or absorb programme domain knowledge in order to gain better understanding about the problem and its potential solutions. Specialization strategy increases the probability of unique programmes success by supplying critical domain knowledge to them. Hence, the second research proposition is: “Generalized mechanisms are more suitable for programmes conducting projects that are more standardized and routine in nature; while specialized mechanisms are more suitable for programmes conducting projects that are more unique in nature”. Another key dimension in the proposed framework is information technology strategy. This label encompasses both the differentiations between IT-based versus non IT-based mechanisms. The main IT-based mechanisms are, decision support technologies, groupware, electronic knowledge bases and main non IT-based mechanisms are, spontaneous knowledge transfer initiatives, mentoring, teams and communities of practice. It is important to know that a firm must take a global and consistent vision when managing its knowledge and selecting the KM tools to be implemented. The key to achieving harmony between KM and IT is to understand the very basic principle: there are things that computer and technology do well, and there are things that humans do well (Egbu and Botterill, 2002). Many of the failures of IT and KM are the result of repeated attempts to force one paradigm to operate within the realm of the other. Although a recent study from the “American Productivity and Quality Center” shows that organizations embarking in knowledge management efforts generally rely, for accomplishing their goals, on the setting up of a suitable IT infrastructure (Mohamed et al., 2006), many investigators insisted that knowledge management initiatives could be successful without using IT tools, and IT should be adopted only when it is necessary (Egbu and Botterill, 2002). Dougherty (1999) argues that IT should be seen as a tool to assist the process of KM in organizations. Such a process relies more on the face-to-face interaction of people than on static reports and databases (Duffy, 2000). Others argue that IT is strategically essential for global reach when organizations are geographically distributed because of increasingly difficulties for them to know where their best knowledge is and to know what they know (Magnier-Watanabe and Senoo, 2008). IT can assist teams, who in today's world may meet only occasionally or even never, to share experiences on line in order to be able to build and share knowledge, and more generally to work effectively together. If properly used, IT can accelerate knowledge management capabilities in both time and space dimensions. Locality, timing, and relevancy factors determine the expediency and the strength of IT role in KM initiatives (Egbu and Botterill, 2002). It should be mentioned again that IT cannot be considered the magic bullet that makes a KM initiative a complete success. So, IT has to be part of a balanced and integrated set of components. Hence we propose that:
Knowledge Management Mechanisms In Programmes
157
Programme Dimensions
KM Mechanisms
Large-Sized, Routine Task Nature, Geographically Dispersed
Codified, Generalized, IT-Based
Large-Sized, Routine Task Nature, Geographically Concentrated
Codified, Generalized, Non IT-Based
Large-Sized, Innovative Task Nature, Geographically Concentrated
Codified, specialized, Non IT-Based
Large-Sized, Innovative Task Nature, Geographically Dispersed
Codified, specialized, IT-Based
Small-Sized, Routine Task Nature, Geographically Dispersed
Personalized, Generalized, IT-Based
Small-Sized, Routine Task Nature, Geographically Concentrated
Personalized, Generalized, Non IT-Based
Small-Sized, Innovative Task Nature, Geographically Concentrated
Personalized, specialized, Non IT-Based
Small-Sized, Innovative Task Nature, Geographically Dispersed
Personalized, specialized, IT-Based
Table 1.Proposed KM Mechanisms Based on Types of Programmes “IT-based mechanisms are more suitable for programmes that are geographically dispersed; while non IT-based mechanisms are more suitable for programmes that are geographically concentrated”. The interaction between three dimensions of programmes knowledge management mechanisms results in a framework that generates eight classes of KM strategies (Table. 1 depicts the research propositions which are based on types of programmes). 5. The Case Study Results During the three years of launching TARA programme, the PMO has remained relatively small with a total of about 40 employees. All of these employees have collocated in the
158
Advanced Technologies
three-floor building. Over 80% of the employees have at least master degrees and the average age of them is about 35 years old.
Generalized
Knowledge Strategy
Specialized
IT Strategy
Management
• •
Books
•
• Programme
Management
Seminar • PMO Committees • Coaching
DSS of Contractor Selection Database of Programme Management Articles E-Books
Programme Monthly Reports Projects Status Reports
• •
• • • • • • • •
Coaching Meeting with Advisors PMO Weekly meeting Meeting with Contractors PMO Units Meeting Projects Coordination Meeting Socialization of new employee Experts consultancy
IT-based • Programme Portal • Email • Document
Management
System
KM Strategy
Codified
• Written work procedures • Programme
Non IT-based
IT-based
• Net Meeting Software
Personalized
Non IT-based
Table 2.TARA Identified Mechanisms in the KM Mechanisms Matrix TARA programme includes diverse projects which are unique and complex in their nature. Projects like: “Integrated Tax System”, “Risk-Based Audit Selection”, “Data Center Implementation” and “Tax Evasion Study and Prevention Plan” are some of these projects. There is serious lack of knowledge and experiences regarding these projects in Iran. Some interviewees highlighted that collecting an acceptable level of knowledge for defining the scope and deliverables of the projects is one of the most important success factors of the programme. Therefore the programme nature is characterized as very unstructured and non-routine. As shown in the table.2, the case study results highlighted that the key mechanisms used for knowledge management in the TARA programme are more personalized, non IT-based mechanisms predominantly oriented towards specialization. Whereas TARA is a small,
Knowledge Management Mechanisms In Programmes
159
concentrated and innovative programme, the findings support the research propositions. Many interviewees mentioned that they used oral communications to find right individuals to approach for knowledge sharing. Many individuals in the programme depend on their personal network to find the answers to their questions, or to identify the right people to speak to. Senior staffs that have been in the PMO for a long period of time and would know things from long ago are key sources of referrals and knowledge in TARA programme. It was mentioned by interviewees that it was not difficult to find information when someone is looking for specific information, but they do not necessarily know what types of knowledge and information is available to them in the course of their work. Some employees highlighted that they often found out after the fact that their work would have been facilitated if they had approached so-and-so for help before they started. The programme does not use of many collaboration technologies to enable individuals to share information and knowledge with others. The main mode of knowledge and information sharing in the PMO is through diverse meetings such as: Projects integration meetings, meetings with advisors, PMO weekly meetings, meetings with contractors and PMO Units Meetings. Document management system is the most important codified, IT-based mechanism in the programme. The mechanism is a computer system used to track and store electronic documents and images of paper documents. This web based electronic document management system provides a central repository to access, create, store, modify, review, and approve documents in a controlled manner. The results of case study also highlighted that PMO has been used more knowledge management mechanisms for gaining and sharing programme domain knowledge than programme management knowledge. It means that the content of the projects are more challenging for the PMO staffs than the context of managing the projects.
6. Discussion The objective of the research has been to create a better understanding of Knowledge management in programmes. The literature defines programme management as the integration and management of a group of related projects with the intent of achieving benefits that would not be realized if they were managed independently. The case study shows the usefulness of the framework in evaluating the use of Knowledge management mechanisms, and in analyzing the fit of the mechanisms with programme dimensions. The case study results highlight that knowledge management does not necessarily mean having to codify all individual employees’ knowledge. Instead, another key approach to retain and share knowledge is by ensuring that the knowledge is shared with and diffused amongst other employees in the programme. According to the findings, the authors don’t agree with some arguments in the literature about the important role of information technology to leverage knowledge management. As the research has shown, most of the mechanisms identified by participants are social in nature; therefore, this case study confirms the view that knowledge management is a social rather than a technical process. One of the successful mechanisms which are used in this regard is consulting with experts as knowledge providers. Based on Boh (2007), this mechanism has several advantages. First,
160
Advanced Technologies
the experts can provide customized advice for each project for which they are approached. Second, given the years of experience that these experts have accumulated, they have a wide network of contacts to draw upon. Hence, they can effectively broker knowledge linkages between problems owners to other consultants with potential solutions. Third, the experts themselves can benefit from accumulating experience in repeatedly searching for information from their contacts and archives, such that they build up an extensive mental model of who knows what as well as a large set of archives developed from previous interactions with their own client-consultants. One of the finding doesn’t support the proposition of Boh (2007). He proposed that codification knowledge sharing mechanisms are more suitable for organizations conducting tasks that are more standardized and routine in nature; while personalization mechanisms are more suitable for organizations encountering problems that are more unique in nature. It is important to note that the context of his work (project based organizations) is different from this research’s context (programme environment). As a single-site case study, which was investigated only one programme; the study does not permit the extrapolation of the results to a larger population. So, the multiple case study approach can be adopted as a suitable strategy for future research in this regard. It will also remain for future research to refine and expand the proposed framework. Whereas culture might play a role in the programme management approach and style, Comparison of the conclusions with observations in programmes in other countries is necessary to improve the external validity of the research.
7. Conclusion As a result of the research effort, the conceptual framework of knowledge management mechanisms in programme was established. The research shows that different types of programmes require different knowledge management mechanisms. This chapter, however, distinguishes between codified versus personalized, generalized versus specialized and ITbase versus non IT-based as three distinct dimensions. Prior studies tend to examine only one dimension of knowledge management mechanisms: codification versus personalization. The framework proposes that codified mechanisms are more suitable for large programmes; while personalization is more suitable for small programmes. This chapter proposes that generalized mechanisms are more suitable for programmes conducting projects that are more standardized and routine in nature; while specialized mechanisms is more suitable for programmes conducting projects that are more unique in nature. The framework also proposes that IT-based mechanisms are more suitable for geographically dispersed programmes; while personalized mechanisms is more suitable for geographically concentrated programmes. The chapter is the pioneer of its kind to examine if there are suitable configurations of KM strategies for programmes with different dimensions. It gives valuable information, which hopefully will help programmes to accomplish knowledge management.
Knowledge Management Mechanisms In Programmes
161
8. References Artto, K., Martinsuo, M., Gemunden, H. G. and J. Murtoaro. 2009. Foundations of programme management: A bibliometric view. Int J Project Manage 27 (2009): 1–18. Boh, W. F., 2007. Mechanisms for sharing knowledge in project-based organizations. Information and Organization, 17(1): 27–58. Berdrow, I and Lane, H.W., International Joint Ventures: Creating Value Through Successful Knowledge Management. Journal of World Business, 38 (2003): 15-30 Chong, S., 2006. Implementation in Malaysian ICT companies. The Learning Organization, Vol 13, Issue. 3: 230 – 256. Duffy, J., 2000. Knowledge management: what every information professional should know. Information Management Journal, 34(3) : 10-16. Evaristo, R. and V. Fenema, 1999. A typology of project management: emergence and the evolution of new forms. Int J Project Manage.17(5):271–81. Egbu, C., and K. Botterill. Information technology for knowledge management: their usage and effectiveness, 7(2000): 125-133. Greiner, M. E., Bohmann, T. and H. Krcmar, 2007. A strategy for knowledge management. Journal of knowledge management. 11(6): 3-15. Lycett, M., Rassau, A. and J. Danson, 2004. Programme management: a critical review. Int J Project Manage 22 (2004) 289–299. Leseure, M. J. and N. Brookes, 2004. Knowledge management benchmarks for project management. Journal of knowledge management, 8(1): pp. 103-116. Landaeta, R., 2008. Evaluating benefits and challenges of knowledge transfer across projects. Engineering Management Journal. 20 (1): 29–38. Magnier-Watanabe, R., and D. Senoo., 2008. Organizational characteristics as prescriptive factors of knowledge management initiatives, Journal of knowledge management, 12(1) : 21-36. Mohamed, M., Stankosky, M., and A. Murray. 2006. Knowledge management and information technology: can they work in perfect harmony, Journal of knowledge management, 10(3): 103-116. Ma, Z., Qi, L. and K. Wang. 2008. Knowledge sharing in Chinese construction project teams and its affecting factors. Chinese Management Studies 2(2):97-108 Nunes, M., Annansingh, F., Eaglestone, B., and R. Wakefield, 2006. Knowledge management issues in knowledge-intensive SMEs, Journal of Documentation. 62(1) : 101-119. Owen, J. 2008. Integrating Knowledge Management with Programme Management. Current issue in knowledge management edited by Murray E. Jennex. IGI Global. New York. 132-148. PMI (Project Management Institute). The standard for programme management. Pennsylvania, USA, 2006. Payne, J. H. 1995. Management of multiple simultaneous projects: a stateof-the-art review. Int J Project Manage.17(1):55–9. Smith, A. D. 2004. Knowledge management strategies: a multi- case study. Journal of Knowledge Management, 8(3): 6-16. Thiry, M. ‘‘For DAD’’: a programme management life-cycle process. Int J Project Manage 22 (2004): 245-252. Thiry, M. Combining value and project management into an effective programme management model. Int J Project Manage 20 (2002): 221-227.
162
Advanced Technologies
Pellegrinelli, S., Partington, D., Hemingway, C., Mohdzain, Z., and M. Shah. 2007. The importance of context in programme management: An empirical review of programme practices. Int J Project Manage 25 (2007): 41–55. Venkitachalam, K. and M. R. Gibbs, 2004. Knowledge strategy in organizations: refining the model of Hansen, Nohria and Tierney. The Journal of Strategic Information Systems. Vereecke, A., Pandelaere, E., Deschoolmeester, D. and M. Stevens, 2003. A classification of development programmes and its consequences for programme management. International journal of operations & production management. 23(10): 1279-1290.
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
163
10 X Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia) Peter Andráš, 1,2 Adam Lichý, 3 Ivan Križáni2 and Jana Rusková4
1 Department
of environmental management, Matej Bel University, Tajovského 40, 974 01 Banská Bystrica; 2 Geological Institute of Slovak Academy of Sciences, Ďumbierska 1, 974 01 Banská Bystrica, Ďumbierska 1, 974 01 Banská Bystrica Slovakia 3 Envigeo, Kynceľová 2, 974 11 Banská Bystrica, Slovakia 4 Regional Institute of Public Health, Cesta k nemocnici 1, 974 01 Banská Bystrica, Slovakia
1. Introduction The Ľubietová deposit was exploited since the time of the Bronze Age and in the 16th and 17th centuries it was one of the most important and most extensively exploited Cu-mines of Europe. The Cu-ore was in the 18. century exported to more than 50 countries (Koděra et al. 1990). The Cu mineralisation with Ag admixture is developed within 4 – 5 km long and 1.5 km wide range of N-S direction. There are three main ore-fields in the Ľubietová surrounding: Podlipa, Svätodušná and Kolba with admixture of Co/Ni-mineralisation. The Cu content in the ore ranged from 4 – 10 % and the Ag content was about 70 g.t-1 (Koděra et al., 1990). 25 thousand tons of Cu were exploited during last five centuries. The main dump-field Podlipa represents about 2 km2 area and was exploited by 18 adits. The ore mineralisation is situated in the Ľubietová terrigene crystalline complex of Permian age which consists of greywackes, arcose schists and conglomerates. The main tectonic structures are of NE-SW direction, the main ore veins strike approximately E-W and N-S. Ore veins are 30 – 40 m thick. In the southern part of the ore-field was described also disseminated mineralisation. The probably volcano-sedimentary mineralisation, genetically connected with the basic, intermediate and acid Permian volcanism, was mobilised by Hrončok granite intrusion during the Alpine (upper Cretaceous) orogene (Ebner et al., 2004). The vein mineralisation is characterised by a rather simple paragenesis which is represented by quartz, siderite, (± calcite and ankerite), chalcopyrite, Ag-bearing tetrahedrite, arsenopyrite, pyrite, barite and rare galena. In the well developed cementation zone the main Cu-minerals were cuprite and native copper. The deposit is famous also because of formation of wide range of rare secondary minerals as libethenite, langite, annabergite, aurichalcite, azurite, brochantite,
164
Advanced Technologies
cyanotrichite, erithrine, evansite, euchroite, farmacosiderite, hemimorphite, chrysocol, cuprite, limonite, malachite, olivenite, tirolite, pseudomalachite, copper etc. (Koděra et al. 1990). Although the intensive mining activities were stopped during the 19th century (the last owner of the mine near Haliar locality Ernest Schtróbl finished the exploitation during the WWI on April 1915 because of the limited number of the miners) and only a few geological survey activities with negligible effect have been carried out here since, the area remains substantially affected.
2. Experimental The samples (of about 30 kg weight) of sediments from the dumps and soil from the 30 – 50 cm depth (the sampling step was 25 m2), surface water (stream water, drainage water) and groundwater were collected for the characterisation of landscape components contamination. The reference site was selected for comparison of territories loaded by heavy metals and non-contaminated natural environment (Figure 1). It was situated outside of geochemical anomalies of heavy metals and represents graywakes of Permian age, similar to material at the dump-field. The samples of plant material were collected both from the reference area and from the contaminated dumps. The dump-sediments are represented by two sets of samples: the first one consists of 15 samples (samples HD-1 to HD-15) and the second one of 15 samples (A-1 to A-15). Samples HD-10, HD-11 and A-12 represents the reference area. The sample set is completed by sample of limonitised rock (A-17), which represents mixture of three samples from localities A-2, A-3 and A-5. The water samples were collected during the dry seasons (February 25th 2007 and May 27th 2008) and wet seasons (June 14th 2006 and March 31st 2008). To the each sample of 1 000 ml volume was added 10 ml of HCl. Vegetation creates small isles and is enrooted in few depressions which have enabled limited soil-forming process. The selection of plant species was performed so that it could be possible to compare all identical plant species from the contaminated planes with plants from the reference sites. The samples of hardwood species (Betula pendula, Quercus petraea, Salix fragilis), coniferous species (Pinus sylvestris, Abies alba, Picea abies) and herbs (Juncus articulatus, Mentha longifolia) were studied. At everyone site were sampled 10 individuals of each plant species to get average sample. Five coniferous individuals of approximately same age were sampled for branches (in case of Picea abies also needles) from the fourth or fifth spike with approximate length of segment from 10 to 15 cm. In the case of Pinus sylvestris were analysed two years old needles. Roots of the same length and with 2 - 3 cm diameter were obtained from the surface soil level. Similar mode of sampling was used at hardwood species: 3 – 4 years old branches were sampled from the lower limbs. The samples were dried at laboratory temperature and then homogenized. The clay mineral fractions from 8 samples (A-1c to A-11c and A-17c) of technogenous sediments were prepared according to the method described by Šucha et. al. (1991). To remove carbonates from the sample pulverised to <0.16 mm grain size, is to the 10 g of sample added 100 cm3 of natrium acetate buffer. Reacted solution is after 2 days segregated from the solid phase and the solid phase is dispersed by SOTR addition in ultrasound
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
165
device during 2 – 3 minutes. The sample is three times heated up to 90 °C with 100 cm3 SOTR addition and the suspension is decanted.
Fig. 1. Localisation of the technogenous sediment, water and plant samples Organic mater was removed by reaction with 10 cm3 of concentrated hydrogen peroxide and 100 cm3 of SOTR. The mixture was heated at 70 °C 15 minutes. This proceeding was two times repeated and the reacted solution was removed.
166
Advanced Technologies
The free Fe and Mn oxides are removed by addition of 90 cm3 citrate solution and by heating up to 75 – 85 °C. After 5 minutes double additions of 2 g of natrium hydrosulphite was realised and the solution was decanted. The rest of the sample was irrigated by distilled water. After mentioned procedure is possible to realise the true separation of clay minerals (<2 µm fraction). The coloidal rest of the sample in 2 dm3 of distilled water is after 41 hours and 8 minutes (the time is calculated according to Stokes rule for gravitation sedimentation) decanted to beaks and saturated solution of NaCl is added. After treating is the solution fleeced. The solid rest is transposed to calciferous form using 1 mol.dm-3 CaCl2 solution. This procedure regulate the replacable cations in clay minerals. Dialysis is sused for chlorides removal and the presence of chlorides is verified by additament of AgNO3 solution. After removing of chlorides is the rest of the sample drained at 30 °C. To the 0,14 g of sample was added 3,5 cm3 of distilled water and using ultrasound was the sample dispergated. Suspension was applied using syringe on mount glass and was drained at laboratory temperature to get oriented mount. These oriented mounts were saturated by ethylenglycol gasses in exicator during 8 hours on ceramic skid at 60 °C to optimise the conditions of the rtg-diffraction analysis. Rtg-diffraction analyses of clay minerals were realised in the laboratories of the Geological Institute of the Slovak Academy of Sciences using X-ray difractograph PW 1710 Philips (analysed RNDr. Ľubica Puškelová). The existence of clay mineral free sorption capacity was studied using heavy metals containing drainage water from terraine depression beneath the dump of the Empfängnis adit. To 20 g of clay sample (A-1c to A7c and A-17) was added 50 cm3 of 5-times concentrated drainage water. Analyses were realised from 1 g of sample (A-1* to A7* and A17* after 14 days maceration in drainage water. Rinse pH of the sediments was measured in the mixture of distilled water and unpulverised sample (Sobek et al., 1978). The pH of the sediments was determined also from mixture of unpulverised sediment and 1M KCl according to Sobek et al. (1978). In both eventualities to 10 g of sediment sample 25 ml of distilled water or 1M KCl was added and after two hours of mixing in laboratory mixer the pH and Eh was measured. The samples of technogenous sediments from the dumps and soils were dried and 0.25 g of sample was heated in HNO3-HClO4-HF to fuming and taken to dryness. The residue was dissolved in HCl. Solutions were analysed by ICP-MS analyse in the ACME Analytical Laboratories (Vancouver, Canada). Plant samples were divided into roots, branches/stems, leaves/needles and flowers/fruits. 0.5 g of vegetation sample was after split digestion in HNO3 and then in Aqua Regia analysed by ICP-MS for ultralow detection limits. The contamination of live and dead parts was compared in several plants. Plants were analysed in the same laboratory as sediments. The carbon content (total carbon – Ctot., organic carbon - Corg. and inorganic carbon Cinorg.) was determined in the laboratories of the Geological Institute of the Slovak Academy of Sciences by IR spectroscopy using device Ströhlein C-MAT 5500 (analysed Alžbeta Svitáčová). The water samples were analysed using AAS in the National Water Reference Laboratory for Slovakia at the Water Research Institute in Bratislava. The speciation of As was performed on the basis of different reaction rate of As3+ and As5+ depending on pH (analysed Ing. Adriana Shearman, PhD).
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
167
The efficiency of the Fe0-barrier for heavy metals removal from the surface water was tested in the laboratories of the Faculty of Natural Sciences at the Comenius University in Bratislava (Mgr. Bronislava Lalínska). Microscopical analyses of plant tissues were realised in the laboratories of the Department of Wood Science of the Technical University Zvolen (analysed Ing. Miroslava Mamoňová, PhD).
3. Results 3.1 Heavy metal contamination of the technogenous sediments The dump-field sediments are influenced by heavy metals from the hydrothermal Cumineralisation. The main contaminants: Fe (up to 2,64 %), Cu (25 ppm - >10 %), Mn (34 1258 ppm), As (7 - 289 ppm), Pb (8 - 130 ppm), Co (5,1 - 96,3 ppm), Sb (7 – 62 ppm) and Ni (7,8 - 62,1 ppm) are accompanied also by U (up to 10 ppm) and Th (up to 35 ppm). Sample A-1 A-1c A-1* A-2 A-2c A-2* A-3 A-3c A-3* A-4 A-4c A-4* Fe % 1.31 1.45 2.98 1.42 1.46 2.17 1.94 2.14 2.90 2.64 2.47 3.65 Cu 2829 1693 2345 199 574 472 828 624 857 4471 3324 3112 Pb 28 64 229 130 22 28 16 23 37 10 15 38 Zn 14 18 95 21 36 62 20 25 47 23 16 27 Cd <0.1 <0.1 0.2 0.1 0.2 0.1 <0.1 <0.1 <0.1 0.3 <0.1 <0.1 Bi 2.8 4.5 14.6 0.2 1.4 1.5 8.5 7.2 12.1 23.7 39.2 90.9 Co 10.4 11.3 18.3 5.9 10.3 6.4 14.0 17.0 11.0 50.0 58.3 32.1 Ni 36.8 36.0 71.8 9.8 12.2 17.0 32.1 28.3 30.4 55.0 42.4 64.4 ppm As 162 258 628 10 19 15 71 110 105 169 237 300 Sb 62 60 153 7 9 13 22 24 28 59 79 130 Ag 0.7 0.8 1.7 <0.1 0.1 0.2 0.4 0.6 0.9 1.4 2.1 4.1 Cr 38 9 24 36 17 26 34 21 37 38 15 30 Sn 10.9 11.1 29.4 3.5 2.7 4.4 9.8 7.2 9.5 17.3 12.8 22.7 U 1.3 1.4 3.3 1.4 1.1 1.1 1.7 1.8 1.9 1.6 1.7 2.2 Th 5.8 6.0 9.5 7.6 5.9 2.2 9,1 9,2 5,2 8,3 7,8 5,0 Table 1. ICP-MS analyses of technogenous sediments, clay fraction and of clay fraction after 14 days maceration in drainage water. Explanation to tab. 1 and 2: A-1 to A-12 technogenous sediments, A-1c to A-10c clay fraction; A-1* to A-10* clay fraction after 14 days maceration in drainage water; A-17 hydrogoethite rich rock El.
Unit
The heavy metal distribution in technogenous sediments of the dump-field is variable (tabs. 1 – 4). The distribution of individual elements reflect the primary concentration in separate parts of the dump-field as well as their geochemical relations (figs. 2 – 7), first place their migration ability. In general we can distinguish at the dump-field three types of heavy metals: 1. group: Fe, Cu (figs. 2 and 3), As, Sb, Sn, Co, Ni, Cr, Ag, V, U, Th (figs. 4 and 5); 2. group: Zn, Bi, Cd (Figure 6); 3. group: Pb (Figure 7).
168
Advanced Technologies
Sample A-5 A-5c A-5* A-6 A-6c A-6* A-7 A-7c A-7* A-8 A-8c A-8* Fe % 1,71 1,66 1,83 2,06 2,09 3,36 1,32 1,43 2,81 0,91 1,29 0,79 Cu 3150 3001 2078 4797 2503 2918 756 855 2026 716 836 837 Pb 17 15 22 16 25 72 17 20 74 7 6 4 Zn 19 18 45 13 14 65 26 33 176 7 14 4 Cd <0,1 <0,1 <0,1 0,2 <0,1 0,3 <0,1 0,2 0,7 <0,1 <0,1 <0,1 Bi 1,7 2,1 3,2 25,4 24,4 51,7 0,9 1,2 3,6 0,5 0,7 0,8 Co 24,4 30,4 29,6 41,8 40,9 32,0 10,2 12,0 15,5 89,9 69,7 104 Ni 34,0 34,1 55,4 51,6 45,1 61,7 10,4 10,1 26,0 58,0 66,5 62,5 ppm As 60 64 105 134 224 305 16 17 33 61 52 46 Sb 17 16 30 50 56 92 12 7 17 18 20 19 Ag 0,1 0,1 0,2 1,0 1,6 3,0 0,2 0,2 0,4 <0,1 <0,1 0,1 Cr 30 10 22 31 11 23 28 11 35 23 21 7 Sn 4,9 3,3 8,1 14,9 12,9 19,6 4,0 2,6 6,8 3,9 7,1 3,0 U 1,0 1,2 1,4 1,4 1,6 2,2 1,1 1,1 2,3 2,6 2,5 2,1 Th 5,9 5,8 4,0 6,9 6,1 4,1 4,8 5,3 11,8 6,8 5,7 6,7 Table 2. ICP-MS analyses of technogenous sediments, clay fraction and of clay fraction after 14 days maceration in drainage water. El.
Unit
Fig. 2 and 3. Distribution of Fe and Cu at the dump-field Ľubietová (explanation to figs. 2 – 7: the numeric indexes represtent the concentration of heavy metals in % or ppm)
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
169
Element Unit A-9 A-9c A-9* A-10 A-10c A-10* Fe % 1.84 2.14 5.03 1.12 1.66 2.03 Cu 5466 3112 1181 390 231 176 Pb 18 23 41 54 60 74 Zn 24 25 31 36 40 52 Cd 0.1 0.1 0.2 0.3 0.2 0.4 Bi 8.1 10.2 15.4 1.7 2.8 4.1 Co 96.3 98.2 98.2 7.1 8.3 7.9 Ni 51.9 48.6 62.7 7.8 6.3 9.7 ppm As 130 148 234 32 47 76 Sb 28 37 48 18 28 47 Ag 0.6 0.7 1.3 0.3 0.5 0.9 Cr 11.8 8.7 9.9 12.5 7.5 9.6 Sn 12.1 10.2 15.8 4.1 3.7 4.8 U 2.1 2.3 2.7 1.1 1.2 1.3 Th 8.8 8.1 7.2 6.4 5.4 4.7 Table 3. ICP-MS analyses of technogenous sediments, clay fraction and of clay fraction after 14 days maceration in drainage water.
Fig. 4 and 5. Distribution of Th and As at the dump-field Ľubietová 3.1.1 Total Acidity Production (TAP) and Neutralization Potential (NP) The indirect indicator of oxidation are hydrogoethite and gypsum. The oxidation of sulphide minerals is indicated also by coating of secondary oxides and Cu-carbonates. The less oxidised is the fine grained pellitic material. The direct indicator of oxidation processes is the pH. To determine the Total Acid Production (TAP) and the Neutralisation Potential (NP) is necessary to know the Eh and pH values of the sediments (in distilled water and in 1M KCl lixivium) as well as the carbon and sulphur content. If distilled water is used in the measurment of paste or rinse pH, its pH is typically around 5.3. Consequently, any pH
170
Advanced Technologies
measurment less than 5.0 indicates the sample contained net acidity. Carbonate minerals can create pH values around 8 – 10 and thus values above 10 are usually alkaline. Values of paste pH between 5 and 10 can be considered near neutral (Sobek et al., 1978). Element Unit A-11 A-11c A-11* A-12 A-17 A-17c A-17* Fe % 2.37 3.02 4.44 1.38 1.72 1.50 13.10 Cu >10 000 8 756 7654 25 >10 000 20 360 23 060 Pb 14 24 33 16 8 49 60 Zn 15 17 21 39 59 80 50 Cd 0.1 0.1 0.2 0.2 0.2 0.2 0.2 Bi 2.6 3.2 4.7 0.2 7.2 6.0 5.0 Co 84.5 77.5 85.1 5.1 73.4 70.0 83.0 Ni 62.1 59.8 64.1 8.5 51.7 43.0 58.0 ppm As 206 211 243 7 289 260 280 Sb 36 38 40 10 43 40 34 Ag 1 258 1 322 1 452 559 1 074 960 1 010 Cr 0.7 0.8 1.7 0.3 2.2 2.0 30.0 Sn <1 <0.1 <0.1 <1 <0.1 <0.1 <0.1 U 0.3 0.4 0.6 0.1 0.8 1.0 1.0 Th 29 25 31 31 15 16 15 Table 4. ICP-MS analyses of technogenous sediments, clay fraction and of clay fraction after 14 days maceration in drainage water.
Fig. 6 and 7. Distribution of Cd and Pb at the dump-field Ľubietová The pH values in sediments measured at the dump-field in the distilled water lixivium range from 4.21 to 7.93 (tab. 5). It is interesting, that the lowest pH value was determined in the samples from reference area. It is caused probably by the fact, that in spite of absence of sulphides there are no carbonates; the carbon content is very low (Ctot. 0.40 %, Corg. 0.37 %
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
171
and concentration of the inorganic carbon Corg. is below the detection limite (tab. 5). The highest carbon content is in sample A-7 (Ctot. 1.63 %; re-counted for CaCO3 even 12.71 %). The samples contain 0,01 – 0,42 % of total sulphur (tab. 5). The highest content of total sulphur is in the sample A-6 from the Empfängnis adit, where the sulphidic sulphur (SS 0.27 %) prevalence on sulphate sulphur (SSO4 0.15 %) was recognised. At the dump-field Ľubietová in the majority of the samples is dominant the SSO4 vs. SS thus in the dump material still contain a great portion of not oxidised sulphidic minerals. According to Sobek et al. (1978) the calculation of the TAP value is possible by multiplication of the Stot. (or SS) content by coefficient 31.25, which is derived from the neutralisation equation: CaCO3 + H2SO4 = CaSO4 + H2O + CO2 This datum is equal to quantity of acid which could be potentially produced by the dump material. At Ľubietová – Podlipa dump-field range the TAP value from 0,3125 to 13,125 (in average 3,7; tab. 6). The highest TAP values are at the dump of the Empfängnis adit and the lowest at the little dump of Holdenbachen adit beneath the wooded slope of the Zelená Dolina Valley (Figure 1).
Sample
H20 Eh pH
(mV)
1M KCl Eh pH
(mV)
% Stot.
SSO4
Ss
Ctot.
Corg.
Cinorg. CO2 CaCO3
A-1 5.14 77 4.61 109 0.25 0.10 0.15 0.74 0.20 0.54 1.97 A-2 5.89 34 5.40 63 0.02 0.01 0.01 0.86 0.38 0.48 1.75 A-3 4.87 94 4.21 131 0.10 0.03 0.07 0.62 0.34 0.28 1.02 A-4 5.46 59 5.33 66 0.33 0.13 0.01 0.34 0.26 0.08 0.29 A-5 5.77 42 5.37 64 0.05 0.01 0.05 0.78 0.35 0.43 1.57 A-6 5.17 74 5.06 83 0.42 0.15 0.27 0.40 0.27 0.13 0.47 A-7 7.93 -84 7.34 -58 0.03 0.02 0.01 1.63 0.10 1.53 5.61 A-8 5.42 36 5.22 42 0.01 0.01 0.01 0.45 0.13 0.32 1.17 A-9 5.03 83 5.01 85 0.03 0.03 0.01 0.40 0.37 tr. tr. A-10 5.25 71 5.14 78 0.04 0.02 0.02 0.48 0.46 tr. tr. A-11 6.11 22 5.95 30 0.11 0.04 0.07 4.31 4.18 0.13 0.47 A-12 4.21 133 3.47 173 0.02 0.01 0.02 4.05 4.03 tr. tr. Table 5. Characteristic of the samples of technogenous sediments from dump-field
4.48 3.99 2.32 0.66 3.57 1.08 12.71 2.66 tr. tr. 1.08 tr.
To define the risk of the acidity production it is necessary to know also the neutralisation potential (NP), which define the content of the neutralisation matter in the dump-field able to neutralise the dump produced acidity. The distribution of NP values within the individual parts of the Podlipa dump-field show substantial differencies (from 0 to -127.1, in average 27.1; tab. 6) and it is more or less in negative correlation to the Total Acid Potential (TAP). For example to the TAP 13.125 value (in sample A-6 where the highest Stot. 0.42 % and SS 0.27 % contents were described; tab. 5) rises value NP 10.8, while to the lowest TAP value 0.3125 (sample A-8) rises NP 26.6. Higher NP – 127.1 is only in the sample A-7 (tab. 6), where the highest Ctot. was appointed (equal to 12.71 kg.t-1 CaCO3; tab. 5).
172
Advanced Technologies
The Net Neutralisation Potential (NNP) is equal to the quantity of neutralisation matter (usually n.kg CaCO3 per 1 ton of material), necessary for neutralisation of dump-field matter produced acidity (NNP = NP – TAP). NPP values at the Podlipa dump-field are presented in tab. 6. The results show that to neutralise the dump material it would be necessary use such a quantity of neutralisation reagent which is in average equal to 23.5 kg CaCO3 per 1 ton of dump matter. The risk of the Acid Mine Drainage water (AMD) formation is best expressed by NP:TAP ratio. If it is close to value 1, the risk of the AMD formation is highly feasible. In case when the mentioned ratio is equal to 3 or >3, the risk of the AMD formation is negligible (Sobek et al., 1978). Sample A-1 A-2 A-3 A-4 A-5 A-6 A-7 A-8 A-9 A-10 A-11 A-12
TAP 7,81 0,62 3,12 10,31 1,56 13,12 0,93 0,31 0,93 1,25 3,43 0,62
NP 44,8 39,9 23,2 6,6 35,7 10,8 127,1 26,6 0 0 10,8 0
NNP 37,0 39,3 20,1 -3,7 34,1 -2,3 126,2 26,3 -0,9 -1,3 7,4 -0,6
NP:TAP 5,7 63,8 7,4 0,6 22,8 0,8 135,6 85,1 0,0 0,0 3,1 0,0
Mean 3,7 27,1 23,5 7,4 Table 6. Values of Total Acidity Production (TAP), Neutralisation Potential (NP) and net neutralisation potential (NNP) If we consider the average NP:TAP ratio at the Podlipa dump-field (7.4; tab. 6), the risk of the AMD formation is beyond possibility. Such a high average NP:TAP ratio is caused only by value from the dump of the Empfängnis adit (NP:TAP = 135.6). If this only extreme value will be excluded, the NP:TAP ratio will change to 1.72 and it correspond to the low risk of AMD creation. 3.1.2 Heavy metal sorption on clay minerals and hydrogoethite Rtg.-diffraction analyse prowed, that the most important potential natural sorbents in studied area are the clay minerals and hydrogoethite – FeO(OH)·nH2O, which are formed during the weathering process of rock-material. The research confirmed, that the clay minerals are represented by illite – (K,H3O)(Al,Mg,Fe)2(Si,Al)4O10[(OH)2,(H2O)] and muscovite – KAl2(AlSi3O10)(F,OH)2, caolinite – Al2Si2O5(OH)4, as well as smectite and chlorite mixture. Illite and muscovite are dominant in all samples. The next important mineral is smectite. The heavy metal sorption study of clay minerals and hydrogoethite from technogenous dump sediments and the study of the free sorption capacity of these natural sorbents in individual samples is a relatively complex problem and the interpretation of these data is
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
173
very confusing. Better reproducibility enable the complex interpretation of the results if the concentrations of the individual elements in technogenous sediment, in clay mineral mixture and in clay fraction after maceration in drainage water (tab. 7) is presented in form of toatal values for each element (tabs. 1 - 4). Such a toatal data enable better understand the studied processes and trends. Fe
Cu
Pb
Zn
Cd
Bi
Co Ni As Sb Ag Cr Sn -1 µg.l V-1 486 9864 12 189 0,3 2,1 44 23 15 8,4 0,1 8 0,2 Table 7. ICP-MS analyse of drainage water used for 14 days maceration of clay fraction Sample
Fig. 8. Toatal content of Fe, As, Sb, Pb, Cu and Th in technogenous sediments (A), in clay fraction (Ac), in clay fraction after 14 days maceration in heavy metals containing drainage water (A*). Preferential sorption of Cr and Th on surface of clay minerals in comparison with hydroghoethite was described. On hydrogoethite surface are preferentially fixed Cu, Zn (± Fe, Cd, Co). The following elements: Sb, Sn, Pb, Ag, Ni, As and U show no legible trends of preferred sorption both on clay minerals and on hydrogoethite rich rock (tabs. 1 - 4).
174
Advanced Technologies
The following heavy metals: Fe, As, Sb, Ag, Pb, Zn, Bi and U show not only good sorption efficiency on clay minerals but also free sorption capacity of the clay fraction. Opposite trend – lower heavy metal content in clay component in comparison with the sediment and metal elements washing during maceration was proved in case of Th and Cu (Figure 8). Co show moderate increase of content in clay minerals but no free sorption capacity was proved. The Cd, Ni, Co, V and Cr behaviour is very complex (Figure 9).
Fig. 9. Toatal content of Cd and Ni in technogenous sediments (A), in clay fraction (Ac), in clay fraction after 14 days maceration in heavy metals containing drainage water (A*). Cd, Ni and V are preferentially fixed in sediments, lower Cd, Ni and V contents are in clay fraction but the clay mineral mixture proved a good ability to fix the mentioned heavy metals (Cd, Ni and V) on their surface (Figure 9). The probable reason of this behaviour is the fact, that the Cd, Ni and V majority is bound in the solid phase and only with difficulties create soluble forms, so in consequence of this behaviour in weathering process are the autochtonous clay minerals insufficiently saturated by V. The same trend was described for Cd, Ni, V and Co in case of hydrogoethite. The Cr behaviour is very similar, only with this difference that while V concentrations are in macerated clay higher as in the original sediment, the Cr concentrations are the highest in the original sediment. The most complex relations were recognized in the case of Co. The highest Th content was described in sediments and in soil. The Th contents in clay minerals are lower as in the sediment and after maceration is Th washed out from the clays. This trend is noticeable because in general is U considered to be more mobile as Th. The better mobility of U at the Ľubietová deposit where the content of Th in soil is several times higher as the content of U, while in plants are the contents both of U and Th in consequence of better U mobility approximately identical. The Th/U rate is about 1 : 1. 3.2 Plant contamination by heavy metals The plants adapted to the specific conditions of the different zones of the studied area show different level of the contamination in individual tissues (roots, twigs/stems, leafs/needles or in flowers/fruits). The article presents also results of the plant tissue degradation study in heavy metal contaminated conditions and compare them with those from reference sites. The knowledge of content of heavy metals in plants is important from the point of food cycle contamination. The intensity of plant contamination by heavy metals is necessary to critize depending up plant species.
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
Plant
Betula pendula
Quercus petraea
Pinus sylvestris
Picea abies Abies alba Juncus articulatus Mentha longifolia
Salix fragilis Acetosell a vulgaris
Sample
LB-7
LB-100
LB-103
LB-14
LB-8
LB-1
LB-112
LB-2
LB-13
Part of the plant a b b2 c a b c d a b b2 c d a b b2 c a b c a b c d a b c d a b c a b c
Fe
Cu
Zn
Pb
Ag
175
Cd
Ni
Co
As
Sb
0.08 0.04 0.15 0.08 0.10 0.10 0.06 0.05 0.03 0.03 0.03 0.10 0.03 0.07 0.04 0.03 0.06 0.05 0.04 0.20 0.10 0.10 0.06 0.04 0.10 0.20 0.05 0.20 0.06 0.08 0.42
5.80 3.50 10.70 10.90 11.80 4.70 3.70 3.00 2.70 2.50 2.30 4.90 2.70 6.10 9.20 9.20 6.20 5.40 5.42 10.20 4.90 4.92 4.70 2.40 7.60 13.00 3.70 21.00 9.40 12.00 36.30
0.90 0.70 6.70 3.40 3.50 1.00 1.00 1.10 2.30 1.30 1.10 2.50 1.60 6.10 0.80 0.80 1.25 1.80 1.60 8.70 2.60 2.55 1.20 0.50 3.30 9.70 0.60 6.20 5.7 7.70 25.60
2.14 1.50 0.50 1.29 0.50 0.64 0.77 0.37 0.42 0.56 0.62 0.15 0.41 3.14 0.59 0.19 0.85 0.49 1.34 2.24 19.70 1.40 1.28 3.64 0.57 0.05 0.76 1.44 2.84 0.26 0.52 17.31 13.08 12.97
2.90 0.30 0.10 0.14 0.25 0.09 0.11 0.07 0.50 0.13 0.11 0.04 0.10 0.53 0.12 0.12 0.19 0.96 0.34 1.14 1.52 1.67 1.60 0.44 0.21 0.08 0.26 2.07 0.21 0.15 0.11 3.46 3.30 3.27
ppm 91.7 158.5 308.9 380.0 274.6 204.1 85.2 210.0 156.8 164.6 143.0 148.5 95.4 221.8 158.2 134.1 125.2 221.6 73.2 358.0 209.4 208.0 246.0 89.4 593.6 415.6 131.3 565.9 184.0 176.8 701.0
7.1 6.6 8.6 25.3 9.2 0.0 8.1 6.5 28.1 2.9 0.0 59.9 6.8 7.5 12.1 10.1 10.8 8.0 8.7 37.2 70.1 68.3 100.4 14.3 173.3 63.5 0.0 0.0 133.7 72.9 46.5
16.80 24.1 16.70 18.6 71.30 75.6 35.40 3.8 59.00 57.1 41.70 4.7 10.30 28.6 14.60 17.1 15.00 14.5 15.70 32.2 16.70 20.5 44.30 38.3 12.80 15.2 32.40 38.6 16.05 2.0 14.60 1.9 22.14 15.4 27.70 15.5 8.70 31.5 85.10 51.4 33.20 37.9 32.01 35.9 23.30 16.0 13.10 11.4 38.30 41.2 84.00 6.0 20.80 6.3 74.10 98.6 22.30 4.6 31.10 23.1 185.80 135.1
0.02 2.90 16.90 8.20 40.10 2.04 30.30 0.02 0.70 3.04 0.10 8.00 0.04 0.07 0.82 0.70 3.10 1.40 20.80 225.1 4.70 4.00 0.20 0.04 2.40 0.20 5.30 0.80 0.30 7.40 0.38
Table 8. Analyses of plant tissues from dump-field Ľubietová – Podlipa Explanations to tab. 8 and 9: a - roots; b - live branches and stems; b2 - dead branches; c – leaves and needles; d – flowers and fruits The contents of the heavy metals in plant tissues decrease in the following rank: Fe, Zn, Pb a Cu. The comparison of additive concentrations of heavy metals in individual types of plant tissues (roots. branches/stems and leaves/needles) was realised to obtain more complex contamination model of the plant tissues. The comparison was performed with the following wood species and herbs: Betula pendula. Quercus petraea. Picea abies. Abies alba, Pinus sylvestris. Juncus articulatus. Mentha longifolia. The study showed (Figure 10) that the highest concentrations of Fe, Pb, As, Sb and Zn were determined in root system. Cu
176
Advanced Technologies
contaminates leaves and needles in preference. Fe probably enters leaves and needles up to certain concentration level in preference. When this concentration level is exceeded. Fe accumulates in root system after it cannot enter leaves and needles. The contamination of plant tissues of live and dead branches was compared in several plants from the dump-field (Betula pendula. Pinus sylvestris. Picea abies) and reference site (Betula pendula. Quercus petraea) (tabs. 8 and 9). Plant
Sampl e
Betula pendula
LB-10
Quercus petraea
LB-9
Pinus sylvestris
LB-101
Picea abies
LB-102
Abies alba
LB-113
Juncus articulatus
LB-114
Mentha longifolia
LB-101
Part of the
Fe
a b b2 c a b b2 c a b c a b c d a b c a b c d a b c d
192.0 107.7 108.8 209.0 174.2 67.1 158.6 123.2 111.1 84.0 136.8 121.6 95.2 90.8 84.1 98.1 112.8 41.7 199.5 103.2 100.4 246.0 89.4 593.6 -
plant
Cu
Zn
Pb
Ag
Cd
Ni
Co
As
Sb
0.06 0.02 0.12 0.07 0.06 0.04 0.06 0.03 0.02 0.02 0.06 0.05 0.60 0.52 0.02 0.04 0.04 0.03 0.08 0.05 0.03 0.06 0.04 0.10 -
5.10 3.00 1.70 4.70 7.82 3.40 3.80 2.80 2.00 2.20 4.50 4.11 7.10 7.00 8.20 5.13 4.30 4.90 5.10 3.00 2.99 4.70 2.40 7.60 -
2.30 7.70 6.70 1.80 1.53 1.80 1.00 1.30 2.00 0.30 1.30 4.10 5.10 4.20 0.70 1.11 1.40 0.70 1.00 1.30 1.30 1.20 0.50 3.30 -
0.31 0.12 0.20 0.14 0.35 0.06 0.63 0.16 0.16 0.04 0.13 0.33 0.07 0.17 0.57 0.39 0.07 0.15 0.59 0.16 0.17 0.31 0.57 0.05 0.76 1.44
0.20 0.27 0.05 0.04 0.19 0.04 0.07 0.09 0.51 0.03 0.06 0.52 0.06 0.08 0.16 1.20 0.03 0.11 0.18 0.07 0.05 0.06 0.21 0.08 0.26 2.07
ppm 2.0 3.5 5.6 6.9 4.2 0.0 3.2 3.1 8.1 0.0 6.6 4.5 8.1 7.4 2.1 3.4 4.6 3.0 34.1 8.8 8.0 100.4 14.3 173.3 -
27.10 16.90 41.30 25.50 32.00 25.40 36.30 27.50 10.20 12.80 22.90 22.40 10.20 10.01 8.63 19.70 20.00 17.20 39.50 42.90 38.12 23.30 12.10 38.30 -
19.20 14.80 75.60 3.10 31.10 2.00 30.40 22.20 10.80 11.40 17.00 28.60 1.15 1.11 0.98 10.00 11.80 22.40 4.30 15.40 15.41 16.00 11.40 41.20 -
0.02 1.30 1.90 5.40 0.08 0.02 0.20 0.05 0.05 1.20 0.10 0.04 0.55 0.48 0.60 0.80 0.80 3.50 0.20 0.50 0.49 0.20 0.04 2.40 -
Table 9. Analyses of plant tissues from the reference site Ľubietová – Podlipa
3.2.1 Plant tissue defects Plants at the mine waste dumps have lack of organic nutrients and moisture besides higher concentrations of heavy metals. They mainly populate depressions or weathered parts of the dumps. Permanent plants prevail at the old mine waste dumps. Annual and biennial plants are rare. Changes of pH and oxidative-reductive potential affect the release and precipitation of heavy metals. It causes the heavy metals transfer into the bottom sediments, solution or soil/rock environment. Bowen (1979) states that Ag, As, Cd, Cu, Pb, Sb and Zn metals have the tendency to accumulate in the upper soil horizon due to vegetation recycling, atmospheric deposition and adsorption by organic mass.
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
Fig. 10. Additive heavy metal concentrations in individual types of plant tissues a - roots b - twigs/stems, c- leaves/needles
177
178
Advanced Technologies
Fe, Co and Ni metals usually accumulate in higher concentrations in dislocated clay minerals and autigene sesquioxides in lower horizon of the soil profile enriched in clay and oxyhydroxide components. Year shoots in live branches of Betula pendula and Pinus sylvestris (Figure 11) from the dumpfield are narrow (10/100 μm vs. 2000 μm wide year shoots from the reference area, Figure 12)
Fig. 11. Comparison of the current year shoots in analysed samples of the Pinus sylvestris from the dump-field: 1 - living twig, 2 – dead twig and from reference site: 3 – living twig, 4 – dead twig.
Fig. 12. Betula pendula: formation of extraordinary tight current year shoots. Fig. 13. Pinus sylvestris: exfoliation of summer tracheide cell-wall layers. Anomalous cell-wall exfoliation (Figure 13) and coarsening, occurrence of calluses, zonal occurrence of thick-walled fibres (Figure 14), formation of traumatic resin canals (Figure 15) and numerous hyphaes in vessels (Figure 16) as well as the absence of cell-wall coarsening
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
179
suggest defense mechanism of plants which are exposed to the stress factors at the dumpfield such as contamination by heavy metals, soil and moisture deficiency, movement of incohesive material down to slope.
Fig. 14. Betula pendula: zonal occurrence of thick-walled fibres. Fig. 15. Pinus sylvestris: formation of the traumatic resin chanals in closeness to the calluses.
Fig. 16. Betula pendula: a – scalar perforation in vessels.. b – hyphaes Fig. 17. Pinus sylvestris: absence of the cell-wall coarsening. Plant contamination by Fe causes the atrophy of plant tops and the root coarsening. It has been observed by Picea abies where the deformation of tree tops and the formation of „stork nests” has been occurred. Plant contamination by Cu causes the formation of dead stains on lower leaves at steam, purple and violet stem coloring (Acetosella vulgaris), atrophy of the root system and leaf chlorosis with green veining. The flora loading by Zn causes the abundant occurrence of plants with leaf chlorosis with green veining (Picea abies, Betula pendula), dead stains on leaf-tips (Acetosella vulgaris) and rudimentary root system (Picea abies). Plant loading by Ni and Co causes the formation of white stains. White stains were described by Salix fragilis. Plants receive Cd mostly by roots
180
Advanced Technologies
in Cd2+ form by diffusion due to metal chelatation by organic acids secreted by plant root system. The highest Cd contents are in roots, lower in leaves and then in stems and fruits. The lowest Cd contents are in seeds. Higher Cd contents cause several diseases such as leaf chlorosis, root darkening and formation of violet-brown stains on leaves. Diseases have been observed at the studied locality by Picea abies (needle chlorosis), Quercus petraea (root darkening), Acetosella vulgaris (formation of violet-brown stains on leaves). Shedding of leaves and needles is present frequently as well. Lack of Cd causes increase of biomass formation. High Cd content causes unproportional growth of leaves and roots, cell extension and stagnation of cell division. For instance, needle length is ultra small by Pinus sylvestris (2 cm) at the mine waste dumps. 3.3 Heavy metals in the water The surface water in the creek draining the valley along the dump-field is gradually contaminated by heavy metals from leached from the technogenous sediments of the mining dumps. The drainage water contains relatively high Cu (up to 2060 µg.l-1), Fe (up to 584 µg.l1), Zn (up to 35 µg.l-1) and sometimes also Co (up to 10 µg.l-1) and Pb (up to 5 µg.l-1) concentrations. The highest As concentration is 6.11 µg.l-1. Sample
Eh (mV)
Mn
Zn
Cd
Co
pH
V-1a V-1b V-1c V-2b V-2c V-3a V-3b V-3c V-4a V-4b V-5a V5b V-5d
6.5 7.5 6.5 6.7 6.9 6.7 6.1 6.5 6.7 6.2 6.2 6.3 6.2
-6 -58 -8 -14 -21 -12 14 0 -14 14 -11 -8 -7
<1 <1 11 <1 <1 <1 <1 21 <1 <1 <1 7 4
<10 <10 <10 <10 <10 30 40 <10 <10 <10 <10 20 30
0.04 0.05 <0.05 0.13 0.09 0.04 0.05 <.05 0.06 0.06 0.06 0.08 0.07
1.1 2.2 <1.0 <1.0 <1.0 7.0 9.6 7.6 3.1 8.1 5.5 8.3 6.6
Cu Fe µg.l-1 2.2 26 2.7 73 5.1 94 42.1 584 38.2 580 1 810 86 2 060 101 1 980 45 22.2 263 1 850 274 6.0 170 7.9 210 8.1 160
Ni
Pb
Sb
As
4.1 5.9 1.2 2.1 1.6 3.2 4.9 8.5 2.1 5.6 6.0 7.1 8.1
4.2 4.3 <1.0 3.0 2.9 2.2 2.8 2.8 4.2 3.6 4.8 5.1 1.0
0.74 <1.00 1,03 <1.00 <1.00 1.12 1.88 2.35 1.72 1.57 1.66 2.21 2.00
<1.0 <1.0 <1.0 1.69 1.54 <1.00 3.41 1.14 <1.0 1.21 2.79 3.21 1.08
V-6a 7.6 -62 <1 30 0.07 1.9 30.4 270 4.3 3.2 2.00 6.02 V-6b 7.1 -62 <1 32 0.07 2.2 34.8 263 5.0 3.4 2.01 6.11 Table 10. Atom absorption spectrometric analyses of surface water. Expanations: Samples marked by index „a“ - rainy period (June 14th 2006), samples marked by index „b“ - dry period (February 25th 2007), samples marked by index „c“ - rainy period (March 31st 2008), samples marked by index „d“ - dry period (May 27th 2008)
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
181
Fig. 18. Comparison of Fe, Cu, As and Cd contents during dry and rainy seasons Cu Cd Pb Bi As Sb µg.l-1 G-1a 6.55 -4 11 1.1 <5 <10 22.0 <0.05 <1.1 1.36 <1 <1.0 G-1b 6.63 -10 17 1.2 <5 <10 1.3 <0.05 1.9 1.55 <1 <1.0 G-2a 6.72 -14 366 1.3 18 <10 3.0 <0.05 1.3 <1.00 <1 <1.0 G-2b 6.23 -16 210 1.5 8 <10 2.2 <0.05 <1.0 <1.00 <1 <1.0 G-3a 6.85 -21 146 <1.0 15 61.0 14.0 0.13 3.4 <1.00 5 1.42 G-3b 6.55 -23 120 <1.0 17 350.0 5.9 0.1 3.3 <1.00 1.52 1.21 G-4a 6.40 +4 380 5.0 20 <10 30.0 0.5 <1.0 <1.00 1.98 <1.0 G-4b 6.48 -2 2 260 <1.0 55 <10 181 82.0 <1.0 <1.00 2.52 <1.0 Table 11. Atom absorption spectrometric analyses of groundwater (samples G-1, G-2 and G3) and mineral water (sample G-4). Explanations: a – sampled on March 31st 2008 during the rainy period; b - sampled on May 27th 2008 during the dry period Sam -ple
pH
Eh (mV)
Fe
Ni
Mn
Zn
The heavy metal content in the water is in most cases higher during the dry period in comparison with the rainy period (Figure 18). The As content both in the surface (and drainage) as well as in the groundwater is not high (6.11 µg.l-1). The speciation of the As proved only the presence of the less toxic As5+. The more toxic inorganic As3+ is not present.
182
Advanced Technologies
The presence of Acidithiobacteria or of sulphate reducing bacteria was not proved. The acidity both of the surface and groundwater is close to neutral pH (6.4 – 7.6) so the formation of acid mine drainage water is not probable. The most contaminated is the mineral water from the spring Linhart (Figure 1). The total radioactivity is 6,498 Bq.l-1 and the Fe (380 µg.l-1), Cu (181 µg.l-1), Pb (1 µg.l-1) and Cd (82.0 µg.l-1) content substantionally exceed the Slovak decrees No. 296/2005, No 354/2006 Coll. Precipitation of cementation copper on iron sheets immersed in sink, situated in fielddepression beneath the dump of the Empfängnis adit (Figure 1) was described. The contents of selected heavy metals in the Cu-rich drainage water from the sink are documented at tab. 7. Native copper of high fineness (up 96.07 % Cu) on the iron sheets surface was precipitated (Figure 19) after two months of maceration.
Fig. 19. Native copper (3), the Fe-oxides and Cu-carbonates (2) on the iron surface (1) were precipitated after two months of maceration. The ability of the drainage water precipitate cementation copper on the iron surface give possibility to realize Fe0-barrier for elimination of heavy metals from the drainage water and contribute to the remediation of the mining country. 0
3.4 Laboratory testing of the Fe -barrier The testing of Fe0-barrier was realised under laboratory conditions using Fe chips and granules (Aldrich) in mixture with dolomite (to avoid colmatage) in rate 9 : 1. The water containing heavy metals percolated hrough the agents 5 hours. Content of all studied metals decreased most intensively during the first two hours of the experiment (Figure 20) when the pH ranged from 6,3 to 8,11 (Figure 21). During the next hours was described already only sorption of Cd and Zn. The most effective was the As sorption (99,97%). Also the Cu sorption (98,98%) and Zn sorption (98,13%) was satisfactory. The effectivity of Cd sorption
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
183
(99,64%) is with respect to the relatively low primary Cd content in the drainage water (tab. 7 and 10) also acceptable. The experiment proved the possibility to remove from the water (using dolomite as calcination agent) together with Cu, As, Cd and Zn also Fe (which is during the cementation released to solution).
Fig. 20. Fe0-barrier testing; Decreasing content of the heavy metals during the experiment
Fig. 21. Fe0-barrier testing; pH changes during the experiment
184
Advanced Technologies
4. Conclusion The dump-field mining sediments are influenced by heavy metals from the hydrothermal Cu-mineralisation. The main contaminants: Cu (up to 20 360 ppm), Fe (up to 2.58 %), As (up to 457 ppm), Sb (up to 80 ppm) and Zn (up to 80 ppm) are accompanied also by U (up to 10 ppm) and Th (up to 35 ppm). The present natural sorbents are predominantly the clay minerals (illite, muscovite, caolinite, smectite) and hydrogoethite. The clay minerals are good sorbents of V, Cr, Ti, W, Zr, Nb, Ta a Th and at the hydrogoethite of Cu, Zn, Mo, Mn, Mg, (± Fe, Cd, Co, Ca). In the case of the Fe, As, Sb, Ag, Pb, Zn, Mn, Mo, Bi, U was proved also the free sorption capacity. The paste or rinse pH of sediments measured in distilled H2O is around 5.3 and only very few samples account acid values (< 5.0). The measuring of the pH paste in the samples using solution of 1M KCl give similar values. It means that only several few samples show markedly acid reaction. The acidity production (AP) vary from 0.625 to 10.31 (in average 3,7) and the neutralization potential (NP CaCO3) from 0.66 to 12.71 kg.t-1 (in average cca 27,1 kg.t1 CaCO ). The value of the net neutralisation potential (NPP) and the NP : AP ratio show that 3 the potential of the acid mine drainage water formation is very limited (NPP = 1,42; NP : AP = 1,72) and the environmental risk is negligible. The surface water (and drainage water) as well as the groundwater water are substantially contamined predominantly by Cu, Fe and As. Both the As content and its speciation don't pose acute risk (the highest arsenic content is only 6.11 µg.l-1 and it is present only in the form of moderately toxic inorganic As5+). The only risk poses the spring of the mineral water Linhart because of the high radioactivity and high Fe, Cu, Cd and Pb contents. For this reason was the spring closed and it is not used for drinking. The concentrations of the heavy metals in plant tissues decrease seriately in rate: Fe, Zn, Pb and Cu. Comparison of individual types of plant tissues show that the highest concentrations of heavy metals are in roots, than in leaves and stems and the lowest concentrations are in flowers, seeds and in fruits. The plant tissues from the dump-field are heavily damaged and the growth of the current year shoots are extraordinary tight. The results of the research document the plant defense reactions under the influence of stress factors at the dump sites (absence of soil and water, the heavy metal contamination, mobility of the cohesionless slope material). The ability of the drainage water precipitate cementation copper on the iron surface give possibility to realize Fe0-barrier for elimination of heavy metals (Cu, As, Zn, Cd) from the drainage water and contribute to the remediation of the mining country. Application of the dolomite enable remove from the water also the Fe released during the cementation process to solution.
5. Acknowledgements The authors wish to thank the APVV and VEGA Grant Agencies. This work was funded by the grants No. APVV-51-015605, APVV-VVCE-0033-07 and MŠ SR VEGA 1/0789/08.
Heavy metals and their impact on environment at the dump-field Ľubietová-Podlipa (Slovakia)
185
6. References Bowen, H. J. M. (1979). Hodnocení těžkých kovů v odpadech průmyslově vyráběných kompostů. Konference ČSVTS „Kompostování odpadů a životné prostředí.“ pp. 83-94, ISSN 80-02-00287-3 Ebner, F.; Pamič, J.; Kovács, S.; Szederkényi, T.; Vai, G. B.; Venturini, C.; Kräutner, H. G.; Karamata, S.; Krstič, B.; Sudar, M.; Vozár, J.; Vozárová, A. & Mioč, P. (2004). Variscan Preflysch (Devonian-Early Carboniferous) environments 1 : 2 500 000: Tectonostratigraphic terrane and paleoenvironment maps of the Circum-Pannonian region. Budapest: Geological Institute of Hungary, 125 p., pp. 963-671-245X CM Koděra, M. et al. (1990). Topografická mineralógia 2, Veda, SAV, pp. 585 – 1098, ISSN-80-2240101-3 Sobek, A. A.; Schuller, W . A.; Freeman, J. R. & Smith, R. M. (1978). Field and laboratory methods applicable to overburden and minesoils. U. S. Environmental Protection Agency, Environmental Protection Technology, EPA 600/2-78-054, Cincineti. OH, 2, 403 p. ISSN- 0-9682039-0-6 Šucha, V.; Šrodoň, J.; Zatkalíková, V. & Franců, J. (1991). Zmiešanovrstevný minerál typu illit-smektit, separácia, identifikácia, využitie. Mineralia Slovaca, 23, pp. 267-274, ISBN 0369-2086
186
Advanced Technologies
Health Technology Management
187
11 X Health Technology Management Roberto Miniati1, Fabrizio Dori1 and Mario Fregonara Medici2
1University
of Florence (Department of Electronics and Telecommunications) 2Hospital University AOU Careggi Italy
1. Introduction The term “health technology management” includes all those activities and actions necessary to carry out a safe, appropriate and economic use of technology in health organizations. The progression of technological innovation has changed many aspects of medical instrument management and related processes in order to achieve the main aims of a health technology manager, including acquisition, maintenance planning and the replacement of the medical devices in a specific environment where the continuity of services, privacy, reliability and safety are indispensable. The most important changes concern the integration of different technologies into the same device, the increasing presence of software (stand alone or not), the subsequent rise in technological complexity as well as new device problems, such as usability, systems alarm and software validation. For standard medical devices of typical electronic or mechanic equipment the presence of official regulations and guidelines provide decision makers with a valid and reliable support for a correct management. For medical systems the lack of norms represents a reduction of quality and safety in health care: testing and maintenance planning are carried out according to the particular need in specific health contexts without common strategies or methods fundamental for benchmarking. Therefore, even if some general guidelines are being developed, medical software management still represents a critical point for health organizations. In addition to standard operations related to management of technology, innovative and appropriate approaches have to be developed in order to maintain high quality and safety levels for healthcare technology. Studies aiming to define the critical points in managing complex systems and medical software have been carried out. Technology management also has to consider the technology in terms of maintenance planning, technical assistance agreements, training courses, quality controls and replacement. All these aspects have to be provided for by specific analyses as an effective response to clinical processes’ needs (failures and failure modes analysis, Health Technology Assessment, usability).
188
Advanced Technologies
2. Technology Management and Safety 2.1 Safety in Health Structures General Aspects The use of technology in medical practice implies the existence of a wide range of hazards, given by the critical nature of processes in the clinical environment in which these devices are generally used, that significantly increases the level of risk. The hazards present in such activities are such that it is not possible to provide absolute protection to virtually nullify the risk because this would reduce the clinical efficacy, making the device inappropriate for the purpose to which it was designed. Therefore, in the face of this unacceptable reduction of functionality a series of measures are adopted that ensure a degree of protection that is appropriate for both the needs of the patient and of the operator; both clinical and safety. The management of a medical device must be guided by criteria that ensure its safe, appropriate and economical use. The main steps for achieving this goal include a well planned purchase, appropriate preventive and corrective maintenance of equipment especially performance testing and calibration, and evaluation. It is evident that this perspective involves not only the entire life of the device within the health structures, by monitoring the perfect working status, but also in the acquisition phase through evaluating the clinical appropriateness. Indeed, from these considerations it is not possible to determine the potentially harmful aspects of the use of a device simply by observing the intrinsic characteristics, but it is necessary to evaluate the whole process in which the device is used. The risk analysis process is therefore a management tool through which it is feasible to define a set of critical parameters on the use of equipment through the analysis of processes and the risks related to them. Risk Analysis Methods The prevention of accidents cannot focus only on users’ training and performance; it must also involve the entire design of the system. Reason (1990) distinguished between active failures, which have immediate consequences, from latent errors, which remain ‘silent’ in the system until a certain triggering event causes them to manifest themselves, resulting in more or less serious consequences. The human operator is part of the incident, but the real cause (root cause) has to be found in bad managerial and organizational decisions. Up to now most efforts to reduce errors have focused on identifying the active errors, those clerical errors made by medical and nursing staff. Recently, it has come to light that errors of an organizational origin, that is, latent errors, also play a determining role. The safety of the patient therefore, comes from the ability to design and manage processes capable, on the one hand, to curb the effects of errors that occur (protection) and on the other, to reduce the likelihood that such errors occur (prevention). Two types of analysis can be used: • Reactive analysis; • Proactive analysis. The reactive analysis provides a post-incident study and it is aimed at identifying the causes that have allowed its occurrence. The most used analysis approaches in hospitals are as follows: - Incident reporting:
Health Technology Management
189
Incident reporting is defined as a means of collecting records of events in a structured way in order to provide a basis for analysis, preparation of strategies and actions for correction and improvement to prevent future accidents (some examples are given by ASRS. ‘Aviation Safety Reporting System’ created by NASA, the AISM ‘Australian Incident Monitoring System’ introduced in health care 1n 1996 and the NPSA ‘National Patient Safety Agency’ by local health organization in United Kingdom). -Reviews: This procedure concerns identifying errors in medicine through the analysis of clinical records in order to estimate and to compare the expected results. This method is usually composed of the following steps: 1 Identification of population and reference sample selection; 2 Patient selection according to clinical and statistical criteria; 3 Any report of an adverse event analysis; 4 Identification of possible adverse events with the aim of prevention. - The Root Causes Analysis ‘RCA:’ In the RCA it is essential that the intervention is focused on the cause, not on the problem. The RCA focuses first on the system and processes and then on personal performance. There are different techniques for conducting an RCA such as “herringbone diagram,” “5 Whys” and the “process map.” The Proactive Analysis aims to identify and eliminate the criticalities of the system before the accident occurs by analyzing the process activities in order to identify the critical points with the aim of improving the safety of the whole system. Proactive Analysis Some of the most significant proactive analysis methods used are as follows: - Quality Function Deployment 'QFD:' QFD has been derived from the latest evolutions in the approach to quality management. The focus therefore, in light of this, has been moved from the finished product (quality by inspection), to the production process (by quality control) in order to reach the final design (quality by design). Gradually focus on that route is operational, design features that act primarily (in reviews). The concept of QFD can be thus formulated: it is a technique that transforms the needs of customers into quality characteristics, which are incorporated into the project and brought forward as a priority (deployment) in the process and then therefore also in the product, whose result depends on the network of these reports. This path planning can reduce the need for any subsequent changes in the product and processes, reducing time and cost control. The different phases of QFD are: • Identification of customer/users’ needs • Product or process evaluation through attributes and characteristics analysis • State of art comparison;; • Design definition; • Identification of relations between the customer needs and the technology and technical features of the design; • Benchmarking. - Decision Tree: This is a graphical tool used for the formulation and evaluation of alternative choices in the decision making process in order to solve complex business problems. It focuses on the
190
Advanced Technologies
starting condition (context, means, resources), the possible alternatives (every feasibility is studied in probabilistic terms) and the results which these choices lead to. Graphically, the decision tree consists of a diagram where the various alternative options are inserted in rectangular boxes with all the possible results written beside them. - Failure Mode and Effect Analysis: The Failure Mode and Effect (critically) Analysis ‘FME(C)A’ is a technique for prevention, that along with functional analysis and problem solving , is usually recognized as one of three basic fundamental techniques for systems improvement. It takes into consideration all the possible system inconvenience (errors, faults, defects) through a brainstorming process or other techniques. For each inconvenience is defined the level of criticality that depends on three critical factors: the occurrence of the event, the severity and the detection, that is the degree to which the inconvenience can be detected. These three critical factors are previously measured separately and then calculated as one only Index. By applying a criterion of criticality, it is possible to determine which events have the priority for intervention through preventative actions in order to minimize the occurrence of the event, protection protocols in order to reduce severity and control systems with the aim to increase the detection. Finally, this technique is codified in a specific European Technical Standard: EN 60812: 2006. - Fault Tree Analysis: The Fault Tree Analysis ‘FTA’ is a deductive method of analysis because it starts from a ‘general’ analysis according to the failure type, and then arrives to detect faults in the specific components. This method is codified in the Standard IEC 1025:1990. Case Study: Analysis of Training and Technical Documentation in Technology Management in Health Care. The training of users is important for safety aspects in technology management in healthcare. A more correct and safer use of medical devices is strictly correlated with the specific clinical activity, respect of procedures and legal standards and norms. Technical documentation analysis is of equally high importance. It represents one of the most important tools in controlling and evaluating the quality of documentation provided by the manufacturer, the correct use of the devices and if training has been performed, with the purpose of maintaining performance and safety at the appropriate levels for the whole life cycle of the device. Methods The survey method was divided into two phases: firstly the training needs analysis , and secondly the development of a course modeled on the previous training needs: Each phase is divided into the following steps: PHASE 1: a) personnel interviews through a check list distribution; b) data collection and organization in a predefined and common form; c) analysis of results ; PHASE 2: a) designing of a training course based on the critical points identified during the previous phase; b) development of the course and related events;
Health Technology Management
191
c) investigation of the critical points and reports emerged during the course; d) analysis of data collected in the preceding step and observation of critical points emerged. The first point was to identify the basic elements to consider in order to establish objective criteria to use for quality assessment of the technical documentation. As a general consideration, a user guide has to be clear and comprehensive, easy and fast to use for being consulted when and where necessary. In support of these topics the technical standards underlines how the language should be clear, direct and unambiguous (standard UNI 10893, "Technical Product Documentation Instructions for use - Articulation and display order of content). Table I summarizes the aspects to be considered and possible technical suggestions for improvement. Evaluation Parameter
Technical Suggestion
Age, availability and identification technical documentation.
of The instructions for use must be produced in a durable, with a clear identification of the editorial year and placed in a location known and accessible to all, depending on the conditions of use.
Ease of reference and traceability information when it is necessary.
of A brief summary of the arguments and easy use of the index.
Identification of users and the manufacturer. Definition of the purpose of the manual, identification of the product and the original language. Graphic and editorial skills assessment.
Good and appropriate graphic, structure and format in order to stimulate reading.
Good content organizations by appropriate Appropriate design (outline of the arguments) and use separation of topics and effectiveness of the of short, uniform and synthetic titles. terminology. Level of readability and completeness of the Use short sentences and concise, correct punctuation, text use of verbs and sentences indefinitely in active form. Elementary syntax with little use of subordinate. Ease of comprehension of the instructions
Adoption of the principles of communication " see think – use” with great use of illustrations accompanied by brief explanations of the actions to be taken.
Detection of specific product and/or service Refer unambiguously to the specific product
Table 1. Evaluation Indicators and important aspects to consider for a well designed technical documentation. In addition, the following comments can represent one reference scheme for the evaluation of the efficacy and comprehension of information: • Ease of consultation and traceability of information when it is needed (short summary of the arguments in the analysis and use). • Effective terminology and contents organization; • Easy to read and completeness of the texts (syntax appropriate to the specific of reader). • Ease of understanding.
192
Advanced Technologies
Further, once these criteria are added, criteria of quality of technical documentation from the standard UNI 10893 and UNI 10653 ( "Technical Documentation. Quality of design and product) the complete list for the instruction manual can be defined as: • Title page / General instructions / Introduction; • How to install and / or linkage with other equipment and / or with the plant; • Technical data; • Operational use / Instructions for the operator; • Preventive and corrective maintenance; • User Tasks; • Cleaning and sanitation; • Diagnostics (analysis of failures and alarms according to a model: Event - Case - Remedy – Residual risk); Finally, in addition to the above list, it would be necessary to add any specific information that the law requires for certain types of product. In order to carry out an investigation on the matters discussed, a check list was distributed to the personnel with the aim to explore all of the most critical aspects regarding the use of medical equipment. Results The check lists were distributed to previously selected departments, where a designated contact person for each department suggested the equipment to be included in this survey. The criteria for equipment selection were based basically on the intrinsic criticality inside the clinical process by defining as critical the importance of the specific equipment in the clinical activities carried out in the specific department. In addition, the person designated to fill in the check list was also instructed to consider the experience of other users who habitually used the selected equipment. The data obtained belongs to 47 devices in 5 departments, pertaining to the areas of laboratory medicine and general medicine. Firstly, the results show that the percentage of users that consider themselves to have a good knowledge on device functionality who answer "yes, good" is less than 50% (Figure 1a) Figure 1b shows the results related to knowledge of the safety aspects of the device(alarms, operational risks, other risks)and the response is positive with a percentage of 40%, while the remaining 60% indicates having a partial or low safety aspects knowledge with a significant 30% of low level answers ("just", "very little", " nothing").
Fig. 1. a)Level of knowledge expressed from users on medical device functional aspects; B)Level of knowledge expressed from users on medical device safety aspects.
Health Technology Management
193
Further it is significant to analyze the percentage of equipment for which training was given, approximately 40% of equipment tested is used without any course, see Figure 2.
Fig. 2. Medical devices for which users have received training. By crossing this with the figure 3b trend it is possible to see many situations in which the lack of a training course is related to a significant difficulty in use, by showing that the most users’ votes is “5” and over. Figure 3a confirms and summarizes this trend by showing the percentage of equipment (with and without previous training) considered difficult (voting more than 5).
Fig. 3. a) Medical device complexity to use according to user interview ; b)Complexity of use for medical device used in hospital without any previous training course. Another important result regards the presence of the user guide. Figure 4a shows how most of the devices are accompanied by the manual(79%). Further, figure 4b shows that devices still exist with the user guide in English (40%).
Fig. 4. a)Number of medical devices with instructions guide; b)Medical devices with userguide in italian.
194
Advanced Technologies
Finally, the users’ perception of the quality of the manual is reported in Figure 5 and it shows significant fractions equal to or even below "sufficiency (vote<6).”
Fig. 5. Users opinion on user-guide quality: vote 1 bad…vote 10 good. Discussion And Further Developments The collected data has underlined the following topics: 1. Knowledge of the equipment functionality, especially for those aspects related to safety is low. 2. Technical documentation and other instruments identified by the legislature to improve this knowledge are not perceived by operators as fully adequate to carry out their aim, because they are judged to be insufficient in quality or not complete in contents. 3. Importance of training courses based on the critical aspects of equipment safety especially concerning the safe use, documentation and training. Finally, further developments regard the check list distribution in more critical areas such as operating rooms or intensive care units, where the first data indicates a problematic situation, for example with the anaesthetic machines, that present good reviews on training but poor reviews concerning the technical documentation. This lack of knowledge on certain aspects is responsible for bottlenecks, repetitions and delays in clinical activities.
2. Health Technology Assessment 2.1 Hospital Based Health Technology Assessment The term ‘HTA’ Health Technology Assessment, signifies the multi-disciplinary evaluation of properties, effects and/or impacts of health care technology in order to improve efficacy and effectiveness in health care services. An HTA process starts with the analysis of clinical needs and evaluates the patient outcomes of different technologies, by using a multidisciplinary approach in order to consider, in addition to the clinical efficacy, the technical, usability and economic aspects related to the medical devices. When HTA is performed on a hospital scale the process is defined as Hospital Based HTA. It is substantially different from standard HTA processes, usually carried out on a regional or national scale, that include both diffusion and innovation of all technologies, especially high technology, and it is mainly oriented to only consider clinical and economic aspects such as clinical efficacy and hospitalization times.
Health Technology Management
195
In hospitals, the HTA process principally involves medium and low technologies and its main aim is usually only to compete with other structures in technological innovation. This leads hospitals to undergo HTA processes to assess the effects of this gain in equipment complexity and not to plan the technology replacement according to real needs and clinical improvements. Furthermore, the increasing levels of management expenditure in spite of limited economic resources requires more attention to acquisition contracts, especially the technical aspects such as maintenance, safety controls and personnel training in order to ensure an appropriate and safe use of the device. Further, the hospital scale strongly characterizes technology depending on its destination of use. The presence of patient or not and experience and preparation of technical personnel represent important aspects to consider in HB-HTA processes, as well as analysing the device insertion impact in specific healthcare processes. Finally, the increasing presence of medical softwares in hospitals represent a new challenge for HTA processes, especially in hospitals where some critical departments, such as Intensive Care, are strongly dependent on medical software responsible for monitoring the central station. Case Study: an experience of HB-HTA at a Complex Hospital According to different health structures’ needs, the use of a general HTA methodology is essential in order to define the most appropriate and objective Key Performance Indicators for an appropriate evaluation of safety, efficacy and costs of medical equipment (Cohen et Al. 1995) . The methodology has been tested at the University Florence Hospital Careggi (AOU Careggi) and applied to the Clinical Engineering database including all the maintenance data of 15.000 medical devices. This study has developed a set of indicators, with the aim to provide clinical, safety and economic information to be considered in technology assessment in hospitals. As shown in figure 6, the result has been the development of a multidisciplinary HTA dashboard “S.U.R.E.,” organized in four sections: Safety, Usage, Reliability and Economics.
196
Advanced Technologies
Fig. 6. The Multidisciplinary HTA-Dashboard SURE. For the safety section in particular, the technical KPI have been previously validated to ensure their usefulness in the dashboard. For instance, “Technology Level” and/or “Destination of Use” have been demonstrated to be strongly related with the safety and economic fields of “SURE.” Figure 7 shows a failure analysis by using the Global Failure Rate (GFR) and the Age Failure Rate (AFR) applied to medical devices classified according to Technology Level: High-Tech, Medium-Tech, Low-Tech and Limited-Tech devices.
Fig. 7. Age Failure Rate and Global Failure Rate analyses applied to technology level. Both the analyses demonstrate that the number of failures is proportional to technology level during all medical device life. The AFR analysis further shows that training aspects are fundamental in obtaining a low AFR. After two years users are able to correctly use the device. Indeed, a good HTA process has to heavily evaluate both the quality of training courses and the full risk maintenance, especially for High technology.
Health Technology Management
197
Fig. 8. Age Failure Rate and Global Failure Rate analyses applied to destination of use. The analysis demonstrates as “Destination of Use” is strongly related to failures in hospitals. The analysis demonstrates that “Destination of Use” is strongly related to failures in hospitals and consequently, to safety. Both analyses show that number of failures is strongly related to the presence of patients (protocol depends on patient clinical situation), defining as the most critical ones the Therapeutic (e.g. surgery rooms) and Clinical Support areas (e.g. in-patient rooms). Further, the AFR analysis demonstrates that the different skills of users characterize “Destination of Use”, suggesting that different learning periods (Laboratory is a very “Technical area” with well educated technicians working there) should be considered in a complete HTA process. Future development aims to create a simplified SURE in order to carry out continuously a sustainable methodology and an easy tool for supporting decision makers in HB HTA. Case Study: Medical Software HTA The recent European Directive 2007/47/EC added “clinical software” to the Medical Device classification (MD). As a consequence, interest in medical software increased in health care. A critical point in the life cycle process of clinical software is the definition of “performance”, which is not usually applied to healthcare (so not based on specific safety aspects), and focuses on equipment performance, usually connected to other needs (for instance quality in imaging or electrical and mechanical safety rules). This study aims to explore and to evaluate these aspects related to HTA process for clinical software through the application of a HTA methodology for a software failure analysis. The goal is to locate Key Performance Indicators ‘KPI’ that could be included from traditional MD HTA and from software engineering. Four main factors were selected as key points of an HTA process by answering these questions: “Does this technology work?”, “Who can take advantage of it?”, “What are the costs?” and “What comes out from the comparison with the alternatives?” Indicators from a previous KPI used for traditional HTA were evaluated according to the following criteria: “which ones are suitable for both HTAs”, and if applicable, “how can I adapt their meaning to the software (what is its usability for SW?)”, “is the insertion of different indicators necessary (different standard or systems integration)” or “can the priority scale be modified (maintenance, formation, quality documentation)”. Three groups of KPI (Key Performance Indicators) are provided for classifying software HTA indicators.
198
Advanced Technologies
The first includes indicators suitable for both SW and MD processes, the second including parameters needing conceptual adaptations from SW engineering to medical application before being used in SW-HTA (e.g. usability is replaced by interface complexity) and the last group that comprises of indicators not applicable to software HTA. The proposed methodology allowed indicators for medical software HTA to be evaluated, by taking into consideration the evaluation methods typical of medical device approaches and the measuring parameters related to the informatic and software fields, see Table 2. First Group Second Group Appropriateness Usability Complexity Interface Training Supported Safety Documentation Provided Personalization Ability ... ... Table 2. Proposal of KPI for Software HTA.
Third Group Privacy Interoperability (HL/IHE,..) Reliability Consistency ...
3. Health Technology Maintenance A failure is an event responsible for any change in the performance of the device making it unusable or unsafe for the patient and the user, in the use for which it was intended. According to actual standards, maintenance is defined as "a combination of all the technical, administrative and management activities planned during the life cycle of an entity, to keep it or return it to a state where it can perform the required function.“ As mentioned in the definition above, two important aspects are fundamental for all maintenance activities: the specific actions to carry out and the time necessary in order to correct the failure. Generally two main maintenance systems are normally carried out: corrective and preventive maintenance. A balanced compromise between the two is the aim for health technology managers. 3.1 Corrective Maintenance Corrective maintenance is intended to be applied in case of a failure with the aim to repair the device. Evidence based maintenance systems (Wang et Al., 2006) depend on failure analysis in order to design efficient maintenance plans. Corrective actions and time of interventions are obtained by taking into consideration details concerning the failure and the device involved, such as the failure type and/or the technology complexity. Indeed, health technology managers have to optimize the integration of external interventions assistance and internal technical service to guarantee an efficient and costeffective maintenance system. For instance, internal technicians training has to respond to the real needs evaluated from a failure analysis and then be integrated with external assistance provided from the suppliers. Effective maintenance must also consider rapidity in problem resolution, especially in healthcare where the services continuity is essential. In some clinical areas where technology is vital for patient life such as intensive care units or operating theatres, the continuity of the service coincides with the continuity of technological functionality. Here the priority is to minimize the downtime of the device and
Health Technology Management
199
time of intervention, which is essential for planning suppliers’ assistance services, internal organization and hospital procedures. 3.2 Preventive Maintenance Preventive maintenance includes care or services by specific personnel with the aim to maintain working equipment and/or extend its life, by providing for systematic inspection, detection, and correction of incipient failures either before they occur or before they develop into major defects. In particular, these services aim to improve equipment longevity, detection of future need for major service or replacement, maintenance of proper performance, prevention of downtime due to mal-performance or failure and reduction of repair costs. Two preventive maintenance systems are possible: first level and second level. First level preventive maintenance is usually carried out by users in terms of visual controls and in some cases simple medical tests on the devices. These controls and/or tests have to be described in the user guide provided by the suppliers. Second level preventive maintenance is performed by qualified personnel and consists of visual checks, safety tests, control of real efficacy and performance of first level preventive maintenance and the execution of maintenance programs provided from the manufacturer. The most important variables for an effective preventive maintenance are the frequency and the specific tests to be carried. According to international standards, the actions must be provided from the manufacturers, even if some specific actions could be suggested from health technology managers as a result of failure analysis. The failure rate trend (figure 9a) is a good measure for planning the frequency of the interventions. The trend analysis is useful for two main reasons: firstly, it is possible to improve the maintenance system (changes on frequency and type of interventions) according to the relation between the failure trend analysis and the levels to be reached; secondly, the failure analysis allows for control and adaptation of the maintenance system according to the lifecycle of the device. As shown in Figure 9a, the typical trend is composed by three main sections: high failure rate in early age due to defective parts and small practice with the equipment, followed by a constant trend representing the contribution of “casual” failures that have to be minimized by the use of proper preventive maintenance, and finally, high rate of failure due to age of the device such as the device wear and the difficulty to find the proper components and suitable material for the replacement. Indeed, the preventive maintenance should therefore take into account these aspects and be adapted according to the lifecycle of the device. Figure 9b shows the situation at AOU Careggi. By analyzing the Age Failure Rate 'AFR' (calculated as N ° of Failures / Number of Devices / Age of the devices) it is possible to refer the actual “lifecycle moment” to the central sector of the trend of figure 9a. For future management, all the preventive measures described above should be reinforced in order to minimize the expected increase of the failure rate.
200
Advanced Technologies
Fig. 9. Bathtub function/Failure Rate analysis comparison. Case study: Experience of Failure Analysis regarding Medical Devices in Hospitals This study represents a failure analysis carried out on a complex university hospital by using two main indicators: the FRC and the NTI. The Failure Rate Category (FRC) index is an indicator that takes into account the Failure Rate for a specific category of failure “N° of Failures/N° of Devices/Failure Type.” Six categories of failure have been considered and classified by analyzing all technician reports present in the Database: “Software (FRC-SW),” “Electronic/Electric (FRC-ELE),” “Mechanic (FRC-MEC),” ”Accessories (FRC-ACC),” “False Alarm (FRC-FA)” and “Unclassifiable (FRCUN).” The NTI indicator takes into consideration the “Number of Technical Interventions” by considering whether the technicians come from the internal Clinical Engineering department, from an external private company or from supplier’s assistance. By analyzing fig. 10, it is possible to see a characterization between Failure type and technology. FRC-ELE and FRC-SW are typical for High-tech. FRC-MEC is equally distributed over all classes. It is interesting to observe that Limited-Tech does not present any FRC-False Alarm. It is also important to note that FRC-UN characterizes mostly Highand Medium- Tech. The FRC analysis on destination of use classification is reported in Figure 11. FRC-SW characterizes Therapeutic destination of use. Activity Support does not have FRC-FA; further, FRC-FA is low for Diagnostic and Lab areas. It is also interesting to note that FRC-ACC is low for Lab.
Health Technology Management
201
Fig. 10. FRC analysis applied to ‘technological complexity’ classification.
Fig. 11. FRC analysis applied to ‘destination of use’ classification. NTI application on destination of use is reported in Fig. 12. Diagnostic and Lab areas are the exceptions to the general trend of NTI [destination of use], that normally has the highest values for all uses of internal NTI. Both Diagnostic and Lab present higher values for external technical interventions than internal ones. Lab and Diagnostic are characterized by High-Tech medical devices. Figure 13 shows the NTI analysis according to failure type and to technician provenence (internal or external). It is interesting to observe that Unclassified Failures represent the only typology with more NTI-EXT than internal ones (no clear reports for external interventions have been common during the analysis). Finally, False Alarms result to be much higher in NTI-INT than in EXT because the hospital protocols provide internal personnel as the first call of intervention.
202
Advanced Technologies
Fig. 12. NTI analysis applied to ‘destination of use‘ classification and to technician. The presence of FRC-FA for both Medium- and Low-Tech suggests that a request for user training with lower technological devices should be considered in the acquisition phase. Another essential aspect during the acquisition phase is asking external companies’ technicians to leave a formal, pre-prepared report of their work after every maintenance in order to create a much more efficient control. This would also help technology managers to better control important hospital areas such as Lab and Diagnostic that present high NTIEXT and a high concentration of High-Tech equipment.
Fig. 13. NTI analysis applied to technician. Case Study: Experience of Failure Analysis with regards to Medical Softwares in Hospitals 515 software failure reports have been analyzed. They have been classified into seven categories according to the type and to the cause of the failure. Further, all “medical softwares” have been classified according to their “Destination of Use” and according to software installation: ‘Stand Alone (SA)-’ or “Hospital Network (NET)-’ softwares. Figure 14 shows the software presence at the hospital analyzed.
Health Technology Management
203
Fig. 14. Percentage of different softwares present at the hospital. Reports are divides into six areas of interest, considering the "Type of failure”, each addressed by an acronym: • Non-Software - NS, • Non-Medical Software - NMS, • Corrective maintenance - MC, • Preventive Maintenance - MP , • False alarm - FA, • Wrong/bad Use - BU. Furthermore, in order toalso analyze management aspects and health process, all types of software have been classified according to the "Intended Use" in a hospital ('Life Support', 'Diagnostic', 'Therapy', 'Laboratory', 'Support Activities') and in relation to the type of software to the type of installation ('Stand alone' or 'Hospital Network'). After the definition of these classes, the last step included index, "FR," defined as the number of faults normalized to the total number of softwares, equation 1. FR = Failure (N°) ÷ Software (N)
(1)
Further, evidence based relationships exist between different Destination of Uses and Software Type, see Figure 15: “NET-Software”is distributed in all hospital areas whilst the “SA-software”‘is present only in Life Support and Therapeutic areas.
Fig. 15. A)‘SA- Software‘ hospital distribution. b)‘ NET-Software‘ hospital distribution. Figure 15 shows interesting correlations between software types and failure categories. NetSoftwares present only one third of total failures specific for softwares such as MC,’ ‘MP’ and ‘NMS’, see Figure 16a. Further, ‘false Alarm -33%-’ is the highest failure type for these kind of softwares. Figure 16b shows the situation regarding the Stand Alone softwares. Also
204
Advanced Technologies
for this category only 33% of total failures is related to software faults while the category ‘NS’ represents the highest.
Fig. 16. Failure type analysis applied to software installation: (A)Stand Alone ; (b)NETConnected. Finally, by considering the most critical area, ‘Life Support’, it is clear that all failure types are present here, and only 50% are related to medical software failures, see figure 17.
Fig. 17. Failure Type analysis applied to ‘Life Support‘ Areas. By better analyzing the FA-Failures, it is interesting to note how the ‘Life Support’ area represents 25% of total failures. If the presence of patients in the ‘Therapeutic’ area can justify the highest number (41%) and high technology in imaging can help to explain the percentage of 34% in ‘Diagnostic’, the signifying percentage in life support results to be very high and the object of further analysis, See figure 18.
Health Technology Management
205
This approach allows a quantitative evaluation of safety and reliability for “medical software” in hospitals. A multi-dimensional approach has been considered by evaluating software applications according to different user skills and clinical areas needs, and it has shown that personnel training could reduce one third of total software failures. The approach is crucial in terms of multi-dimensional analysis, because it supports the analysis and evaluation of the software linking process characteristics and clinical areas of hospitals with different intrinsic technical characteristics of the software.
Fig. 18. Number of False Alarms according to destination of use in hospital.
4. Future Challenges 4.1 Medical Software Management Two main groups of Medical Device Softwares "MDS" are present in healthcare: stand alone or isolated MDS, and connected MDS to HIS (Hospital Informatic System) or to other MDS. The management of MDS is basically structured in three main areas: procedural, technical and legal issues. At the moment, all these areas have no well defined and sure standards or methods to follow. Procedural aspects Despite the presence of some international standards, there is a strong lack of detailed procedures, especially for medical devices, including basic procedures for risk assessment, performance control and test certificate tasks. Finally, procedures for recalling in the case of adverse events are not yet fully developed. For connected MDS especially, the management is more complicated. The HIS/other MDS connection introduces new potential risks to consider, such as when and which update processes are necessary to maintain the proper functionality ie. how much does it depend on the actual connection modes given by actual software versions. MDS management points also regard organizational problems: the necessity to clearly define the roles for MDS management is essential. For instance in hospitals, when MDS are inserted in the HIS, two different departments are interested in their management; Clinical Engineering and Information Technology. This redundancy is largely responsible for
206
Advanced Technologies
ineffective services in problem solving and decision making rapidity for MDS, and for patients subsequently. Technical aspects: Further critical points introduced by the MDS regard the technical aspects that concern the data transmission, its security and integrity. Robustness, protection tools and user friendly interface are some of the most important technical aspects to consider during the evaluation of MDS. With regards to connected MDS, most of these difficulties arise from data compatibility with the HIS system and amongst different MDS: the standardization process for data transmission interfacing (DICOM, HL7) would allow to assemble different softwares from different companies with a subsequent simplification of maintenance and acquisition processes. Finally, inter-operability and integration of MDS in HIS must be considered in order to facilitate use for medical personnel, therefore improving clinical efficacy and efficiency. Legal aspects: Finally, data protection for both privacy and safe storage of clinical information is extremely important in health care. The standard protection measures such as user authentication modes, data security protections, back-up and data rescue-recovery systems are not practical in healthcare where, for example, in case as of emergency, these systems must not delay the clinical activities. The aim of the health technology manager is to guarantee fast access in specific cases (emergency and rescue) without reducing the security and data protection levels. The hybrid environment composed by IT systems and medical devices where connected MDS are inserted make the situation even more complicated. Table 3 summarizes the Medical Software issues and challenges previous cited. Management
Needs
Procedural
Test certificate Performance evaluation Adverse events calling Role definition for MDS and HIS management Updating procedures Technical Data safety and integrity Correct interfacing with other MDS or HIS IT system compatibility Legal Privacy Data protection Table 3. Main aspects to consider for software management in health care.
4.2 Electro Medical Systems Management 4.2.1 EMS (Electro Medical Systems) The increasing use of computers, especially in diagnosis and in therapeutic treatment monitoring, has led to the synergic use of several medical devices, creating electro medical systems with the aim to ensure additional functionality in applications without decreasing performance of individual equipment or increasing risk.
Health Technology Management
207
Some examples of EMS are present in radio-diagnostic equipment, video endoscopes, ultrasonic equipment with personal computers, computerized tomography, magnetic resonance imaging. Safety aspects in EMS concern the characteristics of each individual equipment in addition to the characteristics of the entire system. Further, EMS can be spatially distributed in the clinical area by having parts both inside and outside the “patient area”, each of which need different safety protections: • within the patient area safety level is required to be the same for medical equipment; • outside the patient area safety level is required to be the same for non-medical equipment. In addition to typical protocols, a new vision is necessary for the EMS life cycle management such as, for instance, the "EMS acceptance testing." This should be performed only once the different parts of the system are connected according to the specifications and the documentation provided from the manufacturer or from the responsible organization of the system. It is important to identify equipment classed as non-medical, for which the organization must verify the compatibility with other system components (electrical or not). Finally, regarding the periodic electrical safety inspections, the concept of "separation" is a critical aspect, where, for instance, an high value for the leakage current (due to functional connections between different parts) should be taken under control by applying the safety standard concerning the device separations. 4.2.2 PEMS (Programmable Electrical Medical Systems) Often, EMS include a significant software component in order to program some parts for system flexibility improvement. The software results to be critical for the functionality and performance of the PEMS, so that it is responsible for the proper functioning, stability and reliability of the system. Since it is not possible to apply to software the concept of "technical parameter measures" for the proper functioning of the other equipment parts, it is necessary to move the risk analysis to the design phase of the PEMS, especially concerning reliability regarding the development of software. Indeed, in order to facilitate the design of a PEMS, its structural definition is fundamental in order to allow an easy identification and implementation of all those requirements concerning the safety. The technical documentation of a PEMS describes this structure and thus represents the functional connections between each component and the whole PEMS. The structure description should provide: • the scheme of the PEMS by identifying all the components, especially those implemented in each programmed component, including software components; • the functions of each programmable part and its components (including those relating to safety); • the interfaces between the components of the software; • the interfaces between the software systems. Finally, the type of PEMS structure is important: Appropriate structure (following the functionality) allows the recognition of functional connections that leads to a simple risk detection, through an easy identification of those hazards related to both connections and software.
208
Advanced Technologies
4.2.3 EMS Conformity Inspections From the previous brief description of EMS it has been possible to extract some fundamental aspects underlining the importance of verifying conformity to, for example, the CE (European Community) or the FDA (USA). In both cases it is essential to analyze the software component by evaluating the effective importance for the proper functioning of the system. All these aspects concerning particular attention to software have been included in the recent European Directive 2007/47/EC that, having included software in the medical device category, requires that software itself must also conform to all the essential safety requirements necessary for medical devices to be put on the market. Complexity of a system. In general, EMS, with the aim of increasing functionality, present a high complexity that requires careful assessment of essential safety requirements. At that regard, following is reported the list with the general categories of hazards associated to medical device / system suggested form the Appendix D of the Standard EN ISO 14971: . Energy-related; . Biological; . Environmental; . Incorrect energy outlet and substances; . Improper and incorrect use; . Unusable interface; . Functional failures; A further aspect to consider is that technological progress tends to transfer functionalities of the device as much as possible to software for two main reasons: a. Firstly due to economic aspects; the objective is the elimination of “hard elements" whose features can be reproduced by the software; b. Secondly for the versatility of the system; by delegating the intelligence of the entire system to software, it becomes much simpler for the manufacturer to update and renew the entire system through just the addition of functionality in the software command. Due to improvements in innovation and efficacy, a correct software risk assessment, including the individuation of risks and the verification of conformity to the essential safety requirements, is becoming ever more complicated. Generally, because of the difficulty in implementing a comprehensive and effective risk management, software is limited to a purely support /advisory function, where the final decision-making is usually left to the user., who has to choose the parameters suggested from the software. CE Marking and FDA Certification. Regarding CE marking or FDA certification, another critical aspect is the different approach between the designing of medical devices and software. A software programmer would be required to operate so that during the development phases it is possible to control the risk management aspects, since it is difficult to act on safety after the final development. There is a lack of specific technical standards present for PEMS; the EN 60601-1 on the safety and performance of medical devices in addition to the EN ISO 14971 on the application of
Health Technology Management
209
risk management to medical devices, are not enough for the development of an appropriate procedure for the design and the safety use of PEMS. In order to overcome these limitations it is therefore necessary to integrate these standards with the requirements present in software standards such as the EN 62304 regarding the processes for the lifecycle of a software.
5 References Yadin David ; Thomas M. Judd (1995). Management and Assessment of Medical Technology In: The Biomedical Engineering Handbook, Joseph D. Bronzino, 2507-2516, CRC and IEEE PRESS, ISBN 0-8493-8346-3, United States of America. Cohen T, Bakuzonics C, Friedman SB, Roa RL. Benchmarking indicators for medical equipment repair and maintenance. Biomed Instrum Technol. 1995;29:308-321. B. Wang, E. Furst, T. Cohen, O.R. Keil, M. Ridgway, R. Stiefel, Medical Equipment Management Strategies, Biomed Instrum & Techn., May/June 2006, 40:233-237. EN 60601-1:2006-10. Medical electrical equipment - Part 1: General requirements for basic safety and essential performance. EN ISO 14971:2000-12; EN ISO 14971/A1:2003-03; EN ISO 14971/EC:2002-02. Medical devices - Application of risk management to medical devices. Council Directive 93/42/EEC of 14 June 1993 concerning medical devices. Directive 2007/47/EC of the European Parliament and of the Council of 5 September 2007. EN 62304:2006-07. Medical device software - Software life-cycle processes
210
Advanced Technologies
Differential Sandwich Theorems with Generalised Derivative Operator
211
12 X
Differential Sandwich Theorems with Generalised Derivative Operator Maslina Darus and Khalifa Al-Shaqsi
School of mathematical sciences, Faculty of science and technology, Universiti Kebangsaan Malaysia, Bangi 43600 Selangor D. Ehsan, Malaysia 1. Introduction Denote by the unit disk of the complex plane: = {z ∈ :| z|< 1} .
Let ( ) be the space of analytic function in . Let
n =
{f
∈ ( ) , f ( z) = z + an + 1 zn + 1 + an + 2 zn + 2 + ⋅ ⋅ ⋅} ,
for ( z ∈ ) with 1 = . For a ∈ and n ∈ we let [ a , n] = { f ∈ ( ) , f ( z) = a + an zn + an + 1 zn + 1 + ⋅ ⋅ ⋅} , ( z ∈ ). If f and g are analytic functions in , then we say that f is subordinate to g , written f g , if there is a function w analytic in ,with w(0) = 0 , |w( z)|< 1 for all z ∈ such that f ( z) = g( w( z )) for z ∈ . If g is univalent, then
f g if and only if f (0) = g(0) and
f ( ) = g( ) .
A function f analytic in , is said to be convex if it is univalent and f ( ) is convex. Let p , h ∈ ( ) and let ψ (r , s , t ; z) : 3 × → . If p and ψ ( p( z), zp '( z), z 2 p ''( z); z) are univalent and if p( z) satisfies the (second-order) differential superordination h( z) ψ ( p( z), zp '( z), z 2 p ''( z); z),
(z ∈ )
(1)
then p( z) is called a solution of the differential supordination (1) . (If f ( z) subordinate to
F( z) , the F( z) superordinate f ( z) ).
212
Advanced Technologies
An analytic function q is called a subordinant of the differential superodination, or more
simply a subordinant if q( z) p( z) for all p( z) satisfying (1). A univalent subordinant
q ( z) that satisfies q( z) q ( z) for all subordinants q( z) of (1) is said to be the best
subordinant . (Note that the best subordinant is unique up to a rotation of ). Recently Miller and Mocanu obtained conditions on h,q and Ψ for which the following implication holds: h( z) ψ ( p( z), zp '( z), z 2 p ''( z); z), ⇒ q( z) p( z)
(z ∈ ) .
We now state the following definition.
Definition 1. (Al-Shaqsi & Darus, 2008). Let function f in , then for j , λ ∈ 0 and β > 0 ,
we define the following differential operator ∞
Dλj , β f ( z) = z + ∑ [1 + β (n − 1)] j C (λ , n)an zn , ( z ∈ ) , n=2
n + λ − 1 where C (λ , n) = . λ Special cases of this operator includes the Ruscheweyh derivative operator Doλ , 1 ≡ Rλ
(Ruscheweyh,
1975), the Sălăgean derivative operator D0,j 1 ≡ S j (Sălăgean, 1983), the
generalized Sălăgean derivative operator (or Al-Oboudi drivetive operator) D0,j β ≡ Fβj (AlOboudi, 2004) and the generalized Ruscheweyh derivative operator (or Al-Shaqsi-Darus drivative operator) Doλ , β ≡ K βλ ( Darus & Al-Shaqsi, 2006).
For j , λ ∈ 0 and β > 0 , we obtain the following inclusion relations:
and
j j Dλj+1 , β f ( z ) = (1 − β )Dλ , β f ( z ) + β z( Dλ , β f ( z ))'
(2)
z(Dλj , β f ( z))' = (1 + λ )Dλj + 1, β f ( z) − λ Dλj , β f ( z) .
(3)
In order to prove the original results we shall need the following definition s , lemma and theorems.
Definition 2: (Miller & Mocanu, 2000, Definition 2.2b p.21 ). Denote by Q , the set of all
functions q that are analytic and injective on − E(q ) , where
{
}
E(q ) = ζ ∈ ∂ : lim q( z) = ∞ z →ζ
Differential Sandwich Theorems with Generalised Derivative Operator
213
and are such that q '(ζ ) ≠ 0 for ζ ∈ ∂ − E(q ) . Further let the subclass of Q for which q(0) = a be denoted by Q( a) and Q(1) = Q1 .
Theorem 1: (Miller & Mocanu, 2000, Theorem 3.4h p.132 ). Let q( z) be univalent in the unit disk and θ and φ be analytic in a domain containing w ∈ q( ) . Set
ψ ( z) = zq '( z)φ (q( z))
and
q( ) with φ ( w ) ≠ 0 when
h( z) = θ (q( z)) + ψ ( z) .
Suppose that 1.
ψ ( z) is starlike univalent in , and
2.
zh '( z) Re > 0, for z ∈ . ψ ( z)
If p is analytic with p(0) = q(0), p( ) ⊆ and
θ ( p( z)) + zp '( z)φ ( p( z)) θ (q( z)) + zq '( z)φ (q( z)) ,
(4)
then p( z) q( z) and q( z) is the best dominant. Definition 3:(Miller & Mocanu, 2000, Definition 2.3a p.27 ). Let Ω be a set in , q( z) ∈ Q and
n be a positive integer. The class of admissible functions Ψ n [Ω , q ] consists of those
functions ψ : 3 × → that satisfy the admissibility condition ψ (r , s , t ; z) ∉ Ω whenever r = q(ζ ), s = kζ q '(ζ ) , and Re
{ }
ζ q ''(ζ ) t + 1 ≥ k Re + 1 , ( z ∈ , ζ ∈ ∂ − E(q ), k ≥ n). s q '( ζ )
We write Ψ 1 [Ω , q ] as Ψ[Ω , q ] . Mz + a with M > 0 and |a|< M , then q( ) = M = {w :|w|< M} M + az q(0) = a , E(q ) = ∅ and q ∈ Q( a) . In this case, we set Ψ n [Ω , M , a] = Ψ n [Ω , q ] , and in the special
In particular when q( z) = M
case when the set Ω = M the class is simply denote by Ψ n [ M , a] .
Theorem 2:( Miller & Mocanu, 2000, Theorem 2.3b p.28 ). Let ψ ∈ Ψ n [Ω , q ] with q(0) = a . If
the analytic function p( z) = a + an zn + an + 1 zn + 1 + ... , satisfies
ψ ( p( z), zp '( z), z 2 p ''( z); z) ∈ Ω , then p( z) q( z) .
214
Advanced Technologies
Lemma 1: (Bulboacă, 2002). Let q( z) be convex univalent in the unit disk and ϑ and ϕ be analytic in a domain containing q( ) .
Suppose that 1. 2.
ϑ '(q( z)) Re > 0, for z ∈ and ϕ (q( z)) ψ ( z) = zq '( z)ϕ (q( z)) is starlike univalent in .
If p( z) ∈ [q(0),1] Q with p( ) ⊆ and ϑ ( p( z)) + zp '( z)ϕ ( p( z)) is univalent in and
ϑ (q( z)) + zq '( z)ϕ (q( z)) ϑ ( p( z)) + zp '( z)ϕ ( p( z)) ,
(5)
then q( z) p( z) and q( z) is the best subordinant. Definition 4:(Miller & Mocanu, 2003, Definition 3 p.817 ). Let Ω be a set in , q ∈ [ a , n]
with q '( z) ≠ 0 . The class of admissible functions Ψ 'n [Ω , q ] consists of those functions
ψ : 3 × → that r = q( z), s =
satisfy
the
admissibility
condition
ψ (r , s , t ; ζ ) ∈ Ω
whenever
zq '( z) , and m
Re
{ }
zq ''( z) t 1 + 1 ≥ Re + 1 , ( z ∈ , ζ ∈ ∂ ,1 ≤ n ≤ m). s m q '( z)
In particular, we write Ψ '1 [Ω , q ] as Ψ '[Ω , q ] . Theorem 3:( Miller & Mocanu, 2003, Theorem 1 p.818). Let ψ ∈ Ψ 'n [Ω , q ] with q(0) = a . If the p( z) ∈ Q( a) and ψ ( p( z), zp '( z), z 2 p ''( z); z) is univalent in , then
Ω ⊂ {ψ ( p( z), zp '( z), z 2 p ''( z); z) : z ∈ } .
implies p( z) q( z) .
2. Subordination Results Using Theorem 1, we first prove the following theorem. Theorem 4: Let j , λ ∈ 0 , β > 0, δ ,α ∈ and q( z) be convex univalent in with q(0) = 1 .
Further, assume that
zq ''( z) 2δ q( z) + + 1 > 0 Re q '( z ) α Let
(z ∈ ) .
(6)
Differential Sandwich Theorems with Generalised Derivative Operator
Ψ( j , λ , β , δ , α ; z ) =
215
j+1 Dj Dj f ( z) f ( z) δ [2 − β (2 + λ )] Dλ , β f ( z) + δβ (λ + 2)(λ + 1) λ j+ 2, β − δβ (λ + 1)2 λ j+ 1, β j Dλ , β f ( z) Dλ , β f ( z) Dλ , β f ( z) β 2
j+1 1 D f ( z) δ (1 − β )[1 − β (λ + 1)] + [α + δ (1 − )] λj , β . − β Dλ , β f ( z) β
(7)
If f ( z) ∈ satisfies Ψ( j , λ , β , δ ,α ; z) δ zq '( z) + (δ + α )(q( z ))2
(8)
Then Dλj+1 , β f ( z) Dλj , β f ( z)
q( z )
and q( z) is the best dominant. Proof. Define the function p by p( z) =
Dλj+1 , β f ( z)
(z ∈ ) .
Dλj , β f ( z)
(9)
Then the function p is analytic in and p(0) = 1 . Therfore, by making use of (2), (3) and
(9). we obtain
D D f ( z) f ( z) δ [2 − β (2 + λ )] Dλ , β f ( z) + δβ (λ + 2)(λ + 1) λ j+ 2, β − δβ (λ + 1)2 λ j+ 1, β j Dλ , β f ( z) Dλ , β f ( z) Dλ , β f ( z) β j+1
j
j
2
j+1 1 Dλ , β f ( z) δ (1 − β )[1 − β (λ + 1)] )] j − β Dλ , β f ( z) β 2 = δ zp '( z) + (δ + α )( p( z)) .
+ [α + δ (1 −
(10)
By using (10) in (8), we have
δ zp '( z) + (δ + α )( p( z))2 δ zq '( z) + (δ + α )(q( z))2 . By sitting θ ( w ) = δ w 2 and φ ( w ) = α , it can be easily observed that θ ( w ) and φ ( w ) are anlytic in − {0} and that φ ( w ) ≠ 0 . Hence the result now follows by an application of Theorem 1.
Corollary 1: Let q( z) = f ( z) ∈ then,
1 + Az ( −1 ≤ B < A ≤ 1) in Theorem 4. Further assuming that (6) holds. If 1 + Bz
216
Advanced Technologies
Ψ( j , λ , β , δ , α ; z )
2 Dλj+1 1 + Az 1 + Az , β f ( z) , + (δ + α ) ⇒ j Dλ , β f ( z) 1 + Bz (1 + Bz) 1 + Bz
δ ( A − B) z 2
1 + Az is the best dominant. 1 + Bz 1+ z Also, let q( z) = , then for f ( z) ∈ we have, 1−z
and
Ψ( j , λ , β , δ , α ; z ) and
2 Dλj+1 2δ z 1+ z 1+ z , β f ( z) + δ + α ⇒ ( ) j 2 Dλ , β f ( z) 1 − z (1 − z) 1−z
,
1+ z is the best dominant. 1−z µ
1+ z By taking q( z) = (0 < µ ≤ 1) , then for f ( z) ∈ we have, 1−z Ψ( j , λ , β , δ , α ; z )
2δµ z 1 + z (1 − z )2 1 − z
µ −1
1+ z + (δ + α ) 1−z
2µ
⇒
µ Dλj+1 1+ z , β f ( z) Dλj , β f ( z ) 1 − z
,
µ
1+ z and is the best dominant. 1−z Now, the following class of admissile functions is required in our following result. Defintion 5: Let Ω be a set in and q( z) ∈ Q1 . The class of admissible functions Φ n ,1 [Ω , q ] consists of those functions φ : 3 × → that satisfy the admissibility condition
φ (u , v , w ; z) ∉ Ω whenever u = q(ζ ),
v = q(ζ ) +
β kζ q '(ζ ) q(ζ )
q(ζ ) ≠ 0 ,
ζ q ''(ζ ) ( w − v )v u( β + 1) − v − + 1 , ( z ∈ , ζ ∈ ∂ − E(q ), k ≥ 1). Re ≥ k Re β2 β ( v − u) q '(ζ ) Theorem 5: Let φ ∈ Φ n ,1 [Ω , q ] . If f ∈ satisfies j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , j+1 , j+2 ; z :z ∈ ⊂ Ω φ j Dλ , β f ( z) Dλ , β f ( z) Dλ , β f ( z)
then
Dλj+1 , β f ( z) Dλj , β f ( z)
q( z ) .
(11)
Differential Sandwich Theorems with Generalised Derivative Operator
217
Proof. Define function p given by (9). Then by using (2), we get Dλj+2 , β f ( z) j+1 λ, β
D
f ( z)
= p( z) +
β zp '( z)
(12)
p( z)
Differenitating logarithmically (12), further computations show that
Dλj+3 , β f ( z) j+2 λ, β
D
f ( z)
= p( z) +
β zp '( z) p( z)
β ( p( z) + β ) +
2 zp '( z) β z 2 p ''( z) zp '( z) + − p( z) p( z ) p( z ) β zp '( z) p( z) + p( z)
(13)
Define the transformation from 3 to by
u = r, v = r + Let
βs r
, w=r+
βs r
s r
β ( r + β ) + +
2 βt s −
r
r βs r+ r
ψ (r , s , t ; z) = φ (u , v , w ; z)
(14)
(15)
s βt s − β ( r + β ) + r r r βs βs ,r + ; z . = φ r,r + + βs r r r+ r 2
Using (9), (12) and (13), from (15), it follows that j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , , ;z . Dj f ( z) Dj+1 f ( z) Dj+2 f ( z) λ, β λ, β λ, β
ψ ( p( z), zp '( z), z 2 p ''( z); z ) = φ
(16)
Hence (11) implies ψ ( p( z), zp '( z), z 2 p ''( z); z ) ∈ Ω . The proof is completed if it can be shown that the admissibility condition for φ ∈ Φ n ,1 [Ω , q ] is equivalent to the admissibility condition for ψ as given in Defintion 3. For this purpose note that s ( v − u) t = , = r r β and thus
( w − v )v − ( v − u)(u + β ) +
β2
( v − u )2
β
218
Advanced Technologies
t ( w − v )v u( β + 1) − v . +1= − s β ( v − u) β2 Hence ψ ∈ Ψ[Ω , q ] and by Theorem 2, p( z) q( z) or
Dλj+1 , β f ( z) Dλj , β f ( z)
q( z ) .
In the case Ω ≠ is a simply connected domain with Ω = h( ) for some conformal mapping h( z) of onto Ω , the class Φ n ,1 [ h( ), q ] is written as Φ n ,1 [ h , q ] . The following result is
immediate consequence of Theorem 5. Theorem 6: Let φ ∈ Φ n ,1 [ h , q ] with q(0) = 1 . If f ∈ satisfies
j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , , ; z h( z), Dj f ( z) Dj+1 f ( z) Dj+2 f ( z) λ, β λ, β λ, β
φ then
Dλj+1 , β f ( z) Dλj , β f ( z)
(17)
q( z ) .
Following similar arguments as in (Miller & Mocanu 2000, Theorem 2.3d, page 30), Theorem 6 can be extended to the following theorem where the behavior of q( z) on ∂ is not known. Theorem 7: Let h and q be univalent in with q(0) = 1 , and set q ρ ( z) = q( ρ z) and hρ ( z) = h( ρ z) . Let φ : 3 × → satisfy one of the following conditions: 1. 2.
φ ∈ Φ n ,1 [ h , q ρ ] for some ρ ∈ (0,1) , or there exists ρ 0 ∈ (0,1) such that φ ∈ Φ n ,1 [ hρ , q ρ ] for all ρ ∈ ( ρ 0 ,1) .
If f ∈ satisfies (17), then
Dλj+1 , β f ( z) Dλj , β f ( z)
q( z ) .
The next theorem yields the best dominant of the differential subordination (17). Theorem 8: Let h be univalent in , and φ : 3 × → . Suppose that the differential equation 2 zq '( z) β z 2 q ''( z) zq '( z) β ( q( z ) + β ) + − q( z ) q( z ) q( z) β zq '( z) β zq '( z) φ q( z), q( z) + , q( z ) + ; z = h( z) (18) + β zq '( z) q( z ) q( z ) q( z ) + q( z )
has a solution q( z) with q(0) = 1 and one of the following conditions is satisfied:
Differential Sandwich Theorems with Generalised Derivative Operator
219
1.
q ∈ Q1 and φ ∈ Φ n ,1 [ h , q ] ,
2.
q is univalent in and φ ∈ Φ n ,1 [ h , q ρ ] for some ρ ∈ (0,1) , or
3.
q is univalent in there exists ρ 0 ∈ (0,1) such that φ ∈ Φ n ,1 [ hρ , q ρ ] for all
ρ ∈ ( ρ0 ,1) . If f ∈ satisfies (17), then
Dλj+1 , β f ( z) Dλj , β f ( z)
q( z) , and q( z) is the best dominant.
Proof. Applying the same arguments as in (Miller & Mocanu 2000, Theorem 2.3e, page 31), we first note that q( z) is a dominant from Theorems 6 and 7. Since q( z) satisfies (18), it is also a solution of (17), and therefore q( z) will be dominated by all dominants. Hence q( z) is the best dominant. In the particular case q( z) = 1 + Mz , M > 0 , the class of admissible functions Φ n ,1 [Ω , q ] , is simply denoted by Φ n ,1 [Ω , M ] . Theorem 9: Let Ω be a set in , and φ : 3 × → satisfy the admissibility condition
φ 1 + Me iθ ,1 + Me iθ +
k β Me iθ , L; z ∉ Ω iθ 1 + Me
whenever z ∈ ,θ ∈ ,with β L (1 + Me iθ )( e − iθ + M ) + β kM 2 3 iθ 2 − iθ 2 2 Re 2 β (1 + Me ) ( e + M ) + ( β kM ) ( β − 1) ≥ k β M , iθ − + + 3 β kM (1 Me ) e − iθ + M for all real θ and k ≥ 1 . If f ( z) ∈ satisfies j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , , ; z ∈ Ω, Dj f ( z) Dj+1 f ( z ) Dj+2 f ( z ) λ, β λ, β λ, β
φ then
Dλj+1 , β f ( z) Dλj , β f ( z)
− 1 < M.
Proof. Let q( z) = 1 + Mz , M > 0 . A computation shows that the conditions on φ implies that it belongs to the class of admissible functions Φ n ,1 [Ω , M ] . The result follows immediately from Theorem 5.
In the special case Ω = q( ) = {w :| w − 1|< M} , the conclusion of Theorem 9 can be written as
220
Advanced Technologies
j+2 j+3 Dλj+1 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , β f ( z) − < ⇒ − 1 < M. , , ; 1 z M j Dj f ( z) Dj+1 f ( z) Dj+2 f ( z) Dλ , β f ( z) λ, β λ, β λ, β
φ
3. Superordination and Sandwich Results Now, by applying Lemma 1, we prove the following theorem. Theorem 10: Let q( z) be convex univalent in with q(0) = 1 . Assume that 2 δ( + α )q( z)q '( z) Re > 0. δ Let f ( z) ∈ ,
Dλj+1 , β f ( z) Dλj , β f ( z)
∈ [q(0),1] Q.
Ψ( j , λ , β , δ , α ; z )
Further, Let
(19)
given by (7) be
univalent in and (δ + α )(q( z))2 + δ zq '( z) Ψ( j , λ , β , δ ,α ; z) then q( z )
Dλj+1 , β f ( z) Dλj , β f ( z)
,
and q( z) the best subordinant. Proof. Theorem 10 follows by using the same technique to prove Theorem 4 and by an application of Lemma 1. By using Theorem 10, we have the following corollary. Dj+1 f ( z) 1 + Az Corollary 2: Let q( z) = ∈ [q(0),1] Q. Further, ( −1 ≤ B < A ≤ 1), f ( z) ∈ ,and λj , β Dλ , β f ( z) 1 + Bz assuming that (19) satisfies. If 2 j+1 1 + Az Dλ , β f ( z ) 1 + Az , + (δ + α ) j Ψ( j , λ , β , δ , α ; z ) ⇒ (1 + Bz) 1 + Bz Dλ , β f ( z ) 1 + Bz
δ ( A − B) z 2
and
1 + Az , is the best subordinant. 1 + Bz
Also, by let q( z) =
Dj+1 f ( z) 1+ z ∈ [q(0),1] Q . Furhter, assuming that (19) , f ( z) ∈ ,and λj , β Dλ , β f ( z) 1−z
satisfies. If 2 j+1 2δ z 1 + z Dλ , β f ( z) 1+ z + δ + α Ψ λ β δ α ⇒ ( ) ( , , , , ; ) j z (1 − z)2 1 − z Dλj , β f ( z) 1−z
,
Differential Sandwich Theorems with Generalised Derivative Operator
and
221
1+ z , is the best subordinant. 1−z
Finally, by taking
µ Dλj+1 1+ z , β f ( z) ∈ [q(0),1] Q . Furhter, q( z ) = (0 < µ ≤ 1), f ∈ ,and j D − 1 z λ , β f ( z)
assuming that (19) satisfies. If 2δµ z 1 + z (1 − z)2 1 − z
µ −1
1+ z + (δ + α ) 1−z
2µ
µ Dλj+1 1+ z , β f ( z) Ψ( j , λ , β , δ ,α ; z ) ⇒ j Dλ , β f ( z) 1−z
,
1+ z and 1−z
µ
is the best subordinant. Now we will give the dual result of Theorem 5 for differential superordination. Definition 5: Let Ω be a set in and q ∈ [q(0),1] with zq '( z) ≠ 0 . The class of admissible functions Φ 'n ,1 [Ω , q ] consists of those functions φ : 3 × → that satisfy the admissibility
condition
φ (u , v , w ; ζ ) ∈ Ω
Whenever u = q( z),
v = q( z ) +
β zq '( z) mq( z)
( q( z) ≠ 0, zq '( z) ≠ 0 ) ,
zq ''( z) ( w − v )v u( β + 1) − v 1 − + 1 , ( z ∈ , ζ ∈ ∂ , m ≥ 1). Re ≤ Re 2 β β ( v − u) m q '( z) Theorem 11: Let φ ∈ Φ 'n ,1 [Ω , q ] . If f ( z) ∈ ,
Dλj+1 , β f ( z) Dλj , β f ( z)
∈ Q1 and
j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , , ;z Dj f ( z) Dj+1 f ( z) Dj+2 f ( z) λ, β λ, β λ, β
φ is univalent in , then
j+3 Dj+1 f ( z) Dλj+2 , β f ( z ) Dλ , β f ( z ) Ω ⊂ φ λj , β , j+1 , j+2 ; z z ∈ D f ( z) D f ( z ) D f ( z ) λ, β λ, β λ , β
implies q( z)
Dλj+1 , β f ( z) Dλj , β f ( z)
,
.
Proof. Let p( z) be defined by (9) and ψ by (15). Since φ ∈ Φ 'n ,1 [Ω , q ] , (16) and (20) yield Ω ⊂ {ψ ( p( z), zp '( z), z 2 p ''( z); z) : z ∈ } .
(20)
222
Advanced Technologies
From (14), the admissibility condition for φ ∈ Φ 'n ,1 [Ω , q ] is univalent to the admissibility condition for ψ as given in Definition 4. Hence ψ ∈ Φ 'n [Ω , q ] , and by Theorem 3, q( z) p( z) or q( z)
Dλj+1 , β f ( z) Dλj , β f ( z)
.
If Ω ≠ is a simply connected domain, and Ω = h( ) for some conformal mapping h( z) of onto Ω ,the the class Φ 'n ,1 [ h( ), q ] is written as Φ 'n ,1 [ h , q ] . Proceeding similarly as in the
previous section, the following result is an immediate consequence of Theorem 11. Theorem 12: Let φ ∈ Φ 'n ,1 [ h , q ] . and q ∈ [q(0),1], h( z) be analytic in f ∈ ,
D
f ( z)
D
f ( z)
j+1 λ, β j λ, β
If
∈ Q1 and j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , , ;z Dj f ( z) Dj+1 f ( z) Dj+2 f ( z) λ, β λ, β λ, β
φ is univalent in , then
implies q( z)
Dλj+1 , β f ( z) Dλj , β f ( z)
j+3 Dj+1 f ( z) Dλj+2 , β f ( z ) Dλ , β f ( z ) , j+1 , j+2 ;z h( z) φ λj , β D f ( z) D f ( z) D f ( z) λ,β λ,β λ,β
,
(21)
.
Theorems 11 and 12 can only be used to obtain subordinants of differential superordinations of the form 20 or 21. The following theorem proves the existence of the best subordinant of 21 for an appropriate φ . Theorem 13: Let h( z) be analytic in and equation
φ : 3 × → . Suppose that the differential
2 zq '( z) β z 2 q ''( z) zq '( z) β ( q( z ) + β ) + − q( z ) q( z ) q( z) β zq '( z) β zq '( z) φ q( z), q( z) + , q( z ) + ; z = h( z) + β zq '( z) q( z ) q( z ) q( z ) + q( z )
has a solution q ∈ Q1 . If φ ∈ Φ 'n ,1 [ h , q ], f ( z) ∈ ,
Dλj+1 , β f ( z) Dλj , β f ( z)
∈ Q1 , and
j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , j+1 , j+2 ;z j D f ( z) D f ( z) D f ( z) λ, β λ, β λ, β
φ is univalent in , then
Differential Sandwich Theorems with Generalised Derivative Operator j+3 Dj+1 f ( z) Dλj+2 , β f ( z ) Dλ , β f ( z ) , j+1 , j+2 ;z h( z) φ λj , β D f ( z) D f ( z) D f ( z) λ,β λ,β λ,β
implies q( z)
Dλj+1 , β f ( z) Dλj , β f ( z)
223
,
and q( z) is the best subordinant.
Proof. The proof is similar to the proof of Theorem 8, and is therefore omitted. Combining Theorems 4 and 10, we obtain the following sandwich-type theorem. Theorem 14: Let q1 ( z) and q 2 ( z) be convex univalent in and satisfies (19) and (6), respectively. If f ( z) ∈ ,
Dλj+1 , β f ( z) Dλj , β f ( z)
∈ [q(0),1] Q and Ψ( j , λ , β , δ ,α ; z) given by (7) be
univalent in and δ zq '1 ( z) + (δ + α )(q1 ( z))2 Ψ( j , λ , β , δ ,α ; z) δ zq '2 ( z ) + (δ + α )(q 2 ( z ))2
then
q1 ( z )
Dλj+1 , β f ( z) Dλj , β f ( z)
q2 ( z) ,
and q1 ( z) and q 2 ( z) are respectively the best subordinant and the best dominant. For q1 ( z) =
1 + A1 z 1 + A2 z where ( −1 ≤ B2 ≤ B1 < A1 ≤ A2 ≤ 1) , we have the following , q2 ( z) = 1 + B1 z 1 + B2 z
corollary. Corollary 3: If f ∈ ,
Dλj+1 , β f ( z) Dλj , β f ( z)
∈ [q(0),1] Q and
Ψ 1 ( A1 , B1 , j , λ , β , δ ,α ; z) Ψ( j , λ , β , δ ,α ; z) Ψ 2 ( A2 , B2 , j , λ , β , δ ,α ; z) then j+1 1 + A1 z Dλ , β f ( z) 1 + A2 z j 1 + B1 z Dλ , β f ( z) 1 + B2 z
where Ψ 1 ( A1 , B1 , j , λ , β , δ ,α ; z ) = Ψ 2 ( A2 , B2 , j , λ , β , δ ,α ; z) =
2
δ ( A1 − B1 )z
1 + A1 z + (δ + α ) , 1 + B1 z
δ ( A2 − B2 )z
1 + A2 z + (δ + α ) . 1 + B2 z
(1 + B1 z)2
(1 + B2 z)2
2
1 + A1 z 1 + A2 z and respectively the best subordinant and the best dominant. 1 + B1 z 1 + B2 z Also, by combining Theorems 6 and 12, we state the following sandwich-type theorem. Theorem 15: Let h1 ( z) and q1 ( z) be analytic functions in , let h2 ( z) be an analytic univalent Hence
function f ∈ ,
in
D
f ( z)
D
f ( z)
j+1 λ, β j λ, β
,
q 2 ( z) ∈ Q1 with
∈ [q(0),1] Q1 and
q1 (0) = q2 (0) = 1 and
φ ∈ Φ n ,1 [ h2 , q2 ] Φ 'n ,1 [ h1 , q1 ] .
If
224
Advanced Technologies j+2 j+3 Dλj+1 , β f ( z ) Dλ , β f ( z ) Dλ , β f ( z ) , , ;z Dj f ( z) Dj+1 f ( z) Dj+2 f ( z) λ, β λ, β λ, β
φ is univalent in , then
j+3 Dj+1 f ( z) Dλj+2 , β f ( z ) Dλ , β f ( z ) , j+1 , j+2 ; z h2 ( z ) , h1 ( z) φ λj , β D f ( z) D f ( z) D f ( z) λ, β λ, β λ, β
implies q1 ( z)
Dλj+1 , β f ( z) Dλj , β f ( z)
q2 ( z) .
Remark 1 : By using the same techniques to prove the earlier results and by using the relation (3), the new resuts will be obtained.
Acknowledgment The work presented here was fully supported by eSciencefund: 04-01-02-SF0425, MOSTI, Malaysia.
7. Conclusion There are many other results can be obtained by using the operator studied earlier by the authors (Darus & Al-Shaqsi, 2006).
8. References Al-Shaqsi, K. & Darus, M. (2008), Differential subordination with generalized derivative operator. Arab J. Math. Math. Sci. to appear. Darus, M. & Al-Shaqsi, K. (2006), On harmonic univalent functions defined by generalized Ruscheweyh derivatives operator. Lobachevskii J. Math., Vol. 22, 19-26. Ruscheweyh, St. (1975), New criteria for univalent functions, Proc. Amer. Math. Soc., Vol. 49, 109-115. Sălăgean, G. Ş. (1983) Subclasses of univalent functions, Lecture Note in Math. SpringerVerlag, Vol. 1013, 362-372. Al-Oboudi, F. M. (2004). On univalent functions defined by a generalized Sălăgean operator, Int. J. Math. Math. Sci., Vol.27, 1429-1436. Miller, S. S. & Mocanu, P. T. (2000), Differential Subordinations: Theory and Applications, Marcel Dekker Inc., New York. Bulboacă, T. (2002), Classes of first order differential superordinations, Demonstratio Math. Vol. 35, No. 2, 287-292. Miller, S. S. & Mocanu, P. T. (2003), Subordinants of differential superordinations, Complex Variables Theory Appl., Vol. 48, No. 10, 815-826.
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
225
13 0 Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems∗ Khair Eddin Sabri, Ridha Khedri and Jason Jaskolka
Department of Computing and Software, McMaster University Canada
1. Introduction Information security is an important aspect that should be considered during system development. Analyzing the specification of a system enables detecting flaws at early stage of a system development. An agent knowledge of the exchanged information and its nature is essential for analyzing systems. An agent can enrich its knowledge by receiving information as messages and producing new information from the existing one. We classify an agent knowledge as explicit knowledge and procedural knowledge. The explicit knowledge of an agent is related to the information that it possesses. For example in the context of a hospital software, an agent explicit knowledge would contain information about patients, drugs, and diseases. In the context of a school, the explicit knowledge of an agent would contain information about students, courses, and instructors. In the area of cryptographic protocols, the information of an agent can be its own key, the cipher used for encryption and decryption, and the identity of other agents that are involved in the protocol. Agents communicate by sending messages which are pieces of information stored in their explicit knowledge. The information an agent receives from other agents becomes a part of its explicit knowledge. The procedural knowledge involves a set of mechanisms/operators that enables an agent to obtain new information from its explicit knowledge. For example, if the explicit knowledge of an agent contains an encrypted message as well as the key and the cipher used to decrypt the message, then by using the procedural knowledge, the concealed information can be obtained. The use of the procedural knowledge to analyze cryptographic protocols can be found in Sabri and Khedri (2006; 2007b). The explicit knowledge representation is needed to analyze security related policies in multiagent systems. We summarize below some uses of the explicit knowledge: 1. Agents communicate by exchanging messages, which are constructed from their explicit knowledge. Therefore, an agent explicit knowledge is necessary for modeling agents communications. 2. Explicit knowledge is required to specify agent internal actions such as verifying the existence of an information in the knowledge. The explicit knowledge representation becomes more useful in complex systems. For example, in the registration part of the Equicrypt protocol presented in Leduc and Germeau (2000), a third party can handle ∗ This
chapter presents a revised and enlarged version of the material presented in Sabri et al. (2008)
226
Advanced Technologies
simultaneously several registrations. Therefore, it should maintain an internal “table” with information on the users that have a registration in progress. 3. Some security properties are based on the explicit knowledge of agents. For example, a confidentiality security property would require that an agent should not know a specific kind of information existing in the explicit knowledge of another agent. 4. Even if the specification of a multi-agent system is proved to be secure by satisfying some security properties, it could contain flaws due to its incorrect implementation. To reduce the risk of incorrect implementation, one can derive the code automatically from the mathematical model of the system and prove that the derivation is correct. Having an explicit knowledge representation that allows specifying internal actions such as inserting and extracting information from the knowledge as well as verifying the existence of an information in the knowledge would be necessary for code generation. For an efficient analysis of security policies in a multi-agent system, an explicit knowledge representation would have the following characteristics as giving in Sabri and Khedri (2008): 1. Classifying information so that one can reason on the ability of an agent to obtain an information that has a specific classification (e.g., private) in another agent’s knowledge. 2. Relating information together such as relating patient to drugs so that one can reason on the ability of an agent to link pieces of information together. 3. Specifying internal actions such as inserting information into the knowledge and updating information. 4. Flexibility in specification by not having the same classification of information in all agents knowledge. 5. Specifying the explicit knowledge of systems with the same mathematical theory so that there is no need to introduce a new theory for a specific case. In the literature, we find that explicit knowledge specifications satisfy some of the characteristics but not all of them. In this chapter, we present a mathematical structure to represent the explicit knowledge of agents that satisfies all the characteristics above. Then, we show that the structure is an information algebra which is introduced in Kohlas and St¨ark (2007). In Section 2, we summarize information algebra. In Section 3, we present the mathematical structure to specify agent explicit knowledge. In Section 4, we give two applications of the uses of the proposed structure. In Section 5, we conclude.
2. Information Algebra In Kohlas and St¨ark (2007), the authors explore connections between different representations of information. They introduce a mathematical structure called information algebra. This mathematical structure involves a set of information Φ and a lattice D. They show that relational databases, modules, and constraint systems are information algebras. In the rest of this chapter, we denote elements of Φ by small letters of the Greek alphabet such as ϕ, ψ and χ. Each piece of information is associated with a frame (also called domain in Kohlas and St¨ark (2007)), and the lattice D is the set of all frames. Each frame x contains a unit element e x which represents the empty information. Information can be combined or restricted to a specific frame. Combining two pieces of information ϕ and ψ is represented by ϕψ. Information ϕ and ψ can be associated with different frames, and ϕψ is associated with a more precise frame than ϕ and ψ. Kohlas and St¨ark (2007) assume that the order of combining information does not matter
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
227
and, therefore, the combining operator is both commutative and associative. Restricting an information ϕ to a frame x is denoted by ϕ↓x which represents only the part of ϕ associated with x. In the following definition and beyond, let ( D, , ) be a lattice and x and y be elements of D called frames. Let be a binary relation between frames such that x y = y ↔ x y. Let Φ be a set of information and ϕ,ψ, χ be elements of Φ. We denote the frame of information ϕ ∈ Φ by d( ϕ) . Let e x be the empty information over the frame x ∈ D, the operation ↓ be a partial mapping Φ × D → Φ, and · be a binary operator on information. For simplicity, to denote ϕ · ψ, we write ϕψ. Definition 1 (Information Algebra as in Kohlas and St¨ark (2007)). An information algebra is a system (Φ, D ) that satisfies the following axioms: 1. ( ϕψ)χ = ϕ(ψχ)
6. ∀( x | x ∈ D : d(e x ) = x ) 7. x d( ϕ) → d( ϕ↓x ) = x
2. ϕψ = ψϕ 3. d( ϕψ) = d( ϕ) d(ψ)
8. x y d( ϕ) → ( ϕ↓y )↓x = ϕ↓x
4. x y → (ey )↓x = e x
9. d( ϕ) = x ∧ d(ψ) = y → ( ϕψ)↓x = ϕ(ψ↓x∧y )
10. x d( ϕ) → ϕϕ↓x = ϕ
5. d( ϕ) = x → ϕe x = ϕ
The first two axioms indicate that the set of pieces of information together with the combining operator form a semi-group. Axiom 3 states that the frame of two pieces of information combined is the join of their frames. Axioms (4-6) give properties of the empty information e x . Axioms (7-8) give the properties of focusing an information to a specific frame. Axioms (9-10) give properties that involve combining and focusing of information.
3. Specification of Agent Explicit Knowledge In Sabri et al. (2008), we develop a mathematical structure to specify an agent explicit knowledge and prove that it is an information algebra. The explicit knowledge of an agent is represented by two elements Φ and D. The set Φ consists of pieces of information (we use the words information and piece of information interchangeably) available to the considered agent. There is no restriction on the representation of these pieces of information. They can be represented as formulae as in artificial intelligence literature, functions, etc. In this chapter, we represent pieces of information as functions. While D is a lattice of frames such that each piece of information is associated with a frame. Definition 2 (Agent Information Frame). Let {Ai | i ∈ I } be a family of sets indexed by the set of indices I and P (Ai ) be the powerset of Ai . An information frame DI is defined as: DI ∏ P (Ai ) i∈ I
Which can be equivalently written as a set of functions as DI { f : I →
i∈ I
P (Ai ) | ∀(i | i ∈ I : f (i ) ∈ P (Ai ) )}
228
Advanced Technologies
Let J ⊆ I and I J ⊆ I × I such that I J = {( x, x ) | x ∈ J } (i.e., I J is the identity on J). Given the frame DI , we can define DJ as { g | ∃( f | f ∈ DI : g = I J ; f )} where ; denotes relational composition. We call an element ϕ of DJ an information and DJ the frame of ϕ and denote1 it by d( ϕ). We call “d“ the labelling operator. The information ϕ is a function which can be written as a set of 2-tuples (i, A) where i is an index and A is a set. Each frame D J contains a special element called the empty information e DJ and defined as {(i, ∅) | i ∈ J }. Whenever, it is clear from the context, we write eJ instead of e DJ . We denote the set of all frames DJ for J ⊆ I by D and the set of all pieces of information by Φ. D
{country,company} D{company} D{country}
D∅
Fig. 1. A lattice constructed from I = {company, country} As an example of our representation of Φ and D, suppose that an agent can handle only two kinds of information: company and country. In this case, the set of indices is I = {company, country} and the lattice D is constructed as in Figure 1. The lattice D consists of four frames: D∅ is a frame that might involve only the empty information e∅ (absence of information), D{company} is the frame of the pieces of information classified as company, D{country} is the frame of the pieces of information classified as country, and D{company, country} is the frame of composite information where part of it is classified as company and another part is classified as country. Our aim from this lattice representation is to represent frames of atomic information as in D{country} and D{company} and to represent frames of composite information as in D{country, company} . To illustrate our representation of information, let the set of information Φ contains two pieces of information ϕ and ψ such that ϕ = {(company, {AirFrance}), (country, {France})} and ψ = {(company, {AirCanada})}. The first information associates the company AirFrance with the country France while the second information contains the AirCanada information. Definition 3. An information ϕ is called atomic if ϕ = e∅ or d( ϕ) = D{ j} for j ∈ I. From the definition, we can see that ϕ is a composite information while ψ is an atomic information. The set of information Φ can be represented in a tabular format as shown in Table 1. A piece of information can be seen as a row in a table where the table header represents the indices of the frame of an information. An empty information can be perceived as a table with only a header and e∅ can be seen as an empty page that does not contain even the header. An atomic information can be seen as a cell of the table or as an empty page. The following axiom and proposition are taken from Sabri et al. (2008) and are needed for the subsequent proofs. 1 The notation d ( ϕ ) to denote the frame of ϕ comes from the usage of the term domain in Kohlas and St¨ark (2007) as a synonym for frame. We prefer to use the term frame to avoid any confusion with the domain of a relation.
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
229
Table 1. The set Φ in a tabular format ϕ ψ
Axioms 1.
company AirFrance AirCanada
1. ϕ ∈ DJ → d( ϕ) = DJ
country France 2. eJ {(i, ∅) | i ∈ J }
From the definition of DJ , it follows that ϕ ∈ DJ → ∀(i | i ∈ J : ϕ(i ) ∈ P (Ai ) ). Therefore, ϕ ∈ DJ can be written as a set of 2-tuples {(i, A) | i ∈ J ∧ A ⊆ Ai }. Proposition 1. For J, K ⊆ I and ϕ ∈ Φ, we have:
1. ϕ ∈ DK → I J ; ϕ = {(i, A) | i ∈ ( J ∩ K ) ∧ A ⊆ Ai }
2. I J ;IK = I J ∩K
3. ϕ ∈ DK → d(I J ; ϕ) = DJ ∩K
4. I J ∪K = I J ∪ IK Proof.
1. The proof invokes the definitions of relation composition, ϕ, and I J as well as the trading rule for ∃, set intersection axiom, and the distributivity of ∧ over ∃.
2. The proof invokes the definition of DK , definition of DJ ∩K , Proposition 1(2), and Axiom 1(1). 3. One uses the definition of I J , IK , and I J ∪K as well as applies set union axiom and range split axiom. The complete proof is given in Sabri and Khedri (2007a). We define a binary operator · to combine information (we write ϕψ to denote ϕ · ψ). We can use this operator to represent composite information made of pieces of information. Definition 4 (Combining Information). Let Φ be a set of information and ϕ, ψ be its elements. Let d( ϕ) = DJ and d(ψ) = DK . We define the binary operator · (however, we write ϕψ to denote ϕ · ψ) on information as: ϕψ {(i, A) | i ∈ J ∩ K ∧ A = ϕ(i ) ∪ ψ(i )} ∪ {(i, A) | i ∈ J − K ∧ A = ϕ(i )} ∪ {(i, A) | i = K − J ∧ A = ψ(i )} We also define two operators on frames as follows: Definition 5. Let D J and DK be frames and ϕ ∗ ψ = {(i, A) | i ∈ J ∩ K ∧ A = ϕ(i ) ∩ ψ(i )}, we define the operators and on frames as: 1. DJ DK {χ | ∃( ϕ, ψ | ϕ ∈ DJ ∧ ψ ∈ DK : χ = ϕψ )}
2. DJ DK {χ | ∃( ϕ, ψ | ϕ ∈ DJ ∧ ψ ∈ DK : χ = ϕ ∗ ψ )}
Proposition 2. DJ DK = DJ ∪K
Proof. The proof calls for the definitions of DJ , DK , and DJ ∪K as well as Definition 5(1), distributivity of ∧ over ∃, trading rule for ∃, nesting axiom, interchange of dummies, Definition 4, Proposition 1(1), renaming, and range split axiom. The detailed proof is given in Appendix A.
230
Advanced Technologies
Proposition 3. DJ DK = DJ ∩K Proof. The proof is similar to that of Proposition 2. We use the definitions of DJ , DK , and DJ ∩K and we apply Definition 5(2), distributivity of ∧ over ∃, trading rule for ∃, nesting axiom, interchange of dummies, Proposition 1(1), renaming, and range split axiom. The complete proof is given in Sabri and Khedri (2007a). Proposition 4. Let D J , DK , and DK be frames, we have 1. D J DK = DK D J
4. ( D J DK ) D L = D J ( DK D L )
2. D J DK = DK D J
5. D J ( D J D L ) = D J
3. ( D J DK ) D L = D J ( DK D L )
6. D J ( D J D L ) = D J
Proof. We use Proposition 2, Proposition 3 and the properties of ∩ and ∪. The complete proof is given in Sabri and Khedri (2007a). The following proposition is a consequence result of Proposition 4. Proposition 5 (Lattice of Frames). ({ D J } J ⊆ I , , ) form a lattice. For simplicity, we use D to denote the lattice ({ D J } J ⊆ I , , ). On the lattice D and for DJ and DK frames in D, it is known that the following are equivalent (Davey and Priestley, 2002, page 39): 1. DJ DK
2. DJ DK = DK
3. DJ DK = DJ
We define a partial order relation ≤ on information as ϕ ≤ ψ and we say that ψ is more informative than ϕ. Definition 6 (More Informative Relation). Let Φ be a set of information and ϕ, ψ be elements of Φ. Let D be a lattice and DJ and DK be elements of D. Let d( ϕ) = DJ and d(ψ) = DK . We define the binary relation ≤ on information as: ϕ ≤ ψ ↔ J ⊆ K ∧ ∀(i | i ∈ J : ϕ(i ) ⊆ ψ(i ) ) The relation ≤ indicates whether or not an information is a part of another one. We use it to verify the existence of an information in the knowledge of an agent. An information can be in the knowledge of an agent as a part of a composite information. The special element e∅ of D∅ is the least informative information i.e., ∀( ϕ | ϕ ∈ Φ : e∅ ≤ ϕ ). Proposition 6. The relation ≤ is a partial order. Proof. The proof is based on the property that ⊆ is a partial order. The proof is given in Sabri and Khedri (2007a).
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
231
We show in Sabri et al. (2008) that there is a relation between frames and their indices. Proposition 7. 1. ∀( J, K | J, K ⊆ I : J = K → DJ = DK ) 2. ∀( J, K | J, K ⊆ I : DJ = DK → J = K ) Proof.
3. ∀( J, K | J, K ⊆ I : DJ DK ↔ J ⊆ K )
1. The proof uses trading rule for ∀, Substitution axiom, and properties of propositional logic.
2. We prove by contrapositive. We assume that J = K and prove DJ = DK → false. The proof uses the definition of DJ and DK , definition of ”↔”, Weakening, Proposition 1(4), the distributivity of relational composition over ∪, Distributivity of ∧ over ∃, ∃-True body, and properties of propositional logic. 3. The proof uses Proposition 7(2), Proposition 2, Reflexivity of ↔, ∀-True body, and properties of set theory. The complete proof is given in Appendix A. We also define a binary operator to extract a part of an information that belongs to a specific frame as: Definition 7 (Marginalizing Information). Let DJ be a frame and ϕ be an information such that DJ ∈ D and ϕ ∈ Φ, we define a binary operator ↓ : Φ × D → Φ as ϕ
↓ DJ
I J ; ϕ.
The ↓ operator can be used to extract a specific kind of information. For example, let ϕ = ↓D
{(company, {AirFrance}), (country, {France})}, then ϕ {company} = {(company, {AirFrance)}. After defining information marginalizing, labelling and combination in our context, we prove in Sabri et al. (2008) that our structure is an information algebra by proving the following proposition. Proposition 8. For J, K ⊆ I, we have 1. 2. 3. 4. 5.
( ϕψ)χ = ϕ(ψχ) ϕψ = ψϕ d( ϕψ) = d( ϕ) d(ψ) d( ϕ) = DJ → ϕeJ = ϕ d(eJ ) = DJ
6. DJ DK → (eK ) Proof.
↓ DJ
= eJ
7. d( ϕ) = DJ ∧ d(ψ) = DK → ( ϕψ) 8. DJ d( ϕ) → d( ϕ
↓ DJ
↓ DJ
= ϕ(ψ
↓ DJ ∧ DK
)
) = DJ
9. DJ DK d( ϕ) → ( ϕ↓DK ) 10. DJ d( ϕ) → ϕϕ
↓ DJ
↓ DJ
=ϕ
↓ DJ
=ϕ
1. The proof calls for Definition 4, commutativity and associativity of ∪, and properties of set difference.
2. We use Definition 4 and commutativity of ∩ and ∪.
3. The proof essentially invokes Axiom 1(1), Propositions 2, the definition of DJ ∪K , Proposition 1(1), and Definition 4.
4. We basically use Definition 4, Axiom 1(2), idempotency of ∩, and empty range axiom.
232
Advanced Technologies
5. The proof essentially calls for Axiom 1(1, 2), the definition of DJ , and Proposition 1(1). 6. The proof invokes Definition 7, Axiom 1(2), Proposition 1(1), and Proposition 7(3). 7. The proof invokes Definition 7, Definition 4, Proposition 1(1), and properties of set difference, ∪ and ∩.
8. The proof invokes Definition 7, Proposition 1(3), Axiom 1(1), and Proposition 7(3). 9. We use Definition 7, Proposition 1(2), and Proposition 7(3).
10. The proof calls for Definition 7, Proposition 1(1), Definition 4, Axiom 1(1), Proposition 7(3), range split axiom, and properties of set difference, ∪, and ∩. The full detailed proof can be found in Sabri and Khedri (2007a). Proposition 9. The structure (Φ, D ) is an information algebra. Proof. (Φ, D ) satisfies the ten axioms of information algebra (see Definition 1) as shown in Proposition 8. As consequence results of proving that (Φ, D ) is an information algebra, the following properties hold and the proofs can be found in Kohlas and St¨ark (2007): Proposition 10. 1. d( ϕ) = DJ → ϕ
↓ DJ
=ϕ
4. d( ϕ) = DJ → ( ϕDK )
2. ϕϕ = ϕ
5. DJ d( ϕ) → ( ϕeK )
3. DJ DK → eJ eK = eK
↓ DJ
↓ DJ
=ϕ =ϕ
↓ DJ
6. eJ K = eJ eK
The empty information has some interesting properties as shown in the following proposition. Proposition 11. Let ϕ ∈ DJ and J ⊆ I, we have ↓D
1. ϕ↓D∅ = e∅
2. e∅ J = e∅
3. ϕe∅ = ϕ
4. e∅ = ∅
Proof. The proof uses Definition 7, Definition 4, Proposition 1(1), Axiom 1, and properties of set theory and propositional logic. The full proof is given in Sabri et al. (2009a). We also prove some properties related to the marginalzing operator. Proposition 12. Let ϕ ∈ DL and ψ ∈ DK , we have 1. ϕ
↓ DJ
· ϕ↓DK = ϕ
↓ DJ ∪K
2. ( ϕψ)↓DL = ϕ↓DL ψ↓DL
3. ϕ↓DK =ϕ
↓ DJ ∩K
4. ( ϕ↓DK )↓DL =ϕ↓DK∩L
Proof. The proof uses the definition of ·, the definition of ↓ , Proposition 1(1), and properties of set theory and propositional logic. The full proof is given in Appendix A. In addition to the information algebra operators, we define in Sabri et al. (2009b) an operator to remove a piece of information from another one.
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
233
Definition 8 (Removing Information). Let d( ϕ) = DJ and d(ψ) = DK . We define the binary operator ” − ” as: ϕ − ψ {(i, A) | i ∈ J ∩ K ∧ A = ϕ(i ) − ψ(i )} ∪ {(i, A) | i ∈ J − K ∧ A = ϕ(i )} Let ϕ = {(company, {AirFrance}), (country, {France})} and ψ = {(company, {AirFrance})}, then ϕ − ψ = {(company, {}), (country, {France})}. We also prove in Sabri et al. (2009b) the following proposition. Proposition 13. Let ϕ, ψ and χ be pieces of information such that d( ϕ) = DJ , d(ψ) = DK , and d(χ) = DL . Also, let eK be the empty information on DK 1. 2. 3. 4. Proof.
d( ϕ − ψ) = d( ϕ) ϕ − eK = ϕ eK − ϕ = eK ϕ ≤ (ψ − χ) → ϕ ≤ ψ
5. ϕ ≤ ψ → ϕ − ψ = ed( ϕ) 6. ( ϕψ − ψ)↓d( ϕ) ≤ ϕ 7. ϕ ≤ ψ → (χ − ϕ)ψ = χψ
1. The proof uses Axiom 1(1), Definition of DJ , Distributivity of ; over ∪, Proposition 1(1), Empty range axiom, Definition of ϕ − ψ, and properities of set theory and propositional logic.
2. The proof uses Definition of ϕ − eK , Definition of eK , Distributivity of ∧ over ∨, and properities of set theory. 3. The proof uses Definition of eK − ϕ, Definition of eK , Distributivity of ∧ over ∨, and properities of set theory. 4. The proof uses Definition of ≤, Range split for ∀, Empty range axiom, Definition of ϕ − χ, Distributivity axiom, Weakening, and properties of set theory and propositional logic. 5. The proof uses Definition of ϕ − ψ, Empty range, Definition of eJ , and properties of set theory and propositional logic. 6. The proof uses Definition of ϕψ − ψ, Definition of ↓ DJ , Distributivity of ; over ∪, Proposition 1(1), Definition of combining information, Definition of ≤, ∀-True body, and properties of set theory and propositional logic. 7. The proof uses Definition of ≤, The definition of combining information, Range split, Definition of set difference, Definition of χ − ϕ, and properties of set theory and propositional logic. Full proof is given in Sabri et al. (2009a). The proposition gives some properties of the remove operator such as Proposition 13(1) which indicates that removing pieces from an information does not change the frame of that information. Proposition 13(2, 3) states that removing an empty piece from an information does not affect that information, and a removing piece of information from the empty information does not change the empty information. Also, the proposition relates the more informative relation with the remove operator as shown in Proposition 13(4,5). Proposition 13(6,7) relates the remove operator with the combine operator. We note that agents might have different lattices of frames. The frame of an information at a sender’s knowledge might be assigned to a different frame after its transmission to a receiver. In Sabri et al. (2009b), we define a frame substitution function that substitutes a part of a frame of an information with another as:
234
Advanced Technologies ↓
Definition 9 (Frame Substitution). fs( ϕ, DJ , DK ) ϕ D(L− J ) · ( ϕ K are singleton subsets of the set of indices I and d( ϕ) = DL .
↓ DJ
[ DK /DJ ]) where the sets J and
We note that this function is defined using basic information algebra operators. As an example of the frame substitution function, let ϕ = {(country,{France}),(company,{AirFrance})}. Then, fs( ϕ, D{company} , D{airline} ) = {(country, {France}), (airline, {AirFrance})}. We prove a proposition related to set theory that we used in Sabri et al. (2009b) to prove properties related to frame substitution function. Proposition 14. For J = { j}, we have ¬( J ⊆ K ) → J ∩ K = ∅ Proof. The proof uses properties of set theory. The full proof is given in Sabri et al. (2009a). Proposition 15. Let J = { j} and K = {k} be singleton subsets of the set of indices I and d( ϕ) = DL , we have 1. DJ d( ϕ) ∨ ϕ = fs( ϕ, DJ , DK )
2. DK d( ϕ)) ∨ ϕ = fs(fs( ϕ, DJ , DK ), DK , DJ ) Proof.
1. The proof uses Proposition 7(3), Proposition 14, Definition of fs( ϕ, DJ , DK ), Proposition 10(1), Definition of ↓ DJ , Proposition 1(1), Proposition 11(4), Proposition 11(3), and properties of set theory and propositional logic.
2. To prove ϕ = fs(fs( ϕ, DJ , DK ), DK , DJ ), we have two cases: ¬( DJ d( ϕ)) and DJ d( ϕ). The proof of the first case uses Proposition 7(3), Proposition 14, and Proposition 15(1). The proof of the second case uses Definition of fs, Proposition 1(2, 9), Proposition 11(1), Proposition 11(3), Proposition 12(2), Proposition 12(4), Proposition 12(3), and Proposition 12(1). The full proof is given in Appendix A. As discussed in Sabri et al. (2008; 2009b), the knowledge of each agent is modeled as an information algebra N (Φ, D ). Based on the operators of information algebra, we introduce in Sabri et al. (2008; 2009b) several functions to specify operations on knowledge. • isInKnowledge(N , x, ϕ) ∃(ψ | ψ ∈ Φ : x ∈ D ∧ ϕ ≤ ψ ∧ x d(ψ) ). This function verifies the existence of an information in the knowledge N associated with the frame x and is more informative than ϕ. • extract(N , x, ϕ) {ψ↓x | x ∈ D ∧ ψ ∈ Φ ∧ ϕ ≤ ψ ∧ x d(ψ)}. This function extracts pieces of information from the knowledge N that contains ϕ and restricts them to the frame x. • insert(N , ϕ). This function inserts the information ϕ into Φ.
• update(N , ψ, ϕ) ({(χ − ψ) · ϕ | χ ∈ Φ ∧ ψ ≤ χ} ∪ {χ | χ ∈ Φ ∧ ¬(ψ ≤ χ)}, D ). This function update the knowledge N by replacing ψ with ϕ.
In the insert and update functions, there is always a condition that d( ϕ) ∈ D. We also define in Sabri et al. (2009b) the function choose(Φ) to select a piece of information randomly from Φ. If Φ is empty, it returns the empty information e∅ . In Sabri et al. (2009b), we prove the following proposition which helps in verifying policies. Proposition 16. Let ϕ and ψ be pieces of information and let N be a knowledge.
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
235
1. ϕ ≤ ψ ∧ ϕ ≤ χ → update(update(N , ϕ, ψ), ϕ, χ) = update(N , ϕ, ψ · χ) 2. isInKnowledge(N , d( ϕ), ϕ) ∨ update(N , ϕ, ψ) = N Proof.
1. The proof uses the definition of the function update, Distributivity axiom, Trading rule for ∃, Nesting axiom, Distributivity of ∧ over ∨, Proposition 13(7), Substitution axiom, and properties of set theory and propositional logic.
2. The proof uses Definition of isInKnowledge, De Morgan laws, Proposition 7(1), Definition of update, Empty range axiom, and properties of set theory and propositional logic. The full proof is given in Appendix A.
4. Application The proposed mathematical structure has several applications in the analysis of security properties. We summarize here its use in the analysis of cryptographic protocols and information flow. We implement a prototype tool in the functional programming language Haskell. This prototype tool is used to represent and manipulate explicit knowledge of agents. It allows initializing the lattice of frames D and the set of information Φ for each agent. It implements the functions presented earlier so that the user can insert, remove and update the knowledge of each agent. Also, it allows extracting information from the knowledge and verifying the existence of an information in the knowledge of an agent. 4.1 Cryptographic Protocols
In Sabri et al. (2008), we show the use of our representation of the explicit knowledge and its functions to specify protocols, specify properties, reduce the state space, and generate a specific type of attack with the aid of the developed prototype tool. • Specify protocol: the tool allows specifying the insertion of information and the update of the knowledge -- insertInformation is the implementation of the -- function insert presented in the previous section. -- insertInformation function inserts the key "hello" -- into the frame named "key" at the knowledge of agent "S". insertInformation "S" ([("key",["hello"])],["key"])
• Specify properties: the tool allows specifying several properties such as an intruder Z should not get a session key “hello“. -- isInKnowledge is the implementation of the -- function isInKnowledge presented in the previous section. -- isInKnowledge function checks if the intruder knowledge (Z) -- contains the key "hello" associated with the frame "key" isInKnowledge "Z" (["key"]) ([("key",["k"])],["key"])
• Reduce state space: the tool allows specifying intruder that send useful messages. For example, all the messages sent to the server should be encrypted with the server public key if the server should decrypt the message.
236
Advanced Technologies
-- extractInformation is the implementation of the -- function extract presented in the previous section. -- extractInformation extracts from the knowledge Z the public keys that -- are associated with the server. extractInformation "Z" (["publicK"]) ([("id",["Server"])],["id"])
• Generate attack: the tool allows mounting a specific kind of attack such as a reflection attack where the intruder Z sends messages back to the sender. -- extractInformation extract from the knowledge Z all the messages -- that are associated with the sender John Do. extractInformation "Z" (["message"]) ([("sender",["John Do"])],["sender"])
In Sabri et al. (2008), we summarize existing techniques used to specify agent knowledge in cryptographic protocols. For instance, Leduc and Germeau (2000) adopt LOTOS, F´abrega et al. (1999) propose strand space, Clarke et al. (2000) introduce Brutus, Paulson (1998) adopt an inductive approach, Ma and Cheng (2005) introduced a knowledge-based logical system, and finally Cervesato (2000) introduce MSR. We also compare them to our mathematical structure. We show that our explicit knowledge representation allows specifying knowledges similar to the existing techniques. However, our mathematical structure allows the specifier to define only a set of frames of the explicit knowledge which indicates the classification of information. There is no need to specify the relation between information as in the existing techniques such as relating public key with a private key. We use a compact number of operators to specify the agent explicit knowledge of any protocol. For example, the knowledge-based logical approach as found in Ma and Cheng (2005) uses about six functions to specify the registration phase of the SET protocol. Four functions are used to map an agent to its public encryption key, private encryption key, public signature key, and private signature keys. Also, a function is introduced to associate two agents with a shared key, and another function to verify if a message is a part of another one. In our structure, only the pre-defined operators within the framework are required to manipulate the information. There is no need to define new operators. Having a small number of operators would reduce the complexity of specifying cryptographic protocols and verifying them. Also, the proposed framework enables specifying the internal actions of agents. For example, we can specify the ability of the server to check the freshness of a message while this is not possible in Brutus as we find in Clarke et al. (2000). The inability of specifying the internal actions would affect the protocol analysis and implementation. 4.2 Information Flow Analysis
In Sabri et al. (2009b), we apply our explicit knowledge structure in developing a technique to verify information flow in agent-based systems. The technique is based on information algebra to represent agent knowledge, global calculus to represent the communication and an amended version of Hoare logic for verification. We use Hoare triple { P}S{ Q} to conduct verification where the precondition P represents a condition on the initial knowledge of agents, S represents the specification of the communication between agents, and the postcondition Q represents the negation of a confidentiality policy on the knowledge of agents. The
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
237
precondition and the postcondition are expressed within the language of information algebra. To verify a policy, we first calculate the weakest precondition from S and Q and then prove or disprove that P → wp(S, Q). The inference rules are obtained by amending Hoare’s set of rules to make them appropriate to protocols specified using global calculus and information algebra. For more details, we refer the reader to Sabri et al. (2009b). A tool is used in Sabri et al. (2009b) together with the PVS theorem prover to verify policies. In Sabri et al. (2009b), we show that the use of information algebra to specify confidentiality policies allows specifying policies similar to that of Bell and LaPadula (1976) and Brewer and Nash (1989) models. Also, it allows analyzing composite information flow, which is not taken into consideration in the existing techniques such as Alghathbar et al. (2006); Focardi and Gorrieri (1997); Hristova et al. (2006); Varadharajan (1990). Analyzing composite information enables verifying the possibility of an agent to link pieces of information together and therefore, build up an important composite information.
5. Conclusion In this chapter, we present a structure to specify agent explicit knowledge based on information algebra. We define in the context of agent knowledge the combining, marginalizing, and labelling operators. Also, we define remove and frame substitution operator. These operators are all what is needed to express operations on agent explicit knowledge. We also define a set of frames to be associated with information. Then, we prove that our structure is an information algebra which links our work to a rich heritage of mathematical theories. Our mathematical structure is expressive as it allows combining information for different purposes regardless of their frames, extracting a part of information, or associating information with a frame. We give two applications of the proposed structure. First, we apply it to the specification and analysis of agent knowledge in cryptographic protocols. In the literature of cryptographic protocols, operators are usually defined on information that belongs to a specific type, while our structure enables a uniform and a general way to handle information. Also, defining a relation between frames and linking them to the operators applied on information is not addressed in the literature. Furthermore, different protocol-dependent structures should be defined to relate different kinds of information which are not needed in our representation. Second, we show its use in the analysis of information flow between agents in multi-agent systems. Our structure provides a comprehensive language to specify agents knowledge and confidentiality policies. For example, it allows specifying and reasoning on composite information flow. Also, it allows specifying policies similar those articulated within Bell-LaPadula and Chinese Wall models.
A. Detailed Proofs A.1 Proposition 2
DJ DK = DJ ∪K
Proof.
=
DJ DK
Definition 5(1) {χ | ∃( ϕ, ψ | ϕ ∈ DJ ∧ ψ ∈ DK : χ = ϕψ )}
238
= =
=
=
=
=
=
=
=
=
=
=
= =
Advanced Technologies
y ∈ { x | r } ↔ r [ x := y] & Definition of D J {χ | ∃( ϕ, ψ | ∃( f | f ∈ DI : ϕ = I J ; f ) ∧ ψ ∈ DK : χ = ϕψ )} y ∈ { x | r } ↔ r [ x := y] & Definition of DK {χ | ∃( ϕ, ψ | ∃( f | f ∈ DI : ϕ = I J ; f ) ∧ ∃( g | g ∈ DI : ψ = IK ; g ) : χ = ϕψ )} Distributivity of ∧ over ∃ {χ | ∃( ϕ, ψ | ∃( f | f ∈ DI : ϕ = I J ; f ∧ ∃( g | g ∈ DI : ψ = IK ; g ) ) : χ = ϕψ )} Trading rule for ∃ {χ | ∃( ϕ, ψ | ∃( f | f ∈ DI ∧ ϕ = I J ; f : ∃( g | g ∈ DI : ψ = IK ; g ) ) : χ = ϕψ )} Nesting axiom {χ | ∃( ϕ, ψ | ∃( f , g | f ∈ DI ∧ ϕ = I J ; f ∧ g ∈ DI : ψ = IK ; g ) : χ = ϕψ )} Trading rule for ∃ & Symmetry of ∧ {χ | ∃( ϕ, ψ | ∃( f , g | f ∈ DI ∧ g ∈ DI : ϕ = I J ; f ∧ ψ = IK ; g ) : χ = ϕψ )} Trading rule for ∃ {χ | ∃( ϕ, ψ |: ∃( f , g | f ∈ DI ∧ g ∈ DI : ϕ = I J ; f ∧ ψ = IK ; g ) ∧ χ = ϕψ )} Distributivity of ∧ over ∃ {χ | ∃( ϕ, ψ |: ∃( f , g | f ∈ DI ∧ g ∈ DI : ϕ = I J ; f ∧ ψ = IK ; g ∧ χ = ϕψ ) )} Substitution axiom {χ | ∃( ϕ, ψ |: ∃( f , g | f ∈ DI ∧ g ∈ DI : ϕ = I J ; f ∧ ψ = IK ; g ∧ χ = (I J ; f )ψ ) )} Substitution axiom {χ | ∃( ϕ, ψ |: ∃( f , g | f ∈ DI ∧ g ∈ DI : ϕ = I J ; f ∧ ψ = IK ; g ∧ χ = (I J ; f )(IK ; g) ) )} Trading rule for ∃ & Symmetry of ∧ {χ | ∃( ϕ, ψ |: ∃( f , g | f ∈ DI ∧ g ∈ DI ∧ χ = (I J ; f )(IK ; g) : ϕ = I J ; f ∧ ψ = IK ; g ) )} Interchange of dummies {χ | ∃( f , g | f ∈ DI ∧ g ∈ DI ∧ χ = (I J ; f )(IK ; g) : ∃( ϕ, ψ |: ϕ = I J ; f ∧ ψ = IK ; g ) )} I x and the functions f and g are always defined, and there composition is defined as well {χ | ∃( f , g | f ∈ DI ∧ g ∈ DI ∧ χ = (I J ; f )(IK ; g) : true )} Trading rule for ∃ & Identity of ∧ {χ | ∃( f , g | f ∈ DI ∧ g ∈ DI : χ = (I J ; f )(IK ; g) )}
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
=
Definition 4 and Proposition 1(1) {χ | ∃( f , g | f , g ∈ DI : χ = {(i, A) | i ∈ J ∩ K ∧ A = f (i ) ∪ g(i )} ∪{(i, A) | i ∈ J − K ∧ A = f (i )} ∪{(i, A) | i ∈ K − J ∧ A = g(i )} )} = Let f (i ) ∪ g(i ) if i ∈ ( J ∩ K ), f (i ) if i ∈ ( J − K ), h (i ) = g (i ) if i ∈ (K − J ). ∅ if i ∈ I − ( J ∪ K )
= = = =
{χ | ∃(h | h ∈ DI : χ = {(i, A) | i ∈ J ∩ K ∧ A = h(i )} ∪{(i, A) | i ∈ J − K ∧ A = h(i )} ∪{(i, A) | i ∈ K − J ∧ A = h(i )} )} Range split axiom {χ | ∃(h | h ∈ DI : χ = {(i, A) | i ∈ ( J ∩ K ) ∪ ( J − K ) ∪ (K − J ) ∧ A = h(i )} )} Set theory {χ | ∃(h | h ∈ DI : χ = {(i, A) | i ∈ J ∪ K ∧ A = h(i )} )} Proposition 1(1) {χ | ∃(h | h ∈ DI : χ = I J ∪K ; h )} Definition of DJ ∪K DJ ∪K
A.2 Proposition 7
1. ∀( J, K | J, K ⊆ I : J = K → DJ = DK ) 2. ∀( J, K | J, K ⊆ I : DJ = DK → J = K )
3. ∀( J, K | J, K ⊆ I : DJ DK ↔ J ⊆ K )
1.
Proof.
∀( J, K | J, K ⊆ I : J = K → DJ = DK ) Trading rule for ∀ ∀( J, K | J, K ⊆ I ∧ J = K : DJ = DK ) ← Trading rule for ∀ & p ∧ q → p ∀( J, K | J, K ⊆ I : ( J = K ∧ DJ = DK ) ↔ J = K ) ← Substitution axiom ∀( J, K | J, K ⊆ I : ( J = K ∧ DK = DK ) ↔ J = K ) ← A = A ↔ true ∀( J, K | J, K ⊆ I : ( J = K ∧ true) ↔ J = K ) ← ( p ∧ true) ↔ p ∀( J, K | J, K ⊆ I : J = K ↔ J = K ) ← ( p ↔ p) ↔ true ∀( J, K | J, K ⊆ I : true ) ← ∀-True body ←
239
240
Advanced Technologies
true 2.
↔
∀( J, K | J, K ⊆ I : DJ = DK → J = K ) Contrapositive ∀( J, K | J, K ⊆ I : J = K → DJ = DK )
To prove the proposition we assume that J = K and prove DJ = DK → false (which is equivalent to ¬( DJ = DK )).
↔ ↔ → → → → → → → → 3.
DJ = DK
y ∈ { x | r } ↔ r [ x := y] & Definition of DJ and DK { g | ∃( f | f ∈ DI : g = I J ; f )} = { g | ∃( f | f ∈ DI : g = IK ; f )} { x | Q} = { x | R} ↔ ∀( x |: Q ↔ R ) ∀( g |: ∃( f | f ∈ DI : g = I J ; f ) ↔ ∃( f | f ∈ DI : g = IK ; f ) ) Definition of ”↔” & Weakening ∀( g |: ∃( f | f ∈ DI : g = I J ; f ) → ∃( f | f ∈ DI : g = IK ; f ) ) J = K & Assume K = J ∩ {k} ∀( g |: ∃( f | f ∈ DI : g = I J ; f ) → ∃( f | f ∈ DI : g = I J ∪{k} ; f ) ) Proposition 1(4) & Relational composition distributes over ∪ ∀( g |: ∃( f | f ∈ DI : g = I J ; f ) → ∃( f | f ∈ DI : g = I J ; f ∪ I{k} ; f ) ) J ⊆ dom ( f ) ∀( g |: ∃( f | f ∈ DI : | g| = | J | ) → ∃( f | f ∈ DI : | g| = | J | + 1 ) ) Distributivity of ∧ over ∃ ∀( g |: ∃( f | f ∈ DI : true ) ∧ | g| = | J | → ∃( f | f ∈ DI : true ) ∧ | g| = | J | + 1 ) ∃-True body & Identity of ∧ ∀( g |: | g| = | J | → | g| = | J | + 1 ) Implication (i.e., ( p → q) ↔ (¬ p ∨ q)) ∀( g |: | g| = | J | ∨ | g| = | J | + 1 ) Let g = eJ & | g| = | J | & false ∨ false → false false
∀( J, K | J, K ⊆ I : DJ DK ↔ J ⊆ K ) ← J∪K=K↔ J⊆K ∀( J, K | J, K ⊆ I : DJ DK ↔ J ∪ K = K ) ← Proposition 7(2) ∀( J, K | J, K ⊆ I : DJ DK ↔ DJ ∪K = DK ) ← Proposition 2 ∀( J, K | J, K ⊆ I : DJ DK ↔ DJ DK = DK ) ← DJ DK ↔ DJ DK = DK ∀( J, K | J, K ⊆ I : DJ DK = DK ↔ DJ DK = DK ) ← Reflexivity of ↔ ∀( J, K | J, K ⊆ I : true ) ← ∀-True body true
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
A.3 Proposition 12
1. ϕ
↓ DJ
· ϕ↓DK = ϕ
↓ DJ ∪K
2. ( ϕψ)↓DL = ϕ↓DL ψ↓DL
↓ DJ ∩K
4. ( ϕ↓DK )↓DL =ϕ↓DK∩L
1.
Proof.
= = = =
=
=
= = = = = =
2.
3. ϕ↓DK =ϕ
=
ϕ
↓ DJ
· ϕ↓DK
I J ; ϕ · IK ; ϕ
Definition of ↓ DJ and ↓ DK
ϕ ∈ DL I J ;{(i, A) | i ∈ L ∧ A = ϕ(i )} · IK ;{(i, A) | i ∈ L ∧ A = ϕ(i )} Proposition 1(1) {(i, A) | i ∈ L ∩ J ∧ A = ϕ(i )} · {(i, A) | i ∈ L ∩ K ∧ A = ϕ(i )} Definition of · {(i, A) | i ∈ ( L ∩ J ) ∩ ( L ∩ K ) ∧ A = ϕ(i ) ∪ ϕ(i )} ∪ {(i, A) | i ∈ ( L ∩ J ) − ( L ∩ K ) ∧ A = ϕ(i )} ∪ {(i, A) | i ∈ ( L ∩ K ) − ( L ∩ J ) ∧ A = ϕ(i )} ∪ is idempotent {(i, A) | i ∈ ( L ∩ J ) ∩ ( L ∩ K ) ∧ A = ϕ(i )} ∪ {(i, A) | i ∈ ( L ∩ J ) − ( L ∩ K ) ∧ A = ϕ(i )} ∪ {(i, A) | i ∈ ( L ∩ K ) − ( L ∩ J ) ∧ A = ϕ(i )} Range split (i.e., { x | r } ∪ { x | p} = { x | r ∨ p}) {(i, A) | i ∈ ( L ∩ J ) ∩ ( L ∩ K ) ∧ A = ϕ(i ) ∨ i ∈ ( L ∩ J ) − ( L ∩ K ) ∧ A = ϕ (i ) ∨ i ∈ ( L ∩ K ) − ( L ∩ J ) ∧ A = ϕ(i )} Distributivity of ∧ over ∨ {(i, A) | (i ∈ ( L ∩ J ) ∩ ( L ∩ K ) ∨ i ∈ ( L ∩ J ) − ( L ∩ K ) ∨ i ∈ ( L ∩ K ) − ( L ∩ J )) ∧ A = ϕ(i )} Set union axiom (i.e., i ∈ A ∨ i ∈ B ↔ i ∈ A ∪ B) {(i, A) | i ∈ (( L ∩ J ) ∩ ( L ∩ K )) ∪ (( L ∩ J ) − ( L ∩ K )) ∪ (( L ∩ K ) − ( L ∩ J )) ∧ A = ϕ(i )} Set theory {(i, A) | i ∈ ( L ∩ (K ∪ J )) ∧ A = ϕ(i )} Proposition 1(1) I J ∪K ;{(i, A) | i ∈ L ∧ A = ϕ(i )} ϕ ∈ DL I J ∪K ; ϕ Definition of ↓ DJ ∪K ϕ DJ ∪K ( ϕψ)↓DL
Definition of · ({(i, A) | i ∈ J ∩ K ∧ A = ϕ(i ) ∪ ψ(i )} ∪ {(i, A) | i ∈ J − K ∧ A = ϕ(i )} ∪ {(i, A) | i ∈ K − J ∧ A = ψ(i )})↓DL = Definition of ↓ DL
241
242
Advanced Technologies
I L ;{(i, A) | i ∈ J ∩ K ∧ A = ϕ(i ) ∪ ψ(i )} ∪ I L ;{(i, A) | i ∈ J − K ∧ A = ϕ(i )} ∪ I L ;{(i, A) | i ∈ K − J ∧ A = ψ(i )} = Proposition 1(1) {(i, A) | i ∈ J ∩ K ∩ L ∧ A = ϕ(i ) ∪ ψ(i )} ∪ {(i, A) | i ∈ ( J − K ) ∩ L ∧ A = ϕ(i )} ∪ {(i, A) | i ∈ (K − J ) ∩ L ∧ A = ψ(i )} = Set theory {(i, A) | i ∈ ( J ∩ L) ∩ (K ∩ L) ∧ A = ϕ(i ) ∪ ψ(i )} ∪ {(i, A) | i ∈ ( J ∩ L) − (K ∩ L) ∧ A = ϕ(i )} ∪ {(i, A) | i ∈ (K ∩ L) − ( J ∩ L) ∧ A = ψ(i )} = Definition of · and ↓ DL ϕ↓DL ψ↓DL 3.
= = = = =
4.
= = = =
ϕ↓DK
ϕ ∈ DJ {(i, A) | i ∈ J ∧ A = ϕ(i )}↓DK Definition of ↓ DK IK ;{(i, A) | i ∈ J ∧ A = ϕ(i )} Proposition 1(1) {(i, A) | i ∈ J ∩ K ∧ A = ϕ(i )} Set theory {(i, A) | i ∈ J ∩ ( J ∩ K ) ∧ A = ϕ(i )} Definition of ↓ DK ↓ ϕ DJ ∩K ( ϕ↓DK )↓DL I L ;( ϕ↓DK ) I K ;I L ; ϕ IK ∩ L ; ϕ ϕ↓DK∩L
Definition of ↓ DL Definition of ↓ DK Proposition 1(2) Definition of ↓ DK∩L
A.4 Proposition 15
1. DJ d( ϕ) ∨ ϕ = fs( ϕ, DJ , DK )
2. DK d( ϕ)) ∨ ϕ = fs(fs( ϕ, DJ , DK ), DK , DJ ) 1.
Proof.
↔
DJ d( ϕ) ∨ ϕ = fs( ϕ, DJ , DK ) p ∨ q ↔ ¬p → q ¬( DJ d( ϕ)) → ϕ = fs( ϕ, DJ , DK )
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
Our proof strategy for p → q is to assume p and then prove q. We assume
¬( DJ d( ϕ)) d( ϕ) = DL ¬( DJ DL ) → Proposition 7(3) ¬( J ⊆ L) → Proposition 14 J∩L=∅ ↔
Then we prove fs( ϕ, DJ , DK ) = ϕ
= = = = = = = = = = = = =
2.
↔
fs( ϕ, DJ , DK ) Definition of fs( ϕ, DJ , DK ) ↓ D( L− J )
↓
· ( ϕ DJ [ DK /DJ ]) ( J ∩ L = ∅) → L − J = L ↓ ϕ↓DL · ( ϕ DJ [ DK /DJ ]) Proposition 10(1) ↓ ϕ · ( ϕ DJ [ DK /DJ ]) ϕ ∈ DL ↓ ϕ · ({(i, A) | i ∈ L ∧ A = ϕ(i )} DJ [ DK /DJ ]) Definition of ↓ DJ ϕ · (I J ;{(i, A) | i ∈ L ∧ A = ϕ(i )}[ DK /DJ ]) Proposition 1(1) ϕ · ({(i, A) | i ∈ L ∩ J ∧ A = ϕ(i )}[ DK /DJ ]) J∩L=∅ ϕ · ({(i, A) | i ∈ ∅ ∧ A = ϕ(i )}[ DK /DJ ]) i ∈ ∅ ↔ false ϕ · ({(i, A) | false ∧ A = ϕ(i )}[ DK /DJ ]) p ∧ false ↔ false ϕ · ({(i, A) | false}[ DK /DJ ]) Empty range ϕ · (∅[ DK /DJ ]) Frame substitution of an empty set ϕ·∅ Proposition 11(4) ϕ · e∅ Proposition 11(3) ϕ ϕ
DK d( ϕ) ∨ ϕ = fs(fs( ϕ, DJ , DK ), DK , DJ ) p ∨ q ↔ ¬p → q ¬( DK d( ϕ)) → ϕ = fs(fs( ϕ, DJ , DK ), DK , DJ )
Our proof strategy for p → q is to assume p and then prove q. We assume
¬( DK d( ϕ))
243
244
Advanced Technologies
d( ϕ) = DL ¬( DK DL ) → Proposition 7(3) ¬(K ⊆ L) → Proposition 14 K∩L=∅ ↔
To prove ϕ = fs(fs( ϕ, DJ , DK ), DK , DJ ), we have two cases: (a) ¬( DJ d( ϕ))
(b) DJ d( ϕ)
(a) Assume ¬( DJ d( ϕ))
fs(fs( ϕ, DJ , DK ), DK , DJ ) Proposition 15(1) & ¬ DJ d( ϕ) fs( ϕ, DK , DJ ) = Proposition 15(1) & ¬ DK d( ϕ) ϕ
=
(b) Assume DJ d( ϕ) → J ⊆ L → J ∩ L = J
= = = = = = = = = = =
fs(fs( ϕ, DJ , DK ), DK , DJ ) Definition of f s fs( ϕ
↓ D( L− J )
↓
· ( ϕ DJ [ DK /DJ ]), DK , DJ ) Definition of f s ↓ ↓ ↓ ↓ ↓ ( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ])) D(L−K) · (( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ]))↓DK [ DJ /DK ]) (K ∩ L = ∅) → L − K = L ↓ ↓ ↓ ↓ ( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ]))↓DL · (( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ]))↓DK [ DJ /DK ]) ↓ DJ Proposition 1(2, 9) & d( ϕ [ DK /DJ ]) = DK ↓ ↓ D( L− J ) ↓ ↓ (ϕ · ( ϕ DJ [ DK /DJ ]))↓DL · (( ϕ D(L− J )∩K · ( ϕ DJ [ DK /DJ ]))[ DJ /DK ]) K∩L=∅ ↓ ↓ ↓ ( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ]))↓DL · (( ϕ↓D∅ · ( ϕ DJ [ DK /DJ ]))[ DJ /DK ]) Proposition 11(1) ↓ ↓ ↓ ( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ]))↓DL · ((e∅ · ( ϕ DJ [ DK /DJ ]))[ DJ /DK ]) Proposition 11(3) ↓ D( L− J ) ↓ ↓ (ϕ · ( ϕ DJ [ DK /DJ ]))↓DL · (( ϕ DJ [ DK /DJ ])[ DJ /DK ]) Replace J by K and then K by J is equivalent to replace J by J ↓ D( L− J ) ↓ ↓ (ϕ · ( ϕ DJ [ DK /DJ ]))↓DL · ( ϕ DJ [ DJ /DJ ]) Replacing J by J does not affect the information ↓ ↓ ↓ ( ϕ D(L− J ) · ( ϕ DJ [ DK /DJ ]))↓DL · ϕ DJ Proposition 12(2) ↓ ↓ ↓ ( ϕ D(L− J ) )↓DL · ( ϕ DJ [ DK /DJ ])↓DL · ϕ DJ Proposition 12(4) ↓ ↓ ↓ ( ϕ D(L− J )∩L ) · ( ϕ DJ [ DK /DJ ])↓DL · ϕ DJ
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
=
(L − J) ∩ L = L ∩ J ∩ L = L ∩ J = L − J ↓ ↓ ) · ( ϕ DJ [ DK /DJ ])↓DL · ϕ DJ ↓ Proposition 12(3) & d( ϕ DJ [ DK /DJ ]) = DK ↓ ↓ ↓ ( ϕ D(L− J ) ) · ( ϕ DJ [ DK /DJ ])↓DL∩K · ϕ DJ K∩L=∅ ↓ ↓ ↓ ( ϕ D(L− J ) ) · ( ϕ DJ [ DK /DJ ])↓D∅ · ϕ DJ Proposition 11(1) ↓ ↓ ( ϕ D(L− J ) ) · e∅ · ϕ DJ Proposition 11(3) ↓ ↓ ( ϕ D(L− J ) ) · ϕ DJ Proposition 12(1) ↓ D( L− J )∪ J ϕ DJ d( ϕ) → J ⊆ L & ( L − J ) ∪ J = L ϕ↓DL Proposition 12(1) ϕ (ϕ
= = = = = = =
↓ D( L− J )
A.5 Proposition 16
1. ϕ ≤ ψ ∧ ϕ ≤ χ → update(update(N , ϕ, ψ), ϕ, χ) = update(N , ϕ, ψ · χ) 2. isInKnowledge(N , d( ϕ), ϕ) ∨ update(N , ϕ, ψ) = N 1.
Proof.
Ψ
Definition of the function update {τ | ∃(χ1 | χ1 ∈ Ω : ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 | χ1 ∈ Ω : ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} Definition of the function update = {τ | ∃(χ1 | χ1 ∈ {τ1 | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ τ1 = (χ2 − ϕ) · ψ )} ∪ {τ1 | ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ τ1 = χ2 )} : ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 | χ1 ∈ {τ1 | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ τ1 = (χ2 − ϕ) · ψ )} ∪ {τ1 | ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ τ1 = χ2 )} : ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} = Set union axiom (i.e., i ∈ A ∨ i ∈ B ↔ i ∈ A ∪ B) {τ | ∃(χ1 | χ1 ∈ {τ1 | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ τ1 = (χ2 − ϕ) · ψ )} ∨ χ1 ∈ {τ1 | ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ τ1 = χ2 )} : ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 | χ1 ∈ {τ1 | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ τ1 = (χ2 − ϕ) · ψ )} ∨ χ1 ∈ {τ1 | ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ τ1 = χ2 )} : =
245
246
Advanced Technologies
=
=
=
=
=
=
¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} y ∈ { x | r } ↔ r [ x := y] {τ | ∃(χ1 | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ ) ∨ ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) : ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ ) ∨ ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) : ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} Distributivity axiom {τ | ∃(χ1 | ∃(χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∨ (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) ) : ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 | ∃(χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∨ (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) ) : ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} Trading rule for ∃ {τ | ∃(χ1 |: ∃(χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∨ (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) ) ∧ ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 |: ∃(χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∨ (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) ) ∧ ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} Nesting axiom {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∨ (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) ∧ ( ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ) )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∨ (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ) ∧ (¬( ϕ ≤ χ1 ) ∧ τ = χ1 ) )} Distributivity of ∧ over ∨ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = ( χ2 − ϕ ) · ψ ∧ ϕ ≤ χ1 ∧ τ = ( χ1 − ϕ ) · χ ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ϕ ≤ χ1 ∧ τ = (χ1 − ϕ) · χ )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = (χ2 − ϕ) · ψ) ∧ ¬( ϕ ≤ χ1 ) ∧ τ = χ1 ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )} Proposition 13(7) {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ ϕ ≤ χ1 ∧ τ = χ1 · χ ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ϕ ≤ χ1 ∧ τ = χ1 · χ )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ :
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
=
=
=
=
=
=
=
=
ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ ¬( ϕ ≤ χ1 ) ∧ τ = χ1 ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ¬( ϕ ≤ χ1 ) ∧ τ = χ1 )}
Substitution axiom {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ ϕ ≤ χ2 · ψ ∧ τ = χ2 · ψ · χ ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ϕ ≤ χ2 ∧ τ = χ2 · χ )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ ¬( ϕ ≤ χ2 · ψ) ∧ τ = χ2 · ψ ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 )} Contradiction {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ ϕ ≤ χ2 · ψ ∧ τ = χ2 · ψ · χ ∨ false )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : false ∨ ¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 )} Zero for ∨ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ ϕ ≤ χ2 · ψ ∧ τ = χ2 · ψ · χ) )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) )} ϕ ≤ χ2 → ϕ ≤ χ2 · ψ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ τ = χ2 · ψ · χ) )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) )} Idempotency of ∧ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : ( ϕ ≤ χ2 ∧ χ1 = χ2 · ψ ∧ τ = χ2 · ψ · χ) )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ : (¬( ϕ ≤ χ2 ) ∧ χ1 = χ2 ∧ τ = χ2 ) )}, D ) Trading rule for ∃ & Symmetry of ∧ ({τ | ∃(χ1 , χ2 | χ2 ∈ Φ ∧ ϕ ≤ χ2 ∧ τ = χ2 · ψ · χ) : (χ1 = χ2 · ψ )} ∪ {τ | ∃(χ1 , χ2 | χ2 ∈ Φ ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) : (χ1 = χ2 )} Nesting axiom {τ | ∃(χ2 | χ2 ∈ Φ ∧ ϕ ≤ χ2 ∧ τ = χ2 · ψ · χ) : ∃(χ1 |: (χ1 = χ2 · ψ ) )} ∪ {τ | ∃(χ2 | χ2 ∈ Φ ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) : ∃(χ1 |: (χ1 = χ2 ) )} The information χ1 and combining information is always defined {τ | ∃(χ2 | χ2 ∈ Φ ∧ ϕ ≤ χ2 ∧ τ = χ2 · ψ · χ) :
247
248
Advanced Technologies
true )} ∪ {τ | ∃(χ2 | χ2 ∈ Φ ∧ ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) : true )} = Trading rule for ∃ {τ | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ τ = χ2 · ψ · χ )} ∪ {τ | ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) )} = Proposition 13(7) {τ | ∃(χ2 | χ2 ∈ Φ : ϕ ≤ χ2 ∧ τ = (χ2 − ϕ) · ψ · χ )} ∪ {τ | ∃(χ2 | χ2 ∈ Φ : ¬( ϕ ≤ χ2 ) ∧ τ = χ2 ) )} = The definition of Ω Ω[ψ · χ/ψ] 2.
↔
isInKnowledge(N , d( ϕ), ϕ) ∨ update(N , ϕ, ψ) = N p → q ↔ ¬p ∨ q ¬isInKnowledge(N , d( ϕ), ϕ) → update(N , ϕ, ψ) = N
First, we assume that
¬isInKnowledge(N , d( ϕ), ϕ) Definition of isInKnowledge ¬ ∃(χ | χ ∈ Φ : d( ϕ) ∈ D ∧ ϕ ≤ χ ∧ d( ϕ) d(χ) ) ↔ De Morgan ∀(χ | χ ∈ Φ : ¬(d( ϕ) ∈ D ∧ ϕ ≤ ψ ∧ d( ϕ) d(χ)) ) ↔ De Morgan ∀(χ | χ ∈ Φ : ¬(d( ϕ) ∈ D ) ∨ ¬( ϕ ≤ χ) ∨ (¬d( ϕ) d(χ)) ) ↔
Based on the assumption, we prove that for d( ϕ) = DJ and d(ψ) = DK , we have ∃(χ | χ ∈ Φ : ϕ ≤ χ ) → false
→ → → → →
∃(χ | χ ∈ Φ : ϕ ≤ χ ) ϕ ≤ χ → J ⊆ K from the definition of ≤ ∃(χ | χ ∈ Φ : ϕ ≤ χ ∧ J ⊆ K ) Proposition 7(1) ∃(χ | χ ∈ Φ : ϕ ≤ χ ∧ DJ DK ) d( ϕ) = DJ and d(ψ) = DK ∃(χ | χ ∈ Φ : ϕ ≤ χ ∧ d( ϕ) d(χ) ) d(χ) ∈ D ∧ J ⊆ K → d( ϕ) ∈ D ∃(χ | χ ∈ Φ : ϕ ≤ χ ∧ d( ϕ) d(χ) ∧ d( ϕ) ∈ D ) The assumption false
Therefore, we have ¬ ∃(χ | χ ∈ Φ : ϕ ≤ χ ) ↔ ∀(χ | χ ∈ Φ : ¬( ϕ ≤ χ) ) ↔ true. Then we prove that update(N , ϕ, ψ) = N
=
update(N , ϕ, ψ) Definition of update ({(χ − ϕ) · ψ | χ ∈ Φ ∧ ϕ ≤ χ} ∪ {χ | χ ∈ Φ ∧ ¬( ϕ ≤ χ)}, D )
Algebraic Model for Agent Explicit Knowledge in Multi-agent Systems
= = = = =
249
From (1) ({(χ − ϕ) · ψ | χ ∈ Φ ∧ false} ∪ {χ | χ ∈ Φ ∧ true}, D ) Zero of ∧ and ∨ ({(χ − ϕ) · ψ | false} ∪ {χ | χ ∈ Φ}, D ) Empty range ∅ ∪ { χ | χ ∈ Φ }, D ) Identity of ∪ { χ | χ ∈ Φ }, D ) Definition of N N
B. References Alghathbar, K., Farkas, C., and Wijesekera, D. (2006). Securing UML information flow using FlowUML. Journal of Research and Practice in Information Technology, 38(1):111–120. Bell, D. and LaPadula, L. (1976). Secure computer system: Unified exposition and multics interpretation. Technical Report ESD-TR-75-306, The MITRE Corporation. Brewer, D. F. and Nash, M. J. (1989). The chinese wall security policy. In IEEE Symposium on Security and Privacy, pages 206–214. Cervesato, I. (2000). Typed multiset rewriting specifications of security protocols. In Seda, A., editor, First Irish Conference on the Mathematical Foundations of Computer Science and Information Technology — MFCSIT’00, pages 1–43, Cork, Ireland. Elsevier ENTCS 40. Clarke, E. M., Jha, S., and Marrero, W. (2000). Verifying Security Protocols with Brutus. ACM Transactions on Software Engineering and Methodology, 9(4):443–487. Davey, B. and Priestley, H. (2002). Introduction to Lattices and Order. Cambridge university press, second edition. F´abrega, F. J. T., Herzog, J. C., and Guttman, J. D. (1999). Strand Spaces: Proving Security Protocols Correct. Journal of Computer Security, 7(2–3):191–230. Focardi, R. and Gorrieri, R. (1997). The Compositional Security Checker: A Tool for the Verification of Information Flow Security Properties. IEEE Transactions on Software Engineering, 23(9):550–571. Hristova, K., Rothamel, T., Liu, Y. A., and Stoller, S. D. (2006). Efficient type inference for secure information flow. In PLAS ’06: Proceedings of the 2006 workshop on Programming languages and analysis for security, pages 85–94, New York, NY, USA. ACM. Kohlas, J. and St¨ark, R. F. (2007). Information Algebras and Consequence Operators. Logica Universalis, 1(1):139–165. Leduc, G. and Germeau, F. (2000). Verification of security protocols using LOTOS-method and application. Computer Communications, 23(12):1089–1103. Ma, X.-Q. and Cheng, X.-C. (2005). Formal Verification of Merchant Registration Phase of SET Protocol. International Journal of Automation and Computing, 2(2):155–162. Paulson, L. C. (1998). The inductive approach to verifying cryptographic protocols. Journal of Computer Security, 6(1–2):85–128. Sabri, K. E. and Khedri, R. (2006). A multi-view approach for the analysis of cryptographic protocols. In Workshop on Practice and Theory of IT Security (PTITS 2006), pages 21–27, Montreal, QC, Canada.
250
Advanced Technologies
Sabri, K. E. and Khedri, R. (2007a). A mathematical framework to capture agent explicit knowledge in cryptographic protocols. Technical Report CAS-07-04-RK, department of Computing and Software, Faculty of Engineering, McMaster University. http://www.cas.mcmaster.ca/cas/0template1.php?601 (accessed on May 20, 2009). Sabri, K. E. and Khedri, R. (2007b). Multi-view framework for the analysis of cryptographic protocols. Technical Report CAS-07-06-RK, department of Computing and Software, Faculty of Engineering, McMaster University. http://www.cas.mcmaster.ca/cas/0template1.php?601 (accessed on May 20, 2009). Sabri, K. E. and Khedri, R. (2008). Agent explicit knowledge: Survey of the literature and elements of a suitable representation. In 2nd Workshop on Practice and Theory of IT Security (PTITS 2008), pages 4–9, Montreal, QC, Canada. Sabri, K. E., Khedri, R., and Jaskolka, J. (2008). Specification of agent explicit knowledge in cryptographic protocols. In CESSE 2008 : International Conference on Computer, Electrical, and Systems Science, and Engineering, volume 35, pages 447–454, Venice, Italy. World Academy of Science, Engineering and Technology. Sabri, K. E., Khedri, R., and Jaskolka, J. (2009a). Automated verification of information flow in agent-based systems. Technical Report CAS-09-01-RK, department of Computing and Software, Faculty of Engineering, McMaster University. http://www.cas.mcmaster.ca/cas/0template1.php?601 (accessed on May 20, 2009). Sabri, K. E., Khedri, R., and Jaskolka, J. (2009b). Verification of information flow in agent-based systems. In E-Technologies: Innovation in an Open World, 4th International Conference, MCETECH 2009, volume 27 of Lecture Notes in Business Information Processing, pages 252–266, Ottawa, Canada. Springer Berlin Heidelberg. Varadharajan, V. (1990). Petri net based modelling of information flow security requirements. In Computer Security Foundations Workshop III, pages 51–61.
Energy Field as a Novel Approach to Challenge Viruses
251
14 X Energy Field as a Novel Approach to Challenge Viruses S. Amirhassan Monadjemi
Department of Computer Engineering, Faculty of Engineering, University of Isfahan, Isfahan, 81746, Iran 1. Introduction The serious harmful effects of viruses and viral infections to the human life, have been one of the greatest challenges of the health industry in recent years. Virus' mutation could create worldwide epidemic and the virus will evolve into a pathogen deadly for humans; such as a bird flu virus that may mutate to become as deadly and infectious as viruses that killed millions during three influenza pandemics of the 20th century. The influenza pandemics was so severe that killed more people than the Great War, known today as the World War 1, at somewhere between 20 and 40 million people. It has been cited as the most devastating epidemic in recorded world history. HIV/AIDS has become one of the great fears of our time, a disease so virulent and widespread that its victims are counted in the millions. It has killed at least 28 million people since 1982 . The number of HIV/AIDS infected now surpasses 50 million. In the 25 years since the first reported cases of HIV/AIDS in 1981, the disease has become a global pandemic. Unfortunately, the epidemic’s history is a story of largely unfulfilled hopes for various treatments. The history of drug treatment regimens for HIV/AIDS is complex. It is complicated by problems with toxicity, compliance, side effects, and cost. Therefore it’s possible that AIDS is more than another disease that nature has thrown our way, so this calamitous virus, which could potentially wipe out the entire human species, may be slipped out of a man-made experience. Was this again another program to keep us in a constant state of fear or it was a new conspiracy to control the world population by introducing some secret viruses. It does not really matter which of the above is true or if both are false. What matters is that viruses in general are a yearly problem planet wide and it would be great if we had a solution for this once and for all. In both detecting and fighting viral diseases ordinary methods have come across some basic and important difficulties. Vaccination is by a sense introduction of the virus to the immune system before the occurrence of the real case infection. It is very successful against some viruses (e.g. Poliomyelitis), while totally ineffective against some others (e.g. HIV or Hepatitis-C). On the other hand, anti-virus drugs are mostly some tools to control and not to cure a viral disease. This could be a good motivation to try alternative treatments. In this study, some key features of possible physical-based alternative treatments for viral diseases
252
Advanced Technologies
are presented. Electrification of body parts or fluids (especially blood) with micro electric signals with adjusted current or frequency is also studied. The main approach of this study is to find a suitable energy field, with appropriate parameters that are able to kill or deactivate viruses. This would be a lengthy, multi-disciplinary research which needs the contribution of virology, physic, and signal processing experts. There is no doubt that the Bird Flu factually does exist, and there is no question that in the past the World has been badly hit by Flu pandemics. The Spanish Flu of 1918 was a result of such a bird flu. Initially it only killed birds, but it mutated and became a virus that could then be spread from human to human and ended up killing over 40 million people in just a few months time. It was very quick, and its victims usually lasted a couple of days at the most. This flu was not selective, in that it attacked all age groups equally but mostly young males. There was no protection from it and two people have become infected from personto-person contact. Currently, there are three times as many people on the planet, and if a similar pandemic were to hit in the same way, we could expect a minimum of 150 million up to as many as half a billion human deaths throughout the World. To top it off, modern anti-viral drugs and antibiotics have a little effect on this modern bird flu strain whatsoever. All attempts to find an effective vaccination for this have proven to be unsuccessful. A couple of years ago, it was SARS, and a year before that it was the Anthrax scare. A viral pneumonia like illness that becomes the first pandemic of the twenty century and spreads quickly to more than 30 countries and killed at least 10 per cent of those who caught it. Many scientists believe that SARS is manmade and the virus could only be produced in laboratory conditions, maybe by an accidental leak in a laboratory is blame. Persian Gulf War Syndrome, since the first Persian Gulf War ended in 1991, still affects hundreds of thousands of American veterans from the "toxic soup" that cut some 30 years off many of their life-spans. Hundreds of thousands of people sent into the Gulf War zone and others that never entered the war zone who also became sick from only the vaccines. The question becomes what is the common factor that connects all these persons leading to this similar illness pattern. This common factor is not a mystery and is a well known effect. The common mechanism is the loss of enzymes in the human body. It is estimated there are now more than 80,000 veterans continually surfacing from Gulf War Syndrome. HIV/AIDS has become one of the great fears of our time, a disease so virulent and widespread that its victims are counted in the millions. It has killed at least 28 million people since 1982 (Broder, etal. 1990, WHO, 2003). The number of HIV/AIDS infected now surpasses 50 million. In the 25 years since the first reported cases of HIV/AIDS in 1981, the disease has become a global pandemic. Unfortunately, the epidemic’s history is a story of largely unfulfilled hopes for various treatments. The history of drug treatment regimens for HIV/AIDS is complex. It is complicated by problems with toxicity, compliance, side effects and cost. Therefore it’s possible that AIDS is more than another disease that nature has thrown our way, so this calamitous virus, which could potentially wiper out the entire human species, may be slipped out of a man-made experience. Was this again another program to keep us in a constant state of fear or it was a new conspiracy to control the world population by introducing some secret viruses It doesn’t really matter which of the above is true or if both are false. What matters is that viruses in general are a yearly problem planet wide and it would be great if we had a solution for this once and for all.
Energy Field as a Novel Approach to Challenge Viruses
253
In this article, we are going to describe the problem in hand and the proposed energy fieldbased cures firstly. The historical background of the alternative –physical- treatments would be represented as well. Different energy fields and their possible effects on viruses will be discussed next. The layout of a rational approach to build up a multi-agent model of the body, immune system, and infection would conclude the article.
2. A Review of Alternative Cures A quick, much abbreviated historical overview of the work of these notable pioneers is in order. Nickola Tesla in the 1890’s noted curative effects on various conditions when using high frequency electrical oscillation circuits. Georges Lakhovsky during the early to middle part of the last century produced various broad band multiple wave oscillator circuits that similarly to Tesla’s circuits produced broad band (wide spectrum of frequencies) ultrasound in human tissue. Also in the first part of the last century A. Abrams, , developed various electrical oscillation circuits that supplied electrodes connected to the human body with complex voltage oscillation patterns that produced broad band ultrasound in human tissue. Despite claims of success of multiple wave oscillators in curing many microbial diseases and cancers, all alternative electro-medicine technologies was suppressed and outlawed by the FDA in the US. After succeeding in suppressing electro-medicine, one of the most notable persons in popularizing electro-medical devices was John Crane. He popularized the use of electrodes applying a voltage square wave to the human body. Crane’s voltage square wave generator when tuned to specific frequencies was able to achieve many of the curative results as Rife frequency instrument. 2.1. Royal Rife Machines During the 1920’s and 30’s, Royal Raymond Rife developed two new technologies. One was the Rife microscope and the other was Rife frequency instrument. He invented a new kind of optical microscope. This microscope could be used to observe viruses in live cells and tissue culture. Rife was able to see viruses with visible light because he had found an optical assembly to overcome diffraction phenomena which stops the best currently available optical microscopes from seeing anywhere near the virus level. Rife’s second great accomplishment was to invent a variable frequency flashing light ultrasound source which could kill bacteria, protozoa, fungi, and viruses. While observing these various microbes with his microscope Rife used his frequency instrument to produce specific frequencies of ultrasound which would destroy these micro organs. Rife found that every micro organ he encountered had at least one frequency of ultrasound (i.e. mechanical shaking and resonance) that destroyed it very easily. By 1939 Rife had found the lethal ultrasound frequencies for the microbes associated with 52 major diseases (Morse, 1997, WHO, 2003). One of the main reasons each specific micro organs type is apparently susceptible to destruction by a specific ultrasound frequency is the existence of the periodically spaced, often closed on themselves, and elastically coupled together protein clump structures in them. These structures play a vital part in the functioning and life cycle of the micro organ and if they are destroyed and or significantly damaged the micro organ can not survive or propagate itself. By the very nature of their construction, these periodic protein clump structures are very susceptible to destruction by specific ultrasound frequencies (i.e. mechanical shaking rates). These structures can support and go into resonance with specific
254
Advanced Technologies
frequencies of mechanical vibration. Figure 1 A, shows the center of each protein molecule laid out in a linear fashion for ease of graphing some of the wave motions (resonate oscillation modes) that it can support (resonate with). Figures 1 B, C & D, illustrate some of the Resonate oscillation modes. Figure 1 B is the most stressful and potentially damaging oscillation mode; the reason being that all adjacent protein molecules are always moving in the opposite direction to each other which puts maximum stress on where these molecules are connected (bonded) together. These bonding regions are usually made up of mostly weak hydrogen bonds with occasional covalent chemical bonds. The bonding between clumps is weak and if the oscillation amplitude builds high enough the bonds will rip apart and the structure will be destroyed (Morse, 1997, WHO, 2003). There are various kinds of so called Rife machines available. The varying success of these machines is based on Rife’s first discovery that every micro organ always had at least one frequency of mechanical vibration that destroyed it easily and quickly. The best type of “Rife Machine” for anti aging purposes is a broad band low intensity ultrasound machine. This type of machine scans through a frequency range from zero to several tens of mega hertz of mechanical vibration (ultrasound). With regular use this type of machine goes after all microbes and almost all virus types to keep the viral and microbe load in the body at a minimum and therefore a minimum of cell death rate occurring and minimum cell replacement rate is needed (Bureau International Programs, 2005, WHO, 2003).
Fig. 1. Resonant oscillation modes
3. Effects of Micro Electric Current on HIV Years after that in the fall of 1990, two medical researchers, Drs. W. Lyman and S. Kaali, working at Albert Einstein College of Medicine in NYC made an important discovery to help deal with disease including AIDS, cancer, Gulf War Syndrome and so on. They found that they could inactivate the HIV virus by applying a low voltage direct current electrical potential with an extremely small current flow to AIDS infected blood in a test tube. Initially, they discovered this in the lab by inserting two platinum electrodes into a glass tube filled with HIV-1 (type 1) infected blood. They applied a direct current to the electrodes and found that a current flow in the range of 50-100 microamperes (μA) produced the most effective results. Practically all of the HIV viral particles were adversely affected while normal blood cells remained unharmed. The viral particles were not directly destroyed by the electric current, but rather the outer protein coating of the virus was affected in such a
Energy Field as a Novel Approach to Challenge Viruses
255
way as to prevent the virus from producing reverse transcriptase, a necessary enzyme needed by the virus to invade human cells. This is reminiscent of a well proven cure for snakebite by application of electric current that instantly neutralizes the venom's toxicity. And there may be several other diseases as yet undiscovered or untested viruses neutralize able with this discovery; such as Epstein Barr (chronic fatigue syndrome), hepatitis, Lupus, cancer and many others (Wysock, etal. 2001, Beck, 2001). This very simple and valid blood clearing treatment proved of great promise as a positive method for immobilizing known strains of HIV still present and contaminating some European and US blood bank reserve supplies. It was further suggested that infected human HIV carriers could be cured by removing their blood, treating it electrically, and returning it by methods similar to dialysis, or by surgically implanting electrode arrays with miniature batteries sewn inside blood vessels. Kaali then worked out a design of a small battery with two tiny electrodes that could be sewn directly into an artery in the arm or leg. By maintaining the current flow between the two electrodes within the 50-100 micro ampere range, the HIV particles were gradually disabled within the bloodstream and the AIDS victim would gradually recover his health (Beck, 2001, Hamadani, 2005). Kaali outlined two methods for treating an AIDS patient with this new therapy: One method involved removing a small amount of blood, electrifying it and then returning it to the patient's body, in a process similar to dialysis. The second method involved sewing a miniature electrifying power supply along with two tiny electrodes directly into the lumen of an artery (Hamadani, 2005). 3.1. Beck’s Protocol Dr Robert (Bob) Beck paid attention to the above-mentioned discovery in Albert Einstein College of medicine. Beck looked up the patent and decided to try and duplicate the therapy but he wanted to do it non-invasively; that is by applying the electric current from outside the body (Broder, etal. 1990, Hamadani, 2005). W. Lyman and S. Kaali used DC current to deactivate the AIDS virus, Beck found that he could get the same results using the 3.92HZ square wave. He tried to apply the electrodes to the skin directly over those arteries that were close enough to the skin surface.
Fig. 2. Blood electrifying
256
Advanced Technologies
The 50-100 micro amperes current could be created within the artery by electromagnetic induction allowing the entire therapy to be applied externally, without the need for implanting electrodes into the arteries. The device he put together to accomplish this is today called a blood electrifier. Beck designed a circuit that varied the voltage with an alternating current (AC) at a very low frequency and avoided the electrolysis problem. The waveform that Beck chose is not the typical sine wave seen in AC household outlets, but rather is a bi-phasic square wave, meaning that the waveform voltage has a positive half and a negative half, allowing the current to reverse direction each half cycle (Broder, etal. 1990, Hamadani, 2005). The Bob Beck’s Protocol suggests apparatus and simple techniques which have the potential to safely eliminate pathogens, bacteria, virus, parasites, fungus and germs which devastate health and are immune system destroyers. The protocol includes four synergistic and essential elements. 1. Building or acquiring a functioning battery-powered blood electro-purifier that attaches externally to the Radial and Ulnar artery pulse points on one wrist. Suggested use is for a minimum of four to twelve weeks with daily electrification of two hour. 2. A very simple and inexpensive instrument for making a quantity of Ionic Silver Colloids for pennies ensuring help with the immune system. 3. A high-intensity magnetic pulser which destroys any residual germinating or incubating pathogens in lymph and other organs in tissue consequently preventing self re-infection. 4. An ozone generator easily made with tropical fish store components to charge drinking water with O3. Ozone comfortably detoxifies by oxidation any wastes which the body must eliminate to regain health. 3.2. Magnetic Fields There are many pulsed magnetic field type devices but we are only interested in those that produce very intense ringing magnetic fields; like machines that have a wire coil of 8 inches internal diameter or larger that produce a transient magnetic field of several tens of thousands of gauss and the magnetic field polarity changes many tens of thousands of time per second, each time power is pulsed into the coil, In other words an intense pulsed ringing magnetic field (Tomas, etal. 2000). In an experimental work on pulsed ringing magnetic fields of the type just mentioned, we can find several very useful and beneficial effects; Namely, the production of in the body tissue of broad band ultrasound, charge density waves, and oscillating electric currents from a ringing electric field. All of these effects are anti microbial. The broad band (wide frequency range) ultrasound generated can disrupt and kill many microbe types. The charge density waves, which are moving compactions and ramifications of the normal positive and or negative ion densities found in the body’s interstitial fluids which are salt water like. These charge density waves can have a very strong electric field at the front of the moving wave. This charge density wave front electric field can interact with the delicate protein structures on viruses that are used to bind the virus with target cell surface proteins. If the electric field is strong enough it can interact with the various charged structures on these binding proteins and rearrange their structure so that they can not bind to their target protein. If the virus can not attach to the target cell surface, then the virus can not infect the cell. The virus is effectively destroyed. This electric field can also interact with bacteria surfaces and denature delicate protein structures on them or reorganize their structure so
Energy Field as a Novel Approach to Challenge Viruses
257
that these vital surface protein structures are non operational and the bacteria can not function normally and in some cases probably starves to death. Another very important discovery made with intense pulsed ringing magnetic fields was their ability to make certain types of cells converts over into embryonic looking and acting cells. For example, it’s demonstrated that fibroblast cells and certain epithelial precursor cell types could, with exposure to ringing magnetic fields of various field strength and pulsing rate, be made to convert over into embryonic looking cells. . Furthermore, in field trials on horses and humans we were able to apparently undo the effects of traumatic physical injuries where scar tissue had or was forming. Empirically, it looks as though scar tissue which is formed by and maintained by mainly fibroblast cells, was having the surface layer of fibroblast cells on the scar surface converted to embryonic like cells that then in turn converted over into the adjacent normal cell type the scar tissue is butted up against (Broder, etal. 1990, Tomas, etal. 2000). The other method for releasing telomerase is an electromagnetic method namely, exposing the body to specific frequencies of microwaves in the multi Giga hertz frequency range at low power levels for a brief time (a minute or less). Experiments that indicated this method of producing telomerase were observed in a set of experiments designed to regenerate animal tissue carried out in 1977. 3.3. The Magnetic Pulser As a Diagnostic Tool - The intense ringing magnetic field that the coil produces can induce voltages across and currents through electrically conductive media or material. The scar tissue resultant from traumatic physical injuries has relatively large concentrations of an electrically conductive protein filament material called collagen. These collagen filaments form an overlapping intertwined mesh holding the scar tissue together. When this collagen rich scar tissue is exposed to the pulsed ringing magnetic field of the coil, electric currents are induced in and throughout the scar tissue. Nerve sensor fibers in this region sense this induced current flow and the person experiences a sharp stabbing sort of sensation at the damaged site each time the coil rings. By slowly moving the coil over the entire body surface most tissue damaged regions or areas can be easily located and then appropriately treated. However, not all damaged sites can always be located this way, due to poor nervation in the damaged area or just nerve damage associated with an injury. A good example of this is with knee cartilage damage, where the patient often does not feel much from the coil, but still gets very good treatment results (Tomas, etal. 2000, Halford, 2006, Guo, 2005). Many phenomena occur when animal tissue is exposed to rapidly changing magnetic fields. Which phenomenon is most observed depends on the strength, rate of change of, and duration of change of the magnetic field. For example, if a magnetic field changes by several thousand gauss in a microsecond, broadband ultrasound and charge density waves can be expected to be generated in the tissue. Also, electrical eddy currents will be produced in the interstitial body fluids. As discussed below these charge density waves and broad band ultrasound can be expected to disrupt and destroy microbe functions. The electrical eddy currents, when they enter the range of 100 microamperes per square centimeter to 200,000 microamperes per square centimeter, begin to biologically deactivate all manors of viruses and microbes (Tomas, etal. 2000, Halford, 2006, Guo, 2005).
258
Advanced Technologies
3.4. Ionic Silver/Colloidal Silver Ionic silver is recognized as an accepted treatment for viral conditions; a new form of ionic silver may already be providing us with a remarkably effective treatment for not only a bird flu pandemic that may or may not occur but also an enormous range of infectious diseases that are a very real part of our world today and It’s a rapidly growing phenomenon taking place that ionic silver is emerging as the new antimicrobial wonder in dealing with viral conditions as well as bacterial (Wade, 1994, Rife-Tech, 2006). Ionic silver was actually a commonly-used antimicrobial 100 years ago, before the advent of modern antibiotics, which only address bacteria and are becoming largely obsolete while posing risks related to resistant super-germs. It is increasingly being recognized for its broad-spectrum antimicrobial qualities and the fact that it presents virtually none of the side-effects related to antibiotics. Ionic silver is also entirely non-toxic to the body. Research has shown that some resistant strains of disease cannot develop with ionic silver the way that they will with antibiotics. Some reports indicate that it even kills drug-resistant strains of germs. Back 100 years ago, major pharmaceutical firms made ionic silver products for systemic human use in the form of what is loosely referred to as “colloidal” silver, a very crude and archaic substance that did the job of delivering silver ions decently for its time. In recent decades, colloidal silver has seen a resurgence in popularity, but primarily in the alternative medicine field. In this regard, there are two pivotal questions to be considered: (1) whether silver ions kill viral pathogens; and (2) the method of delivery for systemic human use. Even if silver ions are effective against viral pathogens, the delivery mechanism for use in the human body becomes the key issue. This need for a delivery mechanism to maximize availability is all the more demanding when attempting delivery of ionic silver in the human body, due to the aggressive and fluctuating electrochemical environment the human organism presents (RifeTech, 2006, Houston Post, 1991, Wade, 2005). 3.5. Ozoneized Water Ozonized water is made from oxygen in ambient air. O³ unlike other forms of oxygen carries negative electrical charges that specifically counteract free radical damages and recharge deplete cells. Ionic silver colloids also greatly assist this rejuvenation process by restoring free electrons. By drinking ozone-charged water, you can gain some of the benefits of ozone use such as providing rapid, safe, totally natural cell oxidation free of radical damage. O³ rapidly converts (oxidizes) all known toxins and wastes long present in the body cells to H²O and CO² which flush out easily and rapidly without utilizing colonics, lymph, spleen, liver or kidney detoxing or any other treatment (Broder, etal. 1990, Sato, 1989). Another potential benefit of the ozone's reaction with water is the destruction of organic toxins; ozone's reaction with organic molecules involves fairly specific types of reactions, and it does not remove all organic materials from the water passing through the contact chamber. However, many toxins have very specific structures, being toxic specifically because they fit exactly into or onto some important bio-molecule in a living organism, thereby interfering with its normal activity. Even a small chemical change will likely reduce the toxicity of even a very potent natural toxin (Broder, etal. 1990, Sato, 1989).
Energy Field as a Novel Approach to Challenge Viruses
259
4. New Evidences and Applied Methods Sadighi-Bonabi and Ghadiri have studied the effects of excimer UV laser on the artemia cysts hatching characteristic. They observed that while the laser irradiation decreases the hatching rate of low quality de capsulated cysts, it on the other hand clearly increases the hatching rate and efficiency of capsulated cysts by a few percent. They had also reported the same effects, of chicken and quail eggs hatching. It is suggested that it could be due to disinfective side effects of the laser radiation on the cysts surface. The laser used on Ar/F ultra violet excimer with 193m wavelength. (Sadighi-Bonabi & Ghadiri, 2008) Rajaei etal have reported that electro magnetic field (EMF) exposure has got some detrimental effects on the male reproductive system by decreasing the diameter of reproductive ducts, the length of epithelial cells, and weight of testes. Their experiments included a few tens BALB/C mice, which were exposed to a 50hz, 0/5 mT EMF for two months. The same effects can be expected in other mammals. (Rajaei, etal. 2008) Again, Sadeghzadeh etal. study the effects of electromagnetic field (EMF) on the mice heart tissue. After exposing a 3Mt EMF for two months, four hours per day to mice, the microscopic observations show the inflammation and leucocyte filtering in the cardiac tissue, which along with some other side effects would result in decreasing of the cardiac function. (Sadeghzadeh, etal. 2008) Mousavi etal investigate the effects of mobiles electromagnetic fields on the oxygen affinity and tertiary structure of the human hemoglobin. Their results indicated that oxygen affinity decreases in the radiated samples, proportionally with the EMF intensity, buffer concentration, and the exposure period. (Mousavi, etal. 2008) Recently several reports have been made on the alternative electrical and electromagneticbased cares for diverse types of cancer. Namely, Bio- electric and bio-electromagnetic balance of cells (Alishahi, 2008), and the bio-resonance therapy (Hijazi, 2009) Alishahi introduces a new theory of cells electro and electromagnetic balance(Alishahi, 2008) and claims that affected under threat cells firstly suffer from a drastic unbalancing of those two factors which may result in an abnormal status(i.e. having a cancerous cell). Again, he claims that to diagnose and kill those cancerous cells, measuring and using the BEB and BEMB phenomenon would be useful. There are some practical treatments of cancers carried out in their research centre in Austria. Meanwhile, more studies are reported on bio-resonance therapy of cancer, as well as bio resonance detoxification against infections. (Hijazi, 2009) Biological warfare is the third side, of the triangle of weapons of mass destruction (WMDS), where the first two sides are chemical and new clear warfare. In contrary of two others, thanks god, biological warfare never has been practically employed in warfronts or against the civilians: however, it sadly could be a matter f when instead of if, that a wicked leader or a terrorist group use them for real with devastating, long standing effects. A biological warfare is any organic material used to make ill or kill different animates, in particular human. Biological agents are typically live organisms, and can be different viruses, Fungi, Rickettsiae, and so on. It is claimed that scientists in the former USSR had developed a rich bank of biological weapons, including Ebola, Marburg, and Lassa fever viruses. It also was claimed that an accident in one of the biological warfare complexes in Novosibrisk in Siberia killed thousands of people in the mid 70s. Those claims were strongly rejected by soviet authorities. However some researchers believe that the deadly Spanish flu virus was
260
Advanced Technologies
produced or at least modified and deliberately mutated in hostile countries laboratory during the first world war. The virus which by a sense is the cousin of the nowadays bird flu and swine flu viruses. (Aslaheh, 1981)
5. Computer Simulation: a Necessary Step Today, researchers of HIV-1 are still unable to determine the exact biological mechanisms that cause AIDS. Various mechanisms have been hypothesized and their existences have been experimentally verified, but whether they are sufficient to account for the observed disease progression is still in question. In order to understand the phenomena better, HIV-1 researchers turn to the construction of scientific models to verify these hypotheses. One of the earlier approaches of HIV-1 modeling uses ordinary differential equation (or ODE) models. For low levels of granularity, they can be inexpensive to construct and allow the prediction of macroscopic dynamics in time dimension. However, to increase model granularity to cover spatial and topological dimension that may contain crucial information with regards to realistic disease progression partial differential equations (PDEs) are usually required (Guo, 2005, Perrin, 2006). 5.1. The Multi-Agent Approach Multi-Agent simulation models (or simply MA models) as a new approach, conveniently enable the modeling of different entity types through the specification of interaction rules between agents and their environment. With an explicitly constructed computational model, we can further quantitatively study many types of entities and interactions simultaneously, which would be too complex for simple rationalization. Therefore modeling has great value in assisting in the verification of infection hypotheses. MA models treat cells and molecules as ‘agents’ and allow autonomous interactions in a virtual environment. Such a model explores the level of cell-to-cell and cell-to-molecule interactions, from which the macroscopic behaviors emerge. By doing so, we avoid directly making intuition-driven assumptions on macroscopic properties. Implementation of MA Models can be based on CAFISS. In CAFISS, agents are implemented in a multi-threaded fashion; hence the sequence of interaction events is unpredictable. Such a design is intended to eliminate possible artifacts resulting from the implementation itself. First we begin by specifying the agent interaction rules; some of which are specific to the virus progression hypotheses, while others are common knowledge specific to the immune system (Seidel & Winter, 1944, Bird, 1976). We first specify a null model as a common basis for modeling an adaptive immune response for all four HIV hypotheses. 5.2. A Null Infection Model The MA model design methodology and the preliminary simulation results are based on a “null model plus hypothesis” framework of sufficiency verification. Figure 3 is a simplified schematic representation of the null model, which contains only the key elements of the agent interaction network. It can be seen that the null model simply captures common knowledge about the adaptive immune system, for example, TH cell sending activation signals, B cell producing antibodies, and humoral elimination of HIV virion and so on (Guo,
Energy Field as a Novel Approach to Challenge Viruses
261
2005, Wade, 2005, Perrin, 2006).
Fig. 3. Schematic diagram of the null model 5.3. Rapid Viral Mutation The immune cells are able to attack the virus only upon recognition As HIV replication is error prone during reverse transcription which results in mutant strains, the immune system is put at disadvantage since it needs to detect each mutant strain before it is able to activate the specific antibodies. It is postulated that mutation reduces the chance of virus detection and hence allows HIV to persist (Seidel & Winter, 1944, Guo, 2005).
Fig. 4. Rapid viral mutation (Guo, 2005) The mutation mechanism is added to the null model by altering the shape from time to time, computationally implemented by toggling a series of binary bits. As such, multiple strains of HIV can coexist in the environment. 5.4. Agents and Their Transactions As a biological system, human body has been facilitated with a very plausible and robust guard: the immune system. In a given scenario of conducting with a virus, Table 1 below shows a flow diagram of possible transactions amongst the agents in a
262
Advanced Technologies
simulated infected body regarding a particular procedure. That particular procedure attempts to model the diminishing of B and T-helper cells and the concurrent increment of the viral load within an infected body. Many other events during the illness period can be modeled and analyzed too.
6. The Complex System Approach to the Problem System generally is a set which performs something on the input and provides some outputs. A system might have an internal state too. Although started in the electrical and control engineering, the system theory has spread widely in other studies and applications, from human science, to economy and biotechnology. A complex system is a system composed of interconnected parts that as a whole exhibit one or more properties, not obvious from the properties of the individual parts. This definition well matches the biological immune system, which many believe is the most complicated part of our body after the nervous system. Although one can argue that humans have been studying complex systems for thousands of years, the modern scientific study of complex systems is relatively young when compared to areas of science such as physics and chemistry. The history of the scientific study of these systems follows several different strands. In the area of mathematics, arguably the largest contribution to the study of complex systems was the discovery of chaos in deterministic systems, a feature of certain dynamical systems that is strongly related to nonlinearity. The study of neural networks also has appeared to have strong links with the study of complex systems due to referring to the similar connectionist mathematical models (Flake, 1998, Deisboeck & Kresh, 2006). During an infection, the behavior of the immune system, its sub systems or agents, and virus itself, can be considered as a good and clear example of a complex system. Therefore, study of complex systems and modeling of the biological phenomenon as a complex system, could shed a huge amount of light to our problem.
7. Conclusion It should be clear that the human kind is under the serious threat to viruses. While, apart from the vaccination, the main stream of the medical science has not been successful in tackling the viral diseases so far. So why should not we think about some alternative therapies? Some older and new research suggest that physical therapy and energy fields can be potent candidates. Since the computer-based models resulted in a valid and reliable simulation can not be ignored, the multi agent model design methodology would be the most reasonable approach, and the preliminary simulation results based on a “null model plus hypothesis” framework can be justified. Such a methodology is shown to be based on the logic of contradiction proofs; directing us also towards an accurate model of the real biological system. Complex system theory also can be advised as the best fitting hypothesis to fully uncover the immune system and viruses transactions and behaviors.
Energy Field as a Novel Approach to Challenge Viruses
263
•
HIV virus continuously sends infection signals to other involved agents, without any condition.
•
If a T-helper cell receives an infection signal, it will change its state from healthy to infected, and sends an output successfully infected signal.
•
A HIV virus/agent would be satisfied if it gets a successful infection signal, and starts reproduction.
•
If the CTL receives any infected signal from TH or B cells, will send a death signal to them.
•
T-helper cells will commit suicide whenever they receive an input death signal.
•
If a B cell receives an infection signal, it will change its state from healthy to infected, and sends an output successfully infected signal.
•
B cells will commit suicide whenever they receive an input death signal.
Table 1. Procedure of diminishing of B and TH cells and increasing of HIV viruses in a multi agent simulated system (Guo, 2005).
8. References Broder, S.; Mitsuya; H., Yarchoan; R. & Pavlakis. (1990). Antiretroviral therapy in AIDS, Annals of Internal Medicine, Vol. 113, No. 8, October 1990, 604-618. Morse, S. (1997). The Public Health Threat of Emerging Viral Disease, Journal of Nutrition, Vol. 127, No. 5, 951-957.
264
Advanced Technologies
Bureau International Programs, USDS. (2005). Meeting the Challenge of Bird Flu, U.S. Department of State, Retrieved May 2009, URL: http://hdl.handle.net/1805/1125. WHO. (2003). Sever acute respiratory syndrome: SARS, WHO, Retrieved May 2009, URL: /en.wikipedia.org/wiki/Severe_acute_respiratory_syndrome. Tomas, R.; Vigerstad T.; Meagher J. & McMullin C. (2000). Particulate Exposure During The Persian Gulf War, Office Of The Special Assistant For Gulf War Illness, Falls Church Va., URL: http://www.stormingmedia.us/34/3462/A346283.html . Wysock, W.C.; Corum, J. F.; Hardesty, J.M. & Corum, K.L. (2001). Who Was The Real Dr. Nikola Tesla?, Proceedings of Antenna Measurement Techniques Association Conference, October 2001, URL: http://www.scribd.com/doc/4601812. Beck, R. (2001). A First Aid Kit of Future, The Beck protocol, SOTA Publishing, 2001, URL: www.realityzone.com/beckbook.html. Seidel, R.E. & Winter, M. E. (1944). The New Microscopes, Journal of the Franklin Institute, Vol. 237, 1944. Wade, G. (1994). Rife Research and Energy Medicine: A Physicist’s View of Dr. Rife NonDrug and Non-Surgical Treatment and Cure of Medical Associated Disease, Issue of Health Freedom News, URL: http://www.rifeenergymedicine.com/physicistb.html Rife-Tech Company. (2006). The Magnetic Pulser, Semi Technical Explanation of How It Works and To What it can Applied for Experimental Purpose, Rife Tech Company Website, URL: http://www.rifeenergymedicine.com/ Last retrieved July 2007. Houston Post. (1991). Electric Current may help to fight AIDS, Houston Post, 20 March 1991, URL: http://www.educate-yourself.org/be/behoustonpost.shtml. Bird. C. (1976). What Has Become of the Rife Microscope?, New Age Journal, Boston, March 1976, pp 41-47. Hamadani, S. (2005). Study Shows Silver Nano-particles Attach to HIV-1 Virus, PhysOrg Online Journal, October 2005, URL: www.physorg.com/news7264.html Sato, P. A., Chin, J. & Mann. J.M., (1989). Review of AIDS and HIV infection Global epidemiology and statistics, AIDS Journal, Vol. 3, No, 1, 1989. Halford, B. (2006). A Silver Bullet For Infections?, Chemical & Engineering News, Vol. 84, No. 16, 2004, pp 35-36. Guo, Z. (2005). Sufficiency verification of HIV-1 pathogenesis based on multi-agent simulation, Proceedings of the 2005 conference on Genetic and evolutionary computation, USA, ISBN: 1-59593-010-8 Wade, G. (2005). Exciting possibilities In Pulsed, Intense Magnetic Field, Intense Magnetic Field Therapy, Health Freedom News, August/September 1998, URL: http://educate-yourself.org/gw/gwpulsedmagtherapyaug8.shtml. Perrin, D. (2006). HIV Modeling-Parallel Implementation Strategies, Enformatika Transaction on Engineering and Technology, Vol. 16, Nov 2006, ISSN 1305-5313 Sadighi-Bonabi R. & Ghadiri A. (2008). Effects of low intensity He-Ne Laser irradiation on hatching characteristics of Artemiaurmianacysts, Proceedings of the EMFE’08 Conference, University of Tehran, Tehran, Iran, 2008. Rajaei, F., Farrokhi, M., Ghasemi, N. & Jahani Hashemi, H. (2008). Effects of extremely lowfrequency magnetic field in mouse epididymis and deferens duct, Proceedings of the EMFE’08 Conference, University of Tehran, Tehran, Iran, 2008.
Energy Field as a Novel Approach to Challenge Viruses
265
Sadeghzadeh, B., Mohammadi, A., Soleimani, J., Habibi, M., Hemmati, M. & Halabian, R. (2008). Investigation of microscopic changes of heart tissue in EMF-exposed mice, Proceedings of the EMFE’08 Conference, University of Tehran, Tehran, Iran, 2008. Mousavy, J., Kamarei, M., Aliakbarian, H., Riazi, G., Sattarahmady, N., Sharifizadeh, A. & Moosavi-Movahedi, A. (2008)., The Effect of electromagnetic fields in mobile phone frequencies on oxygen affinity and tertiary structure of Human Hemoglobin, Proceedings of the EMFE’08 Conference, University of Tehran, Tehran, Iran, 2008. Alishahi, A. (2008). Cells electro and electromagnetic balance to tackle the cancer, Vadelayman Cancer Cure Centre, URL: http://www.vadelayman.com/, 2008. Hijazi, M. (2009). Bioresonance Therapy, Retrieved from the Web 13-01-2009, URL: http://en.wikipedia.org/wiki/Bioresonance_therapy. Aslaheh (1981). Biological Warfare, Aslaheh Military Encyclopedia, Vol. 1, No. 6, 1981, pp 287. Flake, G., 1998. The Computational Beauty of Nature, G. W. Flake, MIT Press, Cambridge, Massachusetts, 1998. Deisboeck, T. & Kresh, Y., 2006. Complex Systems Science in Biomedicine, Thomas S. Deisboeck and J. Yasha Kresh (Ed.), Springer, 2006.
266
Advanced Technologies
A Manipulator Control in an Environment with Unknown Static Obstacles
267
15 X
A Manipulator Control in an Environment with Unknown Static Obstacles Pavel Lopatin and Artyom Yegorov
Siberian state aerospace university named after academician M.F.Reshetnev Russia 1. Introduction In contemporary society robots and manipulators are used in different spheres of life. Robot should be as autonomous as possible and should effectively operate in a natural environment. In the beginning of the robotic era, robots operated in a workspace which was free of obstacles. Later the works dedicated to the invention of algorithms for the control of robots in the presence of obstacles began to appear. There are algorithms which guarantee finding a trajectory in the presence of known obstacles, if such trajectory exists (Canny, 1988; Collins, 1975; Donald, 1985). Some authors use the artificial potential methods (see, for example, (Choset et al., 2005)). In this method a robot is represented by a point, regions in the workspace that are to be avoided are modeled by repulsive potentials and the region to which the robot is to move is modeled by an attractive potential. In a general situation there is no guarantee that a collision-free path will always be found, if one exists (Ahrikhencheikh & Sereig, 1994; Choset et al., 2005). There are different graph searching methods (Choset et al., 2005; La Valle, 1999-2004) which find a trajectory avoiding obstacles (even an optimal one), if it exists. It is easier to use such methods in the case where we have full information about free and forbidden points before the beginning of the movement. A computer may then calculate a preliminary trajectory and after that the manipulator may execute this trajectory. But in case of unknown obstacles the manipulator has to investigate its environment and plan its trajectory alternately. Then the difficulty arises that the graph searching algorithms demand to carry out in a certain volume a breadth-first search, otherwise the reaching of the target point qT is not guaranteed (Ilyin,1995). But during the breadth-first search the following situation often arises: suppose we have just finished considering the vertices adjacent to a vertex q and we have to consider vertices adjacent to a vertex q’ and the q and q’ are not adjacent. In order to consider vertices adjacent to the q’ the manipulator at first has to come to the q’. So we get a problem of the manipulator movement from q to q’. The necessity of searching and executing paths for multiple different q and q’ makes the total sum of the manipulator movements very big (Ilyin,1995). In case we plan trajectory in known environment the computer simply switches its “attention” from q to q’, which are stored in the computer’s memory.
268
Advanced Technologies
It is possible to outline the following representatives of the breadth-first approach: they are namely breadth-first searching algorithm, A* algorithm, best-first heuristic search, lazy PRM, dynamic programming (La Valle, 1999-2004). The methods based on randomized potential field, Ariadne’s Clew algorithm, rapidly-exploring random trees (La Valle, 19992004) have such feature that new vertices are generated randomly and therefore using these methods for the unknown environment leads to the same difficulties. The approaches based on cell decomposition, visibility (bitangent) graphs, Voronoi diagrams (Choset et al., 2005; La Valle, 1999-2004) are reduced to alternate graph building and searching a path on it and have the above mentioned disadvantage connected with multiple mechanical movements. In the algorithm presented in this article the vertices q and q’ are always neighbor vertices and it reduces the number of movements. For a solution of our problem it is possible to use the approach based on automatic theorem proving (Timofeev,1978), but this approach demands to consider large amount of variants and directions of the search and therefore the application of such approach is not effective (Yefimov, 1988). It is also known that the “depth-first” algorithms do not guarantee reaching of a goal (Ilyin, 1995). There is common difficulty for the methods for trajectory planning in the presence of known obstacles: it is very difficult to borrow full information about workspace of manipulator in advance and to represent this information in a form suitable for trajectory planning. Considering our algorithm one may see that there is no need for the control system to have full information about workspace in advance, manipulator will borrow necessary information by itself in limited quantities and in terms of generalized coordinates which is suitable for trajectory planning. The attempts of creating algorithms for the robot control in presence of unknown obstacles were made. Most of them cover various two-dimensional cases (Lumelsky, 2006). In (Chen et al., 2008; Ghosh et al., 2008; Masehian & Amin-Nasari, 2008; Rawlinson & Jarvis, 2008) different approaches for a robot control in a two-dimensional unknown environment are considered. In (Chen et al., 2008; Rawlinson & Jarvis, 2008) the approaches are based on Voronoi diagrams, in (Masehian & Amin-Nasari, 2008) a tabu search approach is presented. The approaches demand multiple robot movements. In (Masehian & Amin-Nasari, 2008) obstacles should have polygonal form. An application of methods proposed in (Chen et al., 2008; Ghosh et al., 2008; Masehian & Amin-Nasari, 2008; Rawlinson & Jarvis, 2008) to a nlink manipulator control in the unknown environment is not presented. In (Lumelsky, 2006) an algorithm for the control of manipulators in the presence of unknown obstacles in three-dimensional space is given. Though this algorithm guarantees reaching of a target position it has such a limitation that the manipulator should not have more than three degrees of freedom. In (Amosov et al., 1975; Kasatkin, 1979; Kussul & Fomenko, 1975) is described an application of semantic nets to the problem of robot control in unknown environment. The disadvantage of such approach is the preliminary teaching of the net, which simulates a planning system. The absence of formal algorithms for teaching makes impossible the teaching of a complex net, which should plan robot actions in an environment close to natural ((Ilyin, 1995).
A Manipulator Control in an Environment with Unknown Static Obstacles
269
In (Yegenoglu et al., 1988) the n-dimensional case is considered. The algorithm is based on the solution of the system of nonlinear equations using Newton method and therefore in cannot guarantee the reaching of a target position. In (La Valle, 1999-2004) algorithms for moving a robot in the presence of uncertainty (including cases of unknown environment) are considered. The algorithms are based on the sequential decision theory. In general case the algorithms do not guarantee reaching the goal. In cases when the algorithms use searching on a graph the above mentioned difficulty arises connected with multiple mechanical movements. In (Lopatin, 2001; Lopatin & Yegorov, 2007) algorithms for a n-link manipulator movement amidst arbitrary static unknown obstacles were presented. Algorithms guarantee reaching the qT in a finite number of steps under condition that the qT is reachable. It was supposed that the manipulator’s sensor system may supply information about free and forbidden points either from a r-neighborhood of the manipulator’s configuration space point where the manipulator is currently situated (Lopatin, 2001) or from r-neighborhoods of a finite number of points from the manipulator configuration space (Lopatin, 2006; Lopatin & Yegorov, 2007). In both cases it was supposed that the r-neighborhood has a form of a hyperball with a radius r>0. In this work we consider more general form of the rneighborhood.
2. Task formulation and algorithm 2.1. Preliminary Information We will consider manipulators which consist of n rigid bodies (called links) connected in series by either revolute or sliding joints (Shahinpoor, 1987). We must take into account that because of manipulator's constructive limitations the resulting trajectory q(t) must satisfy the set of inequalities a1 q(t) a2
(1)
for every time moment, where a1 is the vector of lower limitations on the values of generalized coordinates comprising q(t), and a2 is the vector of higher limitations. The points satisfying the inequalities (1) comprise a hyperparallelepiped X in the generalized coordinate space. We will consider all points in the generalized coordinate space which do not satisfy the inequalities (1) as forbidden. We will have to move the manipulator from a start configuration q0=(q10, q20, …, qn0) to a target configuration qT=(q1T, q2T, …, qnT). In our case the manipulator will have to move amidst unknown obstacles. If in a configuration q the manipulator has at least one common point with any obstacle then the point q in the configuration space will be considered as forbidden. If the manipulator in q has no common points with any obstacle then the q will be considered as allowed. So, in our problem a manipulator will be represented as a point which will have to move in the hyperparallelepiped (1) from q0 to qT and the trajectory of this point should not intersect with the forbidden points.
270
Advanced Technologies
2.2. Preliminary Consideratons Let us make the following considerations: 1) The disposition, shapes and dimensions of the obstacles do not change during the whole period of the manipulator movement. Their number may not increase. 2) It is known in advance, that the target configuration is allowed and is reachable (that is, we know that in the generalized coordinates space it is possible to find a line connecting q0 and qT, and this line will not intersect with any forbidden point). 3) The manipulator has a sensor system which may supply information about rneighborhoods of points qiX, i=0.1,…, N, where N – is a finite number, defined by the sensor system structure and methods of its using. The r-neighborhood of a point qi is a set Y(qi) of points near qi (see Figure 1). q2 a22
qi
a21 a11 Fig. 1. An example of a r-neighborhhod Y(qi).
Y(qi)
q1 a12
The Y(qi) has a shape of a hypercube. The qi is situated in the centre of the hypercube. The length of the hypercube’s edge may be arbitrary and is defined by the sensor system structure and methods of its using. All sets Y(qi), i=0,1,… should have the same size that is the length of an edge of a Y(qi) should be equal to the length of an edge of a Y(qj) for every i and j. The sides of a Y(qi), i=0,1,… should be parallel to the corresponding planes of the basic coordinate system of the generalized coordinate space. The sets Y(qi) , i=0,1,… also should satisfy such condition that it should be possible to inscribe in the set Y(qi) a hyperball with the centre in the qi and with a radius r > 0 (see Figure 2). The part of the Y(qi) comprising the hyperball should be compact that is the part should not have any hole. So, all points of the hyperball should belong to the inner part of the Y(qi). The words “the sensor system supplies information about the r-neighborhood of a point qi“ mean that the sensor system tells about every point from the Y(qi) whether this point is allowed or forbidden. The sensor system writes all forbidden points from the Y(qi) into a set Q(qi), and writes into a set Z(qi) all allowed points from the Y(qi). The sets Y(qi), Q(qi), Z(qi) may be written using one of such methods like using formulas, lists, tables and so on, but we suppose that we have such method. We will not consider the sensor system structure.
A Manipulator Control in an Environment with Unknown Static Obstacles
271
q2 2
a2
qi
r
Y(qi)
a21 a11
a12
q1
Fig. 2. It should be possible to inscribe in the set Y(qi) a hyperball with the centre in the qi and with a radius r > 0 An r-neighborhood of qi with the sets Z(qi) and Q(qi) may look as follows (Figure 3). Note that the sets Z(qi) and Q(qi) may be not continuous.
q2
Z(qi) qi
r
Y(qi)
Q(qi) a21 a11
a12
q1
Fig. 3. An example of a r-neighborhood Y(qi) with the sets Q(qi) and Z(qi). The set Q(qi) is shown by a dark color. The points from the Y(qi) which do not constitute the Q(qi), belong to the Z(qi). The considerations 1)-3) cover a wide range of manipulators’ applications.
272
Advanced Technologies
2.3. Algorithm for Manipulators’ Control in the Unknown Environment We will denote the points where generation of a new trajectory occurs as qn, n=0, 1, 2,… We will call such points “trajectory changing points”. Before the algorithm work n=0 and qn= q0. Algorithm STEP 1. The manipulator is in a point qn, n=0, 1, 2,…, and its sensor system supplies information about the r-neighborhood of the qn, and about the r-neighborhoods of points yj, j=0, 1, …, Nn, yjX for every j=0, 1, …, Nn, Nn – is a known finite number. yj, j=0, 1, …, Nn and the Nn are generally speaking different for every n and are told on every n to the sensor system before start of the sensor system functioning. So the sensor system supplies information about the Q(qn) and about the set QS
n
=
Nn
Q ( y j ).
j=0
After that the manipulator generates in the configuration space a preliminary trajectory L(qn, qT). The L(qn, qT) should satisfy the following conditions: I) connect the qn and the qT; II) no point from the L(qn, qT) coincides with any point from the sets
n
Q( q i )
i =0
and
n
QS i
i =0
, in other
words the preliminary trajectory should not intersect with any known forbidden point; III) satisfy the conditions (1). The manipulator starts to follow the L(qn, qT). The algorithm goes to STEP 2. STEP 2. While following the L(qn, qT) two results may happen: a) the manipulator will not meet forbidden points unknown earlier and therefore will reach the qT. Upon the reaching of the qT the Algorithm terminates its work; b) the manipulator will come to such a point (making at first operation n=n+1, let us define it qn, n=1,2,…), that the next point of the preliminary trajectory is forbidden. The Algorithm goes to STEP 1. End of Algorithm. 2.4. Theorem and Sequences Theorem. If the manipulator moves according to the Algorithm it will reach the target configuration in a finite number of steps. Proof. Suppose that the manipulator being in qn (n=0,1,…), generated a trajectory, leading to qT and began to follow this trajectory. If the manipulator does not meet obstacles it will reach the target configuration in a finite number of steps (because the length of the trajectory is finite). Therefore, the endlessness of the manipulator wandering may be caused only by the endless repeating of the situation described in the item b) of STEP 2 of the Algorithm and therefore the endless generating of new trajectories may be caused by the two reasons: 1) the manipulator will infinitely return to the same point of the trajectory changing, 2) the number of points where it is necessary to change trajectory will be infinite. Let us prove that all points where the manipulator changes its trajectory will be different. Suppose that the manipulator changed a trajectory being in a point qs, and later it again changed a trajectory, being in a point qp, that is s
A Manipulator Control in an Environment with Unknown Static Obstacles
273
when this trajectory was generated. It means that the manipulator can not come to a point of the trajectory changing qp which will be equal to any other point of the trajectory changing and it means that all points where the manipulator changes its trajectory are different. Now let us show that the number of such points is finite. Suppose that it is infinite. All points of a trajectory changing must satisfy the inequalities (1). It means, that the sequence of these points is bounded. According to the Boltsano-Weierstrass theorem it is possible to extract from this sequence a convergent subsequence qi, i=1,2,… According to the Cauchy property of the convergent sequences it is possible for any to find such a number s that all points qi, i>s will lie in an -neighborhood of qs. Let us take
274
Advanced Technologies
i=0,1,2,…, Ns2 – arbitrary points from X, and Ns2 – a finite number. After that operations n:=n+1 and qn=q should be done and the manipulator may generate a new trajectory L(qn, qT) satisfying the conditions I-III of STEP 1 of the Algorithm and begin to move along the new L(qn,qT) according to the Algorithm. Let us call such action “refuse from a preliminary trajectory” that is it is such action when despite the qT has not been reached a new trajectory leading to the qT is generated and the manipulator begins to follow the new trajectory. It is possible to make only a finite number of refuses. Then the finite number of steps on refuses will be added to the finite number of steps according to the Algorithm and in sum we get that the qT will be reached in a finite number of steps.
3. Using the polynomial approximation algorithm as a subroutine in the exact algorithm 3.1 Reducing the Algorithm to a finite number of calls of a subroutine for a trajectory planning in known environment Every time when the manipulator generates a new trajectory according to the STEP 1 of the Algorithm, two cases may happen: either the manipulator will not meet an obstacle and therefore it will reach the qT in a finite number of steps (because the length of the trajectory is finite) or the manipulator will meet an unknown obstacle and will have to plan a new trajectory. Therefore, in the Algorithm the problem of a manipulator control in the presence of unknown obstacles is reduced to the solution of a finite number of tasks of trajectory planning in the presence of known forbidden states. In other words, the Algorithm will make a finite number of calls of a subroutine which will solve the problem stated in STEP 1. In the rest of the article we will call this subroutine the SUBROUTINE. We took the polynomial approximation algorithm as algorithm for the SUBROUTINE. 3.2 Polynomial approximation algorithm Denote the components of vector-function q(t) as q1, q2, …, qn. Write down the restrictions using new variables: qj(0) = q0j, qj(1) = qTj, j = 1, 2, …, n. qLj ≤ qj(t) ≤ qHj, j = 1, 2, …, n.
(2) (3)
Specify the obstacles by hyperspheres with centers (qpm1, qpm2, …, qpmn) and radius r = Δq / 2 · 1.1, where qpmi correspond to the components of vectors pm, m = 1, 2, …, M, q is step of configuration space discretization. The value of the radius r is chosen so that if two obstacles are located on neighbor nodes of the grid and their centers differ only in one component, the corresponding hyperspheres intersect. Otherwise the hyperspheres don’t intersect. Then the requirement of obstacles avoiding trajectory can be written as follow: n p 2 2 (q i (t ) q mi ) r i 1
t [0 ;1], m 1,2 , , M .
(4)
A Manipulator Control in an Environment with Unknown Static Obstacles
275
The left part of this inequality is squared distance between the trajectory point at moment t and the center of the m-th obstacle. We will search the trajectory in the form of polynomials of some order s: s q j (t ) c ji t i , i 0
j 1, n
(5)
Here cji are unknown coefficients. Substitute t = 0 and t = 1 in (5): qj(0) = cj0 + cj1 · 0 + cj2 · 02 + … + cjs · 0s = cj0 s qj(1) = cj0 + cj1 · 1 + cj2 · 12 + … + cjs · 1s = cj0 + c ji , j = 1, 2, …, n. i 1 Taking into account requirements (2): cj0 = q0j (5)
s c ji = qTj – q0j , j = 1, 2, …, n i 1
(6) (7)
Divide duration [0; 1] into K+1 pieces by K intermediate points t1, t2, …, tK. Then the requirements (3) and (4) will be as follow: s i L c ji t k q j , j 1, n i 0 s i H c ji t k q j , j 1, n i 0 n s p 2 i 2 ( c ji t k qmj ) r , j 1 i 0 m 1,2 ,..., M , k 1,2 ,..., K .
(8)
Thus, it is necessary to find such coefficients cji (j = 1, 2, …, n, i = 1, 2, …, s) which satisfy the system of equations (6), (7) and inequalities (8). Obviously, the coefficients cj0 are easily found from the equations (6). Express the coefficients cjs from the equations (7): s 1 c js qT q 0 c ji , j j i 1
j 1, n.
Substitute cj0 and cjs in (8): s 1 s 1 q 0j c ji t i (qTj q 0j c ji )t s q Lj , k k i 1 i 1
(9)
276
Advanced Technologies
s 1 s 1 q 0j c ji t i (qTj q 0j c ji )t s q H , j k k i 1 i 1
j 1, 2 , , n ,
2 n 0 s 1 p i T s 0 s 1 2 q j c jit k ( q j q j c ji )t k qmj r , m 1,2 ,..., M , k 1,2 ,..., K . i 1 j 1 i 1 Thus, it is necessary to solve the system of (M + 2n)K nonlinear inequalities. This problem can be replaced by the problem of optimization of some function F(C): E(C ), if E(C ) 0 F(C ) P(C ), if E(C ) 0 , where C is vector of coefficients (c1,1, c1,2, …, с1,s-1, c2,1, c2,2, …, с2,s-1,…, cn,1, cn,2, …, cn,s-1), E(C) is measure of restrictions violation, P(C) is estimated probability that trajectory not intersects with unknown obstacles. Let’s define function F(C). First, introduce function defining the trajectory point for the vector of coefficients C at the moment t: s 1 s 1 L j (C , t ) q 0 c ji t i (qT q 0 c ji )t s . j j j i 1 i 1
The following functions correspond to the inequalities of the system (8), they show the violation of upper and lower bounds and the intersection with the obstacles of trajectory at the moment t: n EL (C , t ) I (L j (C , t ) q L ), j j 1 n EH (C , t ) I (q H j L j (C , t )), j 1
M n p EP (C , t ) I ( L j (C , t ) qmj )2 r 2 , m1 j 1 z , if z 0 , I ( z) 0 , if z 0.
Because of using of the operator I, the items in the above functions are negative only if corresponding restrictions are violated. The more violation of the restriction, the more the value of the function. If particular restriction is not violated, the corresponding function element is zero. In any case, values of functions can’t be positive. Join all restrictions into single function (taking into account all discrete moments except t = 0 and t = 1):
A Manipulator Control in an Environment with Unknown Static Obstacles
277
K E(C ) EL (C , t k ) EH (C , t k ) EP (C , t k ) . k 1 Thus, E(C) takes negative values if at least one restriction is violated and becomes zero otherwise, that is if the vector C satisfies all the inequalities of the system (9). The function P(C) was introduced to make possible comparison of trajectories which not violate the restrictions. Assume p(d) = e-2d is probability that there is an unknown obstacle in some point, where d is distance between this point and the nearest known obstacle. Then M* P(C ) p(D(C , Om )) , where {O1, O2, …, OM*} is set of obstacles along which the trajectory m1 n lies, D(C , O ) min (L j (C , t k ) O j )2 r 2 is distance between the trajectory C and the k j 1 obstacle O. A trajectory lies along an obstacle O if such point of trajectory L(tk) exists as an obstacle O is nearest among all obstacles for this point. Usage of function P(C) would promote the algorithm to produce trajectories which lie far from unknown obstacles. Since function F(C) is multiextremal, the genetic algorithm is used to find the desired vector. 3.3 Optimization of the restriction function using genetic algorithm Genetic algorithm is based on collective training process inside the population of individuals, each representing search space point. In our case search space is space of vectors C. Encoding of vector in the individual is made as follows. Each vector component is assigned the interval of values possessed by this component. The interval is divided into a quantity of discrete points. Thus, each vector component value is associated with corresponding point index. The sequence of these indexes made up the individual. The algorithm scheme: 1. Generate an initial population randomly. The size of population is N. 2. Calculate fitness values of the individuals. 3. Select individuals to the intermediate population. 4. With probability PC perform crossover of two individuals randomly chosen from the intermediate population and put it into new population; and with probability 1 – PC perform reproduction – copy the individual randomly chosen from intermediate population to new population. 5. If size of new population is less than N, go to Step 4. 6. With given mutation probability PM invert each bit of each individual from new population. 7. If required number of generations is not reached go to Step 2. The fitness function matches F(C). On the third step the tournament selection is used: during the selection the N tournaments are carried out among m randomly chosen individuals (m is called tournament size). In every tournament the best individual is chosen to be put into the new population. On the fourth step the arithmetical crossover is used. The components of offspring are calculated as arithmetic mean of corresponding components of two parents.
278
Advanced Technologies
3.4 Quantization of the path After the polynomials coefficients specifying the route are found, it’s necessary to get the sequence of route discrete points q0, q1, …, qT. The first point (q0) is obtained from the initial values. The rest of points can be found using the following algorithm: 1. t = 0; i = 0. 2. tH = 1. t tH 3. t * . 2 4. Find the point q* with coordinates (q*1, q*2, …, q*n) in whose neighborhood the trajectory q q is lying at the moment t*: q *j q j (t *) q *j j 1, n 2 2 5. If q* equals to qi, then t = t*, go to Step 3. 6. If q* is not a neighbor of qi, then tH = t*, go to Step 3. 7. If q* is not forbidden, then go to Step 9. 8. If q ij q q j (t *) q ij q j 1, n , where qij are the coordinates of qi, then t = t*,
otherwise tH = t*, go to Step 3. 9. i = i + 1; qi = q*. 10. If qi = qT, then the algorithm is finished, otherwise go to Step 2.
4. Experimental results Consider the following experimental set (Figure 5). It is necessary to move a seven-link manipulator (Figure 4) from the start configuration q0 = (1.57; 1.57; 0; 4.71; 0; 4.71; 0) to the target configuration qT = (4.71; 1.57; 0; 0; 0; 0; 0). There are the following limitations on the generalized coordinates: 0 ≤ qi(t) ≤ 6.28, i = 1,2,…,7. The lengths of links are 10. There are four cuboid obstacles in the working area. Each obstacle is described by six values: the length, width, height and the coordinates of the origins attached to the obstacles in the basic coordinate system (Table 1). The parameters of algorithms are as follow: 1. Polynomial order s: 10. 2. Number of time pieces K: 100. 3. Polynomials coefficients precision: 10-5. 4. Population size N: 20. 5. Number of generations: 20. 6. Probability of crossover PC: 0,5. 7. Probability of mutation PM: 0,1. 8. Tournament size m: 5. The working time of the Algorithm depending on different number_of_discretes is given in the Table 2. The q is calculated as the difference between upper and lower bounds of q(t) (that is 6.28) divided on the number_of_discretes. The working time is a sum of three elements: trajectory search time, time to check whether trajectory intersects with unknown obstacles, manipulator moving time (12° per second).
A Manipulator Control in an Environment with Unknown Static Obstacles
279
Fig. 4. The manipulator kinematic scheme
Fig 5. Experimental set x y z № 1 -30 2 12 2 10 -20 0 3 -44 -20 0 4 -40 -40 -10 Table 1. The characteristics of obstacles
length 80 34 34 200
width 1.6 14 14 200
height 2 20 40 10
280
Advanced Technologies
Obstacles
1, 2
1, 2, 3
1, 2, 3, 4
Table 2. Experimental results
number_of_ discretes 40 60 120 240 360 40 60 120 240 360 40 60 120 240 360
Working time, seconds 58 82 153 150 125 83 96 154 233 251 233 340 761 1142 1169
The tests were made on the processor AMD Athlon XP 1800+ (1533 MHz). During experiments the value Nn in the Algorithm was equal to 0 for every n. The key steps of manipulator trajectory for the last test case (with four obstacles) are shown on the Figure 6. One may see that the software based on our Algorithm moved the 7-link manipulator to the target configuration in 58-1169 seconds. In case of using breadth-first or A* algorithms (La Valle, 1999-2004) we have to discretize the configuration space and therefore we get mn points of the configuration space, where m is a number of points for certain level of discretization, n is the configuration space dimensionality. The working time of such algorithms may approximate to billions of seconds. It is possible to outline the following criteria which should be satisfied by an algorithm for SUBROUTINE: a)it should be applicable to the n-dimensional case; b) it should guarantee finding a path in the presence of known forbidden states; c) in case of new call of STEP 1 minimum work for finding path in the presence of known forbidden states should be done. Using of cell decomposition and bitangent graphs algorithms (La Valle, 1999-2004) as algorithms for SUBROUTINE of our Algorithm looks promising.
5. Conclusion An algorithm for a n-link manipulator movement amidst arbitrary unknown static obstacles was presented. Given a theorem stating that if the manipulator moves according to the algorithm it will reach a target configuration qT in a finite number of steps under condition that the qT is reachable. Given proof of the theorem. It was shown that the problem of the manipulator control in the unknown environment may be reduced to a solution of a finite number of problems for the manipulator control in known environment. Given the sequences from the theorem which may facilitate reaching the qT. We also introduced the results of computer simulation of a 7- link manipulator movement in the unknown environment based on our Algorithm and compared results with some other algorithms.
A Manipulator Control in an Environment with Unknown Static Obstacles
Fig. 6. The key steps of manipulator movement trajectory
281
282
Advanced Technologies
6. References C.Ahrikhencheikh, A. Seireg, Optimized-Motion Planning: Theory And Implementation. John Wiley & Sons, Inc, 1994. N.M. Amosov, A.M. Kasatkina, L.M. Kasatkina, “Aktivnye semanticheskiye seti v robotakh s avtonomnym upravleniyem (Active semantic nets in robots with autonomous control)”, Trudy IV Mezhdunarodnoy obyedinennoy konferencii po iskustvennomu intellectu. M.: AN SSSR, 1975, v.9, pp. 11-20 (in Russian). J. Canny, “The Complexity of Robot Motion Planning”, The MIT Press, Cambridge, Massachusetts, 1988. C. Chen, H.-X. Li, D. Dong, “Hybrid Control for Robot Navigation. A hierarchical Qlearning algorithm”, IEEE Robotics & Automation Magazine, Vol.15, № 2, June 2008, pp. 37-47. H. Choset et al., “Principles of Robot Motion. Theory, Algorithms and Implementations”, A Bradford Book. The MIT Press. 2005. G.E.Collins, “Quantifier Elimination for Real Closed Fields by Cylindrical Algebraic Decomposition”, Lecture Notes in Computer Science, Vol.33, pp.135-183, SpringerVerlag, New York, 1975. B.R.Donald, “On Motion Planning with Six Degrees of Freedom: Solving the Intrsection Problems in Configuration Space”, Proceedings of the IEEE International Conference on Robotics and Automation (1985). S.K. Ghosh, J.W. Burdick, A. Bhattacharya, S. Sarkar, “Online Algorithms with Discrete Visibility. Exploring Unknown Polygonal Environments”, IEEE Robotics & Automotion Magazine. Vol.15, №2, June 2008. pp.67-76. V.A. Ilyin, “Intellektualnye Roboty, Teoriya I Algoritmy (Intelligent Robots: Theory and Algorithms)”, Krasnoyarsk, SAA Press, 1995 (in Russian). A.M. Kasatkin, “O predstavlenii znaniy v sistemah iskustvennogo intellecta robotov (On knowledge representation in artificial intelligence systems of robots)”, Kibernetika, 1979, №2, pp.57-65 (in Russian). E.M.Kussul, V.D. Fomenko, “Maket avtonomnogo transportnogo robota TAIR (A Model of an autonomous transport robot TAIR)”, Voprosy teorii avtomatov, robotov I CVM. Kiyev, 1975. pp.60-68 (in Russian). S.M. LaValle, “Planning Algorithms”, 1999-2004. Available: http:// msl.es.uiuc.edu/ planning P. K. Lopatin, “Algorithm of a manipulator movement amidst unknown”, Proceedings of the 10th International Conference on Advanced Robotics (ICAR 2001), August 22-25, 2001, Hotel Mercure Buda, Budapest, Hungary – pp.327-331. P. K. Lopatin, “Algorithm 2 upravlenia dinamicheskimi sistemami v neizvestnoy staticheskoy srede (Algorithm2 for Dynamic Systems’ Control in an Unknown Static Environment)”. Herald of The Siberian state aerospace university named after academician M.F.Reshetnev / ed. prof. G.P.Belyakov; SibSAU. № 4(11). Krasnoyarsk. pp.28-32, 2006. (in Russian). P. K. Lopatin, A. S. Yegorov, “Using the Polynomial Approximation Algorithm in the Algorithm2 for Manipulator’s Control in an Unknown Environment”, International Journal of Applied Mathematics and Computer Sciences, Volume 4, Number 4, pp.190195. http://www.waset.org/ijamcs/v4/v4-4-32.pdf
A Manipulator Control in an Environment with Unknown Static Obstacles
283
V. Lumelsky, "Sensing, Intelligence, Motion: How Robots and Humans Move in an Unstructured World”, John Wiley & Sons, 2006. E. Masehian, M.R. Amin-Nasari, “Sensor-Based Robot Motion Planning. A Tabu Search Approach”, IEEE Robotics & Automation Magazine. Vol. 15, №2, June 2008, pp.4857. D. Rawlinson, R. Jarvis, “Ways to Tell Robots Where to Go. Directing autonomous robots using topological instructions”, IEEE Robotics & Automation Magazine. Vol. 15, №2, June 2008, pp.27-36. M. Shahinpoor, “A Robot Engineering Textbook”, Harper & Row Publishers, New York, 1987. A.V. Timofeev, “Roboty i Iskusstvennyi Intellekt (Robots and Artificial Intelligence)”, M.: Nauka, 1978 (in Russian). Ye. I. Yefimov, “Problema perebora v iskusstvennom intellekte (The search problem in artificial intelligence)”, Izv. AN SSSR. Tehnicheskaya Kibernetika. 1988. №2, pp.127-128. F. Yegenoglu, A. M. Erkmen, H.E. Stephanou, "On-line Path Planning Under Uncertainty," Proc. 27th IEEE Conf. Decis. And Contr., Austin, Austin, Tex., Dec.7-9, 1988. Vol.2, pp.1075-1079, New York (N.Y.), 1988.
284
Advanced Technologies
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
285
16 X
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks Hilal A. Fadhil, S. A. Aljunid and R. B. Ahmad
University Malaysia Perlis, School of computer and communication Engineering Perlis-MALAYSIA 1. Introduction Code Division Multiple Access (CDMA) technique has been widely used in wireless communication networks. CDMA communication system allows multiple users to access the network simultaneously using unique codes. Optical CDMA (OCDMA) has the advantage of using optical processing to perform certain network applications, like addressing and routing without resorting to complicated multiplexers or demultiplexers. The asynchronous data transmission can simplify network management and control. Therefore, OCDMA is an attractive candidate for LAN applications. Among all OCDMA techniques, spectral amplitude coding (SAC) can eliminate first order multiple access interference (MAI) when using balanced detection. In SAC-OCDMA the frequency spectrum is partitioned into bins that are present or not according to the spectral code assigned to a particular user. Phase Intensity Induced Noise (PIIN) is known to be the most important source of impairment in SAC-OCDMA systems. Codes with ideal cross-correlation have been studied for many years. In this chapter, a new code with zero cross-correlation property has been constructed and designed for SAC-OCDMA system namely Random diagonal (RD) code. The performance of the OCDMA system is degraded as the number of simultaneous users’ increases, especially when the number of users is large. This is contributed by the MAI effects which arises from the incomplete orthogonal of the used signature codes. The great contribution of this code is the suppression of PIIN and it has been shown that the performance can be improved significantly when there is zero crosscorrelation between the code sequences. Direct detection technique gives an advantage in the way to improve the performance of RD code. The study is carried out through a theoretical calculation, and simulation experiment. The simulation design is carried out using various design parameters namely; distance, bit rate, input power and chip spacing. The effect of these parameters on the system was elaborated through the bit error rate (BER) and PIIN. By comparing the theoretical and simulation results taken from the commercial optical systems simulator “ OptsimTM “, we show that utilizing RD code considerably improves the system performance compared with Hadamard, Modified Quadratic Congruence (MQC) and Modified Frequency Hopping (MFH) codes. It is shown that the
286
Advanced Technologies
system using this new code matrices not only suppress PIIN, but also allows larger number of active users compare with other codes. Simulation results shown that using point to point transmission with three encoded channels, RD code has better BER performance than other codes. Also It is seen that the RD code family is effectively suppressed PIIN noise compared with the other SAC-OCDMA codes even though the weight is far less than other codes. This chapter is organized as follows: in section two and three we discus how the code is developed theoretically and also its properties; simulation results for different OCDMA codes are given in section four, and the conclusion is presented in section five.
2. RD Code Design for OCDMA Networks and its performance When some pulses originate from sources other than the desired user, multiple access interference, MAI, limits performance. Detected intensity is only proportional to the number of pulses from interfering users (MAI) in the mean. Intensity fluctuates severely around this mean due to phase induced intensity noise (PIIN), causing much greater signal impairment than first order (in the mean) effects. PIIN is often neglected in theoretical analysis, but has been identified as the limiting noise source experimentally for several types of OCDMA, including spectral amplitude coded SAC-OCDMA. For instance, while a first order analysis of SAC systems using balanced detection shows MAI is completely eliminated (Kavehrad & Zaccarin, 1995), in fact the presence of interferers leads to severe PIIN and eye closing. While a new code for SAC-OCDMA systems called RD code suffers from less PIIN than other SAC-OCDMA codes (the zero cross correlation gain reduces the PIIN), it too is ultimately limited by PIIN. Phase-induced intensity noise arises in every multiple access systems where detection of multiple optical pulses occurs. These pulses may have the same origin, i.e., be time delayed versions of the same pulse, or may originate from different sources with the same or different center wavelengths. In general, the power of filtered PIIN depends on the number of interfering pulses and the ratio of optical signal coherence time to the integration time at the detector and never can be cancelled practically. The great contribution of this code is the suppression of PIIN and it has been shown that the performance can be improved significantly when there is zero cross-correlation between the code sequences. In order to effectively extract a signature sequence in the presence of other user’s signature sequences, a set of signature sequence must satisfy the following two properties (Maric, 1993). 1- Maximal auto-correlation properties: each sequence can be easily distinguished from a shifted version of itself (except for zero shift). 2- Minimal cross-correlation properties : each sequence can be easily distinguish from other sequences in the set. An RD code is actually a collection of binary sequences that meet good auto- and crosscorrelation, and maximum code weight properties. RD code is constructed using simple matrices operation. For other code, it is defined by Salehi (Chung, Salehi, & Wei, 1989; Salehi, 1989) and others (Hui, 1985; Marie, Hahm, & Titlebaum, 1995; Marie, Moreno, & C. Corrada, 1996) as a means to obtain CDMA on optical network. Let the {0,1}- valued sequence of the mth user be denoted by C m { C m ( i )}in1 , m=1,2,…..K, where N and K are the code sequence length and total number of subscribed users in the network respectively. K must be less than or equal to code family size. Two important conditions should be satisfied during the RD code design which are:
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
1.
287
The number of ones, W, for the zero shift (s=0) , given by the discrete autocorrelation, N
ACm C m ( i )C m ( i ),
(1)
i 0
2-
Should be maximized The number of coincidences for every shift’s, in the cross-correlation function of two sequence Cl, Cm , N
ACl C ( s ) C l ( i )C m ( i s ), m
should be minimized.
N 1 N N 1
(2)
i 0
The auto-correlation properties are used to enable the receiver to obtain synchronization, that is, to find the beginning of a message and subsequently locate the codeword boundaries. This property is also useful in boosting the signal-to-noise level. The crosscorrelation property enables the receiver to estimate its message in the presence of multipleaccess interference from other users. According to the above discussion, RD code principles can be characterized by (N, W , c ), where W is the code weight, i.e the number of ones in the code sequence, c the minimum values of the out of phase discrete cross-correlation (except the zero shift). So for, there have been several types of incoherent OCDMA codes developed. The most popular code is prime sequences codes(Kwong & Yang, 1995), which are characterized by (P2, P, P-1), Modified Hopping Frequency (MHF), and Modified Quadratic Congruence (MQC) codes (X. Wei, shalaby, & Ghafouri-Shiraz, 2001; Z. Wei & Ghafouri-Shiraz, 2002a). In RD code there are two parameters that can be effectively used to evaluate the performance of the code family as a whole: the code family size and code multiple-access interference (MAI)-limited bit error rate. Generally speaking, we want to develop such a code sequence that has both good cross-correlation property, and large code family size. An (N, W , c ) RD code is a family of (0,1) sequence of length N, weight W and c is the inphase cross correlation which satisfy the following two properties: 1- Zero cross-correlation will minimized the c and reduce PIIN (Phase Induced Intensity Noise). 2- No cross correlation at data segment. The design of this new code can be preformed by dividing the code sequence into two sub-matrixes which are code sub-matrix (code level) and data sub-matrix (data level). The advantages of dividing the RD codes into two parts became easier for hardware implementation using direct detection rather than using different types of detection techniques. Another major advantage of our new codes, including both MQC and MFH, lies in the first property, i.e., elements in each sequence can be divided into groups and each group contains only one “1”. This property makes it much easier to realize the address reconfiguration in a grating-based spectral amplitude-coding optical CDMA transmitter. We can use a group of gratings to reflect all the desired spectral components; and then use another group of gratings to compensate the delays of the desired components and again incorporate them into a temporal pulse.
288
Advanced Technologies
Note that one of the disadvantages of this code is that the minimum number of weight should be designed for W 3 as shown below in the RD code designs. Step1, data segment: let the elements in this group contain only one “1” to keep cross correlation zero at data level ( c 0 ), this property is represented by the matrix (K K) where K will represent number of user these matrices have binary coefficient and a basic Zero cross code (weight=1) is defined as [Y1] for example three users (K=3), y(K K)can be expressed as
[Y1]=
0 0 1 0 1 0 1 0 0
(3)
Where [Y1]–consists of (K K) matrices. Notice, for above expression the cross correlation between any two rows is always zero. Step2, code segment : the representation of this matrix can be expressed as follows for W=4
[Y2] =
0 1 1 1 0 1 1 0 0 1 1 0 1 1 0
(4)
Where [Y2] – consists of two parts weight matrix part and basics matrix part, basic part [B] can be expressed as
0 1 1 , 1 1 0 1 0 1
[B] =
1 And weight part called [M] matrix= 0 1 1 weights, let i= (W-3) and Mi = 0 1 by
(5)
0 which is responsible for increasing number of 1 0
0 where i represents number of Mi matrix on [M],given 1 0
[M]=
M 1 M 2 M 3 .......M i
(6)
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
For example if W=5, from Eq.(1) i=2, so that [M]=
289
M1 M 2
1 0 1 0 1 0 1 1 0 1 0
[M] = 0
(7)
Notice that to increase the number of users simultaneously with the increase of code word length we can just repeat each row on both Matrixes [M] and [B], for K-th user matrix [M] and [B] can be expressed as
0 1 1 0 1 1 , and [B](j)= 0 0 1 1 : : : : a j 2 a j 1
0 1 0 [M](j) = 1 0 : : a j 1
1 1
0 1 1 : : a j2
1 0 1 1 0 : : a j 3
(8)
Where j represents the value for K-th user (j=1,2…K), and the value of aj is either zero or one. The weights for code part for both matrix [M], [B] are equal to W-1, so the total combination of code is represented as (K N) where K=3, N=8, as given by [Z1], [Z1] = [Y1׀Y2]. The RD code sequences are listed in Table (3.1) for K=3 and W= 4.
[Z1] =
kth 1
C
1
2
3
4
5
6
7
8
0
0
1
0
1
1
0
1
0
1
1
0
1
0
0
1
0
1
0
1
2
0
3
1
1 0
Data segment Table 3.1 Example of RD code sequence
Code segment= Basic +weight Sub-matrices
From the above basic matrix Z1, the number of users (K) and the code length (N), is given by (K N) matrix. Notice that the code weight of each row is equal 4, the relation between N and K for this case (W=4) can be expressed as
290
Advanced Technologies
N=K+5 (9) As a result we can find that for W=5, 6, and 7 code word length N can be expressed as K+7, K+9 and K+11 respectively. As a result the general equation describing number of users K, code length N and code weight W is given as N = K+ 2W -3
(10)
The total spectral width v of the RD code system is governed by the code length, N. Assuming that the chips in the combined signal are placed either on top of (for overlapping chips) or next to each other (for non overlapping chips), the relationship can be written as:
v F N
(11)
where F is the chip width. In the RD code sequences only chips at data level will be detected at the receiver section and the remaining chips are considered dummies or useless. This property of the RD code sequence offers simple and cost effective in the implementation of light source. We can use only one light source laser to cover the chip at data rate, and the remaining chips are covered using broadband LED diode as shown in Figure 1
Fig. 1. RD Codes for OCDMA multiplex in optical Fiber channels Figure 2. shows the spectra of the generated codes for all three transmitter channels. As we can see, the transmitted data of RD code (data segment) is carried over 1549.6 nm, 1548.8 nm
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
291
and 1548 nm for channel 1, 2 and 3 respectively. The other wavelengths are considered idle (code segment) which will be cancelled at the receiver part using Fiber Bragg grating (FBG) optical filter.
(a)
(b)
(c) Fig. 2. The generated RD codes waveforms for K=3 with W=4, a) channel 1 b) channel 2 and c) channel 3 Also RD code can be employed for the large number of users says K=6, with W=5, by evaluating the values of code weight (W) and number of users (K) in Eq. (10) we can get the RD code with code length (N=13). The RD code sequences constructions are shown in Table (3.2).
292
C K th 1 2 3 4 5 6
Advanced Technologies
1
2
3
4
5
6
7
8
9
10
11
12
13
0 0 0 0 0 1
0 0 0 0 1 0
0 0 0 1 0 0
0 0 1 0 0 0
0 1 0 0 0 0
1 0 0 0 0 0
0 1 1 0 1 1
1 1 0 1 1 0
1 0 1 1 0 1
1 0 1 0 1 0
0 1 0 1 0 1
1 0 1 0 1 0
0 1 0 1 0 1
Data Segment Table 3.2 (6 13) RD code sequences, W= 5, N=13, and K=6
Code segment
2.1 The Properties of RD Code RD codes having the following properties: 1- The code is divided into two segments which is data segment (sub-matrix) and code segment (sub-matrix) 2- No cross-correlation at data segment which minimized λc and suppressed PIIN (Phase Induced Intensity Noise). Note that certain combinations of simultaneous transmission of codes are always results zero cross correlation. For example, when user 1 and user 4 are transmitted together (table 3.2). 3- Code segment can be replaced with any type of codes, code segments is divided into two sub-matrixes which are basic sub-matrix and weight sub-matrix. Note that the minimum weights combination at code segment are 2 (W=2), to increase the number of weight of the RD code we just increase the number of 1’s in weight sub-matrix as illustrated in section 3.4. 4- The general relationship between the Number of users (K) and code length (N) are given by N = K+ 2W -3 (12) The number of users, K supported by RD code is equivalent to n. For RD codes, W can be fixed at any numbers regardless of the number of users. By fixing W, encoder/decoder design and the signal SNR will be maintained and will not be affected by the number of users. Thus, the same quality of service can be provided for all users. These two features cannot be achieved by other existing codes. 5-More overlapping chips will result in more crosstalk; the minimum number of weight can be achieved by the RD code are equal to three (W=3). Note that the RD code can be design for supporting system having W 3. 6- Flexibility in choosing N, K parameters than other codes like MFH and MDW codes. 2.2 Code Comparisons Many codes have been proposed for OCDMA systems, such as optical orthogonal code (OOC), modified frequency –hopping (MFH) codes, and Hadamard code, but the key points of RD code is that the code length (N), RD code offers better performance than other codes
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
293
in term of code length for same number of users, K, of 30 as shown in Table 3.5. Short code length limit the addressing flexibility of the codes, while long code length are considered disadvantage in implementation, since either very wide –bandwidth source or very narrow filter bandwidth are required, RD codes exists for practical code length that are neither too short nor too long. For example, if chip width (filter bandwidth) of 0.6 is used, the OOC code will require a spectrum width of 218.4 nm whereas, RD code only requires 19.8 nm, and for MFH code it requires 25.2 nm of spectrum width. RD code show shorter bandwidth than other codes. On the other hand too short a code may not be desired because limit the flexibility of the code. Code OOC Prime code Hadamard MFH RD code
No. of user K 30 30 30 30 30
Weight W 4 31 16 7 3
Code length N 364 961 32 42 33
Table 3.5. Comparison between RD MFH, OOC, Hadamard and Prime Codes for Same Number of User K = 30.
3. Detection Schemes of OCDMA In OCDMA systems, the detection schemes affect the design of transmitters and receivers. In general, there are two basic detection techniques namely coherent and incoherent (Prucnal, 2005). OCDMA communication system can be all optical or partly optical. The information bits may be originally optical or electrical. The all-optical CDMA system is usually an incoherent system. On the other hand, a system consisting of unipolar sequences in the signature code is called incoherent system. A system that uses bipolar codeword is called a coherent system. Since coherent is phase sensitive, the use of such techniques will of course be more difficult than that of incoherent ones. In this work, the incoherent SAC-OCDMA detection schemes will be considered. Based on the RD code constructions a new detection scheme is proposed called spectral direct detection scheme. 3.1 Spectral direct detection technique. The setup of the proposed RD system using direct detection technique (known as spectral direct detection) is shown in Figure 3. The Figure illustrates the implementation of RD code using direct detection scheme whereby only one pair of decoder and detector is required as opposed to two pairs in the complementary subtraction techniques. There is also no subtraction process involved. This is achievable for the simple reason that, the information is assumed to be adequately recoverable from any of the chips that do not overlap with any other chips from other code sequences. Thus the decoder will only need to filter through the clean chips at data segment; these chips are directly detected by a photodiode as in normal intensity nodulation/ direct detection scheme. This technique has successfully eliminated the MAI because the only wanted signal spectral chips at data segment in the optical domain is filtered. This is made possible because, the code properties possess one clean
294
Advanced Technologies
signal chip for each of the channels at the RD data segment. Subsequently, the phaseinduced intensity noise (PIIN) is suppressed at the receiver, thus the system performance is improved. Codes which possess non-overlapping spectra such as RD code can generally be supported by this detection scheme. It is also important to note that the whole code’s spectra still need to be transmitted to maintain the addressing signature. This distinguishes the technique from wavelength division multiplexing (WDM) technologies.
Fig. 3. implementation of RD code using spectral direct detection technique 3.2 RD code detection scheme compared with other SAC-OCDMA codes In recent years, several codes have been proposed for incoherent OCDMA networks. For example, Modified Frequency hopping (MFH), and Modified Quadratic Congruence (MQC), and Enhanced Double weight (EDW) codes (Hasoon, Aljunid, Anuar, Mohammad, & Shaari, 2007; X. Wei et al., 2001; Z. Wei & Ghafouri-Shiraz, 2002a). All these codes are based on SAC employing complementary detection techniques (known as the complementary subtraction). The transceiver design for RD code is based on direct detection technique. Figure 4 shows the setup of the proof-of- principle simulation for the proposed scheme. A simple schematic block diagram consists of four users is illustrated in Figure 4. Fibre Bragg Grating (FBG) spectral phase decoder is used to decode the code at data level. The decoded signal is decoded by a photo-detector (PD) followed by a low-pass-filter (LPF) and error detector respectively.
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
295
(a)
(b)
(c) Fig. 4. Simulation setup of the proposed encoding/decoding scheme: a) Transmitter, b) Receiver employing direct detection scheme (RD code), c) another SAC codes employing complementary detection technique
296
Advanced Technologies
Figure 4.c shows another SAC-OCDMA codes employed complementary detection technique such as MFH, MQC, DW codes. As shown in Figure 4.c, two sets of decoders are used; decoder and complementary decoder, followed by two sets of detectors, correspondingly. The purpose of this setup is to cancel the MAI between the wanted and the unwanted code. The wanted codes are filtered by the corresponding encoder, while the unwanted codes are cancelled by the complementary decoder. The subtraction (the complementary subtraction) is performed after the detectors. In this scheme, it is important to ensure that the total optical power received by both photodiodes is balanced so that the subtraction of the unwanted overlapping channels can be completely done. This is not easily achieved, considering the imperfect and non-uniform responses of the components (such as the losses of the Bragg gratings and couplers), particularly in varying environmental conditions. In the spectral direct detection technique, shown in Figure 4. b, there is only one single decoder and a single detector required and no subtraction is required for the detection. This is achievable for the simple reason; the information is adequately recovered from any of the chips that do not overlap with any other chips from other code sequences. Thus, the decoder will need only to filter the clean chips and this is easily be detected by the photodiode, as in the normal intensity modulation/direct detection scheme. 3.3 Mathematical analysis of RD code using direct detection technique For the proposed system anlysis incoherent intensity noise ( I ), as well as shot noise ( sh ) and thermal noise ( T ) in photodiode are considered. The detection scheme for the
proposed system is based on direct detection using optical filter followed by photodetecor. Gaussian approximation is used for the calculation of BER. The SNR is calculated at the receiver side and for each user there is only one photodiode, the current flow through the photodiode is denoted by I. Let C K (i ) denote the ith element of the Kth RD code sequence. According to the properties of the RD code, the RD code is constructed using code segment and data segment can be expressed as: W , For K l N (13) K 1 if K l C K (i)Cl (i) 1, For K l and l i 1 K 1 if K l
But for RD code =0 at data segment on the receiver side, thus the properties of RD code are expressed as N
C i1
K (i)Cl (i)
1, 0,
For K l For K l
(14)
When a broad-band pulse is input into the group of FBGs, the incoherent light fields are mixed and incident upon a photodetector, the phase noise of the fields causes an intensity
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
297
noise term in the photodetector output. The coherence time of a thermal source ( c ) is given by [8]
c
G
2
( v ) dv
0
G ( v ) dv 0
2
(15)
Where G(v) is the single sideband power spectral density (PSD) of the source. The Q-factor performance provides a qualitative description of the optical receiver performance, the performance of an optical receiver depends on the signal-to-noise ratio (SNR). The Q-factor suggests the minimum SNR required to obtain a specific BER for a given signal. The SNR of an electrical signal is defined as the average signal power to noise power [SNR= I2/ ], 2
2 Where is defined as the variance of the noise source (note: the effect of the receiver’s dark current and amplification noises are neglected in the analysis of the proposed system), given by 2 sh + I + T , which also can be written as:
2 2eIB+I2B c +
4 K BTn B RL
(16)
Where e electron’s charge I Average photocurrent The power spectral density for I I2 B noise-equivalent electrical bandwidth of the receiver. KB Boltzmann’s constant . Tn Absolute reciver noise temperature . Receiver load resistor. RL
In Eq. (16), the first term results from the shot noise, the second term denotes the effect of Phase Intensity Induced Noise (PIIN) [9,8], and the third term represents the effect of thermal noise. The total effect of PIIN and shot noise obeys negative binomial distribution (Zheng & Mouftah, 2004). To analyze the system with transmitter and receiver, we used the same assumptions that were used in (X. Wei et al., 2001; Z. Wei & Ghafouri-Shiraz, 2002a, 2002b) and are important for mathematical simplicity. Without these assumptions, it is difficult to analyze the system. We assume the following: 1- Each light source is ideally unpolarized and its spectrum is flat over the bandwidth [ vo 234-
v 2 ,vo v 2 ], where v o is the central optical frequency and v is the
optical source bandwidth in Hertz. Each power spectral component has identical spectral width. Each user has equal power at receiver. Each bit stream from each user is synchronized. Based on the above assumptions, we can easily analyze the system performance using Gaussian approximation. The power spectral density of the received optical signals can be written as (X. Wei et al., 2001):
298
r( v )
Advanced Technologies
Psr v
K
N
d c ( i )uv v k
k 1
k
i 1
o
2Nv ( N 2i 2 ) u v v o 2Nv ( N 2i ) (17)
where Psr is the effective power of a broadband source at the receiver, d k is the data bit of the Kth user that is “1” or “0”, and u (v ) is the unit step function expressed as:
1, u( v ) 0 ,
v0
(18)
v0
From Eq. (17) the power spectral density at photodetector of the lth receiver during one bit period can be written as :-
G( v )
Psr
v
K
N
C dK
k 1
K
( i )C l ( i )
u v v
0
i 1
v 2N
( N 2 i 2 )
u v v0
v
2N
( N 2 i )
(19)
Eq. (19), can be simplified further as follows
G (v) In Eq. (20),
dK
Psr v
K
N
d C K
k 1
i1
K
( i )C l ( i ) u
Nv
(20)
is the data bit of the Kth user that carries the value of either “1” or “0”.
Consequently, the photocurrent I can be expressed as:
(21)
I G ( v ) dv 0
where is the responsivity of the photodetectors given by
(e) (hvc ) [10]. Here,
is the quantum efficiency, e is the electron’s charge, h is the Planck’s constant and vc is the central frequency of the original broad-band optical pulse.
0
0
I= G ( v )dv Psr v
N v d K C K ( i )C l ( i ) u N dv K 1 i 1 K
(22)
By using the properties of RD code Eq. (22) becomes I= 2 WP
N
sr
(23)
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
299
The power spectral density for I2 can be expressed as I2=
N K K P2 2 G2 ( v )dv Nsrv 2 Cl ( i ) d K CK ( i ) d mCm ( i ) i 1 K 1 m1 0
When all the users are transmitting bit “1,” using the average value as
K
k 1
(24)
CK K N , and
from the properties of RD code we can get the power spectral density for I2 as in Eq. (25)
I2=
Psr2 KW ( K 1 W ) 2
(25)
v.N 2
Noting that the probability of sending bit ‘1’ at any time for each user is
1 2
. From Eq. (23)
and Eq. (25), we get the noise power < > as 2
2 eB (
2 WPsr Psr KWB2 4K T B ) ( K 1 W ) b n N 2v.N 2 RL
(26)
Finally, from Eq. (23) and Eq. (26), we can calculate the average SNR as in Eq.(27) 2
2PsrW N SNR 2eBWPsr B2 PsrWK K 1 W 4 KBTn B N 2 N 2v RL
(27)
Where Psr is the effective power of a broadband source at the receiver. The bit-error rate (BER) or probability of error, Pe is estimated using Gaussian approximation (Goodman, 2005) as Pe
1 SNR erfc 2 8
(28)
Psr = −10 dBm is the optical received power, B = 311 MHz is the receiver’s noise-equivalent electrical bandwidth, η = 0.6 is the photodetector quantum efficiency, V = 3.75 THz is the linewidth of the broad-band source, λ = 1550 nm is the operating wavelength, Tn = 300 K is the receiver temperature, RL = 1030 is the receiver load resistance and W, N, and K, are the code weight, code length, and total number of active users, respectively, as being the parameters of RD code itself.
300
Advanced Technologies
4. Simulation Results Using Eq.(28), the BER of the RD code is compared mathematically with other codes which use similar techniques. Figure 5 shows the relation between the number of users and the BER, for RD, MFH, and Hadamard codes, for different values of K (number of active users). It is shown that the performance of the RD code is better compared with the others even though the weight is far less than other codes, which is 7 in this case. The maximum acceptable BER of 10-9 is achieved by the RD code with 125 active users than for 85 by MHF code. This is good considering the small value of weight used. This is evident from the fact that RD code has a zero cross-correlation while Hadamard code has increasing value of cross-correlation as the number of users increase. However, a few code specific parameters are chosen based on the published results for these practical codes (Aljunid, Ismail, & Ramil, 2004; X. Wei et al., 2001). The calculated BER for RD is achieved for W=7 while for MFH, MQC and Hadamrad codes were for W=14, W=8 and W=128, respectively. 0
10
-5
10
-10
BER
10
-15
10
MFH W=14 RD W=7 Hadamard W=128 RD W=5 MQC W=8
-20
10
-25
10
-30
10
0
20
40
60
80 100 K (no. of user)
120
140
160
180
Fig. 5. BER versus number of users for different SAC-OCDMA codes when Psr= −10 dBm. In OCDMA systems, phase induced intensity noise (PIIN) is related to multiple access interference (MAI) due to the overlapping of spectra from different users. Here, the relations between PIIN noise and received power are analyzed. Previously it shown that MAI can be almost reduced due to the zero cross-correlation at data segment. Thus the effect of MAI is not elaborated further. Figure 6 shows the relations between PIIN noise and received power (Psr). The values of B, K, and v are fixed (ie. B = 311 MHz, K = 14 and v = 3.75 THz) but Psr varied from -30 dBm to 30 dBm. When the received power increases the PIIN noise for
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
301
MFH, MQC, Hadamard and RD codes increase linearly. The PIIN noise of RD codes family is less than compared to that of MFH and MQC codes. As shown in Figure 6 the PIIN noise can be effectively suppressed by using RD code family. This is because of the superior code properties of RD code family, such as zero cross-correlation at data level while Hadamard code has increasing value of cross-correlation as the number of users increase. It is shown that the performance of the RD code is better compared with the others even though the weight is far less than other codes, which are 3 in this case.
RD W=3
MFH W=5
MQC W=6
Hadamard W=8
-4
10
-6
10
-8
PIIN Noise
10
-10
10
-12
10
K=14
-14
10
-16
10
-18
10
-30
-25
-20
-15
-10
-5 Pr (dBm)
0
5
10
15
20
Fig. 6. PIIN versus Pr for different OCDMA codes for same number of active users (K=14). Figure 7. shows the bit error rate variation with the effective power Psr . The number of simultaneous users (K=80), and W=7, 14 for RD and MFH codes respectively. The solid line represents the BERs, taking into account effects of intensity, shot noise and thermal noise. The dashed line indicates the BER performance when the effect of only intensity and shot noise is considered. It is shown that, when Psr is low, the effect of intensity noise becomes minimal, and hence the thermal noise source becomes the main factor that limit the system performance. It is also shown that thermal noise is much more effective than intensity noise under same Psr.
302
Advanced Technologies
0
10
-5
10
-10
BER
10
-15
10
MFH (All noise) MFH(PIIN+Shot) RD(All noise) RD(PIIN+Shot)
-20
10
-25
10
-60
-50
-40 -30 -20 Effective power Pr(dBm)
-10
0
Fig. 7. BER performance versus effective power (Pr) for the RD and MFH codes when number of simultaneous users 80. In table (1), the important properties of RD, MFH, and Hadamard codes are listed. The table shows that RD codes exist for any natural number while Hadamard codes exist only for the matrix sequence m, where m must at least be equivalent to two. On the other hand, MFH codes exist for prime number q only. The number of users supported by RD code is equivalent to n. On the other hand, for Hadamard and MFH codes, the number of user supported depends on m and q, respectively, which in turn, alters the value of weight W. This will affect both the design of the encoder–decoder and the SNR of the existing codes in use. In contrast, for RD codes n can be fixed at any even or odd numbers regardless of the number of users. From this table the RD codes have a zero cross-correlation while Hadamard code has increasing value of cross-correlation as the number of users increases. For MFH codes, although the cross correlation is also fixed at 1 but the BER is higher than the RD code. MFH needs higher number of q or W to increase SNR, as shown in Figure 5. Code RD MFH Hadamard
Exist any n q m2
Length N K+2W-3 q2+q 2m
Weight W Any w q+1 2m-1
Size K K=n q2 2m-1
0 1 2m-2
Table 1. Comparison between RD, MFH and Hadamard code Figure 8 shows the BER increases as the fiber length increases for the different techniques. The tests were carried out at a rate of 2.5 and 5 Gb/s for 30-km distance with the ITU-T G.652 standard single-mode optical fiber (SMF). All the attenuation (i.e., 0.25 dB/km), dispersion (i.e., 18 ps/nm km), and nonlinear effects were activated and specified according
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
303
to the typical industry values to simulate the real environment as close as possible.The number of active users is four at 2.5 Gbps and 5 Gbps bit rates. The effect of varying the fiber length is related to the power level of the received power. A longer length of fiber has higher insertion loss, thus smaller output power. In fact, when the fiber length decreases, the data rate should increases to recover a similar degradation of the signal form. Thus, in order to design and optimize link parameters, the maximum fiber length should be defined as short as possible, to obtain high data rate and to achieve a desired system performance. This is because in order to reduce the MAI limitations, the data rate should be decreased in OCDMA analysis. For a given code length N, the chip rate DC can be expressed as: DC = D N. Hence, as D increases (D > 2.5 Gbps) the chip rate increases. Consequently, the chip duration decreases, as a result of this the signal becomes more susceptible to dispersion. Accordingly, the optical power contained in the emitted chip ‘‘1’’ causes optical signal broadening, this signal is then received at several chips. In communication systems, intersymbol interference (ISI) is a result of distortion of a signal that causes the previously transmitted data to have an effect on the received data. The ISI affect results in more chips containing non-zero optical power than expected. As for conventional decisions, we selected the decision threshold level S = W Pcen, where Pcen is the optical power level which corresponds to the chip center. Thus, the data sent ‘‘1’’ is always well-detected. The only error that can occur in this situation is when the data sent is ‘‘0’’, as in the ideal case. In terms of fiber length, it can be noted that the dispersion effect increases as the fiber length increases. However, for this particular chip duration, the dispersion has no impact on the BER for optical fibers shorter than 20 km. on the other hand, when the fiber’s length is greater than 20 km, system performance is deteriorated. In this particular system, direct technique can support higher number users than the conventional technique because the number of filters at the receiver is reduced, thus a smaller power loss. Note that the very low BER values are just a measure of the quality of the received signals as calculated by the simulator, although they may not be very meaningful, practically speaking. 0
10
-5
10
-10
10
-15
BER
10
-20
10
-25
10
2.5Gbps(MFH) 2.5Gbps(RD) 5Gbps(MFH) 5Gbps(RD)
-30
10
-35
10
20
25
30
35 Fiber Lenght (km)
40
45
50
Fig. 8. Variation of BER as a function of fiber length using direct (RD code) and complementary techniques (MFH code) at different transmission rates.
304
Advanced Technologies
The eye pattern diagrams for RD code (employing spectral direct detection scheme) and MFH code (employing complementary detection scheme) are shown in Figure 9. The eyes diagram clearly depict that the RD code system gives better performance, having a larger eye opening. This figure shows the corresponding simulated BER for RD and MFH codes systems, the vertical distance between the top of the eye opening and maximum signal level gives the degree of distortion. The more the eye closes, the more difficult it is to distinguish between 1s and 0s in the signal. The height of the eye opening at the specified sampling time shows the noise margin or immunity to noise (Keiser, 2003).
Fig. 9. Eye diagram of (a) one of the RD channels (W=3), (b) one of the MFH channels (W=8), at 10 G bit/s.
5. Conclusion In this chapter, method to develop, characteristics and performance analysis of a new variation of optical code structure for amplitude-spectral encoding OCDMA system are presented. The RD code is provide to provide simple matrices constructions compared to the other SAC-OCDMA codes such as Hadamard, MQC and MFH codes. This code posses numerous advantages including efficient and easy code construction, simple encoder/decoder design, existence for every natural number n, zero cross correlation at data level ( c =0), The properties of this code are described and discussed with the
related equations. The advantages of the RD code can be summarized as follows: (1) shorter code length; (2) no cross-correlation at data level; (3) data level can be replaced with any type of codes; (4) more overlapping chips will result in more crosstalk; and (5) flexibility in choosing N, K parameters over other codes like MFH code. The RD code can be used to effectively improve the overall performance of a spectral amplitude coding next generation OCDMA technology. The cross correlation functions of the signature codes or sequences have very much to do with the detection construction of all future OCDMA applications. The reason is obvious, making traditional OCDMA signal detection very difficult if the cross correlation
Realization of a New code for Noise suppression in Spectral Amplitude Coding OCDMA Networks
305
properties are not under control. In this chapter, a new detection technique known as spectral direct detection has been proposed for SAC-OCDMA systems. The performance was evaluated based on RD code. The simulation setup and mathematical equations are derived in term of the variance of noise source (shot noise, PIIN, and Thermal noise) and BER. This is achieved by virtue of the elimination of MAI and PIIN by selecting only the data segment from SAC-signal of the intended code sequence. The overall system cost and complexity can be reduced because of the less number of filters used in the detection process
6. References Aljunid, S. A., Ismail, M., & Ramil, A. R. (2004). A New Family of Optical Code Sequence for Spectral-Amplitude-coding optical CDMA systems. IEEE Photonics Technolgy Letters, 16, 1383-2385. Chung, H., Salehi, J., & Wei, V. K. (1989). Optical orthogonal codes: Design, analysis, and applications. IEEE Transactions on Information theory, 35(3), 595–605. Goodman, J. W. (2005). Statistical Optics. New York: Wiley Hasoon, F. N., Aljunid, S. A., Anuar, M. S., Mohammad, K. A., & Shaari, S. ( 2007). Enhanced Double Weight Code Implementation in Multi-Rate Transmission” IJCSNS International Journal of Computer Science and Network Security, 7(12). Hui, J. Y. ( 1985). Pattern code modulation and optical decoding ; A novel code division multiplexing technique for multifiber network,. IEEE J. select. Areas Commun., 3, 916-927. Kavehrad, M., & Zaccarin, D. ( 1995). Optical Code-Division-Multiplexed System Based on Spectral Encoding of Noncoherent Sources. Journal Lightwave Technology13, 534-545. Keiser, G. (2003). Optical Communication Essentials. USA: MiGraw-Hill networking. Kwong, C. W., & Yang, G. C. (1995). construction of 2n prime-sequence codes for optical code division multiple access. IEEE Proceedings Communication 142, 141-150. Maric, S. V. (1993). New family of algebraically designed optical orthogonal codes for use in CDMA fiber-optic networks. IEEE Electron Lett, 29, 538-539. Marie, S. V., Hahm, M. D., & Titlebaum, E. D. (1995). Construction and performance analysis of a new family of optical orthogonal codes for OCDMA fiber-optic networks. IEEE Trans. Commun., 43, 485-489. Marie, S. V., Moreno, o., & C. Corrada. (1996). Multimedia transmission in fiber-optic LANs using optical CDMA. J. Lightwave Technol., 14, 1-5. Prucnal, P. R. (2005). Optical code division multiple access: Fundamentals and applications: Taylor & Francis. Salehi, J. A. (1989). Code division multiple-access techniques in optical fiber network-part I : Fundamental principles,. IEEE Trans. Inform., 37, 824-833. Wei, X., shalaby, H. M. H., & Ghafouri-Shiraz, H. (2001). Modified quadratic congruence codes for fiber-Bragg grating-based spectral amplitude-coding optical CDMA systems,. J. Lightwave Technol., 9, 1274-1281. Wei, Z., & Ghafouri-Shiraz, H. (2002a). Codes for spectral-amplitude-coding optical CDMA systems. J. Ligtwave Technol., 50(1209–1212).
306
Advanced Technologies
Wei, Z., & Ghafouri-Shiraz, H. (2002b). Unipolar codes with ideal in-phase crosscorrelation for spectral-amplitude-coding optical CDMA systems. IEEE Trans. Commun, 50, 1209–1212. Zheng, J., & Mouftah, H. T. ( 2004). Optical WDM Networks: Concepts and Design Principles. Hoboken NJ: Wiley.
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
307
17 X
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique 1
Kamarul Hawari Ghazali1, Mohd. Marzuki Mustafa2 and Aini Hussain2 Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang, Kuantan, Malaysia 2 Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Malaysia
1. Introduction Weeding strategy in oil palm plantation plays a significant role in ensuring greater production yields (Azman et al., 2004). Most of the plantation companies are very concerned with weeding practice to guarantee their crop can produce higher productivity as well as to maintain the oil quality. Current practice of weeding strategy in oil palm plantation involves using labour workers. These labour workers carrying a backpack of herbicide and manually spray the weed in the field. It is known that the manual practice of weeding is very inefficient and dangerous to human beings. At present, herbicides are uniformly applied in the field, even though researchers have shown that the spatial distribution of weed is nonuniform. If there are means to detect and identify the non-uniformity of the weed spatial distribution, then it would be possible to reduce herbicide quantities by application only where weeds are located (Lindquist et al., 1998), (Manh at al., 2001). Consequently, an intelligent system for automated weeding strategy is greatly needed to replace the manual spraying system that is able to protect the environment and at the same time, produce better and greater yields. Appropriate spraying technology and decision support systems for precision application of herbicides are available, and potential herbicide savings of 30% to 75% have been demonstrated (Heisal et. al., 1999). The proposed method of automatic weeding strategy, using a machine vision system, is to detect the existence of weed as well as to distinguish its types. The core component of the machine vision system is an image processing technique that can detect and discriminate between the types of weed. Two types of weed are distinguished namely as narrow and broad. Machine vision methods are based on digital images, within which, geometrical, utilized spectral reflectance or absorbance patterns to discriminate between narrow and broad weed. There are a few methods that used by a researcher in developing machine vision system for herbicide sprayer. A method which utilise shape features to discriminate between corn and weeds have been proposed by Meyer (1994). Other studies classified the scene by means of color information (Cho et al., 1998). In (Victor et al., 2005), a statistical
308
Advanced Technologies
approach has been used to analyze the weed presence in cotton fields. It was reported that the uses of a statistical approach gave very weak detection with a 15% false detection rate. In this work, we report the image processing techniques that have been implemented which is focused on the filtering technique as well as feature vector extraction and selection of the weed images. Filter techniques have a very close relation to edge detection. Selective edge detection and suppression of noise has usually been achieved by varying the filter size. Small filter sizes preserve high-resolution details but consequently include inhibitive amounts of noise, while larger sizes effectively suppress noise but only preserve lowresolution details (Basu, 2002). A multi-scale filter function was proposed as a solution for effective noise reduction, and involves the combination of edge maps generated at different filter sizes. Multi scale size of filter function can be used to determine the clear of edge detection. Other research work by Canny (1986), Deriche (1987), Shen (1996) and Pantelic (2007) found that the simplest way to detect object edges in an image is by using low and high pass filters. As mentioned earlier, a weeding strategy is an important issue in the palm plantation industry to ensure that palm oil production meets quality control standards. In this work, we will focus on the commonly found weed types in oil palm plantations which are classified and identified as narrow and broad weed. Figure 1 shows the narrow and broad weed type in different image conditions. These types of image will be processed using filtering techniques and the features will be extracted using the continuity measure feature extraction method.
Fig. 1. Images of narrow and broad weed to be classified
2. Methodology The advancement of digital image processing provides an easy way to process weed images. There are many techniques of image processing that can be used for detection and classification of weed. One of the common techniques used in image processing is to a filter (Graniatto, 2002). The ordinary filters such as low pass, high pass, Gaussian, Sobel and
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
309
Prewitt are used to enhance the raw image as well as to remove unwanted signal in the original image (Gonzales, 1992). The existing type of filter was designed for general purpose. Thus, it should be modified to fulfil the targeted application. In this paper, low pass and high pass filters have been used to analyze the weed images and we propose a new feature extraction technique. This is to produce a set of feature vectors to represent narrow and broad weed for classification. Figure 2 shows a block diagram of overall methodology of the image processing techniques proposed.
Fig. 2. Methodology of image processing using filter technique As the first step, the image of weed is captured using a digital camera in RGB format with 240x320 resolutions. The captured image (offline) is then converted to a gray scale image to minimize the array of image. In the image processing part, it is easier to process the pixels in the two dimensional gray images rather than RGB three dimensional arrays. Additionally, the filter technique involves the convolution method which is very fast to process in two dimensional arrays to produce an output that can be used for feature extraction method. The operation of the filtering technique does not change the size of data images 240x320 and it is only filtering the unwanted signal that was designated by the filter function such as low pass, high pass etc. The large value of data is difficult to analyze and process with the purpose of represent the target object. Therefore, it is important to minimize the size of filter output data by applying a feature extraction technique. The feature extraction technique continuity measure (CM) has been proposed to extract and minimize the size of pixels values of the output filter. The final stage in the image processing method is to classify weed according to its types - narrow and broad weed. The algorithm development in the image processing method can be considered as the brain of the machine vision system. The algorithms work to detect, discriminate as well as classify the target object in order to be implemented in a real application. The above methodology discussed a technique to develop software engine. As for implementation in a real application, the software engine needs to interface with the mechanical structure to respond receiving signal of detection. An electronic interface circuit is used to ensure the data can be transfer efficiently. A prototype of a real-time sprayer structure weeding
310
Advanced Technologies
strategy has been shown in Figure 3. The mechanical structure equipped with a webcam camera, two tanks to carry different types of herbicide, an agriculture type of liquid sprayer and an electronic board for interfacing with software engine.
Fig. 3. The sprayer structure of the automated weeding strategy
3. Filtering Technique The basic concept of the filter technique is to remove noise as well as to detect the edges of an object. Smooth or sharp transitions in an image contribute significantly to the low and high frequency content in the image. A two dimensional ideal low pass filter (also known as smoothing spatial filter) is shown below;
1 H (u, v) 0
if if
D(u , v) D0 D(u , v ) D0
(1)
Where D0 is a specified nonnegative quantity and D (u,v) is the distance from point (u,v) to the origin of the frequency plane. A two dimensional ideal high pass filter is one whose transfer function satisfies the relation
0 H (u , v) 1
if if
D(u , v ) D0 D(u , v ) D0
(2)
Where D0 is the cutoff distance measured from the origin of the frequency plane and D (u,v) is the distance from point (u,v) to the origin of the frequency plane. The above low and high pass filters have been modified so that they can be applied to weed images in order to detect its type. The low and high pass filters have been defined with different scaling size of filter function as describe below. Five different size of scale of low
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
311
and high pass filter (as shown in Figure 4) have been tested on the weed images to find the best scaling factor that can be used for edge detection. Different scale size was used to find the best edge detection that can contribute higher efficiencies in classification of weed type.
Fig. 4. Five scales of low and high pass filter function Implementing these filters on an image will produce an output in the integer number format. This is shown as a simple example in Figure 5. A matrix image with size 2x3 was created to have horizontal straight line edge. A horizontal filter was applied and produced output in an integer matrix number with same scale size as an input image. The output values of the filter show that the edge horizontal line has been detected with integer numbers -9 and 10. This can also be proved by image sample as shown in Figure 6 which is the straight line of edge that has been detected. In this study, horizontal and vertical edges
312
Advanced Technologies
are very important features as the shape of narrow and broad weed are in this form. The narrow weeds are expected to have more vertical and horizontal edge compared to broad images. This unique feature can be used to distinguish between narrow and broad types of weed.
Fig. 5. Filter output of an simple image matrix
Fig. 6. Filter output of an straight line image Another step was taken to process the output of the filter in order to get minimum numbers of matrix size. As discussed above, both weed types have unique features that can be used to discriminate between the two types. The unique features can be extracted by applying the continuity measure algorithm. The continuity measure feature extraction technique was developed based on the output filter as shown in Figure 7. We found that the filter was able to detect straight lines which represent the edges of narrow weed inside the image frame. The continuity measure technique can be illustrated as shown in Figure 8. The neighbourhood pixel values can be measured by checking its continuity of 3, 5, 7 or 10 with different angle. CM feature extraction method can be described as follows:
Measure the continuity of pixel values using values of either 3, 5, 7 or 10 If there is no continuity, the pixel values is set to zero For example, if continuity = 3 with angle 00, the following step will be taken If Xn&Xn+1&Xn+2 = 1, remain the pixel value 1 If Xn&Xn+1&Xn+2 ≠ 1 all the pixel values set to 0.
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
313
Fig. 7. Filter output of narrow images in different scale
Fig. 8. Continuity measurement technique of output filter The feature vector is expected to give a significant different of narrow and broad based on their respective output of filter. Feature vector obtained from the CM technique has a different value to represent narrow and broad weed. The following Figure 9 shows a plot graph of narrow and broad feature vectors in two different types of filter. The horizontal filter is a low pass function and its values were plotted against the values of the vertical filter of the high pass function. It was found that the narrow and broad feature vectors were located at two different groups of values and it is easy to discriminate the narrow and broad weed. A linear classification tool y=mx+c was used to determine the best threshold equation for classification.
314
Advanced Technologies
Fig. 9. Feature vector of narrow and broad class
4. Result and Discussion The filtering technique together with feature extraction continuity measure was applied to the narrow and broad weed image. More than 500 images have been tested to verify the classification performance. Figure 10 and 11 shows the original image and the output of low and high pass filters with a function describe in equations (1) and (2). From the figure, it is clearly seen that, the low and high pass filters have detected the edges of the objects in the image. The edge of the object in black and white pixel values has been analyzed using CM technique to extract its feature vector.
Fig. 10. (a) Narrow image, (b) Output of vertical filter (c) Output of horizontal filter.
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
315
Fig. 11. (a) Weed image, (b) Output of vertical filter (c) Output of horizontal filter. CM technique can be measured in different ways either by neighbourhood or degrees of continuity. Fusion between neighbourhood and degrees has been used to represent the features of weed images. It can be fused such as continuity 3 and degrees 45° or continuity 4 and 90°. All the combination of fusion has been tested and we found that the best parameter to fuse is degrees 45°and continuity 5. The feature vector of narrow and broad weed were distributed separately as shown in Figure 12. Therefore, the development of a linear classification can be easily obtain from the basic equation of y=mx+c. The overall performance of the techniques for classification broad and narrow weed is depicted in table 1. The CM technique with angle of 45° and scale 4 obtained the best result with a correct classification rate of 98% for narrow and broad weed respectively. Slight drop in the performance was noted when the scale were set to 1, 2, 3 and 5, while maintaining the angle at 45°. We found that the CM technique with angle 45° and scale 4 is the most suitable angle since the best classification result is achieved when this value is used.
5. Conclusion In image processing analysis, filtering is usually use as an enhancement method to remove unwanted signals. However, the function of filter can be extended as an edge detection of objects in the image frame. In this research, a filter has been used to analyze a weed image for classification purposes. The size of filter has been selected from 1 to 5 in order to find the best edge detection of weed shape. At the feature extraction stage, a measurement of neighbourhood continuity has been used to analyze the filter output. The feature extraction technique namely as continuity measure was manage to reduce the size of output filter and the classification equation was successfully obtained by using linear equation. The filter technique associated with continuity measurement has been tested with a total of 500 sample of weed images to clarify and distinguish its respective type. The narrow and broad weed was successfully classified where the correct classification rate for both weed types was 98%. Finally, the image processing technique that was obtained from the filtering analysis, continuity measure as well as linear equation was used in interfacing with hardware system. Further work is ongoing to integrate the software and hardware system in the machine vision technology and to test it into a real field.
316
Advanced Technologies
Fig. 12. Distribution of feature vector with CM 45°, continuity 5 and filter scaling (a) 1, (b) 2, (c) 3 and (d) 4
Table 1. Classification rate of CM 45° at different scale and continuity
6. References Azman, I.; Mohd, A.S. & Mohd, N. (2004). The production cost of oil palm fresh fruit bunches: the case of independent smallholders in Johor, Oil Palm Economic Journal, 3(1), pp: 1022-1030 Lindquist, J.L.; Dieleman, A.J.; Mortensen, D.A.; Johnson, G.A. & Wyse-Pester, D.Y. (1998). Economic importance of managing spatially heterogeneous weed populations. Weed Technology, 12, pp: 7-13 Manh, A.G.; Rabatel, G.; Assemat, L. & Aldon, M.J. (2001). Weed leaf image segmentation by deformable templates. Automation and Emerging Technologies, Silsoe Research Institute, pp: 139 – 146.
Machine Vision System for Automatic Weeding Strategy in Oil Palm Plantation using Image Filtering Technique
317
Heisel, T.; Christensen, S. & Walter, A.M. 1999. Whole-field experiments with site-specific weed management. In Proceedings of the Second European Conference on Precision Agriculture, 2, pp: 759–768 Meyer, G.E.; Bargen, K.V.; Woebbecke, D.M. & Mortensen, D.A. (1994). Shape features for identifying young weed using image analysis. American Society of Agriculutre Engineers, 94: 3019, 1994 Cho, S.I.; Lee, D.S. & Jeong, J.Y. (1998). Weed-plant discrimination by machine vision and artificial neural network. Biosystem Eng, 83(3), pp: 275-280 Victor, A.; Leonid, R.; Amots, H. & Leonid, Y. (2005). Weed detection in multi-spectral images of cotton fields. Computers and Electronics in Agriculture, 47(3), pp: 243-260 Basu, M. (2002). Gaussian-based edge-detection methods—a survey. IEEE Trans. Syst. Man Cybern. C Appl. Rev, 32, pp: 252–260 Graniatto, P.M., Navone, H.D., Verdes, P.F. & Ceccatto, H.A. (2002). Weed seeds identification by machine vision. Computers and Electronics in Agriculture, 33,pp: 91-103. Gonzales, R.C. & Woods, R.E. (1992). Digital image processing. Addison-Wesley Publishing Company, New York. Canny, J.F. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 8 (6), pp: 679–698. Deriche, R. (1987). Using Canny’s criteria to derive a recursive implemented optimal edge detector. The International Journal of Computer Vision, 1 (2), pp: 167–187. Shen, J. (1996). On Multi-Edge detection. CVGIP: Graphical Models and Image Processing, 58 (2), pp: 101–114. Pantelic, R.S., Ericksson, G., Hamilton, N. & Hankamer, B. (2007). Bilateral edge filter: photometrically weighted, discontinuity based edge detection. Journal of Structural Biology 160, pp: 93-102.
318
Advanced Technologies
Technologies to Support Effective Learning and Teaching in the 21st Century
319
18 X
Technologies to Support Effective Learning and Teaching in the 21st Century Susan Silverstone, Jack Phadungtin and Julia Buchanan National University USA
Introduction The challenges for education in the 21st century are to discover and develop tools that add value to both teaching and learning. The evolving landscape of higher education is significantly affected by new technologies and changes in student demographics. To help learners achieve these elevating expectations, institutions are competing to provide support that fit the needs of these diverse groups. Today's learner demographics are fragmented by age groups, locations, areas of study and learning preferences. These groups demand customized teaching modalities to optimize educational effectiveness. Education today is no longer restricted to the traditional classroom lecture. Learning preferences differ among adults due to a number of factors, which include comfort level with new technologies, attention span, and the ability to multitask. Because of emerging technologies, knowledge is available exponentially. Younger generations, native to digital media, accept new learning methods with little problem since they are introduced to sophisticated technology earlier and in elementary education. Tomorrow's students will be required to adapt to new learning styles and technologies at an even faster and earlier rate. This chapter will discuss how institutions of higher education must be responsive to the needs of both today's and tomorrow's learners. The first part of the chapter outlines how the various teaching modalities may be suitable for different generations of learners. Both current and future technologies related to learning are discussed and contrasted. Commonalities of learner preferences and technological literacy are summarized. The authors also establish relationships between learner generations and preferred technologies. The second part of the chapter illustrates how each technology may benefit student learning within major disciplines. The quality of education can be measured against expected learning outcomes. To promote student success, universities and colleges must optimize their resources to prepare graduates for the expectations required by their chosen disciplines. Though technology has helped improve learning across disciplines, inevitably limitations exist. Although online education has helped facilitate learning in some disciplines, it could be a deterrent in others. The authors discuss specific examples where
320
Advanced Technologies
technologies help advance the learning process. Depending upon the discipline involved, resources play significant roles for both the learner and the institution.
Innovations in teaching Dede (1998) in Six Challenges for Educational Technology, makes the following statements: ″Many exciting applications of information technology in schools validate that new technology-based models of teaching and learning have the power to dramatically improve educational outcomes. As a result, many people are asking how to scale-up the scattered, successful ‘islands of innovation.’ Instructional technology has empowered into universal improvements in schooling enabled by major shifts in standard educational practices” (p. 1). He continues, “without undercutting their power, change strategies effective when pioneered by leaders in educational innovation must be modified to be implemented by typical educators. Technology-based innovations offer special challenges and opportunities in this ‘scaling-up’ process. I believe that systemic reform is not possible without utilizing the full power of high performance computing and communications to enhance the reshaping of education design in schools. Yet the cost of technology, its rapid evolution, and the special knowledge and skills required of its users pose substantial barriers to effective utilization”(p.1). According to Schank and Jones, (1991), “Substantial research documents that helping students make sense out of something they have assimilated, but do not yet understand is crucial for inducing learning that is retained and generalized (as cited in Dede, 1998, p.3). According to (1998), Edelson, Pea and Gomez (1996), states, “Reflective discussion of shared experiences from multiple perspectives is essential in learners’ converting information into knowledge, as well as in students mastering the collaborative creation of meaning and purpose (Edelson et al, 1996) “Some of these interpretative and expressive activities are enhanced by educational devices, but many are best conducted via face-to-face interaction, without the intervening filter and mask of computer-mediated communication ”(Brown and Campione, 1994). Dede (1998) poses the following question, “How can many educators disinterested or phobic about computers and communications {may} be induced to adopt new technologybased models of teaching and learning?”(p.6). Thus far, most educators who use technology to implement the alternative types of pedagogy and curriculum are “pioneers”: people who see continuous change and growth as an integral part of their profession and who are willing to swim against the tide of conventional operating procedures—often at considerable personal cost. However, to achieve large-scale shifts in standard educational practices, many more educators must alter their pedagogical approaches; and schools’ management, institutional structure, and relationship to the community must change in fundamental ways. This requires that ‘settlers’ (people who appreciate stability and do not want heroic efforts to become an everyday requirement) must be convinced to make the leap to a different mode of professional activity—with the understanding that, once they have mastered these new approaches, their daily work will be sustainable without extraordinary exertion. How can a critical mass of educators in a district be induced simultaneously to make such a shift? “(p. 6) Dede, (1998) also stated “research documents that new, technology-based pedagogical strategies result in at least four kinds of improvements in educational outcomes. Some of these gains are easy to communicate to the community; others are difficult—but together
Technologies to Support Effective Learning and Teaching in the 21st Century
321
they constitute a body of evidence that can convince most people. These four types of improvements are listed below, in sequence from the most readily documented to the hardest to demonstrate: o Increased learner motivation o Advanced topics mastered. o Students acting as experts do. Developing in learners the ability to use problem solving processes similar to those of experts is challenging, but provides powerful evidence that students are gaining the skills they will need to succeed in the 21st century” (p. 8). In order to identify the generations, discuss their lifestyles with respect to communication and technology and explore the application to teaching methods, the United States, China, Japan and the United Kingdom will be reviewed with analysis for application to online learning and teaching.
Information on generations From the preliminary research it is apparent that most first world countries may be segregated into four cohorts, groups of individuals who have common characteristics, such as age, include experience, location or generation. The most typical type of cohort in developmental psychology is referred to as a an age cohort, birth cohort and may include a span of years such as the "Baby Boomer" or "Generation X". These age or birth cohorts are likely to share common cultural, historical and social influences. (Robinson, 2006) In this paper we will refer to the generations as cohorts and identify them as follows : o Silent generation/Veterans o Baby Boomers o Generation X o Generation Y/ Millenials
Fig. 1. Population by Generation - Source: 2005 US Census The Silent Generation (Smith, 2009) is the generation of people born in the United States between roughly 1923 and the early 1940s. Members of this generation experienced vast cultural shifts in the United States, and many of them struggled with conflicted morals, ideas, and desires. Some members of the Silent Generation claim that they are one of the least understood generations from the 20th century, perhaps because of the relatively small
322
Advanced Technologies
size of this generation. According to Smith (2009), “describing this generation as the ‘Silent Generation’ is a bit of a misnomer. In fact, many revolutionary leaders in the civil rights movement came from the Silent Generation, along with a wide assortment of artists and writers who fundamentally changed the arts in America. This generation is comparatively small when compared to the surrounding generations because people had fewer children in the 1920s and 1930s, in response to financial and global insecurity. As a result, members of the Silent Generation were uniquely poised to take advantage of economic opportunities, thanks to the reduced competition. Many of them went on to harness the scientific and technological advances of the Second World War, developing innovative inventions, which laid the groundwork for even more technological progress in the late 20th century. However, the term ‘Silent Generation’ is not wholly inappropriate. While some members of the Silent Generation did become outspoken activists, many were also quiet, hardworking people who focused on getting things done and advancing their careers, even as they struggled with what to do with their lives. Members of the Silent Generation were encouraged to conform to social norms, and many did so, but as later became evident this generation seethed on the inside as people coped with the growing civil rights movement, the women's liberation movement, and the explosion of the Baby Boomers. Internal conflict plagued many individuals in the Silent Generation.” Baby Boomers born in the years from 1946 to 1964, have a unique set of beliefs and characteristics, vastly different from previous generations. According to Primo and Wedeven (1998), (This encompasses all aspects of life, affecting their beliefs about self, career, home, and leisure. (Strategic Solutions, 1998) (http://www.thestrategicedge.com/). Baby boomers are more optimistic economically, largely since they did not experience the Great Depression. They are better educated; men continued in college to avoid the Vietnam War and more women, seeking equality, sought a college education. Women of this generation worked outside the home in greater numbers, even while raising young children. Baby boomers, especially the latter group, are more comfortable with technology, growing up within the age of computers. They are an individualistic generation, with a focus on self and a tendency to reject authority. Hectic lifestyles are common for Baby Boomers, with their leisure time infringed upon by the various demands of life. As the Baby Boomers enter mid-life, they face normal emotional and physical transitions, yet some unusual financial and employment concerns. As Boomers age, they will face some of the standard chronic health problems and face their own mortality. Many Baby Boomers are facing layoffs as corporations downsize, rather than entering a period of economic certainty, job security, and one’s peak earning years. Their financial status is even more uncertain because Baby Boomers are a spending - not a saving - generation, who liberally use credit to finance purchases. This collective tenuous financial status is compounded by the fact that there is greater uncertainty of the availability and reliability of Social Security. With many subsidizing their own aging parents, this uncertainty is compounded even more for this generation. The highly educated Baby Boomer should support the continued growth of bookstores and high technology products, stores, and services. People are typically in their peak reading years in this age period anyway. This generation, especially those in the later years of the baby boom, is comfortable with computers. Baby boomers will buy computers and hi-tech products for themselves, their children and grandchildren. High technology can be used to
Technologies to Support Effective Learning and Teaching in the 21st Century
323
sell other products and services which will then be perceived as more fun and entertaining, to themselves.
Generation X According to Generation X a term used to describe generations in many countries around the world born from the 1960s and 1970s to 1982. The term has become used in demography, the social sciences, and marketing, though it is most often used in popular culture. (http://www.nasrecruitment.com/TalentTips/NASinsights/GettingtoKnowGenerationX.p df) In the United States, Generation X or Gen X for short was originally referred as the "baby bust" generation because of the small number of births following the baby boom. In the United Kingdom (UK) the term was first used in a 1964 study of British youth by Jane Deverson and Charles Hamblett in their book Generation X. The term was first used in popular culture in the late 1970's by United Kingdom punk rock band Generation X led by Billy Idol. It was later expanded on by Canadian novelist Coupland (1991) who describes the angst of those born between roughly 1960 and 1965, who felt no connection to the cultural icons of the Baby Boom generation. According to the Encyclopedia of the Social Sciences, in continental Europe, the generation is often known as Generation E, or simply known as the Nineties Generation, along the lines of such other European generation names as "Generation of 1968" and "Generation of 1914." In France, the term Génération Bof is in use, with "bof" being a French word for "whatever," considered by some French people to be the defining Generation-X saying. In Iran, they are called the Burnt Generation. In some Latin American countries the name "Crisis Generation" is sometimes used due to the recurring financial crisis in the region during those years. In the Communist bloc, these Generation-Xers are often known to show a deeper dislike of the Communist system than their parents because they grew up in an era of political and economic stagnation, and were among the first to embrace the ideals of Glasnost and Perestroika, which is why they tend to be called the Glasnost-Perestroika Generation. In Russia (former USSR), in particular, they were often called "a generation of stokers and watchmen", referring to their tendency to take non-challenging jobs leaving them with plenty of free time, similar to Coupland's Xers. In Finland, the X-sukupolvi is sometimes derogatorily called pullamössösukupolvi (bun mash generation) by the older Baby Boomers, saying "those whiners have never experienced any difficulties in their lives" (the recession of the early 1990's hit the Xers hardest--it hit just when they were about to join the work force), while the Xers call the Boomers kolesterolisukupolvi (cholesterol generation) due to their often unhealthy dietary habits. Japan has a generation with characteristics similar to those of Generation X, shin jin rui. Other common international influences defining Generation X across the world include: increasingly flexible and varied gender roles for women contrasted with even more rigid gender roles for men, the unprecedented socio-economic impact of an ever increasing number of women entering the non-agrarian economic workforce, and the sweeping cultural-religious impact of the Iranian revolution towards the end of the 1970's in 1979. Generation X in the United States was generally marked early on by its lack of optimism for the future; nihilism, cynicism, skepticism, political apathy, alienation and distrust in traditional values and institutions. For some of this generation, Generation X thinking has
324
Advanced Technologies
significant overtones of cynicism against things held dear to the previous generations, mainly the Baby Boomers. Some of those in Generation X tend to be very "consumer" driven and media savvy according to some. Generation X is volatile. Many found themselves overeducated and underemployed, leaving a deep sense of insecurity in Generation Xers, whose usual attitude to work is take the money and run. Generation X no longer take any employment for granted, as their Baby Boomer counterparts did, nor do they consider unemployment a stigmatizing catastrophe.
Generation Y This generation makes up over 70 million people in the U.S. With those born between 1977 and 1994 included, they make up over 20% of today’s population. The largest generation since the Baby-Boomers, the Millennials are defined by their numbers. They have a huge social and economic impact. (http://www.nasrecruitment.com/talenttips/NASinsights/GenerationY.pdf ) There are three major characteristics of the Millennial group: 1) They are racially and ethnically diverse, 2) They are extremely independent because of divorce, day care, single parents, latchkey parenting, and the technological revolution that they are growing up alongside, and 3) They feel empowered; thanks to overindulgent parents, they have a sense of security and are optimistic about the future. Family cohesion is alive and well in the 21st century. Generation Y is being raised in the age of the “active parent.” Defined by the views of child psychology that predominate and the parental education available, this is the decade of the child. (EmploymentReview.com)
Fig. 2. Majority of Children Eat Dinner with Parents Daily – Source: the Millennials: American Born 1977 to 1994; US Census 2000 According to Giordani (2005) “Fathers have entered the child rearing equation and companies realize that time away from the job to spend with the family is very important. Unlike Generation X that came before them, these children are not left to make key decisions on their own; the parents of Generation Y are very hands-on. Parents are involved in the daily lives and decisions of Gen Y. Their parents helped them plan their achievements, took
Technologies to Support Effective Learning and Teaching in the 21st Century
325
part in their activities, and showed strong beliefs in their child’s worth. The secure feeling attained by strong parental involvement makes the members of the Y Generation believe they can accomplish most anything, and if they don’t, they can always go back home and get help and support” (Giordani, 2005). “From a young age, Generation Y is told, through both the media and home, that they can have it all. This generation has a strong sense of entitlement. Striving for a quality of life only known by the rich and famous, wanting the best and thinking they deserve it, makes Generation Y driven and ambitious, with high expectations (Giordani, 2005).
Fig. 3. Women are Better Educated than Men – Source: the Millennials: American Born 1977 to 1994; US Census 2002 According to the Board of Census, “The Millennials are one of the most educated generations yet, and they love to learn. Going to college is no longer reserved for the elite, it is the norm. Today, 64% of women and 60% of men go to college after graduating high school and 85% attend full-time. There are also many choices in higher education today because of the commonality of attending colleges. There are many alternatives beyond public and private schools, from on-line learning to the traditional classroom. Most parents want their children to graduate from college. 58% want their children to graduate from college and 28% want them to obtain an advanced degree. Only 14% of parents do not want their children to receive a college education. (Bureau of Census: 2000) More affluent families have more children that are attending college. The majority of families with children ages 18 to 24 and incomes of $50,000 or more have at least one child in college. Growing up in the age of technology has put a computer in the hands of almost every child. They have understanding and knowledge of technology and keep up quite well with its advances. Three out of four teenagers are on-line, and 93% of those ages 15-17 are computer users. The majority of time spent on the Internet is for entertainment purposes. Emailing, instant messaging and gaming is done by the majority of children eight and older who are on-line (National Center for Health Statistics, 2000).
326
Advanced Technologies
Fig. 4. Most Children are On-Line – Source: the Millennials: American Born 1977 to 1994; National Center for Health Statistics: Computer & Internet Used by Children and Adolescents in 2001. According to NAS, “Unlike past generations, the technological advances in the past decade have put a multitude of choices at the fingertips of Generation Y. The wealth of information available in seconds from the Internet, hundreds of television stations to choose from and a different shopping center every ten miles has given Gen Y members the notion that if they do not get what they want from one source, they can immediately go to another. This relates to employment because Generation Y will question workplace regulations, such as uniform requirements and schedules, and know that there are other options out there if they are not satisfied with the answers. (Allen, 2005) Similar research findings in the UK confirm these characteristics for all four generational cohorts.” Euromonitor International, (2008) has determined that “China has undergone tumultuous changes in the last few decades, many of which have come together to completely transform China’s age demographics and family composition. Increasing life expectancies and muchimproved health conditions have led to rapidly increasing numbers of older people in China, who numbered a staggering 170 million in 2007 and are projected to reach over 230 million by 2015. The traditional Chinese family is largely becoming a relic of the past. The habit of family caring for elders is cultural, and will not disappear overnight. Rather, it will take the form of a continued propensity to save, and an increased demand for private healthcare, homecare and nursing. In December 1977, one year after the end of the Cultural Revolution, 5.7 million Chinese participated in national university entrance exams. Over the three decades since, millions of students have graduated from colleges and universities throughout China and become part of a highly skilled workforce. According to official Ministry of Education figures, the country's higher educational institutions enrolled around 54 million students out of the 128 million national college entrance examinees since 1977. Education spending is growing at levels never before seen in China. Over RMB550 billion made its way out of government coffers and into academic institutions in 2007 alone. The introduction by the government of a compulsory nine-year education program funded by the state is part of a package of reform and higher standards for all Chinese educational institutions, from pre-primary through to higher education. It is at the level of higher education where the government’s
Technologies to Support Effective Learning and Teaching in the 21st Century
327
own ambition is met by an equally ambitious Chinese consumer. Spending on higher education has seen huge growth over the past decade, with the numbers in higher education boasting 30% growth every year over the past five years. These students are also increasingly looking towards doctoral and masters’ courses, and not just undergraduate studies, to differentiate themselves from the ever fiercer domestic competition. The impact of increasing numbers of educated Chinese is immense as illiteracy has been largely wiped out and skilled workers are available to fill the roles required of a rapidly developing economy. China is looking to develop its industries from those based in manufacturing to also include more high-tech and 'idea' industries such as information technology, automotive, bio- technology and robotics. Research and development in China will continue to increase into the future as it builds on its massive 80% growth in the number of researchers since 1995. China hopes to vastly improve the quality of its workforce in order to compete globally in more than just the manufacturing industries which have been integral to its growth thus far. The growth rate of skilled labor through these changes in higher education will, over the forecast period, have implications for global trade, both directly in ideas as well as in idea-driven products, which China sees as key to continued prosperity. There are, though, concerns amongst the huge influx of recent graduates about the increasing problem of finding employment in the ever more saturated job markets in China’s first-tier cities. The number of job hunters from this group entering the market in 2009 is expected to exceed six million nationwide, an increase of 7% from 2008, according to official figures. Unfortunately the number of skilled workers has risen faster than the economy has grown. But the continued and rapid development of China’s interior should slowly foster job creation to match required levels and allay fears of a population of unemployed. With education becoming central to so many young people’s lives in China, there is some reevaluation, especially amongst women, on how to plan their lives. Marriage and partners, children and a settled house and home are all pushed further back in a person’s life plan in contemporary China. This is having an influence on consumption patterns. Increasingly, the consumer habits of the young are being taken up by an increasingly older demographic extending into the 30s. Consumption of food- service, expenditure on alcohol and shopping, and a generally more self-centered lifestyle is becoming the norm for a larger segment of the population. This is not to suggest that Chinese consumers are living completely for themselves – consumption of green and environmental products is on the rise amongst educated people. Global issues such as sustainability and global warming are very much factors in the spending choices of Chinese consumers. This is due in no small part to the increasing levels of education and information consumption in China today (National Statistics, Euromonitor International, 2008). Growth rates of around 20% from 2007 until 2015 in this age segment look to partly reverse the fall in numbers of around 30% in the years 1995-2007. In absolute numbers this means a rise to 190 million by 2010 and then 210 million by 2015. What is interesting is the split between males and females in this age group over the forecast period. Figures of around half a million more females to males in 2005 begin to even out by 2010, and then slide heavily on the side of the males with 5 million more men than women in this age group forecast by 2015. This trend is important because most Chinese in their 20s look to be settling down with a partner. A proportion of this age group will, of course, be students in
328
Advanced Technologies
higher education and so will follow trends as described in the section above. By the same token many will be former students with an income from their first job burning a hole in their pocket. This group will prove very exciting in terms of consumption potential as it is part of the second highest income group with its 25-29-year-old age range boasting a gross annual income of US $2,050. People in Their 30s Source: National statistics, Euromonitor International In stark contrast to the 20s demographic, China has seen the number of people in their 30s grow by around 30% over the period 1995-2007. However, they will face a decline 2007-2015 of over 20%. In absolute numbers this decline will bring the total 30s demographic down from 238 million in 2007 to 216 million by 2010. By 2015 the total number of people in their 30s will have dropped to 184 million. With the average income of 30-35 year olds being the highest in China at USD $2,105, it is this group that has the means to really drive Chinese consumption over the coming decade. This demographic traditionally has a broad choice in consumer spending options, from entertainment and leisure to household and financial goods and services. People in their 30s cannot be pigeon-holed as easily as can other demographics. Marriage, children and income are variables that have major implications for spending patterns and are all variables which affect this age group more so than any other (Euromonitor International 2008). The middle-aged demographic will see growth of 25% between 2007-2015, pushing the total number of middle-aged Chinese up to a staggering 550 million. This will be a full 40% of the total population in 2015, up 10% from 2005. Growth in this demographic has been consistent since 1995. Approximately half of those earning over USD $40,000, and nearly 60% of earners in the highest bracket (over US$100,000), are in their 40s. A taste for high-end items and the resources to spend on themselves and their immediate and extended families – which may include grandchildren by this age – characterizes this segment. (Euromonitor International 2008) The average age of retirement in China is over a decade earlier than in most European countries. As a result, pensions are drawn by middle-aged adults as well as the older demographic discussed here. The ‘grey-consumer’ is proving to be an integral part of the Chinese economy and a key source of consumer spending. The spending patterns of this demographic are likely to change dramatically as the sheer numbers of older Chinese mount every year of the forecast period. Pensioners in 2008 enjoy rather substantial incomes from the state but as the strain of mounting pension payments and healthcare costs takes its toll, Beijing will have to reduce state pensions to a fraction of what it provides today. Based on the statistics, it is forecast that there will be a 24% increase in the number of old-aged persons, bringing the total up from 118 million in 2007 to 146 million in 2015.
Impact The waning spending power of this age group over the forecast period makes it irresponsible to summarize any long term impacts of this particular demographic that coincide with the short-term trends. Although it is true that the state should be able to provide for its retired population in the short-term, it is increasingly obvious that the system will collapse unless major cut- backs and policy changes are made soon. The spiraling medical and health costs alone will be unmanageable towards the end of the forecast period. The number of young adults in Japan declined over the review period. This is attributed to
Technologies to Support Effective Learning and Teaching in the 21st Century
329
the falling birth rate in the country. Young adults, aged under 17 years have less spending power and normally still live with their parents. At this age, however, consumers are becoming more independent and thus the taste and preferences of this consumer group has a significant influence the spending of families. Clothing, consumer electronics are very important to this age group and parents are not hesitant to purchase high-priced products for their children. Consumer electronics, especially those engage in introducing portable media players such as MP3, i-Pods, or other multimedia players are also trying to lure this consumer group. Young people tend to leave the family home around the age of 18 years and are in full-time employments, especially by the second half of their 20s. (Euromonitor International from 2008) The number of middle-aged adults rose to almost 53.8 million in 2005. This age group is considered by trade sources to have more savings that the younger generations and as they grow older will not be averse to spending on themselves. They also constitute niche markets for a variety of goods and services. The number of middle-aged adults in Japan is expected to fluctuate over the forecast period, accounting for 41% of the population in 2015. Females became predominant in this age group since 2005 and they will continue to exceed males in number through to 2015. The number of baby boomers showed decline in 2005 but it is expected to increase by 2015 reaching around 35.2 million or27% of the entire population of the country. Baby Boomers prefer staying home to cook, read, and watch television, rather than going out and spending money. They tend to be loyal to the products they like and mostly they try to be practical. Pensioners : (Euromonitor International, 2008) (aged 60+) possess almost half of the country’s ¥11.3 trillion in savings. Projections have them spending nearly ¥95 billion a year on new products, home care and home renovation. By 2015, there will be 38.3 million pensioners in Japan, representing 30% of the population with female pensioners outnumbering males. On the other hand, it is expected that sales to pensioners will increase resulting in companies diversifying their products to accommodate the needs of this sector. One interesting way to distinguish the four generations in 20th Century American culture is by their attitude and use of technology. (Clift. C. 2009) http://www.helium.com/items/ 368338-the-names-and-characteristics-of-the-four-generations o The Mature Generation are afraid of and skeptical towards technology. Perhaps they are threatened by computers, because they provide shortcuts to so much of the honest work that their generation toiled over for so many years. o The Baby Boomers are forced with the dilemma of adapting to new technology out of necessity, while at the same time being old dogs learning new tricks. o Generation X were the early adopters to most new technologies. I believe that they were falsely labeled a slacker generation because the way they learned, worked, and played differed so much from previous generations, due to their use of technology. o Generation Y, the Net Gen, or the Millennials, are a unique group of American youth, as they have had access to computers and even the internet for most or all of their lives. This changes the way they learn and communicate, and is a huge diversion from the traditional print medium, with its linear, singular way of doing things. Millennials multi-task, they think in terms of multi-media as opposed to simply text, and they exchange and process information extremely fast. Unfortunately, and conversely, though, Millennials take these luxuries for granted.
330
Advanced Technologies
They have lost something of the art of face to face conversation. Generation Y seems to favor quantity over quality in terms of communication. ubiquitousness of the technology of their grandchildren. Jones (2009) states, “Contrary to the image of Generation Y as the ‘Net Generation,’ internet users in their twenties do not dominate every aspect of online life. Generation X is the most likely group to bank, shop and look for health information online. Boomers are just as likely as Generation Y to make travel reservations online. And even Silent Generation internet users are competitive when it comes to email” (p.1). Generations Explained, Pew Internet & American Life Project
Internet use and email Jones (2009) continues, “The web continues to be populated largely by younger generations, as more than half of the adult internet population is between 18 and 44 years old. But larger percentages of older generations are online now than in the past and they are doing more activities online, according to the Pew Research Center's Internet & American Life Project surveys taken from 2006-2008.”
Fig. 5. Generations Explained – Source: Pew Internet & American Life Project 2008 Survey, N=2,253
Fig. 6. Makeup of Adult Internet Population by Generation (excluding teens) – Source: Pew Internet & American Life Project 2006-2008
Technologies to Support Effective Learning and Teaching in the 21st Century
331
Adult Internet users by generation According to Jones (2009) and the Pew Research Center, The biggest increase in internet use since 2005 can be seen in the 70-75 year-old age group. While just over one-fourth (26%) of 7075 year olds were online in 2005, 45% of that age group is currently online. Much as we watch demographic and age groups move up in "degrees of access" on our "thermometers,"1 we can probably expect to see these bars become more level as time goes on. For now, though, young people dominate the online population. Instant messaging, social networking, and “blogging” have gained ground as communications tools, but email remains the most popular online activity, particularly among older internet users. Fully 74% of internet users age 64 and older send and receive email, making email the most popular online activity for this age group. At the same time, email has lost some ground among teens; whereas 89% of teens said they used email in 2004, just 73% currently say they do. Teens and Generation Y (internet users age 18-32) are the most likely groups to use the internet for entertainment and for communicating with friends and family. These younger generations are significantly more likely than their older counterparts to seek entertainment through online videos, online games and virtual worlds, and they are also more likely to download music to listen to later. Internet users ages 12-32 are more likely than older users to read other people's blogs and to write their own; they are also considerably more likely than older generations to use social networking sites and to create profiles on those sites.2 Younger internet users often use personal blogs to update friends on their lives, and they use social networking sites to keep track of and communicate with friends.3 Teen and Generation Y users are also significantly more likely than older generations to send instant messages to friends.
Fig. 7. Percentage of Americans online by age – Source: Pew Internet & American Life Project 2008 Survey, N=2,253 By a large margin, teen internet users' favorite online activity is game playing; 78% of 12-17 year-old internet users play games online,4 compared with 73% of online teens who email, the second most popular activity for this age group. Online teens are also significantly more likely to play games than any other generation, including Generation Y, only half (50%) of whom play online games.
332
Advanced Technologies
Fig. 8. Internet Users in the World by Geographic Regions – Source: Internet Research World Stats, 2009
Fig. 9. Personal and Lifestyle Characteristics by Generation, Hammill G, 2005 Mixing and Managing Four Generations of Employees , FDU Magazine The characteristics listed in the table above are but a very few of those that have been studied and reported by various authors. Not every person in a generation will share all of the various characteristics shown in this or the next table with others in the same generation. However, these examples are indicative of general patterns in the An example, based on these traits, would be to think about how words are received differently. When a Boomer says to another Boomer, “We need to get the report done,” it is generally interpreted by the Boomer as an order, it must be done and done now. However,
Technologies to Support Effective Learning and Teaching in the 21st Century
333
when a Boomer says to an Xer, “This needs to be done,” the Xer hears an observation, not a command, and may or may not do it immediately. http://www.fdu.edu/newspubs/magazine/05ws/generations.htm
Learning Online Online education has both grown and changed significantly over the past decade. The term distance education http://www.ecollegefinder.org/blog/post/The-History-of-OnlineEducation- How-it-All-Began.aspx dates back to the late 1800s? It was first used in a school catalog for the University of Wisconsin-Madison. Nowadays, this concept of "distance education" is associated with online education at your fingertips anytime, anywhere. Since its inception, the online education industry has grown in popularity, and altered the definition of 'classroom' and has given brick and mortar educational institutions tough competition. It is no longer uncommon to know friends and family members who have earned their degrees, particularly advanced level degrees, from an online education portal. Though now fully functional and aesthetically pleasing to students, thriving online schools have come a long way. In the 60’s some schools, including Stanford, began implementing the earliest versions of "online education" which enabled students to reach instructors via online notes and teachers to monitor their students' progress via data. In the 70s and 80s, computers began to appear in classrooms as early as in Kindergarten. Lotus Notes version 1.0 was released in 1989, paving the way for the Internet to transform from "geek gadget" to a necessity. During the mid 1990s, Internet companies were by the thousands and gave way to the "dot-com" boom. Later that decade, schools began to explore internet and computer capabilities-beyond creating slideshows- into very basic, text-based online courses. In 1996, founders Glenn Jones and Bernard Luskin created the first accredited web-based university, Jones International University. What once began as text-based courses with a smattering of pictures transformed into courses with streaming media, web access and the ability to work anytime, from anywhere. Online education has literally opened doors for many who thought they could not further their education. Whether you have two jobs, are a busy caretaker, or simply cannot afford to commute to a campus program, online education makes pursuing your dreams achievableoften at a lower price, too.
Discipline and Technology The landscape of education has been constantly changing. It has been shaped by the multiple technologies that have been designed and utilized to enhance teaching and learning. During the past decade, distance learning has proliferated significantly. Universities and colleges, traditional or not for-profit institutions, have added distance learning programs or attempted to figure the best possible ways to deliver their education to learners and satisfy learner preferences Growth of online education is prominent in the United States, particularly in the higher education section. According to EduVenture, an independent research organization with a focus on online education, in 2008 approximately 90 percent of students enrolled in college or university took at least one online class. This accounts for over 16 million students. Such a
334
Advanced Technologies
phenomenon can be interpreted that online learning are utilized in most disciplines. However the level of popularity and acceptance of online education differ from one discipline to another. Factors contributing to the disparity, may include required learning environment, validation of faculty and student online authenticity, accessibility of information and modality of teaching, Among major disciplines, business, liberal arts and heath profession rank the highest for their online penetration rates. The Sloan Report: Staying the Course, Online Education in the United State (2008) shows trends in online education in various disciplines. Another study conducted by EduVenture during the same period provides the consistent results.
Fig. 10. Online Penetration Rate by Academic Discipline – Source: The Sloan Report Staying the Course, Online Education in the United States, 2008 The level of popularity of online education in the various disciplines depends on various factors. They include: The availability of online classes, learning preferences and pedagogical reasons. Today’s learning management systems provide evidence on course behaviors. A fully developed system can track student activities and engagement in an online class. The richness of data is helpful for university and colleges to correlated student success with class component so that they can design classes suitable for each discipline. In addition to the demand for technologies suitable for learners of various generations, different disciplines requires unique set of technologies assisted learning. The second part of this chapter discusses (1) how online education can enable student learning in different disciplines and (2) how online graduates are accepted in their professions.
Student Learning Early studies suggested that there are no significant differences between student learning in online and traditional classes (Fox, 1998; McKinssack, 1997; Sonner, 1999 and Waschull, 2001). Learning results measured included test performance (Waschull, 2001) and student grades (Beare, 1989) is supported recent literature. More recent studies however suggest that
Technologies to Support Effective Learning and Teaching in the 21st Century
335
student success in online classes depend on organizational, technological, pedagogical, institutional and student factors including learning style preferences (Schrum and Hong, 2002). Finnegan, Morris and Lee’ research (2008) suggested that a “one size fits all” approach in online education may be counterproductive to student success. Measures of student success are commonly established according to expected learning outcomes. Either because of their own inspiration or their accreditation requirements, higher education institutions craft their student learning outcomes to assess their educational effectiveness. It is not uncommon that accreditation agencies require institutions to measure outcomes of student learning through some common guidelines. Such common guidelines may include how technology be utilized in the student learning process. The new generation of research suggests that key factors contributing to student learning are technologies, student interaction and perceived learning.
Technology Technology plays a crucial role in online learning and teaching. Compatibility between preferences of learning style and modes of instructional delivery may enhance student learning. Available educational technologies can be used to modify delivery of instruction so that it may enhance student learning. E-learning technologies inundate higher education with options such as e-book, online simulation, text messaging, video conferencing, wikis, podcasts, blogs, 2nd life and social networking. Pedagogical literature suggests that levels of information retrieval in an e-learning environment influence on student learning (Fernandez-Luna, Huete, MacFarlane and Efthimiadis, 2008). Several studies suggest that some media is more effective than others or that they may enhance some learning activities. For example, advancements in voice over Internet Protocol (VOIP) allow synchronous communication (i.e., visual, verbal, and text) among a group students and the instructor. This technology helps with instant two-way communication. Studies have shown that drawback of technology may include inappropriate use of media. Reading online content generally takes longer than reading the paper copy of the same content. While educational technologies advance rapidly, new models of online teaching and learning are catching up with the technology in a much slower pace. Although various learning technologies have helped improve learning, there is no single universal technology applicable across all disciplines. Different disciplines require different pedagogical logic and thus possible different technologies to support student learning. While experimental experience is critical to student learning in such fields of study as medicine and sciences, it may be secondary in other disciplines. For instance, students in medical school will need to learn surgical procedure through several hand-on practices before they can display the expected learning outcomes required by their discipline. However, students who study history may benefit greatly from online education where they can read relevant materials and interact with their peers and instructor virtually. Literature suggests that the learning of business students in an online environment may exceed that of traditional classroom in some areas. Even within the same field, different professions require different levels of skills and knowledge as each role calls for a unique set of responsibilities. These differences translate to specific technology-support learning. While a pilot and a control tower staff must understand how to land an aircraft, “proof” of the pilot learning can typically be several
336
Advanced Technologies
hours of simulations and real flight time and that of the control tower staff can be a few hours of simulation landing. Learning in the simulation environment for both pilot and the control tower stuff is equipped with similar technology. Though with the same knowledge of how to land an aircraft, pilot and the control tower staff are required to perform different job function and thus their learning and skill are measured differently.
Interaction A typical online class contains three components (1) content, (2) interaction and (3) assessment. Each of these components may be adjusted to fit a specific discipline. The majority of online, online learning courses are structured around content management systems that employ text as the main medium for all communication. While the discipline dictates its content, level of student interaction must be designed to optimize student learning. The core of online classes is interaction, not content (Simmons, Jones, and Silver, 2004). Instructional design must consider interaction among students, and that between instructor and students. Reflective learning is common in most business curriculum programs. In general, businesses curriculum requires students to internally examine and explore an issue of concern triggered by experience. A study conducted at the University of Wisconsin-Whitewater (2004) found a significantly higher percentage of reflective learning in their online MBA students. The same study showed online experience can surpass traditional learning and teaching. What the research showed is that students in an online business program want and seek deeper learning experience. Even within the discipline of business, level interaction required for student leaning differs among types of courses or programs. Literature suggests that information is critical in the area of health professional. In this discipline, student teacher relationships in distance education are pivotal to student learning (Atack, 2003: Roberts, 1998). A major challenge in adopting the online learning mode is to improve human interaction in order to provide a facilitative environment for establishing peer support, developing communication for idea exchange, and guiding socialization (Sit et al 2005). Thurmond (2003) concluded that interaction is a core element of an effective online environment and that faculty should be knowledgeable in interaction strategies to foster settings conducive to learning. Effective two-way communication is paramount to ease the fears, frustration, and anger that may be demonstrated throughout online learning (Kozlowski, 2004). Frith and Kee (2003) studied online communication methods and concluded that the kind of communication affected course satisfaction. In addition to enhancing effective communication, issues related to privacy and intellectual property are critical in an online class. Assessments of student learning in different discipline vary. One discipline may require constant feedback or formative assessment more frequent than another. Online classes may limit opportunities for a close or in-person supervision in assessment process. Such limitation may discriminate the effectiveness of online learning.
Perceived Learning In distance education, the roles of both the faculty and students change and diversify. Faculty assume the role of facilitator and must develop the goals of learning, conditions of
Technologies to Support Effective Learning and Teaching in the 21st Century
337
learning, and methods of instruction. (Atack, 2003; Billings et al., 2001; Huckstadt & Hayes, 2005; Seiler & Billings, 2004). Even within the discipline of business, level interaction required for student leaning differs among types of courses or programs. A study of Argaugh and Rau (2007) suggested that participant interaction was significantly associated with perceived learning. The study shows that perceived learning among business subjects rank from (1) project management, (2) human resource management, (3) strategic management, (4) international business, (5) general management, (6) literature in business and (7) finance. Another finding from this study is that discipline and course structure characteristics were primarily associated with student satisfaction. The nursing literature on distance education and professional socialization reveals positive outcomes in perceived learning. Cragg (1991) reported that student outcomes in the areas of issues in nursing and had attitudinal changes reflecting a more professional orientation are well developed at the completion of a nursing distance education course.
Acceptance of Online Graduates One common objective of students pursuing an educational degree is to advance in their profession. Once the course work ends, acceptance of the credential earned from online and traditional program may post problem. This is despite of identical diploma given to graduate by the institution that offers both online and traditional programs. In addition to the reputation of a university for academic rigor associating with acceptability, literature suggests that in part, the face-to-face contact in traditional classroom is perceived to offer something more. Many believe that instruction and mentoring are more effective in a traditional classroom set up and that that they are essential key to what many would consider a "quality" education. Literature also suggested that online programs, even those offered by reputable institutions with high academic standards, may always be regarded as "missing" key elements. The perception of academic honesty, social presence, and the validity of degrees earned from an online program are of great interest in academia. Newer technologies or pedagogical models are needed help addressing such concern. During the past few years, level of acceptance of online graduate in the job market has greatly improved and the trend will continue. Disparity occurs in some areas. Organizations or recruiters may be skeptical with quality of online degree in some disciplines. While academia has been more embrace of online graduates, it will take a longer time before a law firm accepts graduates from a fully online law program. Despite its popularity and contribution to student learning, research on online education is limited. More research is needed in the area of distance learning and acceptability of online graduates by the public, most importantly by potential employers. Evolution of online learning will continue to change the higher education landscape. With proper adjustment, adoption of online learning will go across all disciplines and the expansion will be international. Acceptance of online degree will be to the new heights. Institutions offering online program must keep improving on their e-learning products and concepts, as they are wide and varying. Students will seek for premier support, convenience, and the most cutting-edge learning content available from the program of their choice.
338
Advanced Technologies
Conclusion Market trends have demonstrated that there has been, and there will continue to be significant growth in online education. It has become the preferred modality of a large proportion of Adult Learners. The Internet is the "WI-FI" generation medium of choice. In order for Higher education to remain competitive in the global marketplace more effort must be made in the Research and development of online classes. We must also encourage the educators to be open to this new format and apply and introduce new delivery modes in order to enhance the students' learning capabilities. Old established Universities are now introducing online programs and those who are already offering this format must continue to improve their e-learning products and concepts. With constant adjustments for improvement, adoption of online learning can traverse across all disciplines, all nations, and become truly international.
References Allen, R. (2005). [Online] Available: Managers Must Set Example for Gen Y Kidployees; Employee Recruitment and Molding; http://www.nasrecruitment.com/talenttips/NASinsights/GenerationY.pgf Arbaugh, J. B., Rau Barbara L.. (2007). A Study of Disciplinary, Structural, and Behavioral Effects on Course Outcomes in Online MBA Courses. Decision Sciences Journal of Innovative Education, 5(1), 65. Atack, L. (2003). Becoming a web-based learner: Registered nurses’ experience. Journal of Advanced Nursing, 44, 289-297. Baby Boomers Grow Up 1996 [Online} Available: www.thestrategicedge.com/Articles/babyboom.html Beare, P. L. (1989). The comparative effectiveness of videotape, audiotape, and telecture in delivering continuing teacher education. The American Journal of Distance Education, 3(2), 57–66. Billings, D., Connors, H., & Skiba, D. (2001, March). Benchmarking Best Practices in WebBased Nursing Courses. Advances in Nursing Science, 23(3), 41. Brown, R.L & J.C. Campione, (1994). Guided discovery in a community of learners. In K.McGilly (Ed.), Classroom lessons: Integrating cognitive theory and classroom practice (pp.229-270). Cambridge, MA: MIT Press. Clift C. (2009) {Online} available http://www.helium.com/items/368338-the-names-andcharacteristics-of-the-four-generations. Blog Consumer Lifestyles in China (2008) National statistics, Euromonitor International Coupland,D. ( 1991) Generation X: Tales for an Accelerated Culture Cragg, C.E. (1991). Professional resocialization of post-RN baccalaureate students by distance education. Journal of Nursing Education, 30, 256-260. Dede, C. (1998). Six Challenges for Educational Technology. Available: http://www.virtual.gmu.edu/pdf/ASCD.pdf Edelson,D.C. R. D. Pea, & L.M Gomez, (1996). Constructivism in the collaboratory. In B. Wilson (Ed.), Constructivist learning environments: Case studies in instructional design. Englewood Cliffs, NJ: Educational Technology Publications. Employment review [Online] Available: employmentreview.com
Technologies to Support Effective Learning and Teaching in the 21st Century
339
Euromonitor International.(2008, November). Consumer Lifestyles. People in their 20’s http://www.euromonitor.com/default.aspx Euromonitor International.(2008, October). Consumer Lifestyles. US, China and Japan. http://www.euromonitor.com/default.aspx Fernández-Luna J. M., Huete J. F., MacFarlane A., Efthimiadis E. N. (2009). Teaching and learning in information retrieval. Information Retrieval, 12(2), 201-226. Finnegan C., Morris L. V., Lee K. (2008). Differences by Course Discipline on Student Behavior, Persistence, and Achievement in Online Courses of Undergraduate General Education. Journal of College Student Retention, 10(1), 39-54. Fox, J. (1998). Distance education: Is it good enough? The University Concourse, 3(4), 3–5. Frith K. H, Kee C. C. (2003). The effect of communication on nursing student outcomes in a web-based course. Journal of Nursing Education, 42(8), 350-8. Generation X (2008). International Encyclopedia of the Social Sciences, 3, 2nd edition. (2008) New York, McMillan Learning. http://www.gale.cengage.com/iess/content.htm Generation Y the Millenials [Online]Available: www.nasrecruitment.com/talenttips/NASinsights/GenerationY.pdf Getting To Know Gen X [Online] Available: www.nasrecruitment.com/TalentTips/ NASinsights/ Getting to KnowGenerationX.pdf Giordani, P 2005.Y Recruiting [Online]Available: http://www.nasrecruitment.com/talenttips/NASinsights/GenerationY.pdf Hamblett, C. & Deverson, J. (1964). Generation X: today’s generation talking about itself. London, Tandem. Health Statistics Welcome to GenXPedia [Online] Available: www.genxpedia.com Huckstadt, A., & Hayes, K. (2005, March). Evaluation of interactive online courses for advanced practice nurses. Journal of the American Academy of Nurse Practitioners, 17(3), 85-89. Jones & Fox, S Pew Internet & American Life Project
January 28, 2009 Kozlowski, D. (2004). Factors for consideration in the development and implementation of an online RN-BSN course: Faculty and student perceptions. Computers, Informatics, Nursing, 22, 34-43. McKissack, C. E. (1997). A comparative study of grade point average (GPA) between the students in the traditional classroom setting and the distance learning classroom setting in selected colleges and universities. Dissertation Abstracts International, 58(8), 3039A. (UMI No. ABA98–06343) NAS Recruitment Communications, (2008). An agency of the McCann Worldgroup. Nasrecruitment.com http://www.nasrecruitment.com/TalentTips/NASinsights/ GettingtoKnowGenerationX.pdf; http://www.nasrecruitment.com/talenttips/NASinsights/GenerationY.pdf People in Their 20s Source: (2008)National statistics, Euromonitor International People in Their 30s Source: (2008)National statistics, Euromonitor International. Primo J. E. & J. A. Wedeven (1996). Baby Boomers Grow Up. The Strategic Solution, Newsletter of the Strategic Edge. http://www.thestrategicedge.com/Articles/babyboom.htm Primo J. E. & J. A. Wedeven (1998). The Strategic Edge Inc. http://www.thestrategicedge.com/
340
Advanced Technologies
Roberts, K.K. (1998). A naturalistic study of students’ experiences in a computer-based nursing course (Doctoral dissertation, University of Kansas, 1998). Dissertation Abstracts International, 59 (12), 4352A. (UMI No. 9914105) Robinson, K. (2006). Cohort. In N.J. Salkind (Ed.) Encyclopedia of Human Development, 1(pp.286). Thousand Oaks, Ca: Sage Reference Retreived February 10, 2009, from Gale Virtual Library. [Online]Available: Schank,R.C & M.Y Jona, 1991). Empowering the student: New perspectives on the design of teaching systems.The Journal of Learning Sciences, 1, 7-35. Schrum, L.,& Hong, S. (2002). Dimensions and strategies for online success: Voices from experienced educators. Journal of Asynchronous Learning Networks, 6(1), 57–67. Seiler, K., & Billings, D.M. (2004). Student experiences in web-based nursing courses: Benchmarking best practices. International Journal of Nursing Education Scholarship, 1(1), Article 20. Simmons, S., Jones Jr., W., & Silver, S. (2004, September). Making the Transition from Faceto-Face to Cyberspace. TechTrends: Linking Research & Practice to Improve Learning, 48(5), 50-85. Sit, J.W., Chung, J.W., Chow, M.C., & Wong, T.K. (2005). Experiences of online learning: Students’ perspective. Nurse Education Today, 25, 140-147. Smith, S.E. (2009) Silent generation [Online} Available: http://www.wisegeek.com/what-isthe-silent-generation.htm Sonner, B. (1999). Success in the capstone business course—Assessing the effectiveness of distance learning. Journal of Education for Business, 74, 243–248. Strategic solutions ( 1998) newsletter of Strategic EdgeThe Millennials: Americans Born 1977 to 1994; Bureau of Census: A Child’s Day, 2000) Study: Web-Based Instruction Can Surpass F2F Courses in Critical Thinking. (2004, September). Online Classroom, Retrieved May 20, 2009, from Academic Search Premier database. The Millennials: Americans Born 1977 to 1994; National Center for Thurmond, V.A. (2003). Defining interaction and strategies to enhance interactions in webbased courses. Nurse Educator, 28, 237-241. Waschull, S. B. (2001). The online delivery of psychology courses: Attrition, performance, and evaluation. Teaching of Psychology, 28, 143–146.
Multiphase Spray Cooling Technology in Industry
341
19 X Multiphase Spray Cooling Technology in Industry Roy J. Issa
West Texas A&M University USA 1. Introduction
Cooling by air-assisted water sprays has received much attention in the last decade because of the benefits it has shown over conventional cooling by methods such as forced air jets or single-fluid nozzles. Air-assisted water sprays are currently being used in many industrial applications. Some of these applications require rapid cooling from high temperatures, such as the cooling of medium thick plates and thin strips in the hot rolling steel mill, glass tempering in the auto industry, and electronic chip cooling in the computer manufacturing industry. Sprays dispersing fine droplets size referred to as air-mist sprays have been proposed for use in heat exchanger devices where the performance of heat exchangers can be tremendously improved by the injection of a small amount of mist with the forced air flow. The application of air-mist sprays has also found its way into the food processing industry in the cooling of vegetable produce on grocery shelves and chilling of beef and lamb carcasses at meat packing facilities. The purpose of this chapter is to present some of the basic principles for understanding multiphase sprays and to demonstrate their use. Since many aspects of multiphase sprays can be discussed, this chapter focuses mainly on the flow dynamics and the heat transfer associated with multiphase cooling. Several industrial applications are presented. These include: quenching of upward and downward facing heated plates by air-assisted water sprays, the use of air-mist sprays in heat exchangers, and chilling of vegetable produce and processed meat by air-mist sprays. The discussion will provide an insight into the optimal flow conditions for best heat transfer effectiveness and best spray coverage. Several factors that influence the spray cooling effectiveness and droplets impaction will be introduced including the spray droplet size, water flow rate, air-to-liquid loading, and nozzle-to-surface distance.
2. Flow Dynamics in Multiphase Sprays Air-assisted sprays consist of a mixture of air and liquid. An air stream of high velocity accelerates the liquid droplets. The nozzles contain two flow chambers: a pressurized liquid chamber and a pressurized air chamber (Figure 1). The discharging liquid and air streams collide towards the nozzle center. The pressurized air surrounds and impinges on the liquid flowing from the nozzle orifice and atomizes the liquid film. There are several factors that
342
Advanced Technologies
affect droplet size, such as the spray pressure, and air-to-liquid loading. Smaller droplet sizes in the spray can be generated by either increasing the air pressure while decreasing the liquid pressure, or by increasing the airflow rate while decreasing the liquid flow rate. An increase in the pressure difference between the liquid and air results in an increase in the relative velocity between them. This leads to an increase in the shear force acting on the liquid, which produces finer droplets. A similar effect occurs by increasing the air-to-liquid loading. Figure 2 shows the air stream-lines and droplets flow for a typical spray application. Circulation regions are shown to develop downstream of the flow towards the target surface and away from the stagnation point. Depending on the flow operating conditions, air-assisted spray atomizers are able to generate a spectrum of droplets sizes that can range from few microns (referred to as fine mist) to several millimeters in diameter. The nozzle system consists of an air actuated nozzle body assembly and a spray setup fixture consisting of an air cap and a fluid cap. Simply by changing the air and fluid caps, sprays with different drop size spectrums can be generated. As the droplets exit the nozzle, their trajectories are influenced by several factors such as: droplet size, droplet impinging velocity, ambient temperature, relative humidity, nozzle-to-surface distance, and wind speed and direction (if applicable). The spray droplet size has a strong influence on the spray heat transfer (Issa & Yao, 2004 & 2005). The smaller the droplet size, the easier it is for the droplet to evaporate at the surface; thus leading to higher heat transfer effectiveness. However, smaller size droplets may never reach the target surface but instead will cool the thermal boundary layer near the surface by increasing the near surface local evaporation. On the other hand, large size droplets can have a detrimental effect on the surface cooling because of the flooding that may occur at the surface. Droplets with higher impinging velocities tend to spread more at the surface (Chandra & Avedisian, 1991). The more the droplets spread at the surface the higher is the heat transfer effectiveness. In certain applications where the surface is heated to temperatures beyond the Leidenfrost point, a vapor layer will develop quickly at the surface. Droplets with low impinging velocities may not be able to penetrate through this film layer. However, for higher impinging velocities droplets can penetrate through the film layer, and more surface contact can be established. An increase in the nozzle-to-surface distance will lead to a decrease in the droplet impinging velocity (also a decrease in droplet momentum) at the surface. This is due to the longer duration the drag force will be acting on the droplet. With less momentum, the droplets surface impaction efficiency will decrease. Surface and ambient temperatures and relative humidity also have an affect on the droplets impaction (Issa, 2008-b). The evaporation rate of the droplets increases with the increase in the surface or ambient temperature. This results in smaller size droplets developing while the droplets are airborne. This also leads to more droplets drifting away and reduces the impaction efficiency. A similar effect takes place when Low relative humidity levels are present. Finally, wind speed and its direction (as in the case of agricultural sprays) have a strong influence on the spray drift (Lawson & Uk, 1979). By increasing the droplets size or the injection velocity, spray drift will reduce. During the last decade, electrostatic charging of the droplets (Zheng et al., 2002) has also been introduced as a means for reducing the spray drift and enhancing the penetration and deposition of the droplets onto the target surface (canopy plants). Spray drift will reduce because the droplets trajectories will then be influenced by the electrostatic forces as they try to follow the electric field lines.
Multiphase Spray Cooling Technology in Industry
343
Stream Lines Spray Droplets
Target Surface
Fig. 1. Air-mist spray nozzle
Fig. 2. Typical simulation for spray flow over a target surface
In order to maximize the spray cooling effectiveness over the target surface, it is important to reduce the amount of drift by maximizing the droplets surface impaction. The spray impaction efficiency, η , is defined as the ratio of the actual droplets mass flow flux deposited onto the target surface, G, to the maximum droplets mass flow flux leaving the nozzle, Gmax: G (1) η= G max Sprays with large droplets have high impaction efficiency unlike sprays with small size droplets which have difficulty staying the course of their initial trajectory and simply drift along the air stream as they approach the target surface. The orientation of the surface with respect to the spray affects the spray impaction efficiency. For example, it is expected that the impaction efficiency on a downward-facing surface to be much lower than that on an upward facing surface due to the effect of gravity pulling on the droplets downwards (refer to Figure 3).
Spray Impaction Efficiency, η (%)
100
80
60
Nozzle-to-Strip Distance = 0.5 m Tair = TH2O = 27 oC
Tplate = 850 oC Water Flow Rate = 5 GPM Injection Velocity = 6.5 & 65 m/s Strip Speed = 15 m/s
40
20
Top Surface, Air-to-Water Load Ratio = 1 Bottom Surface, Air-to-Water Load Ratio = 1 Top Surface, Air-to-Water Load Ratio = 10 Bottom Surface, Air-to-Water Load Ratio = 10
0 10
100 Droplet Diameter, d (µm)
Fig. 3. Droplet impaction versus droplet diameter
1000
344
Advanced Technologies
When spray droplets make it to the surface, there are three possible ways they can interact with the surface: stick, rebound or breakup at the surface. A droplet will stick to the surface when it approaches the surface with low incoming momentum (i.e., low incoming velocity or fine droplet size). Upon impaction, the droplet will adhere to the surface in a nearly spherical form. With an increase in the droplet incoming momentum, a droplet will rebound at the surface. During impaction, the droplet will spread radially in the form of a flattened disk. After the droplet reaches maximum spread, it will begin to recoil backwards towards its center as it leaves the surface due to the surface tension effect. The droplet spread at the surface is function of the droplet deformation on impaction, which is a process of energy transformation between kinetic and surface energies. The understanding of the droplet bouncing behavior at surfaces near room temperature (Scheller & Bousfield, 1995; Mundo et al., 1997) and at metallic surfaces heated above the Leidenfrost point (Wachters & Westerling 1966; Karl et al., 1996) is well established for sprays dispersing water droplets. For metallic surfaces heated to temperatures above the Leidenfrost point, the droplet coefficient of restitution (ratio of outgoing to incoming droplet velocity) at the surface is shown to be dependent on the impinging droplet Weber number (ratio of droplet inertial force to surface tension force). As the droplet incoming momentum increases beyond a critical value, the droplet will disintegrate during impaction at the surface. It has been found that the number of produced satellite droplets will increase with the increase in the incoming droplet momentum (Issa & Yao, 2004). Figure 4 shows the interaction of a droplet with a heated surface. After impacting the surface, the droplet changes its speed and trajectory. This can be quantitatively measured by the normal and tangential coefficient of restitution. Data gathered from several sources for water droplet impactions at atmospheric conditions and on surfaces heated to temperatures above the Leidenfrost point (Hatta et al., 1997; Karl et al., 1996; Wachters & Westerling, 1966; and Naber & Farell, 1993) show the relationship between the droplet normal coefficient of restitution, en, and the normal impinging droplet Weber number, Wen, (Figure 5) to be as follows: e n =1-0.1630We 0.3913 n
(2)
Where,
We n =
2 ρd vi,n d
σd
(3)
ρd is the droplet density, vi,n is the droplet impinging normal velocity at the surface, d is the droplet diameter and d is the droplet surface tension. Experiments performed by Karl et al. (1996) on surfaces heated above the Leidenfrost temperature show the loss in the droplet tangential momentum to the wall to be about 5%. Recent models for fuel spray-wall impingement in diesel engines have assumed a fixed value of 0.71 for the tangential coefficient of restitution (Lee & Ryou, 2000).
Multiphase Spray Cooling Technology in Industry
345
Normal Coefficient of Restitution, e n
1.0
Hatta et al., d=330-480 microns, Tw = 500 deg.C Karl et al., d=90 microns, above Leidenfrost Temp. Wachters & Westerling, d=1.7 mm, Tw ~ 400 deg. C Naber & Farrell, d=310 microns, Tw = 400 deg.C Best-Fit of All Data
0.9 0.8 0.7
Water Droplets
0.6 0.5 0.4
en = 1 - 0.1630Wen0.3913
0.3 0.2 0.1 0.0 0
10
20
30
40
50
60
Impinging Normal Weber No., Wen
70
80
Fig. 4. Droplet bounce at the heated wall Fig. 5. Droplet normal coefficient of restitution at steel plates heated above the Leidenfrost temperature
3. Heat Transfer in Multiphase Sprays Air-assisted water sprays are used in the cooling of high or low temperature surfaces in many industrial applications. Applications associated with rapid cooling from high temperatures include thin strip casting, glass tempering and electronic chip cooling, while low temperature cooling applications include beef or lamb carcass chilling, and chilling of food and vegetable produce. 3.1 Quenching of metallic surfaces heated above the saturation temperature For metallic surfaces heated to temperatures above the droplet saturation temperature, there are three modes of heat transfer associated with the multiphase spray cooling process (Figure 6). These are: a) conduction and convection associated with the droplet contact with the heated surface, b) convection associated with the bulk air flow and the droplet cooling of the thermal boundary layer, and c) surface radiation. For sufficiently high incoming droplet momentum, better droplet-to-surface contact and therefore better surface wetting can be established if surface flooding and droplet-to-droplet interactions are minimized. Those two effects are detrimental to cooling. When surfaces are heated to the critical heat flux (nucleate boiling), spray heat transfer is referred to as heat transfer by wet contact. This is due to the fact that the droplets are in continuous or semi-continuous contact with the heated surface. In this case, the surface heat transfer is at its maximum. When surfaces are heated to the Leidenfrost temperature, spray heat transfer is referred to as heat transfer by non-wet contact. This corresponds to the case where after a short period of droplets contact with the surface, a film layer is generated quickly between the droplets and the surface preventing further direct contact. The heat transfer in this case is at its minimum. In this latter cooling regime, the incoming droplets momentum have significant influence on the cooling efficiency. For sufficiently high incoming momentum, the droplets can penetrate through the film layer, and more surface contact can be established leading to higher spray cooling effectiveness. A comparison between the two boiling regimes (wet and non-wet cooling) is shown in
346
Advanced Technologies
Figure 7. Sprays can always be characterized as either dilute sprays, intermediate dense sprays, or dense sprays. The deciding parameter is the flow flux (water mass flow rate per unit surface area). Spray flow fluxes less than 2 kg/s.m2 are considered to be dilute sprays (Deb & Yao, 1989), while flow fluxes slightly above 2 kg/s.m2 are associated with intermediate dense sprays. For both boiling regimes, the higher the impinging droplet Weber number the stronger is the droplet contact heat transfer. This should not be confused with the spray Weber number which is different. At a certain critical droplet Weber number (Figure 7), the droplet will disintegrate during impaction at the surface. For dilute sprays, the droplet contact heat transfer is expected to increase linearly with the impinging droplet Weber number because droplets interaction is minimal. As the droplet size increases, the spray transitions from dilute to intermediate dense to dense spray. As a result, surface flooding and droplet-to-droplet interactions also increase. This leads to saturation in the droplet contact heat transfer (refer to actual curves in Figure 7).
Droplet Contact Heat Transfer
Droplet Breakup Starts
Ideal
Conditions: Increase in cooling due to: - Increase in droplet spread - Repeated impacts due to increase in droplets number
Actual
Ideal
Wet Cooling Regime
Conditions: Low droplet momentum ⇒ Less droplet spread ⇒ Minimal cooling efficiency
Conditions: Increase in droplets interaction: ⇒ Not enough time for wall temperature to “bloom-back” ⇒ Saturation achieved
Non-Wet Cooling Dilute Spray
0
Actual
Intermediate Dense Spray
Dense Spray
Critical Weber no., Wec
→∞
Impinging Droplet Weber Number
Fig. 6. Spray heat transfer modes for surfaces heated above saturation temperature Fig. 7. Droplet contact heat transfer For high temperature surfaces, the maximum release of heat from the spray consists of: a) the pre-boiling cooling potential of the liquid droplets, b) the release of heat when the liquid droplets completely evaporate at the surface, and c) the superheating of the vapor to the surface temperature. The spray heat transfer effectiveness, ε, is expressed as: ε=
q′′ G h fg +cp,1 (Tsat -T1 )+cp,v (Ts -Tsat )
(4)
where, q′′ is the droplet heat flux, G is the water mass flux (defined as the water mass flow rate per unit surface area), hfg is the enthalpy of vaporization, cp,l is the liquid specific heat constant, c p,v is the vapor specific heat constant, T sat is the liquid saturation temperature, Tl is the liquid temperature, and Ts is the target surface temperature. Researchers have experimented with two types of nozzles: single-phase fluid nozzles that disperse water alone, and multiphase-fluid nozzles that disperse water and air. Auman et al. (1967), Fujimoto et al. (1997), and Ciofalo et al. (1999) have conducted experiments using nozzles that disperse water droplets alone, while Ohkubo and Nishio (1992), Toda (1972), Puschmann and Specht (2004) have used nozzles dispersing water droplets with air. Sozbir
Multiphase Spray Cooling Technology in Industry
347
and Yao (2004) have conducted experiments using both water, and water with air. All of these experiments were conducted on plates heated to temperatures above the Leidenfrost point. Experimental data gathered from the above sources is compiled and presented in Figure 8 which shows the spray heat transfer effectiveness as function of the spray Weber number. Results reveal multiphase sprays to be more efficient than single-phase sprays. This is because air injected with water increases the droplets momentum and enhances the impaction and heat transfer. The spray Weber number in Figure 8, Wes, is defined as follows: G 2d (5) We s = ρd σ d 1.000
single fluid sprays: ε = 0.0035Wes-0.16+1x10-6Wes0.6 Two-fluid sprays: ε = 0.0089Wes-0.1798+1x10-6Wes0.6
Spray Heat Transfer effectiveness, ε
Film Boiling Region
Best-line Fit (two-fluid sprays)
0.100
Extrapolation
0.010
0.001 1.E-08
Ohkubo & Nishio - air/water nozzle Toda - air/water nozzle Puschmann & Specht - air/water nozzle Sozbir & Yao - air/water nozzle Best-line Fit Olden et al. - water/air nozzle Sozbir & Yao - water droplets only (single fluid spray) Ciofalo et al. - water droplets only Fujimoto et al. - water droplets only Auman et al. - water droplets, fan nozzle
1.E-06
1.E-04
1.E-02
1.E+00
Spray Weber Number, Wes
Fig. 8. Spray heat transfer effectiveness for single fluid and two-fluid nozzles During impaction, the droplet mass can be recalculated based on the droplet contact heat transfer empirical correlation (eqn. 4), and the excess mass that is the difference between the incoming droplet mass and the re-calculated mass (after the droplet makes contact) is released as saturated vapor near the surface. The released vapor mass has the same momentum and energy of the liquid phase from which it was created. The released vapor mass, mv, can be calculated from the droplet enthalpy change before and after impaction as shown in the following equation (Issa, 2003): mv =
(ε-1)m d cp,1 (Tsat -Td )+εm d +h fg +εm d cp,v (Ts -Tsat ) cp,v (Ts -Tsat )+h fg
(6)
where in the above equation md is the droplet mass before impaction. 3.2 Chilling of non-metallic surfaces heated to near room temperature There are three modes of heat transfer associated with the air-mist chilling of surfaces heated to temperatures slightly above room temperature. These are: a) convection heat transfer associated with the bulk air flow, b) evaporation heat transfer of the droplets while airborne and at the surface, and c) sensible heat transfer associated with the droplets contact with the surface. The total heat transfer rate, qtotal , can be expressed as:
348
Advanced Technologies
q total =h aA(Ts -T∞ )+m e h fg +m w cp,1 (Ts -T∞ )
(7)
Where ha is the bulk air heat transfer coefficient, A is the chilled surface area, T∞ is the air temperature, m e is the droplet evaporation mass flow rate, and mw is the mass flow rate of the impinging droplets. In this low surface temperature cooling regime, the spray heat transfer enhancement factor, ξ , can be defined as the ratio of the heat transfer of the two-phase flow (i.e., air and liquid water), qtotal , to the heat transfer of the single-phase flow (i.e., air alone), qa :
ξ=
q total qa
(8)
For low temperature applications, in order to reduce the amount of water loss from dehydration by the product during cooling (as in the case of processed meat and vegetable chilling), it is important to maximize droplets impaction to create a thin water film layer on the target surface. The water film layer will allow some of the water to penetrate through the surface pores to minimize the amount of water loss due to dehydration. Optimizing both the droplets surface wetting and the heat transfer is essential in these applications. There is an optimal droplet size that is best suited to achieve both maximum heat transfer and surface wetting capability.
4. Multiphase Sprays in Industrial Applications 4.1 Spray quenching of thin metallic strips In thin strip casting, glass tempering, and electronic chip cooling, air-assisted water spray cooling (Figure 9) promises to be the most efficient method for cooling due to the increase in the surface contact area between the liquid droplets and the hot surface. Cooling by air-assisted water sprays has its advantages. It provides uniformity in cooling that leads to improvement in the material properties (glass and steel strips) and flatness control of the finished product (steel strips). It is also cost effective because it optimizes the amount of water consumption, and reduces the expenses associated with water recycling and filtration. Studies have been recently conducted on upward-facing (Issa & Yao, 2004, 2005) (Figure 10) and downward facing surfaces (Issa, 2007) (Figures 11 and 12) to model the transportation process of the spray, droplets impaction, and heat transfer phenomena. Parametric studies were conducted to investigate the effect of the droplet size, air-to-liquid loading, nozzle-to-surface distance and flow operating conditions on the droplets impaction and heat transfer enhancement. Gravity is shown to have a strong effect on the spray impaction. As the air flow rate increases while keeping the water flow rate the same, the spray impaction efficiency on the top and bottom surfaces becomes almost identical (Figure 3). Spray impaction enhances as the air loading increases due to the increase in the droplet momentum. Spray impaction is also strongly dependent on the droplet size, and increases as the droplet size increases. Large droplets result in high terminal velocity and make it to the surface, while it is possible for small droplets to completely evaporate before reaching the target surface. In the cooling
Multiphase Spray Cooling Technology in Industry
349
of a downward-facing surface, the droplet net incoming momentum and gravitational force act in opposite directions. Therefore, it is possible that for a certain spray droplets size the gravitational force may overcome the net flow momentum and cause the droplets to fall backward before hitting the surface. From the heat transfer side, the smaller the droplet size, the better is the spray heat transfer effectiveness. However, for a downward-facing surface there is an optimal droplet size that is best suited for optimizing both the spray impaction and heat transfer. The selection of this optimal droplet size depends on the flow operating conditions and the nozzle-to-surface distance.
Water Air
Air-Water Slit Nozzle
Fig. 9. Water-air spraying system Fig. 10. Spray simulation for 20 μm droplets (spraying from top)
Fig. 11. Spray simulation for 100 μm droplets (spraying from below) Fig. 12. Spray simulation for 1000 μm droplets (spraying from below) In air-assisted water sprays, a wide range of droplets sizes can be generated during atomization based on the flow conditions of the water and air flows. Typical results are shown in Figure 13 (Issa & Yao, 2005). In this case a full conical spray type is injected from a distance of 40 mm above a stainless steel plate heated to 525°C. The air and water mass flow rates are 2x10-3 and 10-4 kg/s (20:1 loading ratio), respectively. The nozzle spray angle is 13°, and the air velocity is 35 m/s. Based on these conditions, the two-phase fluid nozzle disperses a spectrum of droplet diameters ranging from 9 to 63 μm with an average diameter of 19.2 μm by volume. A comparison between the modeling of spray heat transfer and experimental data for the air-water quenching of the heated plate is shown in Figure 14 (Issa & Yao, 2005). Using a multi-size spectrum for droplet distribution, droplets contact the wall at random locations with the largest hitting in the vicinity of the jet impingement point and the smaller droplets hitting further away, resulting in uniform cooling away from the plate center.
350
Advanced Technologies
1800
Average d (by volume) = 19.2 µm Average d (Rosin Rammler) = 40 µm Spread Parameter, n = 2.3 o Spray Angle = 13 Air Pressure = 14 psig Water Pressure = 10 psig
800
Data is Based on the Spray Experiment by Sozbir & Yao
600 400 200 0
9
12
16
21
25
32
40
Droplet Diameter, d (µm)
50
Air Heat Transfer (Model Simulation) Air Heat Transfer (Chang & Yao Data) Total Heat Transfer (Model Simulation) Total Heat Transfer (Chang & Yao Data)
1600
2
1000
Total Heat Transfer Coeff., W/m .K
Droplet Count in the Spray
1200
63
1400
Computation Data: Spray Type: Full Cone Avg. d = 19 µm (by volume) Va = 35 m/s
1200
o
Tliq = 27 C
1000
o
Tw = 525 C θs = 13 deg. Air Mass Flow Rate = 0.002 kg/s Water Mass Flow Rate = 10-4 kg/s G @ Plate Center = 2.5 kg/m2.s
800 600 400 200 0 0.00
0.01
0.02
0.03
0.04
0.05
Distance From Plate Center, m
Fig. 13. Spectrum of water droplet distribution count Fig. 14. Simulation versus experimental data for the air-water quenching of a steel plate (top surface) 4.2 Spray cooling in heat exchangers In the external cooling of heat exchangers, forced air has been traditionally forced over the exterior surface of the heat exchanger. However, the usage of air alone reduces the overall efficiency of the system. As a result, further enhancement of the external cooling in a heat exchanger has been of critical concern. Experiments have been conducted for the external cooling of heated cylindrical surfaces where mist is being injected with forced air flow (Issa, 2008-a). Results show that with the introduction of mist with air, the overall heat transfer effectiveness can increase by up to 700%. The use of air-mist can result in a considerable reduction in the consumption of the forced air (therefore, reducing energy consumption) that is traditionally required to cool the exterior surface of the tubes in a shell-and-tube heat exchanger device. Experimental and numerical simulation studies were recently conducted to investigate the effect of the spray operating conditions on the heat transfer enhancement in the cooling of cylindrical surfaces heated to temperatures in the nucleate boiling region. Test measurements show the dependency of the air-water spray heat transfer and droplets dynamics on factors such as the spray droplet size, liquid-to-air loading and water flow rate. Figure 15 shows the overall experimental system setup. Figure 16 shows a close-up view of the location of the drilled holes along the cylinder wall. The air-mist nozzle in this setup provided a spectrum of droplets ranging from 5 to 100 microns. The size of the water droplets produced is controlled by the nozzle operating flow conditions and the liquid-to-air loading ratio. Smaller size droplets are generated by increasing the air pressure while decreasing the liquid pressure, and vise versa.
Multiphase Spray Cooling Technology in Industry
351
#1 #8
#2
#7
#3 #6
#5
#4
Fig. 15. System setup for spray cooling of a heated cylinder Fig. 16. Arrangement of the drilled holes on a steel cylinder
Air: Air: Air: Air: Air: Air: Air: Air: Air:
1400 1200 1000
5 psi, Water: 0.5 gph, d = 60 microns 5 psi, Water: 1.0 gph, 70 microns 5 psi, Water: 1.5 gph, d = 85 microns 10 psi, Water: 0.5 gph, d = 53 microns 10 psi, Water: 1.0 gph, d = 63 microns 10 psi, Water: 1.5 gph, d = 77 microns 30 psi, Water: 1.0 gph, d = 42 microns 30 psi, Water: 1.5 gph, d = 53 microns 30 psi, Water: 2.0 gph, d = 65 microns
Tsurface = 123 - 130 oC Tair = 22.1 - 25.7 oC TH2O = 22.9 - 24.7 oC
800 600 400 200 0 0
90
180
270
Angular Position, θ (deg.)
360
Avg. Air-Water Spray Heat Transfer 2 Coeff., hs,avg (W/m .K)
Local Air-Water Spray Heat Transfer 2 Coeff., hmix (W/m .K)
The local air-water heat transfer coefficient is calculated for the eight angular positions on the cylinder surface, and the results are shown in Figure 17 (Issa, 2008-a). A comparison is made between the test cases where air and water are being dispersed. The results show the heat transfer coefficient to be highest at the stagnation point, and gradually decreases as the hydrodynamic boundary layer develops over the cylinder surface. As the water flow rate increases, the spray becomes denser and water flooding near the stagnation point is seen to increase. Figure 18 shows the spray average heat transfer coefficient as function of the water mass flow rate. The sharp increase in the heat transfer coefficient at high liquid loadings is due to the ballistic impaction of the spray which has a favorable effect on the enhancement of the heat transfer. Results show for dilute sprays (water flow flux around 2 kg/s.m2 or less), larger size droplets result in better cooling. However, for dense sprays, smaller size droplets result in better cooling. This is due to the fact that in dense sprays there is more interaction between droplets, and large size droplets will lead to an increase in surface flooding which is detrimental to the heat transfer. In dilute sprays (which have lower spray momentum than dense sprays), large size droplets increase the droplets incoming momentum causing the droplets to spread more at the surface during impaction, and therefore enhance the heat transfer. Experimental tests show that as the air pressure increases, the spray becomes ballistic, and therefore, causes a tremendous increase in the satellite droplets generated during impaction which enhances the heat transfer. 800
o
Tsurface = 123 - 130 C Tair = 22.1 - 25.7 oC TH2O = 22.9 - 24.7 oC
600
Pair = 5 psi Pair = 10 psi Pair = 30 psi
400
200
0 0.0E+00
5.0E-04
1.0E-03
1.5E-03
2.0E-03
2.5E-03
Water Mass Flow Rate (kg/s)
Fig. 17. Local air-water spray heat transfer coefficient versus angular position on the test cylinder Fig. 18. Average value of the spray heat transfer coefficient over the cylinder surface versus water mass flow rate
352
Advanced Technologies
4.3 Spray chilling of food and meat products In the research on beef carcass chilling (Figure 19-a), a variety of chilling techniques have been adopted during the last decade by researchers that range from using conventional air chilling systems (Mallikarjunan & Mittal, 1994) to air-assisted chilling systems with a variety of intermittent cooling schemes (Strydom & Buys, 1995). In these studies, the amount of water used to cool the beef carcass ranged from 3.5 to 7 gallons per carcass and for spray cooling periods from 10 to 17 hours. Most of these commercial spray chilling systems use nozzles producing large size droplets such as full jets nozzles (1140-4300 μm average droplet size) and hollow cone nozzles (360-3400 μm average droplet size). Computer modeling and experimental studies have been recently conducted on the spray chilling of food products (Figure 19-b) and beef test specimens (Figure 20) to investigate the effect of using a two phase flow (air and water) with a fine spray droplets size in the chilling process (Issa, 2008-b). The conducted studies show promising results where substantial improvements are made in the heat transfer enhancement. Recent experimental tests were performed to investigate the effect of the spray droplet size and water flow rate on the spray heat transfer. Test data show that the application of excess water is detrimental to the cooling effectiveness on a beef surface, and maximum heat transfer occurs when the spray median droplets size is less than 40 μm (Figure 21). However, surface flooding increases with the increase in droplet size, and when using a large amount of water, the cooling effectiveness of the multiphase spray reduces to that of a forced air jet. For the same amount of air and water mass flow rates, sprays with larger droplets have lower droplet number density but higher droplet momentum. More droplets make impaction at the target surface than when finer droplets are dispersed, and the number of drifting droplets decreases sharply. As the air-to-liquid loading increases (for the same amount of water flow rate and same water droplets size), the impaction efficiency drastically increases due to the increase in the droplets impinging velocity and momentum. The rate of evaporation is governed by the gradient of the vapor concentration between the droplet and the bulk air. When the ambient is saturated (i.e., relative humidity is 1), the impaction efficiency slightly increases because the droplet evaporation rate is lowered. In general, as the nozzle-to-surface distance increases, droplets evaporation increases and larger droplets are needed to achieve higher impaction efficiency. One of the challenges in air mist chilling is to understand the effect of the droplet size on the heat transfer enhancement. Since in air-mist chilling the cooling medium is a two-phase flow, the first question to be addressed is what desirable droplet size is required for best heat transfer enhancement. Also, in order to reduce the amount of water loss from dehydration by the product during cooling (as in the case of beef carcass or vegetable cooling), it is important to maximize droplets impaction to create a thin water film on the target surface. The second question to be addressed is what desirable droplet size is best suited to achieve both maximum heat transfer and surface wetting capability. Recent studies show the optimal droplet size required for maximum impaction is different from that required for maximum heat transfer enhancement (Figure 22). To optimize both the heat transfer and surface wetting criteria, a multi-objective optimization methodology can be applied where the net optimal droplet size is between the droplets median size required for maximum heat transfer and that required for maximum surface impaction.
Multiphase Spray Cooling Technology in Industry
353
Air Flow
Flow Meter
Droplets
Pressure Gauge/Reg.
Water Flow Beef Specimen
Nozzle
Pump Nozzle
Air Flow
Cooling Header
Water Flow
Flow Meter
Computer Thermocouples Data Acquisition
Chiller
Ice
Air
Food Products
Air-Mist Nozzle
Water Tank
Pressure Gauge/Reg.
Fig. 19. Air-mist chilling: (a) beef carcass, (b) food products Fig. 20. Experimental setup for chilling of beef test specimen
Spray Heat Transfer Enhancement Factor, ε
40 (d=40 µm)
30 hs = Extrapolation
(d=53 µm)
0.0122Qw2
- 1.59Qw + 61.7
(Test Data)
20
10 (d=64 µm)
Nozzle Height = 0.3 m Water/Air Loading = 4.4 Water Flow Rate = 0.076 gph
6.0
Heat Transfer Enhancement Factor Droplet Impaction Efficiency
5.0
20
40
60
Water Flow Rate, Qw (mL/min)
80
80
60 50
3.0
Nozzle Spray Angle = 13o Target Side Length, L = 0.5 m Nozzle Height, X = 0.3 m X/L = 0.6 Tair = 1 oC Vair = 13.75 m/s Air Flow Rate = 7 lpm TH2O = 1 oC
2.0 1.0 0.0
0
90
70
4.0
Tsurf. = 42.5 oC
0
100
40 30 20 10
Droplets Impaction Efficiency, η (%)
7.0
Multi-Phase Cooling Test Data (P_air = 7-10 psi, P_H2O = 4-10 psi) Forced Air Convection (Pair = 10 psi) Stagnant Air Natural Convection
(W/m2.oC)
Air-Mist Heat Transfer Coefficient, hs
50
0 0
20
40
60
80
100
120
Droplet Diameter, d (µm)
Fig. 21. Air-mist spray heat transfer coefficient versus water flow rate for the chilling of beef versus droplet diameter Fig. 22. Heat transfer and impaction efficiency test specimen High water mass flow flux has a detrimental effect on the surface cooling because of the flooding that can occur on the surface. Furthermore, if the spray droplets are too large, there is a risk of water running fast down the chilled surface. In the air-mist chilling of beef carcasses, water runoff on the surface has a detrimental effect on the quality of the processed beef because of the bleached streaks that can be produced on the surface. Therefore, predicting the amount of water flooding on the surface is very critical in the cooling operation. In the air-mist chilling of beef carcasses, water sprays are intermittently turned ON. As time elapses while the spraying system is ON, more water droplets accumulate on the surface, where on a vertical surface (as in the case of beef carcass), gravity acts on pulling the water-film downward. When the height of the water-film reaches a critical thickness, gravity overcomes the effect of surface tension and water starts running down the surface. The question of how fast the water-film runs down the surface depends on the size of droplets used in the spray. The droplet size plays a role in the time it takes the water-film to reach maximum speed during runoff. This information provides a clue to what the intermittent spraying time, and the frequency of spraying should be in the chilling operation. The amount of time required to reach a steady dripping velocity considerably decreases from about 30 s for the 20 μm droplets to 7 s for the 100 μm droplets. In order to avoid excessive water overflow, the spraying time should be substantially reduced when large droplets are sprayed.
354
Advanced Technologies
5. Conclusion Experimental studies and numerical simulations have been conducted on the use of multiphase sprays for cooling applications such as metal plate quenching, external cooling of heated tubes in exchangers, and chilling of food and meat products. Studies reveal the impaction efficiency and the heat transfer effectiveness of the multiphase sprays to be strongly dependent on the spray droplet size, air-to-liquid loading, and nozzle distance from the target surface. The impaction efficiency increases as the droplet incoming momentum increases such as by using sprays dispersing large size droplets. In certain applications where spraying is done in a direction opposite to gravity, the gravitational force may cause larger size droplets to fall backward before hitting the surface if the spray momentum is not sufficiently strong. On the heat transfer aspect, the smaller the droplet size, the better is the spray heat transfer effectiveness. However, in all cooling applications, there is an optimal droplet size that is best suited for optimizing both the spray impaction and heat transfer effectiveness.
6. References Auman, P.M.; Griffiths, D.K. & Hill, D.R. (1967). Hot Strip Mill Runout Table Temperature Control, Iron and Steel Engineer, Vol. 44, pp. 174-179, ISSN 0021-1559 Chandra, S. & Avedisian, C. T. (1991). On the Collision of a Droplet with a Solid Surface, Proceedings of the Royal Society of London A, Vol. 432, No. 1884, pp. 13-41, 1991, ISSN 0962-8444 Ciofalo, M.; Piazza, I.D. & Brucato, V. (1999). Investigation of the Cooling of Hot Walls by Liquid Water Sprays, International Journal of Heat and Mass Transfer, Vol. 42, No. 7, pp. 1157-1175, ISSN 0017-9310 Deb, S. & Yao, S.C. (1989). Analysis on Film Boiling Heat Transfer of Impacting Sprays. International Journal of Heat and Mass Transfer, Vol. 32, No. 11, pp. 2099-2112, ISSN 0017-9310 Fujimoto, H.; Hatta, N.; Asakawa, H. & Hashimoto, T. (1997). Predictable Modelling of Heat Transfer Coefficient between Spraying Water and a Hot Surface above the Leidenfrost Temperature, ISIJ International, Vol. 37, No. 5, pp. 492-497, ISSN 0915-1559 Hatta, N.; Fujimoto, H.; Kinoshita, K. & Takuda, H. (1997). Experimental Study of Deformation Mechanism of a Water Droplet Impinging on Hot Metallic Surfaces above the Leidenfrost Temperature. Journal of Fluids Engineering, Vol. 119, No. 3, pp. 692-699, ISSN 0098-2202 Issa, R. (2003). Numerical Modeling of the Dynamics and Heat Transfer of Impacting Sprays for a Wide Range of Pressures, Ph.D. Dissertation, University of Pittsburgh, 2003, Pittsburgh, PA Issa, R. & Yao, S.C. (2004). Modeling of the Mist Dynamics and Heat Transfer at Various Ambient Pressures. Proceedings of the 2004 ASME Heat Transfer/Fluids Engineering Summer Conference, ISBN 0791846903, July 11-15, 2004, Charlotte, NC Issa, R. & Yao, S.C. (2005). A Numerical Model for Spray-Wall Impactions and Heat Transfer at Atmospheric Conditions. Journal of Thermophysics and Heat Transfer, Vol. 19, No. 4, pp. 441-447, ISSN 0887-8722
Multiphase Spray Cooling Technology in Industry
355
Issa, R. (2007). Optimal Spray Characteristics in the Air-Assisted Water Spray Cooling of a Downward-Facing Heated Surface," 24th ASM International Heat Treating Society Conference, ISBN 9781604239300, Sept. 17-19, 2007, Detroit, MI Issa, R. (2008-a). Simulation and Experimental Investigation into the Cooling of a Heated Cylinder using an Air-Water Spray, Proceedings of AIAA 40th Thermophysics Conference, ISBN 9781605603711, June 23-26, 2008, Seattle, WA Issa, R. (2008-b). Numerical Investigation of the Chilling of Food Products by Air-Mist Spray. International Journal of Fluid and Thermal Engineering, Vol. 1, No. 3, pp. 130-139, ISSN 2070-3759 Karl, A.; Rieber, M.; Schelkle, M.; Anders, K. & Frohn, A. (1996). Comparison of New Numerical Results for Droplet Wall Interactions with Experimental Results, Proceedings of 1996 ASME Fluids Engineering Division Conference, ISBN 0791817911, July 7-11, 1996, San Diego, CA Lawson, T.J. & Uk, S. (1979). The Influence of Wind Turbulence, Crop Characteristics and Flying Height on the Dispersal of Aerial Sprays. Atmospheric Environment, Vol., 13, No. 5, pp. 711-715, ISSN 1352-2310 Lee, S. H. & Ryou, H. S. (2000). Development of a New Spray/Wall Interaction Model. International Journal of Multiphase Flow, Vol. 26, No. 7, pp. 1209-1234, ISSN 0301-9322 Mallikarjunan, P. & Mittal, G.S. (1994). Heat and Mass Transfer during Beef Carcass Chilling - Modelling and Simulation. Journal of Food Engineering, Vol. 23, No. 3, pp. 277-292, ISSN 0260-8774 Mundo, C.; Tropea, C. & Sommerfeld, M. (1997). Numerical and Experimental Investigation of Spray Characteristics in the Vicinity of a Rigid Wall. Experimental Thermal and Fluid Science, Vol. 15, No. 3, 1997, pp. 228-237, ISSN 0894-1777 Naber, J. D. & Farrell, P. V. (1993). Hydrodynamics of Droplet Impingement on a Heated Surface. SAE Journal, SAE Publication no. 930919, March 1993 Ohkubo, H. & Nishio, S. (1992). Study on Transient Characteristics of Mist-Cooling Heat Transfer from a Horizontal Upward-Facing Surface, Heat Transfer - Japanese Research, Vol. 21, No. 6, pp. 543-555, ISSN 0096-0802 Puschmann, F. & Specht, E. (2004). Transient Measurement of Heat Transfer in Metal Quenching with Atomized Sprays, Experimental Thermal and Fluid Science, Vol. 28, No. 6, pp. 607-615, ISSN 0894-1777 Scheller, B.L. & Bousfield, D.W. (1995). Newtonian Drop Impact with a Solid Surface. AIChE Journal, Vol. 41, No.6, 1995, pp. 1357-1367, ISSN 0001-1541 Sozbir, N. & Yao, S.C. (2004). Experimental Investigation of Water Mist Cooling for Glass Tempering, Atomization and Sprays, Vol. 14, No. 3, pp. 191-210, ISSN 1045-5110 Strydom, P.E. & Buys, E.M. (1995). The Effects of Spray-Chilling on Carcass Mass Loss and Surface Associated Bacteriology. Meat Science, Vol. 39, No. 2, pp. 265-276, ISSN 0309-1740 Toda, S. (1972). A Study of Mist Cooling (1st Report: Investigation of Mist Cooling), Transactions of JSME, Vol. 38, pp. 581588, ISSN 0387-5016 Wachters, L. H. J. & Westerling, N. A. J. (1966). The Heat Transfer from a Hot Wall to Impinging Water Drops in the Spheroidal State. Chemical Engineering Science, Vol. 21, No. 11, pp. 1047-1056, ISSN 0009-2509
356
Advanced Technologies
Zheng, J.; Zhou, H. & Xu, Y. (2002). Advances in Pesticide Electrostatic Spraying Research in China, 2002 ASAE Annual International Meeting/CIGR XVth World Congress, July 28-31, 2002, Chicago, IL
Web Technologies for Language Learning: Enhancing the Course Management System
357
20 X Web Technologies for Language Learning: Enhancing the Course Management System Afendi Hamat and Mohamed Amin Embi Universiti Kebangsaan Malaysia Malaysia
1. Introduction Technology and language education are not newly found partners, as evidenced by the proliferation of language labs in 70s and 80s of the previous century. The rise of the Web brings in new and ‘exciting’ technologies for use by the language teaching community and these include the platforms known as course management systems. Course Management Systems (CMSs) are systems that provide facilities for teachers and students to engage in teaching and learning activities online by helping to manage various functions like course content preparation and delivery, communication, assessment, administrative functions and collaboration (Ellis, 2001; Nichani, 2001). Other terms have also been used to describe CMSs: online learning environment, virtual learning environment and course-in-a-box (Collis & De Boer, 2004). The rapid adoption of CMSs by institutions around the world is truly dramatic. Warger (2003) reported that these systems have become essential to IHL’s (institution of higher learning) drive for implementing instructional technology. The available literature on the use of CMSs for language learning and instruction is largely promotional in nature, such as by Brandl (2005), Robb (2004), Siekmann (1998) and GodwinJones (2003). Research and reports that deal with CMSs in language learning environments include Masuyama and Shea (2003), Masuyama (2005), Zhang and Mu (2003) and Da (2003). The paucity of research related to the use of CMSs for language learning could very well lie in its so-called strength, a “one-size-fits-all” philosophy (Kuriloff, 2001) that casts the learning of all subjects in the same manner. Language learning is vastly different from the learning of any other subjects (Moffett & Wagner, 1983), yet CMSs are designed with uniformity of tools and features. This gave rise to calls for CMSs to provide more flexibility to better allow language teaching and learning to take place. Kuriloff (2001) argues that CMSs “cater to the lowest denominator” as it treats the teaching of all disciplines in the same manner. SanchezVillalon and Ortega (2004) and Kuriloff (2001) describe the lack of functionalities for writing and composition in the current crop of CMSs. Corda and Jager (2004) claim that CMSs currently offer more assessment features rather than language practice features commonly associated with language learning. The study presented in this chapter is part of a broader study to develop a design framework for a course management system (CMS) oriented for language instruction in
358
Advanced Technologies
Malaysian institutions of higher learning. This study specifically aims to identify the technologies used for the purpose of language learning and teaching in a systematic manner. It will contribute to the larger study by providing a focus on the kinds of technologies that needs to be integrated within the course management system. This chapter is divided into the following sections: data selection and methodology, discussion and lastly conclusion.
2. Data Selection and Methodology The initial problem faced by this study after choosing the appropriate methodology is data selection. If the subject (language learning and web technologies) is taken as the sole criteria for data selection, then the data will be large and unmanageable. Decision is made to use the Thomson Social Science Citation Index as the basis for data selection with the following justifications: i. The Social Science Citation Index is the established and recognized index for leading journals and publications within the larger domain of social sciences. It provides a solid basis for initial selection of data sources. ii. Although there is only one journal (Language Learning and Technology), the number belies the true amount of data as the journal itself specializes in the field of technology-assisted language learning. All articles within the journal have a high potential of being used as the data for the research question. iii. In a qualitative study, the amount of data is not as important as the quality of data. This is especially true provided the criteria for selection are adequately justified. The only journal within the index that deals with the subject is Language Learning and Technology. There are other journals outside the index that deal with the subject such as System, CALL-EJ, The Reading Matrix and CALL. However, there is no common justification for including them in the list of sources. An attempt to include one or any of the other would result in arbitrary criteria that would not be justifiable and indefensible for the purpose of research. The final list is made up of 40 articles from the journal Language Learning and Technology. Once the basis for selection has been established, the next step is to refine the selection based on the question that is to be investigated. As the aim is to identify the web technologies used for web language learning and teaching, the first step is to determine whether the article examined is about web technologies or traditional computer-assisted language learning. This is a pretty straightforward process, with one exception. If an article that discusses traditional CALL applications suggests that the technology is portable to the web or internet then the article will be included in the data. The next step is to identify the web technologies or applications used within the articles selected. In order to assist in this process, a matrix display is used to organize the data. The display is used then to assist in the discussion on the findings. Any qualitative research (or any other type of research) will inevitably deal with the question of validity and reliability. The more appropriate term is trustworthiness as defined by Lincoln and Gruba (1985) as this research is qualitative in nature. In order to ensure trustworthiness, a panel of experts reviewed the analysis and acts as interraters. The input from the expert panel is used to improve the analysis although there are no major or critical changes to the original analysis.
Web Technologies for Language Learning: Enhancing the Course Management System
359
The following categories of technologies emerge from the data available: synchronous and asynchronous communications, production technologies, web resources and language testing.
3. Discussion 3.1 Synchronous and Asynchronous Communication
Synchronous communications are communications that have no or insignificant delay between initiation and response of the communication. Synchronous communication technologies allow for almost simultaneous or real-time communication between users. The most common form of synchronous communication is the text-based chat. There are 14 articles within the data that specifically mentioned the use of text chats, some in combination with audio chat. Data 001 (Basharina, 2007), for example, mentions the user preference for chat as opposed to the slower message boards: “This tension reveals students’ desire to approximate delayed bulletin board interaction to the immediate response (Thorne, 2003). Based on this, several students from all three cultures expressed their preference for chat over the asynchronous bulletin board interaction.” (p.94) Other forms of synchronous communication on the web include audio/video chat and desktop conferencing. Eight of the data mention the use of voice or video-based chat facilities. Data 011 (Strambi & Bouvet, 2003) mentions the use of voice chat in order to help prepare students for oral tests (p.94). Data 024 (Payne & Ross, 2005) describes an experiment using voice chat and concludes that the chat may provide a unique form of support to certain types of learners in L2 oral proficiency (p.50) Asynchronous communication is communication where there is perceptible and expected delay between the messages. On the web, the most common forms of asynchronous communication are emails and web forums (sometimes also called threaded discussion forums or bulletin boards). Emails are popular due to their accessibility and ease-of-use (Heisler & Crabill, 2006). The web forums are perhaps one of the most popular of web technologies for communication (Lally, 1995). Eleven of the forty articles that make up the data mentioned the use of various forms of emails, either on its own or in combination with other applications. The articles deal more with the aspects of language use within emails rather than email as a technology. For example, Data 032 (Chen, 2006) highlighted the issue of language pragmatism when communicating using emails. It puts forth the argument that since emails lack paralinguistic clues, communication could be challenging especially for unequal-status communication. Ten of the data mention the use of another form of popular asynchronous communication tool: the web forums. Data 013 (Weasenforth, Biesenbach-Lucas & Meloni, 2002) argues that the use of threaded discussion forums open up new learning possibilities that may not be available in a face-to-face environment. However, it also cautions that such a use must be well integrated into the learning process to achieve any benefits for learners. In a study on the use of discussion boards in teacher training, Data 015 (Arnold & Ducate, 2006) presents a view based on previous literature that the lack of social context cues might hinder communication. This is put together with a counter argument that such deficiency often leads to equal participation when compared to a normal classroom. It must be noted; however, that any text-based online communication will suffer from the same lack of social
360
Advanced Technologies
or paralinguistic clues. Furthermore, the use of emoticons, while being far from a perfect representation of human emotions does help to provide visual clues for better communication in an online, text-based environment such as a web board or a chatroom. The majority of the data collected deal with the subject of communication technologies in language education. And these technologies are available in virtually all course management systems. The question facing a designer is how to design these applications so that there is effective communication within the course management system with the particular view of enhancing language learning. Afendi and Mohamed Amin (2005) propose a set of guidelines based on Communicative Language Teaching (CLT). They named four aspects of communications that need to be addressed by a course management system: integrated communication design, conversational design, social communication design and multimedia communication design. Table 1 summarizes their ideas: Integrated Communication Design
Conversational Design
CLT and Communication as • Communication the goal and the process. • Contextualization of communication •
•
Social Communication Design
•
•
Multimedia Communication Design
Dialogic view of learning and communication. Inhibition might discourage language learners to communicate freely. Social aspects of human communication. Communicative competence covers the ability to use language in socially appropriate contexts.
Tools/Features Design Distribution of communicative functions into other parts of the system (e.g., forum functions within online notes & forum functions within language practice exercises. Multi-directional and private facility to ‘converse’ with teacher or fellow students. Could also be made public if agreeable to all parties.
Virtual cafes – virtual space for socializing. Controlled and moderated by students. Include content creation tools, polling, publishing and chat facilities.
Communication tools in a CMS Human communication is should include text, conveyed via a audio and visual capabilities while maintaining the variety of media. • Communication skills organization and permanence are not limited only to aspects normally available in text- only communication. oral proficiency. Table 1. Design considerations of communicative tools within a CMS based on CLT (Afendi & Mohamed Amin 2005) •
Web Technologies for Language Learning: Enhancing the Course Management System
361
A course management system designed for language learning should not merely have communication technologies; it should facilitate communication through these technologies. Computers in language learning have been traditionally used in drill-and practice activities, beginning with the heydays of the Audio Lingual Method (ALM) and behaviorism (Ahmad et al., 1985). At the time, what constitutes personal computing technology is still in its infancy i.e. computers and software are limited in their abilities; be it processing power, storage or communications. Much has changed since the early days of personal computing. According to Warschauer (2004), computer-assisted language learning is going towards what he terms as ‘integrative CALL’. The defining features of integrative CALL are multimedia technology and the Internet, used in combination. A course management system designed with such communications facilities and in an integrative manner as suggested by the study would be a solid platform for Warschaeur’s ‘integrative CALL’.
3.2 Production Technologies
Production applications or technologies refer to technologies that allow users to utilize and practice the two production language skills: speaking and writing. Speaking has been covered by the earlier discussion on communications; therefore, this section will focus more on the other productive skill which is writing. The data mentions the use of blogs. Since it was first introduced in 1998, blogs have been gaining popularity on the web. Its usefulness in helping to develop writing skills has been mentioned by a few researches such as Campbell (2003) and Richardson (2004). In general, the use of technology for L2 writing has been shown to be beneficial (Hertel, 2003). There are a few ways in which the blog technology could be used within a course management system for language learning. The first is the most common way where the blog functions as a publicly accessible personal journal. The second way is a step up by integrating the blog technology into the CMS. This is not the same as just having the blog ‘parked’ within a CMS. For example, if an instructor wants his students to read an article, then discuss it and later write an entry in their blogs regarding their experiences etc., the process should be handled by the CMS seamlessly. A learner should be able to see the article, the discussion part and his own blog in a seamless manner instead of having to navigate from one section to another to carry out the assigned tasks. More importantly, an instructor should be able to see the task as one integrated whole instead of separated pieces scattered throughout the system. This simplifies the tasks involved in management and evaluation of the assignment. Blogs could also be designed for collaborative writing that involves peer-review and editing. Learners could contribute revisions, comments and help improve each other’s writing in general. A blog system within a CMS should have more purpose than being a simple space for personal expression. As it is set within an academic setting, a mechanism should exist for teacher-student interaction within the blog itself. The purpose of this mechanism would be to allow the teacher to comment on the blog for corrective, evaluative or general feedback comments. Most blogs have a section for comments; however, this section is visible to all visitors. The academic interaction mechanism should be visible to the owner of the blog and the teacher only. There is currently no publicly available research or literature on something of this nature. Kruper (2003), for example, commented that the CMS vendor Blackboard does not seem to be interested in students’ publishing. The Edutools’ CMS comparison tool
362
Advanced Technologies
does not even have blogs as one of the features to be compared (Edutools 2008 – www.edutools.info). The recognition given by this study to production technologies as one of the key areas for emphasis in the applications of technology for online language learning is important. This is because production technologies such as the authoring of web pages and blogs represent a new kind of literacy instead of being merely an extension of current CMC tools (Warschauer 2004). A thoughtful approach must be taken when integrating technologies that assist with productive language skills. Any technology common to today’s Web such as blogs should be integrated into a CMS for language learning with the view of helping to improve students’ skills rather than just for the sake of having the latest ‘in-thing’. Productive skills are an important half of language proficiency and that is an established fact. The focus should be on giving more opportunities for the development of these skills within a course management system designed for language instruction.
3.3 Web Resources
There are two factors that would encourage the use of web resources in a CMS designed for language learning. First is the fact that the web offers a multitude of authentic language materials in various forms. Second, a CMS is a web-based system; therefore, it should be naturally capable of making use of web resources either via embedding or linking. This section looks at the possible web resources to use and discuss some problems and limitations. There are two broad classes of web resources available for inclusion or adaptation by a CMS: ‘open’ and ‘propriety’ web resources. Open web resources are freely available for use and because of that, are quite an attractive option. However, the quality might not be up to expectations and the range of resources might not be able to meet specific needs. Proprietary web resources, on the other hand, are available only for a fee. These may include online libraries and materials published by publishing houses. The quality may be higher or they may provide better materials for specific needs, for example, English lessons for business etc.; however, as mentioned earlier they are not free. Data 014 (Horst et al. 2005) mentions the use of user-contributed resources in the forms of word banks and collaboratively populated vocabulary database. This should be considered as part of a subset of the open resources category. However, its potential should not be overlooked. This is because the recent popularity and success of social networking and usercontributed sites on the Web such as Facebook and YouTube. Tim O’Reilly (2005) frame this development as ‘Web 2.0’, which could be described (among the numerous definitions available, see Hoegg et al., 2006; and Hinchcliffe, 2006) as empowering the users to contribute and publish. The methods for integrating them should be also given some consideration. While it is normal to use hyperlinks to link to other resources on the web, it might not be the best option and could be considered slightly outmoded for use within a course management system. Some resources offer APIs (application programming interfaces) that allow for seamless integration of the resources or services within the applications that invoke the APIs. One of the most well known examples is the Google Search API that allows web searches to be carried out against Google’s indexes from any websites. The adaption or integration of web resources for language learning would be naturally focused on those that are important for language learning such as online dictionaries,
Web Technologies for Language Learning: Enhancing the Course Management System
363
concordances and practice materials. However, the question of whether or not to integrate with external, third party resources involve factors such as administrative and financial policies which is beyond the scope of this study. Figure 1 shows the overview of the design features for integrating web resources into a CMS designed for language learning.
Web resources for language learning
Propriety Resources
Open Resources
User-contributed resources (subset) Word banks, glossary Links Database Shareable learning notes with full multimedia capabilities i.e text, audio, video, mind maps etc Fig. 1. Design features for web resources
3.4 Web-based Testing
Testing is integral to language instruction and is not used only for evaluation but also for practice of certain items especially grammar. Most course management systems include the facilities to conduct tests and quizzes in various formats like multiple choice, essays/short answers and fill-in-the-blanks. For example, a cursory comparison using Edutools on three popular CMSs (ANGEL 7.3, Blackboard and Desire2Learn) shows a reasonably welldeveloped set of assessment tools. The reason for the maturity of CMS design in this area is that testing and evaluation are features commonly in demand across disciplines. The practice aspects of online language testing including diagnostic and self-assessment should be given extra consideration as they are quite important for language learning (Fulcher, 2000). A test or quiz in any format should include the ability to provide sufficient
364
Advanced Technologies
and helpful feedback if the instructor thinks it is needed. The facilities for adding quizzes and tests should also be integrated throughout the whole system, for example, a listening comprehension activity may require an audio file plus the comprehension questions. This is a point that needs to be made as most CMSs cater to testing only for evaluation and therefore isolate it into a distinct section specifically for designing and taking tests. Figure 2 illustrates the idea of integration between assessment tools and the activities within a CMS geared for language learning in comparison to the ‘traditional’ CMS design:
Assessment and Testing in Traditional CMS Design
Assessment and Testing Tools
Assessment and Testing Section
Learners
Assessment and Testing in CMS Designed for Language Learning Assessment and Testing Section
Assessment and Testing Tools
Language Activities: Reading, Writing, Listening and Speaking
Learners Diagnostics and Remedial Exercises
Fig. 2. Comparison of testing/assessment designs for language-oriented CMS and traditional CMS
Web Technologies for Language Learning: Enhancing the Course Management System
365
The applications of testing technology as suggested here expand on the traditional role of the technology within a course management system. This is in line with the argument made by Corda and Jager (2004) that CMSs offers more features for assessment rather than the tools needed for language practice which will be more useful for language learning.
4. Conclusion Thirty two out of forty articles that form the data are about the use of communication technologies for the purpose of online language learning. Based on the data available, it is clear that a CMS designed for the purpose of language learning, should also be focused on enabling and facilitating communication. Afendi and Mohamed Amin (2005) identified four design considerations: integrated communication design, conversational design, social communication design and multimedia communication design. The aim of these four design considerations is to enable and facilitate communication processes within a course management system. The next category of technology discussed is the use of production technologies. These technologies include blogs and web page publishing. They allow students to use their production skills such as writing and speaking within an online environment. Speaking is closely associated with communication technologies like chat and teleconferencing although by definition it is a production skill. A CMS oriented for language learning should therefore integrate technologies that enable students to make use of their production skills. Technologies for web-based testing are also covered by the data, however, the number of article discussing online language testing is only one. Although the number is not significant, the article gives a well-rounded discussion on online language testing. A course management system cannot hope to integrate every piece of technology available; however, since testing is an integral part of language learning, it is a necessity within a CMS geared for language learning. The last category of applications, web resources, is not something simple to integrate into the design of a CMS as it involves external resources and different decision-making processes. A course management system is already embedded into the web, therefore inclusion of databases or lists of hyperlinks to available resources should not be a problem. Integration and access to specialized, third-party resources however, is a decision that would require input from policy makers because it involves financial and administrative decisions. However, a CMS should be designed to allow easy access to available resources especially those related to language learning such as online dictionaries.
5. References Afendi, H. & Mohamed Amin, E. 2005. Design Considerations for CMC Tools within a Course Management System Based on Communicative Language Teaching. CALL-EJ Online 7(1) June 2005. (Online) http://www.tell.is.ritsumei.ac.jp/callejonline/journal/7-1/Hamat-Embi.html (20 September 2008) Ahmad, K., Corbett, G., Rogers, M., & Sussex, R. 1985. Computers, language learning and language teaching. Cambridge: Cambridge University Press.
366
Advanced Technologies
Arnold, N. & Ducate, L. 2006. Future Foreign Language Teachers' Social And Cognitive Collaboration In An Online Environment. Language Learning & Technology 10(1):42-66. Basharina, O.K. 2007. An Activity Theory Perspective on Student-reported Contradictions in International Telecollaboration. Language Learning & Technology 11(2):82-103 Brandl, K. 2005. Are You Ready To "Moodle"? Language Learning & Technology 9(2):1623. Campbell, A.P. 2003. Weblogs for Use with ESL Classes. The Internet TESL Journal IX(2). (Online) http://iteslj.org/Techniques/Campbell-Weblogs.html (15th December 2008) Chen, C-F.E. 2006. The Development Of E-Mail Literacy: From Writing To Peers To Writing To Authority Figures. Language Learning & Technology 10(2):35-55. Collis, B. & De Boer, W. 2004. Teachers as learners: Embedded tools for implementing a CMS. Tech Trends 48(6):7–12. Corda, A. & Jager, S. 2004, "ELLIPS: Providing web based language learning for Higher Education in the Netherlands", Recall 16 (1): 225-236. Da, J. 2003. The Use of Online Courseware in Foreign Language Instruction and Its Implication for Classroom Pedagogy. Mid-South Instructional Technology Conference Teaching, Learning, & Technology: The Challenge Continues. (Online) http://www.mtsu.edu/~itconf/proceed03/146.html (30 January 2006) Ellis, R. K. 2001. LCMS roundup. Learning Circuits. (Online) http://www.learningcircuits.org/2001/aug2001/ttools.html (20 September 2008) Fulcher, G. 2000. Computers in Language Testing. In Brett, P. and G. Motteram (Eds), A Special Interest in Computers, pp. 93-107. Manchester: IATEFL Publications. Godwin-Jones, R. 2003. Emerging Technologies. Blogs and Wikis: Environments for Online Collaboration. Language Learning and Technology 7(2):12-16. Heisler, J.M. & Crabill, S.L. 2006. Who are "stinkybug" and "Packerfan4"? Email Pseudonyms and Participants' Perceptions of Demography, Productivity, and Personality. Journal of Computer-Mediated Communication , Vol. 12 (1), pp. 114– 135. Hertel, J. 2003. Using and e-mail exchange to promote cultural learning. Foreign Language Annals 36(3):386–396. Hinchcliffe, D. 2006. The State of Web 2.0. Web Services Journal. (Online) http://web2.socialcomputingmagazine.com/the_state_of_web_20.htm (17th August 2008) Hoegg, R., Martignoni, R., Meckel, M. & Stanoevska-Slabeva, K. 2006. Overview of business models for Web 2.0 communities. In Dresden, S. (ed.) Proceedings of GeNeMe 2006, pp. 23–37. Horst, M., Cobb, T. & Nicolae, I. 2005. Expanding Academic Vocabulary With An Interactive On-Line Database. Language Learning & Technology 9(2):90-110. Kruper, J. 2003. Blogs as Course Management Systems: Is their biggest advantage also their achille's heel? (Online) http://homepage.mac.com/john_kruper/iblog/B905739295/C1776019690/E140 1376105/ (15th August 2008) Kuriloff, P.C. 2001. One Size Will Not Fit All. The Technology Source, July/August 2001. (Online)http://ts.mivu.org/default.asp?show=article&id=899 . (13 March 2005).
Web Technologies for Language Learning: Enhancing the Course Management System
367
Lally, L. 1995. Exploring the Adoption of Bulletin Board Services. The Information Society. Vol.11(2), pp. 145-155, 1995. Lincoln, Y. S., & Guba, E. G. 1985. Naturalistic inquiry. Beverly Hills, CA:SAGE Publications, Inc. Masuyama, K. & Shea, A. 2003. Successful Integration of Technology? A Case Study of a First Year Japanese Language Course. In P. Kommers & G. Richards (Eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2003. pp. 1336-1339. Chesapeake, VA: AACE. Masuyama, K. 2005. Essence of Success in the Development and Implementation of Webenhanced Foreign Language Courses. In P. Kommers & G. Richards (Eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2005. pp. 3252-3259. Chesapeake, VA: AACE. Moffett, J. & Wagner, B.J. 1983. Student-Centered Language Arts And Reading, K-13. 3d ed. Boston, MA: Houghton-Mifflin. Nichani, M. 2001. LCMS = LMS + CMS (RLOs). Elearningpost. (Online) http://www.elearningpost.com/features/archives/001022.asp (15 March 2005) O'Reilly, T. 2005. What Is Web 2.0: Design Patterns and Business Models for the Next Generation of Software. (Online) http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-isweb-20.html (17th August 2008) Payne, J.S. & Ross, B.M. 2005. Synchronous CMC, Working Memory, And L2 Oral Proficiency Development. Language Learning & Technology 9(3):35-54. Richardson, W. 2004. Blogging and RSS - The “What’s it?” and “How to” of powerful new tools for web educators. Multimedia & Internet@Schools 11(1):10-13. (Online) http://www.infotoday.com/MMSchools/jan04/richardson.shtml (January 24th, 2006). Robb, T. 2004. Moodle: A Virtual Learning Environment for the Rest of Us. TESL-EJ 8(2). (Online) http://writing.berkeley.edu/TESL-EJ/ej30/m2.html (19th January 2006) Sanchez-Villalon, P.P. & Ortega, M. 2004. Writing on the Web: a Web Appliance in a Ubiquitous e-Learning Environment, The Reading Matrix: Proceedings of the First International Online Conference on Second and Foreign Language Teaching and Research, September 25-26, M. Singhal (Ed.). The Reading Matrix Inc. USA. Siekmann, S. 1998. To Integrate Your Language Web Tools – Call WebCT. Paper presented at Natural Language Processing and Industrial Application – Special Accent on Language Learning. Moncton, New Brunswick, Canada, August 18-21, 1998. Strambi, A. & Bouvet, E. 2003. Flexibility And Interaction At A Distance: A Mixed-Mode Environment For Language Learning. Language Learning & Technology 7(3):81102. Warger, T. 2003. "Calling all course management systems," University Business 6(7): 64– 65. Warschauer, M. 2004. Technological Change and the Future of CALL. In Fotos, S. (Ed) New Perspectives on CALL for Second Language Classrooms. pp. 15-27. Mahwah, NJ, USA: Lawrence Erlbaum Associates, Incorporated. Weasenforth, D., Biesenbach-Lucas, S. & Meloni, C. 2002. Realizing Constructivist Objectives Through Collaborative Technologies: Threaded Discussions. Language Learning & Technology 6(3):58-86.
368
Advanced Technologies
Zhang, D. & Mu, A. 2003. Use of Online Chat in a WebCT-Enhanced Elementary Chinese Language Class. In G. Richards (Ed.), Proceedings of World Conference on ELearning in Corporate, Government, Healthcare, and Higher Education 2003. pp. 1265-1271. Chesapeake, VA: AACE.
New Advances in Membrane Technology
369
21 X
New Advances in Membrane Technology 1National
2Department
Maryam Takht Ravanchi1 and Ali Kargari2
Petrochemical Company, Research and Technology Co. of Chemical Engineering, Amirkabir University of Technology Iran
1. Introduction Research into the use of membrane-based processes in the industry is an interesting area for modern membrane science and technology. The main purpose in this chapter is to describe and discuss a few selected, well recognized and also promising aspects of membrane usage in the industry with the focus on energy conversion, environmental protection and process intensification in this broad field.
2. Membranes for Energy Conversion The energy strategy is changing all over the world. There are different reasons for this: Fossil fuels will become rare in less than 50 years; more than 64%of the current petroleum reserve is located in the Middle East, while less than 14% is available in Europe, USA and the former USSR region together. Energy independence is a security issue. As a consequence different low emission renewable energy technologies are being implemented, favoring the use of bio-fuels and hydrogen to power our future. At the same time the modernization of conventional power plants and refineries is being stimulated to reduce their emission of CO2 in a transition period when petroleum and coal are still the predominant fuel sources. In all these new technologies and transition steps, membranes have a huge opportunity to become a key player. 2.1 Fuel Cells Fuel cells are the main zero emission energy converters fed with hydrogen or renewable fuels like methanol and ethanol to power vehicles, portable devices or to supply electricity to buildings. Various proton-conducting polymer electrolyte materials have been investigated for high temperature operation. Two categories of membranes can be proposed, depending on whether water is required for proton conduction or is not necessary. Polymer electrolytes involving water molecules in the proton mobility mechanism (e.g., perfluorosulfonic membranes) need humidification to maintain suitable conductivity characteristics. The amount of humidification may vary depending on the operating temperature and membrane properties; it influences the size and complexity of the device. Some other electrolytes do not necessarily involve water molecules in the mechanism of
370
Advanced Technologies
proton conduction; these systems do not strictly need humidification. Yet, there are some drawbacks related to the short-term stability of such systems: phosphoric acid leakage from the membrane during operation, poor extension of the three-phase reaction zone inside the electrodes due to the absence of a proper ionomer, and reduced conductivity levels for inorganic proton conductors (Tchicaya-Bouckary et al., 2002). These problems have decreased the perspectives of utilization of water-free protonic electrolytes in low temperature fuel cells. Alternatively, composite perfluorosulfonic membranes containing different types of inorganic fillers such as hygroscopic oxides, surface modified oxides, zeolites, inorganic proton conductors and so on have shown an increased conductivity with respect to the bare peruorosulfonic membranes at high temperature and fuel cell operation up to about 150oC has been demonstrated. Such an effect is mainly due to the water retention capability of the filler (Kreuer, 2001). Fuel cell operation at elevated temperatures can limit the effects of electrode poisoning by adsorbed CO molecules, increase both methanol oxidation and oxygen reduction kinetics and simplify water and thermal management. High temperature operation can reduce the complexity of the reforming reactor employed; the temperature range 130 to 150oC is ideal for application of these systems in electric vehicles and for distributed power generation (Jung et al., 2003). The presence of hygroscopic inorganic oxides inside the composite membrane, besides extending the operation of peruorosulfonic membranes (e.g., Nafion) in the high temperature range, reduces the cross-over effects by increasing the ‘‘tortuosity factor’’ in the permeation path. Such effects are particularly serious at high temperature in fuel cell systems. Presently, these membranes appear to operate better at high pressure since this allows one to maintain a suitable content of liquid water inside the assembly or to facilitate water condensation in the pores. In fuel cell devices, cathode operation at high pressure reduces system efficiency because of power consumption by the air compressor; whereas, less remarkable, is the power loss for the liquid pump at the anode. Although, significant progress has been achieved in the last few years on the development of composite membrane-based systems, the high-pressure requirement is actually the main feature limiting large application of such composite electrolytes at temperatures above 100oC (Arico et al., 2003). Two strategies have been pursued in the study of the perfluorosulfonic composite membranes. A series of composite membranes based on recast Nafionionomer containing different inorganic nanoparticle fillers (SiO2, phosphotungstic acid-impregnated SiO2, ZrO2, Al2O3) mainly varying in their acid–base characteristics has been prepared and investigated in fuel cell devices. In this series of membranes water was used as solvent. In another series of membranes, only one inorganic filler (TiO2) was selected; this was tailored in terms of morphology and surface chemistry and was used for preparation of composite membranes prepared by casting Naon ionomer in the presence of dimethyl sulfoxide (DMSO). 2.2 Hydrogen Separation Hydrogen is the lightest element in the periodic table and is primarily used as a chemical building block in a large number of chemical processes. Currently, about 96% of hydrogen is produced from fossil fuels with close to 48% from natural gas (methane), ~30% from petroleum feedstock (oil) and ~18%from coal. Only about 4%of hydrogen is produced by electrolysis although this is almost certain to increase in the future. The major use of
New Advances in Membrane Technology
371
hydrogen, close to 50% of that produced, is in ammonia synthesis, followed by refinery use and methanol synthesis. Only a very small fraction is used as a fuel although this will undoubtedly increase in the near future as we enter the era of the hydrogen economy (Gryaznov, 2000). Membranes for hydrogen separation are available for different temperature ranges. Two classes of inorganic membranes for hydrogen separation are treated: palladium membranes (temperature 300-450oC) and mixed proton and electron conductive materials (above 600oC). For temperatures up to 550oC molecular sieve membranes based on silica or zeolite are the state-of-the-art (Duke et al., 2006). For temperatures higher than 250oC polymer membranes cannot compete but for the low temperature range they have some advantages, being easy to produce and manufacture in modules on a large scale. A potential application for this temperature range is the recovery of hydrogen from fuel gas and platform off-gas. Glassy polymers with high temperature stability like some polyimides are suitable for membranes for preferential hydrogen transport (Shishatskiy, 2006). Steam reforming of natural gas is by far the most common process used for the production of hydrogen. In steam reforming, methane is mixed with steam and the catalytic reaction is carried out at high pressure (e.g., 30-40bar) and high temperature (700-900oC). Because the reaction is controlled by thermodynamic equilibrium, in order to increase the overall conversion of the process, shift reactors (high and low temperature) are used to increase the hydrogen conversion, followed by a preferential oxidation reactor (PreOx) and hydrogen separator. Unfortunately, the by-product of the reaction is the greenhouse gas CO2, which could not and should not be exhausted to the atmosphere and needs to be treated. A membrane reactor can increase the overall conversion of a thermodynamic equilibrium-controlled reaction by continuously removing one or more of the reaction products during the reaction. Therefore, it is especially suited for carrying out the steam reforming reaction (Klette & Bredesen, 2005). Metallic membranes, particularly, Pd and Pd/alloy membranes supported on porous metal, are well suited for the steam-reforming application. Both metallic membrane and porous support are chemically, mechanically and thermally stable at high temperatures and pressures. Composite membranes supported on porous metal have the advantage over stand-alone thin films because a thin membrane layer can be made since the porous support provides the necessary mechanical strength required for high pressure applications. The persisting perception that palladium is too expensive to be economically feasible for large scale applications is misleading. Since the reduction in thickness of a composite Pd and Pd/alloy membrane to less than 5mm or thinner is feasible the quantity of palladium used is so small that its cost becomes an insignificant fraction of the total membrane assembly cost. However, the cost of the support may become significant and may play a considerably more important role in making composite Pd and Pd/alloy membranes economically viable for large scale industrial applications. On the other hand, the cost of the support may become much lower than the current price when large quantities are purchased, thereby making the process more competitive with the conventional steam reforming processes. In addition, in order to get the maximum economic and operational benefits, composite Pd and Pd/alloy membrane reactors should be considered. The known hydrogen-permeable dense ceramic materials are oxides that are mixed proton– electron conductors. Proton transport at high temperatures is fast, but thermodynamics
372
Advanced Technologies
speaks against a high concentration of protons in the materials at high temperatures. Combinations of both high protonic and electronic conductivity appear to be remarkably rare. Materials’ chemical and thermal stability, mechanical strength, and the ceramic engineering are important to make dense, thin membranes on porous supporting substrates – all needed to arrive at long-lived high-performance membranes for hydrogen separation at high temperatures. 2.3 CO2 Capture and Power Generator Approximately one third of the main anthropogenic sources of CO2 originate from power generation. The growing acceptance that this emission causes an increase in the global temperature with enormous potential consequences, has led to efforts to develop technology for CO2 mitigation. This may be achieved by following simultaneously several strategies such as: 1. Improving energy efficiency 2. Changing to low carbon fuels, or CO2 neutral or non-emitting power generation and chemical production routes, 3. Developing CO2 capture and storage technology. Membranes may play an important role in the different technologies for CO2 mitigation listed above (Powell & Qiao, 2006). Traditional CO2 separation is accomplished by physical and chemical absorption, adsorption, cryogenic distillation and membrane technologies. Different approaches for material development for membranes with preferential CO2 transport include functionalized polymers and polymer composites containing polar ether oxygens (Lin & Freeman, 2005) and/or amine groups. An approach which is under investigation by different groups is the use of polymers with ethylene oxide segments. Recently, highly branched, cross-linked poly (ethylene oxide) was reported with particularly high selectivity (up to 30) for CO2/H2. Commercially available membranes for CO2 separation are polymeric and are operated at close to ambient temperatures. Selectivity for CO2/CH4, CO2/N2 and CO2/H2 is typically well below 50 and permeance fairly low, which is a major drawback for this type of membrane. Development of hybrid membranes (polymers containing inorganic materials) may give improvements. However, due to the better thermal stability, and typically better selectivity and permeance, inorganic membranes will give new possibilities of integration in power cycles that could result in higher efficiency and reduction of total cost for CO2 capture. It has been shown that inorganic membranes can be integrated in different fossil fuel-based power cycles with CO2 capture. Studies comparing the power cycle efficiency show that a penalty of only 5-8% for CO2 capture is possible by membrane integration. Some state-ofthe-art membranes already demonstrate sufficient flux and selectivity to give cost-effective CO2 capture solutions. The properties of current membranes appear, however, critically dependent on stringent control of all stages in advanced production processes. Consequently, fabrication is difficult, and significant work is necessary to realize economically viable large scale production of membrane modules. Several companies are currently involved in such efforts and in validating performance under real operating conditions (Lin et al., 2006).
New Advances in Membrane Technology
373
2.4 Power Generation by Pressure Retarded Osmosis The shrinking reserves of fossil fuels and the increasing energy demand due to the development of third world countries are only some reasons for the urgent need to search for alternative emission free energy sources. Solar and wind power are already well established and part of our day to day life. The ocean as an energy source has not yet been developed to a commercial level but it represents a renewable energy source with a high potential. Ocean energy sources were accessed during the energy crises after 1973 with regard to their energy density and potential power. Sources considered were ocean waves, ocean currents, thermal gradient, tides and salinity gradient. Salinity gradient power systems include Reversed Electrodialysis and Osmotic Power, the latter using Pressure Retarded Osmosis (PRO). A comparison of these two processes has been published recently (Post et al., 2007). Osmotic Power using pressure retarded osmosis is one of the most promising renewable ocean energy sources. It represents a huge potential and can make a significant contribution not only to satisfying the global energy demand but also to reducing the environmental impact of power production. Scientists have known of this energy source for more than 30 years but, due to lack of effective membranes, the key part of an Osmotic Power plant, not much effort has been made to establish this type of energy. Osmotic Power using the PRO process has the potential to be a huge energy resource. It produces no CO2 or other emissions that may interfere with the global climate and is a predictable energy form compared to solar or wind power. Statkraft, the leading power utility and electricity provider in Norway, started their research on Osmotic Power in 1997 together with the Norwegian research institute SINTEF. In order to establish this form of green energy, the membrane, the heart of the process needs further improvement. The break even value for the membrane performance is 5W/m2, so there is still a need to improve the design of the membrane and the process of industrializing the technology (Jones, & Rowley, 2003). The performance of cellulose acetate (CA) membrane and the thin film composite (TFC) membrane has been tested, reaching 3.7W/m2 for the best ones. The target performance for commercialization is 5W/m2. The achievements from the research to date show that the realization of PRO is getting closer. Although there is still need for further improvement of the membrane performance, no obstacles have been identified that should prevent PRO from becoming a large-scale power technology within a few years. From the results shown earlier it can be concluded that asymmetric CA membranes have been developed close to the maximum performance of this polymer. The target of 5W/m2 can probably not be reached with CA membranes, but the transfer of this technology to hollow fibers can still imply interesting improvements. The TFC membranes made by interfacial polymerization have the potential to reach the designated performance of 5W/m2.
3. Membranes for Environmental Protections Water supply and environmental protection are two important challenges of the current century. Due to increase in the world population and limited sources of the water, more efficient processes for water processing and reuse are needed. Also, daily discharge of large amounts of wastewater, domestic and industrial effluents and other gaseous and liquid pollutants to the environment has been made the earth more and more unsafe for living.
374
Advanced Technologies
More expensive and energy and time consuming processes for water and wastewater conditioning such as multistage flash evaporation and biological activated sludge processes are now going out of date and new and more effective processes such as reverse osmosis (RO), ultra filtration (UF) and membrane bio-reactors (MBR) are replaced. In this section, the application of the new membrane processes for wastewater and environmental protection are discussed (Judd & Jefferson, 2003). 3.1 Wastewater Treatment Almost every manufacturing industry (automobiles, food, steel, textiles, animal handling and processing, etc) and service establishment (hotels, transportation, etc.) generates large quantities of wastewater daily. Industry accounts for about a quarter of all water consumption, and there is hardly any industry that does not use large volumes of water. The need for stringent pollution control (and legislation) provides tremendous opportunities for membrane technology in all aspects of pollution control, from end–of-pipe treatment to prevention and reduction of wastes. There are two approaches to wastewater treatment, depending on (1) if the permeate is to be reused, e.g. alkaline/acid cleaning baths, electrocoat paint, water, or, (2) if the permeate is to be disposed of and objective is to reduce the volume of solids, e.g., machining operations, food wastes, metal plating. However, the physicochemical properties of wastes vary widely, even within the same industry and sometimes within the same plant at different times of the year. Wastewater treatment requires more extensive testing than most industrial membrane applications to account for possible feed-stream variations, pretreatment options, cleaning problems, and issues related to recycling or disposal of permeate and retentate (Cheryan, 1998). 3.1.1 Solid Membranes Solid membranes play an important role in wastewater treatment. As they are the oldest kind of industrial membranes, they have been employed in many industrial applications. 3.1.1.1 Oily Wastewater Oily wastes are generated in a wide variety of industries such as metalworking, vegetable and food processing, transportation, textiles, laundries, chemicals, etc. They are grouped as three broad categories: free oil, unstable oil/water emulsions, and highly stable oil/water emulsions. Free oil can be readily removed by mechanical separation devices that use gravitational forces as the driving force. Unstable oil/water emulsions can be mechanically or chemically broken and then gravitationally separated. However, stable emulsions, particularly water-soluble oily wastes, require more sophisticated treatment to meet today’s effluent standards. Using UF to recover the oil component and allow safe discharge of the water makes good economic sense, and this application covers a wide volume range. In large, automated machining operations such as automobile plants, steel rolling mills, and wire mills, a central ultrafiltration system may process up to 100,000gal/day of waste emulsion. These are relatively sophisticated plants that operate continuously using several ultrafiltration feedand-bleed stages in series. At the other end of the scale are very small systems dedicated to single machines, which process only a few gallons of emulsion per hour. The principal
New Advances in Membrane Technology
375
economic driver for users of small systems is the avoided cost of waste hauling. For larger systems the value of the recovered oil and associated chemicals can be important. In both cases, tubular or capillary hollow fiber modules are generally used because of the high fouling potential and very variable composition of emulsified oils. A flow diagram of an ultrafiltration system used to treat large machine oil emulsions is shown in Figure 1. The dilute, used emulsion is filtered to remove metal cuttings and is then circulated through a feed-and-bleed ultrafiltration system, producing a concentrated emulsion for reuse and a dilute filtrate that can be discharged or reused (Baker, 2004).
Fig. 1. Flow diagram of a UF unit used to concentrate a dilute oil emulsion (Baker, 2004). 3.1.1.2 BOD and COD Reduction The presence of organic materials in wastewaters can result in many problems. The appropriate indices for showing the amount of organics in water are the chemical oxygen demand (COD) and the biological oxygen demand (BOD). Membrane processes have received considerable attention for the separation and concentration of inorganics and organics from various wastewaters. Membrane processes have been combined with or substituted for traditional advanced treatment technologies such as biological treatment, adsorption, stripping, oxidation, and incineration. The membrane processes can be used to concentrate and purify simultaneously wastewater containing both inorganics and organics and produce a 20 to 50 fold decrease in waste volume that must be treated with other processes. The development of low-pressure processes has made RO an attractive alternative for the treatment of aqueous wastes since these offer high fluxes and solute separations and can operate over wide temperature and pH ranges. RO processes for wastewater treatment have been applied to the chemical, textile, petrochemical, electrochemical, pulp and paper, and
376
Advanced Technologies
food industries as well as for the treatment of municipal wastewater (Madaeni & Mansourpanah, 2003). Both UF and RO membranes used for COD and BOD reduction in wastewaters depend on the size of the organic molecules. When both high and low molecular weight organics are present in wastewater, it is customary to use UF as a pretreatment for RO process. This combination allows removing more than 90% and in some cases more than 99% of TOC from wastewater. The shortcoming of this combination is the need of high pressure operation (20-80bar) for RO systems, but it has the benefit of high rejections for even concentrated wastes. Recently, application of nano filtration (NF) instead of the above combination has been developed. This system does not suffer from high pressure operations as RO systems and usually works with a pressure lower than 20bar, but the percent of rejection is nearly 50% of the RO system and is suitable for low contaminated wastewaters (COD<400 ppm) (Viero et al., 2002). 3.1.1.3 Heavy Metal Ion Removal Heavy metals constitute a major portion of the contaminants in chemical effluents that cannot be degraded or destroyed. Heavy metals are dangerous due to bioaccumulation. In water bodies, they accumulate in sediment and organisms from where they may transfer into the food chain. Due to their hazard to human health, lead, arsenic, cadmium and mercury are targeted by international environmental legislation (Landaburu-Aguirre et al., 2006). Although solvent extraction is frequently employed for removal of selected species from aqueous solutions, this technology suffers from a requisite maintenance of a large inventory of an organic solvent that is often flammable, toxic or otherwise hazardous. Other traditional methods for elimination, concentration and/or recovery of heavy metals are precipitation, ion exchange (IX), electrodeposition, crystallization, evaporation, etc. In the majority of cases, the ultimate objective of the process is not the recovery of the metal but rather its elimination. However, recovery of heavy metals allows their later re-use and thus provides further economic and environmental benefits by contributing to reduced disposal costs and raw material requirements. Membrane processes provide a viable alternative for heavy metal recovery, as they can achieve high permeate fluxes and high rejection coefficients with low energy costs and under mild conditions. In addition, using membrane technology separation can be carried out continuously and membrane processes can easily be combined with other separation processes (hybrid processing). Nearly all kinds of membranes (UF, NF, RO and IX) have been employed for heavy metal ion removal from wastewaters but RO and NF are more common than the others (Abu Qdaisa & Moussab, 2004). 3. 1. 2 Liquid Membranes For more than 30 years, liquid membranes have been in focus of research. Since diffusivities in liquids in comparison with solids are higher by several orders of magnitude, enhanced permeabilities of liquid in comparison with solid membranes can be expected (Krull et al., 2008). Recently, liquid pertraction or liquid membranes appeared as a new and prospective separation method. Due to its advantages over solid membranes and liquid-liquid extraction, liquid pertraction attracted the attention of many scientists and engineers.
New Advances in Membrane Technology
377
Liquid pertraction explores a very simple idea: two homogeneous, completely miscible liquids, which may be referred to as donor solution, and an acceptor solution are spatially separated by a third liquid, immiscible and practically insoluble in the former two liquids-the membrane phase. Due to the favorable thermodynamic conditions created at the interface between the donor solution and the organic membrane, some components are extracted from the donor solution and transported into the membrane liquid. Simultaneously, at the second interface, conditions are created, which favor the reverse transport, i.e., the extraction of the above-mentioned components from the membrane liquid and their accumulation in the acceptor solution. Liquid membranes (LMs) have two considerable advantages over solid membranes: as is known, molecular diffusion in the liquids (except in super-viscous ones) is, by several orders of magnitude faster than that in solids. Furthermore, in some pertraction methods the molecular diffusion in the liquid membrane is replaced by eddy diffusion, which intensifies the transfer process. Hence, it may be stated that solid membranes, even those of submicron thickness, cannot compete with liquid membranes with respect to transfer intensity. As a general rule, polymer membranes are less selective than liquid ones. Wastewater treatment using LM is a new insight in membrane technology. All of the three kinds of liquid membranes (emulsion, bulk and supported liquid membranes) have their specific applications in wastewater treatment. Among the various activities in this field, the most important ones are: Removal of phenol from concentrated phenolic wastewater (up to 50000 ppm) by ELM (Kargari et al., 2002; Kargari et al., 2003d; Kargari et al., 2005b). Removal of iodine from highly concentrated aqueous media (up to 2000 ppm) by BLM (Nabieyan et al., 2007; Kaghazchi et al., 2009). Removal of cadmium ion from aqueous media (up to 100 ppm) by SLM (Nabavinia et al., 2009). Removal of chromium ion from aqueous solution by BLM (Rezaei et al., 2004). Removal of the gold ion from aqueous media, selectively (up to 110 ppm) by ELM (Kargari et al., 2003a-c; Kargari et al., 2004a-e; Kargari et al., 2005a; Kargari et al., 2006ac; Mohammadi et al., 2008). Removal of bio-organic materials from aqueous media by ELM (Kaghazchi et al., 2006). 3.2. Nuclear Waste Treatment The nuclear industry generates a broad spectrum of low and intermediate level liquid radioactive wastes (LRWs). These liquid wastes may be produced continuously or in batches and may vary considerably in volume, radioactivity, and chemical composition. A wide range of treatment methods have been used throughout the industry to treat these wastes. Treatment methods for LRWs have tended to use the same conventional processes found in industrial and municipal water treatment. These processes typically include chemical treatment, adsorption, filtration, ion exchange, and evaporation. They are limited by either their inability to remove all contaminants or, in evaporation, the high operating costs involved and the large quantities of secondary solid waste produced, which means that satisfactory processing of LRWs is difficult to achieve. Furthermore, the treated liquid effluent is not pure enough for environmental discharge or recycling. During the past 5-10 years, membrane technology has been gradually introduced into nuclear power plants for treatment of low radioactive waste.
378
Advanced Technologies
An application of membrane methods for liquid radioactive wastes treatment requires solving many problems connected with the proper selection of the membranes, membrane modules and other equipment according to local conditions: chemical and radiochemical composition of the effluents treated, their activity and total salinity. Membrane processes enable radioactive impurities to be separated from waste stream by selective passage of certain components of the stream through a membrane. These processes include reverse osmosis (RO), ultrafiltration (UF) or micro filtration (MF), depending on the pore size in the membrane. Membrane processes were already applied for radioactive laundry wastes in nuclear power plants, of mixed laboratory wastes, and for cleanup of boric acid solutions for recycling. There are many installations based on membrane technology, working successfully in nuclear industry. The application of membrane processes limited to only low and probably medium level liquid wastes where the concentration of radioactivity limits to 37-3.7×10 6Bq/L. The volume of waste is normally reduced by a factor of 10 and decontamination factor of 8-10 is achieved in this process (Pabby et al., 2009). Various membrane-based methods have been developed for this purpose. The most important ones are reverse osmosis, nanofiltration, ultrafiltration, precipitation ultrafiltration, complexation ultrafiltration, microfiltration, osmotic concentrator, electrodialysis, electrodeionization, diffusion dialysis and Donnan dialysis, and liquid membranes. Only few methods have been commercialized until now. Table 1 shows some industrial use of membrane technology for nuclear liquid waste treatment. Recently, a new process named membrane distillation (MD) has been introduced for this purpose. Membrane distillation is a separation method that employs porous liophobic membrane, non-wettable by the liquid. Because of liophobicity of the polymer, only vapor is transported through membrane pores. The condensation takes place on the other side of the membrane in air gap, cooling liquid or inert carrier gas. Usually MD is employed to treat water solutions, therefore hydrophobic membranes manufactured from polymers such as polypropylene (PP), polytetrafluoroehtylene (PTFE), or poly(vinylidenefluoride) (PVDF) are used in the process. The driving force in the MD process is a gradient of partial pressures of the components of the solution in gaseous phase. Despite of some technical and process limitations, membrane techniques are very useful methods for the treatment of different types of effluents. Removal of tritium from nuclear waste (liquid and gaseous effluents), isotope separation, gaseous radioactive wastes and noble gases separation are the latest applications of membrane technology in the field of radioactive materials processing industries (Zakrzewska-Trznadel et al., 2001) . 3.3. Air Pollution Air pollution is most often affected by the emission of pollution generated by industry, power plants, car transport, and agricultural and municipal waste. Pollution is exceptionally hazardous when it involves the emission of so-called acid gases (SO2, NOX) and volatile organic compounds, mainly halogen-derived hydrocarbons and aromatic compounds which destroy the ozone layer and contribute to the creation of the greenhouse effect. Different methods are used to eliminate these substances. Particular techniques have been classified according to optimum range of concentration at which they are working. Appropriate combining of these processes (hybrid processes) can be advantageous from the economic and technical viewpoints. The removal of volatile organic compounds can be carried out
New Advances in Membrane Technology
379
with the recovery of solvent or without it, although from environmental and economical viewpoints, the second solution is favored (Bodzek, 2000). Membrane process
Facility
Reverse osmosis
AECL Chalk River (Canada)
RO with conventional pretreatment
Nine Mile Point NPP (USA) Pilgrim NPP (USA) Wolf Creek NPP (USA)
RO with ultrafiltration pretreatment
Comanche Peak NPP (USA) Dresden NPP (USA) Bruce NPP (Canada)
RO with microfiltration pretreatment
Microfiltration
BWR floor drains and various other wastes PWR floor drains, reactor outage waste, spent resin sluice water, and other waste Floor drains, resin sluice water, boron recycle water Inventory of TRU (trans-uranium) contaminated batch of liquid waste Aqueous wastes from steam generator chemical cleaning
AECL Chalk River (Canada)
Nuclear research wastes
Diablo Canyon NPP (USA) River Bend NPP (USA)
Spent media transfer liquid BWR floor drains PWR floor drains, equipment drains, and other various sources PWR floor drains and spent resin tank drain-down Floor drains, equipment drains, reactor coolant Wastes from fuel reprocessing activities Alpha-containing tail wastes
Salem NPP (USA) Ultrafiltration
Wastes processed Reactor coolant clean-up with boric acid recovery
Seabrook NPP (USA) Callaway NPP (USA) Mound Laboratory (USA) Sellafield Nuclear Center (UK) Projected facility for treatment of laundry (detergent) wastes AECL Chalk River (Canada) Rocky Flats (USA)
Laundry (detergent) wastes Contaminated ground water
Table 1. Examples of industrial use of membrane technology for nuclear waste processing (Pabby, 2008).industrial Another problem connected with pollution of the atmosphere is the generation of vast volumes of gases, which contributes to the creation of the greenhouse effect - carbon dioxide while burning carbon-derived fuels and simultaneous emission of methane and carbon dioxide from solid waste dumps. With respect to the latter case, it seems to be beneficial to recover methane since it is a valuable source of energy and is characterized by higher global greenhouse factor than carbon dioxide (Table 2).
380
Advanced Technologies
Compound CO2 CH4 Halogen derived hydrocarbons O3 NO
Comparative index inducing greenhouse effect 1 32 ca. 15000 2000 150
Concentration* 56 25 8 9 2
* relative concentration without taking nitrogen and oxygen into consideration. Table 2. Influence of particular gases on the greenhouse effect (Bodzek, 2000). The concept of gas separation through membranes is based on the mechanism of dissolution and diffusion. As compared to liquids, gases are characterized by low affinity to polymers and therefore their solubility in such materials is also low (usually <0.2%). Solubility of a given gas in polymer increases along with the increase in affinity to polymer; for example, the solubility of carbon dioxide is higher in hydrophilic polymers than in hydrophobic ones. Separation of gases and vapors has been applied practically in the industry for splitting the following systems (Li et al., 2008): CO2/CH4 - biogas, natural gas, H2 or He from other gases, H2S/CH4 - natural gas, O2/N2 - oxygen enrichment of air and vice versa, H2O - drying of gases, SO2 - desulphurization of gases, vapors of organic compounds - removal from the air and from industrial waste flows. 3.3.1 Removal of Volatile Organic Compounds from Air Industrial processes where volatile organic solvents are applied contribute to the generation of waste gas flows polluted by the vapors of these compounds. They are not only hazardous for the environment but also have some kind of economic value connected with the recovery of chemical substances and energy. Selective membrane absorption used for the removal of volatile organic compounds integrates the advantages of absorption and membrane gas separation. The properties of such solutions are as below: Compact capillary membranes are characterized by short diffusion path, The surface of phase separation is taken up by the membrane, The recovery of volatile organic compounds is taking place even at low concentrations, Energy consumption is low, The technique is not destructive, Flexibility of process system. A spin-off of the activities of gasoline vapor recovery at gasoline tank farms is the development of a system to reduce emissions generated by the operation of petrol stations. In the case of car refueling, the connection between the dispenser nozzles and the petroleum tank filler pipe is the only area open to the atmosphere. To reduce emissions during refueling, vacuum assisted vapor return systems have been introduced in many countries. An investigation of the TÜV Rheinland has shown that the efficiency of catching emissions by means of the 1:1 vapor return ratio is limited to an average of approx. 75%. The difference between a minimum value of 50% and a maximum value of 90% in vapor return
New Advances in Membrane Technology
381
is caused by differences in the construction of car filling pipes. In order to enhance the vapor return rates, a surplus of air/vapor volume has to be returned. Tests have shown that the increase of the air over liquid ratio to 1.5:1 leads to an improvement of the efficiency of between 95–99%, depending on the type of the car. The enhancement of the vapor return rate is only possible if no additional emissions are generated. A membrane-based vapor separation system to treat the breather pipe vent gases of storage tanks enables emission reduction during car refueling without creating any additional emissions. The essential requirement is a leakage proof installation of tanks, pipes and dispensers. Furthermore, the installation of over/under pressure safety valves at breather pipes and check valves at the filling and vapor-balancing couplings of the storage tanks. Because of the surplus of returned vapor volume, a pressure build-up occurs in the storage tanks. At a given set point of a pressure gauge, which measures the differential pressure between tank pressure and atmospheric pressure, the vacuum pump of a membrane separation system is activated. This system is installed parallel to the vent stack of the storage tanks. A pneumatic valve in the retentate line of the membrane module is opened by the applied vacuum. The overpressure of the storage tanks causes a volume flow, which is released by passing the membrane stack. The gasoline vapors are separated from the off-gas and clean air enters to the atmosphere. After the lower set point of the pressure gauge, is reached, the system is deactivated. Besides the advantage of emission reduction, the wet stock losses of gasoline storage can be reduced because diffusive emissions are avoided and most of the generated gasoline vapor is returned to the storage tank. Because of the simplicity and the nearly maintenance free operation, the system is particularly suitable for petrol station applications and product delivering-receiving stations at petrochemical plants (Takht Ravanchi et al., 2009a). 3.3.2 Removal of SO2 from Exhaust Gases Various absorption processes are used for the removal of SO2 from exhaust gases. One of the most commonly used is the so called "double alkaline process" which can also be applied to membrane absorbers. Pilot tests for an installation of the output of 100m3/h by TNO (Holland) using the gas coming from the installation for biogas combustion, which contained SO2. Sulfur dioxide, which was being recovered by membrane absorption in the form of sodium sulfite, may be again used in the production process. Over the 6 months of testing, over 95% of SO2 was recovered with the output of 120m3/h. There were no disturbances observed in the operation caused by fluctuations in the volume of gas flux or content of SO2. Also, the fouling of membrane was not observed since in the process of gas separation only the diffusive transport of substances takes place, and convection transport (which is principally responsible for this process) is not observed (Bodzek, 2000). 3.3.3 Removal of CO2 from Exhaust Gases Conventional techniques for the removal of carbon dioxide from the air are based principally on the process of chemical and/or physical absorption. It is absorbed by solvents in low temperature and/or under low pressure. The following solvents are the most typical: monoethanoloamine (MEA), diethanoloamine (DEA), triethanoloamine (TEA), potassium carbonate (Benfield), methylodiethanoloamine (MDEA), methanol (Rectisol), N-methylo-2-
382
Advanced Technologies
pyrolidon (Purisol), polyethylene glycol (Selexol), propylene carbonate (Fluor solvent) and the system sulfolane/diisopropanoloamine/water (Sulfinol). Chemical solvents are used when the concentration of carbon dioxide in the air is low and when carbon dioxide obtained in the product should have high purity. But with the high concentration of carbon dioxide in the inlet gas and with smaller requirements involving its purity in the product, physical solvents are favored. Due to low partial pressure of carbon dioxide in exhaust gases, in practice the absorption/desorption process is applied with the use of monoethanoloamine. Absorption is carried out in the temperature only slightly higher than ambient temperature and desorption in the temperature of about 110ºC. The required membrane surface in the absorption stage can be calculated from respective equations. Assuming that the transport index of matter is limited solely by the diffusion stage in gas filling membrane pores, we can accept that it equals to 0.02m/s for typical capillary membranes (l mm of external diameter). For the flow of exhaust gases at the rate of 600m3/s and carbon dioxide recovery at the level 70%, the required membrane surface will be 35,500m2. Introductory economic analyses indicate that the cost of the absorber alone could be reduced by 30% and the total cost of the installation absorption/desorption can be reduced by 10%. Recently, supported liquid membranes have been considered to be an effective means for removal of acidic components such as CO2 and H2S from natural gas and other process gases. In this process a thin layer of the absorption solvent placed in the pores of the membrane by capillary force. This very thin layer (approximately less than 150 μm) has negligible mass transfer resistance and with the high contact surface area between the phases due to the presence of membrane, mass transfer occurs as fast as possible. Heydari Gorji et al. used a supported liquid membrane containing amine solution as the carrier for gas sweetening. The experimental results from CO2/H2 separation showed the separation factor of 1350 and permeability of 95×10-11(mol.cm/cm2.s.kPa) at the ambient temperature and a transmembrane pressure of less than 2 barg is attainable (Heydari Gorji et al., 2009 a,b). 3.3.4 Purification of Air from Cigarette Smoke in Closed Rooms Cigarette smoke contains a few hundred chemical compounds, in both the solid and gas phases. The existing systems of mechanical ventilation in houses and in offices do not guarantee good air quality with many heavy smokers around. The quality of the air indoors could be improved by the application of a portable facility for air purification, which enforces air circulation through a system of filters. Classical facilities for air purification make use of electrostatic or cloth filters and sometimes of filters from wood coal. However, most filters are not effective with respect to most gas substances generated during cigarette smoking. The problem can be solved by membrane filters. Experimental results showed even when tap water was used as the absorber (which has very low tendency for absorption of hydrocarbons) the efficiencies was very good. Compounds soluble in water were characterized by high removal effectiveness. The tests have proven that the tested membrane facilities for air purification are useful filtration systems which are characterized by high removal effectiveness of compounds soluble in water and small values of pressure drop. The facilities are compact due to the application of capillary modules. Table 3 shows the effectiveness of the membrane facility for purification of air containing cigarette smoke (Bodzek, 2000)
New Advances in Membrane Technology
383
Component Acetone Styrene Formaldehyde Nicotine Ammonia Aroma
Removal effectiveness [%] 96.6 15 98.4 99.1 95 49-54
Table 3. Effectiveness of membrane facility for purification of air containing cigarette smoke (Bodzek, 2000).
4. Membranes for Process Intensification 4.1 Nano-composite Gas Separation Membranes Membrane gas separations are attractive because of their simplicity and low energy costs, but often limited by insufficient gas flux. This problem is specially challenging because the permeability of a material is frequently inversely related to its selectivity. Recently, polymer–inorganic nano composite materials have been developed to improve the physical properties of polymer membranes. The polymer-inorganic nano composite membrane constitutes of two matrices, i.e. polymer and inorganic material. In these kinds of membranes, the inorganic phase is dispersed at nanoscale level in the polymer phase. Due to special structural characteristics of polymer-inorganic nano composites, the gas separation properties of pure polymers are improved. Kong et al. used polyimide (PI)/TiO2 nano composite membranes for gas separation. The permeation properties of these membranes are illustrated in Table 4. As it can be seen from these results, the low TiO2 content could not greatly enhance the permeation properties of the composite membranes. When the TiO2 content in the composite membranes was above 20 wt%, the permeability of the composite membranes was remarkably enhanced, and selectivity of the composite membranes was still kept at a high level. This might be caused by the specific interaction between gases and the TiO2 component in PI/TiO2 composite membranes. At TiO2 content of 25wt%, the results were very interesting because both of the permeability and selectivity of PI membrane were enhanced at the same time (Kong et al. 2002). Zhang et al. used nano-sized nickel-filled carbon membranes to examine gas separation properties. Nickel, as a very commonly used hydrogenation catalyst was chosen because it can selectively chemically adsorb hydrogen, which would result in the change of hydrogen permeation properties of the resulting nickel-filled carbon membranes. Permeation properties of single gas through Ni-filled carbon membranes are shown in Table 5. It could be seen that the nickel amount had strong influence on the gas permeation properties of the corresponding membrane (Zhang et al., 2006). TiO2 content (wt%) 0 5 15 20 25
PH2 3.809 3.773 5.523 6.686 14.143
PO2 0.166 0.155 0.273 0.290 0.718
PN2 0.023 0.033 0.053 0.037 0.075
PCH4 0.018 0.018 0.039 0.041 0.099
αH2/N2 166.9 115.0 104.6 180.7 187.5
αH4/CH4 214.0 222.0 142.2 163.4 143.2
αO2/N2 9.3 4.7 5.2 7.8 9.5
Table 4. Permeability (barrer) and selectivity of PI/TiO2 composite membrane with different TiO2 content (Kong et al. 2002).
384
Advanced Technologies Ni (wt%) 1 3 5 7.5 10
content
PH2
PCO2
PO2
PN2
αH2/N2
αCO2/N2
αO2/N2
αCO2/H2
5.6 1.8 0.8 3.0 10
21 31 30 29 22
3.6 6.1 8.5 10.2 4.5
0.3 0.6 6.8 7.2 4.3
19 3.0 0.1 0.4 2.3
70 52 4.4 4.0 5.1
12 10 1.3 1.4 1.0
3.7 17 38 9.7 2.2
Table 5. Single gas permeance (10-10 mol m-2 s-1 Pa-1) and ideal selectivity of Ni-filled carbon membranes with different amounts of nano-sized nickel (Zhang et al., 2006) 4.2 Olefin-Paraffin Separation via Facilitated Transport Membrane Olefins such as ethylene and propylene are important feedstocks in petrochemical industry. They are used for the production of polypropylene, acrylonitrile, propylene oxide, oxo alcohols, cumene, acrylic acid and isopropyl alcohol. As olefin and its corresponding paraffin are produced simultaneously in the petrochemical complexes, the important step is their separation (Takht Ravanchi et al., 2008a-d). The traditional and most conventional separation method is cryogenic distillation. As olefin and its corresponding paraffin have close physical properties and relative volatility of this separation is near unity (α~1.2), distillation is hard. A very tall column containing around 200 trays that operated at high pressure (20bar) and low temperature (-30oC) is required for this separation. Because of the high fixed and operating costs of such a column, proposing an alternative process that can be performed at moderate operating condition is the aim of researchers in recent years. Since membrane processes have many advantages; such as simplicity, energy-saving, easy operation, environment friendly, low-maintenance cost and modular configuration; in recent years researchers investigated their application for olefin-paraffin separation. At first polymeric membranes were used for this separation. Burns & Koros, 2003 reported separation factors obtained for the separation of 50:50 (vol. %) propylene-propane mixture using polymeric membranes. The highest separation factor obtained was 21 (when Pyralin 2566 was used as membrane (Krol et al., 2001)), which is so low for industrial applications. Facilitated transport (FT) phenomena helped impressively in the utilization of membrane for olefin-paraffin separation. Park et al., 2001 and Kim et al., 2003 reported the application of solid polymer electrolyte (SPE) membranes, one type of FT membranes, for propylenepropane separation. In comparison with polymeric membranes higher separation factors were obtained (170 in PAAm/AgBF4 membrane containing 67mol.% Ag+ (Park et al., 2001)). In SPE membranes, monomer and carrier ion (Ag+) were incorporated in the membrane in the procedure of membrane preparation. In immobilized liquid membranes (ILMs), as another type of FT membranes, the porous support was immersed in the carrier solution. Carrier ions were settled inside the pores by capillary forces. Immobilized liquid membranes were not studied for propylene-propane separation before. As their preparation is somehow easier than SPE membranes, their application for propylene-propane separation seems more feasible. In a membrane separation setup which contains flat sheet membrane module, propylenepropane separation was studied. The detail description of the setup can be found in Takht Ravanchi et al., 2009b. Industrial grade propylene (99.74 mol.%) and industrial grade propane (99.79 mol.%) were used as feed gases. PVDF flat sheet membrane was used as the
New Advances in Membrane Technology
385
support of ILM and AgNO3 solution was used as carrier. Separation experiments were conducted at four trans-membrane pressures, four carrier concentrations and with three feed mixtures (Figure 2). As it can be seen, increasing trans-membrane pressure and carrier concentration are in favor of propylene separation. The results obtained confirm that present ILM system performs better in comparison with SPE membranes used for propylenepropane separation. For industrial application of this system, some pilot experiments must be conducted by which optimum values of operating conditions can be determined (Takht Ravanchi et al., 2009c-f). (a) Separation Factor
600
0 w t.% 5 w t.% 10 w t.% 20 w t.%
500 400 300 200 100 0 40
60 80 100 120 Trans-m em brane Pressure (kPa)
Separation Factor
(b) 400 350
0 w t.% 5 w t.% 10 w t.% 20 w t.%
300 250 200 150 100 50 0 40
60 80 100 120 Trans-m em brane Pressure (kPa)
(c) Separation Factor
250
0 w t.% 5 w t.% 10 w t.% 20 w t.%
200 150 100 50 0 40
60 80 100 120 Trans-m em brane Pressure (kPa)
Fig. 2. Separation performance of PVDF-AgNO3 ILM system for propylene-propane separation; (a) 30:70 (vol.%) propylene-propane mixture, (b) 50:50 (vol.%) propylenepropane mixture, (c) 70:30 (vol.%) propylene-propane mixture (Takht Ravanchi et al., 2008b)
386
Advanced Technologies
4.3 Separation of Light Hydrocarbons Pervaporation offers the possibility of separating solutions, mixtures of components with close boiling points, or azeotropes that are difficult to separate by distillation or other means. The first systematic work on pervaporation was done at American Oil in the 1950s. The process was not commercialized at that time and remained a mild academic curiosity until 1982, when GFT (Gesellschaft für Trenntechnik GmbH, Germany) installed the first commercial pervaporation plant. That plant separated water from concentrated alcohol solutions; GFT has since installed more than 50 such plants. In these plants, polyvinyl alcohol is used as composite membranes; they are far more permeable to water than alcohol. A flow scheme of a GFT plant combining distillation and pervaporation to produce dry alcohol is shown in Figure 3. The ethanol feed to the membrane generally contains ~10% water. The pervaporation process removes the water as the permeate, producing pure ethanol with less than 1% water and avoiding all the problems of azeotropic distillation.
Fig. 3. Flow scheme of a GFT plant for ethanol recovery (Baker, 2000) Spurred on by this success, a great deal of effort is being made to apply pervaporation to other difficult separations. Another commercial pervaporation application is the separation of dissolved VOCs (Volatile Organic Compound) from water, developed by Membrane Technology and Research, Inc. Relatively hydrophobic composite membranes, such as silicone rubber coated on a microporous polyimide support membrane, are used. Extremely high separation factors can be obtained for the more hydrophobic VOCs such as toluene, benzene, chlorinated solvents, esters and ethers. Another commercial pervaporation processes involve the separation of organics and water. This separation is relatively easy, because organic solvents and water have very different polarity and exhibit distinct membrane permeation properties. The first pilot-plant result for an organic-organic application, the separation of methanol from methyl t-butyl ether/isobutene mixtures, was reported by Separex in 1988. This is a particularly favorable application and available cellulose acetate membranes achieve a good separation. More recently, Exxon started a pervaporation pilot plant for the separation of aromatic/aliphatic mixtures, using polyimide/poly urethane block copolymer membranes. This separation is one of the major separation problems in refineries (Baker, 2000).
New Advances in Membrane Technology
387
4.4 Solvent Dewaxing A promising new application of reverse osmosis in the chemical industry is the separation of organic/organic mixtures. These separations are difficult because of the high osmotic pressures that must be overcome and because they require membranes that are sufficiently solvent resistant to be mechanically stable, but are also sufficiently permeable for good fluxes to be obtained. One application that has already reached the commercial stage is the separation of small solvent molecules from larger hydrocarbons in mixtures resulting from the extraction of vacuum residual oil in refineries. Figure 4a shows a simplified flow diagram of a refining lube oil separation process- these operations are very large. In a typical 100,000 barrel/day refinery about 15,000barrel/day of the oil entering the refinery remain as residual oil. A large fraction of this oil is sent to the lube oil plant, where the heavy oil is mixed with 3 to 10 volumes of a solvent such as methyl ethyl ketone and toluene. On cooling the mixture, the heavy wax components precipitate out and are removed by a drum filter. The light solvent is then stripped from the lube oil by vacuum distillation and recycled through the process. The vacuum distillation step is very energy intensive because of the high solvent-to-oil ratios employed. In 1998, a reverse osmosis process developed by Mobil for this separation is illustrated in Figure 4b. Polyimide membranes formed into spiral-wound modules are used to separate up to 80% of the solvent from the dewaxed oil. The membranes have a flux of 10-20gal/ft2day at a pressure of 450-650psi. The solvent filtrate bypasses the distillation step and is recycled directly to the incoming oil feed. The net result is a significant reduction in the refrigeration load required to cool the oil and in the size and energy consumption of the solvent recovery vacuum distillation section (Baker, 2001). Mobil is now licensing this technology to other refineries. Development of similar applications in other operations is likely. Initially, applications will probably involve relatively easy separations such as the separation of methyl ethyl ketone/toluene from lube oil described above or soybean oil from hexane in food oil production. Long term, however, the technology may become sufficiently advanced to be used in more important refining operations, such as fractionation of linear from branched paraffins, or the separation of benzene and other aromatics from paraffins and olefins in the gasoline pool.
Fig. 4. Simplified flow schemes of (a) a conventional and (b) Mobil Oil's membrane solvent dewaxing processes (Baker, 2001) 4.5 Membrane Aromatic Recovery System Phenolic compounds are used in phenolic resins, polycarbonates, biocides and agrochemicals. Aromatic amines are used in a wide range of consumer products, including polyurethane foam, dyes, rubber chemicals and pharmaceuticals. The factories that
388
Advanced Technologies
manufacture and/or use these types of chemicals often create aqueous waste streams containing significant (0.1-10 wt%) amounts of aromatic amines or phenolic compounds. Phenol is an aromatic acid, with a solubility of 8 wt.% in water at 25◦C. This compound is highly toxic and one of the EPA’s priority pollutants. Two of the main commercial applications for phenol are production of bisphenol A and phenol-formaldehyde resins. Phenol and formaldehyde are the main reagents in the phenol-formaldehyde resin production process. Since phenol is highly toxic and at high concentration (>200mg/l) are inhibitory to biological treatment, the recovery of phenol from industrial wastewater streams has generated significant interest. Methods for the recovery of phenol include solvent extraction, activated carbon and polymer adsorption, and membrane processes. Membrane technologies have attracted attention for removal of low-volatility organics from wastewaters. Porous membranes have been used for membrane solvent extraction for the recovery of organics from aqueous solutions. However, porous membranes have a major shortcoming due to their instability, i.e. breakthrough of the immobilized phase in the pores can occur unless a high breakthrough pressure through the membrane is maintained. Nonporous membranes were proposed for carrying out extraction. Compared to porous membranes, the breakthrough pressure is much higher through nonporous membranes; however, this is at the expense of a lower mass transfer rate in the membrane extraction. The Membrane Aromatic Recovery System (MARS) is a relatively new process for recovery of aromatic acids and bases. In the MARS process aromatics are selectively removed from a wastewater stream into the stripping solution via a tubular silicone rubber membrane with a wall thickness of 500 μm. For aromatic bases (e.g. aniline) the stripping solution is maintained at an acidic pH using HCl and for aromatic acids (e.g. phenol) the stripping solution is maintained at a basic pH using NaOH (Ferreira et al., 2005). The mass transfer rate of water through the membrane is negligible due to the hydrophobicity of the tubular silicone rubber membrane, combined with its relatively large thickness. Ion transport is also negligible; hence the ionic form of the aromatic, formed in the stripping solution, cannot pass back across the membrane into the wastewater solution. This not only keeps the aromatic in the stripping solution but also maintains the driving force across the membrane. MARS technology has been successfully applied for the recovery of phenol and aniline at lab and pilot plant scale. It has also been applied on full plant scale for recovery of p-cresol since December 2002 at a Degussa plant in Knottingley UK (Daisley et al., 2006). 4.6 Membrane Bio-Reactor Activated sludge processes (ASPs) have widely been used for biological wastewater and sewage treatment. However, since the settling of activated sludge for solid-liquid separation is difficult by gravitational settling, the biomass concentration which can be controlled is limited to approximately 5000mg/L. Therefore, bio-reactor volume becomes large. On account of the difficulty of solid-liquid separation in biological wastewater treatment, a MBR has been one of the most prevalent solutions since late 1960s. Membrane bio-reactor technology combines the biological degradation process by activated sludge with a direct solid-liquid separation by membrane filtration. By using micro or ultra filtration membrane technology (with pore sizes ranging from 0.05 to 0.4 μm), MBR systems allow the complete physical retention of bacterial flocs and virtually all suspended solids within the bioreactor. As a result, the MBR has many advantages over conventional
New Advances in Membrane Technology
389
wastewater treatment processes. These include small footprint and reactor requirements, high effluent quality, good disinfection and odor control capability, higher volumetric loading and less sludge production (Wisniewski, 2007). As a result, the MBR process has now become an attractive option for the treatment and reuse of industrial, domestic and municipal wastewaters, as evidenced by their constantly rising numbers and capacity. There are more than 2200 MBR installations in operation or under construction worldwide. In North America, 258 full-scale MBR plants have been constructed, where 39 of them are for industrial waste water treatment and 7 of them have been constructed for industrial chemical wastewater treatment. The current MBR market has been estimated to value around US$216 million and to rise to US$363 million by 2010 (Atkinson, 2006). Wastewater in petrochemical industry is currently treated by activated sludge process with pretreatment of oil/water separation. Tightening effluent regulations and increasing need for reuse of treated water have generated interest in the treatment of petrochemical wastewater with the advanced MBR process. Tam et al. used a membrane bio-reactor/reverse osmosis (MBR/RO) and microfiltrationreverse osmosis (MF/RO) system for reclamation and reuse of wastewater. A schematic of these two systems is given in Figure 5. As it can be seen, in MBR/RO operation, there is no need to use "primary treatment" and "secondary treatment" and this is an advantage of MBR/RO system. The performance of these two systems is represented in Table 6 (Tam et al., 2007). Parameter
MBR/RO system Feed
BOD5 (mg/L) 198 CODcr 391 (mg/L) SS (mg/L) 201 E. coli 4.1*107 (cell/L) TKN (mg/L) 43.0 E. coli: Escherichia coli ND: Not Detected
<2 17.5
RO permeate <2 <2
MF/RO system MBR Feed effluent 3 <2 23 17.9
RO permeate <2 <2
<2 3.4 (44.3%)
<2 ND
2 2.8*107
<2 ND
MBR effluent
<2 2 (19.7%)
1.6 0.1 3.1 1.5 0.4 SS: Suspended Solid TKN: Total Kjeldahl Nitrogen (the sum of organic N2, NH3 and NH4+)
Table 6. Performance of MBR/RO and MF/RO systems (Tam et al., 2007)
5. Conclusion Increasing research of membrane technology indicates that membranes definitely will become another alternative for industrial processes. However, much research and development effort is needed in order to commercialize membranes in the international market. The increased world-wide competitiveness in production has forced industry to improve current process designs and also the increased importance of natural environment has compelled industry to develop new process designs. Consequently, the development of new process designs using alternative technologies is of growing importance to the industry. A continuous research work on membrane properties and fundamental aspects of transport phenomena in the various membrane operations is important for the future of
390
Advanced Technologies
membrane science and technology. There is a need for both basic and applied research to develop new membranes with improved properties and new membrane processes.
Fig. 5. An illustration of MBR/RO and MF/RO systems (Tam et al., 2007)
6. References Abu Qdaisa, H. & Moussab, H. (2004). Removal of heavy metals from wastewater by membrane processes: a comparative study, Desalination, Vol. 164, pp. 105-110. Arico`, A.S.; Baglio, V.; Di Blasi A.; Cretri, P.; Antonucci, P.L. & Antonucci, V. (2003). Influence of the acid-base characteristics of inorganic fillers on the high temperature performance of composite membranes in direct methanol fuel cells. Solid State Ionics, Vol. 161, pp. 251-265. Atkinson, S. (2006). Research studies predict strong growth for MBR markets, Membr. Technol., Vol. 2006, pp. 8-10. Baker, R. W. (2000). Membrane Separation, Membrane Technology & Research Inc. (MTR). Baker, R. (2001). Membrane Technology in the Chemical Industry: Future Directions, In: Membrane Technology in the Chemical Industry, Nunes S. P. & Peinemann, K.- V., John Wiley. Baker, R. W. (2004). Membrane technology and applications, John Wiley & Sons Ltd. Bodzek, M. (2000). Membrane Techniques in Air Cleaning: Reviews, Polish J. Environ. Studies, Vol. 9, No. 1 pp. 1-12. Burns, R. L. & Koros, W. J. (2003) Defining the challenges for C3H6 / C3H8 separation using polymeric membranes. J. Membr. Sci., Vol. 211, pp. 299-309. Cheryan, M. (1998). Ultrafiltration and Microfiltration Handbook, Technomic Publishing Co. Daisley, G. R.; Dastgir, M. G.; Ferreira, F. C.; Peeva, L. G. & Livingston, A. G. (2006). Application of thin film composite membranes to the membrane aromatic recovery system, J. Membr. Sci., Vol. 268, pp. 20-36.
New Advances in Membrane Technology
391
Duke, M. C.; da Costa, J. C. D.; Do, D. D.; Gray, P. G. & Lu, G. Q. (2006). Hydrothermally Robust Molecular Sieve Silica for Wet Gas Separation, Advanced Functional Materials, Vol. 16, pp. 1215-1220. Ferreira, F. C.; Peeva, L.; Boamb, A.; Zhang, Sh. & Livingston, A. (2005). Pilot scale application of the Membrane Aromatic Recovery System (MARS) for recovery of phenol from resin production condensates, J. Membr. Sci., Vol. 257, pp. 120-133. Gryaznov, V. (2000). Metal containing membranes for the production of ultrapure hydrogen and the recovery of hydrogen isotopes. Separation and Purification Methods, Vol. 29, pp. 171-187. Heydari Gorji, A.; Kaghazchi, T. & Kargari, A. (2009a). Selective Removal of Carbon Dioxide from Wet CO2 / H2 Mixtures via Facilitated Transport Membranes containing Amine Blends as Carriers, Chem. Eng. Technol. Vol. 32, No. 1, pp. 120-128. Heydari Gorji., A.; Kaghazchi, T. & Kargari, A. (2009b) Analytical solution of competitive facilitated transport of acid gases through liquid membranes, Desalination, Vol. 235, pp. 245-263. Jones, A. T. & Rowley, W. (2003). Global perspective: Economic forecast for renewable ocean energy technology. Marine Technology Society Journal, Vol. 36, pp. 85-90. Judd, S. & Jefferson, B. (2003). Membranes for Industrial Wastewater Recovery and Re-use, Elsevier Science Ltd. Jung, D. H.; Cho, S.Y.; Peck, D.H.; Shin, D.R. & Kim, J.S. (2003). Preparation and performance of a Nafion / montmorillonite nanocomposite membrane for direct methanol fuel cell. Journal of Power Sources, Vol. 118, pp. 205-211. Kaghazchi T.; Kargari, A.; Yegani, R. & Zare, A. (2006). Emulsion liquid membrane pertraction of L-lysine from dilute aqueous solutions by D2EHPA mobile carrier, Desalination, Vol. 190, pp. 161-171. Kaghazchi, T.; Takht Ravanchi, M.; Kargari, A. & Heydari Gorji, A. (2009) Application of Liquid Membrane in Separation Processes, J. Sep. Sci. Eng., in press. Kargari, A.; Kaghazchi, T.; Mohagheghi, E. & Mirzaei, P. (2002). Application of Emulsion Liquid Membrane for treatment of phenolic wastewaters, Proceedings of 7th Iranian Congress of Chemical Engineering, pp. 310-316, Tehran University, October 2002, Iran. Kargari A.; Kaghazchi, T. & Soleimani, M. (2003a). Application of Emulsion Liquid Membrane in the Extraction of Valuable Metals from Aqueous Solutions, 4th European Congress of Chemical Engineering, Granada, September 2003, Spain. Kargari A.; Kaghazchi, T. & Soleimani, M. (2003b). Role of Emulsifier in the Extraction of Gold (III) Ions from Aqueous Solutions Using Emulsion Liquid Membrane Technique, Permea2003 Conference, Tatranske Matliare, September 2003, Slovakia. Kargari A.; Kaghazchi, T. & Soleimani, M. (2003c). Extraction of gold (III) ions from aqueous solutions using surfactant Liquid Membrane, 8th Iranian National Chemical Engineering Conference, Mashhad, October 2003, Iran. Kargari A.; Kaghazchi, T.; Kamrani, G. & Forouhar, T. (2003d). Recovery of Phenol from High Concentration Phenolic Wastewater by Emulsion Liquid Membrane Technology, 8th Iranian National Chemical Engineering Conference, Mashhad, October 2003, Iran. Kargari A.; Kaghazchi, T. & Soleimani, M. (2004a). Role of Emulsifier in the Extraction of Gold (III) Ions from Aqueous Solutions Using Emulsion Liquid Membrane Technique, Desalination, Vol. 162, pp. 237-247.
392
Advanced Technologies
Kargari A.; Kaghazchi, T. & Soleimani, M. (2004b) Mass transfer investigation of liquid membrane transport of gold (III) by methyl iso-butyl ketone mobile carrier, J. Chem. Eng. & Tech., Vol. 27, pp. 1014-1018. Kargari A.; Kaghazchi, T.; Sohrabi, M. & Soleimani, M. (2004c). Batch Extraction of Gold (III) Ions from Aqueous Solutions Using Emulsion Liquid Membrane via Facilitated Carrier Transport, J. Membr. Sci., Vol. 233, pp. 1-10. Kargari A.; Kaghazchi, T. & Soleimani, M. (2004d). Mass transfer investigation of liquid membrane transport of gold (III) by methyl iso-butyl ketone mobile carrier, Chisa Conference, Praha, August 2004, Czech Republic. Kargari A.; Kaghazchi, T.; Sohrabi, M. & Soleimani, M. (2004e). Emulsion liquid membrane pertraction of gold (III) ion from aqueous solutions, 9th Iranian Chemical Engineering Congress, Iran University of Science and Technology, November 2004. Kargari A.; Kaghazchi, T. & Soleimani, M. (2005a). Extraction of gold (III) ions from aqueous solutions using emulsion liquid membrane technique, International Solvent Extraction Conference (ISEC 2005), The People’s Republic of China, September 2005, China. Kargari A.; Kaghazchi, T.; Kamrani, G. & Forouhar, T. (2005b). Pertraction of phenol from aqueous wastes using emulsion liquid membrane system, FILTECH Conference, Wiesbaden, October 2005, Germany. Kargari A.; Kaghazchi, T.; Sohrabi, M. & Soleimani, M. (2006a). Application of Experimental Design to Emulsion Liquid Membrane Pertraction of Gold (III) Ions from Aqueous Solutions, Iranian Journal of Chemical Engineering, Vol. 3, No. 1, pp. 76-90. Kargari A.; Kaghazchi, T.; Mardangahi, B. & Soleimani, M. (2006b). Experimental and modeling of selective separation of gold (III) ions from aqueous solutions by emulsion liquid membrane system, J. Memb. Sci. Vol. 279, pp. 389-393. Kargari A.; Kaghazchi, T. & Soleimani, M. (2006c). Mathematical modeling of emulsion liquid membrane pertraction of gold (III) from aqueous solutions, J. Memb. Sci. Vol. 27, pp. 380-388. Kim, J. H.; Min, B. R.; Won, J.; Joo, S. H.; Kim, H. S. & Kang, Y. S. (2003). Role of polymer matrix in polymer-silver complexes for structure, interaction, and facilitated olefin transport, Macromolecules, Vol. 36, pp. 6183-6188. Klette, H. & Bredesen, R. (2005). Sputtering of very thin palladium-alloy hydrogen separation membranes. Membrane Technology, Vol. 5, pp. 7-9. Kong, Y.; Du, H.; Yang, J.; Shi, D.; Wang, Y.; Zhang, Y. & Xin, W. (2002) Study on polyimide/TiO2 nanocomposite membranes for gas separation, Desalination, Vol. 146, pp. 49-55. Kreuer, K.D. (2001). On the development of proton conducting polymer membranes for hydrogen and methanol fuel cells. J. Membr. Sci., Vol. 185, pp. 29-39. Krol, J. J.; Boerrigter, M. & Koops, G. H. (2001). Polymeric hollow fiber gas separation membranes: preparation and the suppression of plasticization in propanepropylene environments, J. Membr. Sci., Vol. 184, pp. 275-286. Krull, F.F.; Fritzmann, C. & Melin, T. (2008) Liquid membranes for gas/vapor separations, J. Membrane Sci., Vol. 325, pp. 509-519. Landaburu-Aguirre, J.; García, V.; Pongrácz, E. & Keiski, R. (2006). Applicability of membrane technologies for the removal of heavy metals, Desalination, Vol. 200, pp. 272-273.
New Advances in Membrane Technology
393
Li, N.N.; Fane, A. G.; Ho, W.S.W. & Matsuura, T. (2008). Advanced membrane technology and applications, John Wiley & Sons, Inc. Lin, H. & Freeman, B. D. (2005). Materials selection guidelines for membranes that remove CO2 from gas mixtures, Journal of Molecular Structure, Vol. 739, pp. 57-74. Lin, H.; Wagner, E. V.; Freeman, F.; Toy, L.G. & Gupta, R. P. (2006). Plasticization-Enhanced Hydrogen Purification Using Polymeric Membranes, Science, Vol. 311, pp. 639-642. Madaeni, S.S. & Mansourpanah, Y. (2003) COD Removal from Concentrated Wastewater Using Membranes. Filtration and Separation, Vol. 40, pp. 40-46. Mohammadi S.; Kaghazchi, T. & Kargari, A. (2008). A model for metal ion pertraction through supported liquid membrane, Desalination, Vol. 219, pp. 324-334. Nabavinia, M.; Kaghazchi. T.; Kargari, A. & Soleimani, M. (2009). Extraction of Cd2+ from aqueous solutions by supported liquid membrane containing D2EHPA-M2EHPA carrier, Desalination, under review. Nabieyan B.; Kaghazchi, T.; Kargari, A.; Mahmoudian, A. & Soleimani, M. (2007). Benchscale simultaneous extraction and stripping of iodine using bulk liquid membrane system, Desalination, Vol. 214, pp. 167-176. Pabby, A. K. (2008). Membrane techniques for treatment in nuclear waste processing: global experience, Membrane Technology, Vol. 2008, pp. 9-13. Pabby, A. K.; Rizvi, S.S.H. & Sastre, A.M. (2009) Membrane Separations Chemical, Pharmaceutical, Food, and Biotechnological Applications, CRC Press, Taylor & Francis Group. Park, Y. S.; Won, J. & Kang, Y. S. (2001). Facilitated transport of olefin through solid PAAm and PAAm-graft composite membranes with silver ions. J. Membr. Sci., Vol. 183, pp. 163-170. Post, J.W.; Veerman, J.; Hamelers, H.V.M.; Euverink, G.; Metz, S.J.; Nymeijer, K. & Buisman, C. (2007). Salinity gradient power: Evaluation of pressure retarded osmosis and reverse electrodialysis. J. Membr. Sci., Vol. 288, pp. 218-230. Powell, C.E. & Qiao, G.G. (2006). Polymeric CO2/N2 gas separation membranes for the capture of carbon dioxide from power plant flue gases. J. Membr. Sci., Vol. 279, pp. 1-49. Rezaei M.; Mehrabani, A.; Kaghazchi, T. & Kargari, A. (2004). Extraction of chromium ion from industrial wastewaters using bulk liquid membrane, 9th Iranian Chemical Engineering Congress, Iran University of Science and Technology, November 2004. Shishatskiy, S., (2006). Polyimide Asymmetric Membranes for Hydrogen Separation: Influence of Formation Conditions on Gas Transport Properties, Advanced Engineering Materials, Vol. 8, pp. 390-397. Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2008a). Separation of a Propylene-Propane Mixture by a Facilitated Transport Membrane, the 5th International Chemical Engineering Congress (IChEC 2008), Jan 2008, Kish Island, Iran Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2008b). Immobilized Liquid Membrane for Propylene-Propane Separation, Proceeding of World Academy of Science, Engineering and Technology, pp. 696-698, ISSN 1307-6884, Paris, July 2008, France. Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2008c). A New Approach in Separation of Olefin-Paraffin Gas Mixtures by a Membrane System, Amirkabir Journal of Science and Research, Vol. 19, pp. 47-54.
394
Advanced Technologies
Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2008d). Application of facilitated transport membrane systems for the separation of hydrocarbon mixtures, 18th International Congress of Chemical and Process Engineering, Praha, August 2008, Czech Republic. Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2009a). Application of Membrane Separation Processes in Petrochemical Industry: A Review, Desalination, Vol. 235, pp. 199-244. Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2009b). Separation of Propylene-Propane Mixture Using Immobilized Liquid Membrane via Facilitated Transport Mechanism, Sep. Sci. Tech., Vol. 44, pp. 1198-1217. Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2009c) Supported Liquid Membrane Separation of Propylene-Propane Mixtures Using a Metal Ion Carrier, Desalination, in press. Takht Ravanchi, M.; Kaghazchi, T.; Kargari, A. & Soleimani, M. (2009d) A Novel Separation Process for Olefin Gas Purification: Effect of Operating Parameters on Separation Performance and Process Optimization, Journal of Taiwan Institute of Chemical Engineers, in press. doi:10.1016/j.jtice.2009.02.007 Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2009e) Facilitated Transport Separation of Propylene-Propane: Experimental and Modeling Study, J. Chem. Eng. Proc., under review. Takht Ravanchi, M.; Kaghazchi, T. & Kargari, A. (2009f) The Application of Facilitated Transport Mechanism for Hydrocarbon Separation, Iranian Journal of Chemical Engineering, under review. Tam, L.S.; Tang, T.W.; Lau, G.N.; Sharma, K.R. & Chen, G.H. (2007). A pilot study for wastewater reclamation and reuse with MBR/RO and MF/RO systems, Desalination, Vol. 202, pp. 106-113. Tchicaya-Bouckary, L.; Jones, D.J. & Rozie`re, J. (2002). Hybrid polyaryletherketone membranes for fuel cells applications. Fuel Cells, Vol. 1, pp. 40-45. Viero, A.F.; Mazzarollo, A.C.R.; Wada, K. & Tessaro, I.C. (2002). Removal of hardness and COD from retanning treated effluent by membrane process. Desalination, Vol. 149 pp. 145-149. Wisniewski, Ch. (2007). Membrane bioreactor for water reuse, Desalination, Vol. 203, pp. 1519. Zakrzewska-Trznadel, G.; Harasimowicz, M. & Chmielewski, A.G. (2001). Membrane processes in nuclear technology-application for liquid radioactive waste treatment, Sep. Pur. Tech., Vol. 22-23, pp. 617-625. Zhang, L.; Chen, X.; Zeng, Ch. & Xu, N. (2006) Preparation and gas separation of nano-sized nickel particle-filled carbon membranes, J. Membr. Sci., Vol. 281, pp. 429-434.
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
395
22 X
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach Venus Marza1 and Mohammad Teshnehlab2
Islamic Azad University South Tehran Branch Faculty of Technical and Engineering1, Department of Electrical and Computer Engineering of Khaje Nasir Toosi2 Iran, Tehran 1. Introduction As software becomes more complex and its scope dramatically increase, the importance of research on developing methods for estimating software development effort has perpetually increased. Estimating the amount of effort required for developing a software system is an important project management concern, because these estimation is a basic for budgeting and project planning, which are critical for software industry. However accurate software estimation is critical for project success. So many software models have been proposed for software effort estimation. Algorithmic models such as COCOMO, SLIM, Multiple Regression, Statistical models,… and non-algorithmic models such as Neural Network Models (NN), Fuzzy Logic Models, Case-Base Reasoning (CBR), Regression Trees,… are some of these models. Here we want to improve software accuracy by integrating the advantages of algorithmic and non-algorithmic models. Also, recent research has tended to focus on the use of function point (FP) in estimating the software development efforts, but a precise estimation should not only consider the FPs, which represent size of the software, but also should include various elements of the development environment which affected on effort estimation. Consequently, for software development effort estimation by NeuroFuzzy approach, we will use of all the significant factors on software effort. So the final results are very accurate and reliable when they are applied to a real dataset in a software project. The empirical validation uses the International Software Benchmarking Standards Group (ISBSG) Data Repository Version 10 to demonstrate the improvement of results. This dataset contains information on 4106 projects of which two thirds were developed between the years 2000 and 2007. The evaluation criteria were based mainly upon MMRE (Mean Magnitude Relative Error), MMER and PRED(20). The results show a slightly better predictive accuracy amongst Fuzzy Logic Models, Neural Network Models, Multiple Regression Models and Statistical Models. This chapter of book is organized into several sections as follows: In section 1, we briefly review fuzzy logic models and neural network models in software estimation domain.
396
Advanced Technologies
Section 2 begins with preparing the dataset and this is followed by description of our proposed model. The experimental results are examined in Section 3 in details, and finally Section 4 offers conclusions and recommendations for future research in this area.
1. Survey of fuzzy logic and neural network models 1.1 Fuzzy Logic Model Since fuzzy logic foundation by Zadeh in 1965, it has been the subject of important investigations [Idri & Abran, 2001]. Fuzzy logic enhances the user’s ability to interpret the model, allowing the user to view, evaluate, criticize and possibly adapt the model. Prediction can be explained through a series of rules [Gray & MacDonell, 1997],[Saliu et al., 2004]. After analyzing the fuzzy logic model, experts can check the model to avoid the adverse effects of unusual data, thereby increasing its robustness. Additionally, fuzzy logic models can be easily understood in comparison to regression models and the neural network, thus making it an effective communication tool for management [MacDonell et al., 1999],[Gray & MacDonell , 1999]. In comparison to fuzzy logic, case-based reasoning is similarly easy to interpret, but it requires a high volume of data [Su et al., 2007]. The purpose in this section is not to discuss fuzzy logic in depth, but rather to present these parts of the subject that are necessary for understanding of this chapter and for comparing it with Neuro-Fuzzy model. Fuzzy logic offers a particularly convenient way to generate a keen mapping between input and output spaces thanks to fuzzy rules’ natural expression. The number of fuzzy rules for six input variables and three membership functions is calculated by 36, which equals 729. As a result, writing these rules is an arduous task, so based on the statistical model we use two input variables which are demonstrated later. Implementing a fuzzy system requires that the different categories of the different inputs be presented by fuzzy sets, which in turn is presented by membership functions. A natural membership function type that readily comes to mind is the triangular membership functions [Moataz et al., 2005]. A triangular MF is a three-point (parameters) function, defined by minimum (a), maximum (c) and modal (b) values, that is MF(a, b, c) where a≤ b ≤ c. Their scalar pa rameters (a, b, c) are defined as follows: MF(x) = 0 if x < a MF(x) = 1 if x = b MF(x) = 0 if x > c Based on the Correlation (r) of the variables, fuzzy rules can be formulated. Correlation, the degree to which two sets of data are related, varies from -1.0 to 1.0. The Correlation Coefficient for the input variables is calculated from the equation below [Humphrey, 2002]: r=
n[ ∑ ( Xi .Yi )] − ( ∑ Xi )( ∑ Yi )
[n( ∑ Xi2 ) − ( ∑ Xi )2 ][n( ∑ Yi2 ) − ( ∑ Yi )2 ]
(1)
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
397
An acceptable correlation should have an absolute value higher than 0.5. The fuzzy inference process uses the Mamdani Approach for evaluating each variable complexity degree when linguistic terms, fuzzy sets, and fuzzy rules are defined. Specifically, we apply the minimum method to evaluate the ‘and’ operation, and consequently, we obtain one number that represents the antecedent result for that rule. The antecedent result, as a single number, creates the consequence using the minimum implication method. Overall, each rule is applied in the implication process and produces one result. The aggregation using the maximum method is processed to combine all consequences from all the rules and produces one fuzzy set as the output. Finally, the output fuzzy set is defuzzified to a crisp single number using the centroid calculation method [Xia et al., 2007]. This Two-Input-One-Output fuzzy logic system for Effort is depicted in Figure 1. Moreover, the results of this model are shown in Table 7 and Table 9.
Fig. 1. The Fuzzy Logic System for Effort Estimation 1.2 Neural Network Model Artificial neural network are used in estimation due to its ability to learn from previous data. In addition, it has the ability to generalize from the training data set thus enabling it to produce acceptable result for previously unseen data [Su et al., 2007]. Artificial neural networks can model complex non-linear relationships and approximate any measurable function so it is very useful in problems where there is a complex relationship between inputs and outputs [Aggarwal et al., 2005] [Huang et al.,2007]. When looking at a neural network, it immediately comes to mind that activation functions are look like fuzzy membership function [Jantzen, 1998]. Our neural network model uses an RBF network, which is easier to train than an MLP network. The RBF network is structured similarly to the MLP in that it is a multilayer, feedforward network. However, unlike the MLP, the hidden units in the RBF are different from
398
Advanced Technologies
the units in the input and output layers. Specifically, they contain the RBF, a statistical transformation based on a Gaussian distribution from which the neural network’s name is derived [Heiat, 2002]. Since the data of our variables differs significantly, first, we normalized the data and then randomly divided them into two categories: 75% of projects are used for training and 25% of them are used for testing. The trajectory of the training phase is depicted in Figure 2. In particular, we used the Generalized Regression Neural Network Model in MATLAB 7.6, RBF network was created and the data set was applied to it; the results are shown in Table 7-9.
Fig. 2. Progress of Training Phase
2. Proposed Model: 2.1 choosing a Neuro-Fuzzy Model for estimation By comparison between artificial neural networks (ANN) and fuzzy inference systems (FIS), we find that neural network difficult to use prior rule knowledge, learning from scratch, they have complicated learning algorithms and they are black box structure and also they difficult extract knowledge while fuzzy inference systems can incorporate prior rule-base, they are interpretable by if-then rules, they have simple interpretation and implementation but they can’t learn linguistic knowledge and knowledge must be available. Therefore, it seems natural to consider building an integrated system combining the concepts of FIS and ANN modeling. A common way to integrate them is to represent them in a special architecture. Different integrated neuro-fuzzy models implement a Mamdani and Takagi Sugeno fuzzy inference systems, some of them are FALCON, ANFIS, NEFCON, NEFCLASS, NEFPROX, FUN, SONFIN, EFuNN, dmEFuNN and many others [Abraham, 2005]. Due to unavailability of source codes, we are unable to provide a comparison with all the models. In general Takagi-Sugeno fuzzy system has lower Root Mean Square Error (RMSE)
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
399
than Mamdani-type fuzzy system but Mamdani fuzzy systems are much faster in compared to Takagi-Sugeno types, our purpose is accuracy so we didn’t consider mamdani-type fuzzy system such as FALCON, NEFCON, NEFCLASS, EFuNN. Since no formal neural network learning technique is used in FUN and it randomly changes parameters of membership functions and connections within the network structure, therefore we don’t consider it as a neuro-fuzzy system. About other models, Mackey & Glass [Mackey & Glass, 1977] provided a comparative performance of some neuro fuzzy systems for predicting the Mackey-Glass chaotic time series that represented in table 1. System Epochs ANFIS 75 NEFPROX 216 EFuNN 1 dmEFuNN 1 SONFIN 1 Table 1. Performance of neuro-fuzzy systems
Test RMSE 0.0017 0.0332 0.0140 0.0042 0.0180
As shown in table ANFIS has the lowest RMSE in compared to NEFPROX (highest RMSE), SONFIN and dmEFuNN which used Takagi-Sugeno fuzzy system. So we use ANFIS as neuro-fuzzy model for predicting effort of software projects. 2.2 Preparing Dataset In this study we used the latest publication of ISBSG (International Software Benchmarking Standards Group) data repository Release 10 that contains 4106 project’s information and two thirds of them were developed between the years 2000 and 2007. One hundred seven metrics were described for each project including data quality rating, project size, work effort, project elapsed time, development type, development techniques, language type, development platform, methodology, max team size,… . The ISBSG data repository includes an important metric as Data Quality Rating which indicated that the reliability of the reported data. We excluded 141 projects with quality rating D which had little credibility. Project size is recorded with function points and homogeneity of standardized methodologies is very essential for measuring function size. Among different count approaches of function point NESMA is considered to produce equivalent results with IFPUG [NESMA 1996] and most of projects used these approaches for counting function points. So for giving more reliable results, projects with other counting approaches were excluded from the analysis. Also some projects had mistakenly information for example they had 0.5 or 0.95 for Average Team Size or Development Platform was recorded by ‘HH’ where not acceptable. Finally after cleaning data, 3322 projects remained for predicting effort’s projects. 2.3 Suggested model Our study is based on statistical regression analysis, which is the most widely used approach for the estimation of software development effort. Now we briefly introduce the variables in data repository which will be used as the predicator for the regression analysis [Zhizhong et al.a, 2007]:
400
Advanced Technologies
1.
Functional Size: It gives the size of the project which was measured in function points. 2. Average Team Size: It is the average number of people that worked on the project through the entire development process. 3. Language Type: It defines the language type used for the project such as 2GL, 3GL, 4GL and ApG. 2GL (two generation languages) are machine dependent assembly languages, 3GL are high-level programming languages like FORTRAN, C ,etc. 4GL like SQL is more advanced than traditional high-level programming languages and ApG (Application Generator) is the program that allows programmers to build an application without writing the extensive code. 4. Development Type: Describes whether the software development was a new development, enhancement or Re-development. 5. Development Platform: Defines the primary development platform. Each project was developed for one of the platforms as midrange, mainframe, multi-platform, or personal computer. 6. Development Techniques: Specific techniques used during software development (e.g. Waterfall, Prototyping, Data Modeling, RAD, etc). A large number of projects make use of various combined techniques. 7. Case Tool Used: Indicates if the project used any CASE (Computer-Aided Software Engineering) tool or not. 8. How Methodology Acquired: Describes whether the development methodology was traditional, purchased, developed in-house, or a combination of purchased and developed. It is important to point out that [Zhizhong et al b., 2007]: ∗ We did not take into account the factor primary programming language, since each particular programming language (Java, C, etc) belongs to one of the generation languages (2GL, 3GL, etc). ∗ It is conceivable that senior software developers are more proficient and productive than junior developers. ISBSG data repository does not report this and assumes the developers are all well-qualified practitioners. ∗ When considering the factor Development Techniques, there exist over 30 different techniques in the data repository and 766 projects even used various combinations of these techniques. Our study considered the ten key development techniques (Waterfall, Prototyping, Data Modeling, Process Modeling, JAD or Joint Application Development, Regression Testing, OO or Object Oriented Analysis & Design, Business Area Modeling, RAD or Rapid Application Development) and separated each of them as one single binary variable with two levels that indicates that whether this variable was used (1) or not (0), also other combinations were labeled by ‘Other’ as development factor technique. ∗ The variables Effort, Size and Average Team Size are measured in ratio scales while all others are measured in nominal scales. Here by fitting a model with Effort as the dependent variable and all the other variables as the predicators, we reduced our inputs for prediction, because for ANFIS with Genfis1 implementation is impossible to write all the rules and the complexity of model will be
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
401
increased. So Regression Analysis helps us to use variables effectively. Table 2 gives the summary of the variables used for the regression analysis. Variables Effort Size Average Team Size Language Type Development Type Development Platform CASE Tool Methodology Waterfall Data Process JAD Regression Development Prototypin Techniques g Business RAD O.O Event Other factors
Scale Ratio Ratio Ratio Nominal Nominal Nominal Nominal Nominal Nominal Nominal Nominal Nominal Nominal Nominal
Description Summary Work Effort Functional Size Average Team Size Language Type Development Type Development Platform CASE Tools Used How Methodology Acquired 1=Waterfall , 0=Not 1=Data Modelling , 0=Not 1=Process Modelling , 0=Not 1=JAD , 0=Not 1=Regression Testing , 0=Not 1=Prototyping , 0=Not
Nominal Nominal Nominal Nominal Nominal
1=Business Area Modelling , 0=Not 1=RAD , 0=Not 1=Object Oriented Analysis, 0=Not 1=Event Modelling , 0=Not 1=Uncommon Development Techniques, 0=Not 1=Missing , 0=Not
Missing Nominal Table 2. Summary of the variables for Regression
The variable Missing was added as an indicator variable and indicate that the use of development techniques was recorded for particular project or not (1=recorded, 0=missing). The first step is automatic model selection based on Akaike’s information criterion (AIC). AIC is a measure of the goodness of fit of an estimated statistical model. Given the assumption of normally-distributed model errors, AIC is given as [Venables & Ripley, 2002]:
AIC = n log( RSS n) + 2 p
(2)
Here n is the number of observations, RSS is Residual Sum of Squares, and p is the number of parameters to be estimated. AIC has a penalty as a function of the number of estimated parameters because increasing the number of parameters improves goodness of fit (small RSS), so the preferred model is the one with the lowest AIC value. Based on this criterion, the preferred model with the lowest AIC value is introduced in Table 3. It is important to point out here that since the original data of Effort and Average Team Size also Effort and Size are extremely skewed, we take the natural log transformation (with base e) to make the data look normally distributed. In scatter plot between each two variables we
402
Advanced Technologies
can demonstrate that the relationship between them is close to linear. Accordingly we can apply linear model to investigate them. Regression Terms Df Log(Size) 1 Log(Team Size) 1 Language Type 3 Development Type 2 Development Platform 3 Other 1 RAD 1 Table 3. Regression Results Based on AIC (The lowest value of AIC is -395.1)
Sum of Square 140.6 134.4 22.2 14.2 13.8 1.9 1.2
AIC (if variable excluded) -161.7 -170.4 -357.5 -371.1 -373.8 -393.7 -395.1
As regression based on AIC tends to overestimate the number of parameters when the sample size is large [Venables & Ripley, 2002], rely fully on the results produced by AIC is not suitable. So AIC should be combined with other statistical criterion such as ANOVA (ANalysis Of VAriance), here we used the ANOVA approach (based on Type І Sums of Squares) to test the significance of the variables. The variables added into the model in order and according to Table 3, the exclusion of the variable size results in the greatest increase of AIC value. Thus the project size factor is most significant to development effort likewise average team size is the second most important factor and etc. Based on Table 3 we can add the variable size to the regression model first, average team size, language type and so forth, then each time the regression was performed, the most insignificant variable was removed and then the model was refitted with the remained variables. By continuing this process we have the model with the final sets of significant terms where represented in Table 4 and significance level is based on p-value <0.05. Regression Terms Df Sum of Sq Log(Size) 1 497.8 Log(Average Team Size) 1 173.7 Language Type 3 35.9 Development Platform 3 16.3 Development Type 2 13.5 RAD 1 2.7 Other 1 3.9 Residuals 573 277.9 Table 4. ANOVA based on Type І Sums of Squares (The significance level is based on P-level < 0.05)
F-Value 1026.2 358.1 24.7 11.2 13.9 5.5 8.1
P-Value <10-15 <10-15 4.8 * 10-15 3.8 * 10-7 1.3*10-6 0.019 4.6*10-3
By comparing Table 2 and Table 3, we can see that the two methods produced similar significant factors for development effort, although the model based on AIC statistics overestimated additional two variables (OO and Missing) as significant. Considering that
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
403
AIC tends to overestimate the number of parameters when the sample size is large, we accept the second model as most appropriate for our study. Summary of the regression results are shown in Table 5. Regression Terms Coefficients Intercept 4.24 Log(Size) 0.56 Log(Average Team Size) 0.68 3GL’s Language -0.40 4GL’s Language -0.85 ApG’s Language -0.71 Midrange Platform -0.12 Multi-Platform -0.15 PC Platform -0.46 New Development Type 0.29 Re-development Type 0.56 RAD -0.23 Other -0.27 Table 5. Summary of the Regression results
Standard Error 0.30 0.03 0.04 0.27 0.27 0.29 0.08 0.17 0.08 0.07 0.15 0.11 0.09
P-Value <10-15 <10-15 <10-15 0.136 0.002 0.014 0.116 0.379 3.3*10-8 1.6*10-5 2.4*10-4 0.027 0.005
It’s important to point that the default Language Type is 2GL, the default Development Platform is Mainframe, and the default Development Type is Enhancement. According to Table 5, the model is fitted as (the variable ‘Other’ is not useful and not included): log(Effort ) = 4.24 + 0.56 * log(Size ) + 0.68 * log(TeamSize) + α i ϕ (Languagei ) + β j ϕ ( Platform j ) + γ k ϕ (DevTypek )
(3)
−0.23ϕ ( RAD)
i=1, 2, 3, 4; j=1, 2, 3, 4; k=1, 2, 3 Here the function Ф is the indicator function with binary values of 1 or 0. A value of 1 means the relevant development technique in the parentheses is used, otherwise the value is 0. So the default techniques used are: 2GL for language type (α1=0), Mainframe for development platform (β1=0), and Enhancement for development type (γ1=0). The coefficients αi, βj, and γk can be obtained from Table5. By using the obtained coefficient, we assign a value to each variable in our database and these values are corresponding to these coefficients which are shown in Table5. Our purpose was to apply ANFIS to prepared ISBSG database. Before using ANFIS, we need to have an initial FIS (Fuzzy Inference System) that determines the number of rules and initial parameters, etc. This can be done in three different ways: by using five of the GUI tools, by using Genfis1 that generates grid partition of the input space, and by using Genfis2 that employs subtractive clustering. In other words, if we have a human expert, we can use GUI tools to convert human expertise into rough correctly fuzzy rules, which are then finetuned by ANFIS. If we don’t have human experts, then we have to use some heuristics embedded in Genfis1 or Genfis2 to find the initial FIS and then go through the same ANFIS
404
Advanced Technologies
tuning stage. The question is that which of Genfis1 or Genfis2 should be used to generate the FIS matrix for ANFIS, and the answer is when you have less than six inputs and a large size of training data, use Genfis1 and otherwise use Genfis2. GENFIS1 uses the grid partitioning and it generates rules by enumerating all possible combinations of membership functions of all inputs; this leads to an exponential explosion even when the number of inputs is moderately large. For instance, for a fuzzy inference system with 10 inputs, each with two membership functions, the grid partitioning leads to 1024 (=210) rules, which is inhibitive large for any practical learning methods. The "curse of dimensionality" refers to such situation where the number of fuzzy rules, when the grid partitioning is used, increases exponentially with the number of input variables. However, GENFIS1 and GENFIS2 differ in two aspects. First, GENFIS1 produces grid partitioning of the input space and thus is more likely to have the problem of the ``curse of dimensionality'' described above, while GENFIS2 uses SUBCLUST (subtractive clustering) to produces scattering partition. Secondly, GENFIS1 produce a fuzzy inference system where each rule has zero coefficients in its output equation, while GENFIS2 applies the backslash ("\") command in MATLAB to identify the coefficients. Therefore the fuzzy inference system generated by GENFIS1 always needs subsequent optimization by ANFIS command, while the one generated by GENFIS2 can sometimes have a good input-output mapping precision already. Any way since we have six inputs, Genfis2 and then ANFIS is used for our implementation. Also we divided our inputs in two categories and then Genfis1 was used for implementation because we want to compare our results with Fuzzy Model and this model is impossible to implement with six inputs because of its’ exponential rules. The other way for preparing FIS for ANFIS is using Genfis3 and its’ difference with Genfis2 is that Genfis3 use Fuzzy CMeans Clustering for Clustering inputs data and since our results are almost the same, we have arbitrarily used Genfis2. For implementation with two inputs, as we say we should divide our six inputs in two categories: 1. Inputs which have the Ratio Scale such as Log(Size) and Log(Average Team Size) given as: Input1 = 4.24 + 0.56 log(Size) + 0.68 log(Average Team Size) 2. Inputs which have the Nominal Scale such as Language Type, Development Platform, Development Type and RAD Input 2 = α i ϕ ( Languagei ) + β j ϕ ( Platform j ) + γ k ϕ ( DevType k ) − 0.23 ϕ ( RAD)
The ANFIS structure with six and two inputs are shown in Figure 3 and Figure 4 respectively.
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
Fig. 3. ANFIS Structure for Six Inputs
Fig. 4. two-inputs Type ІІІ ANFIS with 9 Rules
405
406
Advanced Technologies
These inputs and structures are for estimating effort of software projects, but for the elapsed time of software project studies shows that two inputs of log(Effort) and log(Average Team Size) are sufficient for estimating. So by using Genfis1, the subspace of ANFIS Structure is as shown in Figure 5.
Fig. 5. corresponding fuzzy subspaces with two inputs for time estimation ANFIS uses a hybrid learning algorithm to identify parameters of Sugeno-type fuzzy inference systems. It applies a combination of the least-squares method and the backpropagation gradient descent method for training FIS membership function parameters to emulate a given training data set. More specifically, in the forward pass of the hybrid learning algorithm, functional signals go forward till layer 4 and the consequent parameters are identified by the least squares estimate. In the backward pass, the error rates propagate backward and the premise parameters are updated by the gradient descent. Hybrid learning rule can speed up the learning process and has less error than gradient descent method. Table 6 summarizes the activities in each pass. Forward Pass Premise Parameters fixed Consequent Parameters Least Squares Estimate Signals Node outputs Table 6. Two passes in the hybrid learning procedure for ANFIS
Backward Pass gradient descent Fixed Error rates
In Figure 6 we demonstrate the membership functions of FIS for time estimation, and Figure 7 shows an output of ANFIS for Time Estimation.
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
407
Fig. 6. membership function of FIS for Time Estimation after ANFIS’s Training
Fig. 7. ANFIS output for Time Estimation
3. Experimental Results 3.1 Evaluation Criteria We employ the following criteria to assess and compare the performance of effort estimation models. A common criterion for the evaluation of effort estimation models is the relative error (RE) or the magnitude of relative error (MRE), which is defined as [Huang et al., 2007]:
408
Advanced Technologies
REi =
Actual Efforti − Pr edicted Efforti Actual Efforti
|Actual Efforti − Pr edicted Efforti| MREi = Actual Efforti
(4)
(5 )
The RE and MRE values are calculated for each project i whose effort is predicted. For N multiple projects, we can also use the mean magnitude of relative error (MMRE) [Huang et al., 2007]:
MMREi =
1 N |Actual Efforti − Pr edicted Efforti| 1 N = ∑ MREi ∑ N i =1 Actual Efforti N i =1
(6)
Intuitively, MER seems preferable to MRE since MER measures the error relative to the estimate. Here we used this. The MER is defined as follows [Lopez-Martin et al., 2008]:
|Actual Efforti − Pr edicted Efforti| MERi = Pr edicted Efforti
(7)
The MER value is calculated for each observation i whose effort is predicted. The aggregation of MER over multiple observations (N) can be achieved through the mean MER (MMER) as follows [Lopez-Martin et al., 2008]:
MMER =
1 N ∑ MERi N i =1
( 8)
Another criterion that is commonly used is the prediction at level p:
Pr ed( p ) =
k N
(9)
Where k is the number of projects where MRE is less than or equal to p. here we used Pred(25). In general, the accuracy of an estimation technique is Proportional to Pred(p) and inversely proportional to MMRE and MMER. Any way we used all of these criterions for evaluation of software techniques. Also the other criterion is coefficient of determination (R2). Coefficient of determination is used to assess the quality of the estimation models and expressed by R2. The coefficient R2 is calculated by [Gu et al., 2006]:
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
409 − 2
n
R2 =
∑(y − y ) i =1 n
^
( 10 )
_
∑ ( y − y )2 i =1
Here,
y
expresses the mean value of random variables. Obviously, the coefficient R2
describes the percentage of variability and the value is between 0 and 1; when an R2 is close to 1, it indicates that this model can explain variability in the response to the predictive variable, i.e. there is a strong relationship between the independent and dependent variables. 3.2 Implementation Results A software tool (MATLAB 7.6) was used to simulate fuzzy logic system, neural network model and neuro-fuzzy model. Three categories of results are as below: First category: Effort Estimation with two inputs data as we discussed above. The results are gathered in Table 7 which showed that Neuro-Fuzzy model has 96% data with less than 20% error. As shown in Figure 8 just four data had more than 25% error and most of them had less than 7% error. Since we just have two inputs, we implement ANFIS by Genfis1. Estimation Models Neuro-Fuzzy Model Fuzzy Logic Model Neural Network Model Multiple Regression Model
Pred(20) 0.96 0.89 0.88 0.78
Average Error 0.40 0.91 0.39 0.90
MMRE 0.05 0.12 0.04 0.12
Table 7. Implementation Results for Effort Estimation with two inputs
Fig. 8. MMRE Results for ANFIS with two inputs for Effort Estimation
MMER 0.05 0.13 0.07 0.14
410
Advanced Technologies
Second Category: due to the number of inputs we implement ANFIS by Genfis2 here. As we mentioned before the Fuzzy Model is impossible to implement in this category due to large number of inputs, so we have nothing in that row. Here we also had the best results for Neuro-fuzzy Model, these results were shown in Table 8 and were demonstrated in Figure 9.
Estimation Models Neuro-Fuzzy Model Fuzzy Model Neural Network Model Multiple Regression Model Statistical Model
Pred(20) Average Error MMRE MMER 0.95 0.38 0.05 0.05 It’s impossible to implement, Due to very large rule set. 0.89 0.73 0.11 0.11 0.95 0.50 0.07 0.07 0.94
0.51
0.08
0.07
Table 8. Implementation Results for Effort Estimation with six inputs
Fig. 9. MMRE Results for ANFIS with six inputs for Effort Estimation
As shown in Figure 7, most of estimations had less than 5% error and this emphasized that the performance of this model is better than the others. Third category: Time estimation with two inputs: log (Effort) and log(Average Team Size). The obtained results are organized in Table 9.
Estimation Models
Pred(20)
Average Error
Neuro-Fuzzy Model 0.5103 Fuzzy Logic Model 0.3913 Neural Network Model 0.4119 Multiple Regression Model 0.5149 Table 9. Implementation Results for Time Estimation
0.4161 0.5561 0.5295 0.4225
MMRE
MMER
0.2594 0.3435 0.3266 38.02
0.2456 0.3291 0.3032 0.2640
Figure 10 demonstrated that most of results had less than 3% error and it’s pointed that this model is very accurate for prediction.
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
411
Fig. 10. MMRE Results of ANFIS for Time Estimation The value of coefficient of determination (R2) for ANFIS is equal to 0.9828 which indicated that more than 98 % of the variance in dependent variable can be explained by this model thus that’s confidenceable. The comparison plots of these models for Time and Effort estimation are shown in Figure 11 and 12 respectively.
Fig. 11. Comparison plot for Time Estimation
412
Advanced Technologies
Fig. 12. Comparison plot for Effort Estimation
4. Conclusions and future works As software development has become an essential investment for many organizations, software estimation is gaining an ever-increasing importance in effective software project management, quality management, planning, and budgeting. The primary purpose of this study was to propose a precise method of estimation that takes account of and places emphasis on the various software development elements. We compared this neuro-fuzzy based software development estimation model with four other models such as neural network models, fuzzy logic models, multiple regression models, and statistical models. The main benefit of this model is its good interpretability by using the fuzzy rules. Another great advantage of this research is that they could put together expert knowledge (fuzzy rules), project data and the traditional algorithmic model into one general framework that may have a wide range of applicability in software effort and time estimation. Also recent researches have tended to focus on the use of function points (FPs) in estimating the software development efforts and FPA (Function Point Analysis) assumes that the FP is the only factor which influences software development effort, however, a precise estimation should not only consider the FPs, which represent the size of the software, but should also include various elements of the development environment for its estimation. The factors significant to software development effort are project size, average number of developers that worked on the development, type of development, development language,
Estimating Development Time and Effort of Software Projects by using a Neuro_Fuzzy Approach
413
development platform, and the use of rapid application development which are used for estimation although FP as a software size metric is an important topic in the software prediction domain. As a result of comparison, the effort and time estimation model, which is based on the neuro-fuzzy techniques, showed superior results in predictability than the other models mentioned in this study. This study worked on the latest release of ISBSG data repository which is very large database recording 4106 software projects developed worldwide. Also for comparison of software development techniques we used three evaluation criteria: MMRE (Mean Magnitude Relative Error), MMER and Pred(20). The proposed model has 98% coefficient of determination (R2) which emphasize on the best performance of our proposed approach. Some limitations in this domain are: Estimation of time and effort in earlier phase of software development is very difficult and it depends on lower level of estimation such as Size Estimation which is done by using External Inputs (EI), External Outputs (EO), External Queries (EQ), Internal Logical Files (ILF), and External Interface Files (EIF). Many existing research papers have proposed various effort estimation techniques and they still do not have an agreement which technique is the best across different cases. Also we don’t have any dynamic learning algorithm for our model to adopt itself with any situation and completed our database in each estimation time. By adding the process maturity in effort estimation models as an input factor, we can improve the accuracy of estimation models. This limitation gives us motivation to continue this research in our future work.
5. References A. Abraham, (2005), “Adaptation of Fuzzy Inference System Using Neural Learning”, Springer-Verlag Berlin Heidelberg. K. K. Aggarwal, Y. Singh, P. Chandra, M. Puri, (2005), “Sensitivity Analysis of Fuzzy and Neural Network Models”, ACM SIGSOFT Software Engineering Notes, Vol. 30, Issue 4, pp. 1-4. R. Gray, S. G. MacDonell, (1999), “Fuzzy Logic for Software Metric Models throughout the Development Life-Cycle”, in Proceedings of the 18th International Conference of the North American Fuzzy Information Processing Society – NAFIPS, New York, USA, 258 – 262. A. R. Gray, S. G. MacDonell, (1997), “Applications of Fuzzy Logic to Software Metric Models for Development Effort Estimation”, Fuzzy Information Processing Society 1997 NAFIPS’ 97, Annual Meeting of the North American, 394 – 399. X. Gu, G. Song, L. Xiao, (2006), "Design of a Fuzzy Decision-making Model and Its Application to Software Functional Size Measurement", IEEE, International Conference on Computational Intelligence for Modelling Control and Automation,
414
Advanced Technologies
and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06) . A. Heiat, (2002), “Comparison of artificial neural network and regression models for estimating software development effort”, Information and Software Technology, 911-922. X. Huang, D. Ho, J. Ren, L. F. Capretz, (2007), “Improving the COCOMO model using a neuro-fuzzy approach”, Applied Soft Computing , Vol.7, Issue 1, pp. 29-40. W. S. Humphrey, (2002), “A Discipline for Software Engineering”, Addison Wesley. A. Idri, A.Abran, (2001), “A Fuzzy Logic Based Set of Measures for Software Project Similarity: Validation and Possible Improvements”, Proceedings of the seventh international software metrics symposium (METRICS ’01), pp.85-96. J. Jantzen, (1998), “Neuro-fuzzy modeling”, Report no 98-H-874. C. Lopez-Martin, C.Yanez-Marquez, A. Gutierrez-Tornes, (2008), “Predictive accuracy comparison of fuzzy models for software development effort of small programs”, The journal of systems and software, Vol. 81, Issue 6, pp. 949-960. S. G. MacDonell, A. R. Gray, J. M. Calvert. (1999), “FULSOME: Fuzzy Logic for Software Metric Practitioners and Researchers”, in Proceedings of the 6th International Conference on Neural Information Processing ICONIP’99, ANZIIS’99, ANNES’99 and ACNN’99, Perth, Western Australia, IEEE Computer Society Press, 308-313. M. C. Mackey, L. Glass, Oscillation and Chaos , (1977), in Physiological Control Systems, Science, Vol. 197, pp. 287–289. A. A. Moataz, O.S.Moshood, A.Jarallah, (2005), “Adaptive fuzzy-logic-based framework for software development effort prediction”, Information and Software Technology, Vol. 47, Issue 1, pp. 31-48. NESMA, (1996), NESMA FPA Counting Practices Manual 2.0: NESMA Association. M.O. Saliu, M. Ahmed and J. AlGhamdi. (2004), “Towards adaptive soft computing based software effort prediction”, Fuzzy Information, Processing NAFIPS '04. IEEE Annual Meeting of the North American Fuzzy Information Processing Society, 1, 16-21. M. T. Su, T. C. Ling, K. K. Phang, C. S. Liew, P. Y. Man, (2007), “Enhanced Software Development Effort and Cost Estimation Using Fuzzy Logic Model”, Malaysian Journal of Computer Science, Vol. 20, No. 2, pp. 199-207. W.N. Venables, B. D. Ripley, (2002), “Modern Applied Statistics with” New York: Springer. W. Xia, L. F. Capretz, D. Ho, F. Ahmed, (2007), ”A new calibration for Function Point complexity weights”, Information and Software Technology, 50 (7-8), 670-683. Zhizhonga, Jiang, Craig Comstock, (2007), “The Factors Significant to Software Development Productivity”, PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, Vol. 21, ISSN 1307-6884 Zhizhongb, Jiang, Peter Naudé, (2007), “An Examination of the Factors Influencing Software Development Effort”, International Journal of Computer, Information, and Systems Science, and Engineering Vol. 1, No. 3, 182-191.
Posture and Gesture Recognition for Human-Computer Interaction
415
23 X
Posture and Gesture Recognition for Human-Computer Interaction Mahmoud Elmezain, Ayoub Al-Hamadi, Omer Rashid and Bernd Michaelis
Otto-von-Guericke-University Magdeburg Institute for Electronics, Signal Processing and Communications (IESK) Germany 1. Introduction
While automatic hand posture and gesture recognition technologies have been successfully applied to real-world applications, there are still existed several problems that need to be solved for wider applications of Human-Computer Interaction (HCI). One of such problems, which arise in real-time hand gesture recognition, is to extract (spot) meaningful gestures from the continuous sequence of hand motions. Another problem is caused by the fact that the same gesture varies in shape, trajectory and duration, even for the same person. A gesture is spatio-temporal pattern which may be static or dynamic or both. Static morphs of the hands are called postures and hand movements are called gestures. The goal of gesture interpretation is to push the advanced human-computer communication to bring the performance of HCI close to human-human interaction. Sign language recognition is an application area for HCI to communicate with computers and for sign language symbols detection. Sign language is categorized into three main groups namely finger spelling, word level sign and non manual features (Bowden et al., 2003). Finger spelling is used to convey the words letter by letter. The major communication is done through word level sign vocabulary and non-manual features include the facial expressions, mouth and body position. The techniques for posture recognition with sign languages are reviewed for finger spelling to understand the research issues. The motivation behind this review is to develop a recognition system which works more robustly with high recognition rates. Practically, hand segmentation and computations of good features are important for the recognition. In the recognition of sign languages, different models are used to classify the alphabets and numbers. For example, in (Hussain, 1999), Adaptive Neuro-Fuzzy Inference Systems (ANFIS) model is used for the recognition of Arabic Sign Language. In this proposed technique, colored gloves are used to avoid the segmentation problem and it helps the system to obtain good features. Handouyahia et al. (Handouyahia et al., 1999) presents a recognition system for the International Sign Language (ISL). They have used Neural Network (NN) to train the alphabets. NN is used for the recognition purposes because it can easily learn and train from the features computed for the sign languages. Other approach
416
Advanced Technologies
includes the Elliptic Fourier Descriptor (EFD) used by Malassiotis and Strintzis (Malassiotis & Strintzis, 2008) for 3D hand posture recognition. In their system, they have used orientation and silhouettes from the hand to recognize 3D hand postures. Similarly, Licsar and Sziranyi (Licsar & Sziranyi, 2002) used Fourier coefficients to represent hand shape in their system which enables them to analyze hand gestures for the recognition. Freeman and Roth (Freeman & Roth, 1994) used orientation histogram for the classification of gesture symbols, but huge training data is used to solve the orientation problem and to avoid the misclassification between symbols. In the last decade, several methods of potential applications (Deyou, 2006; Elmezain et al., 2008a; Kim et al., 2007; Mitra & Acharya, 2007; Yang et al., 2007) in the advanced Hand gesture interfaces for HCI have been suggested but these differ from one another in their models. Some of these models are Neural Network (Deyou, 2006), Hidden Markov Models (HMMs) (Elmezain et al., 2008a; Elmezain et al., 2008b) and Dynamic Time Warping (DTW) (Takahashi et al., 1992). In 1999, Lee et al. (Lee & Kim, 1999) proposed an ergodic model based on adaptive threshold to spot the start and the end points of input patterns, and also classify the meaningful gestures by combining all states from all trained gesture models using HMMs. Kang et al. (Kang et al., 2004) developed a method to spot and recognize the meaningful movements where this method concurrently separates unintentional movements from a given image sequences. Alon et al. (Alon et al., 2005) proposed a new gesture spotting and recognition algorithm using a pruning method that allows the system to evaluate a relatively small number of hypotheses compared to Continuous Dynamic Programming (CDP). Yang et al. (Yang et al., 2007) presented a method for recognition of whole-body key gestures in Human-Robot Interaction (HRI) by HMMs and garbage model for non-gesture patterns. Mostly, previous approaches use the backward spotting technique that first detects the end point of gesture by comparing the probability of maximal gesture models and non-gesture model. Secondly, they track back to discover the start point of the gesture through its optimal path and then the segmented gesture is sent to the recognizer for recognition. So, there is an inevitable time delay between the meaningful gesture spotting and recognition where this time delay is not well for on-line applications. Above of all, few researchers have addressed the problems on non-sign patterns (which include out-of-vocabulary signs, epentheses, and other movements that do not correspond to signs) for sign language spotting because it is difficult to model non-sign patterns (Lee & Kim, 1999). The main contribution of this chapter is to explore two parts; the first part is related to hand posture and the second part deals with hand gesture spotting. In posture recognition, an approach is proposed for recognition of ASL alphabets and numbers, which is able to deal with a large number of hand shapes against complex backgrounds and lighting conditions. This approach is based on Hu-Moment, whose features are invariant of translation, rotation and scaling. Besides, geometric features are also incorporated. These feature vectors are used to train Support Vector Machine (SVM) and a recognition process that identifies the hand posture from the SVM of segmented hands. In hand gesture, a robust technique is proposed that executes gesture spotting and recognition simultaneously. The technique recognizes the isolated and the meaningful hand gestures in stereo color image sequences using HMMs. In addition, color and 3D depth map are used to detect hands where the hand trajectory will take place in further step using a robust stereo tracking algorithm to generate 3D dynamic features. This part covers the procedures to design a sophisticated method for
Posture and Gesture Recognition for Human-Computer Interaction
417
non-gesture model, which provides a confidence limit for the calculated likelihood by other gesture models. Furthermore, the confidence measures are used as an adaptive threshold for selecting the proper gesture model or spotting meaningful gestures. The proposed techniques can automatically recognize posture, isolated and meaningful hand gestures with superior performance and low computational complexity when applied on several video samples containing confusing situations such as partial occlusion and overlapping. The rest of the chapter is organized as follows. We formulate the Hidden Markov Models in Section 2 and Support Vector Machine in Section 3. Section 4 discusses the posture and gesture approach in three subsections. Experimental results are given in Section 5. We have tested image and video sequences for hand posture and gesture spotting respectively. Finally, Section 6 ends with a summary and conclusion.
2. Hidden Markov Models Markov Model is a mathematical model of stochastic process, which generates random sequences of outcomes according to certain probabilities (Elmezain et al., 2007; Rabiner, 1989). A stochastic process is a sequence of feature extraction codewords, the outcomes being the classification of hand gesture path. In a compact mode a discrete HMMs can be symbolized with λ= (A, B, Π) and is described as follows: The set of states S= {s1, s2, …, sN} where N represents the number of states. The set of observation symbols V= {v1, v2, …, vM} where M is the number of distinct symbols observable in each state. An initial probability for each state Πi , i=1, 2, ..., N ; such that: �� � ���� ��
�� � ��
� �� � �
(1)
�
An N-by-N transition matrix A= {aij} where aij is the probability of taking a transition from state i to state j at moment t: ��� � ���� � �|���� � ��� � � �� � � �� ��� � �� � ��� � � �
(2)
An N-by-M observed symbols matrix B= {bim} where bim gives the probability of emitting symbol vm in state i: ��� � ���� |�� �� � � � � �� � � � � �� ��� � �� � ��� � � �
(3)
The set of possible emission (an observation) O= {o1, o2, …, oT} where T is the length of gesture path. Based on HMMs the statistical strategy has many advantages, among them being recalled: rich mathematical framework, powerful learning and decoding methods, good sequences handling capabilities, a flexible topology for the statistical phonology and the syntax. The disadvantages lie in the poor discrimination between the models and in the unrealistic
418
Advanced Technologies
assumptions that must be made to construct the HMMs theory, namely the independence of the successive feature frames (input vectors) and the first order Markov process (Goronzy, 2002). The algorithms developed in the statistical framework to use HMMs are rich and powerful, situation that can explain well the fact that today, hidden Markov models are the widest used in practice to implement gesture recognition and understanding systems. The main problems that can be solved with HMMs are: 1. Given the observation sequence O = (o1, o2, …, oT), and a model λ= (A, B, Π) how do we efficiently compute P(O| λ), the probability of the observation sequence, given the model. This is the “evaluation problem”. Using the forward and backward procedure provides solution. 2. Given the observation sequence O = (o1, o2, …, oT), and the model λ, how do we choose a corresponding state sequence S=(s1, s2, …, sT) that is optimal in some sense ( i.e. best explains the observation). The Viterbi algorithm provides a solution to find the optimal path. 3. How do we adjust the model parameters λ= (A, B, Π) to maximize P(O| λ). This is by far the most difficult problem of HMMs. We choose λ= (A, B, Π) in such a way that its likelihood, P(O| λ), is locally maximized using an iterative procedure like Baum-Welch method (Rabiner, 1989). Also, HMMs has three topologies; the first topology is Ergodic model (Fully Connected model) in which every state of the model could be reached in a finite number of steps from every other state of the model (Figure 1(a)). Other types of HMMs have been found to account for observed properties of the signal being modelled better than the standard Ergodic model. One such model is shown in Figure 8. This model is called a Left-Right Banded (LRB) because the underlying state sequence associated with model has the property that as time increases the state index increases or stays the same (i.e. no transitions are allowed to states whose indices are lower than the current state). Each state in LRB model can go back to itself or to the next state only. The last topology of HMMs is called a Left-Right (or the Bakis model) in which each state can go back to itself or to the following states. It should be clear that the imposition of the constraints of the LRB and Bakis model essentially have no effect on the re-estimation procedure. This is the case because any HMMs parameter set to zero initially, will remain at zero throughout the re-estimation procedure.
s1
ST
s2 s1
s4
s3 (a)
s2
s3
s4
ET (b)
Fig. 1. (a) Ergodic topology with 4 states. (b) Simplified Ergodic with fewer transitions
Posture and Gesture Recognition for Human-Computer Interaction
419
3. Support Vector Machines Support Vector Machines is a supervised learning for the optimal modelling of the data (Lin & Weng, 2004). It learns the decision function and separates the data class to the maximum width. Basically, SVM works on two-class i.e. binary classification and is also extendable for multiclass problem. In the literature, there are two types of this extension. All-together approach deals with its optimization problem. It lacks scalability and also faces optimization complexity. Second approach deals in binary fashion with multiple hyper-planes along with the combination into a single classifier. There are further two alternatives for this combination. The first one is based on one-against-all whereas other works as one-againstone. Binary classification of SVM learns on the following principle: ���� � ������
(4)
���� � ������� � � ��
(5)
The SVM’s linearly learned decision function f(x) is described as:
where � is a weight vector while b is the threshold and x is the input sample. SVM learner defines the hyper-planes for the data and maximum margin is found between these hyper planes. Because of the maximum separation of hyper-planes, it is also considered as a margin classifier. Margin of the hyper-plane is the minimum distance between hyper-plane and the support vectors and this margin is maximised. It can be formulated as following: ��
2 ���
(6)
where � is margin of the hyper-plane. Maximisation of the margin of the hyper plane is depicted in Figure 2.
�� � � � =0
Orign
Fig. 2. Margin of the hyper plane
|�| �
�
Slack
Margin
420
Advanced Technologies
SVM maps input data into high dimension domain where it is utmost linearly separable as shown in Figure 3. This mapping does not affect the training time because of implicit dot product and kernel trick (Cristianini & Taylor, 2001; Suykens et al., 2005). This is also a reason that SVM is a well suited classifier where features are large in number because they are robust to the curse of dimensionality. Kernel function is the computation of the inner product ����� � � ���� directly from the input. One of the characteristics of using the kernel is that there is no need to explicitly represent the mapped feature space. Kernel function is mathematically described as follows: ���� �� � ���� � ����
(7)
���� �� � �� � ��
(8)
Following are some of the kernel functions which are commonly used to convert the input features into new feature space. Linear kernel RBF Gaussian kernel Polynomial kernel Sigmoid kernel
(9)
� ⁄�� �
���� �� � � ������
(10)
���� �� � �� � � � ���
(11)
���� �� � ������� � � � ��
where � is a scaling factor while � is a shifting factor that controls the mapping. As discussed above, SVM outputs only the class labels for the input sample as output but not the probability information for the classes. Lin et al. describes a method to compute the class probabilities using SVM. Chang et al. developed a library ‘’LIBSVM’’ which provides the tools for the SVM functionalities including class probability estimation (Chang & Lin, 2001). Separation may be easier in higher dimension
Feature map Separating hyperplane Complex in low dimensions
Simple in higher dimensions
Fig. 3. Mapping from input data to a richer feature space through kernel function SVM has been studied a lot and is being used in a large problem domain including novelty detection regression optimization along with learning and classification. It has a basic
Posture and Gesture Recognition for Human-Computer Interaction
421
architecture which can be modified depending upon the problem domain using margin, kernel type and duality characteristics. SVM lacks several problems which other learners do like non-linear function, problem of local minima etc. It not only distinguishes between classes but also learns to separate them optimally. In addition, the performance of SVM is declined with non-scaled data and multi-class solution is still under process (Burges, 1998).
4. Posture and Gesture Approach In this Chapter, an approach is developed for the recognition of hand postures and gestures. Besides, improvements are done in the existing system of gesture recognition provided by IESK (Magdeburg University, Germany) whose purpose is to recognize the alphabets characters (A-Z) and numbers (0-9). The proposed approach is based on the analysis of stereo color image sequences with the support of 3d depth information. Gaussian distribution detects the skin pixels from the image sequences and depth information is used to help Gaussian distribution to build the region of interest and overcome the difficulties of overlapping regions. A framework is established to extract the posture and gesture features by the combination of various image processing techniques (Figure 4).
Pre-processing
Image Acquisition
Objects Segmentation Using GMM
Detection & Tracking
Objects Tracker with 3D Depth Map
Finger Tip Detection for Hand Posture
Feature Extraction
3D Dynamic Gesture Features
Postures: Statistical & Geometrical Features
Classification
Hidden Markov Models
Support Vector Machine
Fig. 4. Simplified structure showing the main modules for the posture and gesture approach The computed statistical and geometrical features for the hand movements are invariant to scale, rotation and translation. These features are used for the classification of posture symbols. The classification step is divided into two steps. The first step develops the classes for some set of alphabets for hand posture. In particular, the curvature analysis determines the peaks of the hand (i.e. fingertips) which helps in the reduction of computation and to avoid the classes that are not mandatory to test for that specific posture symbol. The misclassification is also reduced due to this grouping which helps in the recognition of correct symbol. Furthermore, SVM is applied on the respective set of classes to train and test the symbols. In the second step, the hand trajectory will take place in further step using
422
Advanced Technologies
Mean-shift algorithm and Kalman filter (Comaniciu et al., 2003) to generate 3D dynamic features for hand gesture. Furthermore, k-means clustering algorithm (Ding & He, 2004) is employed for the HMMs codewords. To spot meaningful gestures (i.e. Arabic numbers from 0 to 9) accurately, a non-gesture model is proposed, which provides a confidence limit for the calculated likelihood by other gesture models. The confidence measures are used as an adaptive threshold for spotting meaningful gestures. 4.1 Depth Map Image acquisition step contains 2D image sequences and depth image sequences. For the skin segmentation of hands and face in stereo color image sequences an algorithm is used, which calculates the depth value in addition to skin color information The depth information can be gathered by passive stereo measuring based on cross correlation and the known calibration data of the cameras. Several clusters are composed of the resulting 3D-points. The clustering algorithm can be considered as kind of region growing in 3D which used two criteria; skin color and Euclidean distance (Scott, 19992; Niese et all., 2007). Furthermore, this method is more robust to the disadvantageous lighting and partial occlusion, which occur in real time environment (for instance, in case of gesture recognition). The classification of the skin pixels is improved from Figure 5 by exploiting the depth information which contains the depth value associated with 2D image pixel. In the proposed approach, the depth image is used to select the region of interest in the image and it lies in the range from minimum depth 30cm to maximum depth 200cm. However, the depth range is adaptive and can be changed. From the depth information, not only the search of object of interest is narrowed down but also the processing speed is increased. The region of interest helps to remove the computed skin pixels other than this region. Figure 6 (a)&(b) shows the normalized 2D and 3D depth image ranges up to 10m. The normalization depth images are presented for visualization in the range from 0 to 255. Figure 6 (c)&(d) shows the normalized 2D and 3D depth range of interest (i.e. range from 30cm to 200cm). It should be noted that the region of interest should include the hands and face. The improved results by using the depth information are shown in Figure 5(c).
(a) (b) (c) Fig. 5. (a) Original 2D image (b) Skin pixel detection without using depth map (c) Yellow color shows the detection of skin pixels in the image after applying depth information By the given 3D depth map from camera set-up system, the overlapping problem between hands and face is solved since the hand regions are closer to the camera rather than the face region (Figure 12 & 13).
Posture and Gesture Recognition for Human-Computer Interaction
423
HR
F HL
(a)
(b)
F
HR
HL
(c)
(d)
Fig. 6. (a)&(b) shows the normalized 2D and 3D depth image respectively (c)&(d) shows the normalized 2D and 3D depth image for the region of interest (30cm to 200cm). F referes to the face, HL and HR represent the left and right hand respectively 4.2. Feature Extraction There is no doubt that selecting good features to recognize the hand posture and gesture path plays a significant role in any system performance. So, we will mention the features about postue and gesture in some details as follows. 4.2.1 Posture Features In the proposed approach, the statistical and geometrical features are computed for the hand postures. These are described as under. Statistical Feature Vectors Hu-Moments (Hu, 1962) are used in statistical feature vectors and are derived from basic moments. More specifically, moments are used to describe the properties of objects shape statistically. In image analysis, moments are considered as a binarized or grey level image with 2D density distribution functions. In this manner, an image segment is categorized with the help of moments. The properties extracted from the moments are area, mean, variance, covariance and skewness. Central Moments If݂ሺݔǡ ݕሻ is a digital image of M-by-N dimension, the central moments of order (p+q) is defined as:
424
Advanced Technologies ��� ���
��� � � ��� � �� �� �� � ���� ���, �� , �� � �
�
��� ��� , �� � ��� ���
(12)
where ��� gives the area of the object, ��� �������� are used to locate center of gravity of the object, �� ������� gives the coordinates of the center of gravity of the object (i.e. centroid). It can be seen from the above equation that central moments are translation invariant. Normalized Central Moment The normalized central moments are defined as: ��� �
��� , ��� �
�� � �� ��� � �� , 2
�, � ��2,3, … , ∞�
(13)
By normalizing the central moments, the moments are scale invariant. The normalization is different for different order moments. Hu-Moments Hu (Hu, 1962) derived a set of seven moments which are translation, orientation and scale invariant. The equations are computed from the second and third order moments. Hu invariants are extended by Maitra (Maitra, 1979) to be invariant under image contrast. Later, Flusser and Suk (Flusser & Suk, 1993) have derived the moment invariant, that are invariant under general affine transformation. The equations of Hu-Moments are defined as: �� � ��� � ��� �� � ���� � ��� �� � ���� � �� � ���� � 3��� �� � �3��� � ��� �� �� � ���� � ��� �� � ���� � ��� �� �� � ���� � 3��� � ���� � ��� � ����� � ��� �� � 3���� � ��� �� � � �3��� � ��� � ���� � ��� � �3���� � ��� �� � ���� � ��� �� � ��� � ���� � ��� ������ � ��� �� � ���� � ��� �� � � ���� ���� � ��� � ���� � ��� � �� � �3��� � ��� � ���� � ��� � ����� � ��� �� � 3���� � ��� �� � � �3��� � ��� � ���� � ��� � �3���� � ��� �� � ���� � ��� �� �
(14) (15) (16) (17) (18) (19) (20)
Hu-Moments are derived from a set of seven moments. These seven moments are derived from second and third order moments. However, zero and first order moments are not used in this process. The first six Hu-Moments are invariant to reflection (Davis & Bradski, 1999) and seventh moment change the sign. Statistical feature vectors contain the following set: ����� � ��� , �� , �� , �� , �� , �� , �� ��
where �� is the first Hu-Moment. Similar is the notation for all other features in this set.
(21)
Geometrical Feature Vectors Geometrical feature set contains two features: circularity and rectangularity. These features are computed to exploit the hand shape with the standard shapes like circle and rectangle.
Posture and Gesture Recognition for Human-Computer Interaction
425
This feature set varies from letter to letter and is useful to recognize the alphabets and numbers. The feature set of the geometrical features is as under: ���� � ����� ������
(22)
Circularity: Circularity is the measure of the shape that how much the object’s shape is closer to the circle. In the ideal case, circle gives the circularity as one. The range of circularity varies from 1 to infinity. Circularity ��� is defined as: ��� �
��������� � �� � ����
(23)
where Perimeter is the contour of the hand and Area is the total number of hand pixels. Rectangularity: Rectangularity defines the measure of the shape of the object that how much its shape is closer to the rectangle. The orientation of the object is calculated by computing the angle of all contour points using central moments. Length l and width w is calculated by the difference of largest and smallest angle in the rotation. In ideal case, the rectangularity (����) is 1 for rectangle and varies from 0.5 to infinity and is calculated as: ���� �
���� ���
(24)
where area is the total pixels of the hand, � is the length and � is the width. The statistical and geometrical feature vector set combined together to form a set of feature set. It is denoted as: ������ � ����� � ���� ������ � ��� � �� � �� � �� � �� � �� � �� � ���� ������
(25) (26)
������ contains all the features used for hand posture recognition.
Curvature Feature An important feature for the recognition of alphabets is the curvature feature which tells us about the peaks (i.e. fingertips) in hand. Therefore, before classifying the alphabets by SVM, four groups are made according to the numbers of fingertips detected in the hand. For ASL numbers, we classify them with a single classifier. Normalization: The normalization is done for features to keep them in a particular range. Geometrical features vector have the range up to infinity and these features are very different from each other, so they create a scalability problem. In order to keep them in same range and to combine them with statistical feature vector, normalization is carried out and is defined as: ������� �
��� � ������ � ������ � ������
�������� �
���� � ������� ������� � �������
(27)
426
Advanced Technologies
where ������ and ������ are the minimum and maximum circularity of the hand from all classes of feature vectors. ������� The notations are the same for rectangularity. HuMoments are normalized by the following equation: �� � ������� (28) ������� � ������� where �� is the ith Hu-Moment feature. ������� and ������� are the minimum and maximum values from the set of all classes respectively. �� �
4.2.2 Gesture Features There are three basic features; location, orientation and velocity. So, we will do a combination of these three basic features and using them as a main feature. A gesture path in spatio-temporal pattern that consists of hand centroid points (xhand, yhand) where the coordinates in the Cartesian space can be extracted from gesture frames directly. We consider two types of location features. The first location feature is Lc that measures the distance from the centroid to a point of the hand gesture because different location features are generated for the same gesture according to the different starting points (Eq. 29). The second location feature is Lsc, which is computed from the start point to the current point of hand gesture path (Eq. 31). ��� � ������ � �� �� � ����� � �� �� � ��� � �� � �
�
�
���
���
� � 1� �� � � � � 1
1 �� �� � � �� � �
���� � ������ � �� �� � ����� � �� ��
(29)
(30) (31)
where T represents the length of hand gesture path. (Cx, Cy) refers to the center of gravity at the point n. To verify the real-time implementation, the center of gravity is computed after each image frame. The second basic feature is the orientation, which gives the direction along the hand when traverses in space during the gesture making process. As described above, the orientation feature is based on the calculation of the hand displacement vector at every point and is represented by the orientation according to the center of gravity (1t), the orientation between two consecutive points (2t) and the orientation between start and current hand gesture point (3t). ���� � �� ���� � �� ���� � �� � � ��� � ������ � � � ��� � ������ � � ��� � ������ � ���� � �� ���� � �� ���� � ��
(32)
The third basic feature is the velocity, which plays an important role during gesture recognition phase particulary at some critical situations. The velocity V is based on the fact that each gesture is made at different speeds where the velocity of the hand decreases at the corner point of a gesture path. The velocity is calculated as the Euclidean distance between
Posture and Gesture Recognition for Human-Computer Interaction
427
the two successive points divided by the time in terms of the number of video frames as follows: ܸ௧ ൌ ඥሺݔ௧ାଵ െ ݔ௧ ሻଶ ሺݕ௧ାଵ െ ݕ௧ ሻଶ
(33)
Each frame contains a set of feature vectors at time t (Lct, Lsct, 1t, 2t, 3t, Vt) where the dimension of space is proportional to the size of feature vectors. In this manner, gesture is represented as an ordered sequence of feature vectors, which are projected and clustered in space dimension to obtain discrete codeword that are used as an input to HMMs. This is done using k-means clustering algorithm (Ding & He, 2004; Kanungo et al., 2002), which classifies the gesture pattern into K clusters in the feature space. This algorithm is based on the minimum distance between the center of each cluster and the feature point. We divide a set of feature vectors into a set of clusters. This allows us to model the hand trajectory in the feature space by one cluster. The calculated cluster index is used as input (i.e. observation symbol) to the HMMs. Furthermore, we usually do not know the best number of clusters in a data set. In order to specify the number of clusters K for each execution of the k-means algorithm, we considered K = 28, 29, ..., 37 which is based on the numbers of segmented parts in all numbers (0-9) where each straight-line segment is classified into a single cluster. Suppose we have n sample of trained feature vectors x1, x2, ..., xn all from the same class, and we know that they fall into K compact clusters, K < n. Let mi be the mean of the vectors in cluster i. If the clusters are well separated, a minimum distance classifier is used to separate them. That is, we can say that x is in cluster i if x-mi is the minimum of all the K distances. The following procedure for finding the k-means is; Build up randomly an initial Vector Quantization Codebook for the means m1, m2, ..., mk Until there are no changes in any mean – Use the estimated means to classify each sample of train vectors into one of the clusters mi for i=1 to K Replace mi with the mean of all of the samples of trained vector for cluster i end (for) end (Until) A general observation is that different gestures have different trajectories in the cluster space, while the same gesture show very similar trajectories. 4.3 Classification 4.3.1 Hand posture via SVM In the classification, a symbol is assigned to one of the predefined classes and a fusion of statistical and geometrical feature vectors are used in it. A set of thirteen ASL alphabets (i.e. A, B, C, D, H, I, L, P, Q, U, V, W and Y) and seven ASL numbers (i.e. 0-6) are recognized using SVM and are shown in Figure 7(a) and Figure 7(b) respectively. Classification phase contains two parts. Curvature is analyzed in first part for ASL alphabets where as SVM classifier is used in second part for both ASL alphabets and numbers. The reason for not putting these letters with alphabets is that some letters are very similar to alphabets and it is hard to classify them. For example, ‘D’ and ‘1’ are same with a small change of thumb. Therefore, unlike alphabets, ASL letters are not categorized into groups and classification is
428
Advanced Technologies
carried out for a single group. In this way, the first part for ASL numbers is ignored and it includes only SVM classifier part.
(a) 0
1
2
3
4
5
6
7
8
9
(b) Fig. 7. (a)&(b) Set of ASL alphabets and numbers where rectangles show postures sign used Curvature Analysis In the classification, phase, we have used the number of detected fingertips to create the groups for ASL alphabets. These groups are shown in Table 1. The analysis is done to reduce number of signs in each group and to avoid the misclassifications. In the second part, SVM classifies the posture signs based on the detected fingertips. Group Nr. 1 2 3 4
Fingers 0 1 2 3
Posture Symbols A, B A, B, D, H, I, U C, L, P, Q, V,Y W
Table 1. The number of detected fingertips in posture alphabets 4.3.2 Hand Gesture via HMMs To spot meaningful gestures, we construct gesture spotting network as shown in Figure 8. The gesture spotting network can be easily expanded the vocabularies by adding a new meaningful gesture HMMs model and then rebuilding a non-gesture model. Shortly, we mention how to model gesture patterns discriminately and how to model non-gesture patterns effectively. Each reference pattern for Arabic numbers (0-9) is modeled by LRB model with varying number of states ranging from 3 to 5 states based on its complexity. As, the excessive number of states can generate the over-fitting problem if the number of training samples is insufficient compared to the model parameters. It is not easy to obtain the set of non-gesture patterns because there are infinite varieties of meaningless motion. So, all other patterns rather than references pattern are modeled by a single HMM called a non-gesture model (garbage model) (Lee & Kim, 1999; Yang et al., 2007; Elmezain et al.,
Posture and Gesture Recognition for Human-Computer Interaction
429
2009). The non-gesture model is constructed by collecting the states of all gesture models in the system as follows: 1. Duplicate all states from all gesture models, each with an output observation probabilities. Then, we re-estimate that probabilities with gaussian distribution smoothing filter to makes the states represent any pattern. 2. Self-transition probabilities are kept as in the gesture models. 3. All outgoing transition are equally assigned as: ���� �
� � ��� , ���
(34)
��� ��� �, � � �
where ���� represents the transition probabilities of non-gesture model from state i to state j, aij is the transition probabilities of gesture models from state i to state j and N in the number of states in all gesture models. Non-gesture Model
ST
ET
Ten gesture Models Zero
1
2
3
5
6
7
8
9
10
11
Two Two
12
13
14
15
Three Three
16
17
18
19
20
Four Four
21
22
23
24
25
Five Five
26
27
28
Six Six
29
3
31
Seven Seve
32
3
34
35
37
38
39
40
4
One One
S
36
Eight Eight Nine Nine
Fig. 8. Gesture spotting network which contains ten gesture models and one non-gesture model with two null states (Start: ST; End: ET) The non-gesture model (Figure 1(b) & Figure 8) is a weak model for all trained gesture models and represents every possible pattern where its likelihood is smaller than the dedicated model for a given gesture because of the reduced forward transition probabilities. Also, the likelihood of the non-gesture model provides a confidence limit for the calculated likelihood by other gesture models. Thereby, we can use confidence measures as an
430
Advanced Technologies
adaptive threshold for selecting the proper gesture model or gesture spotting. The number of states for non-gesture model increases as the number of gesture model increases. Moreover, there are many states in the non-gesture model with similar probability distribution, which in turn lead to a waste time and space. To alleviate this problem, a relative entropy (Cover & thomas, 1991) is used. The relative entropy is a measure of the distance between two probability distributions. Consider two random probability distributions P =(p1, p2, ..., pM)T and Q =(q1, q2, ..., qM)T, the symmetric relative entropy ������� is defined as: ������� �
�
�� �� 1 ���� log � �� log � �� �� 2
(35)
���
The proposed state reduction is based on Eq. 35 and works as follows: 1. Calculate the symmetric relative entropy between each probability distribution pair p(l) and q(n) of l and n states, respectively. ���
��� ���
���
�
���
���
�
�
�� � 1 ��� ��� � � ���� log ��� � �� log ���� � 2 � � ���
(36)
2. Determine the state pair (l, n) with the minimum symmetric relative entropy ������ ��� ��� �. 3. Recalculate the probability distribution output by merging these two states over the M observation discrete symbol as: ����
��
�
���
���
�� � �� 2
(37)
4. If the number of states is greater than a threshold value, then go to 1, else re-estimate probability distribution output by gaussian distribution smoothing filter to makes the states represent any pattern The proposed gesture spotting system contains two main modules; segmentation module and recognition module. In the gesture segmentation module, we use a sliding window which calculates the observation probability of all gesture models and non-gesture model for segmented parts. The start (end) point of gesture is spotted by competitive differential observation probability value between maximal gestures (λg) and non-gesture (Figure 9). The maximal gesture model is the gesture whose observation probability is the largest among all ten gesture p(O| λg). When this value changes from negative to positive (Eq. 38, O can possibly as gesture g), the gesture starts. Similarly, the gesture ended around the time that this value changes from positive to negative (Eq. 39, O cannot be a gesture). ��� ������ � � ���������������� �
��� ������ � � ���������������� �
(38) (39)
Posture and Gesture Recognition for Human-Computer Interaction Image sequences
HMM λzero
HMM λnine Pre-processing + Feature extraction
P(O|λ0) Max. likelihood
HMM λone
431
P(O|λ1)
P(O|λg)
P(O|λ9)
Competitive Differential Observation Probability = P(O|λg) P(O|λnon-gesture)
P(O|λnon-gesture)
Gesture start /end
Non-gesture HMM λnon-gesture
Fig. 9. Simplified structure showing the main module for hand gesture spotting via HMMs After spotting start point in continuous image sequences, then it activates gesture recognition module, which performs the recognition task for the segmented part accumulatively until it receives the gesture end signal. At this point, the type of observed gesture is decided by Viterbi algorithm frame by frame. The following steps show how the Viterbi algorithm works on gesture model �� :
1. Initialization:
� � �� ��� � �� � �� ��� �,
��� � � � � �
(40)
2. Recusion (accumulative observation probability computation): � � � � �� ��� � �������� ���� ��� �� �� ��� � ,
3. Termination:
�
�
��� � � � � �, � � � � � �
������ � � ������ ���� �
(41)
(42) �
where ��� is the transition probability from state i to state j, �� ��� � refers to the probability of � emitting o at time t in state j, and �� ��� is the maximum likelihood value in state j at time t.
5. Experiments Discussion
A method for detection and segmentation of the hands in stereo color images with complex background is used where the hand segmentation and tracking takes place using 3D depth map, color information, Gaussian Mixture Model (GMM) (Elmezain et al., 2008b; MingHsuan & Narendra, 1999; Phung et al., 2002) and Mean-shift algorithm in conjunction with Kalman filter (Comaniciu et al., 2003). Firstly, segmentation of skin colored regions becomes robust if only the chrominance is used in analysis. Therefore, YCbCr color space is used in our approach where Y channel represents brightness and (Cb, Cr) channels refer to
432
Advanced Technologies
chrominance. We ignore Y channel to reduce the effect of brightness variation and use only the chrominance channels, which fully represent the color information. A large database of skin and non-skin pixels is used to train the Gaussian model. In the training set, 18972 skin pixels from 36 different races persons and 88320 non-skin pixels from 84 different images are used. The GMM technique begins with modeling of skin using skin database where a variant of k-means clustering algorithm performs the model training to determine the initial configuration of GMM parameters. Additionally, blob analysis is used to derive the hand boundary area, bounding box and hand centroid point (Figure 12 & 13). Secondly, after localization of the hand's target from the segmentation step, we find its color histogram with Epanechnikov kernel (Comaniciu et al., 2003). This kernel assigns smaller weights to pixels further from the center to increases the robustness of the density estimation. To find the best match of our hand target in the sequential frames, the Bhattacharyya coefficient (khalid et al., 2006) is used to measure the similarity by maximizing Bayes error that arising from the comparison of the hand target and candidate. We take in our consideration the mean depth value that is computed from the previous frame for the hand region to solve overlapping between hands and face. The mean-shift procedure is defined recursively and performs the optimization to compute the mean shift vector. After each mean-shift optimization that gives the measured location of the hand target, the uncertainty of the estimate can also be computed and then followed by the Kalman iteration, which drives the predicated position of the hand target. Thereby, the hand gesture path is obtained by taking the correspondences of detected hand between successive image frames (Figure 12). The input images were captured by Bumblebee stereo camera system that has 6 mm focal length at 15FPS with 240 320 pixels image resolution, Matlab and C++ implementation. Our experiments are carried out an isolated gesture recognition and meaningful gesture spotting test. 5.1 Experimental results 5.1.1 Hand Posture For training the data, a database is built which contains 3000 samples for posture symbols taken from eight persons on a set of thirteen ASL alphabets and seven numbers. Classification results are based on 2000 test samples from five persons and sample test data used is entirely different from the training data. The computed features set are invariant to translation, orientation and scaling, therefore posture signs are tested for these properties which is an important contribution of this work. Experimental result shows the probability of posture classification for each class in the group and it is achieved for test data by the analysis of confusion matrixes. The calculated results include the test posture samples (i.e. alphabets and numbers) with rotation, scaling and under occlusion. The diagonal elements in the confusion matrixes represent the percentage probability of each class in the group. Misclassifications between the different classes are shown by the non-diagonal elements. Feature vector set for posture recognition contains the statistical feature vectors and geometrical feature vectors, so the computed confusion matrix from these features gives an inside view about how different posture symbols are similar to each other. Confusion matrixes and classification probabilities of the groups for ASL alphabets are described here:
Posture and Gesture Recognition for Human-Computer Interaction
433
Group 1 (No Fingertip Detected): Table 2 shows the confusion matrix of ASL alphabet ‘A’ and ‘B’. It is to be noted that there is no misclassification between these two classes. It shows that these posture symbols are very different from each other. A B Symbol A 100.0 0.0 B 0.0 100.0 Table 2. Confusion Matrix for no fingertip detection. The alphabets in this group are completely different from one another Group 2 (One Fingertip Detected): Table 3 shows the confusion matrix of the classes with one fingertip detected. The result of misclassification shows the tendency of a posture symbol towards its nearby posture class. Posture symbols are tested on different orientations and back and forth movements. It can be seen that alphabet ‘A’ results in least misclassification with the other posture symbols because alphabet ‘A’ is different from other postures in this group. ‘H’/’U’ has the maximum misclassification with the other posture alphabets. It is observed that the misclassification of ‘H’/’U’ with ‘B’ is occurred during the back and forth movement. In general, there are very few misclassifications between these posture signs because of the features which are translation, rotation and scale invariant. A B D I Symbol A 99.8 0.0 0.0 0.0 B 0.0 98.18 1.0 0.0 D 0.0 0.0 98.67 1.33 I 0.58 0.0 0.8 98.62 H/U 0.0 3.08 0.0 0.24 Table 3. Confusion Matrix of the alphabets for one detected fingertip
H/U 0.2 0.82 0.0 0.0 96.68
Group 3 (Two Fingertips Detected): Table 4 shows the confusion matrix of the classes with two fingertips detected. The posture symbols in this group are tested for scaling and rotations. The presented results show that the highest misclassification exists between ‘P’ and ‘Q’. It is due to the reason that these two signs are not very different in shape and geometry. Besides, statistical features in this group are not very different from each other. Therefore, a strong correlation exists between the symbols in this group which leads to the misclassifications between them. C L P Q V Symbol C 98.65 0.25 0.0 0.75 0.0 L 0.38 98.5 0.0 0.76 0.0 P 0.0 0.0 98.74 1.26 0.0 Q 0.0 0.0 3.78 96.22 0.0 V 0.20 0.0 0.0 0.0 99.35 Y 0.0 0.0 0.0 0.0 0.7 Table 4. Confusion Matrix for the signs having two fingertips detected
Y 0.35 0.36 0.0 0.0 0.45 99.3
434
Advanced Technologies
Group 4 (Three Fingertips Detected): The posture symbol ‘W’ only falls in the category of three fingertips detections. Therefore, it always results in the classification of alphabet ‘W’. ASL Numbers: Table 5 shows the confusion matrix of the classes for ASL numbers and these are tested for scaling and rotations. The presented results show the least misclassification of letter ‘0’ with the other classes because its geometrical features are entirely different from the other classes. Highest misclassification exists between letters ‘4’ and ‘5’ as there is a lot of similarity between these signs (i.e. thumb in letter ‘5’ is open). Other misclassifications exists between the letters ‘3’ and ‘6’. 0 1 2 3 4 5 6 Numbers 0 99.8 0.2 0.0 0.0 0.0 0.0 0.0 1 0.3 99.44 0.26 0.0 0.0 0.0 0.0 2 0.0 0.0 98.34 0.4 0.0 0.0 1.26 3 0.0 0.0 0.42 98.2 0.86 0.0 0.52 4 0.0 0.0 0.0 0.2 98.24 1.56 0.0 5 0.0 0.0 0.0 0.0 2.4 97.6 0.0 6 0.0 0.0 0.8 0.6 0.0 0.0 98.6 Table 5. Confusion Matrix of ASL numbers. The maximum and the minimum classification percentage is for the numbers ‘0’ and ‘5’ Following are the classification results based on statistical and geometrical feature vectors for posture recognition as shown in Figure 10. This result shows that the SVM clearly defines the boundaries between different classes in a group. In this figure, Y-axis shows the probability of the classes and time domain (frames) are represented in the X-axis. Probabilities computed by SVM of the resultant posture alphabets are higher due to the separation between the posture classes in respective group. Test Sequence 1 with Classification Results: In Figure 10, the major part of graph includes posture signs ‘A’, ‘B’ and some frames at the end shows the posture symbol ‘D’. Posture signs ‘A’ and ‘B’ are the two signs that are categorized in two groups (i.e. no fingertip detection and one fingertip detected). These symbols in this sequence are tested for rotation and back and forth movement. During the occlusion, posture symbol ‘B’ is detected and recognized robustly. Figure 10 presents the test sequence with detected contour and fingertips of left hand. It can also be seen that the left hand and right hand can present different posture signs but the presented results here only show the left hand. However, it can be seen that features of posture signs does not affect much under rotation, scaling and under occlusion. Figure 11 presents the classification probabilities for test sequence in Figure 10. The classification presents good results because the probability of resultant class with respect to other classes is high. The discrimination power of SVM can be seen from this behavior and it classifies the posture signs ‘A’ and ‘B’ correctly. In the sequence, posture sign change from ‘A’ to ‘B’ in frame 90, followed by another symbol change at frame 380 from ‘B’ to ‘D’. Posture sign ‘B’ is detected robustly despite of orientation, scaling and occlusion. However, misclassifications between the groups can be seen from the graph due to false fingertip detection and segmentation. For example, in the frames where no fingertip is detected, posture signs ‘A’ and ‘B’ are classified correctly but misclassifications are observed with other signs in the group with one fingertip detected.
Posture and Gesture Recognition for Human-Computer Interaction
435
Fig. 10. (a) The graph shows the feature set of the posture signs ‘A’, ‘B’ and ‘D’ (b) Test Sequence “ABD-Sequence” for the postures signs ‘A’, ‘B’ and ‘D’ with different rotations and scaling are presented. Yellow dotted circles show rotation where as back and scaling movements are shown by red dotted circles
436
Advanced Technologies
1.2 1
Probability
0.8 0.6 0.4 0.2 0 0
50 A
100 B
D
150 L
V
200
W
Frames Y
250 I
U
300 P
Q
350 C
400
H
Fig. 11. Classification probability of the test sequence. Blue curve shows the highest probability in the initial frames which classifies ‘A’, classification for ‘B’ sign is shown by the brown curve and ‘D’ is shown in the last frames by the light green curve. 5.1.2 Hand Gesture In our experimental results, each isolated gesture number from 0 to 9 was based on 60 video sequences, which 42 video samples for training by Baum-Welch algorithm and 18 video samples for testing (Totally, our database contains 420 video samples for training and 180 video sample for testing). The gesture recognition module match the tested gesture against database of reference gestures, to classify which class it belongs to.
(a)
(b)
Fig. 12. (a) & (b) Isolated gesture ‘3’ with high three priorities where the probability of nongesture model before and after state reduction is the same
Posture and Gesture Recognition for Human-Computer Interaction
437
The higher priority was computed by Viterbi algorithm to recognize the numbers in realtime frame by frame over LRB topology with different number of states ranging from 3 to 5 based on its complexity. We evaluate the gesture recognition according to different clusters number from 28 to 37, based on the numbers of segmented parts in all numbers (0-9) where each straight-line segment is classified into a single cluster. Therefore, Our experiments showed that the optimal number of clusters is equal to 33 where the higher recognition is achieved. In Figure 12(a)&(b) Isolated gesture ‘3’ with high three priorities, where the probability of non-gesture before and after state reduction is the same (the no. of states of non-gesture model before reduction is 40 and after reduction is 28). Additionally, our database also contains 280 video samples for continuous hand motion. Each video sample either contains one or more than meaningful gestures. We measured the gesture spotting accuracy according to different window size from 1 to 8 (Figure 13(a)). We noted that, the gesture spotting accuracy is improved initially as the sliding window size increase, but degrades as sliding window size increase further. Therefore, the optimal size of sliding window is 5 empirically. Also, result of one meaningful gesture spotting ‘6’ is shown in Figure 13(a) where the start point detection at frame 15 and end point at frame 50.
(a)
(b)
Fig. 13. (a) One meaningful gesture spotting ‘6’ with spotting accuracy for different sliding window size (1-8). (b) Gesture spotting ‘78’ where the mean-shift iteration is 1.52 per frame Figure 13 (b) shows the results of continuous gesture path that contains within itself two meanningful gestures ‘7’ and ‘8’. In addition, the mean-shift iteration of continuous gesture path ’78’ is 1.25 per frame, which in turn would be suitable for real-time implementation. In automatic gesture spotting task, there are three types of errors, namely, insertion, substitution and deletion. The insertion error occurs when the spotter detects a nonexistent gesture. A substitution error occurs when the meaningful gesture is classified falsely. The
438
Advanced Technologies
deletion error occurs when the spotter fails to detect a meaningful gesture. Here, we note that some insertion errors cause the substitution errors or deletion errors where the insertion errors affect on the the gesture spotting ratio directly. The reliability of automatic gesture spotting approach is computed by Eq. 44 and achieved 94.35% (Table 6). ܴ݈ܾ݈݁݅ܽ݅݅ ݕݐൌ
് ݏ݁ݎݑݐݏ݁݃ ݀݁ݖ݅݊݃ܿ݁ݎ ݕ݈ݐܿ݁ݎݎܿ ݂ ൈ ͳͲͲΨ ് ݏ݁ݎݑݐݏ݁݃ ݐݏ݁ݐ ݂ ് ݏݎݎݎ݁ ݊݅ݐܽݎ݁ݏ݊݅ ݂
(44)
Gesture Train Spotting meaningful gestures results path Data Test Insert Delete Substitute Correct Rel. (%) Zero 42 28 1 1 1 26 89.66 One 42 28 0 1 1 26 92.86 two 42 28 0 0 1 27 96.43 Three 42 28 0 0 0 28 100.00 Four 42 28 0 0 1 27 96.43 Five 42 28 0 1 1 26 92.86 Six 42 28 1 1 1 26 89.66 Seven 42 28 0 0 0 28 100.00 Eight 42 28 1 0 2 26 89.66 Nine 42 28 0 1 0 27 96.43 Total 420 280 3 5 8 267 94.35 Table 6. Result of spotting meaningful hand gestures for numbers from 0 to 9 using Hidden Markov Models
6. Summary and Conclusion This chapter is sectioned into two parts; the first part is related to hand posture and the second part deals with hand gesture spotting. In the hand posture, the database contains 3000 samples for training the posture signs and 2000 samples for testing. The recognition process identifies the hand shape using SVM classifier on the manipulated features of segmented hands. The results for the hand posture recognition for thirteen ASL alphabets is 98.65% and for seven ASL numbers, the recognition rate is 98.60%. For the hand gesture, an automatic hand gesture spotting approach for Arabic numbers from 0 to 9 in stereo color image sequences using HMMs is proposed. The gesture spotting network finds the start and end points of meaningful gestures that is embedded in the input stream by the difference observation probability value of maximal gesture models and non-gesture model. On the other side, it performs the hand gesture spotting and recognition tasks simultaneously where it is suitable for real-time applications and solves the issues of time delay between the segmentation and the recognition tasks. The database for hand gesture contains 60 video sequences for each isolated gesture number (42 video sequences for training and 18 video sequences for testing) and 280 video sequences for continuous gestures. The results show that; the proposed approach can successfully recognize isolated gestures and spotting meaningful gestures that are embedded in the input video stream with 94.35% reliability. In short, the proposed approach can automatically recognize posture, isolated and meaningful hand gestures with superior performance and low computational complexity when applied on several video samples containing confusing situations such as partial occlusion.
Posture and Gesture Recognition for Human-Computer Interaction
439
7. Acknowledgments This work was supported by Transregional Collaborative Research Centre SFB/TRR 62 "Companion-Technology for Cognitive Technical Systems" funded by the German Research Foundation (DFG).
8. References Alon, J.; Athitsos, V. & Scharoff, S. (2005). Accurate and Efficient Gesture Spotting via Pruning and Subgesture Reasoning. In Lecture Notes in Computer Sciences, Springer Berlin / Heidelberg, Vol. 3766, pp. 189-198, ISBN 978-3-540-29620-1 Bowden, R.; Zisserman, A.; Kadir, T; & Brady, M. (2003). Vision Based Interpretation of Natural Sign Languages. In International Conference on Computer Vision Systems. Burges, C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition. In Proceeding of Data Mining and Knowledge Discovery, Vol. 2, No. 2, pp. 121-167 Chang, C.C. & Lin, C.J. (2001). LIBSVM: a library for support vector machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm Cristianini, N. & Taylor, J. (2001). An Introduction to Support Vector Machines and other kernel based learning methods, Cambridge University Press, ISBN-10 : 0521780195 Comaniciu, D.; Ramesh, S. & Meer, P. (2003). Kernel-Based Object Tracking. In IEEE Transaction on PAM I, Vol. 25, No. 5, pp. 564-577, ISSN 0162-8828 Cover, T.M. & Thomas, J.A. (1991). Entropy, Relative Entropy and Mutual Information. In Elements of Information Theory, pp. 12-49 Davis, J. & Bradski, G. (1999). Real-time Motion Template Gradients using Intel CVLib. In Proceeding of IEEE ICCV Workshop on Framerate Vision, pp. 1-20 Deyou, X. (2006). A Network Approach for Hand Gesture Recognition in Virtual Reality Driving Training System of SPG. In ICPR, pp. 519-522 Ding, C. & He, X. (2004). K-means Clustering via Principal Component Analysis. In International Conference on Machine Learning (ICML), pp. 225-232 Elmezain, M.; Al-Hamadi, A.; Krell, G.; El-Etriby, S. & Michaelis, B. (2007). Gesture Recognition for Alphabets from Hand Motion Trajectory Using Hidden Markov Models. In IEEE of ISSPIT, pp. 1192-1197 Elmezain, M.; Al-Hamadi, A. & Michaelis, B. (2008a). Real-Time Capable System for Hand Gesture Recognition Using Hidden Markov Models in Stereo Color Image Sequences. In Journal of WSCG'08, Vol. 16, No. 1, pp. 65-72, ISSN 1213-6972 Elmezain, M.; Al-Hamadi, A.; Appenrodt, J. & Michaelis, B. (2008b). A Hidden Markov odelBased Continuous Gesture Recognition System for Hand Motion Trajectory. In International Conference on Pattern Recognition (ICPR), pp. 519-522 Elmezain, M.; Al-Hamadi, A. & Michaelis, B. (2009). A Novel System for Automatic Hand Gesture Spotting and Recognition in Stereo Color Image Sequences. In Journal of WSCG'09, Vol. 17, No. 1, pp. 89-96, ISSN 1213-6972 Flusser, J. & Suk, T. (1993). Pattern Recognition by Affine Moment Invariants. In Journal of Pattern Recognition, Vol. 26, No. 1, pp. 167-174 Freeman, W. & Roth, M. (1994). Orientation histograms for hand gesture recognition. In International Workshop on Automatic Face and Gesture Recognition, pp. 296-301
440
Advanced Technologies
Goronzy, S. (2002). Robust Adaptation to Non-Native Accents in Automatic Speech Recognition. In Lecture Notes in Computer Sciences, Springer, ISBN-13: 978-540003250 Handouyahia, M.; Ziou, D. & Wang, S. (1999). Sign Language Recognition Using MomentBased Size Functions. In International Conference of Vision Interface, pp. 210-216 Hu, M. (1962). Visual Pattern Recognition by Moment Invariants. In IRE Transaction on Information Theory, Vol. 8, No. 2, pp. 179-187, ISSN 0096-1000 Hussain, M. (1999). Automatic Recognition of Sign Language Gestures. Master Thesis, Jordan University of Science and Technology Kang, H. ; Lee, C. & Jung, K. (2004). Recognition-based Gesture Spotting in Video Games. In Journal of Pattern Recognition Letters, Vol. 25, No. 15, pp. 1701-1714, ISSN 0167-8655 Kanungo, T. ; Mount, D. M. ; Netanyahu, N. ; Piatko, C. ; Silverman, R. & Wu, A.Y. (2002). An Efficient k-means Clustering Algorithm: Analysis and Implementation. In IEEE Transaction on PAMI, Vol. 24, No. 3, pp. 881-892, ISSN 0162-8828 Khalid, S. ; Ilyas, U. ; Sarfaraz, S. & Ajaz, A. (2006). Bhattacharyya Coefficient in Correlation of Gary-Scale Objects. In Journal Multimedia, Vol. 1, No. 1, pp. 56-61, ISSN 1796-2048 Kim, D.; Song, J. & Kim, D. (2007). Simultaneous Gesture Segmentation and Recognition Based on Forward Spotting Accumlative HMMs. In Journal of the Pattern Recognition Society, Vol. 40, No. 11, pp. 3012-3026, ISSN 0031-3203 Lee, H. & Kim, J. (1999). An HMM-Based Threshold Model Approach for Gesture Recognition. IEEE Transaction on PAMI, Vol. 21, No. 10, pp. 961-973, ISSN 0162-8828 Licsar, A. & Sziranyi, T. (2002). Supervised Training Based Hand Gesture Recognition System. In International Conference on Pattern Recognition, pp. 999-1002 Lin, C.J. & Weng, R. (2004). Simple Probabilistic Predictions for Support Vector Regression. In Technical report, Department of Computer Science, National Taiwan University Maitra, S. (1979). Moment Invariants. In Proceeding of the IEEE, Vol. 67, pp. 697-699 Malassiotis, S. & Strintzis, M. (2008). Real-time Hand Posture Recognition using Range Data. In Image and Vision Computing, Vol. 26, No. 7 pp. 1027-1037, ISSN 0262-8856 Mitra, S. & Acharya, T. (2007). Gesture Recognition: A Survey. In IEEE Transaction on Systems, MAN, and Cybernetics, Vol. 37, No. 3, pp. 311-324, ISSN 1094-6977 Ming-Hsuan, Y. & Narendra, A. (1999). Gaussian Mixture Modeling of Human Skin Color and Its Applications in Image and Video Databases. In SPIE/EI&T Storage and Retrieval for Image and Video Databases, pp. 458-466 Niese, R.; Al-Hamadi, A. & Michaelis, B. (2007). A Novel Method for 3D Face Detection and Normalization. In Journal Multimedia, Vol. 2, No. 5, pp. 1-12, ISSN 1796-2048 Phung, S.L.; Bouzerdoum, A. & Chai, D. (2002). A Novel Skin Color Model in YCbCr Color Space and its Application to Human Face Detection. In IEEE International Conference on Image Processing (ICIP), pp. 289-292, ISSN 1522-4880 Rabiner, L. R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In Proc. of the IEEE, Vol. 77, No. 2, pp. 257-286, ISSN 0018-9219 Scott, D., W. (1992). Multivariate Density Estimation. In Wiley, 1992. Suykens, J. ; Gestel, T.; Brabenter, J.; Moor, B. & Vandewalle, J. (2005). Least Squares Support Vector Machines, World Scientific, ISBN 9812381511. Takahashi, K.; Sexi, S. & Oka, R. (1992). Spotting Recognition of Human Gestures From Motion Images. In Technical Report IE92-134, pp. 9-16 Yang, H.; Park, A. & Lee, S. (2007). Spotting and Recognition for Human-Robot Interaction. In IEEE Transaction on Robotics, Vol. 23, No. 2, pp. 256-270, ISSN 1552-3098
Optimal Economic Stabilization Policy under Uncertainty
441
24 X
Optimal Economic Stabilization Policy under Uncertainty André A. Keller
Université de Haute Alsace France 1. Introduction A macroeconomic model can be analyzed in an economic regulation framework, by using stochastic optimal control techniques [Holbrook, 1972; Chow, 1974; Turnovsky, 1974; Pitchford & Turnovsky, 1977; Hall & Henry, 1988]. This regulator concept is more suitable when uncertainty is involved [Leland, 1974; Bertsekas, 1987]. A macroeconomic model generally consists in difference or differential equations which variables are of three main types: (a) endogenous variables that describe the state of the economy, (b) control variables that are the instruments of economic policy to guide the trajectory towards an equilibrium target, and (c) exogenous variables that describe an uncontrollable environment. Given the sequence of exogenous variables over time, the dynamic optimal stabilization problem consists in nding a sequence of controls, so as to minimize some quadratic objective function [Turnovsky, 1974; Rao, 1987]. The optimal control is one of the possible controllers for a dynamic system, having a linear quadratic regulator and using the Pontryagin’s principle or the dynamic programming method [Preston, 1974; Kamien & Schwartz, 1991; Sørensen & Whitta-Jacobsen, 2005]. A flexible multiplier-accelerator model leads to a linear feedback rule for optimal government expenditures. The resulting linear first order differential equation with time varying coefficients can be integrated in the infinite horizon. It consists in a proportional policy, an exponentially declining weighted integral policy plus other terms depending on the initial conditions [Turnovsky, 1974]. The introduction of stochastic parameters and additional random disturbance leads to the same kind of feedbacks rules [Turnovsky, 1974]. Stochastic disturbances may affect the coefcients (multiplicative disturbances) or the equations (additive residual disturbances), provided that the disturbances are not too great [Poole, 1957; Brainard, 1967; Aström, 1970; Chow, 1972; Turnovsky, 1973, 1974, 1977; Bertsekas, 1987]. Nevertheless, this approach encounters difculties when uncertainties are very high or when the probability calculus is of no help with very imprecise data. The fuzzy logic contributes to a pragmatic solution of such a problem since it operates on fuzzy numbers. In a fuzzy logic, the logical variables take continue values between 0 (false) and 1 (true), while the classical Boolean logic operates on discrete values of either 0 or 1. Fuzzy sets are a natural extension of crisp sets [Klir & Yuan, 1995]. The most common shape of their membership functions is triangular or trapezoidal. A fuzzy controller acts as an articial decision maker that operates in a closed-loop system
442
Advanced Technologies
in real time [Passino & Yurkovich, 1998]. This contribution is concerned with optimal stabilization policies by using dynamic stochastic systems. To regulate the economy under uncertainty, the assistance of classic stochastic controllers [Aström , 1970; Sage & White, 1977, Kendrick, 2002] and fuzzy controllers [Lee, 1990; Kosko, 1992; Chung & Oh, 1993; Ying, 2000] are considered. The computations are carried out using the packages Mathematica 7.0.1, FuzzyLogic 2 [Kitamoto et al., 1992; Stachowicz & Beall, 2003; Wolfram, 2003], Matlab R2008a & Simulink 7, & Control Systems, & Fuzzy Logic 2 [Lutovac et al., 2001; The MathWorks, 2008]. In this chapter, we shall examine three main points about stabilization problems with macroeconomic models: (a) the stabilization of dynamical systems in a stochastic environment, (b) the PID control of dynamical macroeconomic models with application to the linear multiplier-accelerator Phillips’ model and to the nonlinear Goodwin’s model, (c) the fuzzy control of these two dynamical basic models.
2. Stabilization of dynamical systems under stochastic shocks 2.1 Optimal stabilization of stochastic systems 2.1.1 Standard stabilization problem The optimal stabilization problem with deterministic coefcients is presented rst. This initial form, which does not t to the application of the control theory, is transformed to a more convenient form. In the control form of the system, the constraints and the objective functions are rewritten. Following Turnovsky, let a system be described by the following matrix equation
Yt A1Yt 1 A 2 Yt 2 ... A m Yt m B 0 U t B1U t 1 ... B n U t n .
(1)
q1 target variables in instantaneous and delayed vectors Y and q2 policy instruments in instantaneous and delayed vectors U . The maximum delays are m and n for Y and U respectively. The squared q1 q1 matrices A are associated to the targets, and the q1 q2 matrices B are associated to the instruments. All elements of
The system (1) consists in
these matrices are subject to stochastic shocks. Suppose that the objective of the policy maker is to stabilize the system close to the long-run equilibrium, a quadratic objective function will be
Y Y M Y Y U t 1
where
M is
t
t
t 1
t
U N U t U ,
a strictly positive denite costs matrix associated to the targets and
positive denite matrix associated to the instruments. According to (1), the two sets
U
of long-run objectives are required to satisfy
Y
N
a
and
Optimal Economic Stabilization Policy under Uncertainty
443
m n I A j Y Bi U. j 1 i 0
Letting the deviations be
Yt Y y t
and
Ut U ut , the optimal problem is
min u y t My t u t Nu t t 1 t 1
s.t.
(2)
(2)
y t A 1 y t 1 A 2 y t 2 ... A m y t m B 0 u t B 1u t 1 ... B n u t n .
2.1.2 State-space form of the system The constraint (2) is transformed into an equivalent first order system [Preston & Pagan, 1982]
xt A xt 1 B v t ,
xt (y t , y t 1 , y t 2 ,..., y t m 1 , ut , ut 1 ,..., ut n 1 ) is the g 1 state vector with g mq1 nq2 . The control vector is v t ut . The block matrix A and the vector B are
where
defined by
Any stabilization of a linear system requires that the system be dynamically controllable over some time period [Turnovsky, 1977]. The condition for the full controllability of the system states that it is possible to move the system from any state to any other. Theorem 2.1.2 (Dynamic controllability condition). A necessary and sufficient condition for a system to be dynamically controllable over some time period dynamic controllability condition
rank B | AB | ... | A
g 1
B g.
T g is
given by the
444
Advanced Technologies
Proof. In [Turnovsky, 1977], pp. 333-334.
The objective function (3) may be also written as
t 1
t 1
xt' M*xt vt' Nvt , M/m y ’s and u ’s before t 1 . Letting M * the block diagonal matrix M is defined by
where
includes past
and
N/n, N
The stabilization problem, (2) is transformed to the control form min v xt' M*xt v t' Nv t t 1 t 1 s.t. xt Axt 1 Bv t .
Since the matrices
M*
and
N are strictly positive, the optimal policy exits and is unique.
2.1.3 Backward recursive resolution method Let a formal stabilization problem be expressed with a discrete-time deterministic system T
min x y t' My t xt' Nxt , M, N 0 t 1
(3)
s.t. y t Ay t 1 Bxt . n state vector y and the m control vector x are deviations from long-run desired values, the positive semi-denite matrices M nn
In the quadratic cost function of the problem, the
and
N mm are costs with having values away from the desired objectives. The constraint of
Optimal Economic Stabilization Policy under Uncertainty
445
the problem is a rst order dynamic system 1with matrices of coefficients
A nn
and
B nm .
The objective of the policy maker is to stabilize the system close to its long-run equilibrium. To nd a sequence of control variables such that the state variables y t can move from any
y 0 to any other state y T , the dynamically controllable condition is given by a rank of a concatenate matrix equal to n initial
rank B | AB | ... | A n 1B n. The solution is a linear feedback control given by
xt R t y t1 ,
where we have
R t N B ' St B
1
B ' St A
S t 1 M R NR t A BR t S t A BR t '
' t
ST M The optimal policy is then determined according a backward recursive procedure from terminal step T to the initial conditions, such as ST M ,
step T :
R T N B ' ST B
1
B ' ST A .
step T -1 : ST 1 M R T' NRT A BRT ST A BR T , '
R T 1 N B ' ST 1B
1
B ' ST 1A
… S1 M R '2 NR 2 A BR 2 S 2 A BR 2 , '
step 1 :
R 1 N B ' S 1B
1
B ' S1 A
1 Any higher order system has an equivalent augmented first-order system, as shown in 2.1.2 . Let a second-order system be the matrix equation
y Ay t
1
t 1
Ay 2
t2
B x Bx . 0
t
1
t 1
Then, we have the augmented first-order system
z
t
y y x
t
t 1
t
A I 0
1
A
2
0 0
y y 0 0 x
B
1
t 1
t2
t 1
B 0 I
0
v . t
446
Advanced Technologies
A
'
S 0 M R 1N R 1
s te p 0 :
R
0
N
B ' S 0B
BR1
B 1
'S0A
'
S1
A
BR1
,
2.1.4 The stochastic control problem Uncorrelated multiplicative and additive shocks: The dynamic system is now subject to stochastic disturbances with random coefficients and random additive terms to each equation. The two sets of random deviation variables are supposed to be uncorrelated 2. The problem (3) is transformed to the stochastic formulation ( also [Turnovsky, 1977]).
min x E yt My t xt Nxt s.t. y t A t y t 1 B t xt t , M, N 0,
A n n
The constant matrices
and
B n m
are the deterministic part of the coefcients. The
nn
random components of the coefcients are represented by the matrices Moreover, we have the stochastic assumptions : the elements
ijt , ijt
and and
mm .
it
are
identically and independently distributed (i.i.d.) over time with zero mean and nite variances and covariances. The elements of matrices
Φt
and
Ψ t are uncorrelated with
given by where
Φt are correlated with those of Ψ t , the t . The solution is a linear feedback control
xt R y t 1 ,
3
The deviations xt , y t are about some desired and constant objectives X , Y such that x X X and y Y Y . 3 A scalar system is studied by Turnovsky [Turnovsky,1977]. The optimization problem is given by 2
*
*
t
*
*
t
t
t
2
2
t
t
min E my nx
,
m, n 0
y a
s.t.
t
t
y
t 1
b
t
x
t
ε , t
where t t are i.i.d. with zero mean, variances , and correlation coefficient . 2
2
The optimal policy is xt r yt 1 , where r a b s s / ( n b s s ) 2
and where
s
2
is the positive solution of the quadratic equation
1 a
2
b ab s n 1 a m b s mn 0.
2
2
2
2
2
2
2
2
2
A necessary and sufficient condition to have a unique positive solution is (with 0 ) 2
1 a 2
2
ab
2
b 2
2
,
Optimal Economic Stabilization Policy under Uncertainty
R N B ' SB E Ψ ' SΨ and
S
447
1
B ' SA E ' S ,
is a positive semi-definite solution to the matrix equation
S M R NR A BR S A BR E R S R . Correlated multiplicative and additive shocks: The assumption of non correlation in the original levels equation, will necessarily imply correlations in the deviations equation. Let the initial system be dened in levels by the rst order stochastic equation
Yt A t Yt 1 B t Xt εt , and the stationary equation
Y* AY* BX* . By subtracting these two matrix equations and letting we have
y t Yt Y*
and
xt Xt X* ,
y t A t y t 1 B t xt ε t' ,
where the additive composite disturbance ε ' denotes a correlation between the stochastic component of the coefcients and the additive disturbance. The solution to the stabilization problem takes a similar expression as in the uncorrelated case. We have the solution
xt Ry t 1 p, where
R N B ' SB E ' S
B ' SA E ' SΦ , p N B ' SB E ' S B ' k E ' Sε , 1
1
and
S
is positive semi-definite solution to the matrix equation
' ' S M R ' NR A BR S A BR E Φ R S Φ ΨR ,
and
k
is solution to the matrix equation ' ' k A BR k E R Sε .
where the variabilities and vary inversely. Moreover, the stabilization requirement is satisfied for any a , b ( b 0 ) and any k such that 1 a bk 1 . 2
2
448
Advanced Technologies
The optimal policy then consists of a feedback component p. The system will oscillate about the desired targets.
R
together to a xed component
2.2 Stabilization of empirical stochastic systems 2.2.1 Basic stochastic multiplier-accelerator model Structural model: The discrete time model consists in two equations, one is the nal form of output equation issued from a multiplier-accelerator model with additive disturbances, the other is a stabilization rule [Howrey, 1967; Turnovsky, 1977]
Yt bYt 1 cYt 2 Gt t , Gt g1 Yt 1 g 2 Yt 2 B , where
Y denotes the total output, G
the stabilization oriented government expenditures,
B a time independent term to characterize a full-employment policy [Howrey, 1967] and random disturbances (serially independent with zero mean, constant variance) from decisions only. The policy parameters are
g1 , g 2 and Y
is a long run equilibrium level4.
Time path of output: Combining the two equations, we obtain a second order linear stochastic difference equation (SDE)
Yt (b g1 )Yt 1 (c g 2 )Yt 2 B t , where
B is a residual expression. Provided the system is stable5, the solution is given by
t 1 B r1 j 1 r2j 1 t t Yt C1r1 C2 r2 t j , t 1, 2,... r1 r2 1 b g1 c g 2 j 0
where C1 , C2 are arbitrary constants given the initial conditions and r1 , r2 the roots of the
characteristic equation: r1 , r2 b b 2 4c / 2 . The time path of output is the sum of three terms, expressing a particular solution, a transient response and a random response respectively. 4
The stabilization rule may be considered of the proportional-derivative type [Turnovsky,
1977] rewriting Gt as Gt ( g1 g 2 ) Yt 1 Y
g Y 2
t 1
Yt 2 .
A necessary and sufficient condition of a linear system is that the characteristic roots lie within the unit circle in the complex plane. In this case, the autoregressive coefficients will satisfy the set of inequalities 5
1 b c g
1
g 2 0, 1 b c g1 g 2 0, 1 c g 2 0
The region to the right of the parabola in Figure 1 corresponds to values of coefficients b and c which yield complex characteristic roots.
Optimal Economic Stabilization Policy under Uncertainty
449
2.2.2 Stabilization of the model Iso-variance and iso-frequencies loci: Let the problem be simplified to [Howrey, 1967]
Yt bYt 1 cYt 2 At t .
(4)
Figure 1 shows the iso-variance and the iso-frequencies contours together with the stochastic response to changes in the parameters b and c . Attempts to stabilize the system may increase its variance ratio
y2 / 2 . As coefcient b, c
being held constant, the peak
is shifted to a higher frequency.
Fig. 1. Iso-variance (a) and iso-frequencies (b) contours Asymptotic variance of output: Provided the stability conditions are satised (the characteristic roots lie within the unit circle in the complex plane), the transient component will tend to zero. The system will uctuate about the stationary equilibrium rather than converge to it. The asymptotic variance of output is
asy y2
1 c g2
1 c g 2 1 c g 2 b g1 2
2
2 .
Speed of convergence: The transfer function (TF) of the autoregressive process (4) is given by
T ( ) 1 be i ce i 2
1
.
We then have the asymptotic spectrum
| T ( ) |2 1 b 2 c 2 2b (1 c) cos 2c cos 2 . 1
450
Advanced Technologies
The time-dependent spectra are defined by t 1
| T ( , t ) | 2
j 0
In this application, the
r1 j 1 r2j 1 ij e . r1 r2
parameters take the values
b 1.1, c .5, 2 1
as in
[Howrey, 1967]. Figure 2 shows how rapid is the convergence of the first ten log-spectra to the asymptotic log- spectrum. [Nerlove et al., 1979].
Fig. 2. Convergence to the asymptotic log- spectrum Optimal policy: Policies which minimize the asymptotic variance are such
g1* b
and
* 2
g c . Then we have Yt Y t The output will then uctuate about
Y
and
y2 2 .
with variance
2 .
3. PID control of dynamical macroeconomic models Stabilization problem are considered with time-continuous multiplier-accelerator models: the linear Phillips uctuation model and the nonlinear Goodwin’s growth model 6. 6
The use of closed-loop theory in economics is due to Tustin [Tustin, 1953].
Optimal Economic Stabilization Policy under Uncertainty
451
3.1 The linear Phillips’ model 3.1.1. Structural form of the Phillips’ model The Phillips’model [Phillips, 1954; Allen, 1955; Phillips, 1957; Turnovsky, 1974; Gandolfo, 1980; Shone, 2002] is described by the continuous-time system
Z (t ) C (t ) I (t ) G (t ), C (t ) cY (t ) u (t ), I I (t ) vY ,
(5) (5) (6) (6) (7) (7)
Y (t ) Z (t ) , I and Y denote the first derivatives w.r.t. time of
Y
where variables I (t ) and
(8) (8)
the continuous-time respectively. All yearly variables are continuous twice-
Y (t )
differentiable functions of time and all measured in deviations from the initial equilibrium value. The aggregate demand
Z
income
Y without
G
u (t )
is then dened by the step function
u (t ) 1 for t 1 . The coefcient c
investment
in equation (5). Consumption
delay and is disturbed by a spontaneous change
equation (6). The variable and
C,
consists in consumption
autonomous expenditures of government
C
I
and
depends on
u at time t 0 in u (t ) 0 , for t 0
is the marginal propensity to consume. Equation
(7) is the linear accelerator of investment, where investment is related to the variation in demand. The coefcient
v
is the acceleration coefcient and
denotes the speed of
response of investment to changes in production, the time constant of the acceleration lag being
1
years. Equation (8) describes a continuous gradual production adjustment to
demand. The rate of change of production Y at any time is proportional to the difference between demand and production at that time. The coefcient is the speed of response of production to changes in demand. Simple exponential time lags are then used in this model 7. 3.1.2. Block-diagram of the Phillips’ model The block-diagram of the whole input-output system (without PID tuning) is shown in Figure 3 with simulation results. Figure 4. shows the block-diagram of the linear multiplieraccelerator subsystem. The multiplier-accelerator subsystem shows two distinct feedbacks : the multiplier and the accelerator feedbacks.
7
The differential form of the delay is the production lag / D where the operator
D
is
the differentiation w.r.t. time. The distribution form is
Y (t )
0
Given the weighting function
w( ) Z (t ) d ,
w(t ) e t , the response function is F (t ) 1 e t
the path of Y following a unit step-change in Z .
for
452
Advanced Technologies
Fig. 3. Block-diagram of the system and simulation results
Fig. 4. Block diagram of the linear multiplier-accelerator subsystem 3.1.3. System analysis of the Phillips’ model The Laplace transform of X (t ) is defined by
X ( s ) L X (t ) e st X (t ) dt. 0
Omitting the disturbance
u (t ) , the model (5-8) is transformed to
Z ( s ) C ( s ) I ( s ) G ( s ), C ( s) cY ( s ), sITF I ( s) ( sof ) the system The is vsY ( s ), sY ( s ) Y ( s ) Z ( s ).
(9) (10) (11) (12)
Optimal Economic Stabilization Policy under Uncertainty
453
Y ( s) (s ) 2 . G ( s ) s (1 c) v s (1 c) 3 Taking a unit investment time-lag with 1 together with 4, c 4 H (s)
we have
H ( s ) 20
s .2 j .
v
3 , 5
s 1 . 5s 2 s 5 2
The constant of the TF is then 4, the zero is at conjugates
and
s 1
and poles are at the complex
The TF of system is also represented by
H ( j ) .
The Bode
magnitude and phase, expressed in decibels ( 20 log10 ), are plotted with a log-frequency axis. The Bode diagram in Figure 5 shows a low frequency asymptote, a resonant peak and a decreasing high frequency asymptote. The cross-over frequency is 4 (rad/sec). To know how much a frequency will be phase-shifted, the phase (in degrees) is plotted with a logfrequency axis. The phase cross over is near 1 (rad/sec). When varies, the TF of the system is represented in Figure 5 by the Nyquist diagram on the complex plane.
Fig. 5. Bode diagram and Nyquist diagram of the transfer function 3.1.4 PID control of the Phillips’ model The block-diagram of the closed-loop system with PID tuning is shown in Figure 6. The PID controller in Figure 7 invokes three coefcients. The proportional gain
K pe t
determines
the reaction to the current error. The integral gain t
Ki e d 0
bases the reaction on sum of past errors. The derivative Gain K d e determines the reaction to the rate of change of error. The PID controller is a weighted sum of the three actions. A
454
larger
Advanced Technologies
Kp
will induce a faster response and the process will oscillate and be unstable for an
excessive gain. A larger
K i eliminates
steady states errors. A larger
Kd
decreases
8A
PID controller is also described by the following TF overshoot [Braae & Rutherford, 1978] in the continuous s-domain [Cominos & Nunro, 2002]
H C (s) K p
Ki sK d . s
The block-diagram of the PID controller is shown in Figure 7.
Fig. 6. Block diagram of the closed-looped system
Fig. 7. Block diagram of the PID controller
8 The Ziegler-Nichols method is a formal PID tuning method: the I and D gains are first set to zero. The P gain is then increased until to a critical gain K c at which the output of the loop starts to oscillate. Let denote by Tc the oscillation period, the gains are set to .5 K c
for a P control, to .45 K c 1.2 K p / Tc for a PI control, to .6 K c 2 K p / Tc K p Tc / 8 for a PID control.
Optimal Economic Stabilization Policy under Uncertainty
455
3.2 The nonlinear Goodwin’s model 3.2.1. Structural form of the Goodwin’s model The extended model of Goodwin [Goodwin, 1951; Allen, 1955; Gabisch & Lorenz, 1989] is a multiplier-accelerator with a nonlinear accelerator. The system is described by the continuous-time system
Z (t ) C (t ) I (t ), C (t ) cY (t ) u (t ), I I (t ) B (t ) ,
(13)
B (t ) vY ,
(16)
Y Y (t ) Z (t ) .
(17)
(14) (15)
The aggregate demand
Z in
equation (13) is the sum of consumption
C
and total
investment I 9. The consumption function in equation (14) is not lagged on income Y . The investment (expenditures and deliveries) is determined in two stages: at the rst stage, investment I in equation (15) depends on the amount of the investment decision B with
an exponential lag; at the second stage the decision to invest B in equation (16) depends non linearly by Φ on the rate of change of the production Y . Equation (17) describes a
continuous gradual production adjustment to demand. The rate of change of supply Y is proportional to the difference between demand and production at that time (with speed of response ). The nonlinear accelerator is dened by
LM (Y ) M vY 1 , Le M where M is the scrapping rate of capital equipment and goods trades. It is also subject to the restrictions
B0
if
Y 0, B L
as
L
the net capacity of the capital-
Y , B M
as
Y .
The graph of this function is shown in Figure 8.
9 The autonomous constant component is ignored since level.
Y
is measured from a stationary
456
Advanced Technologies
Fig. 8. Nonlinear accelerator in the Goodwin’s model 3.2.2. Block-diagrams of the Goodwin’s model The block-diagrams of the nonlinear multiplier-accelerator are described in Figure 9.
Fig. 9. Block-diagram of the nonlinear accelerator 3.2.3 Dynamics of the Goodwin’s model The simulation results show strong and regular oscillations in Figure 10. The Figure 11 shows how a sinusoidal input is transformed by the nonlinearities. The amplitude is strongly amplied, and the phase is shifted.
Optimal Economic Stabilization Policy under Uncertainty
457
Fig. 10. Simulation on the nonlinear accelerator
Fig. 11. Simulation of a sinusoidal input 3.2.4 PID control of the Goodwin’s model Figure 12 shows the block-diagram of the closed-loop system. It consists of a PID controller and of the subsystem of Figure 9. The simulation results which have the objective to maintain the system at a desired level equal to 2.5. This objective is reached with oscillations within a time-period of three years. Thereafter, the system is completely stabilized.
458
Advanced Technologies
Fig. 12. Block-diagram and simulation results of the PID controlled Goodwin’s model
4. Fuzzy control of dynamic macroeconomic models 4.1 Elementary fuzzy modeling 4.1.1 Fuzzy logic controller A fuzzy logic controller (FLC) acts as an articial decision maker that operates in a closedloop system in real time [Passino & Yurkovitch, 1998]. Figure 13 shows a simple control problem, keeping a desired value of a single variable. There are two conditions: the error and the derivative of the error. This controller has four components: (a) a fuzzication interface to convert crisp input data into fuzzy values, (b) a static set of "If-Then" control rules which represents the quantication of the expert’s linguistic evaluation of how to achieve a good control, (c) a dynamic inference mechanism to evaluate which control rules are relevant, and (d) the defuzzication interface that converts the fuzzy conclusions into crisp inputs of the process 10. These are the actions taken by the FLC. The process consists of three main stages: at the input stage 1 the inputs are mapped to appropriate functions, at the processing stage 2 appropriate rules are used and the results are combined, and at the output stage 3 the combined results are converted to a crisp value input for the process.
10 The commonly used centroid method will take the center of mass. It favors the rule with the output of greatest area. The height method takes the value of the biggest contributor.
Optimal Economic Stabilization Policy under Uncertainty
459
Fig. 13. Design of a fuzzy controller 4.1.2 Fuzzyfication and fuzzy rules Simple control example: Let us consider a simple control example of TISO (Two Inputs Single Output) Mamdani fuzzy controller. The fuzzy controller uses identical input fuzzy sets, namely "Negative", "Zero" and "Positive" MFs. The system output is supposed to follow
x(t ) 4 e t /5 4 cos t 3 6 sin t , e(t ) r (t ) x(t ), where r (t ) is the de(t ) input, supposed to be constant (a set point) 11. Then we have e x . dt as in Figure 14. The error is dened by
reference
Fig. 14. System output, fuzzy rules and phase trajectory
11
Scaling factors may be used to modify easily the universe of discourse of inputs. We then
have the scaled inputs K e e(t ) and K r e .
460
Advanced Technologies
Fuzzification: Membership functions. A membership function (MF) assigns to each element x of the universe of discourse
: X 0,1 .
X
, a grade of membership
The triangular MF of Figure 15 is defined by
a b c . A fuzzy A x, A x | x X .
where
set
A
( x)
such that
x a c x , , 0 , b a c b
x max min
is then defined as a set of ordered pairs
According to the Zadeh operators, we have
A B m in ( A ), ( B ) , A B m ax ( A ), ( B ) , and ( A ) 1 ( A ). The overlapping MFs of the two inputs error and change-in-error and the MF of the output control-action show the most common triangular form in Figure 15. The linguistic label of these MFs are Negative", "Zero" and "Positive" over the range inputs and over the range
100,100
for the two
1,1 for the output.
Fig. 15. Membership functions of the two inputs and one output Fuzzy rules: Fuzzy rules are coming from expert knowledge and consist in "If-Then" statements. An antecedent block is placed between "If" and "Then" and a consequent block is following "Then"
12.
Let the continuous differentiable variables
e(t )
and
e t
denote the
error and the derivative of error in the simple stabilization problem of Figure 13. The conditional recommendations are of the type See [Braee & Rutherford, 1978] for fuzzy relations in a FLC and their influences to select more appropriate operations.
12
Optimal Economic Stabilization Policy under Uncertainty
461
If e , e is A B Then v is C ,
where A B ( x , y ) min A ( x ), B ( y ), x a , a , y b , b .
These FAM (Fuzzy Associative Memory)-rules 13 are those of the Figure 16. These nine rules will cover all the possible situation. According to rule (PL,NL;ZE), the system output is below the set point (positive error) and is increasing at this point. The controller output should then be unchanged. On the contrary, according to rule (NL,NL;NL), the system output is above the set point (negative error) and is increasing at this point. The controller output should then decrease the overshoot. The commonly linguistic states of the TISO model are denoted by the simple linguistic set A={NL,ZE;PL}. The binary input-output FAM-rules are then triples such as (NL,NL;NL): "If" input e is Negative Large and e is Negative Large "Then" control action v is Negative Large. The antecedent (input) fuzzy sets are implicitly combined with conjunction "And".
Fig. 16. Fuzzy rule base 1: NL-Negative Large, ZE-Zero error, PL-Positive Large 4.1.3 Fuzzy inference and control action Fuzzy inference: In Figure 17, the system combines logically input crisp values with minimum, since the conjunction "And" is used. Figure 18 produces the output set, combining all the rules of the simple control example, given crisp input values of the pair
e, e .
Fig. 17. FAM influence procedure with crisp input measurement 13 Choosing an appropriate dimension of the rule is discussed by [Chopra et al., 2005]. Rules bases of dimension 9 (for 3MFs), 25 (5MFs), 49 (7 MFs), 81 (9 MFs) and 121 (11 MFs) are compared.
462
Advanced Technologies
Fig. 18. Output fuzzy set from crisp input measurements Defuzzyfication: The fuzzy output for all rules are aggregated to a fuzzy set as in Figure 18. Several methods can be used to convert the output fuzzy set into a crisp value for the control-action variable v. The centroid method (or center of gravity (COG) method) is the center of mass of the area under the graph of the MF of the output set in Figure 18. The COG corresponds to the expected value
vc In this example,
vc .124
v (v)dv . (v)dv
for the pair of crisp inputs
e, e 55, 20 .
4.2 Fuzzy control of the Phillips’ model The closed-loop block-diagram of the Phillips’model is represented in Figure 19 with simulation results. It consists of the FLC block and of the TF of the model. The properties of the FLC controller have been described in Figure 13 (design of the controller), Figure 15 (membership functions), Figure 16 (fuzzy rule base) and Figure 18 (output fuzzy set). Figure 20 shows the efciency of such a stabilization policy. The range of the uctuations has been notably reduced with a fuzzy control. Up to six years, the initial range
3,3 .
12,12 goes to
Optimal Economic Stabilization Policy under Uncertainty
463
Fig. 19. Block diagram of the Phillips model with Fuzzy Control
Fig. 20. Fuzzy stabilization of the Phillips’ model 4.3 Fuzzy control of the Goodwin’s model Figure 21 shows the block-diagram of the controlled system. It consists of a fuzzy controller and of the subsystem of the Goodwin’s model. The FLC controller is unchanged. The simulation results show an efcient and fast stabilization. The system is stable within ve time-periods, and then uctuates in an explosive way but restricted to an extremely close range.
464
Advanced Technologies
Fig. 21. Block-diagram and simulation results of the fuzzy controlled Goodwin’s model
5. Conclusion Compared to a PID control, the simulation results of a linear and nonlinear multiplieraccelerator model show a more efcient stabilization of the economy within an acceptable time-period of few years in a fuzzy environment. Do the economic policies have the ability to stabilize the economy ? Sørensen and Whitta-Jacobsen [Sørensen & Whitta-Jacobsen, 2005] identify three major limits: the credibility of the policy authorities’ commitments by rational private agents, the imperfect information about the state of the economy, and the time lags occurring in the decision making process. The effects of these limits are studied using an aggregate supply–aggregate demand (AS-AD) model and a Taylor’s rule.
6. Appendix A: Analytical regulation of the Phillips’ model 6.1 Unregulated model dynamics The unregulated model (with G 0 and ordinary differential equation (ODE) in
Y
u 1)
is governed by a linear second order
, deduced from the system (5-8). We have
Y (1 c) v Y (1 c)Y (t ) ,
Optimal Economic Stabilization Policy under Uncertainty
When
t 0
465
Y (0) 0, Y (0) . Taking the following c 3 / 4, v 3 / 5, 4 ( T 1 3 months) and 1
with the initial conditions
values for the parameters:
(time constant of the lag 1 year), the ODE is
5Y 2Y 5Y t 20, t 0, with initial conditions
Y (0) 0, Y (0) 4. The solution of the unregulated model is
2 6 2 6 Y (t ) 4 2et /5 2 cos t 6 sin t , t 0, 5 5 or
Y (t ) 4 6.32et /5 cos .98 t .68 , t 0.
The phase diagram in Figure A.1 shows an unstable equilibrium for which stabilization policies are justified.
Fig. A.1. Phase diagram of the Phillips’ model 6.2 Stabilization policies The stabilization of the model proposed by [Phillips, 1954] consists in three additive policies: the proportional P-stabilization policy, the proportional + integral PI-stabilization policy, the proportional + integral + derivative PID-stabilization policy. Modications are introduced by adding terms to the consumption equation (6).
466
Advanced Technologies
P-stabilization policy: For a P-stabilization, the consumption equation is
C (t ) c.Y (t ) u (t )
K p denotes the proportional correction factor and the speed of response of policy
where
demand to changes in potential policy demand retain
K pY (t ),
D
2
In the numerical applications, we will (a correction lag with time constant of 6 months). The dynamic equation of the
model is a linear third order ODE in
14.
Y
Y (3) (1 c) v Y
(1 c)( ) K p v Y 1 c K p Y (t ) u (t ).
Taking
c 3 / 4, v 3 / 5, 4, 1, 2, K p 2, u 1,
the ODE is
5Y (3) 8Y 81Y 90Y (t ) 40, t 0, with the
t 0)
initial conditions
Y (0) 0, Y (0) 4, Y(0) 5.6 .
The solution (for
is
Y (t ) .44 .03 e 1.15t 1.1 e .23t sin 3.96 t .44 . The graph of the P-controlled is plotted in Figure A.2(b). The system is stable according to the Routh-Hurwitz stability conditions15. Moreover, the stability conditions for
K p .25
14 15
and
Kp
are
K p .35 .
The time constant of the correction lag is 1 years. Let be the polynomial equation with real coefficients
a0 n a1 n 1 ... an 1 an 0, a0 0
The Routh-Hurwitz theorem states that necessary and sufficient conditions to have negative real part are given by the conditions that all the leading principal minors of a matrix must be positive. In this case, the 3 3 matrix is a a 0 a a 0 0 a a We have all the positive leading principal minors: 1, 7.9 and 142.5 . 1
3
0
2
1
3
1
2
3
Optimal Economic Stabilization Policy under Uncertainty
467
PI-stabilization policy: For a PI-stabilization policy, the consumption equation is
C (t ) c.Y (t ) u (t ) where
Ki
D
K Y (t ) K Y (t )dt , p
i
denotes the integral correction factor. The dynamic equation of the model is a
linear fourth order ODE in
Y is deduced
Y (4) (1 c) v Y (3)
(1 c) K p v Y
(1 c) K p K i Y K i Y (t ) 0.
Taking
c 3 / 4, v 3 / 5, 4, 1, 2, K p K i 2, u 1,
the ODE is
5 Y (4) 8 Y (3) 81Y 170 Y 80 Y (t ) 0, t 0, with the
initial conditions
solution (for
t 0)
Y 0 0, Y 0 4, Y 0 5.6, Y (3) (0) 96
. The
is
Y (t ) .07e1.43t .13e.69t 1.08e.26t sin 4.03 t .19 . The graph of the PI-controlled
Y (t )
is plotted in Figure A.2(c). The system is unstable since
the Routh-Hurwitz stability conditions are not all satisfied conditions on
Ki
are
16.
Given
K i 0,.8987 .
K p 2 , the stability
PID-stabilization policy: For a PID-stabilization policy, the consumption equation is
C (t ) c.Y (t ) u (t ) where
Kd
D
K Y (t ) K Y (t )dt K DY (t ) , p
i
d
denotes the derivative correction factor. The dynamic equation of the model is a
linear fourth order ODE in
Y
Y (4) (1 c) K d v Y (3)
(1 c K d v) (1 c K p ) Y (1 c K p ) Ki Y K i Y (t ) 0.
16
The leading principal minors are: 1 1, 2 8., 3 274.7, 4 5050.8 .
468
Advanced Technologies
c 3 / 4, v 3 / 5, 4, 1, 2, K p K i 2, K d .55, u 1 , fourth order ODE n Y is
Taking
the
Y (4) 6 Y (3) 20.6 Y 34 Y 16 Y (t ) 0, t 0, with the
initial conditions
solution (for
t 0 ) is
Y 0 0, Y 0 4, Y 0 12, Y (3) (0) 2.4 .
The
Y (t ) .07e 2.16t .12 e .74t 1.40 e 1.55t cos 2.76 t 1.54 . The graph of the PID-controlled
Y (t )
is plotted in Figure A.2(d). The system is stable since
the Routh-Hurwitz stability conditions are all satisfied 17. Given conditions on
Kd
are
K d 3.92
and
K d .07 .
K p K i 2 , the stability
The curve Figure A.2(a) without
stabilization policy shows the response of the activity Y to the unit initial decrease of demand. The acceleration coefcient ( v .8 ) generates explosive uctuations 18 .The proportional tuning corrects the level of production but not the oscillations. The oscillations grow worse by the integral tuning. The combined PI-stabilization19 renders the system unstable. The additional derivative stabilization is then introduced and the combined PIDpolicy stabilizes the system.
Fig. A.2. Stabilization policies over a 3-6 years period: (a) no stabilization policy, (b) Pstabilization policy, (c) PI-stabilization policy, (d) PID-stabilization policy The leading principal minors are : 1 1, 2 89.6, 3 3046.4, 4 39526.4 Damped oscillations are obtained when the acceleration coefficient lies in the closed interval 0, .5 .
17 18
19
The integral correction is rarely used alone.
Optimal Economic Stabilization Policy under Uncertainty
469
7. References Allen, R.G.D. (1955). The engineer’s approach to economic models, Economica, Vol. 22, No. 86, 158–168 Aström, K. J. (1970). Stochastic Control Theory, Dover Publications Inc., Mineola, N.Y. Bertsekas, D.P. (1987). Dynamic Programming: Deterministic and Stochastic Models, Prentice Hall Inc., Englewood Cliffs, N.J. Braae, M. & Rutherford, D.-A. (1978). Fuzzy relations in a control setting, Kybernetes, Vol. 7, 185–188 Brainard, W.C. (1967). Uncertainty and the effectiveness of policy, The American Economic Review Proceedings, Vol. 57, 411–425 Chopra, S.; Mitra, R. & Kumar, V. (2005). Fuzzy controller: choosing an appropriate and smallest rule set, International Journal of Computational Cognition, Vol. 3, No. 4, 73–79 Chow, G.C. (1972). How much could be gained by optimal stochastic control policies, Annals of Economic and Social Measurement, Vol. 1, 391–406 Chow, G.C. (1973). Problems of economic policy from the viewpoint of optimal control, The American Economic Review, Vol. 63, No. 5, 825–837 Chung, B.-M. & Oh, J.-H. (1993). Control of dynamic systems using fuzzy learning algorithm, Fuzzy Sets and Systems, Vol. 59, 1-14 Cominos, P. & Nunro, N. (2002). PID Controllers: recent tuning methods and design to specication, IEEE (The Institute of Electrical and Electronics Engineers), Proceedings in Control Theory, Vol. 149, No. 1, 46–53 Gabisch, G. & Lorenz, H.W. (1989). Business Cycle Theory: A Survey of Methods and Concepts, Springer-Verlag, New York Gandolfo, G. (1980). Economic Dynamics: Methods and Models, North- Holland, Amsterdam Goodwin, R.M. (1951). The nonlinear accelerator and the persistence of business cycles, Econometrica, Vol. 19, No. 1, 1–17 Hall, S.G. & Henry, S.G.B. (1988). Macroeconomic Modelling, North- Holland, Amsterdam Holbrook, R. (1972). Optimal economic policy and the problem of instrument instability, The American Economic Review, Vol. 62, No. 1, 57–65 Howrey, E.P. (1967). Stabilization policy in linear stochastic models, Review of Economics and Statistics, Vol. 49, 404-411 Kamien, M.I. & Schwartz, N.L. (1991). Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management, 2nd edition, North-Holland, New York Kendrick, D.A. (2002). Stochastic Control for Economic Models, 2nd edition, version 2.00, The University of Texas, Austin, Texas Kitamoto, T.; Saeki, M. & Ando, K. (1992). CAD Package for Control System on Mathematica, IEEE, (The Institute of Electrical and Electronics Engineers), 448–451 Klir, G.J. & Yuan, B. (1995). Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice Hall Inc., Upper Saddle River, N.J. Kosko, B. (1992). Neural Networks and Fuzzy Systems: a Dynamical Systems Approach to Machine Intelligence, Prentice Hall Inc., Englewood Cliffs, N.J. Lee, C.C. (1990). Fuzzy logic in control systems: fuzzy logic controller-Part I,IEEE Transactions on Systems, Man, and Cybernetics, Vol. 20, No. 2, 404–418 Leland, H.E. (1974). Optimal growth in a stochastic environment, Review of Economic Studies, Vol. 125, 75–86
470
Advanced Technologies
Lutovac, M.D.; Tošić, D.V. & Evans, B.L. (2001). Filter Design for Signal Using MATLAB and MAT HEMATICA, Prentice Hall, Upper Saddle River, N.J. Nerlove, M.; Grether, D.M. & Carvalho, J.L. (1979). Analysis of Economic Time Series: a Synthesis. Academic Press, New York, San Francisco, London Passino, K.M. & Yurkovich, S. (1998). Fuzzy Control, Addison- Wesley, Menlo Park, Cal. Phillips, A.W. (1954). Stabilisation policy in a closed economy, Economic Journal, Vol. 64, No. 254, 290–323 Phillips, A.W. (1957). Stabilisation policy and the time-forms of lagged responses, Economic Journal, Vol. 67, No. 266, 265–277 Pitchford, J.D. & Turnovsky, S.J. (1977). Applications of Control Theory to Economic Analysis, North-Holland, Amsterdam Poole, W. (1957). Optimal choice of monetary policy instruments in a simple stochastic macro-model, Quarterly Journal of Economics, Vol. 84, 197–216 Preston, A.J. (1974). A dynamic generalization of Tinbergen’s theory of policy, Review of Economic Studies, Vol. 125, 65–74 Preston, A.J. & Pagan, A.R. (1982). The Theory of Economic Policy: Statics and Dynamics. Cambridge University Press, Cambridge Rao, M.J. (1987). Filtering and Control of Macroeconomic Systems, North-Holland, Amsterdam Sage, A.S. & White, C.C., III. (1977). Optimum System Control, 2nd edition, Prentice Hall Inc., Englewood Cliffs, N.J. Shone, R. (2002). Economic Dynamics: Phase Diagrams and their Economic Application, 2nd edition, Cambridge University Press, Cambridge Sørensen, P.B. & Whitta-Jacobsen, H J. (2005). Introducing Advanced Macroeconomics : Growth & Business Cycles, The McGrawHill Co., London Stachowicz, M.S. & Beall, L. (2003). MATHEMATICA Fuzzy Logic, Wolfram Research. http://www.wolfram.com The MathWorks Inc. (2008). User’s Guide: "MATLAB 7", "Simulink 7", "Simulink Control Design", "Control System Design", "Fuzzy Logic Toolbox 2" Turnovsky, S.J. (1973). Optimal stabilization policies for deterministic and stochastic linear economic systems, Review of Economic Studies, 40, 79-96l Turnovsky, S.J. (1974). The stability properties of optimal economic policies, The American Economic Review, Vol. 64, No. 1, 136–148 Turnovsky, S.J. (1977). Macroeconomic Analysis and Stabilization Policy, Cambridge University Press, Cambridge Tustin, A. (1953). The Mechanism of Economic Systems, William Heinemann Ltd, London. Wolfram, S. (2003). The MATHEMATICA Book, 5th edition, Wolfram Media Inc., Champaign Ill., http://www.wolfram.com Ying, H. (2000). Fuzzy Control and Modeling: Analytical Foundations and Applications, IEEE (The Institute of Electrical and Electronics Engineers) Press, New York
New Model and Applications of Cellular Neural Networks in Image Processing
471
25 X
New Model and Applications of Cellular Neural Networks in Image Processing Radu Matei
Technical University “Gh.Asachi” of Iasi Romania 1. Introduction The cellular neural network (CNN) in its standard form as defined by Chua and Yang has become a paradigm of analog, high-speed computation, with various applications in parallel signal processing, especially in image processing tasks. Within two decades this domain has extensively developed both theoretically and from the applications point of view. Although the mainstream research basically followed the standard CNN model, many other CNN models and architectures were proposed, envisaging new and more complex classes of applications. We propose here an alternative CNN model, inspired from Hopfield-type recurrent networks. This new model has the same general architecture as the standard model, while the cell structure, preserving the basic elements, is modified from a dynamical point of view. The cell non-linearity, having a major role in CNN dynamics, no longer applies only to the current state variable as it does in the standard model, but to the whole information fed to the current cell, thereby leading to a different dynamic behavior. The new cell structure and state equations are presented and the dynamics and equilibrium points are analyzed. The advantages of this model, which we call “CNN with overall non-linearity”, are a simpler design of the templates, and the fact that all voltages in the cells are limited to a fixed dynamic range regardless of template parameters; thus the boundedness problem regards the current range instead of voltage, which makes it less critical from an implementation point of view. The image processing capabilities of the proposed CNN model will be approached. Some new image processing tasks are proposed. We first analyze the behavior of the proposed CNN in processing binary images, then using socalled local rules we design the templates for several specific tasks, which are: convex corner detection, erasing thin lines, detection of closed curves, finding all paths between two selected points through a labyrinth, rotation detector, marked object reconstruction and finding the intersection points of one-pixel thin lines from two superposed binary images. Many other processing tasks both on binary or grayscale images are possible. Further research may focus on extending the range of applications of the proposed model. Next some applications of CNNs in image linear filtering will be approached. In this range of applications, the cells of the CNN operate in the linear domain, and the system
472
Advanced Technologies
stability must be ensured. The design of CNN linear filters with high selectivity may lead in general to large-size templates. Due to the fact that local connectivity in CNNs is an essential restriction, the decomposition of a large-size template into small-size templates is often necessary. Fast linear filtering using parallel computing systems like CNNs may be useful in early vision aplications and pre-processing tasks. We will present some original and efficient design methods for several types of two-dimensional filters: circular, maximally-flat, elliptically-shaped, orientation-selective filters etc. We also propose a class of 2D linear filters which are designed starting from a desired shape in the frequency plane, described in polar coordinates. Using this method, some interesting filters can be designed, such as concave-shaped filters presenting narrow lobes in the frequency plane. A particular class are the zero-phase 2D filters, which are extensively used in image processing due to the absence of phase distortions. For all the proposed filters, the design starts from an imposed 1D prototype and uses rational approximations and various frequency transformations to obtain the 2D frequency response. The resulted filters are efficient, at the same time of low complexity and relatively high selectivity. The design methods are entirely analytical, but they could be combined with numerical optimization techniques. The filters can be implemented either digitally or on analog processing systems like CNNs. Further research also envisages an efficient and reliable implementation of this class of filters.
2. Cellular Neural Networks with Overall Nonlinearity 2.1 Equations and Circuit Model The state equation of the current cell i for a standard CNN has the form (Chua & Yang, 1988): dx i xi A j f(x j ) Bk uk Ii dt jN kN
i
(1)
i
and the output is given by the piece-wise linear function: y i f(xi ) 0.5 x i 1 x i 1
(2)
Here a new version of CNN is proposed (Matei, 2001), with a modified structure as compared to the standard one. The difference lies in the form of the state equation, which has the form: dxi Aj xj Bk uk I i xi f jN dt kN i i
(3)
For simplicity we used a single index for the cells, as in the case of 1D CNNs. Unlike the standard CNN, in this case the nonlinear function applies to the entire information gathered from the neighboring cells, i.e. state variables ( x j ), inputs ( u k ) and the bias
New Model and Applications of Cellular Neural Networks in Image Processing
473
term ( I i ). For this reason, we have called this a CNN with "overall non-linearity". This CNN version was inspired by the well-known dynamical model of a neuron which is the elementary unit of so-called recurrent neural networks or Hopfield networks. Within this framework, the standard CNN can be regarded as a particular Hopfield network with local connectivity. The neuron's output is related to the current state ("activation potential") through a nonlinear activation function with various shapes (logistic, piece-wise linear,etc.). There are two basic dynamical neuron models, related by a linear and generally invertible transformation (Haykin, 1994). The proposed CNN corresponds to the second recurrent model. Therefore, the proposed CNN model and the Chua-Yang model are equivalent, being described by similar equations if a linear transformation is applied. Let us denote by z i the argument of the piece-wise linear function f in (3): zi
A x B j
jN i
j
k
uk Ii
kN i
A x k j
j
i
(4)
jN i
where the inputs and bias are constant and were put together into the term k i . Multiplying both members of (3) by A j , summing over i N i ( N i denotes the current cell neighborhood) and shifting again the indices i and j, we finally obtain (Matei, 2001): dz i z i A j f(z j ) Bk uk Ii dt jN kN
i
(5)
i
having a form similar to (1), but in the new variable zi . Equation (4) defines the linear
transformation which relates the two CNN models.
2.2 System and Circuit Description The modified CNN will be analytically described from dynamical point of view, as well as at circuit level. Figure 1 shows a block diagram representation of the dynamic behavior of the modified CNN, which can be compared to the standard CNN block diagram (Matei, 2001). In this block diagram all the variables are vectors or matrices. The correlation sums in the state equation can be written as convolutions. The variable z at the input of the nonlinear block is:
z A x Bu I
(6)
and the matrix state equation of the modified CNN will be:
dx dt x f A x B u I Unlike the standard CNN case (where the nonlinear block is placed in the feedback loop), in
(7)
474
Advanced Technologies
(a)
(b) Fig. 1.(a) Block diagram of proposed CNN model; (b) Circuit structure of a cell the modified CNN it operates in the forward path. Here we must emphasize an important feature of the proposed CNN: the output is identical to the state variable y(t) x(t) throughout the operating domain – a significant advantage in the CNN implementation. In Figure 1(b) we present the cell circuit of the modified CNN, which preserves the same elements as the standard CNN, but has a different structure. The major difference between the two structures lies in the location of the nonlinear element. In the standard CNN case, the nonlinear function applies to the state variable x i , which is a voltage, and gives the output voltage y i f(x i ) . In the modified CNN equation (3), the non-linearity applies to the amount denoted z ij , summing up the entire information from the states and the inputs of the neighboring cells. The variable z ij is a current (given by several linear controlled current sources), and f(zi ) is a current as well. For the piecewise-linear current-controlled current source we can write I yx f(zi ) . The resistor denoted R y has the single role of summing the currents given by the linear sources, while in the standard CNN cell it converts voltage into current. Unlike the Chua-Yang model, in this case the nonlinear current source feeds the current directly into the capacitor C and the resistor R x . On the cell circuit, considering normalized values for C and R x we can also derive the state equation (3). 2.3 Dynamics and Equilibrium Points Using the Lyapunov method and the linear mapping (4) it can be proven that the modified CNN also has the property of convergence to a constant steady state following any transient. The stability properties are also established in Hopfield network theory (Haykin, 1994).
New Model and Applications of Cellular Neural Networks in Image Processing
475
For standard CNNs, the state equation (1) is linear as long as all the state variables are less than one in absolute value, x i [ 1,1] . This domain defines the linear region of the state space. For the modified CNN, the linear region of the state space in variable x i depends on the feedback state information, the input information and the bias term, as shows (3). In the linear region of the state space, as long as no cell reached saturation, equations (1) and (3) have a similar form, since in this domain we have f(x) x , which implies that the linear dynamics of the standard CNN and modified CNN are identical. The difference lies in the way in which the two types of CNNs enter the nonlinear region and reach equilibrium. In the following paragraph we proceed to an analysis of the dynamic behavior and a study of the equilibrium points of the modified CNN. We will use the so-called driving-point plot, which gives the state derivative dxi dt vs. state variable x i (Chua & Yang, 1988). We will emphasize the dynamic routes and the possible equilibrium points the system can reach. Using equation (5) which corresponds to a standard CNN in the state variable z i , we can first represent the driving-point plot ( dz i dt , z i ) in the new variable z i as in Figure 2 (a).
The drawing of this plot is shown in detail in (Matei, 2001). In order to derive the driving point plot in the original variable xi , we use the linear mapping (4) in the form:
g i (t)
jN i , j i
(8)
zi (t) p x i (t) g i (t)
Aj xj
B
k
uk Ii
kN i
Aj xj ki
(9)
jN i , j i
In (8), p A 0 is the central element of the feedback template. Supposing for the moment a fixed, constant value for g i (t) , denoted g i , we obtain for p 0 :
xi (t) (1 p) z i (t) g i
(10)
dzi dt p dx i dt
(11)
The case p 0 should be analyzed separately. Through a simple graphical construction shown in Figure 2 (a) we can draw the corresponding driving point plot in the original variable xi . This can be done using two auxiliary plots: (a) the plot (x i ,z i ) , given by (10), represented by a straight line with slope of value 1 p and offset value g i ; (b) the plot (dz i dt ,dx i dt) , given by (11), consisting in a straight line with slope p crossing the origin .
The plots depicted in Figure 2 (b) were drawn for the particular values p 2 , g i 0.5 . The segment A 1B 1 with slope ( p 1 ), given by: dz i dt (p 1) z i g i
(12)
dx i dt (p 1) xi g i
(13)
has an identical form in xi :
The points A 1 , B 1 , given by the coordinates: A1 (1, p 1 g i ) , B1 (1, p 1 g i ) become:
476
Advanced Technologies
1 gi p 1 gi 1 g i p 1 g i A , , , B p p p p
(14)
The lines A 1D 1 and B 1C 1 , described by the equations: dz i dt z i p g i
dzi dt z i p g i
(15)
become AD and BC respectively, given by: dx i dt x i 1
dx i dt x i 1
(16)
(a) (b) Fig. 2. (a) Graphical construction for determining the driving point plot; (b) Driving point plot, dynamic routes and equilibrium points for different values of g i In the diagram dxi dt , xi the points C and D result from the intersection of the straight lines BC and AD with axis xi , imposing the condition: dxi dt 0 . We finally obtain: x i (C) 1 and x i (D) 1 . Points C( 1,0) and D(1,0) will therefore be fixed equilibrium points of the modified CNN, independent of the values of p and g i . For different values of the amount g i , the diagram (dz i dt ,z i ) shifts up or down. Correspondingly, the (dx i dt , x i ) plot slides along the two parallel dashed lines crossing the two fixed points C and D. In Figure 2(b) one typical driving point plot is shown with continuous line and the dynamic routes are indicated by arrows. The number of equilibrium points and their nature for different values of p and g i are analyzed in (Matei, 2001). 2.4 Dynamic Range of Modified CNNs For standard CNNs the following result has been proven in (Chua & Yang, 1988):
New Model and Applications of Cellular Neural Networks in Image Processing
477
Theorem: All states x i in a CNN are bounded for t 0 and the bound vmax is given by: v max 1 I max i
A
i
Bi
(17)
i
if we consider the normalized resistance value R x 1 . The possibility of estimating the maximum dynamic range is very important from the implementation point of view, because it allows one to optimally design the circuit parameters in order to keep the voltage values within reasonable bounds. In (Rodriguez-Vasquez et al., 1993) a new CNN model, called "full-range" model was introduced, adding a nonlinear term to the state equation, which confines the state value to the range [-1,1]. This simplifies the design, allowing for reduced area and power consumption in VLSI implementation. Our approach gives in fact another solution to this problem, forcing the state and output values to be equal. Therefore, the proposed CNN with "overall non-linearity" is in fact a "full-range" model as well. In the modified CNN, this problem is solved by the form of the state equation itself. Indeed, at any equilibrium point, the current cell state variable is x i f(zi ) and consequently its variation is bounded to the range [ 1,1] . Moreover, the output variable is identical to the state variable at any time: y i (t) xi (t) .
Referring again to the circuit structure, the argument zi (t) of the nonlinear function f(zi ) given by (4) is physically a current. However, since equation (5) is similar to the state equation of a standard CNN cell, but written in variable zi , equation (17) may be formally applied to evaluate a maximum bound I max for the cell currents. Therefore in the case of proposed CNN with overall nonlinearity, the boundedness problem regards the current range instead of voltage, which makes it less critical from an implementation point of view. All voltage variations in the modified CNN are limited to a fixed range corresponding to the extreme pixel values, regardless of template parameters.
3. Applications of CNNs with Overall Nonlinearity In this section we approach the image processing capabilities of the proposed CNN model. Some processing tasks will be proposed on binary (black and white) images. We first analyze the behavior of the proposed CNN model in processing binary images, then we design the templates for a few specific tasks, namely: convex corner detection, erasing thin lines, detection of closed curves, finding all paths between two selected points through a labyrinth, rotation detector, marked object reconstruction and finding the intersection points of one-pixel thin lines from two superposed binary images. Many other processing tasks on binary or grayscale images are possible. A vast reference for different classes of standard CNN tasks is the CNN Software Library, a template catalog for a large range of analog processing operations on binary and grayscale images. These also include more complex analog algorithms which consist of sequences of elementary tasks which are fulfilled under digital control. We use here the convention that the pixel value -1 corresponds to white, and the pixel value +1 corresponds to black. Generally, one binary image is loaded into the CNN as initial state,
478
Advanced Technologies
while another binary image may be applied to the array input. In this range of applications the system must behave as a bipolar CNN. A standard CNN is said to be bipolar if u , y B MN , x(0) B MN , where B = {-1,1} , and M, N are the number of rows and columns of the CNN array (Hänggi & Moschytz, 2000). Based on the information contained in the two images, the CNN must accomplish a given task upon the state image, according to a set of local rules for which the feedback and control templates are designed. If a given binary image is loaded as initial state, this implies that all cells are initially saturated. Depending on the chosen template parameters, the initial state of a given cell may be stable or unstable. If the state is stable, the corresponding pixel will remain unchanged (black or white); if on the contrary the state is unstable, the pixel may switch to the opposite state, following a transient evolution. Taking into account that we want to obtain a binary output image in response to the processed image, we impose that the cells which are initially unstable switch into the opposite state, therefore for a generic cell C(i, j) we will have: x ij (0) 0 . These cells will switch into the opposite saturated state which is supposed to be stable according to local rules. The switching of a cell between two saturated states implies a dynamic evolution through the middle linear region. As stated from the beginning in (3), the dynamics of the current cell is described by: dxi (t) dt xi (t) f z i (t)
(18)
Suppose that a given pixel is initially black, i.e. x i (0) 1 . If it is to remain black, this implies x i (0) 0 and from (18) we must have: f z i (0) 1 , therefore zi (0) 1 . If the black pixel
must switch to white, we impose: zi (0) 1 , so f(z i (0)) 1 ; the initial state derivative is: x i (0) xi (0) 1 2 0
(19)
The cell state value decreases exponentially from 1 to -1: x i ( ) 1
x i ( ) x i ( ) 1 0
(20)
therefore the cell finally reaches a stable equilibrium point. In the following paragraphs we present several tasks on binary images in order to show the image processing capabilities of the proposed CNN model with overall nonlinearity. For each of these tasks, the templates are designed and simulation results are provided. 3.1 Convex Corner Detection Although the convex corner detection on binary images is implemented as well on standard CNNs, we show here the template design method for the proposed model. We assume a feedback template A of the general symmetric form: s s s (21) A s p s s s s
New Model and Applications of Cellular Neural Networks in Image Processing
479
The first step is to find the local rules according to the desired result. We have to take into account all the possible combinations of black and white pixels within a 3 3 frame. Although the total number of combinations is quite large, due to template A symmetry the number of distinct combinations is largely reduced. All inner black pixels must turn white, while any edge pixel must remain black if it locates a convex corner or turn white otherwise. In Figure 3(a), the combinations (a)–(d) represent convex corners, while cases (e)-(h) do not. Assuming for s a negative value, we obtain a system of inequalities which reduces to the set: p 2s I 1
pI1
p 8s I 1
(22)
Finding a solution to the above system is a linear programming problem which can be solved graphically or numerically. We finally reach the set of parameters: p 7.5 ; s 1.5 ; I 6 . A simulation of this task is shown in Figure 3(b), (c). The result of convex corner detection is shown, resulted from the test image (b) loaded as initial state. We remark that all convex corners are correctly detected, while the concave corners do not appear in the output image (c). A thin line emerging from a convex corner is also detected.
(a) (b) (c) Fig. 3. (a) Possible pixel combinations in convex corner detection; (b) Image loaded as initial state; (c) Output images containing detected corners 3.2 Erasing Thin Lines We intend to design a template which erases every thin line (one-pixel thick) from a binary image, leaving all other objects unchanged. For example, the test image in Figure 4(a) is loaded into the CNN as initial condition. We look for the feedback template A in the form (21), this time assuming s 0 ; writing the local rules, we reach the following system: p 4s I 1
p 2s I 1
pI 1
(23)
and we choose a convenient set of parameter values: p 8 ; s 2 ; I 2 . The CNN input is not used. The final result of the processing is shown in Figure 4(c). We notice that the two thin curves from the left, as well as the spiral and the lines emerging from the black square have been erased. The lines perpendicular to the square edges may leave a single residual pixel. In Figure 4(b) a snapshot of an intermediate state is shown, in which the progressive
480
Advanced Technologies
deletion of thin lines is noticeable. The compact objects present in the image remain unchanged. 3.3 Detection of Closed Curves In some applications it might be useful to discern between closed and open curves. These curves in turn may result from a previous processing, such as edge detection etc. We design a CNN task which detects the one-pixel thin closed curves and deletes the open curves from a binary image. The absence of at least one pixel from a closed curve turns it into an open curve. Both the input and the initial state are used. The same binary image P is loaded as initial state and applied to input: U(t) P ; X(0) P . The templates are chosen as: s s s r r r A s p s ; B r p r ; r 0.5s s s s r r r
(24)
Applying the corresponding local rules, the set of inequalities is derived: 2p 2s I 1
2p 3s I 1
(25)
2p 2.5s I 1
A convenient choice of parameters is: p 9 ; s 6 ; I 4.5 . In this task we observed the 8pixel connectivity, i.e. the current black pixel is considered connected to any other black pixel lying in its 3 3 neighborhood. A simulation is shown in Figure 5. The test image (a) contains six thin curves, which feature all types of possible edges and corners. There is also a compact concave object in the image. The image (b) shows the CNN state at an intermediate moment, and the final image (c) contains just the closed curves and the compact object, which was preserved. The process of erasing open curves relies on a dynamic propagation starting from the free ends. The black pixels from the ends turn white according to the imposed local rules until the entire curve is erased.
(a) (b) Fig. 4. (a) Binary image; (b) Transient state; (c) Steady state output image
(c)
New Model and Applications of Cellular Neural Networks in Image Processing
481
(a) (b) (c) Fig. 5. Closed curve detection: (a) binary image; (b) transient state; (c) output image 3.4 Finding All Paths Between Two Selected Points Through a Labyrinth A processing task related to the previous one is presented below. We consider a labyrinth made up of white one-pixel thin curves against a black background. In this case we suppose valid the convention that one pixel is possibly connected only to four neighbors (up, down, left, right). The target of the proposed task is to find all the existing paths between two given points situated in the labyrinth, which are marked by white pixels in the input image. A point is represented by one pixel, either on the input or state image. The binary image P representing the labyrinth is loaded into the CNN as initial state ( X(0) P ). The input image may be zero everywhere (average grey level) except for two
white pixels (of value -1) which mark the points selected in the labyrinth. The dynamic process which finds the routes between the chosen points is similar to the one used in application 3.3. The template parameters are designed such that all labyrinth routes with unmarked ends are gradually erased starting from end pixels which turn from black to white (cell outputs switch from -1 to +1). The templates are searched in the form: r s r 0 0 0 A s p s ; B 0 a 0 r s r 0 0 0
(26)
and the following system results: p 4r I 1 p 2s 4r I 1 p 2s 2r I 1 p 2s 4r I a 1
(27)
with convenient parameter values: p 12 ; s 4 ; r 0.5 ; a 8 ; I 8 . In Figure 6 a simulation example is shown. The input image (b) contains two white pixels on a uniform gray background (zero pixel value). The two pixels locate on the labyrinth image (a) (loaded as initial state) the points between which the routes are selected. At the end of dynamic propagation shown in (c), the output image (d) results. More generally, routes can be found among several points in the labyrinth, or a single point is chosen and we find all the closed routes containing that point. The classical problem of finding the shortest path through a labyrinth is more complex and it can be solved through a
482
Advanced Technologies
CNN algorithm using several nonlinear templates, which are run sequentially under digital control.
(a) (b) (c) (d) Fig. 6. (a) Labyrinth image as initial state; (b) input image; (c) transient state; (d) output image containing detected routes between selected points 3.5 Erasing Objects with Tilted Edges (Rotation Detector) We will design a CNN task which detects in a binary image the compact convex or concave objects having only horizontal and vertical edges and removes all tilted objects or objects having at least one tilted edge. In particular the CNN can detect the rotation of such objects. This is based on the discrete representation on an array of a straight line, in particular the edge of a compact object, with a certain slope. Due to the limited spatial resolution of the CNN array, any tilted edge will be mapped as a stair-like edge. An edge with the minimum angle will map into a staircase with only two stairs, while an edge with a slope of 4 is
mapped into a staircase with a maximum number of one-pixel stairs. The binary image P is loaded into the CNN as initial state ( X(0) P ) and applied at the CNN input as well, U(t) P . We look for the templates in the general forms: 0.5r 0.5r 0.5s r s r A s p s ; B 0.5s p 0.5s r s r 0.5r 0.5r 0.5s
(28)
and finally the following convenient values are found: p 5 ; s 5 ; r 0.8 ; I 11.2 .
(a) (b) Fig. 7. (a) Binary image; (b) Transient state; (c) Steady state output image
(c)
New Model and Applications of Cellular Neural Networks in Image Processing
483
A simulation is presented in Figure 7. Image (a) contains several compact objects: a rectangle and a square placed horizontally and their rotated versions etc. The objects with at least one tilted edge are gradually eroded through a dynamic propagation, as shows the transient snapshot (b); the final image (c) preserves only objects with horizontal and vertical edges. Therefore, this task can be used to detect the rotation of simple objects (rectangles etc.). A simulation is presented in Figure 7. Image (a) contains several compact objects: a rectangle and a square placed horizontally and their rotated versions etc. The objects with at least one tilted edge are gradually eroded through a dynamic propagation, as shows the transient snapshot (b); the final image (c) preserves only objects with horizontal and vertical edges. Therefore, this task can be used to detect the rotation of simple objects (rectangles etc.). 3.6 Marked Object Reconstruction A binary image P containing compact shapes or curves is applied to the CNN input. From this image we intend to select certain objects which we want to preserve, and delete the others. The selection of objects is done by marking them with a black pixel, in the image loaded as initial state. Therefore at the input we have U P and the initial state X(0)
consists in black marking pixels against a white background. These can be placed anywhere within the area of selected object (in the interior or on the edge). The marked objects are reconstructed in the state image through a gradual process. Starting from the marking pixel, the white pixels within the marked object turn black. This propagation process stops on the edge of the object in the input image. The unmarked objects will not appear in the final image. We assume the feedback template of the form (21) and the control template zero, except the central element equal to a. With s 0 , p 0 , we finally obtain the system: p 8s I a 1
p 8s I a 1
p 6s I a 1
(29)
A convenient set of parameters results as: p 3 ; s 1.5 ; a 12 ; I 1.5 . An example of selecting marked objects is shown in Figure 8. The image (a) contains six compact objects and curves. Using the marking points from (b) the objects from (d) were selected, while using points in (e), we selected the objects as in image (f). The image (c) is an intermediate snapshot showing the reconstruction of the selected objects around the marking points.
(a) (b) (c) (d) Fig. 8. (a) Test image; (b) initial state (marking points); (c) transient state; (d) output image
484
Advanced Technologies
3.7 Finding the Intersection Points of One-Pixel Thin Lines from Two Binary Images We consider two binary images I 1 and I 2 containing one-pixel thin lines or curves, among
other compact objects. Superposing the two images, the objects will have overlapping areas. We propose to design the feedback and control templates such that the CNN detects only the intersection points (marked by single black pixels) of the thin lines present in the two images. All other overlapping areas will be ignored and will not appear in the output image. This task may be regarded as logic selective AND operation, since it only applies to some objects of particular shape from the images to be processed. One of the images is loaded as initial state while the other is applied to the input: X(0) I 1 ; U I 2 . Interchanging the two images ( X(0) I 2 ; U I 1 ) yields the same result. The feedback and control templates are identical, A B , of the form (21). For this task we get the system: 2p 8s I 1
2p 2s I 1
14s I 1
(30)
and a set of suitable parameter values is: p 3 ; s 0 .5 ; I 8.5 . The described task is shown in Figure 9. The binary images (a) and (b) contain thin lines and various other objects. The first image is loaded as initial state, the second is applied to the input. The output result is the image (c) which contains only the intersection points of thin lines.
(a) (b) (c) Fig. 9. (a) Image I1 as initial state; (b) Input image I2; (c) Resulted output image
4. Applications of CNNs in Image Linear Filtering
Although cellular neural networks are basically designed for nonlinear operation, an important class of applications of CNNs in image processing is also linear filtering (Crounse & Chua, 1995). Generally, the two-dimensional spatial filters are most commonly designed in the frequency domain, by imposing a set of specifications, or starting from various 1D prototype filters with given characteristics and applying frequency transformations which finally lead to the desired 2D filter. For each specific method the filter stability must also be ensured. We will approach here only the IIR spatial filters. Generally the existing design methods of 2D IIR filters rely to a large extent on 1D analog filter prototypes, using spectral transformations from s to z plane via bilinear or Euler transformations followed by z to ( z1 , z2 ) transformations. The most investigated were circular, elliptic-shaped, fan filters etc.
When CNNs are used as stable linear filters, the image I to be filtered is usually applied at the input (U=I) and the initial state is usually zero (X=0). The CNN cells must not reach
New Model and Applications of Cellular Neural Networks in Image Processing
485
saturation during operation, which implies the restriction for the cell state: xij (t) 1 , i, j 1...N .
The linear CNN filter is described by the spatial transfer function (Crounse & Chua, 1995): (31)
H(1 , 2 ) B(1 , 2 ) A(1 , 2 )
where A(1 , 2 ) , B(1 , 2 ) are the 2D Discrete Space Fourier Transforms of templates A, B. An essential constraint for CNNs is local connectivity, specific to these parallel systems. Design specifications may lead to high-order 2D filters, requiring large-size templates, that cannot be directly implemented. The templates currently implemented are no larger than 3 3 or 5 5 . Larger templates can only be implemented by decomposing them into a set of elementary templates. The most convenient are the separable templates, which can be written as a convolution of small-size templates. As we will show, the templates of the 2D filters resulted from 1D prototypes can always be written as convolution products of smallsize templates, therefore the filtering can be realized in several steps. The filtering function will be a product of elementary rational functions. For an 1D recursive CNN filter of order N, its transfer function has the following expression, where equal degrees are taken for numerator and denominator: N
H()
b n cos n
n 1
N
a
m
cosm B() A( )
(32)
m 1
The numerator and denominator of (32) can be factorized into first and second order polynomials in cos . For instance, the numerator can be decomposed as follows: n
B() k
i 1
m
(cos bi )
(cos b 2
1j
cos b 2 j )
(33)
j1
with n 2m N (filter order). Correspondingly, the template B ( 1 N ), can be decomposed into 1 3 or 1 5 templates . A similar expression is valid for the denominator A() . Coupling conveniently the factors of A() and B() , the filter transfer function (32) can be always written as a product of elementary functions of order 1 or 2, realized with pairs of 1 3 or 1 5 templates: H() H1 () H 2 () ... Hk ()
(34)
Here we will discuss zero-phase IIR CNN filters, i.e. with real-valued transfer functions, which correspond to symmetric control and feedback templates. In IIR spatial filtering with a 2D filter H(1 , 2 ) , the final state can be written: X(ω 1 ,ω 2 ) = U(ω 1 ,ω 2 )× H(ω 1 ,ω 2 ) = U(ω 1 ,ω 2 )× H 1 (ω 1 ,ω 2 )× H 2 (ω 1 ,ω 2 )…H n (ω 1 ,ω 2 )
(35)
486
Advanced Technologies
The image X 1 (1 , 2 ) U(1 , 2 ) H1 ( 1 , 2 ) resulted in the first filtering step is re-applied to CNN input, giving the second output image: X 2 ( 1 , 2 ) X 1 (1 , 2 ) H 2 (1 , 2 ) and so on, until the whole filtering is achieved. The frequency response H(1 , 2 ) can be written: H(1 , 2 )
H ( , ) B ( , ) A ( , ) k
1
2
k
k
1
2
k
k
1
2
(36)
k
which implies a cascade interconnection of IIR filters. The separable templates A and B can be written as convolution products, where Bi , A j are elementary ( 3 3 or 5 5 ) templates: B B1 B 2 Bn
A A1 A 2 Am
(37)
4.1 Design Methods for Circularly-Symmetric Filters In this section we propose an efficient design technique for 2D circularly-symmetric filters, based on 1D prototype filters. Circular filters are currently used in many image processing applications. Given a 1D filter function Hp () , the corresponding 2D filter H( 1 , 2 )
results through the frequency transformation 12 22 : H(1 , 2 ) Hp
12 22
(38)
A currently-used approximation of the function cos 12 22 is given by: cos 12 22 C( 1 , 2 ) 0.5 0.5(cos 1 cos 2 ) 0.5 cos 1 cos 2
(39)
and corresponds to the 3 3 template: 0.125 0.25 0.125 C 0.25 0.5 0.25 0.125 0.25 0.125
(40)
Let us consider a 1D filter HP () described by the symmetric templates B P , A P of radius R:
BP [ b 2
b1
b0
b1
b 2 ]
A P [ a 2
a1
a0
a1
a 2 ]
(41)
Applying DSFT and using trigonometric identities we finally get the transfer function as a ratio of polynomials in cos . Its denominator has the factorized form (with n 2m N , the filter order):
New Model and Applications of Cellular Neural Networks in Image Processing R
A p ( ) c0
ck (cos )k k
k 1
n
m
(cos ai )
i 1
(cos a 2
1j
487
cos a 2 j )
(42)
j 1
Substituting cos by C( 1 , 2 ) given by (39) in the prototype function A P () , we obtain:
A(1 , 2 ) A p
12 22 c0
R
c
k
C k (1 , 2 )
(43)
k 1
Therefore, the design of the 2D circular filter consists in substituting cos with C( 1 , 2 ) in each factor of H P () ; A(1 , 2 ) will have the form, corresponding to (42): n
m
i 1
j1
A(1 , 2 ) k C(1 , 2 ) ai C 2 (1 , 2 ) a1 j C(1 , 2 ) a 2 j
(44)
Since the 1D prototype filter function can be factorized, the 2D circularly-symmetric filter is separable, a major advantage in implementation. The large-size template A corresponding to A(1 , 2 ) results directly decomposed into elementary ( 3 3 or 5 5 ) templates: A k (C 1 ... C i ...C n ) (D1 ... D j ...D m )
(45)
If we use the 3 3 template C given in (40) to approximate the 2D cosine function, each template C i from (45) is obtained from C by adding a i to the central element. Each 5 5 template D j results as D j C C b1j C1 b2 j C 0 , where C 0 is a 5 5 zero template, with central element one; C 1 is a 5 5 template obtained by bordering B with zeros. 4.2 Maximally-Flat Zero-Phase Filters We propose a method of designing 2D spatial filters of different types starting from approximations commonly used in temporal filters. However, we will design spatial filters which will preserve only the magnitude characteristics of their temporal prototypes, while the designed filters will be zero-phase; their frequency response will be real-valued. The design of the desired 2D filters will be achieved in several steps, detailed as follows. The starting point is a discrete prototype filter of a given type and desired specifications, with a transfer function H(z) . Using this discrete filter we first derive an 1D spatial filter.
This 1D filter will inherit only the magnitude characteristics of its temporal counterpart, while its phase will be zero throughout the frequency domain, namely the range [ , ] . We consider the magnitude H() of H(z) H(e j ) . In order to find a zero-phase 1D filter transfer function which approximates H() , we can make the change of frequency variable: arccos x x cos
(46)
488
Advanced Technologies
Substituting in H( ) by arccos x , we first get a real function G(x) in variable x; next we find a convenient rational approximation for G(x) . One of the most efficient (best tradeoff between accuracy and approximation order) is the Chebyshev-Padé approximation. The main drawback of this method is that the coefficients of the rational approximation can be only determined numerically using for instance a symbolic computation software. After finding the approximation of G(x) as a ratio of polynomials in variable x, we return to the original variable by substituting back x cos , and finally we get the rational approximation of H() in the variable cos on the range [ , ] , similar to (32): N
H()
b n cos n
n 1
N
a
m
cosm B() A( )
(47)
m 1
From the numerator B() and denominator A() we directly identify the corresponding control and feedback templates, preferably of equal size. The polynomials B() and A() can be factorized according to relation (33). n
m
i 1
j1
A() k (cos a i ) (cos 2 a 1 j cos a 2 j )
(48)
with n 2m N (filter order). A similar expression is valid for B() . In the following sections we will use this technique to design several types of 2D filters: circularly-symmetric filters, elliptically-shaped filters, directional filters etc. Let us consider as prototype a type 2 Chebyshev low-pass digital filter with the following specifications: order N 6 , stop-band ripple R 40 dB , stop-band edge frequency S 0.3 (where generally S [0.0, 1.0] , with 1.0 corresponding to half the sample rate). Following the procedure described above, we finally get an approximation of the form (47). The numerator and denominator coefficients are given by the vectors: [b n ] [0.0176 0.0068 -0.0164 0.0038] [a m ] [-0.4240 1.2552 -1.1918 0.3724]
(49)
and the factorization of the numerator and denominator of the 1D prototype function is : B() 0.0176 (cos 1.261) (cos-0.2957) (cos-0.5789)
(50)
A() -0.424 (cos-1.4035) (cos 2 -1.5568 cos 0.6257)
(51)
To obtain a circularly-symmetric filter from the 1D factorized prototype function, we simply replace in (50) and (51) cos with the circular cosine function (39). For instance, the feedback template A results as the discrete convolution: A k A 11 A 12 ... A 1n A 21 A 22 ... A 2m
(52)
New Model and Applications of Cellular Neural Networks in Image Processing
489
where A 1i (i 1...n) are 3 3 templates and A 2 j ( j 1...m) are 5 5 templates, given by: A 1i C ai A 01
(53)
A 2 j C C a 1j C 0 a 2 j A 02
(54)
where A 01 is a 3 3 zero template and A 02 is a 5 5 zero template with central element one; C 0 is a 5 5 template obtained by bordering C(3 3) with zeros. The above expressions correspond to the factors in (44). In Figure 10 the frequency response and the contour plot are shown for two 2D circularly-symmetric maximally-flat filters. The first one has a narrower characteristic, corresponding to S 0.3 , while the second has a larger cutoff frequency, being derived from a Chebyshev prototype with N 6 , R 40dB , S 0.7 .
(a) (b) (c) (d) Fig. 10. (a), (c) maximally-flat circular filter frequency responses; (b), (d) contour plots 4.3 Elliptically-Shaped Oriented Filters Next we study a class of 2D low-pass filters having an elliptically-shaped horizontal section. These filters will be specified by imposing the values of the semi-axes of the ellipse, and the orientation is given by the angle of the large axis with respect to 2 axis in the frequency
plane. Starting from the frequency response for a 1D zero-phase filter given by (47), we can derive a 2D elliptically-shaped filter using the frequency transformation E (1 , 2 ) : 2 cos 2 sin 2 cos 2 2 sin E (1 , 2 ) 12 2 2 2 2 F F2 E E 1 1 12 sin(2 ) 2 2 a 12 b 22 c 12 E F
(55)
An elliptically-shaped filter results from a circular one through the linear transformation: ' 1 E 0 cos sin 1 0 F sin cos ' 2 2
(56)
490
Advanced Technologies
where E and F are the stretching coefficients ( E F ), (1 , 2 ) are the current coordinates and ('1 , '2 ) are the former coordinates. Thus, the unit circle is stretched along the axes 1 and 2 with factors E and F, then counter-clockwise rotated with an angle , becoming an oriented ellipse, which is the shape of the proposed filter in horizontal section. Consequently, given a 1D prototype filter of the general form (47), we can obtain a 2D elliptically-shaped filter, specified by the parameters E, F and which give the shape and orientation, by using the frequency transformation: 2 E (1 , 2 ) a 12 b 22 c 12
(57)
where the coefficients a, b and c are identified from relation (55). Replacing the real frequency variables 1 and 2 by the complex variables s1 j1 and s 2 j2 , the function E (1 , 2 ) can be written in the 2D Laplace domain as: E (s 1 ,s 2 ) (a s12 b s 22 c s 1s 2 )
(58)
The next step is to find the discrete approximation E (z 1 ,z 2 ) for E (s1 ,s 2 ) in (58), using the forward or backward Euler approximations, or the bilinear transform. We will use here the forward Euler method, therefore along the two axes of the plane we have: s1 z 1 1 , s 2 z 2 1 . The discrete approximations for s12 , s 22 and s1s 2 will result as (Matei, 2004): s12 z 1 z 11 2
(59)
s 22 z 2 z 21 2 1 1
1 2
1 1 2
1 1 2
2s1s 2 z 1 z z 2 z 2 z z z z
Replacing the above operators in (58), E (s1 ,s 2 ) takes the form: E (z 1 ,z 2 ) (a c 2) (z11 z 1 ) (b c 2) (z 21 z 2 ) (c 2) (z 1z 21 z 11z 2 ) (2a 2b c) (60)
After re-arranging terms and identifying coefficients of the 2D Z transform, we obtain the matrix operator E : ac 2 c 2 0 E b c 2 (2a 2b c) b c 2 ac 2 0 c 2
(61)
The matrix E depends on the orientation angle and the stretching coefficients E and F. We found the frequency transformation E : 2 , 2 E (z1 ,z 2 ) in the matrix form:
New Model and Applications of Cellular Neural Networks in Image Processing
2 z11
1 z 1 E z 21
1 z 2
T
491
(62)
As anumerical example, taking the parameter values 3 , E 2.4 , F 0.6 we get: 3.2544 1.1277 0 E 1.9523 8.1581 1.9523 0 1.1277 3.2544
(63)
Let us consider the simple 1D zero-phase prototype filter:
HP () b0 b12 b 2 4
1 a 1
2
a 2 4
(64)
Applying the frequency transform (62), we obtain the 2D elliptically-shaped filter: HE (z 1 ,z 2 )
b0 b1 E (z 1 ,z 2 ) b 2 E2 (z 1 ,z 2 ) 1 a 1 E (z 1 ,z 2 ) a 2 E2 (z1 ,z 2 )
BE (z 1 ,z 2 ) A E (z 1 ,z 2 )
(65)
Since E (z 1 ,z 2 ) corresponds to the matrix E in (61), we determine the filter templates as:
BE b0 E0 b1 E1 b 2 E E
A E E0 a 1 E1 a 2 E E
(66)
By we denoted matrix convolution. E0 is a 5 5 zero matrix with the central element of value 1. The 5 5 matrix E1 is obtained by bordering with zeros the 3 3 matrix E in order to be summed with E0 and E E . In general, for a zero-phase prototype HP ( ) of order 2N , we obtain following the method presented above a 2D filter transfer function HE (z 1 ,z 2 ) whose numerator and denominator correspond to templates BE and A E of size (2N 1) (2N 1) . For a design example, we take as 1D prototype a type-2 Chebyshev digital filter with N 4 , R s 40 dB and passband-edge frequency p 0.5 . Its transfer function in z is: Hp (z)
0.012277 z 2 0.012525 z 0.012277 z 2 1.850147 z 0.862316
(67)
The real-valued transfer function approximating the magnitude of the filter Hp (z) is: Hp (e j ) Ha1 ()
0.940306 0.57565 2 0.0947 4 1 2.067753 2 4.663147 4
(68)
Figure 11 shows two elliptically-shaped filters based on (68), having a flat characteristic in the pass-band and a very low stop-band ripple. An advantage of this design method is
492
Advanced Technologies
that the filter characteristics in the vertical and horizontal plane are specified independently.
(a) (b) (c) (d) Fig. 11. Frequency response and contour plots of two elliptically-shaped filters: (a), (b) 3 , E 2.8 , F 1 ; (c), (d) 3 , E 2.4 , F 0.6 4.4 Directional Filters There are several classes of 2D filters with orientation-selective frequency response, which find applications in many image processing tasks, such as edge detection, motion analysis, texture analysis etc. For instance, steerable filters which are synthesized as a linear combination of a set of basis filters (Freeman & Adelson, 1991). Another important class are the Gabor filters, efficiently implemented also on CNNs (Shi, 1998), with useful applications in image processing and computer vision. A particular class of filters, which can be also implemented on CNNs, are directional (orientation-selective) filters, which select narrow domains along specified directions in the frequency plane. Several design examples of 2D recursive directional filters with an arbitrary orientation were given in (Matei, 2004). They can be used in selecting lines with a given orientation in the input image. The spatial orientation is specified by an angle with respect to 1 -axis and is defined by
the 1D to 2D spatial frequency transformation: 1 cos 2 sin . Considering a 1D zero-phase prototype filter H( ) and using this substitution we obtain the transfer function of the oriented filter, H ( 1 , 2 ) : H ( 1 , 2 ) H(1 cos 2 sin )
(69)
The 2D oriented filter H ( 1 , 2 ) has the magnitude along the line 1 cos 2 sin 0 , identical with the prototype H( ) , and is constant along the line 1 sin 2 cos 0 (longitudinal axis). The design of an oriented filter based on a 1D prototype relies on finding a realization of the 2D oriented cosine function, which depends on the orientation angle : C (1 , 2 ) cos(1 cos 2 sin )
(70)
Next we will determine a convenient 1D to 2D complex transformation which allows an oriented 2D filter to be obtained from a 1D prototype filter. We will treat here the zero-phase filters, which sometimes are desirable in image processing tasks. An accurate rational approximation (Chebyshev- Padé) of the cosine for [ 2 , 2 ] is:
New Model and Applications of Cellular Neural Networks in Image Processing
cos
1 0.4477542 0.0182484 1 0.0416942 0.0024164
493
(71)
The range of approximation [ 2 , 2 ] was required due to the fact that in the frequency plane 1 , 2 the largest value for the frequency of the prototype filter is 2 . Next we have to find a discrete approximation for the expression (70). Denoting f (1 , 2 ) ( 1 cos 2 sin )2 , we obtain using (70) and (71): C (1 , 2 )
1 0.447754 f (1 , 2 ) 0.018248f2 (1 , 2 )
1 0.041694 f (1 , 2 ) 0.002416 f2 (1 , 2 )
(72)
f (1 , 2 ) can be written in 2D Laplace domain in the complex variables s1 j1 , s 2 j2 : f (s1 ,s 2 ) (s1 cos s2 sin )2 (s12 cos 2 s22 sin 2 2s1s 2 sin cos )
(73)
Using the forward Euler method as in section 4.3, we obtain the discrete approximations (59) for s12 , s 22 and s1s 2 ; substituting these expressions into (73), f (s1 ,s 2 ) finally becomes: f (z 1 ,z 2 ) [cos 2 (z1 z 11 2) sin 2 (z 2 z 21 2) sin cos (z 1 z 11 z 2 z 21 2 z 1z 21 z 11z 2 )]
(74)
corresponding to an "orientation matrix" operator O , resulted after re-arranging terms and identifying the coefficients of 2D Z transform: 0 0 0 1 1 0 1 1 O sin 2 1 2 1 cos 2 1 0 1 1 2 1 1 0 1 0 0 0
1 0 4 1 1 0
(75)
The matrix O depends only on the orientation angle . The 2D rational function C (1 , 2 ) , derived from relations (72) and (74), can be regarded itself as a recursive filter: C (1 , 2 ) B (1 , 2 ) A (1 , 2 )
(76)
and can be realized using the pair of templates: B O 0 0.447754 O ' 0.018248 O O
A O0
0.041694 O '
0.002416 O O
(77) (78)
494
Advanced Technologies
also depending only on the orientation angle . By " " we denoted the operation of matrix convolution; O 0 is a 5 5 zero matrix with the central element of value 1; in (77) and (78) the 5 5 matrix O' is derived by bordering with zeros the 3 3 matrix O in order to be summed with O 0 and O O . For a design example we use the 1D prototype filter from section 4.2, where A() and B() in the transfer function are factorized as in (50) and (51); we simply have to
substitute cos by the oriented cosine function (76). Corresponding to A() factorized as in (51), the resulting function A(1 , 2 ) is realized by a template A, decomposed into elementary templates as in (52), where A 1i are 3 3 templates and A 2 j are 5 5 templates, given by: A 1i B a i A A 2 j B B a 1 j A B a 2 j A A
(79)
Using this method, some design examples are presented as follows. 1) With a zero-phase elliptic filter (Figure 12 (a)), we design a directional filter oriented at an angle 6 ; its frequency response and contour plot are shown in Figure 12 (b) and (c). 2) A simple and efficient low-pass zero-phase filter prototype has the frequency response: Hp () 1 (p 1 p cos )
(80)
where the value of p sets the filter selectivity. If in (80) cos is substituted by C (1 , 2 ) given by (76), for p 25 , 0.22 a very selective directional filter results (Figure 12 (d), (e)) . Another two very selective directional filters are shown in Figure 13; these are in fact elliptically-shaped filters with a large ratio E F , and are designed as shown in section 4.3; for instance, the first filter given in Figure 13 (a), (b) has the parameters E 3 , F 0.1 and 0.22 . Directional filters may be used in detecting straight lines with a given orientation in an image (Matei, 2004). In order to demonstrate the directional selectivity of the proposed filters, we present an example of detecting lines with a given inclination from an image, by means of filtering with a 2D IIR oriented low-pass filter. The spectrum of a straight line is oriented in the plane ( 1 , 2 ) at an angle of 2 with respect to the line direction. The binary image in Figure 15(a) contains straight lines with different orientations and some ellipses and is filtered with an oriented elliptically-shaped filter with 6 , designed using the method presented in section III. Depending on the filter selectivity, in the filtered image in Figure 15(b), only the lines which have the spectrum oriented more or less along the filter characteristic, remain practically unchanged, while all the other lines appear more or less blurred, due to low-pass filtering. For a good orientation resolution, to separate lines with very close orientation
New Model and Applications of Cellular Neural Networks in Image Processing
495
1
0.8
0.6
0.4
0.2
0 -3
-2
-1
0
1
2
3
(a) (b) (c) (d) (e) Fig. 12. (a) Elliptic 1D LP prototype filter; (b) 2D directional filter response ( 6 ); (c) contour plot; (d) 2D directional frequency response for 0.22 ; (e) contour plot
(a) (b) (c) (d) Fig. 13: Directional elliptically-shaped filters: (a),(b): p 5 , E 3 , F 0.1 , 0.22 ; (c), (d) p 5 , E 7 , F 0.5 , 0.22
(a) (b) (c) (d) Fig. 14. (a) Test image; (b),(c),(d) directionally filtered images with parameters: (a) 0.8 , p 30 , E 7 , F 0.5 ; (b) 7 , p 70 , E 7 , F 0.3 ; (c) 0.2 , p 50 , E 7 , F 0.3
(a) (b) (c) (d) Fig. 15. (a) Binary test image; (b) Directionally filtered image; (c) Texture image; (d) Directionally filtered image with 12 , p 70 , E 7 , F 0.3
496
Advanced Technologies
angles, the filter selectivity has to be properly chosen. Another example is shown in Figure 15 (c), (d). The texture-type grayscale image (c) representing a portion of a straw-chair upper face is directionally filtered with the given parameter values and the result is image (d), which suggests a possible use in texture segmentation. 4.5 Filters Designed in Polar Coordinates In this section we study a family of 2D zero-phase filters which can be designed starting from a specified shape in polar coordinates in the frequency plane. The contour plots of their frequency response can be defined as closed curves which can be described in terms of a variable radius which is a periodic function of the current angle formed with one of the axes. This periodic radius can be expressed using a rational approximation. Then, using some desired 1D prototype filters with factorized transfer functions, spatial filters can be obtained by a 1D to 2D frequency transformation. Their frequency response is symmetric about the origin and has at the same time an angular periodicity. If we consider any contour resulted from the intersection of the frequency response with a horizontal plane, the contour has to be a closed curve which can be described in polar coordinates by: ()
where is the angle formed by the radius OP with the 1 - axis, as shown in Figure 16(a) for a four-lobe filter. Therefore () is a periodic function of the angle in the range [0, 2 ] . The proposed design method is based on zero-phase filter prototypes whose transfer function is real-valued and can be expressed as a ratio of polynomials in even powers of the frequency . In general this filter will be described by: M
Hp ()
N
b a j
j0
2j
k
2k
(81)
k 0
where M N and N is the filter order. This function may be obtained through a rational approximation of the magnitude of an usual prototype (Chebyshev, elliptic etc.). The proposed design method for this class of 2D filters is based on a frequency transformation of the form: F : 2 , 2 F(z 1 ,z 2 )
(82)
Through this transformation we will be able to obtain low-pass type filters, symmetric about the origin, as in fact are most 2D filters currently used in image processing. The frequency transformation (82) is a mapping from the real frequency axis to the complex plane (z 1 ,z 2 ) . However, it will be defined initially through an intermediate frequency mapping of the form: F1 : 2 , 2 F1 ( 1 , 2 ) (12 12 ) (1 , 2 ) Here the function
(83)
(1 , 2 ) plays the role of a radial compressing function and is initially
determined in the angular variable as () . In the frequency plane (1 , 2 ) we have:
New Model and Applications of Cellular Neural Networks in Image Processing
cos 1
12 22
497
(84)
If the radial function () can be expressed in the variable cos , using (84) we obtain by substitution the function (1 , 2 ) . Generally the function () will have to be determined as a polynomial or a ratio of polynomials in variable cos . For instance, the four-lobe filter whose contour plot is shown in Figure 16(a) corresponds to a function: () a b cos 4 a b 8b cos 2 8b cos 4
(85)
which is plotted in Figure 16(b) in the range [0, 2 ] . As a first 1D prototype we consider a type-2 Chebyshev digital filter with the parameters: order N 4 , stopband attenuation R S 40 dB and passband-edge frequency p 0.7 , where 1.0 is half the sampling frequency. The transfer function in z for this filter is: Hp (z)
0.012277 z 2 0.012525 z 0.012277 z 2 1.850147 z 0.862316
(86)
The magnitude of its transfer function in the range [ , ] is shown in Figure 16(c). Using a Chebyshev-Padé approximation, we determine the real-valued zero-phase frequency response which approximates quite accurately the magnitude of the original filter function: Hp (e j ) H a1 ()
0.940306 0.57565 2 0.0947 4 1 2.067753 2 4.663147 4
(87)
Let us now consider the radial function with the form:
) p 1 Hr () 1 p B(
(88)
) is a periodic function; let B( ) cos(4) . We will use this function to design a where B( 2D filter with four narrow lobes in the frequency plane (1 , 2 ) . Using trigonometric identities, it is straightforward to obtain:
Hr () 1 1 8p (cos )2 8p (cos )4
(89)
plotted for [ , ] in Figure 16(d). This is a periodic function with period 4 and has the shape of a “comb” filter. In order to control the shape of this function, we introduce a parameter k, such that the radial function will be () k Hr ( ) . We obtain using (83): 2 F(1 , 2 )
14 (2 8p)12 22 24 k(12 22 )
(90)
498
Advanced Technologies
Since 12 s12 and 22 s 22 we derive the function F1 (s 1 ,s 2 ) in the plane (s1 ,s 2 ) as:
F2 (s1 ,s 2 ) s14 (2 8p)s12 s 22 s 24
k(s
2 1
s 22 )
(91)
Finally we determine a transfer function of the 2D filter H(z1 ,z 2 ) in the complex plane (z 1 ,z 2 ) . This can be achieved if we find a discrete counterpart of the function (1 , 2 ) .
A possible method is to express the function (1 , 2 ) in the complex plane (s 1 ,s 2 ) and then to find the appropriate mapping to the plane (z 1 ,z 2 ) . This can be achieved either using the forward or backward Euler approximations, or otherwise the bilinear transform, which gives better accuracy. The bilinear transform is a first-order approximation of the natural logarithm
(a) (b) (c) (d) Fig. 16. (a) Contour plot of a four-lobe filter; (b) Variation of the periodic function ( ) ; (c) Maximally-flat low-pass prototype; (d) Very selective radial function function, which is an exact mapping of the z-plane to the s-plane. For our purposes the sample interval takes the value T 1 so the bilinear transform for the two variables s1 and s 2 in the complex plane (s1 ,s 2 ) has the form: s1 2(z 1 1) (z 1 1)
(92)
s 2 2(z 2 1) (z 2 1)
Substituting s1 , s 2 in (91), we find the frequency transformation in z 1 , z 2 , in matrix form: 2 F(z 1 ,z 2 )
B(z1 ,z 2 ) Z1 B Z T2 A(z 1 ,z 2 ) Z1 A ZT2
(93)
where Z1 and Z 2 are the vectors:
Z1 z 12
z 11
1 z1
z 12
Z 2 z 22
z 21
1 z2
z 22
(94)
where denotes matrix/vector product. The expression of the frequency transformation (93) can be considered in itself a 2D recursive filter. The templates B, A giving the coefficients of B(z 1 ,z 2 ) , A(z1 ,z 2 ) are the matrices of size 5 5 :
New Model and Applications of Cellular Neural Networks in Image Processing
1 2p 0 B 8 4p 2 0 1 2p
0 4p 2 8 0 0 8p 20 8 0 0 4p 2
0 1 2p 8 0 0 4p 2 8 0 0 1 2p
0 1 1 2 1 1 A k 0 4 0 2 4 2 0 1 1 2 1 1
499
(95)
We next use the maximally flat filter prototype (87). We only have to make the substitution of (93) in Ha1 () given in (87) and we obtain the desired 2D transfer function: H(z 1 ,z 2 )
b 2 B 2 (z 1 ,z 2 ) b1A(z 1 ,z 2 )B(z 1 ,z 2 ) b0A 2 (z 1 ,z 2 ) B f (z 1 ,z 2 ) A f (z 1 ,z 2 ) B 2 (z 1 ,z 2 ) a 1A(z 1 ,z 2 )B(z 1 ,z 2 ) a0A 2 (z1 ,z 2 )
(96)
where b 2 0.9403 , b1 0.5756 , b0 0.0947 , a 1 2.0677 , a0 4.6631 are the coefficients of the prototype (87). In our case H(z1 ,z 2 ) results of order 8. For a chosen prototype of higher order, we get a similar rational function in powers of A(z1 ,z 2 ) and B(z 1 ,z 2 ) . Since the 2D transfer function (96) can be also described in terms of templates Bf and A f corresponding to B f (z 1 ,z 2 ) , A f (z1 ,z 2 ) we have equivalently:
Bf b 2 B B b 1 A B b 0 A A A f B B a1 A B a0 A A
(97)
where denotes two-dimensional convolution. For our filter Bf and A f are of size 9 9 .
(a) (b) (c) (d) Fig. 17. (a), (b) Frequency response and contour plot for the 4-lobe filter; (c) Test image; (d) Filtered image The designed filter has the frequency response and contour plot as in Figure 17(a), (b). We remark that the filter is very selective simultaneously along both axes of the plane (1 , 2 ) . We now present an example of image filtering using a selective four-lobe filter. This type of filter can be used in detecting lines with a given inclination from an image, as the directional filters discussed in section 4.4. The spectrum of a straight line is oriented in the plane (1 , 2 ) at an angle 2 with respect to the line direction. The test image in Figure 17(c) contains straight lines with different orientations and lengths, and also a few curves. We consider the filter shown in Figure 17(a), with parameter values p 50 and k 0.8 .
500
Advanced Technologies
Depending on filter selectivity, only the lines with the spectrum oriented more or less along the filter pass-bands will be preserved in the filtered image. We notice that in the output image shown in Fig.17(d), the lines roughly oriented horizontally and vertically are preserved, while the others are filtered out or appear very blurred. The intersections of the detected lines appear as darker pixels; this allows also these intersections to be detected, if after filtering a proper threshold is applied.
5. Conclusion An alternative CNN model was proposed, in which the state equations are modified such that the piece-wise linear function operates on the entire information fed to the current cell, including input, feedback and offset term. The circuit structure, dynamic behavior and equilibrium points were analyzed. Through a linear transformation, the proposed model is shown to be equivalent to the standard CNN model. We aimed at making a connection between these two models and compare them from the recurrent neural network point of view. Certain advantages regarding implementation result from the limited voltage range. Several processing tasks on binary images were proposed, templates were designed and simulation results on test images are provided. The template design is relatively easy and can be optimized in order to obtain high robustness to parameter variations. A deeper insight into the proposed CNN dynamic behavior may lead to further image processing capabilities. As regards the applications of CNNs in linear filtering, some original and efficient design methods were approached, for several types of two-dimensional filters: circular and elliptically-shaped filters, maximally-flat, orientation-selective filters etc. The design starts from an imposed 1D prototype filter and uses rational approximations and various frequency transformations to obtain the 2D frequency response. These analytical methods could be combined with numerical optimization techniques in order to realize an efficient design. Even if theoretically all the presented spatial linear filters can be implemented in analog processing systems like CNNs, we have to take into account the fact that these analog filters may be very sensitive to template elements variations due to technological dispersion which is inherently present in any VLSI process. A systematic theoretical study on the effect of these dispersions on filter characteristics and stability can be found in (David, 2004). It has been found both analytically and through simulations that the sensitivity of the frequency response to process variations increases with filter selectivity. Since we have designed also some very selective filters, a sensitivity analysis has to be carried out in each case, to find out if a given filter can be reliably implemented. The filter robustness and stability requirements may imply strong constraints on the realization of filters with high selectivity. Further research also envisages an efficient and reliable implementation of this class of filters.
6. Acknowledgment This work was supported by the National University Research Council under Grant PN2 – ID_310 “Algorithms and Parallel Architectures for Signal Acquisition, Compression and Processing”.
New Model and Applications of Cellular Neural Networks in Image Processing
501
7. References Chua, L.O.; Yang, L. (1988). Cellular Neural Networks: Theory, IEEE Transactions on Circuits and Systems, vol. 35, Oct.1988 Crounse, K.R.; Chua, L.O. (1995). Methods for Image Processing and Pattern Formation in Cellular Neural Networks – A Tutorial, IEEE Trans. on Circuits Systems, CAS-42 (10) David, E.; Ansorge, M; Goras, L; Grigoras, V. (2004). On the Sensitivity of CNN Linear Spatial Filters: Non-homogeneous Template Variations, Proc. of 8-th IEEE Int. Workshop on Cellular Neural Networks and their Applications, CNNA’2004, pp.40-45. Budapest, Hungary Dudgeon, D.E.; Mersereau, R.M. (1984). Multidimensional Digital Signal Processing, Englewood Cliffs, NJ: PrenticeHall, 1984 Freeman, W.T; Adelson, E.H. (1991). The Design and Use of Steerable Filters, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.13, No.9, Sept. 1991 Hänggi, M.; Moschytz, G. S. (2000). Cellular Neural Networks. Analysis, Design and Optimization, Kluwer Academic Publishers, Boston, 2000 Haykin, S. (1994). Neural Networks: A Comprehensive Foundation, Macmillan, 1994 Matei, R. (2001). Cellular Neural Networks with Modified Non-Linearity, Proc. of Int. Symp. Signal Processing and Information Technology, ISSPIT'2001, pp.296-300, Cairo, Egypt Matei, R. (2003). Oriented Low-Pass CNN Filters Based on Gaussian 1-D Prototype, Proceedings of 16th European Conf. on Circuit Theory and Design, ECCTD’2003, Vol.II, pp.185-188, Krakow, Poland Matei, R. (2004). Design Method for Orientation-Selective CNN Filters, Proceedings of International Symposium on Circuits and Systems ISCAS’2004, Vol.3, pp.105-108, ISBN: 0-7803-8251-X, Vancouver, Canada, May 23-26, 2004 Matei, R. (2004). On the Design of a Class of Selective CNN Linear Filters, Proceedings of the 8-th IEEE International Workshop on Cellular Neural Networks and their Applications, CNNA’2004, pp.153-158, Budapest, Hungary Matei, R. (2006). Design of a Class of Maximally-Flat Spatial Filters, Proceedings of the IEEE International Symposium on Circuits and Systems ISCAS’2006, pp.2165-2168, ISBN: 978-0-7803-9389-9, May 21-24, 2006, Kos Island, Greece Rodriguez-Vasquez, A. et al. (1993). Current Mode Techniques for the Implementation of Continuous- and Discrete-Time Cellular Neural Networks, IEEE Transactions on Circuits and Systems-II, vol. CAS-40, pp.132-146, March 1993 Shi, B.E. (1998). Gabor Type Filtering in Space and Time with Cellular Neural Networks, IEEE Trans. on Circuits and Systems, CAS-I, Vol.45, No.2, pp.121-132, Feb. 1998 *** CNN Software Library (CSL) (Templates and Algorithms), version 3.1, Analogical and Neural Computing Laboratory, Computer and Automation Research Institute
502
Advanced Technologies
Concentration of Heterogeneous Road Traffic
503
26 X Concentration of Heterogeneous Road Traffic Thamizh Arasan Venkatachalam and Dhivya Gnanavelu Department of Civil Engineering Indian Institute of Technology Madras India
1. Introduction Traffic flow theories seek to describe in a precise mathematical way the interactions between the vehicles and their operators and the infrastructure. As such, these theories are an indispensable construct for all models and tools that are being used in the design and operation of roads. The scientific study of traffic flow had its beginnings in the 1930’s with the study of models relating volume and speed (Greenshields, 1935), application of probability theory to the description of road traffic (Adams 1936) and the investigation of performance of traffic at intersections (Greenshields et al., 1947). After World War II, with the tremendous increase in use of automobiles and the expansion of the highway system, there was also a surge in the study of traffic characteristics and the development of traffic flow theories. The field of traffic flow theory and transportation while better understood and more easily characterised through advanced computation technology, are just as important today as they were in the early days. The fundamentals of traffic flow and their characteristics have become important and form the foundation for all the theories, techniques and procedures that are being applied in the design, operation, and development of road transportation systems. Traffic Flow is characterised by the movement of individual drivers and vehicles between two points and the interactions they make with one another. Unfortunately, studying traffic flow characteristics is difficult because driver behavior is something that cannot be predicted with one-hundred percent certainty. Fortunately, however, drivers tend to behave within a reasonably consistent range and, thus, traffic streams tend to have some reasonable consistency and can be roughly represented mathematically. The fundamental characteristics of traffic flow are flow, speed and concentration. These characteristics can be observed and studied at the microscopic and macroscopic levels. Table 1 provides a framework for distinguishing these characteristics. Traffic Characteristics Microscopic Macroscopic Flow Time Headways Flow Rates Speed Individual Speeds Average Speeds Concentration Distance Headways Density Table 1. Framework for fundamental characteristics of traffic flow
504
Advanced Technologies
This chapter is concerned with the macroscopic traffic flow characteristics and the associated analysis.
2. Traffic Flow Characteristics 2.1 Flow/Volume Flow, the macroscopic traffic flow characteristic, is quantified directly through point measurements, and by definition, requires measurement over time. Thus, flow (q) also termed as volume, is defined as number of vehicles passing a point on a highway during stated period of time, which is given by, q = N/T
(1)
where, N = Number of vehicles passing a point on the roadway in T; T = Total observation period. Flow rates are usually expressed in terms of vehicles per hour, although the actual measurement interval can be much less. 2.2 Speed Measurement of the speed requires observation over both time and space. The distinction can be made between different ways of calculating the average speed of a set of vehicles. The average traffic stream speed can be computed in two different ways: a time-mean speed and a space-mean speed. The difference in speed computations is attributed to the fact that the space-mean speed reflects the average speed over a spatial section of roadway, while the time-mean speed reflects the average speed of the traffic stream passing a specific stationary point over a specified period of time. 2.2.1 Space Mean Speed The average speed of a traffic stream computed as the length of roadway segment (L) divided by the total time required to travel the segment is the space mean speed. To calculate space mean speed, speeds of individual vehicles are first converted to individual travel time rates, then, an average travel time rate is calculated. Finally, using the value of average travel time rate, an average speed is calculated which is termed as space mean speed. It is given by,
us =
L 1 N
N
∑t i =1
i
Where ti is the time for the vehicle i to cross the distance L, which is given by ti = L/ui, Hence, us =
L 1 N
L ∑ i =1 u i N
=
L L N
N
1
∑u i =1
i
Concentration of Heterogeneous Road Traffic
505
us =
1 1 N
N
1
∑u i =1
(2)
i
where, ui = speed of an individual vehicle i; N = the number of vehicles passing a point during the selected time period. 2.2.2 Time Mean Speed The average speed obtained by taking the arithmetic mean of the speed observations is termed the time-mean speed. Since individual speeds will be recorded for vehicles passing a particular point over a selected time period, time mean speed can be expressed as
ut =
1 N
N
∑u i =1
i
(3)
where, ui = speed of an individual vehicle i; N = the number of vehicles passing a point during the selected time period. All authors agree that for computations involving mean speeds to be theoretically correct, it is necessary to ensure that one has measured space mean speed, rather than time mean speed, since time-mean speeds do not provide reasonable travel time estimates unless the speed of the point sampled is representative of the speed of all other points along a roadway segment. Under congested conditions, it is important to distinguish between these two mean speeds. For freely flowing traffic, however, there will not be any significant difference between the two. When there is great variability of speeds, there will be considerable difference between the two. 2.3 Density Traffic density, the macroscopic characteristic of traffic concentration, is defined as the number of vehicles occupying unit length of roadway at any instant of time and is given by, k=
N L
(4)
where, L = length of the roadway; N = the number of vehicles present over L at an instant of time. 2.4 Time-Space Diagram Traffic engineers represent the location of a specific vehicle at a certain time with a timespace diagram. Figure 1 shows the trajectories of a set of five vehicles (numbered from 1 to 5), through time, as they move in one direction on a road. Certain characteristics, such as headway, speed, flow, density, etc. can be depicted using the diagram (Figure 1). In the figure, the slope of time- space plot for each vehicle, dx/dt equals speed, u. The time difference between pairs of consecutive vehicles along the horizontal line is the headway
506
Advanced Technologies
between those vehicles. For example, ht23 is the time headway between 2nd and 3rd vehicles in the traffic stream (t3 minus t2). The difference in position between consecutive vehicles along vertical line is the spacing (space headway) between vehicles. For example, hs23 is the spacing between 2nd and 3rd vehicles in the traffic stream (x2 minus x3). The number of vehicles that the observer would be able to count at a point A on the road over a period of observation T, namely flow, is equal to the number of vehicle trajectories the horizontal line AA intersects during the time period. The number of time-space plot that are intersected by the vertical line BB corresponds to the number of vehicles occupying the section of roadway at that instant of time, which is nothing but density (k).
B Distance (x)
1
Trajectory of Vehicle 2
3
4
5
x2 hs23
x3
ht23
A
t2
A
t3 B
Time (t)
Fig. 1. Time-space diagram showing trajectories of vehicles over time and space 2.5 Fundamental Relations of Traffic Flow Characteristics The relationship between the basic variables of traffic flow, namely speed (us), flow (q), and density (k) is called the fundamental relations of traffic flow characteristics and mathematically, it is expressed as,
q = ku s
(5)
The relationships between the three basic characteristics, taken two at a time, are illustrated from Figure2 through 4. It may be noted that the nature of the relationships depicted in the figures are based on the assumption that the speed and density are linearly related (Greenshields model).
Concentration of Heterogeneous Road Traffic
507
Flow (q)
qmax
ko
Density (k)
kjam
Fig. 2. Flow-Density Curve uf
ko
Speed (u)
Speed (u)
uf
Density (k)
Fig. 3. Speed-Density Curve
kjam
uo
qmax Flow (q) Fig. 4. Speed-Flow Curve
2.5.1 Flow-Density Relation The flow and density vary with time and location. The graphical representation of the relation between density and the corresponding flow, on a given stretch of road, is referred to as the fundamental diagram of traffic flow (Figure 2). Some characteristics of an ideal flow-density relationship are listed below: 1. When density is zero, flow will also be zero, since there are no vehicles on the road. 2. When the number of vehicles gradually increases, the density as well as flow increases. 3. Increases in density beyond the point of maximum flow (qmax) results in reduction of flow. 4. When more and more vehicles are added, it reaches a saturation level where, vehicles can't move. This is referred to as the jam density or the maximum density (kjam). At jam density, flow will be zero because the vehicles are not moving. 5. The relationship is normally represented by a parabolic curve 2.5.2 Speed-Density Relation The simplest assumption is that the variation of speed with density is linear (Figure 3). Corresponding to the zero density, vehicles will be flowing with their desire speed, or freeflow speed (uf). When the density is jam density, the speed of the vehicles becomes zero. It is also possible to have non-linear relationships between speed and density as shown by the dotted lines.
508
Advanced Technologies
2.5.3 Speed-Flow Relation The relationship between the speed and flow (Figure 4) can be postulated as follows. The flow is zero either because there are no vehicles or there are too many vehicles so that they cannot move. At maximum flow, the speed will be in between zero and free flow speed. So far, the fundamentals of traffic flow and the necessity to study them in detail was briefly explained. Also, the characteristics of traffic flow such as flow, density and speed and the relationship between them were explained. Generally, motorists perceive lowering of the quality of service when the traffic concentration on the road increases. In other words, for a given roadway, the quality of flow, changes with the traffic concentration on the road. Thus, the measure ‘concentration’ provides a clear indication of both the level of service being provided to the users and the productive level of facility use. Hence, there is a need for indepth understanding of traffic flow characteristics with specific reference to concentration. Accordingly, the following sections are focused on the in-depth study of traffic concentration.
3. Concentration Concentration is a traffic measure which explains the extent of usage of road space by vehicles. It is a broader term encompassing both density and occupancy. The first is a measure of concentration over space; the second measures concentration over time of the same vehicle stream. Traffic density, as mentioned earlier is the number of vehicles occupying unit length of roadway at any instant of time. Hence, traffic density can be measured only along a stretch of roadway. The length and width of roadway, usually considered for measurement of density are 1 km and one traffic lane, respectively. The existing Techniques include photographic and input-output counts. In photographic technique, the number of vehicles occupying a section of roadway is counted using the photograph taken aerially on the road section. In input-output counts technique the number of vehicles, crossing two reference points chosen on a section of roadway, are counted. The number of vehicles occupying a section of roadway between these two reference points is obtained by finding the difference between the numbers of vehicles counted at two reference points at any instant of time. Due to difficulty in the field measurement, density needs to be calculated from speed and flow. Density can be calculated from field measured values of traffic volume and speed as k=
q us
(6)
where, k = density in vehicles per lane per km; q = flow rate in vehicles per hour; us = space mean speed in km per hour. Density, expressed as number of vehicles per unit length of roadway, is valid only under highly homogeneous traffic conditions, wherein the difference in individual vehicle speeds and vehicle dimensions are negligible. In practice, however, even under homogeneous traffic conditions, there are significant differences in the said two characteristics (speed and dimension) of vehicles. The measure, density, hence, becomes inapplicable for conditions
Concentration of Heterogeneous Road Traffic
509
with variations in the speed and dimensions of vehicles in the traffic stream. Hence, there is a need for development of an appropriate alternative measure to represent traffic concentration with potential for application to traffic conditions where there is significant difference in speed and dimension of vehicles. Realizing the need for development of an appropriate measure to represent traffic concentration, several research attempts have been made on the subject matter in the past. First, an attempt was made to compute concentration based on the number and speeds of vehicles measured at a point (Greenshields, 1960) and this computed factor named, ‘occupancy’ was used as a surrogate for density. Then, as refinement of the concept, occupancy was defined as a non-dimensional variable which can be measured directly by the proportion of time during which a single point on a roadway is covered by all vehicles (Athol, 1965). It is calculated as, Occupancy (ρ) =
∑t
i
T
(7)
where, ti = time during which a single point on a roadway is covered by vehicle i; T = the total observation period. As can be seen from equation (7), occupancy is a non-dimensional variable. Since occupancy is a function of speed and length of a vehicle, it could consider the effect of varying vehicle lengths and speeds. Hence, occupancy can be used as a surrogate of density for measuring concentration of road traffic where there is significant difference in speed and dimension of vehicles. Most traffic control systems that use a macroscopic density characteristic as the control variable can use the measured occupancy directly without converting to density (May, 1990). Hence occupancy can be considered as a logical substitute of density. It has been used as an indicator of establishing level of service criteria for freeways by developing a relationship between speed flow and occupancy (Lin et al., 1996). Researchers also derived relationships between density and occupancy for homogeneous traffic conditions. Initially, a linear relationship was postulated assuming that all the substreams had constant vehicle length and constant speed (Athol, 1965), which is expressed as, Occupancy, ρ = (l + d ) × k
(8)
where, k = density of traffic stream; d = length of the detection zone; l = length of the vehicle. The difficulty in using equation (8) to estimate density is that the equation is valid only under ideal traffic condition wherein vehicles are of same length and maintain same speed.
4. Measurement of Occupancy Based on the definition of occupancy (equation (7)), it can be inferred that vehicles are to be detected while passing a point or line across the roadway to get the required data for
510
Advanced Technologies
estimating occupancy. In practice, however, one is no longer dealing strictly with a point measurement, but with measurement along a short section of road using detectors. In other words, occupancy, based on practical consideration, is defined as the percentage of time the detection zone is occupied by the vehicles (May, 1990) and is given by, ρ=
∑ (t )
i O
T
(9)
where, (ti)O = time during which the detection zone on a roadway is covered by vehicle i and the subscript, o stands for occupancy; T = the total observation period. Since, the time during which a detection zone is covered by vehicles, is automatically recorded by detectors, occupancy can be easily measured in the field using detectors. Occupancy measured using detectors depends on the length of the detection zone, each detector type has a differing zone of influence (detector length) and the length of the zone of influence is effectively added to the vehicle length to calculate occupancy. Hence, the measured occupancy may be different for different detection zones (detectors) even for the same roadway and traffic conditions, depending on the size and nature of the detectors. This implies that it is necessary to consider length of the detection zone (on which the presence of vehicle is detected) also in the formulation, in order to standardize the measurement of occupancy.
5. Occupancy of Heterogeneous Traffic The afore mentioned concept of occupancy is specific to lane based traffic flow. The road traffic in countries like India is highly heterogeneous comprising vehicles of wide ranging static and dynamic characteristics as shown in Figure 5. The different types of vehicles present in the traffic on major urban roads in India can be broadly grouped into eight categories as follows: 1. Motorized two-wheelers, which include motor cycles, scooters and mopeds, 2. Motorized three-wheelers, which include Auto-rickshaws – three wheeled motorized transit vehicles to carry passengers and tempos – three wheeled motorized vehicles to carry small quantities of goods, 3.Cars including jeeps and small vans, 4. Light commercial vehicles comprising large passenger vans and small four wheeled goods vehicles, 5. Buses, 6. Trucks, 7. Bicycles and 8.Tricycles, which include Cycle-rickshawsthree wheeled pedal type transit vehicles to carry a maximum of two passengers and three wheeled pedal type vehicles to carry small quantity of goods over short distances. By virtue of their wide ranging characteristics, the vehicles do not follow traffic lane and occupy any lateral position over the width of roadway depending on the availability of road space at a given instant of time. Hence, it is nearly impossible to impose lane discipline under such conditions. To analyze such characteristics of the heterogeneous traffic, it becomes necessary to consider the whole of the width of road as single unit. Hence, when the occupancy concept is applied to heterogeneous traffic, it is necessary to consider the area (length and width) of the detection zone and the area of vehicles as the bases.
Concentration of Heterogeneous Road Traffic
511
Fig. 5. The heterogeneous traffic on an Indian road
6. Area-Occupancy Considering all the said issues related to occupancy, a modified concept of occupancy, named, ‘Area-Occupancy’, appropriate for heterogeneous traffic condition, is proposed here, which considers the horizontal projected area of the vehicle, without any restriction on the length of detection zone and width of road (treating the whole of the width of road as single unit, without consideration of traffic lanes), as the basis for measurement (Arasan & Dhivya, 2008). Accordingly, considering a stretch of road, area-occupancy is expressed as the proportion of time the set of observed vehicles occupy the detection zone on the chosen stretch of a roadway. Thus, area-occupancy can be expressed mathematically as follows; AreaOccupancy, ( ρ A ) =
∑ a (t ) i
i
i AO
AT
(10)
where, (ti)AO = time during which the detection zone is occupied by vehicle i in s and the subscript, AO stands for area-occupancy; ai = area of vehicle i falling on the detection zone in m2; A = area of the whole of the road stretch (detection zone) in m2; T = total observation period in s.
512
Advanced Technologies
6.1 Advantages of the concept of Area-Occupancy The area-occupancy is not affected by the length of the detection zone since it considers the length of the detection zone in its formulation. Also, the effect of heterogeneity and lane less nature of the traffic is incorporated in the concept of area-occupancy by consider the area (length and width) of the vehicle in its formulation. Hence, the concept of area-occupancy is valid to measure, accurately, the extent of usage of road space by vehicles. Thus, areaoccupancy, rather than occupancy, can be used as indicator of road traffic concentration at any flow level because of its ability to accurately replicate the extent of usage of the road. It may be noted that the area-occupancy concept can be applied to any traffic condition, from highly homogeneous to highly heterogeneous, and to any length of the detection zone. 6.2 Measurement of Area-Occupancy In the case of measurement of area-occupancy using vehicle-detection loops, according to the definition of time (ti)AO in equation (10), it is the time interval from the instant the front of a vehicle enters the detection zone to the instant the rear of the vehicle leaves the detection zone. As the area-occupancy concept is applicable for any length of detection zone, two cases are possible in the measurement of area-occupancy. One is the measurement of area-occupancy on the detection zone whose length is less than the vehicle length (Figure 6(a)) and the other is the measurement of area-occupancy on a detection zone of length more than the vehicle length (Figure 6(b)). The time (ti)AO of a vehicle can be split into three different time components, namely, t1, t2 and t3, where t1 is the time taken for the projected area of vehicle to fully cover the detection zone (when the length of the detection zone is less than the vehicle length) / fully fall on the detection zone (when the length of the detection zone is more than the vehicle length), t2 is the time interval, after t1, during which the horizontal projected area of the vehicle falling on the detection zone remains unchanged and t3 is the time taken by the vehicle, after t2, to clear off the detection zone. Thus, the distance traveled by a vehicle during the time t1+t2+t3 is (l+d), where, l is the length of the vehicle and d is the length of the detection zone. W
W
w l+d d
d
l
l+d Distance
Distance
w
d
l d
t3
t3 t2 t1 Time 6 (a) - Case 1: Length of Detection Zone less than the length of vehicle
t2 t1 Time 6 (b) - Case 2: Length of Detection Zone more than the length of vehicle
Fig.6. Principle involved in the measurement of area-occupancy
Concentration of Heterogeneous Road Traffic
513
There are some issues involved in the measurement of time component (ti)AO of areaoccupancy and these are explained using the trajectories of vehicles depicted in Figure 7(a) and 7(b). It can be seen that during the time t2, the whole length of the detection zone will be occupied by the vehicle in the case where the length of the detection zone (d) is less than the vehicle length (l) (B in Figure 7(a)) or the whole length of the vehicle will be occupying the detection zone where the length of the detection zone is more than the vehicle length (B in Fig 7(b)). Hence, during the time t2, the area of the detection zone occupied by the vehicle, at any instant of time, will be constant, which is equal to ai = length of the detection zone multiplied by the width of the vehicle (when (d < l)) or length of the vehicle multiplied by the width of the vehicle (when (d > l)). On the other hand, during the times t1 and t3, the detection zone is occupied by the vehicle only partially. Hence, in the time durations t1 and t3, the area of the detection zone occupied by the vehicle progressively varies and it is always less than ai. Due to this, during the times t1 and t3, in the formulation of areaoccupancy, the component, ai(ti)AO (with ti being considered to be t1+t2+t3) will overestimate the occupancy of the vehicle. That is, the occupancy will be estimated as (A+B+C+D+E) while the actual occupancy value is only (A+B+C). Assuming the speed maintained by a vehicle to be constant on the detection zone, the contribution of this vehicle, to occupancy, during the time interval t1 (A in Figure 7(a) and 7(b)) will be equal to the occupancy-contribution during the time interval t3 (C in Figure 7(a) and 7(b)). Hence, the problem of over estimation of occupancy value can be solved if the time interval (ti)AO is considered as either (t1+t2) or (t2+t3)) as (A+B+C) is equal to either B+(A+D) or B+(C+E). Also, the area ai in equation (10) is equal to the length of the detection zone multiplied by the width of the vehicle (when (d < l)) or equal to the horizontal projected area of the vehicle (when (d > l)).
Trajectory of front end of vehicle
A
B
C
E
End point of the detection zone Starting point of the detection zone
Length of vehicle (l) t1
Distance
Distance
Length of the detection zone (d)
D
Trajectory of front end of vehicle
Trajectory of rear end of vehicle
Length of the detection zone (d)
E
C
Trajectory of rear end of vehicle
B
A D
Starting point of the detection zone
Length of vehicle (l) t2
t3
Time - Actual occupied area
7(a) - Case 1: Length of Detection Zone less than the length of vehicle
End point of the detection zone
t1
t2
t3
Time
- Over estimated area
7(b) - Case 2: Length of Detection Zone more than the length of vehicle
Fig. 7. Principle of measurement of time component of area-occupancy
514
Advanced Technologies
The applicability of the area-occupancy concept to real world condition can be verified by estimating the area-occupancy value for a wide range of roadway and traffic conditions and checking for the logical validity of the results. For estimation of area-occupancy, it is necessary to measure accurately the characteristics of heterogeneous traffic flow and the other relevant factors on vehicular movement over a stretch of roadway. Measurement of these complex characteristics in the field is difficult and time consuming. Also, it may be difficult to carry out such experiments in the field covering a wide range of roadway and traffic conditions. Hence, it is necessary to model road–traffic flow for in depth understanding of the related aspects. The study of these complex characteristics, which may not be sufficiently simplified by analytical solution, can be done using appropriate modeling technique like computer simulation. Simulation is already proven to be a popular trafficflow modeling tool for applications related to road-traffic-flow studies. Hence, simulation technique has been used here to validate the concept of area-occupancy.
7. Simulation Model Simulation models may be classified as being static or dynamic, deterministic or stochastic, and discrete or continuous. A simulation model, which does not require any random values as input, is generally called deterministic, whereas a stochastic simulation model has one or more random variables as inputs. Random inputs lead to random outputs and these can only be considered as estimates of the true characteristics of the system being modeled. Discrete and continuous models are defined in an analogous manner. The choice of whether to use a discrete or continuous simulation model is a function of the characteristics of the system and the objectives of the study (Banks et al., 2004). For this study, a dynamic stochastic type discrete event simulation is adopted in which the aspects of interest are analysed numerically with the aid of a computer program. As this study pertains to the heterogeneous traffic conditions prevailing in India, the available traffic-simulation program packages such as CORSIM, MITSIM, and VISSIM etc. cannot be directly used to study the characteristics of heterogeneous traffic flow as these are based on homogeneous traffic-flow conditions. Also, the research attempts made earlier (Katti & Raghavachari, 1986; Khan & Maini, 2000; Kumar & Rao, 1996; Marwah & Singh, 2000; Ramanayya, 1988) to simulate heterogeneous traffic flow on Indian roads were limited in scope as they were location and traffic-condition specific. Moreover, these studies did not truly represent the absence of lane and queue discipline in heterogeneous traffic. Hence, an appropriate traffic simulation model, named HETEROSIM has been developed at Indian Institute of Technology Madras, India (Arasan & Koshy, 2005) to replicate heterogeneous traffic flow conditions accurately. The modelling framework is explained briefly here to provide the background for the study. For the purpose of simulation, the entire road space is considered as single unit and the vehicles are represented as rectangular blocks on the road space, the length and breadth of the blocks representing respectively, the overall length and the overall breadth of the vehicles (Figure 8). The entire road space is considered to be a surface made of small imaginary squares (cells of convenient size - 100 mm square in this case); thus, transforming the entire space into a matrix. The vehicles will occupy a specified number of cells whose coordinates would be defined before hand. The front left corner of the rectangular block is taken as the reference point, and the position of vehicles on the road space is identified
Concentration of Heterogeneous Road Traffic
515
based on the coordinates of the reference point with respect to an origin chosen at a convenient location on the space. This technique will facilitate identification of the type and location of vehicles on the road stretch at any instant of time during the simulation process. (0, 0)
X
Y
Reference axes
Fig. 8. Reference axes for representing vehicle positions on the roadway The simulation model uses the interval scanning technique with fixed increment of time. For the purpose of simulation, the length of road stretch as well as the road width can be varied as per user specification. Due to possible unsteady flow condition at the start of the simulation stretch, a 200m long road stretch, from the start of the simulation stretch is used as warm up zone. Similarly, to avoid the possible unsteady flow at the end of the simulation stretch due to free exit of vehicles, a 200m long road stretch at the end of the simulation stretch is treated as tail end zone. Thus, the data of the simulated traffic flow characteristics are collected covering the middle portion between the warm up and tail end zones. The said details are shown in Figure 9. 200m
Entry Section
Warm Up Zone
1000m
Exit Section
Study Stretch
200m
Tail End Zone
Fig.9. Road stretch considered for simulation of traffic flow Also, to eliminate the initial transient nature of traffic flow, the simulation clock was set to start only after the first 50 vehicles reached the exit end of the road stretch. The model measures the speed maintained by each vehicle when it traverses a given reference length of roadway. The output also includes the number of each category of vehicles generated, the values of all the associated headways generated, number of vehicles present over a given length of road (concentration), number of overtaking maneuvers made by each vehicle, and speed profile of vehicles. The logic formulated for the model also permit admission of vehicles in parallel across the road width, since it is common for smaller vehicles such as Motorised two-wheelers to move in parallel in the traffic stream without lane discipline. The
516
Advanced Technologies
model was implemented in C++ programming language with modular software design. The flow diagram illustrating the basic logical aspects involved in the program is shown as Figure 10.
Fig.10. Major logical steps involved in the simulation model The model is also capable of displaying the animation of simulated traffic movements through mid block sections. The animation module of the simulation model displays the model’s operational behavior graphically during the simulation runs. The snapshot of animation of heterogeneous traffic flow, obtained using the animation module of HETEROSIM, is shown in Figure 11. The model has been applied for a wide range of traffic conditions (free flow to congested flow conditions) through an earlier study (Arasan & Koshy, 2005) and has been found to replicate the field observed traffic flow to a satisfactory extent.
Concentration of Heterogeneous Road Traffic
517
Fig. 11. Snapshot of the animation of simulated heterogeneous traffic flow
8. Validation of Area-Occupancy Concept The concept of area-occupancy can be said to be applicable for any traffic stream under both heterogeneous and homogeneous traffic conditions. To check for the validity of the concept of area-occupancy, as the first step, the occupancy and area-occupancy of a homogeneous traffic stream are related independently to the density of a stream under homogeneous (cars-only) traffic condition for different lengths of detection zone. For this purpose, in respect of occupancy, the relationship between density and occupancy developed by Athol, 1965, given in equation (8) is used. In respect of area-occupancy, a relationship between density and area-occupancy under homogeneous traffic condition is newly developed, the details of which are given in the following sub-section. 8.1 Relationship between Density and Area-Occupancy The relationship between density and area-occupancy is derived as follows: During the time interval (ti)AO, let the distance traveled by a vehicle with speed ui be L. Then, substituting (ti)AO = L/ui in equation (10), we get, Area-Occupancy, ρA =
∑ a × (L u ) i
i
A×T
i
(11)
518
Advanced Technologies
Under highly homogeneous traffic condition, ai can be considered to be the same, say a, for all vehicles and hence, equation (11) can be written as, a L 1 × ×∑ A T i ui
ρA =
(12)
Multiplying and dividing the right hand side of equation (12) by the total number of vehicles, denoted by N, the expression will become, 1 N 1 a × L × × × ∑ T N i ui A
ρA =
(13)
The space mean speed of a traffic stream, as per equation (2), is, us =
1 N
1
1
∑u i
(14)
i
Also, the flow of a traffic stream, as per equation (1), is, q=
N T
(15)
Then, using equations (14) and (15), equation (13) can be written as, q a × L × us A
ρA =
(16)
As per Equation (6), (q/us) is equal to k, and hence, the relationship between density and area-occupancy, under homogeneous traffic condition, from equation (16), can be obtained as, a × L × k A
ρA =
(17)
When the length of the detection zone (d) is less than the vehicle length (l), then, the distance travelled by a vehicle in time, (ti)AO is l and the value of a is d multiplied by w and hence, the relationship between area-occupancy and density, from equation (17), in this case, can be given as, d×w ×l× k d ×W
ρA = That is,
Concentration of Heterogeneous Road Traffic
519
av ×k W
ρA =
(18)
where, w = width of the vehicle; W = width of the detection zone; av = area of the vehicle. When the length of the detection zone is more than the vehicle length (d > l), then, the distance traveled by a vehicle in time, (ti)AO is d and the value of a is l multiplied by w and hence, the relationship between area-occupancy and density, from equation (17), in this case, can be given as,
l×w ×d×k d ×W
ρA = That is,
av ×k W
ρA =
(19)
Thus, it is clear from equations (18) and (19) that the value of area-occupancy for a given roadway and traffic conditions is unaffected by change in the length of detection zone. This mathematical result can also be cross checked by performing traffic simulation experiments and the simulation experiments are explained in the following section. 8.2 Simulation Experiment Since the scope of the experiment is to prove a fundamental relationship, uniform traffic flow on a single traffic lane was considered. Accordingly, the HETEROSIM model was used for simulating the cars-only traffic (100% passenger cars of assumed length 4 m and width 1.6 m) on a 3.5m wide road space - single traffic lane (with no passing). The traffic flow was simulated for one hour (T = 1h) over a stretch of one km. The simulation runs were made with three random number seeds and the averages of the three values were taken as the final model output. The simulation was also run with various volume levels for various lengths (1m, 2m, 3m and 4m) of detection zone. Using the features of the simulation model, the times (ti)O and (ti)AO were recorded for each of the simulated vehicles, (assuming the speed maintained by the vehicles on the detection zone is constant) and the values are given in columns (3) and (4) of Table 2. It may be noted that the distance traveled by the vehicle during time (ti)AO is l, which is constant for all the vehicles under homogeneous traffic condition and hence, the time (ti)AO remains unchanged for the different lengths of detection zone for a given traffic volume. The occupancy and area-occupancy were then calculated using equations (9) and (10) respectively and the values are shown in columns (5) and (6) of Table 2. Density was then estimated using equation (8) as well as equation (18). The density values, thus calculated, are given in columns (7) and column (8) of Table 2. It can be seen that the density values estimated from the relationship with area-occupancy (equation (18)) is the same as those estimated from the relationship with occupancy (equation (8)). This implies that area-occupancy can be a substitute for occupancy.
520
Advanced Technologies
Flow (veh./h)
Mean Speed
(km/h)
(ti)O (s)
(1)
(2)
(3)
494
73.68
122.16
(ti)AO (s)
Occupancya
(%)
AreaOccupancyx (%)
(4) (5) (6) Length of detection zone: 1m 97.72
3.39
1.24
Density
Density
(veh./km) obtained as
(veh./km) obtained as
k=
k=
ρ
(l + d )
ρA
(C × l )
(7)
(8)
7
7
989
72.59
247.96
198.37
6.89
2.52
14
14
1487
71.46
377.84
302.27
10.50
3.84
21
21
1988
69.99
514.33
411.47
14.29
5.22
29
29
2495
67.19
669.74
535.79
18.60
6.80
37
37
2930
58.03
930.59
744.47
25.85
9.45
52
52
494
73.68
146.59
97.72
4.07
1.24
7
7
989
72.59
297.56
198.37
8.27
2.52
14
14
1487
71.46
453.41
302.27
12.59
3.84
21
21
1988
69.99
617.20
411.47
17.14
5.22
29
29
2495
67.19
803.68
535.79
22.32
6.80
37
37
2930
58.03
1116.71
744.47
31.02
9.45
52
52
Length of detection zone: 2m
Length of detection zone: 3m 494
73.68
171.02
97.72
4.75
1.24
7
7
989
72.59
347.15
198.37
9.64
2.52
14
14
1487
71.46
528.98
302.27
14.69
3.84
21
21
1988
69.99
720.07
411.47
20.00
5.22
29
29
2495
67.19
937.63
535.79
26.05
6.80
37
37
2930
58.03
1302.83
744.47
36.19
9.45
52
52
Length of detection zone: 4m 494
73.68
195.45
97.72
5.43
1.24
7
7
989
72.59
396.74
198.37
11.02
2.52
14
14
1487
71.46
604.54
302.27
16.79
3.84
21
21
1988
69.99
822.93
411.47
22.86
5.22
29
29
2495
67.19
1071.58
535.79
29.77
6.80
37
37
2930
58.03
1488.94
744.47
41.36
9.45
52
52
aobtained
by multiplying the occupancy value obtained, using equation (9), by 100. xobtained by multiplying the area-occupancy value obtained, using equation (10), by 100.
Table 2. Estimated values of occupancy, area-occupancy and density of homogeneous traffic It can also be seen that the occupancy values (column (5) in Table 2) estimated for the detection zone lengths of 1, 2, 3, and 4 m are significantly different from one another for a given traffic volume level. Thus, it is clear that even a small change (1m) in detection-zone
Concentration of Heterogeneous Road Traffic
521
80
80
Stream Speed (km/h)
Stream Speed (km/h)
length results in a considerable change in the value of occupancy corroborating the fact that occupancy is specific to the length of the detection zone. On the other hand, the areaoccupancy (column (6) in Table 2) remains unchanged for the different lengths of detection zone for a given volume of traffic. This implies that the concept of area-occupancy is valid for any length of detection zone. Hence, area-occupancy can be applied to measure accurately the extent of usage of road space by vehicles without any restriction on the length of detection zone. Also, to depict the validity of area-occupancy as replacement for density (concentration), the results of the simulation experiment (Table 2) were used to make plots relating (i) areaoccupancy with speed and flow and (ii) density with speed and flow as shown in Figure 12. It can be seen that area-occupancy and density exhibit similar trends of relationships with speed and flow.
75 70 65
y = -0.2339x2 + 0.6785x + 72.77
60
2
R = 0.9938
55 50 0
2
4 6 Area-Occupancy (%)
8
75 70 65
y = -0.0078x2 + 0.1241x + 72.77
60
R2 = 0.9938
55 50 0
10
5 10 15 20 25 30 35 40 45 50 55 60 Density (veh/km)
(a) Speed Vs Area-Occupancy
(c) Speed Vs Density
3000 2500 2000 1500 1000 500 0
y = -19.208x2 + 509.08x - 142.84 R2 = 0.998 0
2
4 6 Area-Occupancy (%)
8
Flow (veh/h)
Flow (veh/h)
3500
3500 3000 2500 2000 1500 1000 500 0
10
y = -0.6423x2 + 93.089x - 142.84 R2 = 0.998 0 5 10 15 20 25 30 35 40 45 50 55 60 Density (veh/km)
(b) Flow Vs Area-Occupancy Fig. 12. Relationship between traffic flow characteristics
(d) Flow Vs Density
9. Area-Occupancy of Heterogeneous Traffic The concept of area-occupancy can be applied as indicated earlier, for heterogeneous traffic condition also and relationships can be developed between flow, area-occupancy and traffic stream speed. The relationship is derived as follows: The formula for area-occupancy of heterogeneous traffic, from equation (10), can be written as,
522
Advanced Technologies n
ρA =
∑ a ∑ (t ) j
j =1
ji AO
i
(20)
AT
where, j = vehicle category; i = subject vehicle within category j; aj = horizontal projected area of the vehicle of category j falling on the detection zone; T = observation period; A = area of the detection zone; n = total number of vehicle categories in the heterogeneous traffic. Therefore, for each substream j,
(ρ A ) j
=
a j ∑ (ti )AO
(21)
i
TA
As the area of vehicle category j is constant for all the vehicles with in that category, equation (21) can be written as,
(ρ A ) j
aj = A
(t i ) AO ∑ i T
(22)
During the time interval (ti)AO, let the distance traveled by a vehicle of category j with speed ui be L. Then, substituting (ti)AO= L/ui in equation (22), and multiplying and dividing the right hand side of the equation by the total number of vehicles of category j, Nj, the expression will become,
(ρ A ) j
aj = A
L ∑ i ui × T
1 × Nj
1 i
Nj × Nj
That is,
(ρ A ) j
aj N j = L × × A T
∑ u i
(23)
The space mean speed of sub stream j, (uj) can be expressed as uj =
1 Nj
1
1
∑u i
Also the flow of sub stream j (qj) can be written as
i
(24)
Concentration of Heterogeneous Road Traffic
523
qj =
Nj
(25)
T
Hence, using equations (24) and (25), equation (23) can be written as,
(ρ A ) j
aj = L × A
qj × u j
(26)
When the length of the detection zone (d) is less than the vehicle length (l) then, the distance traveled by a vehicle (L) is l, when (d < l) and d when (d > l). The value of ai is (d . w) when (d < l) and (l . w) when (d > l) and then equation (26) can be simplified as, avj q j × W uj
(ρ A ) j =
(27)
where, W = width of the detection zone; avj = area of the vehicle category j. for both the cases, namely, (d < l) and (d > l) as for the steps followed to derive equations (18) and (19). Then, the area occupancy of whole of the traffic stream is
ρ A= ∑ ( ρ A ) j j
(28)
Hence, it is clear from equation (27) that by knowing the flow and speed of the different categories of vehicles in a heterogeneous traffic stream, the area-occupancy of different susbstreams can be calculated. Also, the area-occupancy of the whole of the traffic stream can be calculated using the relationship given in equation (28). Under heterogeneous traffic condition, there may be several substreams, each with different vehicle categories in the whole of the traffic stream. Hence, to determine the area-occupancy of the whole of the traffic stream and subsequently to formulate relationship between areaoccupancy, flow and speed for the traffic stream as a whole, it is necessary to express the area-occupancy value of each of the substreams in terms of area-occupancy of an equivalent stream consisting of a standard vehicle. The same can be written, based on equation (27), mathematically, as,
asv × (qsv ) j W × (usv ) j
=
avj × q j W ×uj
(29)
where, (qsv)j = flow of standard vehicle, equivalent to substream j, expressed in No. of vehicles/h; (usv)j = space mean speed of the standard vehicles with flow (qsv)j ; asv = area of the standard vehicle;
524
Advanced Technologies
Equation (29) can be simplified as,
(qsv ) j (usv ) j
avj × q j a = sv uj
(30)
Multiplying and dividing the right hand side of equation (27) by asv, we get,
(ρ A ) j
avj × q j a sv asv = × uj W
(31)
From equation (30), equation (31) can be written as,
(ρ A ) j = asv × W
(qsv ) j (usv ) j
(32)
From equation (27), the area of occupancy of traffic stream considered in terms of standard vehicles with flow qsv can be written as
asv qsv × W usv
ρA =
(33)
Also, q sv = ∑ (q sv ) j j
(34)
Where, ρA = area-occupancy of the whole of the traffic stream considered in terms of standard vehicle; qsv = flow of the whole of the traffic stream considered in terms of standard vehicles; usv = space mean speed of the whole of the stream considered in terms of standard vehicle. From equation (28), equation (33) can be written as
asv qsv × W usv
ρ A = ∑ (ρ A ) j = j
From equation (32), equation (35) can be written as,
asv × qsv ( ) q a W sv sv j = ∑ u sv W j (u sv ) j
(35)
Concentration of Heterogeneous Road Traffic
525
That is, asv × qsv W u sv = asv (qsv ) j ∑ W j (u sv ) j
(36)
From equation (34), the stream speed of traffic considered in terms of standard vehicles (equation (36)) can be written as,
u sv =
∑ (q ) j
sv
(q sv ) j
∑ (u ) j
j
sv
(37)
j
If heterogeneous traffic is converted into equivalent standard vehicles based on the concept of area-occupancy (by satisfying the condition given in equation (29)), then, equation (33) is the fundamental equation of heterogeneous traffic flow in which flow is expressed in terms of standard vehicles.
10. Application through Simulation As the concept of area-occupancy takes into account the variations in traffic composition, it remains unaffected by change in traffic composition. To validate this statement, heterogeneous traffic with three different compositions, as shown in Figure 13 were considered. M.T.W 22%
Bicycle 2%
Bus 14%
M.T.W 17% M.Th.W 5%
M.Th.W 4%
Truck 29%
Car 18%
LCV 11% (a) Composition of Traffic I
Bicycle 2%
Car 21%
Bus 16%
LCV 14% (b) Composition of Traffic II
M.Th.W 2% Car 17%
M.T.W Bicycle 2% 12%
Truck 25%
Bus 21%
LCV 11% (c) Composition of Traffic III
Truck 35%
LCV- Light Commercial Vehicle, M.Th.W – Motorised Three-Wheelers, M.T.W - Motorised Two-Wheelers
Fig. 13. Composition of the three different heterogeneous traffic streams
526
Advanced Technologies
The traffic flow was simulated on a six lane divided road with 10.5m wide main carriageway and 1.5m of paved shoulder for various volume levels, for each of the three cases. The traffic flow was simulated on one km long road stretch for one hour and three simulation runs, with different random number seeds, were made and the mean values of the three runs were taken as the final values of the relevant parameters obtained through the simulation process. Using the features of the simulation model, the time (ti)AO was recorded for each of the simulated vehicles, considering a detection-zone of length 3m. The areaoccupancy was estimated using the equation (20) (It can also be estimated using the equations (27) and (28)). In this case, the area of the detection zone is equal to (3 X 12 = 36 m2). The value of area-occupancy of traffic for the three different traffic compositions were plotted, on a single set of axes, against V/C ratio, and the same is depicted in Figure 14. It can be seen that the values of area-occupancy of the three heterogeneous traffic streams are nearly the same for any given flow (V/C) level and it can be concluded that the areaoccupancy can be used as a measure of traffic concentration for any traffic and roadway condition.
Area-Occupancy (%)
24.00 21.00 18.00 15.00 12.00 9.00 6.00 3.00 0.00 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 V/C Comp I
Comp II
Comp III
Fig. 14. Relationship between area-occupancy and V/C ratio 10.1 Relationship between Traffic-Flow Characteristics To reinforce the fact that the concept of area-occupancy can be applied to study the various characteristics of heterogeneous traffic flow, the area-occupancy of heterogeneous traffic was estimated by simulating one-way flow of heterogeneous traffic on a 7.5 m wide road (equivalent to one half of four-lane divided road) for various volume levels with the
Concentration of Heterogeneous Road Traffic
527
representative traffic composition prevailing on urban roads in India (depicted in Figure 15). The traffic flow was simulated on one km long road stretch for one hour and three simulation runs, with different random number seeds, were made and the mean values of the three runs were taken as the final values of the relevant parameters obtained through the simulation process. Using the features of the simulation model, the time (ti)AO was recorded for each of the simulated vehicles, considering a detection zone of length 3m.
Bicycle 4%
Bus L.C.V. 3% 3%
Car 20%
M.Th.W. 13%
M.T.W. 57%
LCV- Light Commercial Vehicle, M.Th.W – Motorised Three-Wheelers, M.T.W - Motorised Two-Wheelers
Fig. 15. Representative composition of the traffic prevailing on urban roads in India
70 60 50 40 30 20 10 0
6000
Flow in Veh/h
Stream Speed (km/h)
The area-occupancy was estimated using the equation (20). In this case, the area of the detection zone is equal to (3 X 7.5 = 22.5 m2). The average stream speeds and exit volume of the heterogeneous traffic, for the various volume levels were then obtained using the simulation output. Then, plots relating the area-occupancy, speed and flow were made as shown in Figure 16. It may be noted that there is decrease in speed with increase in areaoccupancy (Figure 16(a)) and increase in traffic flow with increase in area-occupancy (Figure 16(b)), which are logical indicating the appropriateness of the area-occupancy concept for heterogeneous traffic. Hence, it is inferred that the concept of area-occupancy is valid and can be applied to measure accurately the traffic concentration.
2
y = 0.2571x - 7.0128x + 65.129 2
R = 0.9944
5000 4000 3000
y = -28.081x2 + 686.89x + 600.91
2000
R2 = 0.9749
1000 0 0
2
4
6 8 10 12 Area-Occupancy (%)
14
16
0
2
4 6 8 10 12 Area Occupancy (%)
14
(a) Speed Vs area-occupancy (b) Flow Vs area-occupancy Fig. 16. Relationship between area-occupancy, flow and speed
16
528
Advanced Technologies
11. Summary and Conclusion Concentration is a traffic measure which explains the extent of usage of road space by vehicles. It is a broader term encompassing both density and occupancy. The first is a measure of concentration over space; the second measures concentration over time of the same vehicle stream. The review of the literature on traffic flow characteristics indicates that the traffic density, expressed as number of vehicles per unit length of roadway, can not be appropriate for accurate measurement of traffic concentration, due to variation in the dimensions and speed of vehicles, even on a given traffic lane. Hence, there is a need for development of an appropriate alternative measure to represent traffic concentration with potential for application to traffic conditions where there is significant difference in speed and dimensions of vehicles. Occupancy, defined as the proportion of time during which the detection zone on a highway is covered by all vehicles, can help to remove the deficiencies of the concept of density to a significant extent. Occupancy, thus, takes into account the traffic composition (variation in the dimensions of vehicles) and speed variations, in its measurement, simultaneously and gives a more reliable indicator of the extent of road being used by vehicles. Hence, occupancy is more meaningful than density. Since, occupancy depends on the size of the detection zone, the measured occupancy, however, may be different for different detection-zone lengths, even for the same roadway and traffic conditions. Also, the concept of occupancy can not be directly applied under highly heterogeneous traffic conditions such as those prevailing on Indian roads, as the traffic has no lane discipline (vehicles occupy any lateral position on the road depending on the availability of road space at a given instant of time) and hence, it is necessary to consider the whole of the road as single unit to analyze the traffic flow. Hence, a new measure, named, ‘area-occupancy’ is proposed, which considers the horizontal projected area of the vehicle, without any restriction on the length of the detection zone and width of road (treating the whole of the width of road as single unit without consideration of traffic lanes) as the bases. Thus, on a stretch of road, areaoccupancy is expressed as the proportion of the area of the detection zone covered by all the vehicles, traversing the zone during the observation time. The area-occupancy is not affected by the length of the detection zone since it considers the length of the detection zone in its formulation. Also, the effect of heterogeneity and lane less nature of the traffic is incorporated in the concept of area-occupancy by consider the horizontal projected area (length and width) of the vehicle in its formulation. Hence, the concept of area-occupancy is valid to measure, accurately, the extent of usage of road space by vehicles. It may be noted that the area-occupancy concept can be applied to any traffic condition, from highly homogeneous to highly heterogeneous, and to any length of the detection zone. To check for the validity of the concept, the occupancy and area-occupancy of a homogeneous traffic stream were estimated for different lengths of detection zone through simulation experiments and the values were related to the density of the stream. It was found from the results of the simulation experiment, that area-occupancy can be a substitute for occupancy. Also, the estimated area-occupancy is found to remain unchanged with respect to change in the length of the detection zone. Thus, it has been proved that areaoccupancy, rather than occupancy, can be used as indicator of road traffic concentration at any flow level because of its ability to accurately replicate the extent of usage of the road space. Also, to depict the validity of area-occupancy as replacement for density
Concentration of Heterogeneous Road Traffic
529
(concentration), plots relating (i) area-occupancy with speed and flow and (ii) density with speed and flow were made and it was found that area-occupancy and density exhibit similar trends of relationships with speed and flow. It was found through simulation experiments that the area-occupancy remains stable with respect to change in traffic composition under heterogeneous traffic conditions. Also, relationships between flow, speed and area-occupancy of heterogeneous traffic for a most common roadway and traffic conditions prevailing in India have been developed and found to be logical. This further reinforces the fact that area-occupancy is an appropriate measure of traffic concentration for any roadway and traffic conditions.
12. References Adams, W. F. (1936). Road Traffic Considered as a Random Series. Journal of Inst. Civil Engineers, Vol. 4, U.K., pp. 121-130. Arasan, V.T. & Dhivya, G. (2008). Measuring Heterogeneous Traffic Density, Proceedings of International Conference on Sustainable Urban Transport and Environment, World Academy of Science, Engineering and technology, Bangkok, Vol. 36, 342-346. Arasan, V.T. & Koshy, R.Z. (2005). Methodology for Modeling Highly Heterogeneous Traffic Flow. Journal of Transportation Engineering, ASCE, Vol. 131(7), 544-551. Athol, P. (1965). Interdependence of Certain Operational Characteristics within a Moving Traffic Stream. Highway Research Record, HRB, National Research Council, Washington, D.C., Vol. 72, 58-87. Banks, J.; Carson, J.S.; Barry, L.N.; and David, M.N. (2004). Discrete-Event System Simulation, Pearson Education, Singapore, Third Edition, 12-14. Greenshields, B. D. (1935). A Study in Highway Capacity, Highway Research Board, Proceedings, Vol. 14, p. 458. Greenshields, B.D. (1960). The Density Factor in Traffic Flow. Traffic Engineering, Vol. 30(6), 26-28 and 30. Greenshields, B. D.; Schapiro, D. & Erickson, E. L. (1947). Traffic Performance at Urban Intersections, Bureau of Highway Traffic, Technical Report No. 1. Yale University Press, New Haven, CT. Katti, V.K. and Raghavachari, S. (1986). Modeling of Mixed Traffic with Speed Data as Inputs for the Traffic Simulation Models. Highway Research Bulletin, Indian Roads Congress, Vol. 28, 35-48. Khan, S.I. & Maini, P. (2000). Modeling Heterogeneous Traffic Flow. Transportation Research Record, Transportation Research Board of the National Academies, D.C., 1678, 234241. Kumar, V.M. & Rao, S.K. (1996). Simulation Modeling of Traffic Operations on Two Lane Highways. Highway Research Bulletin, Indian Roads Congress, Vol. 54, 211-237. Lin, F.B.; Su, C.W. & Huang, H.H. (1996). Uniform Criteria for Level of Service analysis of Freeways. Journal of Transportation Engineering, ASCE, Vol. 122(2), 123-130. Marwah, B.R. & Singh, R. (2000). Level of Service Classification for Urban Heterogeneous Traffic: A Case Study of Kanpur Metropolis, Transp. Res. Circular E-C018: Proc., 4th International Symposium on Highway Capacity, Maui, Hawaii, 271-286. May, A.D. (1990). Traffic Flow Fundamentals, Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
530
Advanced Technologies
Ramanayya, T. V. (1988). Highway Capacity under Mixed Traffic Conditions. Traffic Engineering and Control, U.K., Vol. 29, No. 5, 284-287. Revised Monograph on Traffic Flow Theory: A State-of-the-Art Report, Transportation Research Board, National Research Council, Washington D.C., 1997.
Implementation of Fault Tolerance Techniques for Grid Systems
531
27 X
Implementation of Fault Tolerance Techniques for Grid Systems Meenakshi B. Bheevgade and Rajendra M. Patrikar
Visvesvaraya National Institute of Technology Deemed University, NAGPUR. Visvesvaraya National Institute of Technology, NAGPUR, MAHARASHTRA STATE, INDIA 1. Introduction In the modern era of super-computing, grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Most of the organizations use latest technologies to form a grid. Complex scientific problems in science and engineering run for a long time and it becomes important to make them resistant to failures in the underlying hardware and infrastructure. Grid Computing System has used for execution of the application, which needs more time. During parallel application, the computation cannot complete if any node(s) failures(s) encountered. Therefore, fault tolerance has become necessity. The fault tolerant techniques usually compromise between efficiency and reliability of the node in order to complete the computation even in presence of failures. The goal usually is to preserve efficiency hoping that failures will be less. However, the computational resources have increased in Grid but its dynamic behavior makes the environment unpredictable and prone to failure. A major hurdle parallel applications facing today is appropriate handling of failures that occur in the grid environment. Most application developers are unaware of the different types of failures that may occur in the grid environment. Understanding and handling failures imposes an undue burden on the application developer already burdened with the development of their complex distributed application. Thus, automatic handling of faults by grid middleware is necessary. This strategy makes this functionality transparent to the application users. This would enable different data-intensive grid applications to become fault-tolerant without each having to pay a separate cost. In this chapter, we discuss the various strategies of fault tolerance. The watchdog timer algorithm, a popular method in embedded systems, has been used to bring in fault tolerance in cluster and grid environment. An implementation detail of the watchdog timer like strategy in cluster environment is discussed in detail. This method requires a modification of application program. This methodology was further developed in which program state was collected at regular interval so as to start the program at appropriate time. The technique allows the execution of long-running parallel application even in the presence of node failure. The Watchdog Timer method is used in our work to
532
Advanced Technologies
take care of hardware as well as software faults. The implementation of this method helped us to detect the failure of the node. In the embedded systems, timer resets the system if fault occurs. However, for the parallel computation performance applications this strategy may prove to be costly and may not be possible every time. It is desirable that the application running on the faulty node, which is usually a part of the program spawned by the main routine, should continue to run on healthy node after failure detection. The strategy is that the application resumed on the newly added node from the last saved data. In this method the parallel application need to modify for collecting the state of the intermediate steps. This strategy could be incorporated in the application itself. However, this may increase the complexity of the application and may slow down the application. Thus the long running applications need not start again from the beginning which is usually very costly. This task is done by the master node and we assume that this node has a very high reliability. (In grid, there is no domain controller. All nodes are treated as of same hierarchical level. The node on which I execute and distribute the task from it, I refer that node as Master node for the simplicity.) Since cluster and grid layers were different for the middleware implementation, different sets of application were developed. The collection of state (i.e. data) of parallel application at the middleware of cluster and grid layer using different sets of application was referred as “Sanchita” (meaning Collection in Sanskrit language also abbreviated from Search All Nodes for Collection of Hidden data (Process State) by Implementing The Algorithm) Fault Tolerant Technique. The technique helps to take care of different types of failures results to complete the task of computation and to achieve fault tolerance in cluster and grid environment. These applications were integrated to make it transparent to user.
2. Review Lot of work has been done on fault tolerant mechanisms in distributed parallel systems. The focus of that work is on the provision of a single failure recovery mechanism. The different types of failures have been detected called as failures detection and failures recovery mechanisms have been applied. In perspective of parallel programs, there are vendor implementations of check-point/restart for MPI applications running on some commercial parallel computers. A checkpoint/restart implementation for MPI at NCCU Taiwan uses a combination of coordinated and uncoordinated strategies for checkpointing MPI applications. It is build on top of the NCCU MPI implementation uses Libtckpt as the backend checkpointer. A local daemon process coordinates checkpointing of the processes running on the same node, while processes on different nodes are checkpointed in an uncoordinated manner using message logging (Sankaran, et al., 2005). A limitation of the existing systems for checkpointing MPI applications on commodity clusters is that they use MPI libraries, which primarily serves as research platform in most of the cases. Secondly, the checkpoint/restart system was tightly coupled to a specific single-process check-pointer. Since single-process check-pointers usually support a limited number of platforms, limits the range of systems on which this technique could be applied. A transparent checkpointrestart mechanism for commodity operating systems had evaluated that checkpoints and restarts multiple processes in a consistent manner. This system combines a kernel-level checkpoint mechanism with a hybrid user level and kernel-level restart mechanism to leverage existing operating system interfaces and functionality as much as possible for transparent checkpoint-restart (Laadan & Nieh, 2007). The fault tolerance problem in term of
Implementation of Fault Tolerance Techniques for Grid Systems
533
resource failure had addressed in (Nazir, Khan, 2006). The author devised a strategy for fault tolerant job scheduling in computational grid. This strategy maintains history of the fault occurrence of resource in Grid Information Service (GIS). Whenever a resource broker has job to schedule, it uses the Resource Fault Occurrence History (RFOH) information from GIS and depending on this information use different intensity of check pointing and replication while scheduling the job on resources that have different tendency towards fault acceptable service. The fault tolerance function is to preserve the delivery of expected services despite the presence of fault-caused errors within the system itself and described by the author Avizienis. (Avizienis, 1985). Errors had been detected, corrected, and permanent faults were located and removed while the system continues to deliver acceptable service. Execution of SPMD applications in a fault tolerant manner had achieved using check pointing or replication method by Weissman (Weissman, 1999). For the purposes of a direct quantitative comparison, a simple checkpoint model had assumed in which each SPMD task saves its portion of the data domain on disk at a set of pre-determined iterations. Abawajy et al. (Abawajy, 2004) define resource as any capability that must be scheduled, assigned, or controlled by the underlying implementation to assure non-conflicting usage by processes. Scheduling policies for Grid systems can be classified into space sharing (Thanalapati and Dandamudi, 2001) and time-sharing. It is also possible to combine these two types of policies into a hybrid policy to design an on-line scheduling policy. Tuong designed a framework (Nguyen-Tuong, 2000), which enables the easy integration of fault-tolerance techniques into object-based grid applications. Using programming tools augmented with fault-tolerance capabilities, they have shown how applications had written to tolerate crash failures. A fault tolerance service was designed to incorporated, in a modular fashion, into distributed computing systems, tools, or applications. This service uses well-known techniques based on un-reliable fault detectors to detect and report component failure, while allowing the user to tradeoff timeliness of reporting against false positive rates (Stelling, at.al, 1999). The authors in (Liang, et.al, 2003) described the approach from user viewpoint of Grid and considered the nature of Grid faults across the board based on thread state capturing mechanism, an exception handling method and mobile agent technology. Globus has become the de facto standard for grid computing. The Globus toolkit consists of a set of tools and libraries to support grid applications. Fault tolerance approaches in grid systems had commonly achieved with checkpoint-recovery and job replication described in [(J.B.Weissman and Womack, 1996),(Abawajy, 2004), and (Townend, 2004)] which create replicas of running jobs and hoping that at least one of them succeeds in completing the job. The authors in (J.B.Weissman and Womack, 1996) introduced a scheduling technique for a distributed system that suffers from increased job delays due to insufficient number of remote sites to run the replicas. Fault tolerance in grid environment was achieved by scheduling jobs in spite of insufficient replicas (Abawajy, 2004). This approach requires at least one site to volunteer for running the replica before the execution can start. The author Townend had submitted jobs replicas to different sites that return the checksum of the result. (Townend, 2004), the checksums received from various sites then compared to ensure whether majority results were the same, in order to avoid a result from a malicious resource, which delays the retrieval of result until a majority had reached. Therefore, job delay increase may result not only from failures but also from the verification overhead. Wrzesinska proposed a solution that avoids the unneeded replication and restarting of jobs
534
Advanced Technologies
by maintaining a global result table and allowing orphaned jobs to report to their grandparent incase their parent dies (Wrzesinska., et.al.). However, their approach is strictly for divide-and-conquer type of applications and cannot extend to environments where the sub-processes require communication. “Grid Workflow” leaves recovery decisions to the submitter of the job via user defined exception handling (Soonwook and Kesselman, 2003). Grid Workflow employs various task-level error-handling mechanisms, such as retrying on the same site, running from the last checkpoint, and replicating to other sites, as well as masking workflow level failure. Nonetheless, most task-level fault tolerant techniques mentioned by the authors Abawajy, Dandamudi, Weissman , and Womack. (Abawajy and Dandamudi, 2003, Weissman and Womack, 1996, and Abawajy, 2004), attempted to restart the job on alternative resources in the Grid in an event of a host crash. Hence, there is a need for complementing these approaches by improving failure handling at the site level, especially in a cluster-computing environment. In LinuxHA Clustering project, LinuxHA is a tool for building high availability Linux clusters using data replication as the primary technology. However, LinuxHA only provides a heartbeat and failover mechanism for a flat-structure cluster which does not easily support the Beowulf architecture commonly used by most job sites. OSCAR is a software stack for deploying and managing Beowulf clusters (Mugler, et.al, 2003) and (Thomas Naughton, et al.,). This toolkit includes a GUI that simplifies cluster installation and management. Unfortunately, a detrimental factor of the Beowulf architecture is the single point of failure (SPoF). A cluster can go down completely with the failure of the single head node. Hence, there is a need to improve the highavailability (HA) aspect of the cluster design. The recently released HA-OSCAR software stack is an effort that makes inroads here. HA-OSCAR deals with availability and fault issues at the master node with multi-head failover architecture and service level fault tolerance mechanisms. PBS (Bayucan, et.al, 1999) and Condor (Tannenbaum, et.al, 2002) are resource management software widely used in the cluster community. While a HA solution [CCM, LinuxHA] for Condor job-manager exists, there a dearth of such solutions for the PBS job manager. The failure of Condor Central Manager (CM) leads to an inability to match new jobs and respond to queries regarding job status and usage statistics. Condor attempts to eliminate the single point of failure (i.e. the Condor CM) by having multiple CMs and a high availability daemon (HAD) which monitors them and ensures one of them is active at all times. Similarly, HA-OSCAR’s self-healing core monitors the PBS server among other critical grid services (e.g. xinetd, gatekeeper etc), to guarantee the high availability in an event of any failure. The fault tolerance design evaluation (Object Management Group, 2001), and (Friedman and E.Hadad) has performed by means of simulation, experiments or combination of all these techniques. The reliability prediction of the system has compared to that of the system without fault tolerance. Physical parameters, quality of fault detection and recovery (Foster and A. Iamnitchi,2000), (Foster, et. al.,1999),(Sriram Rao,1999) and (Felber and Narasimhan,2004) algorithms has used as parameters in generating reliability predictions. When degradation of performance takes place during recovery, reliability predictions need to be generated for various levels of performance. A different evaluation is needed when the reliability includes a specification of a minimum number of faults that are to be tolerated; regardless where in the system occurs. The fault tolerance techniques described in (Foster and lamnitchi, 2000, Foster, et.al. 1999) and (Lorenzo,et.al., 1999) many times results in lower performance because of the overheads required for reallocation of job which was subjected to gracefully degradation mode. A gracefully degradable system is one in which the user
Implementation of Fault Tolerance Techniques for Grid Systems
535
does not see errors except, perhaps, as a reduced level of system functionality. Current practice in building reliable systems is not sufficient to efficiently build graceful degradation into any system. In a system with an automatic reconfiguration mechanism, graceful degradation becomes fairly easy to accomplish. After each error is detected, a new system reconfiguration is done to obtain maximal functionality using remaining system resources, resulting in a system that still functions, albeit with lower overall utility. A watchdog timer is a computer hardware-timing device that triggers a system reset, if the main program does not respond due to occurrence of some fault. The intention is to bring the system back from the hung state into normal operation. The most common use of watchdog timer is in embedded systems, where the specialized timer is often a built-in unit of a microcontroller. Watchdog timer may be more complex, attempting to save debug information onto a persistent medium; i.e. information useful for debugging the problem that caused the fault. The watchdog timer has been utilized for a fault tolerant technique in many serial applications. Usual implementation of this technique requires hardware implementation of timer/counter that usually interrupts CPU for corrective actions. Thus, although fault tolerant clusters are being researched for some time now, implementation of the fault tolerance architecture is a challenge. There are various types of failures which may occur in the cluster. Prediction of failure mechanism is very difficult task and strategies based on a particular failure mode may not help (Sriram,et.al.,1999, Toenend and Xu, 2003, & Felbar and Narasimham,2005). Typically, fault detection can be done by some sort of response and respond method (Toenend, 2003). Assuming that the node has been faulty and not responding, in such cases the software testing has been done to re-check that the node has been responding. If the fault is transient then rechecks are to be done and if repeated rechecks show the failure of the system then the system is re-booted. If the fault persists after rebooting of the system then the node implied to be faulty and has to be removed from the list of available nodes. For this, the two algorithms i.e. Fault detection algorithm and recovery algorithm have been studied and partially implemented in our work also. In the next section we define important terms which are used in implementation of all the fault tolerance mechanisms. 2.1 Process State, RPC and Compute Bound Process state is the state field in the process descriptor. A process referred as a task, is an instance of a program in execution. The process is runnable, and it is either currently running or it is on a ready state waiting to run. This is the only possible state for a process executing in a portion of system memory in which user processes run; it can also apply to a process in kernel space (i.e., that portion of memory in which the kernel executes and provides its services) that is actively running. A signal is a very short message that can send to a process, or to a group of processes, that contains only a number identifying the signal. It permits interaction between user mode processes and allows the kernel to notify processes of system events. A signal can make a process aware that a specific event has occurred, and it can force a process to execute a signal handler function included in its code. Processes in user mode had forbidden to access those portions of memory that had allocated to the kernel or to other programs. RPC (Remote procedure call) - A system call is a request made via a software interrupt by an active process for a service performed by the kernel. The wait system call tells the operating system to suspend execution of that process until another process has completed. An interrupt is a signal to the kernel that an event has occurred, thus
536
Advanced Technologies
result in changes in the sequence of instructions that has executed by the CPU. A software interrupt, also referred to as an exception, is an interrupt that originates in software, usually by a program in user mode. Process execution has stopped; the process is not running nor is it eligible to run. The state occurs if the process receives the SIGSTOP, SIGTSTP, SIGTTIN or SIGTTOU signal (whose default action is to stop the process to which they had sent) or if it receives any signal while it has being debugged. Compute Bound - Computers that predominantly used peripherals were characterized as IO (input output bound) bound. A computer is frequently CPU bound implies that upgrading the CPU or optimizing code will improve the overall computer performance. CPU bound (or compute bound) is when the time for a computer to complete a task is determined principally by the speed of the central processor and main memory. Processor utilization is high, perhaps at 100% usage for many seconds or minutes. There are few interrupts such as those generated by peripherals. With the advent of multiple buses, parallel processing, multiprogramming, preemptive scheduling, advanced graphics cards, advanced sound cards and generally, more decentralized loads, it became less likely to identify one particular component as always being a bottleneck. It is likely that a computer's bottleneck shifts rapidly between components. Furthermore, in modern computers it is possible to have 100% CPU utilization with minimal impact to another component. Finally, tasks required of modern computers often emphasize quite different components, so that resolving bottleneck for one task may not affect the performance of another. For these reasons, upgrading a CPU does not always have a dramatic effect. The concept of being CPU bound is now one of many factors considered in modern computer performance. We have developed a code that helps us to know if failure is encountered. This is done by transferring data from all the nodes to master node (here we define master node, which spawns the jobs to other nodes) in Grid Environment.
3. Implementation Today many idle machines are used to form the cluster so that parallel application can be executed on those clusters. We have used LAM/MPI software to make the cluster of the workstations and PCs. This LAM/MPI is open software and has been mostly used for research work. There is another popular standard for inter-cluster communication i.e. Globus Toolkit, which is used to form a grid. This tool also has been used mainly for research work. There are many commercial tools available to form the grid which are used for commercial purpose than research work we have not used them. In our work, we had build the two different clusters, later the two clusters were connected to form a grid (clusterto-cluster communication), which handles run-time communications between parallel programs running on different clusters. In this work, first we had worked on cluster to find the failures encountered while executing parallel application. (Here the master node of each cluster had connected to form a grid.) The focus of the work was to design a method in cluster and grid environment such that it will help to detect failures if any, during computation of parallel application. In addition, a method is implemented which ignores the failures encountered and complete the parallel computation. The system is classified into two layers. These two layers are referred as – 1) 2)
Grid Layer (i.e. Upper Layer) and Cluster Layer (i.e. Lower layer).
Implementation of Fault Tolerance Techniques for Grid Systems
537
The two layers are as shown in figure 1a and 1b.
Fig. 1a Cluster Layer
Fig. 1b Grid Layer The grid layer of the system connects individual computers and clusters to form a grid as shown in figure 1b. The cluster layer connects the nodes in one group to offer their resources in the grid. These two layers were completely different but they were dependent on each other, if parallel application is executed in the grid environment. Due to these reasons different problems arises and different software solutions were used in both cases. In grid environment a Monte Carlo parallel application was executed from specific node say n, for our convenience, we referred that node as master node. There is no specific master/client concept in grid environment. From the user’s point of view, both the layers are unique. The cluster-to-cluster communication approach allows programmers to collect the result of parallel program stored in one node to another node. In the cluster layer, we considered a set of nodes under control of one administrator. The nodes need to be detached from the grid during the routine work of the user (owner of the node since we were utilizing the idle time of the computer). Later on, re-connect the node when it is idle. The nodes cannot be a part of the grid when users work with it (nodes). The node may reboot incorrectly, it may killed the tasks it processed. If only part1 of the node had used during the user worked, the rest may still connected to the cluster to continue the computation. 1(In a multi-processor node, the user may use one processor; the remaining processors may be allocated to the cluster for computation of parallel application. From remaining nodes available in a cluster, only few nodes can be part of the cluster and grid while users (owner(s) of the node(s)) for their personal work may use other nodes).
538
Advanced Technologies
In grid layer, as mentioned above, we connect two different clusters and few individual compute nodes with various security policies (authorization and authentication to access the node) as required in our project, we uses Globus Toolkit, for this purpose. This toolkit provides basic mechanisms such as communication, authentication, network information, and data access, resource location, resource scheduling, authentication, and data access. As the size of high performance clusters multiplies, probability of system failure also grows marginally, thereby, limiting further scalability. A system fault can cause by internal or external factors. For example, internal factors may include as specification and design errors, manufacturing defects, component defects, and component wear out. External factors may be included radiation, electromagnetic interference, operator error, and natural disasters and so on. The failure may also be due to hardware faults or software faults. As a result, it is almost certain for applications that run for a long time to face problem from either type of faults. Regardless, how well a system had designed, or how reliable the components are, a system cannot be completely free from failures. However, it is possible to manage failures to minimize its impact on a system. We had implemented the fault tolerance technique (we called this technique as watchdog timer algorithm technique) for a cluster by writing routines on a master (server) node. The method implemented in our work includes re-checks to take care of transient faults included in the initial allocation phase. The cluster environment had booted by using only available nodes. In the beginning if the failure had been detected while allocating the task, then task had to reallocate again. After allocation, the state of entire parallel application had checked at specific intervals of time-period as it had done by hardware watchdog timer. In this phase, master node monitored an output data of the program at client node at specific time intervals. At the same time, the data had been stored on the reliable storage on the local disk as well as on the disk of the master node. The storage had done for all the nodes in the available list of the nodes. If any of the node had not reported or not responded when a master node send a message to all nodes in cluster environment, then that particular node had to be checked for x times. In this phase, the master node reallocates the task if failure had detected. This was done again to avoid the transient faults after certain intervals of time. Then the task had allocated once again after the application had terminated. Here, the checkpoint had done for x times because of any one reason tabulated in Table 1: 1
If at first call client’s CPU utilization is 100%, it will unable to respond about its existence. In second call, the client may respond and update the status. 2 If link failure occurs due to network failure or due to some other reason, the Master node may try to reboot the client remotely. Therefore, in next call the client may respond and update the status. 3. If it is not possible to reboot the client remotely, it has to be rebooted at the site, and if the rebooting is done in within stipulated time, it will respond in the next call. 4 Lastly, if any of the above options is not feasible and there is no response from client in the stipulated period, the master node will drop the current node from the list, add a new node and resume the job from the earlier check-pointed state on this new node. Table 1. Reasons for the non-responding node
Implementation of Fault Tolerance Techniques for Grid Systems
539
After the allocation phase starts, the client nodes had checked at regular intervals. If the allocation of job failed or the computation of the application failed due to network failure or node failure or may be some hardware component failure, then the flag had set to 1, indicates that the node will not be usable for the computation. Lastly, within stipulated time-period, if any of the above options was not feasible and there was no response from client node, the master node will drop the current node from the list, add another node and resume the job from the last best check-pointed state on another available or added node. It was possible to calculate the checksum that indicates correct behavior of a computation node. The watchdog timer algorithm will periodically test the computation with that checksum. The master node collects the data, which it determine the “state” of the client node when queried by the server. The algorithm given below in figure 2 is in the form of pseudo code.
The Watchdog Timer Algorithm described below is implemented after LAM booting is done. Before Lam booting, a small procedure was implemented for creating the list of participating host.
for (i=1 to p) { do while (!execution_complete) { Monitor the application at regular intervals of time say t seconds; collect the data into the client and send it to master when queried by it; if (client !responding) { check for x times; // reason as given in table 1 if (success) continue; else { drop the node; add a new node ; resume application on another available node say m in the queue. } } } } Fig. 2. Watchdog Timer Fault Tolerant Technique The actual implementation of the watchdog timer technique in our project work helped to detect the faults. The application had resumed on available node from the previous state saved data. Thus helps to recover the computation with the help of watchdog timer algorithm. Thus using the fault detection algorithm, recovery algorithm and watchdog theory we are able to improve the new algorithm. Thus, this method helped us to improve the reliability of the application although the performance degraded slightly because of computation overheads. Another drawback is that to store the data at specified interval of time one had to modify the parallel application(s). Thus to overcome the serious issue of
540
Advanced Technologies
fault tolerance, another method i.e. Sanchita fault tolerant technique was implemented that collected the data at specified interval of time. The data had collected at compute bound rather than I/O bound. This data collection helped to restart the interrupted computation on another available node(s). This method is very effective approach for dealing with specified faults (like link failure, node failure). The state of the entire parallel application had saved into a file, and later it had transferred to the master node (the node on which the application is launched and distribution of task is done). In this work, the tool available in Linux called strace was used. A small program, was written to get the parent PID and child PID of the execution program. The tool helped to collect the information of system calls and to store the signals received by the process. This tool was used to store the relevant data of the parallel application at specified interval of time. The tool runs until the computation of parallel program does not complete. Then the relevant data was stored into a file in the local hard disk. The master node queried (send message) for the client nodes to check for the aliveness. The client node respond to the query and it send back the stored file to the master node. When a fault encountered, the application restarted from a recent last-saved point. In the beginning, to start with for applying the fault tolerant techniques, the experiment was done on one cluster only. Later the work had been carried out to find the problems encountered when the parallel application executed in grid environment. In this technique, the master node checks the aliveness of each node of the cluster at regular intervals of time by using a program. If the client node received the query done by master node, then client node replied to the master node about its existence. At the same time, it updated the master node with the status of the running process. If the client node did not respond before adding another node and resuming the process at that(another) node, the master node checks for x times and waits for response for approximately x+1 seconds. There can be number of situations in each call as explained above in table 1. The Sanchita fault tolerant technique was implemented and tested to store the last-saved state of the standard Monte Carlo application. The process state had resumed on the same node if any soft errors encountered. If hardware errors encountered such as, communication did not established or any physical problem of the node then the process resumed on another available node. The Sanchita technique applied at the middleware layer had given in the form of pseudo code as shown in figure 3. Each statement in the given pseudo code contained one or more than one program (or sub-routine).
Implementation of Fault Tolerance Techniques for Grid Systems
541
Algorithm:(in the form of pseudo code) While (execution in progress) do begin for (each node in cluster) do begin ping node for aliveness; if (alive) then begin collect checkpoint state from node; save in master node file; end; else begin retry check for aliveness for k times; if(success) then begin collect checkpoint state from node; save in master node file; end; else begin drop node transfer computation to another node; end; end; end; end; Fig. 3. Algorithm used in Sanchita Fault Tolerant technique
4. Results Watchdog timer algorithm was implemented in a SMP cluster based on the Linux. All nodes had the LAM/MPI software installed on it. The Monte Carlo application had tested in a parallel mode. The time required for the execution with and without applying the watchdog timer algorithm had shown in figure 4. The columns plotted in the graph pointed the following- The Column 1 pointed that the watchdog Timer algorithm (WDTA) method had not applied and no faults were injected. The column 2 indicates data when WDTA method applied and no fault injected. This is to check that the WDTA code worked properly. The last column 3 pointed when WDTA method applied and one failure injected. The files are saved after interval of t seconds on individual nodes. The filename is referred as the node number followed by date followed by time. This time-stamped of the file helped to recognize from where the data has been collected. The files had collected on the master node when queried by it. If any failures encountered then the node had been checked for say x times about its existence. After checking if client node did not respond then the node had been treated as failed node. This failed node had replaced by another node. Some time to improve the performance of the grid; we had added few nodes in the cluster environment. These nodes were added only to check if performance had improved in the grid
542
Advanced Technologies
environment. The performance had improved a lot. The programs were written using script language and ‘C/C++’ Language to add and drop the node. It also helped to improve the performance of the computation. The code was able to restart the application on another node. Further the work had been implemented in the grid environment. We had also taken care of the reliability of the grid environment. We had calculated the reliability of the grid environment to keep the nodes in spare. The work helped to resume the application from the last stored data that were based on the theories of the compute bound and process state. Thus, in the grid environment also, the status of the parallel application computed on the master node as well as on the client nodes (which are denoted by the rank(s) from ‘0’ to n, where ‘n’ is the number of nodes) were collected. The same methodology is used here for the nomenclature of the filename i.e. the filename is referred as the node number followed by date followed by time. Thus date and time-stamped of the file helped to recognize from where the data had been collected. The work had been carried out to find if any failed node encountered. If encountered, then it was possible to resume the job on the new node and drop the earlier (failed) node. Further the second method i.e. Sanchita fault tolerance technique (SFTT) was implemented in the middleware level of the cluster and grid environment. When the application starts execution, the middleware level code takes care of the failures encountered. Using the Sanchita technique, the status of the computation was collected and shown in figure 5(few lines). Figure 6 shows the graph plotted with different combinations as shown in Table 2.
Table 2. Data for SFTT
When SFTT code applied No No Yes Yes Yes
Fault encountered No No No No Yes
Fig. 4. Before and after applying WDTA code
Node Added Yes No Yes No Yes
Implementation of Fault Tolerance Techniques for Grid Systems
543
read(3, 0xbffacf80, 80) = ? ERESTARTSYS (To be restarted) --- SIGINT (Interrupt) @ 0 (0) --clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7fe1708) = 5304 sigreturn() = ? (mask now [USR2]) read(3, 0xbffacf80, 80) = ? ERESTARTSYS (To be restarted) ---SIGCHLD (Child exited) @ 0 (0) --read(3,"\0\0\0\0\0\0\0\0\20\0\0\0\6\244\4\10\1\0\0\0\360\337\354"..., 80) = 8 readv(3,{"n\353\377\277\0\0\0\0\0\0\0\0n\353\377\277\1\0\0\0\20\0"..., 64}, {"\0\3\0\17\0\0\24\223\0\0\0\0\0\0\0\0", 16}], 2) = 80 rt_sigprocmask(SIG_UNBLOCK, [USR2], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [USR2], NULL, 8) = 0 …………………………….sa_family=AF_INET, sin_port=htons(32769), sin_addr=inet_addr("192.168.0.202")}, [16]) = 95sendto(6, "\0\2\310k\0\0\0\0", 8, n [6])recvfrom(6, "\0\0\0\1\0\2\310A\0\0\0\0@\0\0\7\0\0\0\0\0\0\0#\0\0\1\0"..., 8252, sin_addr=inet_addr("192.168.0.203")}, [16]) = 95sendto(6, "\0\2\310l\0\0\0\0", 8, 0, {sa_family=AF_INET, sin_port=htons(32769), sin_addr=inet_addr("192.168.0.203")}, 16) = 8write(14, " 100 200 2470860"..., 35) = 35select(13, [3 5 6 7 10 11], NULL, [3 5 6 7 10 11], NULL) = 1 (in [6])recvfrom(6, ……………………………….. Fig. 5. Sample File for tracing the parallel application running on the Grid System. (This is the few statements of the file)
Fig. 6. The data collected before and after implementing the Sanchita Fault Tolerant Technique in cluster and grid environment.
544
Advanced Technologies
In figure 6, Column 1 and 2 pointed the data when SFTT was not applied and no failures injected but one node was added. The time required after adding one node is less as compared to column 1. The third and fourth column pointed when Sanchita Fault Tolerant Technique (SFTT) was applied with no failure injected but node was added. These indicate that the SFTT code checks for every possibility of failures to complete the task. Column 5 pointed when SFTT was applied and one node failure encountered at the time of computation and another available node was added. From the last two columns, it was concluded that slightly more time was required to complete the computation.
5. Conclusion The fault tolerant techniques are required in today’s’ cluster and grid environment since large applications are being run today for various applications and failure of single node can cause delays of days in execution and their by development also. In this chapter, two such techniques are described. These techniques are implemented in the application and in middleware layer of the cluster as well in the grid environment. Both the techniques i.e. watchdog timer and Sanchita were helpful in taking care of failures. However, in case of watchdog timer algorithm, the parallel application requires modification to get the status of the intermediate steps of the computation at specific intervals of time. It is the burden on the application developer. So in order to overcome this, Sanchita FT Technique was implemented. In this technique, states of the running jobs are collected at central (master) node and intermediate results are stored. The incomplete task (computation) due to node failure has resumed on another node using the latest computational data stored in a file. Thus, the application will take longer time but execution will be completed by resuming the task using the latest data stored on another node. Thus, system will gracefully degrade with failure. However, overall reliability will improve, which is needed for execution of many complex programs.
6. Acknowledgement I would like to thank Professor Dr. C.S.Moghe for his kind support and suggestions during this work.
7. References Oren Laadan & Jason Nieh, (2007). Transparent Checkpoint-Restart of Multiple Processes on Commodity Operating Systems, 2007 USENIX Annual Technical Conference B. Nazir, T. Khan,( 2006) Fault Tolerant Job Scheduling in Computational Grid, 2006 IEEE, pp 708-713 A. Avizienis, (1985) “The N-version Approach to Fault-Tolerant Software” -IEEE Transactions on Software Engineering - vol. 11 1985 J. H. Abawajy,(2004). Fault-Tolerant Scheduling Policy for Grid Computing Systems, 2004 IEEE. J.H. Abawajy and S. P. Dandamudi.,(2003). Parallel job scheduling on multi-cluster computing systems, In Proceedings of the IEEE International Conference on Cluster Computing (Cluster 2003), Hong Kong, China, December 1-4 2003.
Implementation of Fault Tolerance Techniques for Grid Systems
545
J. H. Abawajy,04-26-2004 “Fault-Tolerant Scheduling Policy for Grid Computing systems”, 18th International Parallel and Distributed Processing Symposium, 04-26-04 Santa Fe, New Mexico T. Thanalapati and S. Dandamudi.,(July 2001). An efficient adaptive scheduling scheme for distributed memory multicomputers, IEEE Transactions on Parallel and Distributed Systems, 12(7):758–768, July 2001. A. Nguyen-Tuong, “Integrating Fault-Tolerance Techniques in Grid Applications”- Report, 2000 P. Stelling, et.al (1999). A Fault Detection Service for Wide Area Distributed Computations, 1999. J. Liang, et. Al (2003). A Fault Tolerance Mechanism in Grid, 0-7803-5/04/2003, IEEE pp. 457-461. J. B. Weissman and D.Womack.,(1996). “Fault tolerant scheduling in distributed networks”Technical Report CS-96-10, Department of Computer Science, University of Virginia, Sep. 25 1996. P. Townend, and J. Xu,(2003). “Fault Tolerance within Grid environment”, Proceedings of AHM2003, http://www.nesc.ac.uk/events/ahm200 3/AHMCD/pdf/063.pdf, page 272, 2003 Soonwook Hwang and Kesselman, C , (2003). “Grid workflow: a flexible failure handling framework for the grid”, High Performance Distributed Computing,2003. Proceeding 12th IEEE International Symposium, 22-24 June 2003,Pages 126 – 137. John Mugler, et.al.(2003) “OSCAR Clusters”, Proceedings of the Ottawa Linux Symposium (OLS'03), Ottawa, Canada, July 23-26, 2003. Thomas Naughton, et al. “The OSCAR Toolkit” T. Tannenbaum, et. Al. (2002). Derek Wright, Karen Miller, and Miron Livny, "Condor -A Distributed Job Scheduler", -Beowulf Cluster Computing with Linux, The MIT Press, 2002. ISBN: 0-262-69274-0. Ibeaus Bayucan, et al. (1999), “Portable Batch System External Reference Specification”, MRJ Technology Solutions, May 1999. Ian Foster et al, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”, International J. Supercomputer Applications, 15(3), 2001. Ian Foster and A. Iamnitchi,(2000). ”A problem – Specific Fault-Tolerance Mechanism for Asynchronous, Distributed Systems”, IEEE, p.4-13 2000. Ian Foster, et. al (1999.”A Fault Detection Service for Wide Area Distributed Computations”, Cluster Computing, v.2 n.2,p.117-128, 1999. Sriram Rao, et. al, (1999). Lorenzo Alvisi, Harrick M.V in , “Egida : An Extensible Toolkit For Low-overhead Fault-Tolerance, Fault-Tolerant Computing”, Digest of Papers. Twenty-Ninth Annual International Symposium, p. 45-55, 1999. J.G. Ganssle, (2004). “Great Watchdogs”, V-1.2, Gaanssel Group, updated January, 2004. P.Townend and J. Xu,(2003) ”Replication-based Fault-Tolerance in a Grid Environment”, citeceer, 2003. Pascal Felber, Proya Narasimhan,(2004) ,Member, IEEE, “Experiences, Strategies, and Challenges in Building Fault-Tolerant CORBA Systems”, IEEE transactions on Computers , Vol.53, NO.5, May 2004.
546
Advanced Technologies
Sriram Sankaran,et.al(2005) Parallel Checkpoint /Restart for MPI Applications. International Journal of High Performance Computing Applications, Vol. 19, No. 4, 479-493 (2005) CCM, Adding High Availability to the Condor Central Manager, http://dsl.cs.rechnion.ac.il/projects/goxal/project_pages/ha/ha.html. LinuxHA , Clustering Project, http://www.linuxha.net/index.pl.
A Case Study of Modelling Concave Globoidal Cam
547
28 X
A Case Study of Modelling Concave Globoidal Cam Nguyen Van Tuong and Premysl Pokorny Technical University of Liberec Czech Republic
1. Introduction Globoidal cam mechanisms are widely used in industry. Compared to other cam-follower systems, the globoidal cam-follower mechanisms have many advantages, such as: compact structure, high loading capacity, low noise, low vibration, and high reliability. They are widely used in machine tools, automatic assembly lines, paper processing machines, packing machines, and many automated manufacturing devices. In term of the shape, globoidal cam is one of the most complicated cams. The most important task when modelling the globoidal cams is to represent their working surfaces. The working surfaces of the globoidal cams are the surfaces that contact with the roller surfaces. These surfaces are very complex and it is very difficult to create them accurately. Up to now, a number of works dealing with finding the way to describe accurately these surfaces have been proposed. Some researchers derived mathematically expressions for the surface geometry of the globoidal cam with cylindrical, hyperboloid, or spherical rollers, based on coordinate transformation, differential geometry, and theory of conjugate surfaces (Yan & Chen, 1994 & 1995; Yan-ming, 2000; Cheng, 2002; Lee & Lee, 2001 & 2007; Chen & Hong, 2008). They developed their own programs, which were written in Visual Basic, C or C++ languages, to assist themselves in studying. Some programs can draw and display the draft of the cam contour, or create the solid model. Some can create the data of the cam curve profile which will be the input data for CAD/CAM software such as Unigraphics to build the CAD model (Chen & Hong, 2008). Some other researchers also described mathematicaly the cam surface, but they used computer to develop a package, which was a combination of AutoCAD R14, 3D Studio Max, and VBA, to generate the surfaces of the roller gear cam (En-hui et al., 2001). In addition, some researchers studied to create the globoidal cam with conical rollers from machining point of view (Tsay & Lin, 2006). They represented the surface geometry of the cam as the swept surfaces of the tool paths. In their study, the expression of a surface normal, a rule surface and its offset, meshing vectors and meshing angles were introduced to define the accurate swept surfaces generated by the roller follower, and a program in C++ language developed to generate the surface coordinates of the cam. In general, the works mentioned above have used the mathematical expressions for the globoidal cam surfaces and various cam laws as the input data to generate the cam surfaces.
548
Advanced Technologies
From machining point of view, the globoidal cam surfaces can be determined by corresponding angular displacements of both the cam and the driven member (Koloc & Vaclavik, 1993). These angular displacements can be extracted from the NC program generated by some special sofwares that are specialized for cam mechanisms. They can also be obtained from the follower displacement equations. In this study, the concave globoidal cam with swinging roller follower is modeled from angular input and output displacements. The objective of the chapter is to introduce some effective methods for modelling concave globoidal cam. These methods can be implemented by using commercial CAD/CAM sofwares. In the chapter, Pro/ENGINEER® Wildfire 2.0 is used to create the cams for the illustrative example that we have recently done for industry in Czech Republic. In this case study, the input data for modelling are the angular input and output displacements of the cam and the follower. Furthermore, besides modelling methods, some important techniques that are useful for designers to find out the most accurate model are also represented. In this chapter, the terms “globoidal cam surface(s)” and “cam surface(s)” refer the working surface(s) of the globoidal cam. The outline of the chapter is as follows. Section 2 presents the theoretical background of concave globoidal cam. In section 3 we describe in detail some modelling methods that can be used to create the globoidal cam surfaces. An application example is presented in Section 4. Finally, conclusion remarks are given in Section 5.
2. Theoretical background of concave globoidal cam There are two types of globoidal cams. The first one is the globoidal cam that has a groove on its surface and the roller follower oscillates when the cam rotates. This type has two subtypes which are convex and concave (Figure 1). These cams are used for small angles of follower oscillation. The second one has one or more ribs on a globoidal body. This type is also called roller gear cam or Ferguson drive (Rothbart, 2004). The two surfaces of the rib always contact with the rollers (cylindrical, spherical or conical) of the follower. This follower may oscillate about its axis or have an intermittent rotation. Figure 2 shows two subtypes of the concave globoidal cam: the concave globoidal cam with an oscillating follower and the concave globoidal cam with an indexing turret follower. The rib of these cams looks like a thread or a blade so that sometimes they can be called thread-type or blade-type globoidal cams. In this study, the single thread-type is the globoidal cam that we will deal with. Figure 3 illustrates the geometrical relationships between a concave globoidal cam with an oscillating follower. According to the structure of the globoidal cam, as can be seen from Figure 2&3, the symmetrical plane of the follower contains the cam axis and the axes of the rollers intersect at a point that lies on the follower axis. In Figure 3, the development plane is the plane that is normal to the axis of the roller and located anywhere along the length of the roller. The intersection point between the development plane and the axis of the roller is the pitch point (P). Datum plane is the plane normal to the cam axis and contains the follower axis. The angular displacement of the roller is measured from this plane. The following are some parameters related to a globoidal cam-follower system (Koloc & Vaclavik, 1993; Reeve, 1995).
A Case Study of Modelling Concave Globoidal Cam
549
Cam
Cam
Follower
Follower (a) Convex cam Fig. 1. Globoidal cam-groove type, oscillating follower
(a) Oscillating follower Fig. 2. Globoidal cam - thread type.
(b) Concave cam
(b) Indexing turret follower.
- angular input displacement (the rotation angle of the cam). - angular output displacement from datum plane (the rotation angle of the follower). has a relationship with and it can be expressed by function = f(), (En-hui et al., 2001). 0 – angle from datum plane to start of follower motion, measured in direction of motion. If the start point is encountered after the datum plane then 0 is positive. 1 – angle between the axis of the upper roller with the datum plane. At the beginning, when the upper roller is at the starting point then 0= 1. 2 – angle between the axis of the lower roller with the datum plane. t - distance between the axis of the follower to the end of the roller, measured along the roller axis. e – clearance between the end of the roller and the cam body. F - distance from the axis of the follower to the pitch point. C - distance between the cam axis and the turret axis. R - perpendicular distance from cam axis to the pitch point, expressed as
550
Advanced Technologies
R = C – F.cos(1)
(1)
h – distance from the pitch point to the datum plane. It is the height of the point P and presented as h = F.sin (1)
(2)
Obviously, the coordinates of the pitch points on the rollers can be calculated if the angular input and output displacements are known. From these coordinates and some other information, the pitch surfaces of the cam can be modeled.
Fig. 3. Globoidal cam- oscillating follower arrangement
3. Modelling methods There are some methods used to model concave globoidal cams. The most important task in the modelling procedure is to create the working surfaces of the cam. Once these surfaces are created, other surfaces of the cam can be easily formed later. Here, we introduce two groups of methods that can be used to create the cam surfaces, namely Pitch surface-based methods and Standard cutter-based method. 3.1 Pitch surface-based methods In a globoidal cam-follower system, when the follower rotates, the locus of the roller axis will generate a ruled surface (pitch curved surface) in space (Tsay & Lin, 2006). The two axes of two rollers in this case study will generate two pitch curved surfaces. The working surfaces of the cam can be obtained from the pitch surfaces by offsetting them a distance that is equal to the radius of the roller. There are several methods to get the pitch surface. The followings are three methods that can be used to create the pitch surface.
A Case Study of Modelling Concave Globoidal Cam
551
3.1.1 Graphs-based method (method 1) Sweep a straight line with two constraints (Figure 4): (i) the angle between this line and the datum plane varies when the cam rotates and its value is ij; (ii) the coordinates of the pitch point P on this line, in the cylindrical coordinate system, corresponding with the input angular displacement j, satisfy the formulas below.
h ij F. sin ij
(3)
R ij C F. cos ij
(4)
where i = 1, 2, corresponding with the upper and the lower pitch surfaces; j = 1,2,…,n, corresponding with the angular output displacements. The relationship between pairs and , h and , R and can be expressed in graphs. These graphs are useful for modelling the cam in commercial CAD/CAM softwares. Z (cam axis)
R ij h ij Datum plane
Sweeping line
P
β ij
Fig. 4. Principle of graphs-based method 3.1.2 Open section-based method (method 2) This method is similar to the previous method but here the two pitch surfaces are created at the same time. These surfaces are made by sweeping an “open section” which consists of three straight lines. Two of them are collinear with the axes of the two rollers. The last one connects them together. 3.1.3 Curves- based method (method 3) Sweep a straight line that is collinear with the roller axis. The two end points of this line must lie on two curves (Figure 6). One of these curves is a circle in the datum plane. This circle goes through the intersection point of the roller axes and its center is in the cam axis and it is also called the origin trajectory. The other curve is a three-dimensional (3D) curve. This 3D curve is the locus of a point, which located on the roller axis (it can be the pitch point), when the follower rotating. The coordinates of that point satisfy formulas (3) and (4) above.
552
Advanced Technologies
Z (cam axis) P
R ij h ij Datum plane
Open section
βij
Fig. 5. “Open section” method 3D curve
Z (cam axis) Sweeping line
Origin trajectory
Datum plane
Fig. 6. Principle of curves- based method 3.1.4 Standard cutter-based method (method 4) An end mill cutter can generate the surfaces of a globoidal cam. If the diameters of the cutter and the roller are equal, the motion of the cutter will be similar to that of the roller in the machining process, and of course, the cutter must rotate about its axis (roller axis). The sweep surface of the tool path can represent the working surface of the cam. The following is one way that can be used to get the cam surface. Cut a bank by sweeping a rectangular section (Figure 7) to form the cam surfaces if the following constrains are performed: (1) The width of the section is equal to the diameter of the roller. The length of the section satisfies the following formula. L=t+e
(5)
where, L is the length of the rectangular section, t and e are two geometrical parameters of the cam which mentioned in section 2.
A Case Study of Modelling Concave Globoidal Cam
553
(2) Two points on the section, which are intersection points between the symmetry axis of the section and its edges, must follow two 3D curves. These curves are loci of two points, which are on the roller axis, when the follower rotates. One of these curves is the origin trajectory. The curves which are used in the curvesbased method can be applied here. (3) This section plane always contains the cam axis. 3D curve
Z (Cam axis) Rectangular section P
Origin trajectory O Datum plane
Roller axis
Fig. 7. Standard cutter-based method The globoidal cam can be modeled by using CAD/CAM softwares such as CATIA, Unigraphics NX, Pro/Engineer.... The four methods above can be implemented in Pro/Engineer. In theory, geometric errors may exist on every model. These errors may be so very high that the model cannot be acceptable. Hence, after modelling, models must be checked to find out the best one among them. In this study, Pro/Engineer Wildfire 2.0 is used to create the globoidal cam. The implementation of those methods is presented in a form of an illustrated example in the next section. Besides, some other important tasks to check the model are included as well.
4. Application example 4.1. Input data and calculations Given a concave globoidal cam with an oscillating follower that has two cylindrical rollers. The angle between two axes of the rollers is 600. The increment of the input angle of the cam is 0.20, starts from 0 and ends at 3600. The angular input and output displacements are given in a table that consists of 1800 pairs of corresponding angles. Some of them are presented in Table I in the appendix. To observe easily, the relationship between the angular input and output displacements is showed in Figure 8. The following are some other parameters of the system, which are showed in Figure 2: d = 25.5 mm, l = 16 mm, C= 107.8 mm, t = 58.7 mm, 0 = 7.490, e = 2.3 mm. There are some calculations that must be done before making the models as follows: (1) Calculating the angular outputs included 0. (2) Calculating the angle 1j and 2j .
554
Advanced Technologies
1j j 0
(5)
60
(6)
2 j
1 j
(3) Calculating the coordinates of two pitch points for each pitch surface. The pitch points are located at the distance F= 59.7 mm on the roller axes. All the calculations are done in Microsoft Excel 2003. Angular output, degree
50 45 40 35 30 25 20 15 10 5 0 0
40
80
120
160
200
240
280
320
360
Angular input, degree
Fig. 8. Angular input/output displacements 4.2. Modelling procedures and results The following are the main steps to create the globoidal cam by using four methods described above in Pro/Engineer Wildfire 2.0. The accuracy of the system is set to 0.0005 to get high accurate models. The graphs-based method (1) Create a revolution surface of the globoidal body of the cam (Figure 9a). (2) Create 3 graphs for the angle between the sweeping line and the datum plane (Figure 9b), the height of the pitch point on the sweeping line, and the distance (radius) from the pitch point to the cam axis. These graphs show the dependence of the three above parameters on the angular output displacement of the cam. (3) Create the upper pitch surface by using the Variable Section Sweep command (Figure 9c). The origin trajectory in this case is the circle which is the intersection between datum plane and the cam body. Constrains for this command are formulas that have the form as sd# = evalgraph("graph_name",trajpar*360)
(7)
where sd# is the dimension which will vary. It can be R or h or (Figure 3a). graph_name is the name of the graph which created in step 2, corresponding with the dimension sd#. Refer (Tuong & Pokorny, 2008b) to get some more useful infomation about these graphs. (4) Repeat steps 2 and 3 for the lower pitch surface (Figure 9c), using other three graphs.
-
A Case Study of Modelling Concave Globoidal Cam
555
(5) Offset the two pitch surfaces to get the working surfaces of the cam (Figure 9d). (6) Merge all surfaces to become one and convert the united surface to solid (Figure 9e). (7) Perform some extra cuts to get the desired cam, model 1 (Figure 9f).
Fig. 9. Main steps of the graphs-based method The open section-based method (1) Create a revolution surface and 3 graphs like the first two steps in the first method. (2) Create the two pitch surfaces at the same time by using the Variable Section Sweep command (Figure 10a) with the same constrains in step 3 in the first method. (3) Perform steps 5, 6 & 7 as in the first method to get the solid model (Figure 10b) and the desired cam, model 2 (Figure 10c).
Fig. 10. Some main steps of the open section-based method The curves-based method (1) Create a revolution surface of the globoidal body of the cam (Figure 11a). (2) Create the origin trajectory and two 3D curves (Figure 11b). (3) Create the two pitch surfaces by using the Variable Section Sweep command (Figure 11c). (4) Perform steps 5, 6 & 7 as in the first method to get the offset surfaces (Figure 11d), the solid model (Figure 11e) and the desired cam, model 3 (Figure 11f).
556
Advanced Technologies
Fig. 11. Main steps of the curves-based method The standard cutter-based method (1) Create a revolution blank of the cam body (Figure 12a). (2) Create the origin trajectory and two 3D curves (Figure 12b). (3) Perform two cuts with rectangular section by using the Variable Section Sweep command (Figure 12c). (4) Perform some extra cuts to get the desired cam, model 4 (Figure 12d).
Fig. 12. Main steps of the standard cutter-based method Among the four methods, the two first methods need more steps than the others but the modelling time is much shorter than that of the two last methods. This is because when creating the pitch surfaces in the methods which need 3D curves, the system has to interpolate the swept surface through 1800 physical points on this curve. To see the cam easily, the pitch surfaces on four final models in figures 9 to 12 are hidden. In general, these models look great and similar. In order to choose the best one among them,
A Case Study of Modelling Concave Globoidal Cam
557
the interference between the cam and its rollers must be checked. If there is no interference and the clearance between the components is small enough, then the result is acceptable. 4.3 Making animation and checking interference In order to make animation and verify the interference between components in the globoidal cam mechanism, first, an assembly of the cam and the follower is made. After that, use the Mechanism Design module to define the geometrical relationships of the system, make it move and analyze its motion also. Last, use a kinematics analysis to obtain information on interference between components. The following is the procedure to make animation and check interference. (a) Creating the assembly model. (b) Modifying Joint Axis Settings. (c) Creating a slot-follower. (d) Checking the assembly model. (e) Creating servo motor. (f) Creating and Running Analyses. (g) Viewing results and taking measurements. When the globoidal cam rotates, the follower will stay or rotate depending on the location of the rollers on the cam surfaces. The follower will not move when the rollers still contact with the cam surfaces in the dwell periods. To get a motion for the follower, a point on one roller axis has to trace along a 3D curve on the pitch surface. The pitch point now can be used for this purpose (point PNT0 in Figure 13). This 3D curve is available on the model 3 and model 4. This curve can be drawn on model 1 and model 2 for the purpose of checking interference. The servo motor is always applied on the cam axis so that the cam can move about its axis. Servo motor can also be applied for the follower. In this case, the slot-follower connection is omitted. Figure 13 shows the cam-follower system in the Mechanism application with the slot-follower connection.
Fig. 13. Cam-follower system in Mechanism application. During the animation, the motions of the cam and the follower are simulated, and if there is any interference between the two components, Pro/Engineer will highlight the interfered positions. Figure 14 presents the result of the animation of model 1. In the assembly of model 1 and model 3 and their followers, there is no interference between the cam and its rollers when the cam rotates one revolution, while interferences occur in the
558
Advanced Technologies
assembly of model 2 and model 4. There are 10 positions of interference in the assembly of former, while there are a lot of positions of interferences for the latter. These interferences appear on both sides of the rib on the rise and return periods, where the curvature of working surfaces changes strongly, and can be seen in the graphic window (Figure 15). There are totally 1800 positions checked for a full revolution of the cam. The angle between two positions (called frames in Mechanism Design module) is 0.20. This value is similar to the increment of the input angle of the cam. In comparison with model 2 and model 4, the latter has bigger interference volumes. The result is that model 1 and model 3 can be accepted. Although the interference does not occur between the cam surfaces of the acceptable models with their rollers, but there may have clearances between them. Thus, these clearances must be checked to ensure that they are small enough. If they are large, the errors of output angular displacements will be large and the model in this case may have to be eliminated. These clearances can be manually measured in assembly standard mode or in mechanism mode. The measuring result is that in one revolution of the cam the biggest clearances are on the rise and return periods and they are less than 0.2 micrometer for both models (Tuong & Pokorny, 2008). Obviously, these gaps also cause errors in the output angular displacements but these errors are very small and can be accepted. In this study, the four models above were created on a notebook with 1.86 GHz Core 2 Duo processor, 2GB RAM. It took only few minutes to create the pitch surface, the most timeconsuming task, for the model of the first method (graphs-based method). Meanwhile, it took quite long with the methods that used 3D curves (method 3 and 4). The similar problem happened when performing the analysis running for checking interference. Therefore, the first method is the best choice when considering the modelling methods.
Fig. 14. Animation playback of model 1.
A Case Study of Modelling Concave Globoidal Cam
559
Fig. 15. Interferences (in red colour) occur between components (model 4). In our research, besides the example above, another example for modelling concave globoidal cam with indexing turret follower has also been done (Tuong & Pokorny, 2009). In that case study, the pitch surface-based modelling method was successfully used to create the cam. Some techniques which introduced above can be used for that case. However, the modelling procedure is much more complicated than the case of the concave globoidal cam with swinging roller follower. Figure 16 shows the animation playback of the concave globoidal cam with indexing turret follower.
Fig. 16. Animation playback of the concave globoidal cam with indexing turret follower.
5. Conclusion In this study, four modelling methods are developed to create the concave globoidal cam. These methods are implemented in Pro/Engineer® Wildfire 2.0 using the same input data of a concave globoidal cam with an oscillating follower. After verifying the four models corresponding with the four methods, it can be said that the model which is created from the graphs-based method is the best one because it meets the required accuracy and the modelling time is short. This method is easy to conduct in Pro/Engineer, which is one of the
560
Advanced Technologies
most powerful CAD/CAM packages in industry. This is also a great potential for industry application. The case study presented here is a concrete example but the modelling methods and techniques can be applied for other spatial cams with cylindrical rollers when the angular input/output are known. The result of this study is very useful in terms of modelling and manufacturing globoidal cam.
6. Acknowledgement The work upon which this paper is based is a part of MSM-4674788501 research program, grant from the Technical University of Liberec, Czech Republic. Appendix 0.00 0.00000000 179.60 44.99616011 0.20 0.00000000 179.80 44.99904001 0.40 0.00000000 180.00 45.00000000 0.80 0.00000000 180.20 44.99904001 0.80 0.00000000 180.40 44.99616011 1.00 0.00000000 … … …. … 213.00 23.59739548 104.60 0.00000000 213.20 23.39136475 104.80 0.00000000 213.40 23.18530279 105.00 0.00000000 213.60 22.97922791 105.20 0.00000680 213.80 22.77315844 105.40 0.00005418 214.00 22.56711268 105.60 0.00018212 214.20 22.36110896 105.80 0.00042995 214.40 22.15516558 106.00 0.00083638 214.60 21.94930085 106.20 0.00143945 … … 106.40 0.00227659 253.60 0.00227659 106.60 0.00338459 253.80 0.00143945 106.80 0.00479962 254.00 0.00083638 … … 254.20 0.00042995 146.40 22.97922791 254.40 0.00018212 146.60 23.18530279 254.60 0.00005418 146.80 23.39136475 254.80 0.00000680 147.00 23.59739548 255.00 0.00000000 147.20 23.80337666 … … 147.40 24.00928999 359.00 0.00000000 147.60 24.21511717 359.20 0.00000000 147.80 24.42083990 359.40 0.00000000 148.00 24.62643991 359.60 0.00000000 148.20 24.83189893 359.80 0.00000000 … … 360.00 0.00000000 Table 1. Example of some selected angular input/output displacements, unit: degree
A Case Study of Modelling Concave Globoidal Cam
561
7. References Chen, S. L. & Hong,S. F. (2008). Surface generation and fabrication of roller gear cam with spherical roller. Journal of Advanced Mechanical Design, System, and Manufacturing, Vol. 2, No. 3, pp. 290-302, ISSN 1881-3054 Cheng, H. Y. (2002). Optimum tolerances synthesis for globoidal cam mechanism. JSME international Journal, series C, Vol. 45, No. 2, pp. 519-526., ISSN 1344-7653 En-hui, Y.; Chun, Z. & Ji-xian, D. (2001). Creating surface of globoidal indexing cam profile on computer. Journal of Northwest University of Light Industry, Vol.19, No.1, pp. 4143, ISSN : 1000-5811 Koloc, Z. & Vaclavik, M. (1993). Cam mechanism. Elsevier, Publisher of Techincal Literature of Prague. ISBN 0-444-98664-2, Prague. Lee, R. S. & Lee, J. N. (2001). A New Tool-Path Generation Method Using a Cylindrical End Mill for 5-Axis Machining of a Spatial Cam with a Conical Meshing Element. The International Journal of Advanced Manufacturing Technology, Vol . 18, No. 9, pp. 615623, ISSN 1433-3015 Lee, R. S. & Lee, J. N. (2007). Interference-free toolpath generation using enveloping element for five-axis machining of spatial cam. Journal of Materials Processing Technology, Vol. 187-188, pp. 10-13, ISSN 0924-0136 Reeve, J. (1995). Cam for industry, Mechanical Engineering Publications, ISBN 0-85298-960-1, London. Rothbart, H. A. (editor) (2004). Cam design handbook, McGraw-Hill, ISBN 0-07-143328-7. Tsay, D. M. & Lin, S. Y. (2006). Generation of globoidal cam surfaces with conical rollerfollowers. ASME 2006 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference, DETC2006-99683, Philadelphia, USA, 9/2006. ASME, New York. Tuong, N.V. & Pokorny, P. (2008a). Modelling concave globoidal cam with swinging roller follower: a case study. Proceeding of World Academy of Science, Engineering and Technology, vol. 32, pp. 180-186, ISSN 2070-3740, Singapore. Tuong, N.V. & Pokorny, P. (2008b). CAD techniques on modelling of globoidal cam. Proceeding of 2nd International Conference, Manufacturing system - Today and Tomorrow, ISBN: 978-80-7372-416-0 TUL, Liberec, Czech Republic. Tuong, N.V. & Pokorny, P. (2009). Modelling concave globoidal cam with indexing turret follower: A case study. International Journal of Computer Integrated Manufacturing, ISSN 0951-192X (accepted). Yan, H. S. & Chen, H. H. (1994). Geometry design and machining of roller gear cams with cylindrical rollers. Mechanism and Machine Theory. Vol. 29, No. 6, pp. 803–812., ISSN: 0094-114X/94 Yan, H. S. & Chen, H. H. (1995). Geometry design of roller gear cams with hyperboloid rollers. Mathematical and computing modelling, Vol. 22, No. 8, pp. 107-117, ISSN 08957177/95 Yan-ming, F. U. (2000). Analysis and design of the globoidal indexing cam mechanism. Journal of Shanghai University (English edition), Vol. 4, No. 1, pp. 54-59, ISSN 10076147
562
Advanced Technologies
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
563
29 X Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment Algebra Veronica Vargas A., Sarfraz Ul Haque Minhas, Yuliya Lebedynska and Ulrich Berger Brandenburg University of Technology Cottbus Germany
1. Introduction The highly competitive globalized market of today has compelled the manufacturers to enhance not only the quality of their products but also to reduce production costs. The production costs reduction (minimization of the use of resources) together with best technological exploitation has become the biggest challenge for them to enhance their productivity. This has emerged a new setup wherein increased customization, product proliferation, heterogeneous markets, shorter product life cycle and development time, responsiveness, and other factors are increasingly taking centre stage (Bollinger, 1998). Up to now, the classical meaning of productivity in the production environment is more to minimize or efficient resource utilization but global competition has drastically changed the meaning of this term. Now manufacturers are striving hard to focus their attention towards best technological exploitation of resources to achieve effectiveness. Effectiveness can be defined as the way in which the industry meets the dynamic needs and expectations of customers. The exploitation leads to systematic planning, quick and robust decision making, precise process modelling and above all the intelligent process monitoring and production control through precise and real time information exchange among different agents of the production system. But resources and technologies cannot be fully exploited due to the complexity of the production system and diversified and highly dynamic processes. This in turn also limits the systematic collection, characterization and the effective and efficient exploitation of knowledge for precise system and process adaptation as well as for robust decision making. Productivity is now dependent on the value and the range of the products and services as well as the efficiency with which they are produced and delivered to the system. This is the main aspect of the holistic concept for enhancing productivity. The globalization of the manufacturing industry and the intense competition to lead or survive in the market require a much broader conception of productivity enhancement methodologies either or together employed internally and externally to the industry. A comprehensive picture of these productivity enhancement methodologies in the different
564
Advanced Technologies
sub domain of the whole production system is illustrated in detail in the form of integrated intelligent methodology. This chapter provides an integrated intelligent methodology for developing a mechanism among different agents of the production system to enhance productivity indirectly. In order to practically demonstrate the mechanism for enhancing productivity, two different application examples from the complete production domain are selected. This domain covers the manufacturing and assembly processes. In manufacturing domain, it is practically investigated the influence of better structuring of knowledge and effective knowledge exchange for the effective optimization of a process chain to produce compressor parts. In joining area, the widespread and systematic knowledge sharing among different agents of the production plant is formulated and implemented to influence productivity through a concept of intelligent production control for joining car body in white parts. The effective knowledge sharing with the process model, automatically updating the influential process parameters for enhancing precision in the process model and achievement of better quality of knowledge guidance for the assembly floor staff is described with the application example of adhesive bonding of car body parts. The investigations and the results from these two distinct examples are merged together to generate the integrated methodology based on intelligent approach for productivity enhancement. For structuring, characterization and sharing relevant engineering data generated during manufacturing and assembling processes, a technology data catalogue (TDC) is developed. It is used for updating or influencing governing parameters for effective control in the production system. In manufacturing domain, an innovative methodology for process chain optimization is described. In assembly domain, the role of intelligence through knowledge characterization, precise knowledge acquisition for intelligent updating the process models and its use in an automated production setup configuration, on process parameter adaptation and real time corrections for quality assurance is comprehensively described.
2. Situational Analysis and Problem Identification Currently, diversified productivity enhancement methodologies are being practised in the industry. In this regard, two case studies have been carried out to investigate productivity improvement potential in the production domain; this domain covers the manufacturing of aero engines components and the automotive assembly. The actual state of aero engines is the result of considerable progress in materials, manufacturing and surface technologies supporting and completing the improvements achieved in design, aerodynamics and thermodynamics (Steffens and Wilhelm, 2008). One example is the introduction of the titanium-alloys in the early 1960 that enabled the design of large fan blades and the design of fast rotating high pressure compressor rotors. This means that the maturity in the design has been reached to a point where the manufacturers are focussing more on productivity in new engine design and their manufacturing. It includes the minimization of costs by several means; the most significant of them is by reducing the number of components, exploiting new materials in the design and fast manufacturing. It is noteworthy to mention here that high speed milling is the most effective way to produce optimum surface finish and geometric accuracy at high metal removal rates. Highly complex components, e.g. compressor bladed disks (blisks), are milled on 5 or 6-axis machines. The productivity in their manufacturing is being achieved by the effective
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
565
interaction of an advanced CNC control, optimum tools and clamping devices and an effective coolant lubricant system. Up to now, an optimized milling strategy is needed to minimize machining times, optimize aerodynamically defined surfaces and reduce machining costs (Steffens and Wilhelm, 2008). Blisks can be termed as the combination of aerodynamically defined surfaces. According to Bußmann, Kraus and Bayer (2005), Blisks (bladed integrated disks) or IBRs (bladed integrated rotors) are some of the most innovative and challenging components in modern gas turbine engines. Some of the advantages using blisks are that they reduce weight by 20% and improve efficiency, compared with the conventional blade assembled disk. The weight saving is a result of lower rim loads, the elimination of blade roots and disks lugs and the compression of more performance in the same design and weight cover. The weight reduction and the consequent material economization is a step towards productivity enhancement. Productivity issue has become very critical in blisks manufacturing due to the significant ramp up of their forecasted demand within the next twelve years in Europe as well as in USA, for civil and military proposes (see Figure 1). The increasing blisk market is pushing the continuous improvement of manufacturing processes, the development of new manufacturing strategies and, furthermore, the development of process chains (Bußmann & Bayer, 2008).
Fig. 1. Blisk Market forecast The process chain development of a complex geometry is defined by its design features. The engineering activities in design are product design, engineering analysis and the design documentation; while the engineering activities in manufacturing are the process planning, the NC part programming, among several other activities. For the last several years, CAD/CAM is being used for the engineering activities in design and manufacturing, as productive tools. In this context, CAD/CAM is not only meant for automating certain activities of design and manufacturing, but also for automating the transition from design to manufacturing (Nasr & Kamrani, 2007). As part of the process plan, the NC part program is generated automatically by CAD/CAM; the automatically program generation takes place only after the manual introduction of a great number of parameters, such as: the tool, the type of machining, the machining strategies, the process parameters, etc. Most of process parameters, such as feeds and speeds, are initially defined based on information generated by the tool supplier and the process constraints, such as the maximum spindle speed provided by the machine supplier. To achieve a better machining program, these parameters are often modified directly on the machine at the workshop. The changes are based on the experience of the tool machine operator, who along the years has become an
566
Advanced Technologies
expert on the machine tool and the machining process. Although valuable information about the final machining process is stored on the machine, at the end this information is not retrieved in an intelligent way to be used in future machining operations. The biggest disadvantage appears when the operator leaves the company, thus valuable implicit knowledge is gone, costing a great loss for the industry. Furthermore, in order to create more accurate NC programs and avoid as possible changes at the workshop the CAM-operator should have on-line knowledge about the manufactured products of the company, and well documented information of successful used cases of machined parts (López de Lacalle Marcaide et al., 2004). According to Sharif Ullah and Harib (2006), manufacturing involves knowledge intensive activities, where the knowledge is often extracted from experimental observations. Like the manufacturing of aero engine components, the automotive assembly system requires data and knowledge for their precise control and on-process quality assurance to enhance productivity further. There are three main factors that are being considered in productivity enhancement. These are time, cost, quality and flexibility in production systems and in their infrastructure; the most significant of them is production monitoring and control. In the past the main emphasis was made on the material and labour productivity which is then diverted towards resource efficiency through many planning strategies, like the material resource planning to influence productivity and the labour productivity enhancement methodologies. In assembly domain, the productivity until now is being improved through many distinct means, particularly virtual assembly techniques to enable fast ramp up and fast programming of the handling and processing equipment. This has given the automotive manufacturers, in particular, a competitive edge in checking the assembly process virtually prior to the actual assembly process. This is one of the areas where automotive industry is investing lot of money to enhance productivity in terms of production ramp up time, production costs and the flexibility in the production setups. This in turn also expanded the role of digital factory in the automotive assembly where the processing, testing and fabrication of assembly lines as well as their visualization are made using interactive digital factory tools (Schenk et al., 2005). In this context, the virtual reality and assembly simulation are combined for better production planning and worker qualification for enhancing productivity in the industry. In the normal production especially in assembly, the generation of unexpected errors which are unforeseen to human experts can be identified prior to the start of the operation on the line. This is due to the fact that many working parameters like dimensional variations of products, fixtures, sensors detection capabilities and robot repeatability are closely coupled with the 3D environment; it is difficult to anticipate all of the error conditions with their likelihood of occurrence and 3-D state in the work-space (Baydar and Saitou, 2001). Through virtual assembly, these errors and deviations can be foreseen and eliminated well in time before the actual assembly or joining takes place. This is in fact provides the manufacturer a large potential for enhancing productivity. The other technique being employed in automotive industry is AutoID recognition of parts using RFID which fits best even in fast paced complex automotive assembly environments (Gary and Warren, 2008) compared to the conventional low cost bar codes techniques. Bar code has significantly contributed to the productivity for a quite long time but the increasing number of product variety and relevant processing equipment together with advances in database technology has improved the amount, quality, and timeliness of data in the
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
567
industry (Schuster et al., 2004) in logistics, supply chain management, quality assurance etc. The innovative technologies of today such as Auto-ID and the Electronic Product Code (EPC) with clusters of interactive sensor networks have created larger data streams of greater complexity. It is estimated that the amount of data generated each year is increasing as much as 40% to 60% for many organizations (Park 2004). The above two examples from the assembly domain depict that the manufacturing industry especially the automotive sector is striving hard for the best technological exploitation and resource utilization for enhancing productivity. These are few but the more actual technologies that have been employed by the automotive manufacturers of today to improve productivity in fragmented form. The state of the art techniques are being used by almost all the automotive manufacturers for productivity enhancement but still the competition is increasing and manufacturers from the automotive industry are striving hard to be highly responsive to differentiated customer demands. By highly responsive means that they should also be highly flexible as well as productive inside so as to meet the individualized demands in an economical way. The area which is not fully explored and where there is still significant untapped potential for productivity enhancement is the intelligent control of production setups. Intelligent control encompasses intelligent selection of resources and parameter settings as well as the exchange of real time data exchange between different actor sensor units as well as the handling systems that play active role in automotive assembly operations. This enables the effective and real time control. For ensuring the real time control in an effective way, smooth flow of comprehensive and precise information throughout the entire process chain is the main key. It offers a solid base for concrete real time decision making as well as online optimization of resources and processes. The up-to-date information enables analysing the current production status of the production system at any definite time. This includes the resources in operations, resources in idle status, material flow and quality assurance for correct planning and control.
3. Methodology The proposed integrated intelligent methodology (see Figure 2) has an aim to optimize the process chain design in the manufacturing domain and the assembly process control model in the assembly domain through a common knowledge base, the technology data catalogue. Further, the methodology foresees the update of the process model with new process chain parameters and the selection of machine setup, tools and fixtures and other resources. The intelligent methodology is devised on the following characteristics: • Highly knowledge driven platform such as technology data catalogue containing the systematic production parameters and functional correlations and coherencies • Intelligent modelling methods dealing with multi-variant parameter correlations and production control parameters considering their interdependencies • Highly flexible and applicable control methodology taking all technical specifications and functionalities and capable of fast and smooth ramp up of new technologies, processes and ensure direct and effective on-process quality assurance
568
Advanced Technologies
Fig. 2. Schematics for an integrated intelligent methodology This integrated methodology was practically implemented and demonstrated in two European research projects EU FP6 HYMOULD (http://www.hymould.eu/) and EU FP6 FUTURA (http://www.futura-ip.eu/). The process chain developer is elaborated further in the case study 1; while the functionalities of the setup configurator and the assembly process modeller are discussed in the case study 2 in the following sections. 3.1 Main component: Technology Data Catalogue Intensive, precise and effective information exchange plays a vital role in successful running of the processes and activities in the production system. This information can be exchanged formally within the boundaries of defined mechanisms, such as structured methods and formal processes; or informally; and both horizontally, e.g. cross-functional, and vertically within the organization (Perks, 2000; Van der Bij et al., 2003; Calabrese, 1999). Moreover, management can determine knowledge sharing by implementing formal procedures for guiding information flows; moreover, there are mechanisms which can originate such process (Berends et al., 2006). Nevertheless, sharing knowledge among members of a big organization may be a complex activity. And as long as the knowledge is not shared, can not be exploited by the organization (Choo, 1996).
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
569
To make this complex activity simpler and enable better exploitation of the knowledge for precise process planning, optimal parameter settings, automatic control program selection and intelligent selection of resources, it is proposed the design and development of a technology data catalogue (TDC) as a main component of the integrated methodology. The main aims of the TDC are collecting, retrieving, structuring, processing and sharing relevant engineering data/information in an intelligent way. The TDC will also provide structured information about the “best-practice” settings of the manufacturing system. Data sources have to be defined and a suitable knowledge representation structure has to be created in order to store the relevant implicit and explicit knowledge generated at the different levels of companies (Minhas et al., 2008). The TDC constitutes the following: 1) A database with shared terms (concepts, instances, attributes, etc.). The relationship between terms (parameters) will be assured through the axiomatic design theory. 2) Translators will match the conceptual terms coming from different sources ensuring the coherence of the exchange of data between the systems and the TDC. In the literature, translators have already been proposed and successfully tested, e.g. in Goossenaerts and Pelletier (2001). 3) Filters will enable sorting out information that better match the requested technology (Lepratti, 2005; Basson et al., 2004). 4) Characterizer (Berger et al., 2008): required information is selected from different databases and sources, and characterized based on its maturity. Highly mature knowledge is stored in the TDC. The characterization based on maturity considers the standardization, amount of synonyms, visualization, source, etc. (see Figure 3) of the terms and the topics.
Fig. 3. Characterization criteria and maturity degree scale
570
Advanced Technologies
For instance, if there is a technical term, which is considered standard then it gets 3 points on a maturity degree scale. If some synonyms for this standardized technical term are available, then the information quality is higher. If the information was originated at recent date then the information gets more points e.g. 4 points: 2007-2006, 3 points: 2005 -2003, 2 points: 2002 – 2000, 1 point: 1999… Sources can come from theory and practice. If theory is proved through practice, this concept meets the criteria to get 4 points. If the knowledge carries visualization and amount of publications, then the knowledge will get more points compared to the one that was merely originated from theory. 3.2 Other Components The process chain developer is based on the axiomatic design methodology. Axiomatic design presents many advantages compare with other methodologies like TRIZ and the Taguchi Method (Hu et al. 2000). It provides a good structuring and quantitative modelling for coupled, uncoupled and decoupled design. Moreover, through axiomatic design the process chain can be mathematically modelled which assures precision, and the iterative trial and error process can be minimized which saves a significant amount of time, thus gaining higher productivity. The process chain developer communicates with the technology data catalogue through the knowledge characterizer component to gain parameters for the machine setup such as tools, fixtures, coolants and lubricants. Furthermore, the relevant process parameters which are exchanged between the process chain developer and the knowledge characterizer are the feed rates and cutting speeds for specific features and milling operation; these process parameters are extracted from successful past used cases or projects and stored in the technology data catalogue after being classified by the knowledge characterizer. The precise information exchange is enabled and consequently the time for process chain development and the setup time are minimized. The main role of assembly modeller is to extract the assembly process parameters and their values from the knowledge characterizer and then finally generate assembly process control programs. This strategy is adopted to ensure high precision in control programs which in turn eliminates the risk of reworking or using the wrong programs and the consequent malfunctioning of the assembly devices as well as the low product quality. The same concept can be mapped to the manufacturing cell, i.e. machining cell where machine programs will be precisely generated through automatic guidance from the knowledge characterizer. As a result, the time for programming and updating will be reduced to a significant level and likewise the implied costs. Precise programs will also influence the process quality outcomes to the greater extent and part/product rejection rates will be lowered eventually. The assembly process modeller will provide the input to the setup configurator as well as direct control program will be loaded to the assembly cell. The setup configurator after analysing the customer demands configures the assembly cell and set parameters corresponding to the assembly tasks and their control functions being performed in the cell. This enables the configuration/reconfiguration of the cell through software means which enables flexibility. Easy configuration of the cell will enhance productivity in setup time reduction or assembling new components/parts; a step towards fast ramp up. The most salient feature of this integrated methodology is that it simultaneously addresses the time, cost, and quality together with flexible way of adapting production setups and utilizing resources.
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
571
4. Case Study 1: Enhancing Productivity in the Manufacturing Domain The state of the art for productivity enhancement in the manufacturing industry is partly described in the section 2. It was deduced that there is still a need for new manufacturing strategies, as well as, for optimal process chain development. It is necessitated a formal methodology for extracting and then exploiting the useful knowledge from the used cases or projects. The case study being described here addresses these issues by taking an example of the manufacturing process of bladed integrated disks (blisks). This case study presents a process chain developer able to design an optimized process chain for the production of blisks that will contribute to the enhancement of productivity in the manufacturing domain. The process chain developer is made on the axiomatic design methodology. While the axiomatic design methodology is providing structure to the process chain, the technology data catalogue, main component of the previously described integrated methodology, is providing all data needed for the integrated process chain (see Figure 4). The input data for the process chain developer is coming from the customer needs, i.e. the requested part or design, generally generated as 3D geometry in CAD, and its respective material. The output is the optimized process chain with all process parameters for the manufacturing cell.
Fig. 4. Interaction between the process chain developer and the TDC
572
Advanced Technologies
4.1 Production of Blisk Airfoils Bußmann et al. (2005) affirmed that the optimum blisk manufacturing process, from technical and commercial point of views, depends on material, geometric and aerodynamic parameters; furthermore, they proposed a toolbox-approach that may provide the optimum technology or combination of technologies which may satisfy current and anticipated requirements. Since the disk body involves conventional cutting, surface compactness and finishing, processes where a sufficient amount of experience has been achieved through the production of the conventional blade assembled disk, the toolbox-approach is utilizable specifically for airfoiling. To produce blisk airfoils, three manufacturing processes are commonly used depending on the airfoil size and the resistance of the material to be machined (Bußmann et al., 2005): • Milling the entire airfoil from the solid; the gas duct area between the airfoils is also milled. This process is applicable for medium-diameter blisks and medium blade counts, and for titanium blisks in the low pressure compressor (LPC) and in the intermediate pressure compressor (IPC) section. • Joining blade and disk together by linear friction welding (LFW) or inductive high frequency pressure welding (IHFP) and subsequently using adaptive milling to remove expelled material; the gas duct area between the airfoils is also milled. This process is applicable for large-diameter blisks, hollow-blade blisk, blisk with large chamber volumes, where the process is suitable to save raw material costs, and for blisks with few blades; and primary for titanium blisks in the low pressure compressor (LPC) section • Removing material through electrochemical machining (ECM) or precise electrochemical machining (PECM). This process is applicable for medium to smalldiameter blisks, high number of blades, and for the hotter sections of the high pressure compressor (HPC) of nickel alloys and nickel powder metallic materials or sintered materials. 4.2 Axiomatic System Design Framework Axiomatic design is a methodology created by N. P. Suh (Suh, 1990) that endows designers with the scientific basis for the design of engineering systems. Additionally, axiomatic design enhances creativity, minimize the iterative trial-and-error process, express the process design mathematically and, moreover, determine the best design. Suh defined design as an activity that “involves interplay between what we want to achieve and how we choose to satisfy the need (the what)” and four domains that delineate four different design activities (Suh, 2001): the customer domain, the functional domain, the physical domain, and the process domain (see Figure 5). The customer domain is characterized by attributes or the needs that the customer seeks in a product, or a process or a system; in the functional domain the needs are defined based on functional requirements (FRs) and constraints (Cs); in the physical domain the design parameter (DPs) that satisfy the specified FRs are described; finally, in the process domain manufacturing process variables (PVs) are characterized and a process based on the PVs that can produce the DPs is developed. Constraints (Cs) provide the limits on the acceptable design. The difference between Cs and FRs is that Cs do not have to be independent as the FRs.
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
573
Fig. 5. Axiomatic design domains The axiomatic design starts by the identification and definition of the customer attributes or needs, and then their translation into functional requirements; this involves a mapping process from the customer domain to the functional domain. Then a mapping process between functional domain and the physical domain follows to satisfy the customer needs; this process is also called zigzagging method. This method allows creating hierarchies for FRs, DPs, and PVs in each domain. During the mapping process, there can be found many possible DPs; the key DPs are selected for the design according to two design axioms. The mapping process can be expressed mathematically in terms of vectors; that is, a vector of FRs can be related to a vector of DPs according to the following equation:
{FR} = [A ] {DP}
(1)
where [A] is the design matrix that indicates a relation between a DP and a FR. The elements of the matrix are represented with a “0” if there is no effect and with an “X” if there is an effect and later on substitute by other values. Moreover, when all Aij are equal to zero except those where i=j then the design matrix is defined as diagonal and the design is called uncoupled design; where each of the FRs can be satisfied independently by means of one DP. And when the upper triangular elements are equal to zero then the design matrix is defined as a lower triangular and the design is called decoupled design; where the independence of FRs can be assured only if the DPs are defined in the right sequence. In any other case, the design matrix is defined as a full matrix and the design called coupled, which is the most undesired design. FR1 FR2 = FR3 FR1 FR2 = FR3
X 0 0 0 X 0 0 0 X
DP1 DP2 DP3
(2)
X 0 0 DP1 X X 0 DP2 X X X DP3
(3)
574
Advanced Technologies
FR1 FR2 = FR3
X 0 X DP1 X X 0 DP2 X X X DP3
(4)
(2) Diagonal matrix/ uncoupled design, (3) triangular matrix/ decoupled design, and (4) full matrix/ coupled design 4.3 Process Chain Developer Initially, the process chain developer must identify the customer need or attribute (CA) and then translate them into functional requirements which must be fulfilled by design parameters. As it was described in the last section, blisks are not completely accepted by the customers because their manufacturing costs are still higher than the ones for the blade-disk joints (Steffens and Wilhelm, 2008). Thus, the main customer attribute at this point is defined as the minimization of blisk costs. According to this CA, the first level functional requirement (FR) and the respective design parameter (DP) are decomposed as follows: FR1 Reduce blisk manufacturing costs DP1 Manufacturing process within target costs Further, blisk manufacturing costs are split into three main categories (Bußmann, et al., 2005): the material costs, the airfoiling process cost and other manufacturing and quality assurance costs. Thus, the next decomposition is as follows: FR11 Minimize quality assurance costs DP11 Steady process to target design specifications FR12 Minimize airfoiling process costs DP12 Airfoiling processes optimization FR13 Minimize material costs DP13 Optimum material utilization The design equation representing the interaction between the FRs and DPs is as follows: FR11 X 0 0 DP11 FR12 = X X 0 DP12 FR13 DP13 X X X
(5)
where [A] is a triangular matrix, thus a decoupled design. For the further decomposition of functional requirements and design parameters, material and process parameters which may have influence on the cost and delivery time of blisks are analysed. These parameters are summarized in the table 1 (Esslinger and Helm, 2003). The material costs are directly correlated to the blisk costs, as pointed in table 1; although they are external costs that can be minimized only by the material supplier, they are partially considered in the development of the process chain since a better utilization of the resources can enhance some reduction of costs. The other material parameter, the material data quality, is being ensured by the knowledge characterizer of the technology data catalogue.
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
Material parameters
575
Cost and time of delivery Relevant Relevant Cost and time of delivery Relevant Relevant
Material costs Material data quality Process parameters
Process stability Number of steps and their duration Availability of process simulation Relevant Table 1. Correlation between material and process parameters and customers´ demands Concerning the process parameters, the process stability is correlated in general to the cost and delivery time of blisks and, in particular to the quality assurance costs since a mature process result in less quality discrepancies. The last process parameter, the availability of process simulation, is guaranteed by the use of CAD/CAM tools, which is considered as a constraint in this process chain development. The decomposition of FR11/DP11 (minimize quality assurance costs/ steady process to target design specifications) is defined as follows: FR111 Minimize process deviations DP111 No process adjustments FR112 Deliver product on time DP112 Throughout time FR113 Meet design specifications DP113 Target surface roughness FR111 FR112 = FR113
X 0 0 DP111 X X 0 DP112 X 0 X DP113
(6)
where [A] is a triangular matrix, thus a decoupled design As it is illustrated in the table 1, the number of steps and their duration are correlated to the blisk costs, thereby an optimized manufacturing process must be designed. Thus the decomposition of FR12/DP12 (minimize airfoiling process costs/ airfoiling processes optimization) is as follows: FR121 Optimize milling process DP121 Optimized process chain design FR122 Optimize joining process DP122 Optimized joining approach FR123 Optimize ECM/ PECM process DP123 Optimized ECM/ PECM approach
FR121 FR122 = FR123
X 0 0 0 X 0 0 0 X
DP121 DP122 DP123
(7)
576
Advanced Technologies
where [A] is a diagonal matrix, thus an uncoupled design. As it was pointed out in the previous section, there are three main airfoiling processes for the blisk manufacturing: milling, joining and electrochemical machining (ECM)/precise electrochemical machining (PECM); this case study is focused on the design of an integrated process chain for milling. And because the milling process enhances better results for medium-size range blisks of titanium alloys (Bußmann, et al., 2005) the first constraint is defined as follows: C1: medium-size blisk made of titanium alloys Before the milling process can be carried out, a design in CAD is required. Therefore, a second constraint is also defined. C2: 3D-CAD geometry The further decomposition of FR13/DP13 (minimize material costs/optimum material utilization) is as follows: FR131 Increase material data quality DP131 Precise material data from the knowledge characterizer FR132 Reduce wasted material during machining DP132 Minimum number of damaged workpieces/ prototypes FR131 X 0 DP131 = FR132 X X DP132
(8)
where [A] is a triangular matrix, thus a decoupled design. Here, the characteristics of titanium alloys that have an influence on the machinability of the alloy (Janssen, 2003); e.g. the low heat conductivity causes thermal load on the tool cutting edge, while the chemical affinity by high temperatures produces welding between the chip and the tool; are taken in consideration for the process chain design. These material characteristics are relevant for increasing the quality of the material data which will be stored in the technology data catalogue (TDC) after its categorization by the knowledge characterizer. The intelligent gaining of precise material data (DP131) is facilitating the design of the process chain and saving setup times. A milling approach that integrates a strategy, tools and machines make possible a production time reduction of about 50% (Bußmann et al., 2005), thus the FR121/DP121 (optimize milling process/ optimized process chain design) is decomposed as follows: FR1211 Define the milling strategy DP1211 Feature-based design FR1212 Determine machine and cutting tool DP1212 Machine and cutting tool selection from the TDC FR1213 Generate process parameters DP1213 Feeds and speeds selection from the TDC FR1211 FR1212 = FR1213
X 0 0 DP1211 X X 0 DP1212 X X X DP1213
(9)
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
577
where [A] is a triangular matrix, thus a decoupled design One of the advantages of feature-based designing is that the features can be associated to machining processes which are further related to process resources (machines, tools, fixtures and auxiliary materials), process kinematics (tool access direction), process constraints (interference and spindle power), process parameters (feeds and speeds) and other information, such as time and costs. Thus, enabling the creation of, what in the literature is called, a feature-based manufacturing knowledge repository (Nasr and Kamrani, 2007) and what in this chapter was defined as technology data catalogue and further extended with a knowledge characterizer to assure the precision of the data. The optimized process chain is finally developed taking all relevant process parameters: feed rate and cutting speed for the specific feature and milling operation; these process parameters, stored from successful past used cases or projects, are retrieved from the technology data catalogue (TDC) through the knowledge characterizer. The structuring of knowledge and precise knowledge exchange is enhancing effective optimization of a process chain to produce bladed integrated disks (blisks). This optimal chain development with high precision will eliminate redesigning and the trial and error in the process chain which eventually minimizes time and cost. Consequently, a more productive process chain development is achieved.
5. Case Study 2: Enhancing Productivity in the Assembly Domain The state of the art strategies for making the assembly systems more productive are discussed in the section 2 in fragmented form. It is concluded that the high mass customization has induced the complexities in terms of efficient and intelligent utilization of resources, precise modelling of assembling processes and reliable and effective quality control. The case study being described here was made on the assembly process of automotive body in white. In this case, the innovative joining process adhesive bonding of car body parts is taken as an example. Adhesive bonding has a strong potential in the car body assembly which has made the joining of different multifunctional materials possible. The objective of this case study is to investigate the possibility of increasing productivity in assembly process with the following targets • Efficient resource utilization with the easy and fast ramp up of joining parts in the flexible cell • Precise modelling of the assembly process and automatic updating with the precise knowledge (experienced knowledge from the used cases) • Intelligent selection and parameter settings of assembly setups and using the same setup for multipurpose applications ( Setup cost and time reduction) 5.1 Adhesive Bonding Process Taking adhesive bonding as an example, for joining tasks, the process control sequence and the relevant program is generated by the process modeller after extracting accurate process parameters for the joining process parameters, e.g. adhesive dispensing rate, robot application speed, etc.
578
Advanced Technologies
Fig. 6. Computation of adhesive positioning with hybrid automata Figure 6 shows a part of block program that is modelled using the concept of hybrid automata (Henzinger, 1996; Branicky et al., 1998). The intelligence can play its active role in setting up guard conditions in the modelled program. As an example the information about the guard conditions corresponding to the actual process conditions can be extracted from the knowledge characterizer while switching from the discrete steps to the continuous states for calculations such as the position calculations from the monitoring data coming from the sensors to the process model for the actual quality of adhesive bead, i.e. its form and position. The knowledge characterizer will provide the necessary process parameters i.e. dispensing pressure, temperature, robot speed, nozzle valve actuation frequency and its operating timings. This procedure of modelling though hybrid automata helps in eliminating risks and ensures precise process control that can be used in real time situation at the assembly cell level thereby enhancing process reliability and productivity. The process flow diagram (control program) of adhesive bonding station is shown in the following figure 7. The process flow contains many feedback loops and computations after extracting monitoring data from the sensors. These feedback loops are activated using parameters from the knowledge characterizer and the conditions sett by the TDC. The significant task of the process modeller is that it is automatically updates the process model through the real time exchange of data with the knowledge characterizer. It saves setting up time in a case where there is significant change of variants in the cell, which are then to be assembled enabling fast automatic adaptation of control programs. It makes the process modelling and programming activity in the assembly more productive in terms of time and cost. Moreover precise process modelling through extensive knowledge exchange with the knowledge characterizer helps in achieving higher quality, thereby making this activity more productive in terms of process reliability.
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
579
Fig. 7. Process flow diagram (part) for robot guided adhesive bonding application Flexibility in terms of resource selection and production setup configuration is one of the most influential factors in enhancing productivity. The more is the system flexible, the greater is the system productive, provided the system is subjected to the mass customization. For simplicity, this case study is carried out using the example of multisensory monitoring of adhesive bonding process for demonstrating effective and onprocess quality assurance. It enables enhanced process and the resulting product quality. 5.2 Multipurpose Multisensory Setup The investigation for fast adaptation of production setups is made using the case of adaptation of multisensory setup that can be adaptable for different assembly processes in the assembly cell (see Figure 8). The sensors are selected in the network relevant to the joining process by the setup configurator and the controller manages the monitoring data exchange with the main controller for real time process control. The selection of sensors can be made using the following methodologies 1. Cost functions 2. Axiomatic design approach (Houshmand M. & Jamshidnezhad B., 2002; Igata, 1996) 3. Algorithms known from cognitive mapping theory (Zhang et al., 1992)
580
Advanced Technologies
Fig. 8. Schematics of multi-purpose multisensory network Cost function based evaluation methodology is simpler compared to the other two methodologies, but it has widespread use as it can be employed not only for the static activation of sensors but also for the dynamic activation of sensors in the network. The selection of sensors corresponding to process parameters can be made using the following algebraic equations. From figure 8, if there are n sensors in the networks with n set of characteristics implies: (10)
where S1, S2, S3,…….., Sn are the sensors in the network with the characteristics E1, E2, E3…… En respectively and calculation of cost function can be made in the following way:
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
581
(11)
where W1, W2, W3,……,Wn are the evaluated weights of S1, S2, S3,…….,Sn respectively based on the weights of their characteristics we1, we2,we3,…………….,wen. Finally the sensors are selected after the sensor weights evaluated corresponding to their suitability for process parameter measurement by the following equations: (12)
where M(P1), M(P2),…………., M(Pn) gives the equations of selected sensors suitable for measuring the relevant parameters. The sensors which are not suitable will be given zero weightage, as a result they are automatically eliminated from the equations. This methodology works well when the mature knowledge about the sensors and their characteristics are available in the TDC. The ramp up of newly developed or the sensors with new technology needs an update in the TDC for reliable selection and their parameter settings.
6. Discussion and conclusions In this chapter, the intelligent integrated methodology for productivity enhancement has been highlighted. The methodology was discussed using two case studies in the production system. The first one was elaborating the innovative process chain optimization of blisks through axiomatic design approach and the intelligent selection of process parameters from the TDC through the knowledge characterizer; and the second one was discussing the parameter settings and adaptation of assembly process control models of a car body in white parts and finally the configuration/ reconfiguration of the adhesive bonding assembly. Moreover, with this integrated methodology is ensured the effective knowledge sharing with the process model from the knowledge characterizer, automatically updating the influential process parameters for enhancing precision in the process model developed through hybrid automata, achievement of better process and the resulting product quality. The salient advantage of this integrated methodology is that it addresses all the influential factors of productivity simultaneously. It is noteworthy to mention that this methodology revolves around the technology data catalogue as knowledge base for optimization and
582
Advanced Technologies
adaptation purposes and this is possible only if the information of the used cases has a high degree of maturity. Furthermore, if the knowledge is not mature or the used cases are not available then the technology data catalogue and the knowledge characterizer can not be so effective and reliable in precise optimization and adaptation. The authors have noticed these limitations and as a next step this integrated methodology will be extended by using concepts and algorithms known from the self learning theory.
7. References Basson, A.H., Bonnema, G.M., and Liu, Y. (2004). A flexible Electro-Mechanical Design Information System. In Imre Horváth and Paul Xirouc hakis (ed.), Tools and Methods of Competitive Engineering, Vol. 2, 879- 889. Millpress, Rotterdam Netherlands. Baydar, C. and Saitou, K. (2001). Automated generation of error recovery logic in assembly systems using genetic programming. Journal of Manufacturing Systems, vol. 20(1), pp. 55-68. Berends, H., Van der Bij, H., Debackere, K., and Weggeman, M. (2006). Knowledge sharing mechanisms in industrial research. R&D Management, pp. 36, 1, 85-95. Berger, U., Lebedynska, Y., Minhas, S.U.H. (2008). Incorporating intelligence and development of knowledge acquisition system in an automated manufacturing environment. International Journal of Systems Application, Engineering & Development, Issue 2, Vol. 2, ISSN: 2074-1308. Bollinger, J. E. (1998). Visionary Manufacturing Challenges for 2020, (Ed.) Committee on Visionary Manufacturing Challenges, National Research Council, National Academy Press, Washington, DC. Branicky, M.S., Borkar, V.S., and Mitter, S.K. (1998). A unified framework for hybrid control. Model and optimal control theory, IEEE Transactions on Automatic Control, Vol. 43, No. 1, pp. 31-45. Bußmann, M., Kraus, J., and Bayer, E. (2005). An Integrated Cost-Effective Approach to Blisk Manufacturing. http://www.mtu.de/en/technologies/engineering_news/an_integrated_approach.pdf Bußmann M., and Bayer, E. (2008). Market-oriented blisk manufacturing. A challenge for production engineering, http://www.mtu.de/de/technologies/engineering_news/bayer_bliskfertigung.pdf, 2008. Calabrese, G. (1999). Managing information in product development. Logistics Information Management, Vol. 12, No. 6, pp. 439-450. Choo, C.W. (1996). The knowing organization: How organizations use information to construct meaning, create knowledge and make decisions. International Journal of Information Management, Vol. 16, No. 5, pp. 329-340. Esslinger, J., Helm, D. (2003). Titanium in Aero-Engines. In G. Lütjering (ed.) Proceedings of the 10th World Conference on Titanium, Vol. 5, Wiley-VcH, Weinheim Germany. Gary, M. G, Warren H. H. (2008). RFID in mixed-model assembly operations: Process and quality cost savings, IIE Transactions, vol. 40, Issue 11, pp. 1083-1096. Goossenaerts, J.B.M and Pelletier, C. (2001). Enterprise Ontologies and Knowledge Management. In: K.-D. Thoben, F. Weber and K.S. Pawar (ed.) Proceedings of the 7th International Conference on Concurrent Enterprising: Engineering the Knowledge Economy through Co-operation, Bremen, Germany, June 2001, pp. 281-289.
Enhancing Productivity through Integrated Intelligent Methodology in Automated Production Environment
583
Henzinger, T.A. (1996). The theory of hybrid automata, in Proceedings of the 11th Annual Symposium, Logic in Computer Science, IEEE CS Press, pp. 278-292. Houshmand M., Jamshidnezhad B. (2002). Redesigning of an automotive body assembly line through an axiomatic design approach, Proceedings of the Manufacturing Complexity Network Conference, Cambridge , UK. Hu, M., Yang, K., Taguchi, S. (2000). Enhancing Robust Design with Aid of TRIZ and Axiomatic Design. http://www.triz-journal.com/archives/2000/10/e/2000-10e.pdf, 2000. Igata H. (1996). Application of Axiomatic Design to Rapid-Prototyping Support for Real-Time Control Software. M.Sc. Thesis at Massachusetts Institute of Technology USA. Janssen R. (2003). Bohren und Zirkularfräsen von Schichtverbunden aus Aluminium, CFK und Titanlegierungen. Forschungberichte aus der Stiftung Institut für Werkstofftechnik Bremen, Band 20, PhD dissertation. Shaker Verlag, Aachen Germany. Kendal, S.; Creen M. (2007). An Introduction to Knowledge Engineering. Springer, USA. López de Lacalle Marcaide, L.N., Sánchez Galíndez, J.A., Lamikiz Menchaca, A. (2004). Mecanizado de Alto Rendimiento, Procesos de Arranque. Izzaro group, Spain. Lepratti, R. (2005). Ein Beitrag zur fortschrittlichen Mensch-Maschine-Interaktion auf Basis ontologischer Filterung, Logos Verlag Berlin. Minhas, S.-ul-H., Kretzschmann, R., Vargas, V., Berger, U. (2008). An Approach for the Real Time Intelligent Production Control to Fulfil Mass Customization Requirements. Mass Customization Services, Edwards. K, Blecker, Th., Salvador, F., Hvam, L., Fridierich, G. (Eds.), ISBN: 978-87-908555-12-3. Nasr, E. A., Kamrani, A. K. (2007). Computer-Based Design and Manufacturing, An Informationbased Approach. Springer, USA. Park, A. (2004). Can EMC Find Growth Beyond Hardware? BusinessWeek. Perks, H. (2000). Marketing Information Exchange Mechanisms in Collaborative New Product Development, the Influence of Resource Balance and Competitiveness. Industrial Marketing Management, vol. (29), 179-189. Schenk, M., Straßburger, S., Kissner, H. (2005). Combining virtual reality and assembly simulation for production planning and worker qualification, International Conference on Changeable, Agile, Reconfigurable and Virtual Production CARV 05. Schuster, E. W., Tom S., Pinaki K., David L. B. and Stuart J. A. (2004). The ext Frontier: How Auto-ID Could Improve ERP Data Quality. Cutter IT Journal, vol. 17(9). Sharif Ullah, A.M.M. and Harib, K. H. (2006): Zadehian Paradigms for Knowledge Extraction in Intelligent Manufacturing. Manufacturing the Future. Concepts, Technologies & Visions. Kordic, Lazinica, Merdan (eds.), Por Literatur Verlag. Steffens, K., and Wilhelm, H. (2008): New Engine Generation: Materials, Surface Technology, Manufacturing Processes. What comes after 2000? http://www.mtu.de/en/technologies/engineering_news/11773.pdf, 2008. Suh, N.P. (1990). The Principles of Design. Oxford University Press, USA. Suh, N.P. (2001). Axiomatic Design: Advances and Applications. Oxford University Press, USA. Uschold, M., and Gruninger, M. (1996). Ontologies: Principles, Methods and Applications. Knowledge Engineering Review, volume (11), 2. Van der Bij, H.; Song, M. X., and Weggeman, M. (2003). An Empirical Investigation into the Antecedents of Knowledge Dissemination at the Strategic Business Unit Level. Journal of Product Innovation Management, volume (20), 163-179.
584
Advanced Technologies
Zhang, W. R. C., S. S. and King, R. S. (1992) A coginitive map based approach to the coordination of distributed cooperative agents, IEEE Trans. Syst., Man, Cybern., vol. 22(1), pp. 103-114.
Graph-based exploratory analysis of biological interaction networks
585
30 X Graph-based exploratory analysis of biological interaction networks Maurizio Adriano Strangio
Department of Mathematics, University of Rome “Roma Tre” Italy 1. Introduction The interactions that occur among different cells and inside themselves are essential for maintaining the fundamental life processes in biological systems. These interactions are generally described as networks interconnecting an enormous number of nodes and are characterised by extremely complex topologies. Among such networks, the most studied are typically classified as metabolic pathways (the flow of material and energy in the organism), gene regulatory cascades (when and which genes are expressed) and protein signalling networks (triggering an immune response to viral attack). The Pathguide resource list offers detailed information on several biological pathway and network databases accessible through the web (http://www.pathguide.org, 2009 - Bader et al., 2006). Mathematical objects such as graphs, consisting of nodes and links joining the nodes (arcs and edges), offer a convenient and effective level of abstraction for representing the aforementioned networks. These graphs are generally referred to as biological interaction networks. Biologists can benefit from graph-based formalisms since they are able to visualize the overall (or certain parts) topology of the network, modify nodes and links and annotate them with additional information. Furthermore, many biological questions can be conveniently formulated in terms of search or optimization problems on graphs. For example, it is currently believed that the topological structure of biological networks (e.g. cell signalling pathways) reveal its characteristics of robustness and evolutionary history. The main lines of thought in this field describe biological interaction networks by employing formalisms based on either undirected or directed graphs; we describe them in section 2. The complexity of biological networks is not only driven by the enormous amounts of variables and the relationships among them; additional sources of uncertainty are due to the difficulty of observing cellular dynamics in 3D-space. Thus, developing quantitative mathematical models of cellular dynamics in large networks may be computationally infeasible. However, structural analysis and hypothesis formulation (and validation) may reveal valuable information on the underlying biological functions at a system level (e.g which molecules and information processing pathways are involved in the accomplishment of a biological function). For this reason, it is essential for biologists working on such networks under this perspective to have access to open and extensible computing platforms
586
Advanced Technologies
with advanced functionalities and efficient graph rendering routines. To this end, we summarise (section 4) the features of an exemplary subset of open source software packages typically used for the exploratory analysis of biological interaction networks of variable type and size. We conclude the chapter by observing that currently there is a lack of quantitative and comprehensive frameworks for linking network topology to biological function.
2. Graph-based formalisms for representing biological interaction networks Graphs are widely used as a mathematical formalism for representing networks. In this section we review directed and undirected graph-based paradigms used to represent and study biological networks. Directed graphs can well describe causal relationships among entities whereas undirected graphs are better suited for expressing soft constraints between nodes. 2.1 Undirected Graphs A biological interaction network may be represented by an undirected graph, a mathematical structure G(V,E) composed of a set of vertexes V and edges E (E ⊆ [V]2); we denote the resulting graph as a biological structure graph (BSG). The vertex set consists of all the species that occur in the network; an edge connects two species that are biologically related. Interesting matters to investigate about BSGs are, for example, determining whether they share the same topological properties of other types of networks (e.g. social or communication networks), their robustness (e.g. against viral attacks or mutations) and if they are self-organizing. To this end, qualitative approaches based on statistical characteristics of network structure have been widely applied for studying the topology of large-scale networks (Albert, R. & Barabasi, A.-L., 2002; Watts, D.J. & Strogatz, S.H., 1998; see Jeong et al., 2001 for a case study of protein-protein interaction networks). These methods try to exploit massive datasets in order to discover patterns that would otherwise be undetectable at smaller scales. Among the most frequently used and perhaps most significant topological statistics are the average path length l, defined as the average number of links along the shortest paths for all possible pairs of network nodes and the average clustering coefficient C =
1 ∑ Ci , n i
where the clustering coefficient Ci of node i provides a quantitative measure of how close to
a clique is the subgraph obtained by considering the vertex i and all its neighbors. In random networks (Erdos, P. & Renyi, A., 1960), an edge connecting two arbitrary nodes is added to the graph as the result of a random experiment (the probability of accepting the edge is equal to a fixed p ∈ [0,1]). In these networks the statistics l and C generally assume small values (Albert, R. & Barabasi, A.-L., 2002). A small l implies that an arbitrary node can reach any other node in a few steps thus it is a measure of the efficiency of information flow (or processing) across the network (eg. it is useful for studying mass transfer in a metabolic
Graph-based exploratory analysis of biological interaction networks
587
network). On the other hand, with a small C it is likely that the network does not contain clusters and groups. Recent studies have revealed that many complex systems present small-world and scale– free properties (Albert, R. & Barabasi, A.-L., 2002). A small-world network has a significantly larger C , with respect to a random graph constructed with the same vertex set, and a relatively small l (Watts, D.J. & Strogatz, S.H., 1998). Scale-free networks, on the other hand, are characterised by the degree distribution following the Pareto or power-law distribution p(k)∼k-γ, meaning that the fraction p(k) of nodes in the network having k links to other nodes is proportional to k-γ where γ is a constant typically in the range ]2,3[ (Albert, R. & Barabasi, A.-L., 2002). It has been conjectured that a self-organizing mechanism based on the preferential attachment rule can explain the dynamics of scale-free networks. Preferential attachment, which naturally leads to the power-law degree distribution, implies that when a new node is added to the network it is attracted by the vertexes with the highest degree. This is in contrast to random graphs where the addition of nodes follows a pure probabilistic process, meaning that nodes are equally important in a statistical sense (no node is privileged). Research has shown that the functions of a complex system may be affected to a large extent by its network topology (Albert, R. & Barabasi, A.-L., 2002). For example, the greater tendency for clustering in metabolic networks appears related to the organization of functional modules in cells, which contribute to the behaviour and survival of organisms. In a biological network this behaviour is plausible (and desirable) since the network converges toward a structure that favours the efficiency (low-energy reactions) of its functions (many real world networks ultimately exhibit this structure). Some researchers explain this behaviour as the evidence of evolutionary history; i.e., nodes with high degrees have resisted selective pressure since they are important for the survival of the organism (acting as hubs and thus are essential for the correct functioning of the network). It is this structure that makes such networks tolerant to random failure and errors (e.g. missense point mutations) but less vulnerable to targeted attacks (e.g. removal of nodes with highest degree) (Holme et al., 2002; Bollobas & Riordan, 2003). However, while this a posteriori analysis seems consistent with observations of real biological networks there are still many open questions that naturally arise regarding the way they are actually formed. 2.2 Directed Graphs When reactions are also represented as vertexes Vr, distinct from the set of vertexes Vs denoting species, we obtain a particular directed graph known as a bipartite directed graph G(Vs, Vr, E) where Vs ∩ Vr = ∅ and E contains the arcs connecting species occurring within a specific reaction. Following the terminology introduced by other authors (BPGs, Ruths et al., 2006) we refer to such graphs as biological pathway graphs 1. BPGs based on bipartite graphs are characterised by the following properties (which immediately follow from the definition of bipartite graph): 1
The nomenclatures BSG, BPG used in this article simply denote that the network is represented respectively by an undirected or directed graph. The BSG formalism is often used to represent proteinprotein and protein-DNA interaction networks whilst the BPG is employed to study metabolic networks.
588
Advanced Technologies
(a) the sets Vs, Vr form a partition of the vertex set V. Therefore a node in a BPG is either a molecule or a reaction but not both; (b) the edge set E ⊆ Vs x Vr ∪ Vr x Vs (the symbol x denotes the cartesian product between sets - a directed edge connecting the generic nodes u and v is denoted by uv; likewise, the directed edge connecting v to u is denoted by vu). This condition implies that Vs and Vr are independent sets. As the result, molecules point exclusively (are the input) to reactions while the latter are linked only to molecules; (c) the property r ∈ Vr u,v ∈ Vs s.t. ur, rv ∈E implies that any reaction requires at least one molecule (products) in input and produces at least one molecule (substrate) in output. The BPG directed graph-based formalism is appealing since: (a) both molecules and reactions are viewed as vertexes (which typically indicate the information content of the graph while edges carry the information concerning its topological structure) thus making certain types of analysis simpler (e.g. when taking into account the enhancing or inhibiting character of reactions); (b) the directionality of the reactions and information flows in the network are made explicit; (c) there is (at least in principle) a greater compatibility with richer (quantitative) models that are able to take into account system dynamics (e.g. Petri nets). The analysis of a BPG is essentially performed by working on the corresponding adjacency matrix A of size (|Vr|+|Vs|)2 (the notation |S| represents the number of elements in the set S, i.e. the cardinality of S); element A(i,j) is set to 1 if there exists an arc joining vertex i with vertex j (it is straightforward to verify that are |Vs|2 elements in the upper left triangular sub-matrix - above the main diagonal - and |Vr|2 elements in the lower triangular sub-matrix - below the main diagonal - are equal to zero). An alternative representation is given by the adjacency list L; in this case the list of nodes (either molecules or reactions) for which there exists an arc originating from the i-th node are stored in the corresponding position of the array L(i). Graph analysis algorithms are often more efficient when implemented using adjacency lists. Of notable interest are the sets γ(1)(s)={e ∈ Ε | e = sr, r ∈ Vr} and γ(-1)(s)={e ∈ Ε | e = rs, r ∈ Vr} which contain the arcs respectively departing from and entering into an arbitrary species node s, and the sets µ(1)(s)={r ∈ Vr | rs ∈ E} and µ(-1)(s)={r ∈ Vr | sr ∈ E} which contain respectively the child and parent vertexes (reactions) of an arbitrary node v. Analogous definitions can be set out for interaction s (i.e. for the sets γ(1)(r), γ(-1)(r), µ(1)(r), µ(-1)(r)). Also, for v ∈ Vr ∪ Vs d+(v)=|γ(1)(v)|=|µ(1)(v)| and d-(v)=|γ(-1)(v)|=|µ(-1)(v)| are respectively the out degree and in degree of v, we have ∀r ∈ Vr d+(r)d-(r) > 0. The in- and out-degrees of a molecule node are extremely informative since one can identify “source” molecules (with in-degree equal to 0 that act only as substrates) and “sink” molecules (with out-degree equal to 0 that act only as products). Nodes with large in/outdegrees (also called “hub” nodes) are typical of many reaction networks that exhibit the scale-free topology and small-world structure as discussed above. It is interesting to explore a BPG by searching for the induced subgraph originating from a particular vertex set. The basic use of this algorithm is for identifying the reactions and substrates involved in the transformations between any two (or set of) molecules. In particular, the induced subgraph originating from a certain vertex v ∈ Vs with constraints specified in order to exclude a set of species nodes Xs ⊆ Vs and/or reaction nodes Xr ⊆ Vr
Graph-based exploratory analysis of biological interaction networks
589
from the search (i.e. vertexes that should not be visited by the algorithm). This can be useful for determining how inhibitors affect the overall functionality of the network. Path enumeration algorithms identify the paths originating in a vertex v ∈ Vs and terminating in t ∈ Vs. Again the algorithm may be restricted to visiting (or excluding) certain nodes as above. To detect feedback loops incident on s it is sufficient to set s=t; such loops are worthy of interest from a biological point of view. An interesting line of research was recently opened by (Ruths et al., 2006) who formulated two biologically significant problems on BPGs: the constrained downstream (CDP) and minimum knockout (MKP) problems. The CDP essentially searches a BPG for the set of reactions that leads from one set of species to another, with the constraint that the search algorithm must include and/or exclude a given set of reactions. An application of a CDP instance to a BPG would therefore be useful to design drugs that modify or inhibit certain biological functions while preserving others. From a topological point of view this amounts to identifying molecules or sets of molecules that have to be targeted to inhibit function of a sub-network while preserving connectivity (and thus information flow) to a different subnetwork. On the other hand, the MKP seeks for a minimum-size set of molecules whose removal (knocking out) from the BPG disconnects given sets of source and target species thus making impossible the production of certain target molecules from the source set. An instance of the MDP may help solve the biological problem of identifying the optimal and minimal sets of molecules that have to be targeted to block network function. For example, in patients affected by cancer this could allow the development of therapeutics that can efficiently destroy cancer cells while preserving normal cells. The above algorithms based on a depth-first search strategy usually require running times of O(|Vs - Xs|+|Vr - Xr|). 2.3 Relationship between directed and undirected graph representations The transformation of the BPG G(Vs, Vr, E) into an undirected bipartite graph Gu(Vs, Vr, Eu) can be simply obtained by duplicating all the arcs in E, i.e. Eu = E ∪ {e = rs ∈ E |sr ∈ E, s ∈ Vs, r ∈ Vr }. It is straightforward to verify that the (symmetric) adjacency matrix B of the associated undirected graph Gu may be calculated by the matrix sum B=A+A’ (or A’+A, A’ denotes the transpose of A) where A is the adjacency matrix of the directed graph G.
(a)
(b)
(c)
Fig. 1. (a) An example BPG, (b) the undirected BPG derived from (a) and (c) the BSG corresponding to (b). As an example, following this procedure the BPG depicted in Figure 1-(a) is transformed into the undirected graph shown in Figure 1-(b), which is also bipartite. From a biological
590
Advanced Technologies
point of view, one must be aware that the undirected graph contains additional information since molecules that were reactants are now also products and vice versa; such newly introduced reactions are not necessarily biologically significant. A further transformation step is illustrated in Figure 1-(c) where the undirected BPG is transformed into a BSR by cutting reaction nodes (all the information contained in the arcs entering and leaving the reaction is retained). 2.4 Petri Nets In computer science, a Petri net (PN) is a graphical and mathematical modeling tool for the analysis and simulation of concurrent and/or stochastic systems (Murata, 1989). A system is represented as a bipartite directed graph with a set of "place" nodes that contain resources (conditions), and a set of "event" nodes which generate and/or consume resources (transitions). A PN models both the topology of the network as well as the dynamical behavior of the system. For this reason they have been employed to simulate biological networks (Reddy et al., 1993; Tsavachidou & Liebman, 2002; Nagasaki et al., 2005; Chaouiya, 2007;). In translating the biological network into a PN, each molecule and each interaction are described as a node. Each molecule is initialized with some number of “tokens,” which are then iteratively reallocated by the interactions to which they are connected.
3. Enhanced formalisms for representing biological networks In this section we briefly describe graphical formalisms that are commonly used to represent complex biological systems. Although such formalisms are not always strictly graph-based (i.e. directed or undirected graphs), they are worthy of mention since they afford visual modelling approaches that humans can employ for the description, analysis and simulation of complex biological systems by computer programs. In general, with respect to mathematical graphs these notations encompass enriched and highly expressive formalisms that can describe many types of nodes and relationships. These languages provide a consistent and complete formalism for the specification of biological models and thus facilitate model sharing in different software environments (models should be platform-independent). It must be emphasized that currently system biology still lacks a standard graphical notation. 3.1 SBML The Systems Biology Markup Language (SBML) is an open XML-based schema for representing quantitative biological models such as regulatory and metabolic networks (Hucka et al., 2003; Finney & Hucka, 2003). SBML specifies a formalism for representing biological entities by means of compartments and reacting species, as well as their dynamic behaviour, using reactions, events and arbitrary mathematical rules (Hucka et al., 2008). The SBML file format and underlying network structure is widely accepted and supported by a large number of tools (http://sbml.org, 2009).
Graph-based exploratory analysis of biological interaction networks
591
3.2 BioPAX The Biological Pathway Exchange Format (BioPAX) is a collaborative effort to create a data exchange format for biological pathway data (http://www.biopax.org, 2009). The BioPAX specification provides an abstract object model, based on the web ontology language (OWL), for the representation of biological pathway concepts. Mapping the concepts defined in any custom data model to the semantically corresponding items in the BioPAX model provides a unified formalism for the representation of biological interaction networks. 3.3 PSI-MI The Proteomics Standards Initiative (PSI-MI) provides an XML-based formalism for the representation of protein-protein interaction data (http://psidev.sourceforge.net/mi/xml/doc/user/, 2009). Future developments are aimed at expanding the language to include other types of molecules (e.g. DNA, RNA, etc). 3.4 SBGN Recently, the level 1 specification of the Systems Biology Graphical Notation (SBGN) was released (http://www.sbgn.org, 2009) in an effort to provide a standardised graphical notation to describe biological systems in a format comprehensible to humans. In this respect, it may be considered a natural complement to SBML and to the aforementioned formalisms.
4. Graph-based Analysis Tools It is essential for biologists working with biological networks to have access to open and extensible computing platforms with advanced functionalities and efficient graph drawing routines. For this purpose there are currently many commercial and open source tools available; a short selection is provided in Table 1 which highlights some essential features. Tool Cytoscape (Shannon et al., 2003) PIANA (BIANA) (Aragones et al., 2006) ONDEX (Kohler et al., 2006) VisANT (Hu et al., 2006) GraphWeb (Reimand et al., 2008) PathSys (BiologicalNetworks) (Baitaluk et al., 2006)
Graph Analysis
Graph Editing
Graph Visual.
Integration
YES
YES
YES
Web Services
NO
YES
YES
Cytoscape plug.in
YES
YES
YES
YES
YES
YES
YES
YES
YES
YES
YES
YES
Cytoscape plug.in
Supported Formats SBML, PSI-MI, BioPAX PSI-MI, BioPAX SBML, PSI-MI, BioPAX
BioPAX
Table 1. Software tools for graph-based analysis of biological networks
License Open source Open source Open source Open source Public web service Open source
592
Advanced Technologies
The first column “graph analysis” indicates whether the software tool can perform structural analysis by making use of graph-theoretic algorithms (average path length, clustering, diameter, etc). The column labelled “graph editing” shows that all the tools have graph editing functionalities, i.e. the capability of adding and modifying nodes or edges (although not all packages have graphical editors). All programs implement advanced graph rendering routines that are able to display large networks (column “graph visualization”). The column “supported formats” enumerates the supported notations among those described previously. Finally, the last column (“license”) indicates the conditions of use of the software (some licenses are free only for academic use).
5. Conclusions With the recent advancements of experimental techniques in molecular biology (e.g. microarrays) vast amounts of data are being gathered at an impressive rate and made available in numerous on-line databases. The systematic analysis of this data is currently the objective of many strong-minded research efforts to observe and understand the biological principles that govern the interactions between a variety of molecules at the system level. Such interactions are viewed as networks that often use graph-based notations since the later allow both explorative, statistical and functional analysis (e.g. local and global characteristics, growth dynamics). In addition, graph-based approaches may provide the means for identifying the functional units (that constitute a large network) responsible for the basic life processes in an organism at the molecular level. Such analysis may also explain the (multifactorial) mechanisms by which molecular interaction networks (as a whole or certain subsystems) appear disrupted in many forms of disease. Although cluster analysis may be able to identify the functional components of a system it is also necessary to study the interactions among subsystems and with the environment. This also applies when evaluating network robustness with probabilistic based formalisms, such as Markov-chain models or stochastic PNs, which capture the time dependent probability of error propagation within a system but are unable to explain the course of events that are caused by interactions with the environment (e.g. transitional events) or the relationship between faults and the functional model (a fault probability equal to zero does not guarantee correct functioning of the system). Applying highly expressive formalisms (e.g. Petri nets) presents at least two problems: a) exploratory techniques can be overly complex (e.g. NP-hard algorithms, state-space explosion) and b) models are difficult to use and understand since they require skilled users to design and maintain. These are general issues that apply to an entire class of graph-based formalisms used to model large-scale networks. As a result, no technique appears to be suitable for representing an entire system from both the structural and functional points of view. One way to approach these problems is by using multiple formalisms, i.e. applying different modelling techniques to different parts of the system in order to exploit the strengths of one notation where others are less suitable. However, we stress that the major concerns in this area are due to the absence of a widely accepted standard notation for the representation of biological interaction networks and the availability of quantitative models that are able to explain both their structural features and system dynamics.
Graph-based exploratory analysis of biological interaction networks
593
6. References Albert, R. & Barabasi, A.-L. (2002) “Statistical mechanics of complex networks” Reviews of Moden Physics 74:47-97 Aragones, R; Jaeggi, D.; Oliva, B (2006) ”PIANA: Protein Interactions and Network Analysis” Bioinformatics 22(8):1015-7, http://sbi.imim.es/piana Bader, G.D.; Cary, M.P.; Sander, C. (2006) “Pathguide: a pathway resource list”, Nucleic Acids Res. 34(Database issue):D504-6 Baitaluk, M.; Qian, X.; Godbole,S; Raval, A.; Ray, A.; Gupta,A. “PathSys: integrating molecular interaction graphs for systems biology” (2006) BMC Bioinformatics, 7:55, http://biologicalnetworks.net/ Bollobas, B. & Riordan, O. (2003) “Robustness and Vulnerability of Scale-Free Random Graphs” Internet Math. Vol. 1, No. 1, pp. 1-35 Chaouiya, C. (2007) “Petri net modelling of biological networks” Briefings in Bioinformatics 8(4):210-9 Erdos, P. & Renyi, A. (1960) “On the evolution of random graphs” Publ. Math. Inst. Hungar. Acad. Sci. 5:17-61 Finney A & Hucka M. (2003) "Systems biology markup language: Level 2 and beyond." Biochem Soc. Trans. 31: 1472-1473. Holme, P; Kim, B.J.; Yoon, C.N.; Han, S.K. (2002) “Attack vulnerability of complex networks” Physical Reviews 65:1-14 Hu, Z.; Mellor, J.; Snitkin, S.E.; DeLisi C. (2008) “VisANT: an integrative framework for networks in system biology”, Briefings in Bioinformatics, 9:317-325 Hucka, M.; Finney, A.; Sauro, HM.; Bolouri, H.; Doyle, J.C.; Kitano, H.; Arkin, A.P.; Bornstein, B.J.; Bray, D.; Cornish-Bowden, A.; Cuellar, A.A.; Dronov, S.; Gilles, E.D.; Ginkel, M.; Gor, V.; Goryanin, I.I.; Hedley, W.J.; Hodgman, T.C.; Hofmeyr, J.H.; Hunter, P.J.; Juty, N.S.; Kasberger, J.L.; Kremling, A.; Kummer, U.; Le Novere, N.; Loew, L.M.; Lucio, D.; Mendes, P.; Minch, E.; Mjolsness, E.D.; Nakayama, Y.; Nelson, M.R.; Nielsen, P.F.; Sakurada, T.; Schaff, J.C.; Shapiro, B.E.; Shimizu, T.S.; Spence, H.D.; Stelling, J.; Takahashi, K.; Tomita, M.; Wagner, J.; Wang, J. (2003) "The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models." Bioinformatics 19: 524-531. Hucka, M.; Hoops, S.; Keating, S.M.; Le Novère, N.; Sahle, S.; Wilkinson, D. (2008) “Systems Biology Markup Language (SBML) Level 2: Structures and Facilities for Model Definitions“ Nature Proceedings (doi:10.1038/npre.2008.2715.1) Jeong, H.; Mason, S.P.; Barabasi, A.L.; Oltvai, Z.N. (2001) “Lethality and centrality in protein networks” Nature 411:41-42 Kohler, J.; Baumbach, J.; Taubert ,J.; Specht, M.; Skusa, A.; Ruegg, A.; Rawlings, C.; Verrier, P.; Philippi, S. (2006) “Graph-based analysis and visualization of experimental results with ONDEX” Bioinformatics 22(11):1383-1390, http://ondex.sourceforge.net Murata, T. (1989) “Petri Nets: Properties, Analysis and Applications” Proceedings of the IEEE Vol. 77, No. 4, pp. 541-580 Nagasaki, M.; Doi, A.; Matsuno, H. & Miyano, S. (2005) “Petri Net Based Description and Modeling of Biological Pathways” Algebraic Biology Universal Academy Press, Inc. pp. 19-31
594
Advanced Technologies
Reimand, J.; Tooming, L.; Peterson, H.; Adler, P.; Vilo, J. (2008) “GraphWeb: mining heterogenous biological networks for gene modules with functional significance” Nucl. Acids Res., 36:W452-W459, http://biit.cs.ut.ee/graphweb/ Reddy, V.N.; Mavrovouniotis, M.L.; Liebman, M.N. (1993) “Petri net representations in metabolic pathways”, Proceedings of the First ISMB, pp. 328-336 Ruths, D.A.; Nakhleh, L.; Iyengar, M.S.; Reddy, S.A.G.; Ram, P.T. (2006) “Hypothesis Generation in Signalling Networks” J. Comput. Biol. 13(9):1546-1557 Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. (2003) “Cytoscape: a software environment for integrated models of biomolecular interaction networks” Genome Res, 13(11):24982504, http://www.cytoscape.org Tsavachidou, D; Liebman, M.N.. (2002) ”Modeling and simulation of pathways in menopause” Journal of the American Medical Informatics 9(5):461-71 Watts, D.J. & Strogatz, S.H. (1998) “Collective dynamics of small-world networks” Nature 393:440-442
Intruder Alarm Systems - The Road Ahead
595
31 X Intruder Alarm Systems - The Road Ahead Rui Manuel Antunes and Frederico Lapa Grilo
Escola Superior de Tecnologia de Setúbal (Setúbal Polytechnic Institute) Portugal 1. Introduction
This chapter presents today’s state of the art intruder alarm systems and detectors, giving special focus on the several technologies applied, including the wireless transmission and reception of alarm messages and commands through GSM/GPRS, TCP/IP, and the project and development of web-based intruder alarm monitoring and control hardware and software. Useful project techniques are also described in detail, concerning the installation of intruder alarm systems for homeowners and for commercial/industrial use. New developments for distributed web-based intruder alarm systems, which can include not only traditional signaling functions but also new "smart" decision functions are challenging thrills today for the intruder alarm designers. Web-based intruder alarm systems may include the use of distributed nets ("grids"), giving each node the ability to dynamically configure its functions within entire respect for the security scope issues. By this approach, distributed network intelligence will allow an intruder alarm system to react to multi-signalization intrusion situations in a much more efficient way, being also able to distinguish more accurately real security violations from unadvertedly operations. The development of web-based intruder alarm systems from the evolution of the new APIs, and from new languages based on the web 2.0 philosophy will lead to a new level for intruder alarm monitoring and control, with the integration of several new features such as, for example, using signal pre-processing and Geo-localization.
2. Today's Intruder Alarm Systems 2.1 Intruder Alarms and Detectors Security alarm systems for intrusion prevention is massively present in our own homes and Companies. A basic intruder alarm system consists of a control panel, with rechargeable battery power backup and internal or external keypads, several interior and perimeter intrusion detectors, and one external sounder, at least. An intruder alarm system can be classified as an hardwired , wireless, or an hybrid system. Wireless systems are used when there is not pre-wiring, operating at 433/868 MHz. Most secure intruder alarm systems are hardwired systems, because wireless systems will use additional battery power and, even with Anti-jamming protection, still can be affected
596
Advanced Technologies
by radio-frequency interferences. Hybrid alarm systems allow simultaneously the installation of wireless and hardwired detectors, being the most versatile.
Fig. 1. Typical hybrid intruder alarm system The STROBE signal is the output from the alarm control panel that will enable the flash lamps from the external sounder. The BELL signal is the output signal from the alarm control panel, used for triggering the alarm sounder devices. Both signals trigger from 12V to 0V. VBAT (+12V) and GND are the power source terminals for the alarm control panel and the external sounder. Finally, Tamper Return (TR) is the external sounder output signal, used to indicate an opened sounder cover tamper. In hardwired intruder alarm systems each protected zone is a closed loop circuit, where a fixed current flows. The first protected zones are usually door entry zones, for manual or remote entry and for exit pre-time arming and disarming. Several commercial intruder alarm configurations are available: NO, NC, EOLR, 2-EOLR, and 3-EOLR (End-of-lineresistor loop circuits). Most common intrusion detectors, that can usually be classified as perimeter and interior detector devices are: PIR (passive infrared motion detectors), PET (pet immune motion detectors), dual technology motion detectors, acoustic glass break detectors, vibration/chock detectors, magnetic contact detectors and others (such as IR beam barrier or ultrasound sensors). A PET immune detector is made by two PIR motion detectors, specially developed to detect simultaneously a high and low layer pre-assigned beam. As pets under a certain amount of weight are not large enough to hit both beams, the PET motion detector does not feel them. Outdoor PIR and dual technology motion detectors, commonly named WatchOUT detectors, are also being used as an important additional perimeter motion detector. Pet immune 360º ceiling PIRs and dual technology motion detectors are also present in industrial facilities. Dual technology motion detectors can have Anti-cloak technology (ATC) to thwart camouflaged burglars. A moving cloaked intruder will emit a weak IR signal that has characteristic shape. This technology applies recognition algorithms that disregard the
Intruder Alarm Systems - The Road Ahead
597
signal strength and focuses on its pattern. Once verifying that it matches with the signal of a burglar, ACT immediately switches to the microwave triggering mode. A dual technology motion detector can be Anti-masking, by using an active infrared channel located behind the lens to protect against spraying and covering burglar techniques. Also, this detector can bring Anti-fluorescent signal processing to avoid the fluorescent light flashes false alarms. Quad element PIR technology can improve intruder catch and reduce false alarms, because are used two PIR channels/twin-elements with separate Fresnel lenses that can distinguish more effectively between humans and other infrared sources. The use of Grenline technology is increasing, because a dual technology detector will disable its microwave sensor when a building or zone is occupied. Thus, there is no need to constantly send high frequency microwave signals to areas occupied by humans, reducing also the power consumption. Magnetic contact detectors do not need to be powered at all. The presence of the magnetic field is enough to trigger the embedded relay from the detector that is attached, needing only a two-wire cable for its installation. Acoustic glass break detectors add important perimeter protection by detecting potential burglars, while them are still outside. The main problem concerning the acoustic devices is that an intruder can still achieve to force to open a window without needing to break the glass. So, these kind of perimeter acoustic detectors should be installed along with other interior intrusion detection devices. Vibration/chock sensors are a good alternative for acoustic glass break detectors, by sensing the vibration on a window even before it breaks, thus providing a reliable perimeter protection. 2.2 Project Techniques In an hardwired alarm system End-of-line-resistors (EOL) can supervise the wires, preventing wire tampering. The burglar when shorts loop wires together hides the End-ofline-resistor from the control panel, letting the control panel know that something is going on. The EOL resistor value is usually provided by the alarm system manufacturer, typically 5.6KΩ. The right placement of an EOL resistor is always at the last detector, on the loop circuit. We should never mount EOL resistors at the control panel, because it will act as if it has no EOL resistors. A Double-End-of-line-resistor (2-EOLR) will allow to distinguish between an opened cover tamper from the trigger detector motion condition, and Triple-End-of-line-resistor may trigger the Anti-masking detector or a fault.
Fig. 2. EOLR circuit configuration
598
Advanced Technologies
In a 2-EORL circuit configuration a (S3) tamper output normally closed detector terminal must be serial wired with the normally closed detector terminal, as shown in figure 3:
Fig. 3. 2-EOLR circuit configuration For a 3-EORL circuit configuration the Anti-masking/fault terminal must be also serial wired with the normally closed detector and tamper output terminals , as shown below:
Fig. 4. 3-EOLR circuit configuration Most recent intruder alarm systems allow now programmed supervision for both single, double, or triple End-of-line-resistor loop circuit configurations. Remote monitoring/control can be made using the telephone line or with a GSM/GPRS/IP module. Both allow a connection with any person, or to a dedicated monitoring station. GSM/GPRS modules have entry channels usually enabled to the negative transition (from 12V to 0V), thus being able to be directly connected to the STROBE and BELL control panel signals. These modules can be programmed from a mobile phone or from an USB port, predefining the destination phone numbers with SMS/MMS or simple alarm voice messages. Existing control panel telephone lines can also be directly wired to a GSM/GPRS/IP module communicator, and be used to send intruder alarm messages. To adapt an older alarm system to send GSM/GPRS or IP alerts, the BELL output may be wired to an input terminal of a GSM/GPRS/IP module. This simple arrangement will avoid the need for extra month fees in contracts with security companies. GSM/GPRS/IP modules can also be used to arm, disarm and even to verify all detectors, or a ready-to-arm condition.
Intruder Alarm Systems - The Road Ahead
599
Also, a bidirectional module can be used when an intruder alarm control panel has already built-in an arming/disarming loop (programmed as maintained key switch arm). Next table shows some important rules and advices, concerning the project of intruder alarm systems for homeowners (Antunes, 2007a): PIR/PET motion detectors: 1- Before mounting a detector, always read carefully the manufacturer installation manuals. 2- When installing a detector, care should be taken to check if manufacturer infrared and microwave radiation diagrams fills the area to protect. 3- The detector must be fixed approximately 2 meters from the floor (depends also on the module), and should always be identified with a stick number. 4- The detector must be pointed to the opposite direction of light entries and windows, and never directly to sun light. 5- A detector must preferentially point to frequent passage areas, such as door entries and corridors. 6- Avoid mounting the detector in areas near air conditioning, heaters, fans and ovens, where it could detect air flows. 7- Avoid installing a detector near areas where high humidity or vapor exists, that could easily cause condensation. 8- Avoid screens and moving curtains that may divide any infrared protected zone. 9- A 90º detector must be preferentially mounted to a corner, while a 360º detector should be mounted on the ceiling. 10- Screw the detector only to fix parts, like consistent walls. 11- Try to avoid using wall swivels, that could rotate, modifying the position of the detector. 12- After the installation check if all motion detectors are installed by performing always in the end a “walk test”. Dual Technology motion detectors: 1- Dual technology detectors should not be used in wireless alarm systems, because the power consumption will be much higher than using wireless PIR detectors. 2- Not recommended for pacemaker owners. Acoustic glass break detectors: 1- Should be avoided installing acoustic glass break detectors in kitchens, or near audio speakers. 2- The detector's microphone should be pointed towards the direction of all windows, never more than 9 meters away (straight line). 3- A perimeter acoustic sensor should be installed also with another interior intrusion detector. Magnetic contact detectors: 1- Magnetic contact detectors should be hidden as much as possible, and using embedded magnetic switches. 2- The magnet must be attached always to the moving part of a door or window. 3- The distance between the magnet and the detector must be within the manufacturer limits (usually less than 10mm).
600
Advanced Technologies
4- In magnetic contact detectors only two-conductor alarm cables should be used. 5- Where possible, mount the body of a magnetic detector close to the top of the nonmoving frame of a door or window. 6- Always align correctly the magnet with the existing marc on the detector. 7- Magnetic contact detectors must not be fixed near metallic, magnetic structures, or high voltage cables, and nor near the floor. Vibration/Chock detectors: 1- Should be used models that bring also magnetic contacts, for extra double protection on opening doors and windows. Wireless detectors: 1- Do not install wireless magnetic detectors near electrical engines and other electronic equipment that could generate radiofrequency interferences, like automatic curtains or garage doors. For the intruder alarm control panel: 1- First it is necessary to study in detail the location of the areas to protect, and to choose the right detectors and the alarm system to be installed. 2- The chosen intruder alarm control panel has to achieve the best cost/liability/time relation, considering also the esthetic outcome of the hardwired, wireless or hybrid solution. 3- The right protected zones must guarantee a minimal number of loops and the total number of sensors needed, assuming a PIR/PET/Dual Technology interior motion detector solution per protected zone, along with the use of additional perimeter detectors for the outside, and for windows and doors. 4- Install the alarm control panel in hidden places (inside a closet for example), to avoid and prevent burglar vandal acts. 5- If only one keypad is used, it should be placed near the frequently used door. 6- Only keypads should be visible, preferentially in frequent passage areas. 7- An additional Panic distress code should always be programmed into the intruder alarm control panel, for letting know (silently) that someone might be in a dangerous situation. Table 1. Basic intruder alarm project rules A perimeter arming mode feature will allow for a homeowner to freely circulate at night, inside his home, yet still being the intruder alarm system ready and armed, and continuously monitoring all house perimeter sensors (such as doors and windows detectors), only disregarding the interior motion detectors. A remote control panel arming/disarming operation can be made in many intruder alarm models, using a single specific current loop (or zone) previously programmed for that purpose. This will allow to remotely arm and disarm an intruder alarm system with a PDA or a mobile phone. Automatic arming/disarming uses the alarm control panel internal real-time-clock (and calendar). This programmable mode could be very useful for protecting places where predefined working timetables exist (at stores, or at industry for example). Door chime mode (Antunes, 2007b) is also often used in stores, allowing a quick audible sound, for example, each time a client enters inside a store.
Intruder Alarm Systems - The Road Ahead
601
3. Web-Based Intruder Alarm Systems 3.1 Monitoring and Control IP modules allow the communication between an intruder alarm system and a remote monitoring station. Some connect directly to the existing control panel main telephone lines, transmitting in ContactID or SIA protocols to an IP receiver station. A very low cost IP module can be made by using an embedded standard micro web-server, with built-in IP connectivity, that is able to send and receive digital commands via TCP/IP (Adão et al., 2008a). This could be a nice low cost solution for the existing alarm and notification home automation models, and without the need to have a main server PC running, or special software (like other expensive available solutions). Although care should be taken to assure a full SSL (Secure Sockets Layer) communication.
Fig. 5. Alarm circuit for low cost web-control and monitoring The BELL and STROBE outputs may usually need an external pull-up resistor to work, depending on the control panel and the sounder model. Zones may also be programmed to be used as key switch arming/disarming. This will allow to remotely arm or disarm an intruder alarm control panel. Note that a key switch webcontrol module needs to be connected to the additional loop zone (n+1) that is programmed as maintained key switch arm, as shown below:
Fig. 6. Dedicated EOLR n+1th maintained key switch arm zone, with the web-control module
602
Advanced Technologies
So, in the restored state the alarm control panel is usually disarmed. The switching of this n+1th zone to the alarm state may turn a control panel to arm. Nevertheless this operation has to be confirmed, because IP communications sometimes might go down. That is why an available output line (PGM1) may also be programmed, when possible, to be able to confirm an armed condition. A second control panel output line (PGM2), usually provided, can be very useful to monitor a ready-to-arm status. This could provide, for example, to remotely check a detector malfunction or a false trigger condition, thus preventing a remote arming command. A low cost micro web-server (Adão et al., 2008b) usually allows interaction with .CGI web pages, making possible the user to call pre-programmed functions within the web-server from an internet browser, for reading inputs and set or reset output values. A CGI html and JavaScript versatile web page can be easily build, even by an homeowner, to upgrade an intruder alarm system with a web page interface, that is directly downloaded into a non-volatile chip, inside the micro web-server. 3.2 New Developments for Distributed Web-Based Intruder Alarm Systems New intruder alarm systems based on web monitoring and control are now starting to grow into complete distributed nets, not only with traditional signaling functions but also with new "smart" decision functions. Midllware computational systems rely, with more than a decade, on a grid computing scalable distributed nets approach (Grilo & Figueiredo, 2008), with dynamic capacity, and can be a starting reference on the development of new distributed web-based intruder alarm systems. Web based intruder alarm systems can exist in distributed nodes ("Grid topology", as shown in next figure), having each node the ability to dynamically change its functions:
Fig. 7. Intruder alarm systems with grid computing From this approach the "intelligence" capability of all the distributed net will be applied to react in a much more efficient way to the different alarm situations that can occur, being the
Intruder Alarm Systems - The Road Ahead
603
alarm systems able to distinguish more accurately real security violations from, for example, unadvertedly operations, increasing security levels and also the immunity to technical failures. The development of web-based systems from the evolution of the new APIs, and from the new languages based on the web 2.0 generation are also bringing improvements for intruder alarm systems. 3.2.1 Integration Using the IP protocol for the net support and Html as the language for the web-servers, it is created the possibility to support several hardware technologies on the same physical net implementation, and also, no less important, created the capability to use and connect to the network systems from different suppliers and technologies, which will give an huge flexibility, not only with the net topology but also at the built grids (individual alarm systems), improving the adaptation capabilities. 3.2.2 Systems Redundancy Being necessary to certain applications and environments to have a redundancy systems strategy, the topology of the distributed network ("grid") can implement this functionality with efficiency, because with all intruder alarm systems connected we can simply choose those for the same security and surveillance area. The systems given for the same area can have then a privileged communication. We might considerer use redundancy in different ways, as for example: • One of the systems is active for security violations and the other is active, in backup function, in case of failure of the first, turning all system more robust. • The two systems are active and ready. In case of a security violation detected, a mutual approach giving the response of each system is made, as shown below:
Fig. 8. Redundancy example Theses capabilities could be programmed in a static, or if necessary, in a more dynamic way, reacting to pre-defined situations comprising all security strategy.
604
Advanced Technologies
3.2.3 Connectivity Redundancy IP protocol can be supported by different hardware technologies. These add an important advantage because it will allow several levels of redundancy between the intruder alarm systems that made the "grid". In the lowest redundancy level each system can have two different types of connections. So, there are always two different ways for the system to communicate through the net. Redundancy gives in practice, to the system, two valuable functions: 1. In case of a connection failure due to a technical problem, the system stays connected to the net through another connection(s) according with the redundancy level used, and then it can monitor the failure; 2. In case of a failure due to a security violation produced on purpose, the system stays connected to the net as explained already, but it loses its security capabilities, monitoring the security violation. The first example (in next figure), shows the intruder alarm systems connected through the network. Each system has a first level of redundancy, with two different available connections: 1. A connection by a dedicated cable structure; 2. A wireless GPRS connection.
Fig. 9. Intruder alarm systems connected through the network (first configuration)
Intruder Alarm Systems - The Road Ahead
605
In this case, the alarm systems have fixed positions, and so the second wireless connection do not serve for mobility but rather for the ability that the system could maintain all its functions working, even when isolated by a security violation. In the second example (figure 10) is also implemented a first redundancy level, with two connections to the net available: 1. A connection through ADSL supported on a conventional phone line cable, and without using a dedicated infrastructure; 2. Other connection (wireless) supported by GPRS.
Fig. 10. Intruder alarm systems connected through the network (second configuration) For this second configuration, using non dedicated structures, the implementation costs can be reduced, giving also more flexibility to the net topology: • In the number of network nodes; • Also concerning systems localization. because the two used structures (ADSL with phone cable, and GPRS) are usually available at the urban (and other) environments.
606
Advanced Technologies
3.2.4 Distributed Control Intruder alarm distributed decentralized networks are indeed a new paradigm for the supervision and control of the security systems. Each system (working as an individual node) has its own control, but also has the ability to monitor the state of the other alarm systems on the complete network. This way one can implement a global security alarm strategy, allowing the network to detect a certain failure, either by a technical failure or a security violation of one of the individual intruder alarm systems. On the other hand, it will be also possible, if necessary, to develop an high level network control, which can be done by any of the intruder alarm systems (node) of the network. All these capabilities make alarm systems more robust, not only in terms of security but also in terms of supervision flexibility. Security definitions can be adapted to each intruder alarm control system, for an individual security policy. These definitions can also be dynamically changed, following programming, or in reaction to possible contingence scenarios. Being the network supported by a structure with Internet access, all monitoring and control, and programmer settings are reachable form any part of the globe (by using several internet access equipments, with different access privileges). 3.2.5 Web 2.0 Integration Intruder alarm network systems work on a web-server basis, and new developments can now be achieved using web 2.0 integrated technologies. An interesting intruder alarm systems application can de made by adopting integrated geolocalization (GPS). Figure 11 shows a geo-localization scenario. It is presented a possible application using a web 2.0 with Google Maps API., allowing the use of internet support maps:
Fig. 11. A geo-localization scenario with two different security levels
Intruder Alarm Systems - The Road Ahead
607
Fig. 12. Geo-localization scenario after a change of security levels Two distinct security areas are defined (internal and external rectangle). Each intruder alarm system is one of the vertices (red marks) of the correspondent rectangle which is associated to one of the security areas. Each type an intruder violation occurs, a special contingency state is automatically implemented, whereas specific actions are taken, with security levels dynamically changed for each protected area, as shown in figure 12. A developed html code with JavaScript is presented in figure 13. In this sample code a security strategy was simplified for three different possible scenarios, in which, depending on the origin of the security violation, are signalized the edges of the two security areas, from a green level to the red level (the most severe), passing through a yellow level condition. Without any intruder violation occurrence, the interior zone is at green alert and the exterior zone is at the yellow level (as shown in figure 11). If a violation is detected by any intruder alarm system located at the exterior area, the interior area will change to a yellow condition, and the exterior area to the red condition, as shown in figure 12. In case of an intruder violation detected in the interior security area, both areas will immediately change to the red level condition.
608
Fig. 13. Sample code for web 2.0
Advanced Technologies
Intruder Alarm Systems - The Road Ahead
609
4. Conclusion Intruder Security is a billion dollar Industry, and a growing world wide market. The mobile and IP communications have developed and completely changed the way we now use, monitor and control intruder security systems. The development of web-based intruder alarm systems from the evolution of the new APIs, and from the new languages based on the web 2.0 philosophy are now giving better ways and tools to design intruder alarm systems and to increase security - the road ahead to follow.
5. References Antunes, R. (2007). Intruder Alarm Systems: The State of the Art, Proceedings of the 2nd International Conference on Electrical Engineering (CEE'07), pp. 252-261, ISBN:978972-99064-4-2, Portugal, November 2007, Coimbra Antunes, R. (2007). Manual do Curso de Formação de Projecto e Instalação de Sistemas de Alarme, ESTSetúbal Adão, H.; Antunes, R.; Grilo, F. (2008). Web-Based Control & Notification for Home Automation Alarm Systems, Proceedings of the XXVII World Academy of Science, Engineering and Technology Conference (ICAR’08), International Conference on Automation and Robotics, pp. 152-156, ISSN:1307-6884, Egypt, February 2008, WASET, Cairo Adão, H.; Antunes, R.; Grilo, F. (2008). Web-Based Control & Notification for Home Automation Alarm Systems, International Journal of Electronics, Circuits and Systems, Vol.2, No.1, 2008, pp. 20-24, ISSN:2070-3988 Grilo, F.; Figueiredo, J. (2008). An Industrial Vision System for Quality Control Based on a Distributed Strategy, Proceedings of the 8th Portuguese Conference on Automatic Control CONTROLO2008, pp. 426-431, ISBN: 978-972-669-877-7, Portugal, July 2008, Vila Real Google Maps API Developer's Guide in http://code.google.com/intl/pt-PT/apis/maps/documentation/index.html Google Maps API Reference in http://code.google.com/intl/pt-PT/apis/maps/documentation/reference.html
610
Advanced Technologies
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
611
32 X Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach Wudhichai Assawinchaichote1, Sing Kiong Nguang2 and Non-members 1Department
of Electronic and Telecommunication Engineering King Mongkut’s University of Technology Thonburi Thailand 2Department of Electrical and Electronic Engineering The University of Auckland, New Zealand
1. Introduction Markovian jump systems consists of two components; i.e., the state (differential equation) and the mode (Markov process). The Markovian jump system changes abruptly from one mode to another mode caused by some phenomenon such as environmental disturbances, changing subsystem interconnections and fast variations in the operating point of the system plant, etc. The switching between modes is governed by a Markov process with the discrete and finite state space. Over the past few decades, the Markovian jump systems have been extensively studied by many researchers; see (Kushner, 1967; Dynkin, 1965; Wonham, 1968; Feng et al., 1992; Souza & Fragoso, 1993; Boukas & Liu, 2001; Boukas & Yang, 1999; Rami & Ghaoui, 1995; Shi & Boukas, 1997; Benjelloun et al., 1997; Boukas et al., 2001; Dragan et al., 1999). This is due to the fact that jumping systems have been a subject of the great practical importance. For the past three decades, singularly perturbed systems have been intensively studied by many researchers; see (Dragan et al., 1999; Pan & Basar, 1993; Pan & Basar, 1994; Fridman, 2001; Shi & Dragan, 1999; Kokotovic et al., 1986). Multiple time-scale dynamic systems also known as singularly perturbed systems normally occur due to the presence of small “parasitic” parameters, typically small time constants, masses, etc. In state space, such systems are commonly modelled using the mathematical framework of singular perturbations, with a small parameter, say ε, determining the degree of separation between the “slow” and “fast” modes of the system. However, it is necessary to note that it is possible to solve the singularly perturbed systems without separating between slow and fast mode subsystems. But the requirement is that the “parasitic” parameters must be large enough. In the case of having very small “parasitic” parameters which normally occur in the description of various physical phenomena, a popular approach adopted to handle these systems is based on the so-called reduction technique. According to this technique the fast variables are replaced by their steady states obtained
612
Advanced Technologies
with “frozen” slow variables and controls, and the slow dynamics is approximated by the corresponding reduced order system. This time-scale is asymptotic, that is, exact in the limit, as the ratio of the speeds of the slow versus the fast dynamics tends to zero. In the last few years, the research on singularly perturbed systems in the H∞ sense has been highly recognized in control area due to the great practical importance. H∞ optimal control of singularly perturbed linear systems under either perfect state measurements or imperfect state measurements has been investigated via differential game theoretic approach. Although many researchers have studied the H∞ control design of linear singularly perturbed systems for many years, the H∞ control design of nonlinear singularly perturbed systems remains as an open research area. This is due to, in general, nonlinear singularly perturbed systems can not be decomposed into slow and fast subsystems. Recently, a great amount of effort has been made on the design of fuzzy H∞ for a class of nonlinear systems which can be represented by a Takagi-Sugeno (TS) fuzzy model; see (Nguang & Shi, 2001; Han & Feng, 1998; Chen et al., 2000; Tanaka et al., 1996). Recent studies (Nguang & Shi, 2001; Han & Feng, 1998; Chen et al., 2000; Tanaka et al., 1996; Wang et al., 1996) show that a fuzzy model can be used to approximate global behaviors of a highly complex nonlinear system. In this fuzzy model, local dynamics in different state space regions are represented by local linear systems. The overall model of the system is obtained by “blending” of these linear models through nonlinear fuzzy membership functions. Unlike conventional modelling which uses a single model to describe the global behavior of a system, fuzzy modelling is essentially a multi-model approach in which simple submodels (linear models) are combined to describe the global behavior of the system. Employing the existing fuzzy results (Nguang & Shi, 2001; Han & Feng, 1998; Chen et al., 2000; Tanaka et al., 1996; Wang et al., 1996) on the singularly perturbed system, one ends up with a family of ill-conditioned linear matrix inequalities resulting from the interaction of slow and fast dynamic modes. In general, ill-conditioned linear matrix inequalities are very difficult to solve. What we intend to do in this chapter is to design a robust H∞ fuzzy controller for a class of uncertain nonlinear two time-scale dynamic systems with Markovian jumps. First, we approximate this class of uncertain nonlinear two time-scale dynamic systems with Markovian jumps by a Takagi-Sugeno fuzzy model with Markovian jumps. Then based on an LMI approach, we develop a technique for designing a robust H∞ fuzzy controller such that the L2-gain of the mapping from the exogenous input noise to the regulated output is less than a prescribed value. To alleviate the ill-conditioned linear matrix inequalities resulting from the interaction of slow and fast dynamic modes, these ill-conditioned LMIs are decomposed into ε-independent LMIs and ε-dependent LMIs. The ε-independent LMIs are not ill-conditioned and the ε-dependent LMIs tend to zero when ε approaches to zero. If ε is sufficiently small, the original ill-conditioned LMIs are solvable if and only if the εindependent LMIs are solvable. The proposed approach does not involve the separation of states into slow and fast ones, and it can be applied not only to standard, but also to nonstandard singularly perturbed systems. This chapter is organized as follows. In Section 2, system descriptions and definition are presented. In Section 3, based on an LMI approach, we develop a technique for designing a robust H∞ fuzzy controller such that the L2-gain of the mapping from the exogenous input noise to the regulated output is less than a prescribed value for the system described in
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
613
Section 2. The validity of this approach is demonstrated by an example from a literature in Section 4. Finally, conclusions are given in Section 5.
2. System Descriptions and Definitions The class of uncertain nonlinear two time-scale dynamic systems with Markovian jumps under consideration is described by the following TS fuzzy model with Markovian jumps:
Eε x (t )=∑ir=1 µ υ t Ai η t + ∆Ai η t x t i + B η t + ∆B η t ω t +
1i
1i
B2i η t + ∆B2i η t u t , x(0)=0,
z (t )=∑ir=1 µ υ t C1i η t + ∆C1i η t x t i + D12 η t + ∆D12 η t u t i i
(1)
I 0 where Eε = , ε > 0 is the singular perturbation parameter, υ (t) = [υ 1 (t) υϑ (t)] is the 0 εI premise variable that may depend on states in many cases, µ i (υ (t))denote the normalized time varying fuzzy weighting functions for each rule, ϑ is the number of fuzzy sets, x(t) ∈ ℜ n is the state vector u(t) ∈ ℜ m is the input, w(t) ∈ ℜ p is the disturbance which
belongs to L2[0,∞), z(t) ∈ ℜ s is the controlled output, the matrix functions A i (η(t)), B 1 i (η(t)),
B 2 i (η(t)), C 1 i (η(t)), D 12 i (η(t)), ∆A i (η(t)), ∆B 1 i (η(t)), ∆B 2 i (η(t)), ∆C 1 i (η(t)), and ∆D 12 i (η(t)),
are of appropriate dimensions.
{(η(t))} is a continuous-time discrete-state Markov process
taking values in a finite set S = {1,2,···, s} with transition probability matrix P r ∆ {Pιk (t)} given by
(
Pιk (t) = P r η(t + ∆) = k η(t) = ι
)
λ ∆ + O(∆) if ι ≠ k = ιk 1 + λιι ∆ + O(∆) if ι = k where Δ > 0, and lim∆ →0
O( ∆ ) ∆
= 0.
(2)
Here λιk ≥ 0 is the transition rate from mode ι (system
operating mode) to mode k (ι ≠ k), and
λιι = −
s
∑ λιk⋅
(3)
k= 1,k≠ι
For the convenience of notations, we let
µ i ∆ µ i (υ (t)),η = η(t), and any matrix
M (µ ,ι ) ∆ M (µ ,η = ι ). The matrix functions ∆A i (η), ∆B 1 i (η), ∆B 2 i (η), ∆C 1 i (η), and ∆D 12 i (η),
614
Advanced Technologies
re-present the time-varying uncertainties in the system and satisfy the following assumption. Assumption 1: ∆A i (η) = F(x(t),η,t)H 1 i (η),
∆B 1 i (η) = F(x(t),η,t)H 2 i (η),
∆B 2 i (η) = F(x(t),η,t)H 3 i (η),
∆C 1 i (η) = F(x(t),η,t)H 4 i (η),
and ∆D 12 i (η) = F(x(t),η,t)H 5 i (η), where H j i (η), j = 1,2,,5 are known matrices which characterize the structure of the uncertainties. Furthermore, there exists a positive function ρ(η) such that the following inequality holds:
F(x(t),η,t) ≤ ρ (η)⋅
(4)
We recall the following definition. Definition 1: Suppose γ is a given positive number. A system of the form (1) is said to have the L2-gain less than or equal to γ if
{
}
Ε ∫ z T (t )z (t ) − γ 2 wT (t )w(t ) dt ≤ 0,x(0 ) = 0 0 Tf
(5)
where Ε[] ⋅ stands for the mathematical expectation, for all Tf and all w(t) ∈ L2 [0, Tf]. Note that for the symmetric block matrices, we use (∗) as an ellipsis for terms that are induced by symmetry.
3. Robust H∞ Fuzzy Control Design This section provides the LMI-based solutions to the problem of designing a robust H∞ fuzzy controller that guarantees the L2-gain of the mapping from the exogenous input noise to the regulated output to be less than some prescribed value. First, we consider the following H∞ fuzzy state feedback which is inferred as the weighted average of the local models of the form: r
u(t ) = ∑ µ j K j (ι )x (t ) ⋅ j=1
Then, we describe the problem under our study as follows.
(6)
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
615
Problem Formulation: Given a prescribed H∞ performance γ > 0, design a robust H∞ fuzzy state-feedback controller of the form (6) such that the inequality (5) holds. Before presenting our first main result, the following lemma is needed. Lemma 1: Consider the system (1). Given a prescribed H∞ performance γ > 0, for ι = 1, 2,···, s,
if there exist matrices Pε (ι ) = PεT (ι ), positive constants δ(ι) and matrices Yj(ι), j = 1, 2, · · · , r such that the following ε-dependent linear matrix inequalities hold: Pε (ι ) > 0 ψ ii (ι ,ε ) < 0, i = 1,2,,r ψ ij (ι,ε ) + ψ ji (ι ,ε ) < 0, i < j ≤ r
where
(7) (8) (9)
(*)T (*)T (*)T Φ ij (ι,ε ) R(ι )B T (ι ) −γR(ι ) (*)T (*)T 1i Ψij (ι,ε ) = T ϕ ij (ι,ε ) 0 −γR(ι ) (*) T 0 0 −P(ι ,ε ) Z (ι ,ε )
(10)
Φ ij (ι , ε ) = Ai (ι )Eε−1 Pε (ι ) + Eε−1 Pε (ι )AiT (ι ) + B2i (ι )Y j (ι ) + Y jT (ι )B2Ti (ι ) + λιι Eε−1 Pε (ι ), ~ ~ ϕ ij (ι , ε ) = C1i (ι )Eε−1 Pε (ι ) + D12i (ι )Y j (ι ), R (ι ) = diag {δ (ι )I , I , δ (ι )I , I }, Z(ι , ε ) = λι1 Eε−1 Pε (ι ) λι (ι −1) Eε−1 Pε (ι ) λι (ι +1) Eε−1 Pε (ι ) λιs Eε−1 Pε (ι ) , P(ι , ε ) = diag Eε−1 Pε (1),, Eε−1 Pε (ι − 1), Eε−1 Pε (ι + 1),, Eε−1 Pε (s ), ,
(
with
{
)
}
[
]
~ B1i (ι ) = I I I B1i (ι ) ~ C1i (ι ) = γρ (ι )H1Ti (ι ) 2ℵ(ι )ρ (ι )H 4Ti (ι ) 0 ~ 2ℵ(ι )ρ (ι )H 5Ti (ι ) γρ (ι )H 3Ti (ι ) D12 i (ι ) = 0
[ [
[
r r ℵ(ι ) = I + ρ 2 (ι )∑∑ H 2Ti (ι )H 2 j (ι ) i =1 j =1
]
]
2ℵ(ι )C1Ti (ι )
(11) (12) (13) (14) (15)
T
]
2ℵ(ι )D12T i (ι )
T
1
2
then the inequality (5) holds. Furthermore, a suitable choice of the fuzzy controller is
u(t) =
r
∑ µ j Kε j= 1
j
(ι)x(t)
(16)
616
Advanced Technologies
where
−1
Kε j (ι ) = Y j (ι )(Pε (ι )) Eε ⋅
(17)
Proof: The closed-loop state space form of the fuzzy system model (1) with the controller (16) is given by r
([
r
]
Eε x (t ) = ∑∑ µi µ j Ai (ι ) + B2 i (ι )Kεj (ι ) x(t ) i =1 j =1
[ ] + [B (ι ) + ∆B (ι )]w(t )),x(0) = 0,
(18)
+ ∆Ai (ι ) + ∆B2 i (ι )Kεj (ι ) x(t ) 1i
1i
or in a more compact form
([
)
]
r r ~ ~ (t ) , x(0 ) = 0 Eε x (t ) = ∑i =1 ∑ j =1 µ i µ j Ai (ι ) + B2i (ι )K εj (ι ) x(t ) + B1i (ι )R(ι )w
where
[
~ B1i (ι ) = I
I
I
]
B1i (ι )
(19)
(20)
F ( x(t ),ι ,t )H 1i (ι )x(t ) ~ (t ) = R -1 (ι ) F ( x(t ),ι ,t )H 2i (ι )w(t ) w F ( x(t ),ι ,t )H 3i (ι )K εj (ι )x(t ) w(t )
(21)
Consider a Lyapunov functional candidate as follows:
V (x(t),ι ,ε )= γx T (t)Qε (ι )x(t), ∀ι ∈ S
(22)
Note that Qε(ι) is constant for eachι. For this choice, we have V(0, ι0, ε) = 0 and V(x(t),ι,ε)→∞ only when ||x(t)|| → ∞. ˜ of the joint process {(x(t), ı, ε), t ≥ 0}, Now let consider the weak infinitesimal operator ∆ which is the stochastic analog of the deterministic derivative. {(x(t), ı, ε), t ≥ 0} is a Markov process with infinitesimal operator given by (Souza & Fragoso, 1993), s ~ ∆V ( x(t ),ι , ε ) = γ x T (t )Qε (ι )x(t ) + γ xT (t )Qε (ι ) x (t ) + γ xT (t )∑ λιk Qε (k )x(t ) k =1
r
(
r
[(
)]
= ∑∑ µi µ j γ xT (t )Qε (ι ) Ai (ι ) + B2 i (ι )Kεj (ι ) x(t ) i =1 j =1
[
]
+ γ xT (t ) Ai (ι ) + B2 i (ι )Kεj (ι ) Qε (ι )x(t ) ~ ~ (t ) + γ xT (t )Qε (ι )B1i (ι )R (ι )w ~T ~T (t )R (ι )B + γw 1i (ι )Qε (ι )x (t ) T
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach s
k= 1
617
∑ λιkQε (k)x(t) ⋅
+γ x T (t)
Adding and subtracting −ℵ2 (ι)z T (t)z(t) + γ 2 w˜ (t)] to and from (23), we get
(23)
∑i= 1 ∑ j= 1 ∑m= 1 ∑n= 1 µ i µ j µ m µ n [w˜ T (t)R(ι) r
r
r
r
[
r r r r ~ ~T (t )R (ι )w ~ (t )] ∆V ( x(t ),ι , ε ) = −ℵ2 (ι )z T (t )z (t ) + γ 2 ∑∑∑∑ µi µ j µ m µ n w i =1 j =1 m =1 n =1
r r r r x(t ) + ℵ2 (ι )z T (t )z (t ) + γ 2 ∑∑∑∑ µi µ j µ m µ n ~ × i =1 j =1 m =1 n =1 w(t ) T Ai (ι ) + B2 (ι )Kεj (ι ) Qε (ι ) i T ( ) * s x(t )⋅ (24 ) ( ) ( ) ( ) ( ) ( ) Q A B K Q k ι ι ι ι λ + + ~ (t ) ∑k =1 ιk ε 2i εj i w ε ~T R (ι )B1i (ι )Qε (ι ) − γR (ι ) T
[
]
[
]
Now let us consider the following terms: r
r
r
r
∑ µ i µ j µ m µ n [w˜ T (t)R(ι)w˜ (t)]
γ 2∑ ∑ ∑
i= 1 j= 1m= 1 n= 1
≤
ρ 2 (ι )γ 2 r r r ∑∑ ∑ δ (ι )
r
∑ µ i µ j µ m µ n x T (t)×
i= 1 j= 1m= 1 n= 1
{
H T1 i
(ι)H 1 m (ι) + KεTj (ι)H T3 i (ι)H 3 m (ι)Kεn (ι)}x(t) +ℵ2 (ι)γ 2 w T (t)w(t)
and r
r
r
{[
r
(25)
]
ℵ2 (ι )z T (t )z (t ) ≤ 2ℵ2 (ι )∑∑∑∑ µi µ j µ m µ n xT (t ) C1i (ι ) + D12 i (ι )Kεj (ι ) × i =1 j =1 m =1 n =1
T
T 2 1 (ι ) + D12 (ι )K εn (ι )] + ρ (ι )[H 4 (ι ) + H 5 (ι )K εj (ι )] × 4 (ι ) + H 5 (ι )K εn (ι )]}x (t )
[C [H where ℵ(ι ) = 1 + ρ 2 (ι )
r
∑i= 1 ∑
m
m
m
i
m
1
T 2 H (ι)H 2 j (ι) ⋅ Hence, j= 1 2 i
r
i
(26)
618
Advanced Technologies r
r
r
r
∑ µ i µ j µ m µ n [w˜ T (t)R(ι)w˜ (t)]+ℵ2 (ι)z T (t)z(t)
γ 2∑ ∑ ∑
i= 1 j= 1m= 1 n= 1
r
≤
r
r
r
∑ ∑ ∑ ∑ µ i µ j µ m µ n x T (t)[C˜ 1 (ι) + D˜ 12 (ι)Kεj (ι)] R -1 (ι)× T
i
i= 1 j= 1m= 1 n= 1
[C˜ 1
m
i
(ι) + D˜ 12 m (ι)Kεn (ι)]x(t) +ℵ2 (ι)γ 2 w T (t)w(t)
(27)
where
[ (ι) = [0
C˜ 1 i (ι ) = γρ (ι)H T1 (ι) D˜ 12 i
i
i
T 2ℵ(ι)D 12
2ℵ(ι )ρ (ι )H T5 (ι ) γρ (ι)H T3 (ι) i
i
] (ι)]
2ℵ(ι)C 1T (ι)
2ℵ(ι)ρ (ι)H T4 (ι) 0
i
T
T
i
Substituting (27) into (24), we have
∆˜ V (x(t),ι ,ε )≤ −ℵ2 (ι )z T (t)z(t) + γ 2ℵ2 (ι )w T (t)w(t) +γ 2
r
r
r
∑∑ ∑
x(t) x(t)T ⋅ Φ ijmn (ι,ε ) w˜ (t) w˜ (t)
r
∑ µ iµ j µ m µ n ×
i= 1 j= 1m= 1 n= 1
where
T A i (ι) + B 2 i (ι )Kεj (ι ) Qε (ι ) T +Q (ι) A (ι ) + B (ι)K (ι ) + 1 C˜ (ι ) + D˜ (ι )K (ι) × (*)T ε i 2i εj 12 i εj γ 1i Φ ijmn (ι ,ε ) = s R -1 (ι ) C˜ 1 (ι ) + D˜ 12 (ι )Kεn (ι ) + λιk Qε (k) m m k= 1 ˜ T (ι )Q (ι ) R ι − γ R ι B ( ) ( ) ε 1
(
)
[
] [
[
]
]∑
i
Using the fact r
r
r
r
r
r
∑ ∑ ∑ ∑ µ i µ j µ m µ n M Tij (ι)N mn (ι) ≤ 12 ∑ ∑ µ i µ j [M Tij (ι)M ij (ι) + N ij (ι)N Tij (ι)], i= 1 j= 1m= 1 n= 1
we can rewrite (28) as follows:
i= 1 j= 1
(28)
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
619
~ ∆V ( x(t ),ι , ε ) ≤ −ℵ2 (ι )z T (t )z (t ) + γ 2ℵ2 (ι )wT (t )w(t ) r r x(t ) x(t ) + γ 2 ∑∑ µ i µ j ~ Φ ij (ι , ε ) ~ ( ) w t i =1 j =1 w(t ) 2 2 2 T T = −ℵ (ι )z (t )z (t ) + γ ℵ (ι )w (t )w(t ) T
r x(t ) x(t ) + γ ∑ µ i2 ~ Φ ii (ι , ε ) ~ ( ) w t i =1 w(t ) T
r r x(t ) x(t ) + γ ∑∑ µ i µ j ~ (Φ ij (ι , ε ) + Φ ij (ι , ε )) ~ i =1 i < j w(t ) w(t ) T
(29)
where
T A i (ι ) + B 2 i (ι )Kεj (ι) Qε (ι ) +Q (ι ) A (ι) + B (ι )K (ι ) + 1 C˜ (ι ) + D˜ (ι )K (ι ) T × (*)T ε i 2i εj 12 i εj (30) γ 1i Φ ij (ι,ε ) = s R -1 (ι) C˜ 1 (ι ) + D˜ 12 (ι)Kεn (ι ) + λ Q k ( ) m m k= 1 ιk ε T R(ι )B˜ 1 (ι )Qε (ι ) −γR(ι )
(
)
(
) [
[
]
]∑
i
Pre and post multiplying (30) by
P (ι ) 0 Ξ(ι ) = ε , I 0 with Pε (ι ) = Qε−1(ι ) we obtain T T T Pε (ι )A i (ι ) + Y j (ι )B 2 i (ι ) + A i (ι)Pε (ι) + B 2 i (ι )Y j (ι ) 1 T ˜ ˜ + C 1 i (ι )Pε (ι ) + D 12 i (ι )Y j (ι ) × γ (*)T Ξ(ι )Φ ij (ι,ε )Ξ(ι ) = R -1 (ι) C˜ (ι )P (ι ) + D˜ 12 m (ι)Y j (ι) ε 1m s −1 + k= 1 λιk Pε (ι)Pε (k)Pε (ι ) T ˜ −γR(ι ) R(ι )B 1 (ι )
[
]
[
]
(31)
∑
i
Note that (31) is the Schur complement of Ψij(ι, ε) defined in (10). Using (8)-(9), we learn that
620
Advanced Technologies
Φ ii (ι ,ε ) < 0, i = 1,2, ,r
(32)
Φ (ι,ε ) + Φ ji (ι ,ε ) < 0, i < j ≤ r ⋅ ij
(33)
Following from (29), (32) and (33) , we know that ∆˜ V (x(t),ι ,ε )< −ℵ2 (ι )z T (t)z(t) + γ 2ℵ2 (ι )w T (t)w(t)⋅
Applying the operator E
Tf
∫0
E
Tf
∫0
(⋅)dt
(34)
on both sides of (34), we obtain
∆˜ V (x(t),ι ,ε )dt < E
∫ 0 (−ℵ2 (ι)z T (t)z(t) + γ 2ℵ2 (ι)w T (t)w(t))dt ⋅ Tf
(35)
From the Dynkin’s formula [2], it follows that E
Tf
∫0
[ ( ( ) ( ) )] [
∆˜ V (x(t),ι ,ε )dt = E V x T f ,ι T f ,ε − E V (x(0),ι (0) ,ε ) ⋅
]
(36)
Substitute (36) into (35) yields 0 < E
∫ 0 (−ℵ2 (ι)z T (t)z(t) + γ 2ℵ2 (ι)w T (t)w(t))dt Tf
[ ( ( ) ( ) )] [
]
−E V x T f ,ι T f ,ε + E V (x(0),ι(0) ,ε ) ⋅ Using (34) and the fact that we have E
∫ 0 {z T (t)z(t)− γ 2 w T (t)w(t)}dt < 0 ⋅ Tf
Hence, the inequality (5) holds. This completes the proof of Lemma 1.
(37)
Remark 1: The linear matrix inequalities given in Lemma 1 becomes ill-conditioned when ε is sufficiently small, which is always the case for the uncertain nonlinear two time-scale dynamic systems. In general, these ill-conditioned linear matrix inequalities are very difficult to solve. Thus, to alleviate these ill-conditioned linear matrix inequalities, we have the following theorem which does not depend on ε. Now we are in the position to present our first result. Theorem 1: Consider the system (1). Given a prescribed H∞ performance γ > 0, for ι = 1, 2, ··· , s, if there exist matrices P(ι), positive constants δ(ι) and matrices Yj(ι), j = 1, 2, ··· , r such that the following ε-independent linear matrix inequalities hold:
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
621
EP(ι ) + P(ι)D > 0
(38)
Ψii (ι ) < 0, i = 1,2,,r
(39)
Ψ (ι) + Ψ ji (ι ) < 0, i < j ≤ r ij
(40)
I 0 0 0 where EP(ι ) = P T (ι )E , P(ι )D = DP T (ι ), E = , D = , 0 0 0 I (*)T (*)T (*)T Φ ij (ι ) R(ι)B T (ι) −γR(ι ) (*)T (*)T 1i Ψij (ι) = T ϕ ij (ι ) 0 −γR(ι) (*) T 0 0 −P(ι) Z (ι )
(41)
~ Φ ij (ι ) = Ai (ι )P(ι ) + P T (ι )AiT (ι ) + B2i (ι )Y j (ι ) + Y jT (ι )B2Ti (ι ) + λιι P (ι ), ~ ~ ϕ ij (ι ) = C1i (ι )P(ι ) + D12i (ι )Y j (ι ),
(42) (43)
R (ι ) = diag {δ (ι )I , I , δ (ι )I , I }, ~ ~ ~ ~ Z(ι ) = λι1 P (ι ) λι (ι −1) P (ι ) λι (ι +1) P (ι ) λι1 P (ι ) , ~ ~ ~ ~ P(ι ) = diag P (1),, P (ι − 1), P (ι + 1),, P (s ) , ~ P(ι ) + P T (ι ) P (ι ) = , 2
(
{
(44)
)
}
(45) (46) (47)
with
[
B˜ 1 i (ι ) = I
[ (ι) = [0
I
]
B 1 i (ι )
I
C˜ 1 i (ι ) = γρ (ι)H T1 (ι) D˜ 12 i
i
2ℵ(ι)ρ (ι)H T4 (ι) 0 i
2ℵ(ι )ρ (ι )H T5 (ι ) γρ (ι)H T3 (ι) i
i
r r T ℵ(ι ) = I + ρ 2 (ι ) H 2 i (ι)H 2 j (ι) i= 1 j= 1
∑∑
] (ι)]
2ℵ(ι)C T1 (ι) i
2ℵ(ι)D T12
T
T
i
1 2
then there exists a sufficiently small εˆ > 0 such that the inequality (5) holds for ε ∈ (0,εˆ] . Furthermore, a suitable choice of the fuzzy controller is r
u(t ) = ∑ µ j K j (ι )x (t ) ⋅ where
j=1
(48)
622
Advanced Technologies −1
K j (ι ) = Y j (ι )(P(ι )) ⋅
(49)
Proof: Suppose there exists a matrix P(ι) such that the inequality (38) holds, then P(ι) is of the following form:
P (ι) 0 1 P(ι ) = T P2 (ι ) P3 (ι )
(50)
with P1 (ι ) = P1T (ι ) > 0 and P3 (ι ) = P3T (ι ) > 0 . Let
(
)
Pε (ι ) = Eε P(ι ) + εP˜ (ι)
(51)
0 P2 (ι ) P˜ (ι ) = ⋅ 0 0
(52)
with
Substituting (50) and (52) into (51), we have P (ι ) εP (ι ) 1 2 ⋅ Pε (ι ) = T ε P ι ε P ( ) 2 3 (ι )
(53)
Clearly, Pε (ι ) = PεT (ι ) , and there exists a sufficiently small εˆ such that for ε ∈ (0,εˆ],Pε (ι ) > 0. Using the matrix inversion lemma, we learn that
[
]
Pε−1 (ι) = P −1 (ι ) + εM ε (ι ) Eε−1
(
(54)
)
−1 where M ε (ι ) = −P −1 (ι)P˜ (ι) I + εP −1 (ι )P˜ (ι ) P −1 (ι ) . Substituting (51) and (54) into (10), we
obtain
Ψij (ι) + ψ ij (ι )
(55)
where the ε-independent linear matrix Ψij(ι) is defined in (41) and the ε-dependent linear matrix is
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
~ A (ι )P (ι ) + P~ T (ι )AiT (ι ) i B (ι )Y (ι ) Y T (ι )B T (ι ) + εj εj 2i + 2i ~ Ψij (ι ) = ε + λιι P (ι ) 0 ~ T ~ ~ C1i (ι )P (ι ) + D12i (ι )Yε j (ι ) ZTε (ι )
(
with
Zε (ι ) =
)
( λ Pˆ (ι ) λ
ι (ι −1)
ι1
Pˆ (ι )
T (*) (*)T (*)T − Pε (ι )
(*)T (*)T 0
(*)
0
0
0
0
T
~
623
)
λι (ι +1) Pˆ (ι ) λι1 P (ι ) ,
(56)
ˆ Pε (ι ) = diagP (1), ,
P˜ (ι ) + P˜ T (ι) ˆ ˆ ˆ ˆ and Yε j (ι ) = K j (ι )M ε−1 (ι ). Note that the εP (ι − 1), P (ι + 1), , P (s) , P (ι ) = 2
dependent linear matrix tends to zero when ε approaches zero. Employing (38)-(40) and knowing the fact that for any given negative definite matrix W, there exists an ε > 0 such that W + εI < 0, one can show that there exists a sufficiently small εˆ > 0 such that for ε ∈ (0,εˆ], (8) and (9) hold. Since (7)-(9) hold, using Lemma 1, the inequality (5) holds for ε ∈ (0,εˆ] .
4. Illustrative Example Consider a modified series dc motor model based on (Mehta & Chiasson, 1998) as shown in Figure 1 which is governed by the following difference equations:
J L
dω˜ (t ) dt di˜ (t ) dt
= K m L f i˜ 2 (t)− (D + ∆D)ω˜ (t) = −Ri˜(t)− K m L f i˜(t)ω˜ (t) + V˜ (t)
(57)
where ω˜ (t) = ω (t)− ω ref (t) is the deviation of the actual angular velocity from the desired angular velocity, i˜(t) = i(t)− i ref (t) is the deviation of the actual current from the desired current, V˜ (t) = V (t)− V ref (t) is the deviation of the actual input voltage from the desired input voltage, J is the moment of inertia, Km is the torque/back emf constant, D is the viscous friction coefficient, and R a , R f , L a and L f are the armature resistance, the field
winding resistance, the armature inductance and the field winding inductance, respectively, with R ∆ R f + R a and L ∆ L f + L a . Note that in a typical series-connected dc motor, the
condition L f >> L a holds. When one obtains a series connected dc motor, i(t) = i a (t) = i f (t) we have L f >> L a . Now let us assume that |ΔJ| ≤ 0.1J.
624
Advanced Technologies
Fig. 1. A modified series dc motor equivalent circuit. Giving x 1 (t) = ω˜ (t) , x 2 (t) = i˜(t) and u(t) = V˜ (t), (57) becomes
x1 (t ) − J +D∆J ε x (t ) = 2 − K m L f x2 (t )
x2 (t ) x1 (t ) 0 + u (t ) − R x2 (t ) 1
KmL f J + ∆J
(58)
where ε = L represents a small parasitic parameter. Assume that, the system is aggregated into 3 modes as shown in Table 1:
Table 1. System Terminology. The transition probability matrix that relates the three operation modes is given as follows:
0.67 0.17 0.16 Pιk = 0.30 0.47 0.23 ⋅ 0.26 0.10 0.64 The parameters for the system are given as R = 10 Ω, Lf = 0.005 H, D = 0.05 N·m/rad/s and Km = 1 N·m/A. Substituting the parameters into (58), we get
x2 (t ) x1 (t ) 0 0 0 + w(t ) + u (t ) − 10 x2 (t ) 0.1 0 1 0.005 x (t ) ∆J (ι ) x 2 (t ) 1 0 x2 (t )
x1 (t ) − 0J.(05ι ) ε x (t ) = − 0.005 x (t ) 2 2
0.005 J (ι )
− 0.05 + ∆J (ι ) 0 1 0 x1 (t ) 0 z (t ) = + u (t ) 0 1 x2 (t ) 1
(59)
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
[
[
]
where x(t) = x T1 (t) x T2 (t)
T
625
]
T
is the state variables, w(t) = w T1 (t) w T2 (t) is the disturbance
input, u(t) is the controlled input and z(t) is the controlled output. The control objective is to control the state variable x2(t) for the range x2(t) ∈ [N1 N2]. For the sake of simplicity, we will use as few rules as possible. Note that Figure 2 shows the plot of the membership function represented by
M 1 (x 2 (t))=
−x 2 (t) + N 2 N2 − N1
and M 2 (x 2 (t))=
x 2 (t) + N 2 N2 − N1
Knowing that x2(t) ∈ [N1 N2], the nonlinear system
Fig. 2. Membership functions for the two fuzzy set. (59) can be approximated by the following TS fuzzy model r
Eε x (t ) = ∑ µ [Ai (ι ) + ∆Ai (ι )]x(t ) + B ω (t ) + B2i u (t ), x(0) = 0, i 1i i =1 r
[
]
z (t ) = ∑ µ C1i x(t ) + D12i u (t ) , i i =1 where μi is the normalized time-varying fuzzy weighting functions for each rule, i = 1, 2, x(t)
1 0 x1 (t ) x (t ) , Eε = 0 ε ,∆A1 (ι ) = F ( x(t ),ι , t )H11 (ι ),∆A2 (ι ) = F ( x(t ),ι , t )H12 (ι ), 2
=
−100 A 1 (1) = −0.005N 1 −10 A 1 (2) = −0.005N 1
−100 10N 1 10N 2 , A 2 (1) = , −10 −0.005N 2 −10 −10 N1 N2 , A 2 (2) = , −10 −0.005N 2 −10
−1 −1 0.1N 1 0.1N 2 A 1 (3) = , A 2 (3) = , −0.005N −10 −0.005N −10 1 2 1 0 0 0 0 B 1 1 (ι ) = B 1 2 (ι ) = , ,B 2 1 (ι ) = B 2 2 (ι ) = ,C 1 1 (ι ) = C 1 2 (ι ) = 1 0.1 0 0 1 0 and D 12 1 (ι ) = D 12 2 (ι ) = , 1
626
Advanced Technologies
with ||F(x(t),ι,t)|| ≤ 1. Then we have
− 0.05 H11 (ι ) = J (ι ) 0
− 0.05 N1 and H 12 (ι ) = J (ι ) 0 0
0.05 J (ι )
N2 ⋅ 0
0.05 J (ι )
In this simulation, we select N1 = −3 and N2 = 3. Using the LMI optimization algorithm and Theorem1 with ε = 0.005, γ = 1 and δ(1) = δ(2) = δ(3) = 1, we obtain the results given in Figure 3, Figure 4 and Figure 5 Remark 2: Employing results given in (Nguang & Shi, 2001; Han & Feng, 1998; Chen et al., 2000; Tanaka et al., 1996; Wang et al., 1996) and Matlab LMI solver [28], it is easy to realize that when ε < 0.005 for the state-feedback control design, LMIs become ill-conditioned and Matlab LMI solver yields an error message, “Rank Deficient”. However, the state-feedback fuzzy controller proposed in this paper guarantee that the inequality (5) holds for the system (59). Figure 3 shows the result of the changing between modes during the simulation with the initial mode at mode 1 and ε = 0.005. The disturbance input signal, w(t), which was used during simulation is given in Figure 4. The ratio of the regulated output energy to the disturbance input noise energy obtained by using the H∞ fuzzy controller is depicted in Figure 5. The ratio of the regulated output energy to the disturbance input noise energy tends to a constant value which is about 0.0094. So γ = 0.0094 = 0.0970 which is less than the pres-cribed value 1. Finally, Table 2 shows the performance index, γ, for different values of ε.
Fig. 3. The result of the changing between modes during the simulation with the initial mode at mode
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
Fig. 4. The disturbance input, w(t).
Fig. 5. The ratio of the regulated output energy to the disturbance noise energy,
∫ T f z T (t )z(t )dt T0 ∫ f w T (t )w(t )dt with ε = 0.005. 0
627
628
Advanced Technologies
Table 2. The performance index γ for different values of ε.
5. Conclusion This chapter has investigated the problem of designing a robust H∞ controller for a class of uncertain nonlinear two time-scale dynamic systems with Markovian jumps that guarantees the L2- gain from an exogenous input to a regulated output to be less or equal to a prescribed value. First, we approximate this class of uncertain nonlinear two time-scale dynamic systems with Markovian jumps by a class of uncertain Takagi-Sugeno fuzzy models with Markovian jumps. Then, based on an LMI approach, sufficient conditions for the uncertain nonlinear two time-scale dynamic systems with Markovian jumps to have an H∞ performance are derived. The proposed approach does not involve the separation of states into slow and fast ones and it can be applied not only to standard, but also to nonstandard uncertain nonlinear two time-scale dynamic systems. An illustrative example is used to illustrate the effectiveness of the proposed design techniques.
6. References Kushner, H. J. (1967). Stochastic Stability and Control, Academic Press, New York. Dynkin, E. B. (1965). Markov Process, Springer-Verlag, Berlin. Wonham, W. M. (1968). On a matrix Riccati equation of stochastic control, SIAM J. Contr., vol. 6, page (681–697). Feng, X.; Loparo, K. A.; Ji, Y. & Chizeck, H. J.( 1992). Stochastic stability properties of jump linear system, IEEE Tran. Automat. Contr., vol. 37, page (38–53). Souza, C. E. & Fragoso, M. D. ( 1993). H∞ control for linear systems with Markovian jumping parameters, Control-Theory and Advanced Tech., vol. 9, page (457–466). Boukas, E. K. & Liu, Z. K. (2001). Suboptimal design of regulators for jump linear system with timemultiplied quadratic cost, IEEE Tran. Automat. Contr., vol. 46, page (944–949). Boukas, E. K. & Yang, H. (1999). Exponential stabilizability of stochastic systems with Markovian jump parameters, Automatica, vol. 35, page (1437–1441). Rami, M. A. & Ghaoui, L. Ei (1995). H∞ statefeedback control of jump linear systems, Proc.Conf. Decision and Contr., page (951–952), Shi, P. & Boukas, E. K. (1997). H∞ control for Markovian jumping linear system with parametric uncertainty, J. of Opt. Theory and App., vol. 95, page numbers (75–99),.
Robust H∞ Fuzzy Control Design for Nonlinear Two-Time Scale System with Markovian Jumps based on LMI Approach
629
Benjelloun, K.; Boukas, E. K. & Costa, O. L. V. (1997). H∞ control for linear time delay with Markovian jumping parameters, J. of Opt. Theory and App., vol. 105, page (73–95). Boukas, E. K.; Liu, Z. K. & Liu, G. X. (2001). Delay dependent robust stability and H∞ control of jump linear systems with time-delay,” Int. J. of Contr.. Dragan, V.; Shi, P. & Boukas, E. K. (1999). Control of singularly perturbed system with Markovian jump parameters: An H∞ approach, Automatica, vol. 35, page (1369–1378),. Pan, Z. & Basar, T. (1993). H∞–optimal control for singularly perturbed systems Part I: Perfect state measurements, Automatica, vol. 29, page (401–423). Pan, Z. & Basar, T. (1994). H∞–optimal control for singularly perturbed systems Part II: Imperfect state measurements, IEEE Trans. Automat. Contr., vol. 39, page (280–299). Fridman, E. ( 2001). State-feedback H∞ control of nonlinear singularly perturbed systems, Int. J. Robust Nonlinear Contr., vol. 6, page (25–45). Shi, P. & Dragan, V. (1999). Asymptotic H∞ control of singularly perturbed system with parametric uncertainties, IEEE Trans. Automat. Contr., vol. 44, page (1738– 1742). Kokotovic, P. V.; Khalil, H. K. & O’Reilly, J. (1986). Singular Perturbation Methods in Control: Analysis and Design, Academic Press, London. Nguang, S. K. & Shi, P. (2001). H∞ fuzzy output feedback control design for nonlinear systems: An LMI approach, Proc. IEEE Conf. Decision and Contr., page s (4352–4357). Han, Z. X. & Feng, G. (1998). State-feedback H∞ controller design of fuzzy dynamic system using LMI techniques, Fuzzy-IEEE’98, page (538–544). Chen, B. S.; Tseng, C. S. & He, Y. Y. (2000). Mixed H� /H∞ fuzzy output feedback control design for nonlinear dynamic systems: An LMI approach, IEEE Trans. Fuzzy Syst., vol. 8, page (249–265). Ikeda, K. T. & Wang, H. O. (1996). Robust stabilization of a class of uncertain nonlinear systems via fuzzy control: Quadratic stability, H∞ control theory, and linear matrix inequality, IEEE Trans. Fuzzy. Syst., vol. 4, page (1–13). Wang, H.; Tanaka, O. K. & Griffin, M. F. (1996). An approach to fuzzy control of nonlinear systems: Stability and design issues, IEEE Trans. Fuzzy Syst., vol. 4, no. 1, page (14–23). Farias, D. P.; Geromel, J. C.; Val, J. B. R. & Costa, O. L. V. (2000). Output feedback control of Markov jump linear systems in continuous-time, IEEE Trans. Automat. Contr., vol. 45, page (944–949). Nguang, S. K. & Shi, P. (2003). H∞ fuzzy output feedback control design for nonlinear systems: An LMI approach , IEEE Trans. Fuzzy Syst., vol. 11, page (331– 340). Nguang, S. K. (1996). Robust nonlinear H∞ output feedback control, IEEE Trans Automat. Contr.,vol. 41, page (1003–1008). Assawinchaichote, W. & Nguang, S. K. (2002). Fuzzy control design for singularly perturbed nonlinear systems: An LMI approach, ICAIET, pp. 146–151, Kota Kinabalu, Malaysia.
630
Advanced Technologies
Assawinchaichote, W. & Nguang, S. K. (2002). Fuzzy observer-based controller design for singularly perturbed nonlinear systems: An LMI approach, Proc. IEEE Conf. Decision and Contr., pp. 2165–2170, Las Vegas. Gahinet, P.; Nemirovski, A.; Laub, A.J. & Chilali, M. (1995). LMI Control Toolbox – For Use with MATLAB, The MathWorks, Inc., MA. Boyd, S.; Ghaoui, L. El; Feron, E.& Balakrishnan, V. ( 1994). Linear Matrix Inequalities in Systems and Control Theory, vol. 15, SIAM, Philadelphia. Mehta, S. & J. Chiasson, (1998). Nonlinear control of a series dc motor: Theory and experiment, IEEE Trans. Ind. Electron., vol. 45, page (134–141).
Data Compression and De-compression by Causal Filters with Variable Finite Memory
631
33 5 Data Compression and De-compression by Causal Filters with Variable Finite Memory A. Torokhti and S. Miklavcic University of South Australia Australia
1. Introduction A study of data compression methods is motivated by the necessity to reduce expenditures incurred with the transmission, processing and storage of large data arrays (Scharf., 1991; Hua et al., 2001). Such methods have also been applied successfully to the solution of problems related, e.g., to clustering (Fukunaga, 1990), feature selection (Jolliffe, 1986; Gimeno, 1987), forecasting (Kim et al., 2005; Common and Golub, 1990) and estimating the medium from transmission data (Kraut et al., 2004). Data compression techniques are often performed on the basis of the Karhunen-Lo`eve transform (KLT)1 , which is closely related to the Principal Component Analysis (PCA). A basic theory for the KLT-PCA can be found, for example, in (Scharf., 1991; Jolliffe, 1986; Torokhti and Howlett, 2007). In short, the KLT-PCA produces a linear operator of a given rank that minimizes an associated error over all linear operators of the same rank. In a so-called standard KLT-PCA (e.g., presented in (Jolliffe, 1986)), an observable signal and a reference signals are the same. In other words, the standard KLT-PCA provides data compression only and no noise filtering. Scharf (Scharf., 1991; Scharf, 1991) presented an extension of the PCA–KLT2 for the case when an observable signal y and a reference signal x are different and no explicit analytical representation of y in terms of x is known. In particular, y can be a noisy version of x. The method (Scharf., 1991; Scharf, 1991) assumes that covariance matrix Eyy , formed from y, is nonsingular. Yamashita and Ogawa (Yamashita and Ogawa, 1996) proposed and justified a version of the PCA–KLT for the case where the covariance matrix Eyy may be singular and y = x + w with w an additive noise. Hua and Liu (Hua and Liu, 1998) considered an extension of the PCA– KLT to the case when y and x are different as in (Scharf., 1991; Scharf, 1991), to guarantee its existence when Eyy is singular. Torokhti and Friedland (Torokhti and Friedland, 2009) studied a weighted version of the the PCA–KLT. Torokhti and Howlett (Torokhti and Howlett, 2006; 2009) extended the PCA–KLT to the case of optimal non-linear data compression. Torokhti and Manton (Torokhti and Manton, 2009) further advanced results in (Torokhti and Friedland, 2009; Torokhti and Howlett, 2006; 2009) to the so-called generic weighted filtering of stochastic signals. Advanced computational aspects of the PCA–KLT were provided, in particular, by Hua, Nikpour and Stoica (Hua et al., 2001), Hua and Nikpour (Hua and Nikpour, 1999), Stoica and Viberg (Stoica and Viberg, 1996), and Zhang and Golub (Zhang and Golub, 1 The
KLT is also known as Hotelling Transform and Eigenvector Transform. of references related to the PCA–KLT is very long. For example, a Google search for ‘KarhunenLo`eve transform and Principal Component Analysis’ gives 9230 items. Here, we mention only the most relevant references to the problem under consideration. 2 List
632
Advanced Technologies
2001). Other relevant references can be found, e.g. in the bibliographies of the books by Scharf (Scharf., 1991), and Torokhti and Howlett (Torokhti and Howlett, 2007). While the topics of data compression have been intensively studied (in particular, in the references mentioned above), a number of related fundamental questions remain open. One of them concerns real-time data processing. In this paper, the real-time aspect of the data compression problem is the dominant motivation for considering specific restrictions associated with causality and memory. Similar observations, but not in context of data compression, motivated studies of Winer-like filtering in works by Fomin and Ruzhansky (Fomin and Ruzhansky, 2000), Torokhti and Howlett (Torokhti and Howlett, 2006), and Howlett, Torokhti and Pearce (Howlett et al., 2007). We note that conditions of causality and memory make the problem very specific and difficult. To the best of our knowledge, such a problem is considered here for the first time. In Section 3.2, we provide a new approach to the problem solution and give an analysis of the associated error. In more detail, motivations to consider the problem are as follows. F IRST MOTIVATION : CAUSALITY AND MEMORY. Data compression techniques mainly consist of two operations, data compression itself and a subsequent data de-compression (or reconstruction). In real time, the compressor and de-compressor are causal and may be performed with a memory. A causality constraint follows from the observation that in practice, the present value of the output of a filter3 is not affected by future values of the input (De Santis, 1976). To determine the output signal at time tk with qk = 1, . . . , k, to be defined below, the causal filter should “remember” the input signal up to time tk , i.e. at times tk , tk−1 , . . . , t1 . A memory constraint is motivated as follows. The output of the compressor and/or decompressor at time tk with k = 0, 1, . . . , m, may only be determined from a ‘fragment’ of the input defined at times tk , tk−1 , . . . , tk−(qk −1) with qk = 1, . . . , k. In other words, compressor and de-compressor should ‘remember’ that fragment of the input. The ‘length’ of the fragment for a given k, i.e. the number qk , is called a local memory. The local memory could be different for different k, therefore, we also say that qk is a local variable memory. A collection of the local memories, {q1 , . . . , qm }, is called the variable finite memory or simply variable memory. A formalization of these concepts is given in Section 3.1 below. Matrices that form filter models with a variable memory possess special structure. Some related examples are given in Section 3.1. Thus, our first motivation to consider the problem in the form presented in Section 2.3 below comes from the observation that the compressor and de-compressor used in real time should be causal with variable finite memory. S ECOND MOTIVATION : SPECIFIC FORMULATION OF THE PROBLEM. In reality, the compression and de-compression are separated in time. Therefore, it is natural to pose optimization problems for them separately, one specific problem for each operation, compression and decompression. Associated optimization criteria could be formulated in many ways. Some of them are discussed in Section 6, and we show that those criteria lead to significant difficulties. To avoid the difficulties considered in Section 6, a new approach to the solution of the data compression problem is presented here. The approach is based on a specific formulation of two related problems given in Section 2 below. Solutions of those problems represent an associated optimal compressor and optimal de-compressor, respectively. It is shown in Section 3.1 that the optimal compressor and de-compressor satisfying conditions of causality and variable finite memory must have special forms. This implies that signals processed by these 3 Below,
when a context is clear, we use the term ‘filter’ for both compressor and de-compressor.
Data Compression and De-compression by Causal Filters with Variable Finite Memory
633
operators should be presented in special forms as well. In Sections 3.1 and 3.2 this issue is discussed in detail. Next, traditionally, the data compression problem is studied in terms of linear operators, mainly due to a simplicity of their implementation. See, for example, (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)–(Torokhti and Friedland, 2009) and references herein. Here, we extend the approaches of linear data compression proposed in (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)–(Torokhti and Friedland, 2009). A case of non-linear compression and de-compression with causality and memory is more complicated, and it can be studied on the basis of results obtained below combined, e.g., with the approaches to optimal non-linear filtering presented in (Torokhti and Howlett, 2007; 2006; 2009; Torokhti and Manton, 2009; Howlett et al., 2007).
2. Basic idea and statement of the problem 2.1 Informal statement of problem
In an informal way, the data compression problem we consider can be expressed as follows. Let (Ω, Σ, µ) be a probability space, where Ω = {ω } is the set of outcomes, Σ a σ–field of measurable subsets in Ω and µ : Σ → [0, 1] an associated probability measure on Σ with µ(Ω) = 1. Let y ∈ L2 (Ω, R n ) be observable data and x ∈ L2 (Ω, R m ) be a reference signal that is to be estimated from y in such a way that (i) first, data y should be compressed to a shorter vector z ∈ L2 (Ω, Rr )4 with r < min{m, n},
(ii) then z should be decompressed (reconstructed) to a signal x˜ ∈ L2 (Ω, R m ) so that x˜ is ‘close’ to x in some appropriate sense, and
(iii) both operations, compression and de-compression, should be causal and should have variable memory. The problem is to determine a compressor and de-compressor so that associated errors are minimal. A rigorous statement of the problem is given in Section 2.3 below. In an intuitive way, y can be regarded as a noise-corrupted version of x. For example, y can be interpreted as y = x + n where n is white noise. Thus, the above two operations, (i) and (ii), perform the noise filtering as well. We do not restrict ourselves to this simplest version of y and assume that the dependence of y on x and n is arbitrary. 2.2 Basic idea
Let B : L2 (Ω, R n ) → L2 (Ω, Rr ) signify compression so that z = B(y) and let A : L2 (Ω, Rr ) → L2 (Ω, R m ) designate data de-compression, i.e., x˜ = A(z). We suppose that B and A are linear operators defined by the relationships
[B(y)](ω ) = B[y(ω )] and [A(z)](ω ) = A[z(ω )] n ×r
r ×m
(1)
and A ∈ R . In the remainder of this paper we shall use the same symbol where B ∈ R to represent both the linear operator acting on a random vector and its associated matrix. We assume that information about vector z in the form of associated covariance matrices can be obtained, in particular, from the known solution (Torokhti and Howlett, 2007) of the data compression problem with no constraints associated with causality and memory. This allows us to state the problem, subject to conditions of causality and memory, in the form of two separate problems (2) and (3) formulated below. This is the basic idea of our approach. 4 Components
of z are often called principal components (Jolliffe, 1986).
634
Advanced Technologies
2.3 The problem formalization
S PECIAL NOTATION. We need to use some special notation to be defined in Section 3.1 below. The notation follows from six specific Definitions 1–6 of causal compressors and decompressors with different types of memories given in Section 3.1. To shorten the way forward for the problem formalization, we describe briefly here that notation and refer the reader to Section 3.1 for more detail. We write MC (r, n, ηB ) for the set of causal r × n compressors B with so-called complete variable finite memory ηB , and MT (r, n, ηB ) for the set of causal r × n compressors B with so-called truncated variable finite memory ηB . The definitions MC (m, r, η A ) and MT (m, r, η A ) for sets of de-compressors A are similar. Related Definitions 1–6 are in Section 3.1. F ORMULATION OF THE PROBLEM . Now, we are in a position to formulate the problem rigorously. We define the norm to be x2Ω = norm of x(ω ). Consider
Let B0 be such that J1 ( B0 ) = min J1 ( B) B
We write z0 = B0 (y). Next, let and let
A0
be such that J2 ( A0 ) = min J2 ( A) A
Ω
x(ω )22 dµ(ω ) where x(ω )2 is the Euclidean
J1 ( B) = z − B(y)2Ω . subject to B ∈ MC (r, n, ηB ) or B ∈ MT (r, n, ηB ).
(2)
J2 ( A) = x − A(z0 )2Ω subject to A ∈ M(m, r, η A ) or A ∈ MT (m, r, η A ).
(3)
We denote x0 = A0 (z0 ). The problem considered in this paper is to find operators B0 and A0 that satisfy minimization criteria (2) and (3), respectively. Operator B0 provides a compression of the signal y to the form z0 . Further, A0 reconstructs then compressed signal z0 to the form x0 so that x0 is an optimal representation of x in the sense of the constrained minimization problem (3). 2.4 Differences of our statement of the problem
The major differences between our statement of the problem (2)–(3) and the known statements considered, for example, in (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)–(Torokhti and Friedland, 2009) are as follows. Firstly, de-compressor A and compressor B should be causal with variable finite memory. Secondly, we represent the data compression problem in the form of a concatenation of two new separate problems (2) and (3). Some related arguments for considering the real-time data compression problem in the form (2)–(3) are presented in Section 6.
3. Main results 3.1 Formalization of the concept of variable memory
Let τ1 < τ2 < · · · < τn be time instants and α, β, ϑ : R → L2 (Ω, R ) be continuous functions. Suppose αk = α(τk ), βk = β(τk ) and ϑk = ϑ (τk ) are real-valued, random variables having finite second moments. We write x = [ α1 , α2 , . . . , α m ] T
y = [ β1 , β2 , . . . , β n ] T
and
z = [ϑ1 , . . . , ϑr ] T .
Data Compression and De-compression by Causal Filters with Variable Finite Memory
635
Let z˜ be a compressed form of data y defined by z˜ = B(y) with z˜ = [ϑ˜1 , . . . , ϑ˜r ] T , and x˜ be a de-compression of z˜ defined by x˜ = A(z˜ ) with x˜ = [α˜1 , . . . , α˜m ] T . In many applications5 , to obtain ϑ˜k for k = 1, . . . , r, it is necessary for the compressor B to use only a limited number of input components, ηBk = 1, . . . , r. A number of such input components ηBk is here called a kth local memory for B. To define a notation of memory for the compressor B, we use parameters p and g which are positive integers such that 1≤p≤n
and
n − r + 2 ≤ g ≤ n.
Definition 1 The vector ηB = [ηB1 , . . . , ηBr ] is called a variable memory of the compressor B. In particular, ηB is called a complete variable memory if ηB1 = g and ηBk = n when k = n − g + 1, . . . , n. Vector ηB is called a truncated variable memory of B if, for r ≤ p ≤ n, ηB1 = p − r + 1 and ηBr = p. Here, p relates to the last possible nonzero entry in the bottom row of B and g relates to the last possible nonzero entry in the first row. The notation η A = [η A1 , . . . , η Am ] has a similar meaning for the de-compressor A. Here, η A j is the jth local memory of A. In other words, η A j is the number of input components used by the de-compressor A to obtain the estimate α˜j with j = 1, . . . , m. The parameters q and s which are positive integers such that 1≤q≤r
and
2≤s≤m
are used below to define two types of memory for A. Definition 2 Vector η A is called a complete variable memory of the de-compressor A if η A1 = q and η A j = r when j = s + r − 1, . . . , m. Vector η A is called a truncated variable memory of A if η A j = 0 for j = 1, . . . , s − 1, η As = s and η A j = r when j = s + r − 1, . . . , m. Here, q relates to the first possible nonzero entry in the last column of A and s relates to the first possible nonzero entry in the first column. The memory constraints described above imply that certain elements of the matrices B = m,r {bij }r,n i,j=1 and A = { aij }i,j=1 must be set equal to zero. In this regard, for matrix B with r ≤ p ≤ n, we require that
if
j = p − r + i + 1, . . . , n,
for
bi,j = 0 p = r, . . . , n − 1, i = 1, . . . , r
and
(4) p = n, i = 1, . . . , r − 1,
(5)
and, for 1 ≤ p ≤ r − 1, it is required that
if
i = 1, . . . , r − p, j = 1, . . . , n,
bi,j = 0 i = r − p + 1, . . . , r, and if j = i − r + p + 1, . . . , n.
(6) (7)
See Examples 1 and 2 below. 5 Examples include computer medical diagnostics (Gimeno, 1987) and problems of bio-informatics (Kim et al., 2005).
636
Advanced Technologies
For matrix A with r ≤ p ≤ n, we require if
j = q + i, . . . , r
for
and, for 2 ≤ s ≤ m, it is required that if
j = s + i, . . . , r
for
ai,j = 0
(8)
q = 1, . . . , r − 1
and
i = 1, . . . , r − q,
ai,j = 0
(9)
(10)
s = 1, . . . , m
and
See Examples 3 and 4 below. The above conditions imply the following definitions.
i = 1, . . . , s + r − 1,
(11)
Definition 3 A matrix B satisfying the constraint (4)–(5) is said to be a causal operator with the truncated variable memory ηB = [ p − r + 1, . . . , p]. The set of such matrices is denoted by MT (r, n, ηB ). Example 1 Let n = 8, r = 3 and p = 7. B is of the form • B= • •
If the symbol • denotes an entry that may be non-zero, then
b1,p−r+1 • • • • 0 0↓ 0 • • • • • 0 0 • • • • r• • 0 is a causal operator with the truncated variable memory ηB = [5, 6, 7].
p
Definition 4 A matrix B satisfying the constraint (6)–(7) is said to be a causal operator with the complete variable memory ηB = [ g, g + 1, . . . , n]. Here, ηB = n when k = n − g + 1, . . . , n. The set of k such matrices is denoted by MC (r, n, ηB ). Example 2 Let n = 6, r = 4 and g = 4. Then B with g • • • • 0 0 • • • • • 0 B= • • • • • • n−g+1 • • • • • •
is the causal operator with the complete variable memory ηB = [4, 5, 6, 6]. Definition 5 A matrix A satisfying the constraint (8)–(9) is said to be a causal operator with the
complete variable memory η A = [r − q + 1, . . . , r ]. Here, η A = r when j = q, . . . , m. The set of such j
matrices is denoted by MC (m, r, η A ).
Example 3 Let m = 5, r = 4 and q = 3. Then A is of the form
• • A= • • •
• • • • •
a1,r−q+1 ↓
0 • • • •
0 0 • • •
← aq,r
and is a causal operator with the complete variable memory η A = [2, 3, 4, 4, 4].
Data Compression and De-compression by Causal Filters with Variable Finite Memory
637
Definition 6 A matrix A satisfying the constraint (10)–(11) is said to be a causal operator with the
truncated variable memory η A = [0, . . . , 0, 1, . . . , r ]. Here, η A = 0 when j = 1, . . . , s − 1, and η A = r j
when j = s + r − 1, . . . , m. The set of such matrices is denoted by MT (m, r, η A ). Example 4 Let m = 6, r = 4 and s = 3. Then matrix 0 0 0 0 0 0 • 0 0 A= • • 0 • • • • • •
j
A of the form 0 0 0 0 0 •
is a causal operator with the truncated variable memory η A = [0, 0, 1, 2, 3, 4]. 3.2 Solution of problems (2) and (3)
To proceed any further we shall require some more notation. Let
αi , β j =
Ω
αi (ω ) β j (ω )dµ(ω ) < ∞, y 1 = [ β 1 , . . . , β g −1 ] T ,
z 1 = [ ϑ 1 , . . . , ϑ g −1 ] T
m,n m×n Exy = {αi , β j }i,j , =1 ∈ R
(12)
y2 = [ β g , . . . , β n ] T ,
(13)
and
z2 = [ ϑ g , . . . , ϑ n ] T .
(14)
The pseudo-inverse matrix (Golub and Van Loan, 1996) for any matrix M is denoted by M† . The symbol O designates the zero matrix. Definition 7 (Torokhti and Howlett, 2007) Two random vectors u and w are said to be mutually orthogonal if Euw = O. Lemma 1 (Torokhti and Howlett, 2007) w1 = y1 where
If we define and
w2 = y2 − Py y1
Py = Ey1 y2 Ey†1 y1 + Dy ( I − Ey1 y1 Ey†1 y1 )
(15)
with Dy an arbitrary matrix, then w1 and w2 are mutually orthogonal random vectors. 3.2.1 Solution of problem (2). The case of complete variable memory
Let us first consider problem (2) when B has the complete variable memory ηB = [ g, g + 1, . . . , n] (see Definition 4). Let us partition B in four matrices K B , L B , SB1 and SB2 so that KB LB , B = SB1 SB2
638
Advanced Technologies
where K B = {k ij } ∈ R nb ×( g−1) is a rectangular matrix with nb = n − g + 1, L B = {ij } ∈ R nb ×nb
is a lower triangular matrix, and
(1)
SB1 = {sij } ∈ R (r−nb )×( g−1)
and
(2)
SB2 = {skl } ∈ R (r−nb )×nb
are rectangular matrices.
We have B(y)
= =
K B ( y1 ) + L B ( y2 ) = SB1 (y1 ) + SB2 (y2 ) K B (w1 ) + L B (w2 + Py (w1 )) TB (w1 ) + L B (w2 ) , = SB1 (w1 ) + SB2 (w2 + Py (w1 )) SB (w1 ) + SB2 (w2 )
KB SB1
LB SB2
y1 y2
(16)
where TB = K B + L B Py
SB = SB1 + SB2 Py .
and
(17)
Then J1 ( B)
= = =
z − B(y)2Ω z1 TB z2 − S B
LB SB2
w1 w2
J (1) ( TB , L B ) + J (2) (SB , SB2 ),
2
Ω
(18)
where J (1) ( T, L) = z1 − [ TB (w1 ) + L B (w2 )]2Ω
and
J (2) (SB , SB2 ) = z2 − [SB (w1 ) + SB2 (w2 )]2Ω .
By analogy with Lemma 37 in (Torokhti and Howlett, 2007), min
B∈M(r,n,ηB )
J1 ( B) = min J (1) ( TB , L B ) + min J (2) (SB , SB2 ). TB ,L B
SB ,SB2
(19)
Therefore, problem (2) is reduced to finding matrices TB0 , L0B , S0B and S0B2 such that J (1) ( TB0 , L0B ) = min J (1) ( TB , L B )
(20)
J (2) (S0B , S0B2 ) = min J (2) (SB , SB2 ).
(21)
TB ,L B
and
SB ,SB2
Taking into account the orthogonality of vectors w1 and w2 , and working in analogy with the argument on pp. 348–352 in (Torokhti and Howlett, 2007), it follows that matrices S0B and S0B2 are given by † † + HB ( I − Ew1 w1 Ew ) (22) S0B = Ez2 w1 Ew 1 w1 1 w1 and
† † + HB2 ( I − Ew2 w2 Ew ), S0B2 = Ez2 w2 Ew 2 w2 2 w2
where HB and HB2 are arbitrary matrices. See Section 7 for more details. Next, to find TB0 and L0B we use the following notation.
(23)
Data Compression and De-compression by Causal Filters with Variable Finite Memory
639
For r = 1, 2, . . . , , let ρ be the rank of the matrix Ew2 w2 ∈ R n2 ×n2 with nb = n − g + 1, and let Ew2 w2 1/2 = Qw,ρ Rw,ρ be the QR-decomposition for Ew2 w2 1/2 where Qw,ρ ∈ R n2 ×ρ and Qw,ρ T Qw,ρ = I and Rw,ρ ∈ R ρ×n2 is upper trapezoidal with rank ρ. We write Gw,ρ = Rw,ρ T and use the notation Gw,ρ = [ g1 , . . . , gρ ] ∈ R n2 ×ρ where g j ∈ R n2 denotes the j-th column of Gw,ρ . We also write Gw,s = [ g1 , . . . , gs ] ∈ R n2 ×s
(24)
for s ≤ ρ to denote the matrix consisting of the first s columns of Gw,ρ . For the sake of simplicity, let us set Gs := Gw,s . (25) Next,
e1 T = [1, 0, 0, 0, . . .],
e2 T = [0, 1, 0, 0, . . .],
e3 T = [0, 0, 1, 0, . . .],
etc.
denote the unit row vectors whatever the dimension of the space. Finally, any square matrix M can be written as M = M∆ + M∇ where M∆ is lower triangular and M∇ is strictly upper triangular. We write · F for the Frobenius norm. Lemma 2 Let matrices K, N and C be such that K = N + C and nij cij = 0 for all i, j, where nij and cij are entries of N and C, respectively. Then
K 2F = N 2F + C 2F . Proof.
The proof is obvious.
Theorem 1 Let B have the complete variable memory ηB = [ g, g + 1, . . . , n]. Then the solution to problem (2) is provided by the matrix B0 , which has the form 0 K B L0B , B0 = S0B1 S0B2 where the blocks K0B ∈ R nb ×( g−1) , S0B1 ∈ R (r−nb )×( g−1) and S0B2 ∈ R (r−nb )×nb are rectangular, and the block L0B ∈ R nb ×nb is lower triangular. These blocks are given as follows. The block K0B is given by K0B = TB0 − L0B Py with
where NB1 rows
(26)
† † TB0 = Ez1 w1 Ew + NB1 ( I − Ew1 w1 Ew ) (27) 1 w1 1 w1 0 λ1 .. 0 is an arbitrary matrix. The block L B = . , for each s = 1, 2, . . . , n2 , is defined by its λ0nb
λ0s = es T Ez1 w2 Ew2 w2 † Gs Gs † + f s T ( I − Gs Gs † )
(28)
640
Advanced Technologies
with f s T ∈ R1×n2 arbitrary. The blocks S0B1 and S0B2 are given by S0B1 = S0B − S0B2 Py
(29)
and (23), respectively. In (29), S0B is presented by (22). The error associated with the matrix B0 is given by
z − B0 y2Ω =
n2
ρ
∑ ∑
s =1 j = s +1
+
2
2
2
∑ Ez z 1/2 2F − ∑ ∑ Ez w Ew w †1/2 2F .
j =1
Proof.
|es T Ez1 w2 Ew2 w2 † g j |2 j j
i =1 j =1
i
i
j
(30)
j
Since Ew2 w2 = Gw,ρ Gw,ρ T , we have J (1) ( T, L) = z1 − [ T (w1 ) + L(w2 )]2Ω = tr Ez1 z1 − Ez1 w1 T T − Ez1 w2 L T − TEw1 z1
=
=
+ TEw1 w1 T T + TEw1 w2 L T − LEw2 z1 + LEw2 w1 T T + LEw2 w2 L T
tr Ez1 z1 − Ez1 w1 T T − Ez1 w2 L T − TEw1 z1 + TEw1 w1 T T − LEw2 z1 + LEw2 w2 L T tr ( T − Ez1 w1 Ew1 w1 † ) Ew1 w1 ( T T − Ew1 w1 † Ew1 z1 )
† + ( L − Ez1 w2 Ew2 w2 † ) Gρ Gρ T ( L T − Ew E ) 2 w2 w2 z 1
+ Ez1 z1 − Ez1 w1 Ew1 w1 † Ew1 z1 − Ez1 w2 Ew2 w2 † Ew2 z1
=
( T − Ez1 w1 Ew1 w1 † ) Ew1 w1 1/2 F
2
2
+ ( L − Ez1 w2 Ew2 w2 † ) Gρ F + Ez1 z1 1/2 F 2
2
− Ez1 w1 Ew1 w1 †1/2 F − Ez1 w2 Ew2 w2 †1/2 F .
2
(31)
Data Compression and De-compression by Causal Filters with Variable Finite Memory
641
On the basis of Lemma 2 and the fact that the matrix LGρ is lower triangular we note that
( L − Ez1 w2 Ew2 w2 † ) Gρ F
2
=
( LGρ − Ez1 w2 Ew2 w2 † Gρ )∆ − ( Ez1 w2 Ew2 w2 † Gρ )∇ F
=
∑ ∑ |(λs gj − es T Ez w
s
ρ
1
s =1 j =1
2
+
Ew2 w2 † g j )|2 ρ
nb
∑ ∑
s =1 j = s +1 ρ
∑ λs Gs − es T Ez w
=
s =1
1
2
Ew2 w2 † Gs 2
+
2
ρ
2
nb
∑ ∑
|es T Ez1 w2 Ew2 w2 † g j |2
s =1 j = s +1
|es T Ez1 w2 Ew2 w2 † g j |2 .
(32)
The first sum in (32) calculates the contribution from all elements with j ≤ s that are on or below the leading diagonal of the matrix ( L − Ez1 w2 Ew2 w2 † ) Gρ and the second sum calculates the contribution from all elements with j > s that are strictly above the leading diagonal. To minimize the overall expression (32) it would be sufficient to set the terms in the first sum to zero. Thus we wish to solve the matrix equation † λs Gs − es T Ez1 w2 Ew Gs = 0. 2 w2
This equation is a system of (2nb − ρ + 1)ρ2 /2 equations in (nb + 1)nb /2 unknowns. Hence, there is always at least one solution. Indeed, by applying similar arguments to those used in proving Lemma 26 of (Torokhti and Howlett, 2007), it can be seen that the general solution is given by (28). Next, it follows from (31) that the minimum of
( TB − Ez1 w1 Ew1 w1 † ) Ew1 w1 1/2 F is attained if
2
( TB − Ez1 w1 Ew1 w1 † ) Ew1 w1 1/2 = O.
(33)
TB Ew1 w1 − Ez1 w1 = O.
(34)
By Lemma 26 of (Torokhti and Howlett, 2007), this equation is equivalent to the equation
See Section 7 for more details. The general solution (Ben-Israel and Greville, 1974) to (34) is given by (43). Then (26) follows from (17). Equality (29) also follows from (17). To obtain the error representation (30), we write
z − B0 y2Ω
=
J (1) ( TB0 , L0B ) + J (2) (S0B , S0B2 )
=
z1 − [ TB0 (w1 ) + L0B (w2 )]2Ω + z2 − [S0B (w1 ) + S0B2 (w2 )]2Ω .
642
Advanced Technologies
Here, J (1) ( TB0 , L0B ) =
ρ
n2
∑ ∑
s =1 j = s +1
|es T Ez1 w2 Ew2 w2 † g j |2 + Ez1 z1 1/2 2F −
2
∑ Ez w Ew w †1/2 2F 1
j =1
j
j
j
which follows from (31) by substituting (43) and (28). Similarly, the term J (2) (S0B , S0B2 ) is represented by (see Corollary 20 (p. 351) in (Torokhti and Howlett, 2007))
2F − J (2) (S0B , S0B2 ) = Ez†1/2 2 z2
2
∑ Ez w Ew w †1/2 2F . 2
j =1
j
j
j
Hence, (30) follows.
3.2.2 Solution of problem (3). The case of complete variable memory
Let us now consider problem (3) when the de-compressor A has the complete variable memory η A = [r − q + 1, . . . , r ] (see Definition 5). In analogy with our partitioning of matrix B, we partition matrix A in four matrices K A , L A , S A1 and S A2 so that LA KA , A= S A1 S A2 where
K A = {k ij } ∈ R q×(r−q) is a rectangular matrix, L A = {ij } ∈ R q×q
S A1 =
(1) {sij }
∈R
is a lower triangular matrix, and
(m−q)×(r −q)
and
(2)
S A2 = {skl } ∈ R (m−q)×q
Let us partition z0 so that z0 =
z01 z02
are rectangular matrices.
with z01 ∈ L2 (Ω, Rr−q ) and z02 ∈ L2 (Ω, R q ). We also write x1 = [ α1 . . . , αr − q ] T
and
x 2 = [ α r − q +1 , . . . , α m ] T ,
and denote by v1 ∈ L2 (Ω, Rr−q ) and v2 ∈ L2 (Ω, R q ), orthogonal vectors according to Lemma 1 as v1 = z01 and v2 = z02 − Pz z01 ,
where Pz = Ez1 z2 Ez†1 z1 + Dz ( I − Ez1 z1 Ez†1 z1 ) with Dz an arbitrary matrix. By analogy with (24)–(25), we write Gv,s = [ g1 , . . . , gs ] ∈ R q×s
where Gv,s is constructed from a QR-decomposition of Ev2 v2 1/2 , in a manner similar to the construction of matrix Gw,s . Furthermore, we shall define Gs := Gv,s .
Data Compression and De-compression by Causal Filters with Variable Finite Memory
643
Theorem 2 Let A have the complete variable memory η A = [r − q + 1, . . . , r ]. Then the solution to problem (3) is provided by the matrix A0 , which has the form 0 L0A KA 0 , A = S0A1 S0A2 where the blocks K0A ∈ R q×(r−q) , S0A1 ∈ R (m−q)×(r−q) and S0A2 ∈ R (m−q)×q are rectangular, and the block L0A ∈ R q×q is lower triangular. These blocks are given as follows. The block K0A is given by 0 K0A = TA − L0A P
with
where NA1
(35)
0 TA = Ex1 v1 Ev†1 v1 + NA1 ( I − Ev1 v1 Ev†1 v1 ) (36) 0 λ1 . is an arbitrary matrix. The block L0A = .. ,for each s = 1, 2, . . . , q, is defined by its
λ0q
rows T
with f s ∈
R 1× q
λ0s = es T Ex1 v2 Ev2 v2 † Gs Gs † + f s T ( I − Gs Gs † )
arbitrary. The blocks
S0A1 = S0A − S0A2 P where
S0A1
and
S0A2
are given by
S0A2 = Ex2 v2 Ev†2 v2 + H A2 ( I − Ev2 v2 Ev†2 v2 ),
and
(37)
S0A = Ex2 v1 Ev†1 v1 + H A ( I − Ev1 v1 Ev†1 v1 )
(38) (39)
and H A2 and H A are arbitrary matrices. The error associated with the de-compressor A0 is given by
x − A0 z0 2Ω =
q
ρ
∑ ∑
s =1 j = s +1
+
2
2
2
∑ Ex x 1/2 2F − ∑ ∑ Ex v Ev v †1/2 2F .
j =1
Proof.
|es T Ex1 v2 Ev2 v2 † g j |2
j j
i =1 j =1
i i
j j
The proof follows in analogy with the proof of Theorem 1.
3.2.3 Solution of problem (2): the case of truncated variable memory
(40)
Let us now consider a solution to the problem (2) when the compressor B has the truncated variable memory ηB = [ p − r + 1, . . . , p] T (see Definition 3). To this end, let us partition B in three blocks K B , L B and ZB so that B = [ K B L B ZB ], where K B = {k ij } ∈ Rr×( p−r) is a rectangular matrix, L B = {ij } ∈ Rr×r
ZB = {zij } ∈ R
is a lower triangular matrix, and
r ×(n− p)
is the zero rectangular matrix.
644
Advanced Technologies
Let us write y1 = [ β1 , . . . , β p −r ] T ,
y 2 = [ β p −r +1 , . . . , β p ] T
and
y 3 = [ β p +1 , . . . , β n ] T .
Therefore B(y) = [K B L B
y1 ZB ] y2 = K B (w1 ) + L B (w2 + Py (w1 )) = TB (w1 ) + L B (w2 ), y3
where TB is given by (17). Then
J1 ( B)
z − B(y)2Ω
=
z − [ TB (w1 ) + L B (w2 )]2Ω
=
J (1) ( TB , L B ).
=
(41)
Similarly to (24)–(25), we write Gw,s = [ g1 , . . . , gs ] ∈ Rr×s where Gw,s is constructed from a QR-decomposition of Ew2 w2 1/2 , in a manner similar to the construction of matrix Gw,s . Furthermore, we shall define Gs := Gw,s . A comparison of (41) with (18) and Theorem 1 shows that the solution to the problem under consideration follows from Theorem 1 as its particular case as follows. Corollary 1 Let B ∈ MT (r, n, ηB ), i.e. the compressor B is causal and has the truncated variable memory ηB = [ p − r + 1, . . . , p] T . Then the solution to problem (2) is provided by the matrix B0 , which has the form B0 = K0B L0B ZB . Here, the block K0B ∈ Rr×( p−r) is given by
K0B = TB0 − L0B Py
(42)
with Py determined by (15) and † † TB0 = Ez1 w1 Ew + NB ( I − Ew1 w1 Ew ) 1 w1 1 w1
(43)
λ01 where NB is an arbitrary matrix. The lower triangular block L0B = ... ∈ Rr×r , for each s = λ0r 1, 2, . . . , r, is defined by its rows λ0s = es T Ezw2 Ew2 w2 † Gs Gs † + f s T ( I − Gs Gs † ), where Gs determined by (24)–(25) with Ew2 w2 ∈ Rr×r , and f s T ∈ R1×r arbitrary.
(44)
Data Compression and De-compression by Causal Filters with Variable Finite Memory
645
Let ρ be the rank of the matrix Ew2 w2 ∈ Rr×r . The error associated with the compressor B0 is given by J (1) ( TB0 , L0B ) =
ρ
r
∑ ∑
s =1 j = s +1
|es T Ezw2 Ew2 w2 † g j |2 + Ezz 1/2 2F −
Proof.
2
∑ Ezw Ew w †1/2 2F . j
j =1
j
j
(42)–(44) follow in analogy with (26)–(28), and (45) is similar to (30).
(45)
3.2.4 Solution of problem (3): the case of truncated variable memory
A solution to the problem (3) when the de-compressor A has the truncated variable memory η A = [0, . . . , 0, 1, . . . , r ] T (see Definition 6) is obtained in a similar manner to the solution of the problem (3) given in Section 3.2.2 above. We partition matrix A and vector z0 in three blocks Z A , L A and K A , and three sub-vectors z01 , z01 and z01 , respectively, so that 0 z1 ZA A = L A and z0 = z02 , KA z03
where Z A ∈ R (s−1)×r is the zero matrix, L A ∈ Rr×r is a lower triangular matrix, K A ∈ R (m−r)×r is a rectangular matrix, and z01 ∈ L2 (Ω, R s−1 ), z02 ∈ L2 (Ω, Rr−s+1 ) and z03 ∈ L2 (Ω, R m−r ). We also write x 1 = [ α 1 . . . , α s −1 ] T ,
x 2 = [ α s , . . . , α s +r −1 ] T
and
x3 = [ α s +r , . . . , α m ] T .
Therefore, J2 ( A)
2 ZA x − L A z0 KA Ω 2 x1 x2 − L A ( z0 ) 2 x3 − K ( z0 ) A 3 Ω
=
=
(1)
(1)
x1 2Ω + J2 ( L A ) + J2 (K A ),
=
(46)
where (1)
J2 ( L A ) = x2 − L A (z02 )2Ω
and
(1)
J2 (K A ) = x3 − K A (z03 )2Ω .
(47)
By analogy with (24)–(25), we write Gs := Gz0 ,s with 2
Gz0 ,s = [ g1 , . . . , gs ] ∈ Rrs ×s , 2
where rs = r − s + 1 and Gz0 ,s is constructed from a QR-decomposition of Ez0 z0 1/2 , in a manner 2 2 2 similar to the construction of matrix Gw,s . The solution of the problem under consideration is as follows.
646
Advanced Technologies
Corollary 2 Let a de-compressor A have the truncated variable memory η A = [0, . . . , 0, 1, . . . , r ] T . Then the solution to problem (3) is provided by the matrix A0 which has the form ZA 0 0 A = LA . K0A
λ01 Here, Z A ∈ R (s−1)×r is the zero matrix and L0A = ... ∈ Rr×r is a lower triangular matrix, for λ0r each s = 1, 2, . . . , r, defined by its rows λ0s = es T Ex2 z0 Ez0 z0 † Gs Gs † + f s T ( I − Gs Gs † ), 2
(48)
2 2
with f s T ∈ R1×r arbitrary. The block K0A ∈ R (m−r)×r is given by K0A = Ex3 z0 Ez†0 z0 + H A ( I − Ez0 z0 Ez†0 z0 ), 3
3 3
3 3
(49)
3 3
where H A is an arbitrary matrix. Let ρ be the rank of the matrix Ez0 z0 ∈ Rrs ×rs where rs = r − s + 1. The error associated with the 2 2
de-compressor A0 is given by
x − A0 (z0 )2Ω = x1 2Ω + Proof.
ρ
rs
∑ ∑
s =1 j = s +1
|es T Ex2 z0 Ez0 z0 † g j |2 + Ex†1/2 2F − Ex3 z0 Ez0 z0 †1/2 2F . (50) 3 x3 2
2 2
3
3 3
By analogy with (31)) and (32), we have 2
2
(1)
J2 ( L A ) = ( L A − Ex2 z0 Ez0 z0 † ) Gw,ρ F + Ex2 x2 1/2 F − Ex2 z0 Ez0 z0 †1/2 F 2
2 2
2
2
2 2
and 2
( L A − Ex2 z0 Ez0 z0 † ) Gw,ρ F = 2
2 2
ρ
∑ λs Gs − es T Ex z
0 2 2
s =1
Ez0 z0 † Gs 2
2
2 2
+
ρ
rs
∑ ∑
s =1 j = s +1
respectively. Then (48) follows similarly to (28).
|es T Ex2 z0 Ez0 z0 † g j |2 , 2
2 2
(51)
(1)
The equality (49) is determined from minimizing J2 (K A ) in (47) as it has been done in (22)– (23) and on pp. 348–352 in (Torokhti and Howlett, 2007). To derive the error representation (50), we write J2 ( A0 )
(2)
(1)
=
x − A0 (z0 )2Ω = J2 ( L0A ) + J2 (K0A )
=
x1 2Ω +
Thus, (48)–(50) are true.
ρ
rs
∑ ∑
s =1 j = s +1
|es T Ex2 z0 Ez0 z0 † g j |2 + Ex†1/2 2F − Ex3 z0 Ez0 z0 †1/2 2F . 3 x3 2
2 2
3
3 3
Data Compression and De-compression by Causal Filters with Variable Finite Memory
647
3.3 Device for data compression and de-compression
The proposed filter F0 for data compression, filtering and de-compression consists of two devices, the compressor B0 and the de-compressor A0 , so that F0 = A0 B0 . The device B0 compresses observed data y ∈ L2 (Ω, R n ) to a ‘shorter’ vector z0 ∈ L2 (Ω, Rr ) with r < min{m, n} where m is a dimension of the reference signal x ∈ L2 (Ω, R m ). The de-compressor A0 restores z0 to a signal x˜ ∈ L2 (Ω, R m ) so that this procedure minimizes the cost functional J2 ( A) given by (3). The compression ratio associated with such devices is given by c=
r . min{m, n}
(52)
4. Simulations The following simulations and numerical results illustrate the performance of the proposed approach. Our filter F0 = A0 B0 has been applied to compression, filtering and subsequent restoration of the reference signals given by the matrix X ∈ R256×256 . The matrix X represents the data obtained from an aerial digital photograph of a plant6 presented in Fig. 1. We divide X into 128 sub-matrices Xij ∈ R m×q with i = 1, . . . , 16, j = 1, . . . , 8, m = 16 and q = 32 so that X = { Xij }. By assumption, the sub-matrix Xij is interpreted as q realizations of a random vector x ∈ L2 (Ω, R m ) with each column representing a realization. For each i = 1, . . . , 16 and j = 1, . . . , 8, observed data Yij were modelled from Xij in the form Yij = Xij • rand(16, 32)(ij) . Here, • means the Hadamard product and rand(16, 32)(ij) is a 16 × 32 matrix whose elements are uniformly distributed in the interval (0, 1). The proposed filter F0 has been applied to each pair { Xij , Yij }. In these simulations, we are mainly concerned with the implementation of conditions of causality and variable finite memory in the filter F0 and their influence on the filter performance. To this end, we considered compressors B0 and de-compressors A0 with different types of memory studied above. First, each pair { Xij , Yij } has been processed by compressors and de-compressors with the complete variable memory. In this regard, see Definitions 4 and 5. We denote BC0 = B0 and A0C = A0 for such a compressor and de-compressor determined by Theorems 1 and 2, respectively, so that BC0 ∈ MT (r, n, ηB ) and A0C ∈ MC (m, r, η A ) 12 + k − 1, if k = 1, . . . , 4, 16 , and where n = m = 16, r = 8, ηB = {ηB k }k=1 with ηB k = 16, if k = 5, . . . , 16 6 + j − 1, if j = 1, 2, . In this case, the optimal filter F0 is η A = {η A j }16 j=1 with η A j = 8, if k = 3, . . . , 16 denoted by FC0 so that FC0 = A0C BC0 . We write JC0 = max Xij − FC0 Yij 2 ij
for a maximal error associated with the filter FC0 over all i = 1, . . . , 16 and j = 1, . . . , 8. 6 The
database is available in http://sipi.usc.edu/services/database/Database.html.
648
Advanced Technologies
Second, each pair { Xij , Yij } has been processed by compressors and de-compressors with the truncated variable memory defined by Definitions 3 and 6, respectively. Such a compressor and de-compressor are denoted by BT0 = B0 and A0T = A0 with B0 and A0 determined by Corollaries 1 and 2. Here, BT0 ∈ MT (r, n, ηB )
and
A0T ∈ MT (m, r, η A )
16 where n = m = 16, r = 8, ηB = {ηB k }16 k =1 with ηB k = 9 + k − 1 for k = 1, . . . , 8, and η A = { η A j } j=1 j, if j = 1, . . . , 7, . The filter F0 composed from BT0 and A0T is denoted by FC0 with η A j = 8, if k = 8, . . . , 16 so that FC0 = A0T BT0 . The maximal error associated with the filter FT0 over all i = 1, . . . , 16 and j = 1, . . . , 8 is JT0 = max Xij − FT0 Yij 2 . ij
The compression ratio for both cases above was c = 1/2. We assumed that covariance matrices associated with a compressed signal and used to compute BC0 , A0C , BT0 and A0T could be determined from the Generic Karhunen-Lo`eve Transform (GKLT) (Torokhti and Howlett, 2007) applied to each pair { Xij , Yij }. We remind that the GKLT is not restricted by the conditions of causality and memory. To the best of our knowledge, a method for the data compression subject to the conditions of causality and memory is presented here for the first time. Therefore, we could not compare our results with similar methods. However, to have an idea of some sort of comparison, we also computed the maximal error associated with the GKLT FGKLT over all i = 1, . . . , 16 and j = 1, . . . , 8 as JGKLT = max Xij − FGKLT Yij 2 . ij
The results of simulations are presented in Table 1 and Figures 1 (a) - (d). Table 1. JC0 3.3123e + 005
JT0 4.3649e + 005
JGKLT 3.0001e + 005
As it has been shown above, the conditions of causality and memory imply the specific structure of the filter F0 = A0 B0 such that matrices A0 and B0 must have special zero entries. The more zeros A0 and B0 have the worse an associated error is. Matrices BC0 and A0C have more non-zero entries than BT0 and A0T , therefore, JC0 is lesser than JT0 . The error JGKLT associated with the GKLT is naturally lesser than JC0 and JT0 because the conditions of causality and memory are not imposed on FGKLT and, therefore, it is not required that FGKLT must have specific zero entries.
5. Conclusion The new results obtained in the paper are summarized as follows. We have presented a new approach to the data processing consisting from compression, decompression and filtering of observed stochastic data subject to the conditions of causality and variable memory. The approach is based on the assumption that certain covariance matrices formed from observed data, reference signal and compressed signal are known or can be estimated. This allowed us to consider two separate problem related to compression and
Data Compression and De-compression by Causal Filters with Variable Finite Memory
649
(a) Given reference signals.
(b) Observed data.
(c) Estimates of the reference signals by the filter FC0 with the complete variable memory.
(d) Estimates of the reference signals by the filter FT0 with the truncated variable memory.
Fig. 1. Illustration of simulation results. de-compression subject to those constraints. It has been shown that the structure of the filter should have a special form due to constrains of causality and variable memory. The latter has implied the new method for the filter determination presented in Sections 3.2.1–3.2.4. The analysis of the associated errors has also been provided.
650
Advanced Technologies
6. Appendix 1: Difficulties associated with some possible formalizations of the real-time data compression problem In addition to the motivations presented in Section 1, we provide here some further, related arguments for considering the real-time data compression problem in the form (2)–(3). Let us consider two seemingly natural ways to state the real-time data compression problem and show that they lead to significant difficulties. This observation, in particular, has motivated us to state the problem as it has been done in Section 2.3 above. The first way is as follows. Let us denote by J ( A, B), the norm of the difference between the reference signal x and its estimate x˜ = ( A ◦ B)(y), constructed by de-compressor A : L2 (Ω, Rr ) → L2 (Ω, R m ) and compressor B : L2 (Ω, R n ) → L2 (Ω, Rr ) from the observed signal y: J ( A, B) = x − ( A ◦ B)(y)2Ω .
(53)
The problem is to find B0 : L2 (Ω, R n ) → L2 (Ω, Rr ) and A0 : L2 (Ω, Rr ) → L2 (Ω, R m ) such that J ( A0 , B0 ) = min J ( A, B) A,B
(54)
subject to conditions of causality and variable finite memory for A and B. The problem (54), with no constraints of causality and variable finite memory, has been considered, in particular, in (Hua and Nikpour, 1999). A second way to formulate the problem is as follows. Let F : L2 (Ω, R n ) → L2 (Ω, R m ) be a linear operator defined by [F (y)](ω ) = F [y(ω )] (55)
where F ∈ R n×m . Let rank F = r and J ( F ) = x − F (y)2Ω . Find F 0 : L2 (Ω, R n ) → L2 (Ω, R m ) such that J ( F0 ) = min J ( F )
(56)
rank F ≤ min{m, n}
(57)
F
subject to
and conditions of causality and variable finite memory for F. The constraint (57) implies that F can be written as a product of two matrices, A and B, representing a de-compressor and compressor, respectively. If the conditions of causality and variable finite memory for F are not imposed then the solution of problem (56)–(57) is known and provided, for example, in (Torokhti and Howlett, 2007), Section 7.4. If there are no constraints associated with causality and variable finite memory, then solutions of problems (54) and (56)-(57) are based on the assumption that certain covariance matrices are known. In this regard, see, for example, (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)– (Torokhti and Friedland, 2009). In particular, the solution of problem (54), with no constraints of causality and variable finite memory, provided in (Hua and Nikpour, 1999) follows from an iterative scheme that requires knowledge of two covariance matrices at each step of the scheme. Thus, if the method proposed in (Hua and Nikpour, 1999) requires p iterative steps, then it requires a knowledge of 2p covariance matrices.
Data Compression and De-compression by Causal Filters with Variable Finite Memory
651
In the case of no constraints of causality and variable finite memory, the known solution of problem (56)-(57) requires knowledge of two covariance matrices only (see, e.g., (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)–(Torokhti and Friedland, 2009)). A special difficulty with solving problem (54) is that it involves two unknowns, A and B, but only one functional to minimize. An iterative approach to its approximate solution based on the method presented in (Hua and Nikpour, 1999) requires knowledge of a number of covariance matrices. Another difficulty is that A and B must keep their special form associated with causality and variable finite memory. Related definitions are given in Section 3.1. Unlike (54), problem (56)–(57) has only one unknown. Nevertheless, the main difficulty associated with problem (56)-(57) is similar: an implementation of the conditions of causality and variable finite memory into a structure of F implies a representation of F as a product of two factors, each with a specific shape related to causality and memory, respectively. An approach to its exact solution based on the methodology presented in (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)–(Torokhti and Friedland, 2009) would require knowledge of two covariance matrices only, but implies constraints related to a special shape of each factor in the decomposition of F into a product of two factors. Therefore, as with problem (54), the difficulty again relates to dealing with two unknowns with only one functional to minimize. To avoid the difficulties discussed above, we proposed the approach to the solution of the real-time data compression problem presented in Sections 2 and 3. We note that the problem (2)–(3) requires knowledge of covariance matrices used in Theorems 1 and 2, and Corollaries 1 and 2, i.e., such an assumption is similar to the assumptions used in (Scharf., 1991)–(Jolliffe, 1986) and (Scharf, 1991)–(Torokhti and Friedland, 2009) for the solution of problems (54) and (56)-(57) with no constraints associated with causality and variable finite memory.
7. Appendix 2: Some derivations used in Section 3.2 7.1 Proof of relationships (22) and (23)
Let us show that (22) and (23) are true. For simplicity, we denote z : = z2 ,
S2 := SB2
and
J (S1 , S2 ) := J (2) (SB , SB2 ).
∑ Ezw (Ew1/2w )† 2F
and
J1 =
S1 : = S B ,
(58)
Let 1/2 2 F − J0 = Ezz
2
k
k =1
Lemma 3 Let J (S1 , S2 ) = z −
k
k
2
∑ Sk Ew1/2w k
k =1
k
1/2 † 2 − Ezwk ( Ew ) F . k wk
2
J (S1 , S2 ) is achieved when ∑ Sk (wk )2Ω . Then min S ,S 1
k =1
2
† † + Hk ( I − Ewk wk Ew ) Sk = Ezwk Ew k wk k wk
for k = 1, 2,
(59)
where Hk is an arbitrary matrix. Proof.
First, let us prove that J (S1 , S2 ) = J0 + J1 .
(60)
We note that M2F = tr[ MM T ], and write J0 = tr[ Ezz ] −
2
∑ tr[Ezw Ew† w Ew z ]
k =1
k
k
k
k
(61)
652
Advanced Technologies
and J1
2
∑ tr[(Sk − Ezw Ew† w )Ew w (SkT − Ew† w Ew z )]
=
k
k =1
k
k
k
k
k
k
k
2
∑ tr[(Sk Ew w SkT − Sk Ew z − Ezw SkT + Ezw Ew† w Ew z )].
=
k
k =1
k
k
k
k
k
k
k
(62)
1/2 † † E1/2 = ( Ew ) and orthogonality of vectors w1 The latter is due to the relationship Ew k wk k wk wk wk and w2 . Next, since for any random vector z, z2Ω = tr[ Ezz ], we have
J ( S1 , S2 ) = z −
2
2
k =1
k =1
∑ Sk (wk )2Ω = tr[Ezz − ∑ (Ezw Sk T − Sk Ew z + Sk Ew w k Sk T )]. k
k
k
1
(63)
Then (60) follows from (61)–(63). In (60), the only term depending on S1 and S2 , is J1 . Therefore, min J (S1 , S2 ) is achieved when S1 ,S2
1/2 Sk Ew k wk
−
1/2 † Ezwk ( Ew ) k wk
= O for k = 1, 2.
(64)
By (Torokhti and Howlett, 2007) (see p. 352), matrix equation (64) has its general solutions in the form (59). As a result, relationships (22) and (23) follow from (58) and (59). 7.2 Proof of equivalency of equations (33) and (34)
In Lemma 4 below, TB is any matrix of the size that makes sense of the difference TB − Ez1 w1 Ew1 w1 † . In particular, TB can be the matrix defined by (17). Lemma 4 The equations and are equivalent. Proof.
( TB − Ez1 w1 Ew1 w1 † ) Ew1 w1 1/2 = O.
(65)
TB Ew1 w1 − Ez1 w1 = O.
(66)
Let us suppose that (65) is true. Multiplying on the right by Ew1 w1 1/2 gives † TB Ew1 w1 − Ez1 w1 Ew E = O. 1 w1 w1 w1
† E = Ez1 w1 (see Lemma 24 on Then TB Ew1 w1 − Ez1 w1 = O follows on the basis of Ez1 w1 Ew 1 w1 w1 w1 p. 168 in (Torokhti and Howlett, 2007)). † † On the other hand, if TB Ew1 w1 − Ez1 w1 Ew E = O then multiplying on the right by Ew 1 w1 w1 w1 1 w1 gives † † TB Ew1 w1 Ew − Ez1 w1 Ew = O. 1 w1 1 w1
(67)
1/2 † E1/2† = Ew1 w1 Ew (see p. 170 in (Torokhti and Howlett, 2007)). Hence, equation Here, Ew 1 w1 w1 w1 1 w1 (67) can be rewritten as 1/2 † TB Ew E1/2† − Ez1 w1 Ew = O. 1 w1 w1 w1 1 w1 1/2 gives the required result (66). Multiplying on the right by Ew 1 w1
In other words, (33) and (34) are equivalent.
Data Compression and De-compression by Causal Filters with Variable Finite Memory
653
8. References L. L. Scharf, Statistical Signal Processing, Reading, MA: Addison-Wesley, 1991. Y. Hua, M. Nikpour and P. Stoica, Optimal reduced rank estimation and filtering, IEEE Trans. Signal Processing, Vol. 49, No. 3, pp. 457-469, 2001. K. Fukunaga, Introduction to Statistical Pattern Recognition, Boston: Academic Press, 1990. I.T. Jolliffe, Principal Component Analysis, Springer Verlag, New York, 1986. V. Gimeno, Obtaining the EEG envelope in real time: a practical method based on homomorphic filtering, Neuropsychobiology, Vol. 18, No. 2, pp. 110–112, 1987. H. Kim, G.H. Golub, H. Park, Missing value estimation for DNA microarray gene expression data: local least squares imputation, Bioinformatics, Vol. 21, No. 2, pp. 187-198 2005. P. Common, G.H. Golub, Tracking a few extreme singular values and vectors in signal processing, Proc. IEEE, Vol. 78, No. 8, pp.1327-1343, 1990. S. Kraut, Member, R. H. Anderson, and J. L. Krolik, A Generalized KarhunenLoeve Basis for Efficient Estimation of Tropospheric Refractivity Using Radar Clutter, IEEE Trans. Signal Processing, Vol. 52, No. 1, pp. 48 -60, 2004. A. Torokhti and P. Howlett, Computational Methods for Modelling of Nonlinear Systems, Elsevier, 2007. L. L. Scharf, The SVD and reduced rank signal processing, Signal Processing, Vol. 25, pp. 113 133, 1991. Y. Yamashita and H. Ogawa, Relative Karhunen-Lo´eve transform, IEEE Trans. on Signal Processing, Vol. 44, No. 2, pp. 371-378, 1996. Y. Hua and W. Q. Liu, Generalized Karhunen-Lo`eve transform, IEEE Signal Processing Letters, Vol. 5, No. 6, pp. 141-143, 1998. A. Torokhti and S. Friedland, Towards theory of generic Principal Component Analysis, J. Multivariate Analysis, Vol. 100, No. 4, pp. 661-669, 2009. A. Torokhti and P. Howlett, Optimal Transform Formed by a Combination of Nonlinear Operators: The Case of Data Dimensionality Reduction, IEEE Trans. on Signal Processing, Vol. 54, No. 4, 2006. A. Torokhti and P. Howlett, Filtering and Compression for Infinite Sets of Stochastic Signals, Signal Processing, Vol. 89, Issue 3, pp. 291-304, 2009. A. Torokhti and J. Manton, Generic Weighted Filtering of Stochastic Signals, IEEE Trans. Signal Processing (accepted 23 January 2009). Y. Hua and M. Nikpour, Computing the reduced rank Wiener filter by IQMD, IEEE Signal Processing Letters, Vol. 6, No. 9, pp. 240-242, 1999. P. Stoica and M. Viberg, Maximum likelihood parameter and rank estimation in reduced-rank multivariate linear regressions, IEEE Trans. Signal Processing, Vol. 44, No. 12, pp. 30693078, 1996. T. Zhang, G. Golub, Rank-One Approximation to High Order Tensors, SIAM J. Matrix Anal. Appl., Vol. 23, Issue 2, pp. 534550, 2001. V.N. Fomin and M.V. Ruzhansky, Abstract optimal linear filtering, SIAM J. Control and Optimization, 2 Vol. 38, pp. 1334 - 1352, 2000. A. Torokhti and P. Howlett, Best approximation of identity mapping: the case of variable memory, J. Approx. Theory, Vol. 143, No. 1, pp. 111-123, 2006. P.G. Howlett, A. Torokhti, and C.E.M. Pearce, Optimal multilinear estimation of a random vector under constraints of causality and limited memory, Computational Statistics & Data Analysis, Vol. 52, Issue 2, pp. 869-878, 2007.
654
Advanced Technologies
R. M. De Santis, Causality Theory in Systems Analysis, Proc. of IEEE, Vol. 64, Issue 1, pp. 36–44, 1976. G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD: Johns Hopkins University Press, 1996. A. Ben-Israel and T. N. E. Greville, Generalized Inverses: Theory and Applications, John Wiley & Sons, New York, 1974.
Feedback Control of Marangoni Convection with Magnetic Field
655
34 X
Feedback Control of Marangoni Convection with Magnetic Field Norihan Md. Arifin, Haliza Rosali and Norfifah Bachok
Department of Mathematics and Institute For Mathematical Research Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia 1. Introduction Convection in a plane horizontal fluid layer heated from below, initially at rest and subject to an adverse temperature gradient, may be produced either by buoyancy forces or surface tension forces. These convective instability problems are known as the Rayleigh-Benard convection and Marangoni convection, respectively. The determination of the criterion for the onset of convection and the mechanism to control has been a subject of interest because of its applications in the heat and momentum transfer research. Rayleigh (1916) was the first to solve the problem of the onset of thermal convection in a horizontal layer of fluid heated from below. His linear analysis showed that Benard convection occurs when the Rayleigh number exceeds a critical value. Theoretical analysis of Marangoni convection was started with the linear analysis by Pearson (1958) who assumed an infinite fluid layer, a nondeformable case and zero gravity in the case of no-slip boundary conditions at the bottom. He showed that thermocapillary forces can cause convection when the Marangoni number exceeds a critical value in the absence of buoyancy forces. The determination of the criterion for the onset of convection and the mechanism to control convective flow patterns is important in both technology and fundamental Science. The problem of suppressing cellular convection in the Marangoni convection problem has attracted some interest in the literature. The effects of a body force due to an externallyimposed magnetic field on the onset of convection has been studied theoretically and numerically. The effect of magnetic field on the onset of steady buoyancy-driven convection was treated by Chandrasekhar (1961) who showed that the effect of magnetic field is to increase the critical value of Rayleigh number and hence to have a stabilising effect on the layer. The effect of a magnetic field on the onset of steady buoyancy and thermocapillarydriven (Benard-Marangoni) convection in a fluid layer with a nondeformable free surface was first analyzed by Nield (1966). He found that the critical Marangoni number monotonically increased as the strength of vertical magnetic field increased. This indicates that Lorentz force suppressed Marangoni convection. Later, the effect of a magnetic field on the onset of steady Marangoni convection in a horizontal layer of fluid has been discuss in a series of paper by Wilson (1993, 1994). The influence of a uniform vertical magnetic field on
656
Advanced Technologies
the onset of oscillatory Marangoni convection was treated by Hashim & Wilson (1999) and Hashim & Arifin (2003). The present work attempts to delay the onset of convection by applying the control. The objective of the control is to delay the onset of convection while maintaining a state of no motion in the fluid layer. Tang and Bau (1993,1994) and Howle (1997) have shown that the critical Rayleigh number for the onset of Rayleigh-Bénard convection can be delayed. Or et al. (1999) studied theoretically the use of control strategies to stabilize long wavelength instabilities in the Marangoni-Bénard convection. Bau (1999) has shown independently how such a feedback control can delay the onset of Marangoni-Bénard convection on a linear basis with no-slip boundary conditions at the bottom. Recently, Arifin et. al. (2007) have shown that a control also can delay the onset of Marangoni-Bénard convection with free-slip boundary conditions at the bottom. Therefore, in this paper, we use a linear controller to delay the onset of Marangoni convection in a fluid layer with magnetic field. The linear stability theory is applied and the resulting eigenvalue problem is solved numerically. The combined effect of the magnetic field and the feedback control on the onset of steady Marangoni convection are studied.
2. Problem Formulation Consider a horizontal fluid layer of depth d with a free upper surface heated from below and subject to a uniform vertical temperature gradient. The fluid layer is bounded below by a horizontal solid boundary at a constant temperature T1 and above by a free surface at constant temperature T2 which is in contact with a passive gas at constant pressure P0 and constant temperature T (see Figure 1) Free surface
z d
Fluid layer
z d ( x, y , t )
y
x
Rigid boundary Fig. 1. Problem set up We use Cartesian coordinates with two horizontal x- and y- axis located at the lower solid boundary and a positive z- axis is directed towards the free surface. The surface tension, τ is assumed to be a linear function of the temperature
τ τ 0 γ T T0 , where
τ0
is the value of
τ
at temperature T0 and the constant
fluids. The density of the fluid is given by
(1)
γ
is positive for most
Feedback Control of Marangoni Convection with Magnetic Field
657
0 [1 (T T0 )],
(2)
where is the positive coefficient of the thermal liquid expansion and 0 is the value of
at the reference temperature T0 . Subject to the Boussinesq approximation, the governing
equations for an incompressible, electrically conducting fluid in the presence of a magnetic field are expressed as follows: Continuity equation: Momentum equation:
U 0,
(3)
1 2 H H, U U Π v U + 4 t
(4)
2 U T T t
(5)
Energy equation :
Magnetic field equations:
H =0
U. H = t
where
U, T , H, , v , and
(6)
H. U + 2 H
(7)
denote the velocity , temperature, magnetic field,
pressure, density, kinematic viscosity , thermal diffusivity and electrical resistivity, respectively. Π=p | H |2 / 8 is the magnetic pressure where p is the fluid pressure and
is the magnetic permeability. When motion occurs, the upper free surface of the layer
will be deformable with its position at z d f x, y,t . At the free surface, we have the usual kinematic condition together with the conditions of continuity for the normal and tangential stresses. The temperature obeys the Newton's law of cooling,
k T / n h T T , where k and h are the thermal conductivity of the fluid and the heat
transfer coefficient between the free surface and the air, respectively, and n is the outward unit normal to the free surface. The boundary conditions at the bottom wall, z = 0, are noslip and conducting to the temperature perturbations. To simplify the analysis, it is convenient to write the governing equations and the boundary conditions in a dimensionless form. In the dimensionless formulation, scales for length, velocity, time and temperature gradient are taken to be d , / d ,d / and T respectively. 2
Furthermore, six dimensionless groups appearing in the problem are the Marangoni number M Td / 0 v , the Biot number, Bi hd / k , the Bond number,
Bo 0 gd 2 / τ 0 , the Prandtl number, Pr / , the Crispation number, Cr 0 / 0 d and the internal heating, Q qd 2 / 2 T .
658
Advanced Technologies
Our control strategy basically applies a principle similar to that used by Bau (1999), which is as follows: Assumed that the sensors and actuators are continuously distributed and that each sensor directs an actuator installed directly beneath it at the same {x,y} location. The sensor detects the deviation of the free surface temperature from its conductive value. The actuator modifies the heated surface temperature according to the following rule Bau (1999) :
T( x, y, 0,t )
1 Bi 1 K T x, y,1,t Bi Bi
(8)
where K is the scalar controller gain. Equation (8) can be rewritten more conveniently as
T ( x, y,0, t ) K T ( x, y,1, t )
(9)
where T is the deviation of the fluid's temperature from its conductive value. The control strategy in equation (9), in which K is a scalar will be used to demonstrate that our system can be controlled.
3. Linearized Problem We study the linear stability of the basic state by seeking perturbed solutions for any quantity x, y,z,t in terms of normal modes in the form
( x, y,z,t ) 0 ( x, y,z ) ( z )exp[ i( x x y y ) st ],
(10)
where 0 is the value of in the basic state, a ( x2 y2 )1 / 2 is the total horizontal wave number of the disturbance and s is a complex growth rate with a real part representing the growth rate of the instability and the imaginary part representing its frequency. At marginal stability, the growth rate s of perturbation is zero and the real part of s, ( s ) 0 represents unstable modes while ( s ) 0 represents stable modes. Substituting equation (10) into equations (3) - (7) and neglecting terms of the second and higher orders in the perturbations we obtain the corresponding linearized equations involving only the z-dependent parts of the perturbations to the temperature and the z-components of the velocity denoted by T and w respectively,
( D 2 a 2 )2 H 2 D 2 w 0
D
subject to
2
PC 1 r D 3a H
D
2
a 2 T w 0,
(12)
sf w (1) 0, 2
2
(11)
2
Dw 1 a a 2
2
Bo f 0,
a 2 w(1) a 2 M T (1) 1 Q f 0, hz (1) 0 ,
(13) (14) (15) (16)
Feedback Control of Marangoni Convection with Magnetic Field
659
DT 1 Bi T 1 1 Q f 0,
and
(17)
w(0) 0, Dw(0) 0,
(18)
hz (0) 0 ,
(20)
(19)
T (0) KT (1) 0.
(21)
on the lower rigid boundary z = 0. The operator D = d/dz denotes the differentiation with respect to the vertical coordinate z. The variables w, T and f denote respectively the vertical variation of the z-velocity, temperature and the magnitude of the free surface deflection of the linear perturbation to the basic state with total wave number a in the horizontal x-y plane and complex growth rates.
4. Results and disussion The effect of feedback control on the onset of Marangoni convection in a fluid layer with a magnetic field in the case of a deformable free surface (Cr 0) is investigated numerically.
The marginal stability curves in the ( a , M ) plane are obtained numerically where M is a function of the parameters a , Bi , Bo , Cr and Q. For a given set of parameters, the critical Marangoni number for the onset of convection defined as the minimum of the global minima of marginal curve. We denote this critical value by M c and the corresponding critical wave number, ac . The problem has been solved to obtain a detail description of the marginal stability curves for the onset of Marangoni convection when the free surface is perfectly insulated ( Bi 0 ).
Figure 2 shows the numerically calculated Marangoni number, M as a function of the wavenumber, a for different values of K in the case Cr 0 . From Figure 4 it is seen that the critical Marangoni number increase with an increase of K. Thus, the magnetic always has a stabilizing effect on the flow. In the absence of controller gain, K = 0 and magnetic field, Q = 0, the present calculation reproduce closely the stability curve obtained by Pearson (1958). The present calculation are also reproduced the stability curve obtained by Wilson for K =0 and Q =100. It can been seen that the feedback control and magnetic field suppresses Marangoni convection. The critical Marangoni number, M c increases monotonically as the controller gain, K increases. In the case of non-deformable free surface Cr 0 , the controller can suppress the modes and maintain a no-motion state, but this situation is significantly different if the free surface is deformable, Cr
0.
When Cr becomes large the long-wavelength instability sets in as a primary one and the critical Marangoni numbers are at a = 0. Figure 3 shows the critical Marangoni number at the onset of convection as a function of the wave number, a, for a range of values of the
660
Advanced Technologies
controller gains, K when in the case of Cr 0.001 and Bo 0.1 . At a = 0, the critical Marangoni number is zero and in this case, conductive state does not exist. Figure 3 shows that the controller is not effective at the wave number a = 0. Figure 4 shows the critical Marangoni number at the onset of convection as a function of the wave number, a for a range of values of the controller gains, K in the case Cr 0.0001 and Bo 0.1 . In this case,
the global minimum occurs at a 0 and as the controller gain K increases, the curve shifts upwards and most importantly, the controller increases the magnitude of the global minimum, thus it has a stabilizing effect.
Fig. 2. Numerically-calculated marginal stability curves for K = 0 and Q = 0 (solid line) and for various values of K (dashed line) in the case Q =100 and Cr 0 .
Fig. 3. Numerically-calculated marginal stability curves for K = 0 and Q = 0 (solid
line) and for various values of K (dashed line) in the case Q =100, Cr 0.001 and Bo 0 .
Feedback Control of Marangoni Convection with Magnetic Field
661
Fig. 4. Numerically-calculated marginal stability curves for K = 0 and Q = 0 (solid line) and for various values of K (dashed line) in the case Q =100, Cr 0.0001 and Bo 0.1 .
5. Conclusion The effect of the feedback control on the onset of steady Marangoni convection instabilities in a fluid layer with magnetic field has been studied. We have shown that the feedback control and magnetic field suppresses Marangoni convection We have also shown numerically that the effect of the controller gain and magnetic field is always to stabilize the layer in the case of a nondeforming surface. However, in the case of a deforming surface, the controller gain is effective depending on the parameter Cr and Bo 0.1 .
6. References Arifin, N. M., Nazar, R. & Senu, N. (2007). Feedback control of the Marangoni-Bénard instability in a fluid layer with free -slip bottom. J. Phys. Soc. Japan, 76(1): 014401: 1 - 4. Bau, H. H. (1999). Contol of Marangoni-Bénard convection. Int. J.Heat Mass Transfer, 42: 1327-1341. Chandrasekhar, S. (1961). Hydrodynamic and Hydromagnetic Stability. Hashim I & Arifin N.M., (2003). Oscillatory Marangoni convection in a conducting fluid layer with a deformable free surface in the presence of a vertical magnetic field. Acta Mech, Vol.164, 199–215. Hashim I & Wilson S.K. (1999). The effect of a uniform vertical magnetic field on the onset of oscillatory Marangoni convection in a horizontal layer of conducting fluid. Acta Mech. Vol. 132. 129 - 146 Howle, L. E. (1997). Linear stability analysis of controlled Rayleigh Bénard convection using shadowgraphic measurement. Physics of Fluids, 9(11) : 3111 -3113.
662
Advanced Technologies
Nield D.A. (1966). Surface tension and Buoyancy effects in cellular convection of an electrically conducting liquid in a magnetic field. Z. angew. Math. Mech, Vol. 17, 131–139. Or, A. C.; Kelly, R. E.; Cortelezzi, L. & Speyer, 1. L. (1999). Control of long wavelength Bénard-Marangoni convection. J. Fluid Mech., 387 : 321 - 341. Pearson J.R.A. (1958). On convection cells induce by surface tension. J. Fluid Mech 4, 489– 500. Rayleigh, R. (1916). On convection currents in a horizontal layer of fluid, when the higher temperature is on the under side. Phil. Mag. Vol. 32. 529 – 546. Tang,1. & Bau, H. H. (1993). Stabilization of the no-motion state in Rayleigh-Bénard convection through the use feedback control. Physical Review Letters, 70: 1795 1798. Tang,1. & Bau, H. H. (1994). Stabilization of the no-motion state in Rayleigh-Bénard problem. Proceedings Royal Society, A447 : 587 - 607. Wilson S.K. (1993). The effect of a uniform magnetic field on the onset of steady BenardMarangoni convection in a layer of conducting fluid. J. Engng Math, Vol.27, 161– 188. Wilson S.K. (1993). The effect of a uniform magnetic field on the onset of Marangoni convection in a layer of conducting fluid. Q. Jl. Mech. Appl. Math, Vol.46, 211–248. Wilson S.K. (1994). The effect of a uniform magnetic field on the onset of steady Marangoni convection in a layer of conducting fluid with a prescribed heat flux at its lower boundary. Phys. Fluid, Vol.6, 3591–3600.
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
663
35 X BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field Marc Pifarré, Xavier Sorribas and Eva Villegas
Enginyeria Arquitectura La Salle Universitat Ramón Llull Spain 1. Introduction
The little details of products and services are often the most influent aspects to achieve a pleasant user experience. But, how can we obtain those little relevant details? Normally the topics about user response are predefined by the team who perform the test, then users only answer about what a facilitator asks trough the entrusted tasks. Once the test is done, data is treated in an empirical way. But this conception of the test entails significant blind points, so this model offers a low margin of spontaneous information generated by the users, of course literals comments are noted, but subjective information is only consider as a support for empirical results. Classical usability testing methods are inspired in experimental psychology and based in the hypothetic-deductive paradigm, but another paradigm coming from postmodern psychology is also applicable in user experience field: the Socratic paradigm. Socratic paradigm depart from a non-objective basis perspective, nowadays is often used in some post-modern psychology schools, which applies Socratic techniques for psychological exploration and treatment. This model is characterized by obtaining the information from the user itself and not only from the observation of his behavior. The idea of shifting from the hypothetic-deductive model to the Socratic one is inspired by the change of paradigm done by the constructivism and other post-modern psychotherapy schools in clinical psychology field. Constructivism defends a subjective and Socratic treatment of the individual; on the other hand, the classic model is based on hypotheticdeductive paradigm, so is aimed to obtain objective results using an empirical model.
2. Change of paradigm in clinical psychology The key of Socratic techniques success in clinical psychology is mainly due to the will to adapt the therapy to the user, this vision is the opposite of classical psychotherapy conception which departs from the basis that the success of the treatment depends of patients’ adaptation to the therapy (the better users adapt the better the results).
664
Advanced Technologies
In order to adapt the therapy to the user the constructivist model doesn’t works with a previously stipulated programs or specific guides regarding pathologies. This school applies techniques based in a maieutic conception rather than a hypothetic-deductive exercise. What characterize the Socratic model in psychotherapy is that solutions use to come from the client (user) not from the psychologist; the psychologist is an expert who helps the subject (user) to define problems properly and find the better solution for his client. The information with the psychologist works to achieve a psychological change is completely generated by the user, this way, the psychologist can be sure the information is always significant. The change to a constructivist paradigm in therapy means a change in the conception of patients’ recovery process. It is not the psychologist changing the patient, but the patient changing himself with the help of the psychologist.
Fig. 1. Socratic psychotherapy means a change in the relationship between psychologist and patient To get this effect, the psychologist does not predetermine or use closed systems, as it does not resolve, or even give advice. In order to adapt the therapy to the client entirely, the Socratic psychologist applications depart from a blank page, then solutions are built from a subjective treatment of the problem. Contrarily, the Hypothetic-deductive applications uses standards or rigid content techniques, so the information worked with in the therapy will hardly adapt to the patient in all the dimensions.
3. Adaptation of Socratic paradigm to the User Experience field Post-modern clinical psychology schools use Socratic techniques to explore the way which the client interacts with their significant relationships, so the type of communications and behavior established with significant people is what determines the mental disorder. The paradigm shifting performed in the UX field, consists in focusing the study on the relationship between the person and the product, as the way the user interacts with a product or service is what gives the real value of that product or service. Techniques normally performed on usability and user experience studies are based on hypothetic–deductive method. According with that model we are able to obtain information from the user interaction with the product and treat resulting data in an empirical way. But
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
665
this paradigm can present also disadvantages because we are only obtaining data about items previously defined by the person or the team who performs the test. The way of using Socratic techniques in User Experience is parting from the minimum information possible. The test carried out must be as similar as possible to a bare page. Even though it may be a difficult and non orthodox task, interviewer must consider the user as a real expert whom they are getting advice from. This model defends a participative product and test design, where all elements presented in the interview are going to be obtained from the users. Under no circumstances interviewers will suggest any element to the users; they only can work with the information generated from the users themselves, otherwise the subjective reliability will be negatively affected.
Fig. 2. The main key in a Socratic Method is to establish an expert-to-expert relationship between user and interviewer. It is certainly difficult to find more precise and complete observation tools than the user itself. From his criteria is possible to discover aspects of the product that would have taken months of inductive observation to be detected as significant elements.
4. BLA (Bipolar Laddering): A Socratic Technique to Gather User Experience Information. Starting from a constructivist paradigm basis, Bipolar laddering (BLA) method is defined as a psychological exploration technique, which points out key factors of the user experience with a concrete product or service. This system allows knowing which concrete characteristics of the product cause users’ frustration, confidence or gratitude (between many others). BLA method works on positive and negative poles to define the strengths and weaknesses of the product. Once the element is obtained the laddering technique is going to be applied to define the user experience relevant details. The object of a laddering interview is to uncover how product attributes, usage consequences, and personal values are linked in a person’s mind. The characteristics obtained through laddering application will define what specific factors make consider an element as strength or as a weakness. Once the element has been defined, the interviewer asks the user for a solution of the problem in the case of negative elements or an improvement in the case of positive elements.
666
Advanced Technologies
4.1 BLA Performing BLA performing consist of three steps: 1.-Elicitation of the elements: The test starts with a blank template for the positive elements (strengths) and another exactly the same for the negative elements (weaknesses). The interviewer asks users to mention what aspects of the product they like best or help them in their goals or usual tasks. The elements mentioned need to be summarized in one word or short sentence 2.-Marking of elements: Once the list of positive and negative elements is done, the interviewer will ask the user to score each one from 0 (lowest possible level of satisfaction) to 10 (maximum level of satisfaction). 3.-Elements definition: Once the elements have been assessed, the qualitative phase starts. The interviewer reads out the elements of both list to the user and applies the laddering interviewing technique asking for a justification of each one of the elements (Why is it a positive element? Why this mark?). The answer must be a specific explanation of the concrete characteristics that make the mentioned element a strength or weakness of the product. 4.2 Results There are two categories of resulting elements: Elements regarding poles: 1.-Positive elements are those which the user perceives as strong points of the product, those which help them to work better or which make them feel good. These elements can be functional, esthetic or of any other type. 2.-Negative elements are those which the user perceives as weak points of the product, those that hinder, slow down their actions or simply are not pleasant at any level. Both positive and negative elements are classified as Common and Particular elements. 1.- Common Elements those are strengths or weaknesses cited by more than one user. Those common elements are established comparing the definitions of each element. Data treatment must be qualitative to be reliable, so overlapping the name of the element is not enough to establish a common element due the difference of meaning that users can give to a similar term. 2.-Particular elements are strengths or weaknesses that have been cited only from one of the users.
5. Case Study As an example of BLA application the case study performed of YouTube website is going to be presented. The study was held in summer of 2007 on the University La Salle with a scope of twelve people with ages from 18-35 years old, all the members of the sample were YouTube users.
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
667
5.1 Application The test was done in the laboratory of user experience (UserLab) of La Salle, Ramon Llull University , resources for the test were: An interviewer and one computer with internet connection (internet explorer 7 and Firefox 2.0 were available for the user to navigate). The interviewer entrusted the user to freely navigate by YouTube webpage and to mention what elements of the website were positive ones and what others were negatives ones based on the experience they had. No specific time was given to users although the average time to complete the test was one hour and a half. 5.2 Results Obtained results using BLA are, as stated before, separated by the users between positive and negative elements and then revised to consider them common or particular elements. Sixteen positive common elements, twenty-three positive particular elements, seven negative common elements and eighteen negative particular elements were obtained. Those results are exposed on tables for a quick reference. Common elements are numbered followed by a capital C. Particular elements are numbered followed by a capital P. 1) Common elements The two common tables are exposed with further explanation of the information they gave.
Table 1. Positive common elements
668
Advanced Technologies
Aside from the average scores, which will be explained later, a mention index is obtained. This indicator shows the percentage of users that have cited an element. A high mention index means that the element is perceived by the users. In this study case, positive common element five has a mention index of 83% this allows us to acknowledge that related videos is an element that users really take into account.
Table 2. Positive common elements individual user scores This table shows how each common element is scored by the users who have mentioned it. These particular scores are also useful once merged with the definition of the element, as a result, a better understatement of how users perceive each element is obtained. The same type of results is obtained also for the negative elements.
Table 3. Negative common elements
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
669
As a result of the test seven common negative elements were generated. The majority of the mention index results are found in a bracket between 33% to 50% being elements that are notably perceived by the users. As a relevant element we can see 1C were 33% of the users mentioned the bad quality of the video on YouTube, also when we look at the scores given to that element they are extremely low. When an element with high index of mention comes along with a low average score (such as 1C) tends to indicate a relevant weak point of the product being tested.
Table 4. Negative common elements 1) Particular elements A total of 41 particular elements were generated those were spilt in 23 positive elements and 18 negative elements. Positive elements have a high average of 7.9 with 5 perfect scores found. This represents a high positive perception of the elements mentioned by the users. Negative elements have an average of 2.9 with two scores with a 0. Seven elements were middle scored with assessments between 5-4
Table 5. Positive particular element
670
Advanced Technologies
Each element score is expanded with the definition given by the user. This allows the precise understanding of each strength or weakness detected by the users. 5.3 Data Analysis When reviewing elements an especial emphasis must be taken on scores as they show a very good perception of how users perceive the element. When the definition attached with the score is analyzed a precise compression of the element is achieved being able to understand why the user has cited it. 1. Top and bottom scores On positive elements High scores tend to show key elements of why user perceives the product as a pleasant experience. Those elements shouldn’t be modified in a redesign process. Example 1: Element 17 Particular Positive
Element 17P
Description With the video on pause it keeps buffering
User
Score 9
10
Justification: Very useful especially for slow connections, you let the video download and then you watch it. This element has a perfect score by the User 9. Analyzing the qualitative registration allows seeing that to have a slow internet connection is a determining factor behind the assessment of this element. On negative elements very low scores denote elements that create high frustration to the user and really affects the experience they have with the product. Those elements should be taken in to account in a redesign process. Example 2: Element 11 Particular Negative Element 11P
Description It doesn't allow me to view more than one video on the same page
User
Score 8
1
Justification: You can’t view the video and search on the same page, you have to open another page and that’s slow and tedious. Solution: Allow me to search while the video window keeps playing. The search could be on the side. On this example we obtain a precise definition of the user problem with the product, acknowledging what entails to him. Asking the user for the solution of the problem allows a better understanding of the problem itself. 2. Middle scores On the positive pole middle scores tend to represent elements that the user appreciates and is pleased about but doesn’t impress him, this type of elements tend to be taken for granted by the user. Example 3: Element 10 Particular Positive
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
Element 10P
Description You can personalize your space
671
User
Score 6
7
Justification: It allows you to get warnings when someone comments a video you have uploaded. It helps to build a community. You can tag videos as favorites and you and other people can see it in your space. On negative elements middle scores are present on elements that annoy the user and create and unpleasant experience without affecting the will of using the product. Example 4: Element 18 Particular Negative
Element 18P
Description No censoreship on violent videos
User
Score 12
4
Justification: The fact that they are accessible for everybody can tend to people imitating what they see on the videos. Solution: Those videos should be erased permanently, it would make YouTube less democratic but there’s a minimum control necessary. 3. Paradoxical scores Occasionally some users define a negative element with a high score or a positive element with a low score, on those cases users must corroborate it to ensure the reliability of the element assessment. Those elements are defined as paradoxical scores and tend to be very interesting elements to look after. On positive elements negative scores tend to apply to elements that are perceived as bonus features that work badly. Users perceive it as a good thing as a concept, but their final experience is not pleasant. The fact that they perceived it as a bonus feature is what provokes that they don’t cite them as a negative element. Example 5: Element 12 Particular Positive
Element 12P
Description Bubble interface
User
Score 7
3
Justification: It allows to visualize the path that I do on YouTube. It allows locating other videos which are in theory related, but they are not well related. Improvement suggested: You don’t really understand how it works. It doesn’t give any explanation. When you exit it doesn’t save the relations. It’s unintuitive... actually is a piece of shit.
672
Advanced Technologies
Fig. 3. YouTube bubble interface (this element is already removed from Youtube website) 4. Heterogeneous marks In some common elements there is a high difference between the scores of the same element, that phenomena is been called heterogeneous marks. That tends to happen for two reasons. One is that users consider different specific characteristics of the same element, so they are really assessing different things. Checking qualitative registers it is possible to find out exactly what each user is assessing. The other reason for heterogeneous marks is the different level of affectation of this particular element. Then, the same element can affect users in a very different way. To detect this kind of phenomena in the results analysis would be necessary to separate user narrations for each element. Example 6: Element 5 Common Negative
Element 5C
Description It doesn't allow to download videos
Score 2,25
User 4 Score: 4 Justification:. It doesn’t allow me to download videos and I have to use other programs. Solution: Give me the option to download it and that when I upload it I can decide if other users can download or not the video. User 5 Score: 4 Justification: The video format is particular to YouTube you need an external tool to download the video. Solution: Just give me the option to download the video. User 7 Score: 0 Justification: I want to have the videos. I want to be able to archive them, store them and be able to see the video in any situation. Solution: Allow me to download videos User 9 Score: 1 Justification: You can never store a video you have to connect to internet always. Solution: Download button next to the video or that when you have finished the video it gives the option to download it next to the replay options.
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
673
10 9 8 7 6 5
Score
4
Average
3 2 1 0 User 4
User 5
User 7
User 9
Chart 1. Common Negative element 5 Users score We encounter four users citing the same element but when we examine the scores we find a significant difference between them. In this example there are two scores being quite neutral on the subject and two scores being extremely low. An analysis reveals that the users giving the bottom scores perceive downloading videos as a major necessity in the way they expect to use the product, as the narration shows users 7 and 9, doesn’t have knowledge about programs to download videos from Youtube. In the other hand, users 4 and 5 have solved the problem trough external sources and don’t rely on Youtube website to solve the problem. It’s also very interesting how given solutions tend to be alike but also have different particularities on the way they solve the problem. This information allows redesigning the product taking into account the real user’s needs and desires.
6. Conclusions Applying a Socratic method for obtaining information allows to discover subtle information of the product, This is very difficult to achieve trough classical methods. This methodological proposal not only claims for the participative product design but also promotes the user´s participation in the test product design. From the basic results diagram, as there are strong and weak points, we can establish a sophisticated system to define the significant points of user experience regarding any kind of service or product. To get this type of results a change of paradigm in classical user experience field is being applied.
674
Advanced Technologies
Fig. 4. Classical usability model. We change from observing the interaction between the user and the product to making the user generate information about the product.
Fig. 5. Socratic Interviewing model Contrarily to the methodology used in classical usability, the user and the facilitator are working together defining relevant factors of the product co-creating the content of the test from a blank template. The participants will have an expert role as users of the product as well the interviewer will have an expert role as a test performer. Resulting data allows discovering a large amount of information about the relevant characteristics of the product and how and why they affect users. With the justification of each element and the score given to it we can have a quite accurate idea about how this element is experienced by the user. Using BLA method we are trying to see the product through the user´s filter, through his perception, emotions, reasoning and value system getting a close approach to the user experience. The aim of this method is not to observe the user through the product but the product through the user.
7. Acknowledgment Special thanks to, Oscar Tomico Nuria Torras, and the UserLab team which made possible this study, also we like to thank the assistants of UEB workshop for his help and support.
BLA (Bipolar Laddering) applied to YouTube. Performing postmodern psychology paradigms in User Experience field
675
8. References Jorgensen, D.L.: Participant Observation: A Methodology For Human Studies. Newbury Park, CA: Sage Publications, 1989. Cook, T. D., Campbell, D.T.: Quasi-Experimentation: Design and Analysis Issues for Field Settings. Boston: Houghton-Mifflin Co, 1979. Sanders, E.: Information, Inspiration and Co-creation. In Proceeding of the 6th International Conference of the European Academy of Design. (Bremen, University of the Arts, 2005). Sanders, E.: Virtuosos of the experience domain. In Proceedings of the 2001 IDSA Education Conference (2001) Neimeyer, R. A. Features, foundations and future directions. In Neimeyer, R. A., Mahoney, M. J. (Eds.): Constructivism in Psychotherapy. Washington: American Psychological Association, 1995. Mahoney, M.J. Participatory epistemology and the psychology of science. In Gholston, B., Shadish, W. R., Neimeyer, R.A., Houts, A. C. (eds.): Psychology of science. Cambridge: Cambridge University Press, 1989. Guidano, V.F.: Constructivist psychotherapy: A theoretical framework. In Neimeyer, R. A., Mahoney, M. J. (Eds.): Constructivism in Psychotherapy. Cambridge: Cambridge University Press, 1989. Tomico, O., Pifarré, M., Lloveras, J.: Experience landscapes. In Proc. DESIGN 2006 Conf. (Dubrovnik, Croatia, 2006). M. Pifarré, “Bipolar Laddering (BLA): a Participatory Subjective Exploration Method on User Experience” Dux 07: Conference on designing for user experience. ChicagoUSA, November, 2007
676
Advanced Technologies
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
677
36 X
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm Justin S. C. Prentice
Department of Applied Mathematics, University of Johannesburg Johannesburg, Republic of South Africa 1. Introduction Initial-value problems (IVPs) of the form dy f x , y dx
y x 0 y 0
(1)
often arise in the simulation of physical systems, particularly if the system is timedependent. The numerical solution of these problems is often achieved using an explicit Runge-Kutta (RK) method (Shampine, 1994; Butcher, 2000; Hairer et al., 2000), which typically involves a large number of evaluations of the function f(x,y). This is the most significant factor that determines the efficiency, or lack thereof, of the RK method. If it is necessary to control the global error in the numerical solution, it may be necessary to use a reintegration algorithm, which, as will be shown, would require at least three applications of RK to the problem. Thus, it is easy to see that a reintegration process could be inefficient if a large number of evaluations of f(x,y) is required. We note here that the more accurate an RK method is, so the greater the number of evaluations of f(x,y) that is required. Demanding very strict global tolerance via reintegration could be particularly expensive in terms of computational effort. For this reason, local error control (to be discussed later) is preferred over global error control and, consequently, very little work has been done regarding reintegration. Indeed, Shampine has recently commented that such schemes “…are generally thought to be too expensive…for production codes” (Shampine, 2005). To the best of our knowledge, no attempt has been made to improve the efficiency of RK methods with regard to reintegration, which is the topic of this paper. We have developed a numerical method for nonstiff IVPs, designated RKGL, which is a modification of a standard explicit RK method. The RKGL method has been designed to reduce the number of function evaluations, relative to the underlying RK method. The method also has an enhanced global order, relative to the RK method, which is a very powerful mechanism for improving efficiency in the context of reintegration. In this article we will show (a) how RKGL can be used to enhance the performance of the reintegration algorithm, in comparison with RK methods, and (b) how RKGL can achieve better accuracy than RK, for equal computational effort. Additionally, we will introduce a
678
Advanced Technologies
reintegration algorithm that we believe is original in nature. We must stress, however, that our emphasis is on the relative efficiencies of the two methods, and the reintegration algorithm used here is intended primarily as a vehicle to facilitate such efficiency analysis.
2. Scope and Structure The objective of this article is to demonstrate the improvement in efficiency and accuracy of the RKGL algorithm relative to the underlying RK method, and this will be achieved via theoretical arguments and numerical examples. However, it is necessary that we also describe the RKGL algorithm in some detail, and indicate a few important properties of the algorithm. We will also describe the general idea of reintegration as a mechanism for global error control. Some mathematical detail is inevitable, but in this regard we will be as economical as possible. These discussions will be presented entirely in the next section, with the sections thereafter devoted to efficiency analysis and numerical work.
3. Relevant Concepts In this section, we describe concepts relevant to the current paper, and introduce appropriate terminology and notation. 3.1 Runge-Kutta Methods To solve an IVP of the form in (1) on an interval [a,b], using an explicit RK method, requires that a set of discrete nodes {xi ; i = 0,1,…,N}, with x0 = a and xN = b, be defined on [a,b]. The numerical solution yi+1 at the node xi+1 is obtained via the explicit RK method by s
y i 1 y i h i β j k j y i h i F x i y i , j 1
(2)
where k 1 f x i , y i
k 2 f x i γ 2 h i , y i h i α 2 ,1 k 1 k s f x i γ s h i , y i h i α s ,1 k 1 α s , 2 k 2 α s ,s1 k s1 .
(3)
In these equations, the function F(x,y) has been implicitly defined, and the various coefficients , , are specific to the particular RK method being used. The parameter hi is the spacing between xi and xi+1, and is known as the stepsize. We label the stepsize with a subscript since, in general, it need not have uniform magnitude. The parameter s indicates the number of stages of the method; each stage requires an evaluation of f(x,y). An RK method is known as a one-step method since the solution yi+1 is computed using information at the previous node only. We note here that the RK method is termed explicit since yi+1 does
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
679
not appear on the right-hand side of (2); if it does, the method is termed implicit. In the remainder of this article, the abbreviation RK indicates an explicit method. RK methods are known to be consistent, convergent and stable with respect to roundoff error (zero-stable). Moreover, an RK method of order r, denoted RKr, has global error
Δ i y i y x i O h r ,
(4)
where y(xi) is the exact value at xi, and h is the average stepsize on [a,b]. It is always true that s ≥ r, so that greater accuracy in an RK method implies greater computational effort. There does exist a class of RK methods, known as embedded methods, that offer greater efficiency if a lower-order and higher-order method need to be used in tandem (Butcher, 2003). Such scenarios arise typically in error control algorithms, as will be described later. From (2) we see that an RK method requires a linear combination of its stages; an embedded method has the property that two different linear combinations of the same stages yield two methods of different order. We usually speak of an RK(r,q) pair, which is an embedded method containing both RKr and RKq. 3.2 Gauss-Legendre Quadrature Gauss-Legendre (GL) quadrature is an algorithm for numerically evaluating the integral of a continuous function (Burden & Faires, 2001). Indeed, we have v
C j f x j , y x j f x , yx dx h j 1 m
u
(5)
for m-point GL quadrature (denoted GLm). Here, the xj are m nodes on [u,v] (given by the roots of the mth degree Legendre polynomial on [−1,1] and then translated to [u,v]); h is the average separation of these nodes; and the Cj are appropriate weights. GL quadrature is open quadrature in the sense that the quadrature nodes are interior to the interval of integration [u,v]. In (5), we write y(xi), although in the context of the RKGL algorithm we actually use yi. The error associated with GLm quadrature is O(h2m+1). 3.3 The RKrGLm Algorithm The RKrGLm algorithm (Prentice, 2008; Prentice, 2009) is best described with reference to Figure 1. The interval of integration [a,b] is subdivided into N’ subintervals, denoted Hj where j = 1, 2,…, N’. On each subinterval we define m nodes suitable for GLm quadrature. The numerical solution at these nodes (indicated RK in the figure) is obtained using RK; the solution at the endpoint of each subinterval Hj is determined using GLm. Defining p ≡ m+1, we have y i 1 y i hi F x i y i ,
where i = (j−1)p, (j−1)p+1,…, (j−1)p+m−1 at the RK nodes, and
(6)
680
Advanced Technologies
y jp y ( j 1) p h
( j 1 ) p m
Ci f i ( j 1 ) p 1
x i , y i
(7)
at the GL nodes.
GL
RK a= x0
x1
. . . H1
xm
xp
GL
RK xp+ 1
...
xp+ m
x2p
.
.
.
b
H2
Fig. 1. Schematic depiction of the RKrGLm algorithm. We have shown that RKrGLm is consistent, convergent and zero-stable. It is clear that RKrGLm does not require any evaluations of f(x,y) at each GL node {xp,x2p, etc.}, which is a clear reduction in computational effort. Furthermore, the global error in RKrGLm has the form
Δ i Ai h r 1 Bi h 2 m O h min r 1 , 2 m ,
(8)
so that if we choose r and m such that 2m ≥ r+1, the global error is of order h r 1 , which is one order better than the underlying RKr method. In Figure 2 we show, for example, the global error for RK5 and RK5GL3 applied to a test problem (which will be described later). The growth of the RK error is evident, whereas the RKGL error is quenched (at each GL node; see Figure 1), thus slowing its rate of growth. The technicalities of this error quenching need not concern us here; it is due to the factor h in (7). Rather, this example shows how RKGL improves the global accuracy of its underlying RK method. Furthermore, since there is no need to evaluate f(x,y) at each GL node, the RK5GL3 calculation in this example required less computational effort than RK5. Indeed, the relative computational effort for this example is 3/4, a saving of some 25%, and furthermore, the maximum global error is about five times smaller. 3.4 Local Error Control (LEC) We must discuss the concept of local error control (Hairer et al., 2000), since it plays a role in the reintegration process. For an RKr method, we define the local error at xi+1 by
ε i 1 y x i hi F x i , y x i y x i 1 O h ir 1 .
(9)
Note that the local error is one order higher than the global error. Note also that the exact value y(xi) is used in the RK method in the square brackets. As such, the local error is the RK
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
681
1 e -8
global error
R K 5G L 3 R K 5
5 e -9
0 0
1
2
3
x
4
5
Fig. 2. Global errors for RK5 and RK5GL3. The greater accuracy of RK5GL3 is clear. error made on the subinterval [xi,xi+1], assuming that the exact solution is known at xi. The local error is controlled by adjusting the stepsize hi. If we assume ε i 1 Li 1 h ir 1 ,
(10)
where Li+1 is the local error coefficient, then control of the local error clearly requires a good estimate of Li+1. We will consider a type of error estimation that requires the use of two RK methods of differing order, such as a RK(r,q) pair. This will be described in section 3.5.2. 3.5 Reintegration Reintegration relies on the fact that as the stepsize tends to zero, so the global error also tends to zero. A numerical method that exhibits this property is said to be convergent. Since RK and RKGL are known to be convergent, they are suitable for the application of a reintegration algorithm. A reintegration algorithm typically consists of three phases: 1. Determining a node distribution in a systematic way. 2. Error estimation at these nodes. 3. Reintegration using a refined node distribution. Phase 1 involves LEC and will be discussed at the end of this section. For now, we assume that a numerical solution has been obtained using some uniform stepsize h – which we term the initial stepsize. Assume also that a good estimate of the global error at each node has been made. If a tolerance δ is imposed, then a new stepsize h * is determined from
δ h 0.9 max Gi *
1
r
,
(11)
where Gi is the constant of proportionality in (4), the so-called global error coefficient (this coefficient is dependent on x, but not on h). The factor 0.9 is a ‘safety factor’ to cater for the
682
Advanced Technologies
possibility of underestimating the magnitude of Gi. The RK method is then reapplied using this new stepsize. Clearly, the need to reapply the method is the source of the computational effort that we hope to reduce using RKGL. However, the control mechanism described here seeks to control the absolute error, whereas it is the relative error y i y x i yi
(12)
that we should rather attempt to control, since finite-precision computing devices distinguish between numerical values in a relative sense only. Hence, we have y i y x i yi
G Gi r h max i h r . yi yi
(13)
Now, if we impose a tolerance δ on the relative error, we have the condition y i y x i yi
δ y i y x i δ y i
(14)
which becomes problematic when yi is very close to zero (because the stepsize that would then be required would be intolerably small). To counter this, we replace the term on the right in (14) by max{δA,δR y i } – where δA is a so-called absolute tolerance and δR is a relative tolerance, and these tolerances are not necessarily the same. Hence, the tolerance is δA when δR y i < δA, and δR y i otherwise. The expression for h * now becomes
δ h * 0.9 RM Gi
1
r
where G iM
Gi max max δ A , y i δ R
.
(15)
This ensures that Gi is never divided by a number smaller than δ A δ R , and that, under this condition, the denominator is maximized – which, of course, yields the smallest upper bound for h * . We often use a uniform tolerance δ, as in δ = δA = δR. 3.5.1 Global Error Estimation It is clear that the estimate of the global error (Phase 2) is a very important part of the reintegration algorithm. The ability to make a reliable estimate of this quantity is crucial to the effectiveness of the algorithm. The favoured approach is to use a method of higher order to obtain a solution that is then held to be more accurate than that obtained with the lower
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
683
order method. If yi(r) and yi(q) denote lower-order (RKr) and higher-order (RKq) solutions, respectively, at xi, then we have y i r y i q y i r y x i y i q y x i G i r h r G i q h q G i r h r .
(16)
Here, y(xi) is the exact solution, Gi(r) and Gi(q) indicate global error coefficients for the lowerand higher-order methods, respectively, and q is the order of the higher-order method (obviously, q > r). The approximation in (16) holds if h is sufficiently small. In other words, if methods of differing order are used to obtain numerical solutions at each node on the interval of integration, then (16) can be used to estimate the global error coefficient in the lower-order solution, which is then used in (11) or (15) to determine h * . This method of reintegration – using two methods of differing order – will form the basis of the reintegration algorithm, designated LHR, that we will use in this paper. The LHR algorithm is based on the reintegration process described in section 3.5, and from this point onwards we will refer generically to that reintegration process as LHR, culminating in a complete description of LHR in section 5.3. Until now, we have assumed that the initial stepsize h is known, and we have described the reintegration algorithm accordingly. But how does one choose a suitable initial stepsize h? If h is too large the asymptotic expressions for the error, as in (4), become unreliable since higher-order terms in the error expansion make significant contributions; if h is too small, we might use many more nodes than is necessary, which is inefficient, and if h is extremely small, problems in the RK method itself, due to roundoff, will persist. A possible solution to this problem involves the use of LEC, before estimating the global error. This approach allows an appropriate stepsize to be determined systematically. Before discussing this, we must describe error estimation in the local error control procedure. 3.5.2 Local Error Estimation via Local Extrapolation Local error estimation can be achieved by using a higher-order method, say RKq, in similar manner to that described previously. Indeed, we have
h i*
max δ , δ y i 1 0.9 Li 1
r 1 1
(17)
into which relative LEC has been incorporated, and where Li+1 is estimated from y i 1 r y i 1 q Li 1 r h ir 1 Li 1 q hiq 1 Li 1 r hir 1 .
(18)
We have retained the subscript on the stepsize to emphasize that, due to the nature of this type of LEC, the stepsize can vary from subinterval to subinterval – it does not need to be constant, as we have assumed previously. We point out that if the local error estimate is less
684
Advanced Technologies
than or equal to the tolerance, then no stepsize adjustment is necessary, and we proceed to the next subinterval. Nevertheless, it is certainly possible to determine a new stepsize, even if the error estimate satisfies the tolerance, and a new solution using this stepsize can be determined. We refer to this procedure as forced LEC (FLEC). In this case, the resultant stepsizes on [a,b] are all consistent with the desired tolerance; none of them are any smaller than is necessary. A particularly important feature of this algorithm concerns the propagation of the higherorder solution. Since the algorithm relies on the exact solution being known at xi, as in (9), we must always use the more accurate RKq solution at xi as input for both the RKr and RKq methods. Hence, the RKr and RKq solutions at xi+1 are both generated using yi(q). This feature is known as local extrapolation (LeVeque, 2007). 3.5.3 Starting stepsize To implement either the LEC or FLEC algorithm, it is necessary to estimate a starting stepsize h0. A practical way to do this is to assume L1 = 1, so that
h 0 δ loc 1 r 1
(19)
if δloc is the desired accuracy in the local error. 3.5.4 Initial Stepsize Estimation The idea here is to use FLEC with a moderate tolerance to obtain a node distribution, and then to compute an average stepsize that can be used in Phase 2 of the reintegration algorithm. This is the so-called initial stepsize referred to previously, not to be confused with the starting stepsize described in the previous section. We propose that, if ultimately a
global tolerance of δ is required, then a local tolerance of δ loc δ should be used (we are assuming, of course, that δ < 1). Assume that this results in a non-uniform node distribution {x0,x1,x2,…,xN}, which has an average stepsize h. We now assume that the solution has a maximum global error of Nδ loc so that an error coefficient may be determined from G Nδ loc h r . This allows a new stepsize to be determined, as in (15), where we treat δ loc as
a global tolerance, and this new stepsize is then used as the initial stepsize. In this section, we have described the principles behind typical reintegration, with reference to the RK method. In the next section, we study the efficiency of reintegration, including RKGL-based reintegration.
4. Efficiency Analysis Here, we will present a theoretical analysis of the relative efficiencies of the LHR reintegration method discussed previously, for both RK and RKGL. We will attempt to count both the number of evaluations of f(x,y), and the number of arithmetical operations involved in each method. In the next section, we will present numerical work demonstrating this efficiency analysis. To facilitate our efficiency analysis, we define various symbols in Table 1.
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
Symbol
685
Symbol
Meaning
Length of interval of integration [a,b]
Af
Number of arithmetic operations in f(x,y)
Global error using RKr
A1
Number of arithmetic operations, per node, using RKrGLm
ΔRKGL
Global error using RKrGLm
A2
Number of arithmetic operations, per node, using RKr
N1
Number of nodes on [a,b] for RKrGLm, excluding first node
Φ1
N1F1 = total number of function evaluations on [a,b], using RKrGLm
N2
Number of nodes on [a,b] for RKr, excluding first node
Φ2
N2F2 = total number of function evaluations on [a,b], using RKr
F1
Number of evaluations of f(x,y), per node, using RKrGLm
Ψ1
N1A1 = total number of arithmetic operations on [a,b], using RKrGLm
F2
Number of evaluations of f(x,y), per node, using RKr
Ψ2
N2A2 = total number of arithmetic operations on [a,b], using RKr
D
ΔRK
Meaning
Table 1. Definition of miscellaneous symbols to be used in the efficiency analysis. Consider the first subinterval H1 for RKrGLm (see Figure 1). There are m + 1 nodes on this interval at which numerical solutions must be determined (it is not necessary to determine y0 since this is the given initial value). RKr, with s stages, is applied to find solutions at m of these nodes, and the GLm algorithm requires the evaluation of f(x,y) at the mth node itself. This means that ms + 1 function evaluations are required by RKrGLm on H1. This holds for all subsequent subintervals. Since there are m + 1 nodes on each subinterval (we exclude the first node, since we regard it as being part of the previous subinterval), we have
686
Advanced Technologies
ms 1 m1 F2 s F1
(20)
where the expression for F2 is due to the fact that RKr is an s-stage method. Note that F1 < F2 for s > 1, and that F1 = F2 only when s = 1. Referring to the RKr method in (2) and (3), we see that multiplication of f(x,y) by h is performed s − 1 times; in the argument of f(x,y), multiplication by h occurs s − 1 times, and addition of γh to x occurs s − 1 times; in the y-component of f(x,y) there are s(s − 1)/2 multiplications and an equal number of additions; and finally, in (2), there are s multiplications and s additions. This all gives A2 s 2 4s 2 sA f .
(21)
For RKrGLm on each subinterval Hi, we use RKr m times and then we compute the solution at the endpoint (see (5)) by means of one evaluation of f(x,y), m multiplications (by the weights), m − 1 additions, one multiplication by h and one more addition. Hence, A1
m s 2 4s 2 sA f 2m 1 A f m1 mA2 2m 1 A f m1
(22) .
Note that F1 Ω 1 A1 F2 Ω 2 A2
(23)
where
Ω 1 Ω 1 m, s , A f
Ω 2 Ω 2 s, A f
ms 1 mA2 2 m 1 A f
s s 4s 2 sA f
(24)
2
are method- and problem-dependent proportionality constants. Since we are interested in comparing the efficiencies of the two methods, we must consider the ratios of their respective arithmetical operations, function evaluations and global errors. Hence, we define the quantities
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
Φ1 Φ2 Ψ RA 1 Ψ2 Δ R Δ RKGL Δ RK
687
RF
(25)
The first of these is the ratio of the total number of function evaluations, and the second is the ratio of the total number of arithmetical operations, over the whole interval [a,b]. As such, these ratios measure the relative efficiency of the two methods. The third ratio is that of the maximum global errors for the two methods. For RΔ we have RΔ
G1 h1 r 1 G1θ r G2 h2 r G2
h 1
(26)
h1 N 2 . h2 N 1
(27)
where h1
D N1
h2
D N2
We see that RΔ must tend to zero as h1 tends to zero; this simply means that as ΔRKGL is made smaller (by reducing h1), so RKGL inevitably becomes more accurate than RK. In (26), the coefficients G1 and G2 can be absolute error coefficients, as in (11), or relative error coefficients, such as GiM in (15). To make a sensible comparison of the two methods, we require that the global error (absolute or relative) of each satisfy some user-defined tolerance δ, and then compute the ratio RF. If both methods satisfy the tolerance, we have ΔRKGL= ΔRK = δ, so that RΔ = 1. Hence, G h R Δ 1 1 1 G2
1r
h2 N 1 . h1 N 2
(28)
Using Φ1 = N1F1, Φ2 = N2F2 and (27), we find 1r N 1 F1 F1 G1 h1 1 r N 2 F2 F2 G2 2 1 1 r r r 2 F1 G1 1 Δ 1 r r F2 G2 G1 RKGL
RF
where we have used
(29)
688
Advanced Technologies 1 r 2 r 2 1 Δ RKGL 1 r r . Δ RKGL G 1 h 1r 1 h1 1 r G 1
(30)
So we have RF δ
1 r 2 r
.
(31)
This is an important result. It shows that as we seek stricter tolerances, so RF becomes smaller, i.e. RKrGLm becomes ever more efficient than RKr. For example, if r = 4 and δ = 10−5, then RF 0.56 , and if δ = 10−10, then RF 0.32 . We are now in a position to study the efficiency of the LHR method. Such efficiency will be studied in terms of arithmetical operations; by virtue the linear relation (23), our analysis holds for function evaluations as well. 4.1 Efficiency of the LHR Algorithm We have described the essentials of the LHR algorithm with regard to RK; the implementation using RKGL is similar. We use two methods, RKrGLm and RKqGLm, where the latter has higher order (i.e. r < q, as before). Note that both methods use GLm quadrature, meaning that they have a common node distribution. Hence, global error coefficients can be determined at each node, and a new stepsize satisfying a tolerance δ can be found. Assume that an initial average stepsize h1 is used to find numerical solutions using RKrGLm and RKqGLm (i.e. we are considering Phase 2 of LHR). We refer to an average stepsize since the RKGL nodes are not uniformly spaced. Hence, we have Ψ1r N 1 A1r Ψ1q N 1 A1q
(32)
where the superscripts indicate RKrGLm or RKqGLm. A new stepsize gives a new node distribution of N 1* nodes, where
N 1*
D , h 1*
(33)
so that Ψ1r * N 1* A1r .
(34)
This is the number of arithmetic operations required by the reintegration phase of the LHR algorithm (Phase 3). The total number of arithmetic operations, for Phases 2 and 3, in the RKGL implementation of LHR is then
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
Ψ1 Ψ1r Ψ1q Ψ1*
689
(35)
Analogous quantities may be derived for the RKr and RKq implementation, using the same initial stepsize h1 (for RK this stepsize is uniform), giving Ψ2 Ψ2r Ψ2q Ψ2* .
(36)
It must be noted that the new stepsize h 2* is not necessarily equal to h1* . Now consider q A1q Ψ1* N 1* A1r Ψ1r A1r Ψ1 , , . Ψ2r A2r Ψ2q A2q Ψ2* N 2* A2r
(37)
The first two of these ratios give the relative efficiencies (in terms of arithmetical operations) for RKrGLm and RKr, and RKqGLm and RKq, as regards the error estimation phase of LHR (Phase 2). The third ratio gives the relative efficiency of RKrGLm and RKr, as regards Phase 3 of LHR. Since RKGL requires fewer arithmetical operations per node than RK, we have that the first two ratios are certainly never greater than one. The factor in the third ratio leads, as in (29)−(31), to Ψ1* 1 r 2 r . δ * Ψ2
(38)
All of this shows that the RKGL implementation of LHR must become more efficient than the RK implementation, as stricter tolerances are imposed. Additionally, it is possible to include a quality control mechanism in LHR. In such mechanism, we use RKqGLm or RKq to obtain a numerical solution using stepsizes h1* or h 2* . This solution can then be used to check the error in the solution obtained with RKr or RKrGLm. If this quality control reveals that the imposed tolerance has not been satisfied, then a new stepsize can be determined and further reintegration can be done. Quality control necessarily requires N 1* A1q or N 2* A2q additional arithmetical operations. Additional
reintegration, if required, may be regarded as a fourth phase of LHR, and will require more than N 1* A1r N 1* A1q or N 2* A2r N 2* A2q additional arithmetical operations.
4.2 Initial Stepsize To implement FLEC using RKr and RKq requires A2r 2 A2q arithmetical operations per q node. Using RKrGLm and RKqGLm requires A1r 2 A1 operations per node. Hence, the
RKGL implementation is more efficient than the RK implementation. Of course, for the sake of comparison, we have assumed, as previously, that the average stepsize for the two methods is the same. This is not unreasonable, since both methods (RKr and RKrGLm) have the same order (r + 1) in their local errors.
690
Advanced Technologies
A note with regard to RKGL local error control: consider the subinterval H1. We obtain solutions at the nodes using RKrGLm and RKqGLm, with y0 as input. We estimate the error at each node; effectively, this is an estimate of the global error on H1. We assume that these errors are proportional to h 1r 1 , where h1 is the average stepsize on H1, and then determine a new stepsize h 1* . The length of the new subinterval is m 1h 1* . RKqGLm is then used to find solutions at the nodes on this new subinterval, and the solution so obtained at xp is used as input for RKrGLm and RKqGLm on the next subinterval H2. In a sense, then, we control global error per subinterval, with local extrapolation at the endpoint of each subinterval, exploiting the fact that the global order of RKrGLm is the same as the local order of RKr. It is clear from the foregoing analysis that the three-phase reintegration algorithm will most likely be more efficient when implemented using RKrGLm, than when using RKr. This is due partly to the design of RKrGLm, through which fewer arithmetical operations per node are required, and partly to the higher global order of RKrGLm. In the next section we will demonstrate this superior efficiency. 4.3 Accuracy for Equal Effort It is instructive to consider the accuracy of the two methods, assuming equal computational effort. In such case we have RA = 1, and so
A R A 1 N 1 A1 N 2 A2 h1 1 h 2 . A2
(39)
Using Δ RKGL G1 h 1r 1 and Δ RK G2 h2r gives A r 1 1 Δ RKGL 1 A2 G 2
1 1 r
G1 Δ RK 11 r
(40)
which shows that the RKGL error is improved, relative to the RK error, by a factor of
Δ RK 1 r .
5. Numerical Examples We will use the equations dy y y 1 on 0 ,20 with y 0 1 20 dx 4 dy y on 0 ,10 with y 0 1 dx
(41)
to demonstrate the theoretical results obtained in the previous section. These have the solutions
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
20 1 19 e x / 4 y x e x y x
691
(42)
respectively. The first of these (which we call P1) is one of the test equations used by Hull et al. (Hull et al, 1972), and is also the equation that we used to generate the plots in Figure 2. The second (P2) is the Dahlquist equation (with λ = 1). We will use three sets of values for r and q : (r,q) = (2,3), (3,4) and (4,5). In other words, we have the pairs of methods RK2 and RK3, RK3 and RK4, RK4 and RK5. In each case, the higher order method is used for error estimation. Moreover, the methods RK2, RK3 and RK4 are independent – not embedded – so that RK2 and RK3 constitute a tandem pair, as do RK3 and RK4 (Butcher, 2003). The pair RK4 and RK5 is an embedded pair, due to Fehlberg (Kincaid & Cheney, 2002), and so, to avoid notational confusion, we will indicate this pair by RKF4 and RKF5. Note that RK4 and RKF4 are not the same. The corresponding RKGL pairs are RK2GL2 and RK3GL2, RK3GL3 and RK4GL3, RKF4GL3 and RKF5GL3. Note that the choice of m in each case is the smallest m such that r + 1 ≤ 2m, subject to the condition that m must be the same for both methods within any given pair. To begin with, we determine the quantities F1, F2, A1, A2, Ω1 and Ω2 for these methods, using Af = 4 (see the RHS of P1 in (41)). These are all shown in Tables 2 and 3. In these tables, s denotes the number of stages in the RK method. s F1 (m = 2) A1 (m = 2) F1 (m = 3) A1 (m = 3) F2 A2 2 1.67 15.00 1.75 16.25 2 18 3 2.33 23.67 2.50 26.00 3 31 4 3.00 33.67 3.25 37.25 4 46 5 3.67 45.00 4.00 50.00 5 63 6 4.33 57.67 4.75 64.25 6 82 Table 2. Number of function evaluations and arithmetical operations per node, with Af = 4. s Ω1 (m = 2) Ω1 (m = 3) 2 0.1111 0.1077 3 0.0986 0.0962 4 0.0891 0.0872 5 0.0815 0.0800 6 0.0715 0.0739 Table 3. Proportionality constants Ω1 and Ω2, with Af = 4.
Ω2 0.1111 0.0968 0.0870 0.0794 0.0732
We see in Table 3 that the constants of proportionality between the number of function evaluations and arithmetical operations are essentially the same for RK and RKGL, for each of the values of s considered. In Table 4 we show the ratio of the number of function evaluations and arithmetical operations per node for RK and RKGL, again with Af = 4.
692
Advanced Technologies
s F1/F2 (m = 2) A1/A2 (m = 2) F1/F2 (m = 3) A1/A2 (m = 3) 2 0.833 0.833 0.875 0.903 3 0.779 0.763 0.833 0.839 4 0.750 0.732 0.813 0.810 5 0.733 0.714 0.800 0.794 6 0.722 0.703 0.792 0.784 Table 4. Ratio of number of function evaluations and arithmetical operations per node for RK and RKGL, with Af = 4.
From Table 4 we see that, as far as function evaluations and arithmetical operations per node are concerned, RKGL is always more efficient than RK. The best case here is when s = 6 and m = 2, for which the ratio of arithmetical operations per node is about 70%, and that of function evaluations per node is about 72%. For phases 1 and 2 of the reintegration algorithm, where the number of nodes used by RK and RKGL is essentially the same, the ratios in Table 4 are indicative of the relative effort of the two methods. Clearly, the gain in using RKGL is in the vicinity of 10% to 30%. It could be argued that this is not actually all that impressive – indeed, as we will see in the next section, the most significant gain is to be had in Phase 3 of the reintegration algorithm, where the higher order of RKGL is exploited. In problem P2 we have Af = 0, and, for completeness’ sake, the appropriate parameters are shown in Tables 5−7. s F1 (m = 2) A1 (m = 2) F1 (m = 3) A1 (m = 3) F2 A2 2 1.67 8.33 1.75 9.25 2 10 3 2.33 14.33 2.50 16.00 3 19 4 3.00 21.67 3.25 24.25 4 30 5 3.67 30.33 4.00 34.00 5 43 6 4.33 40.33 4.75 45.25 6 58 Table 5. Number of function evaluations and arithmetical operations per node, with Af = 0. s Ω1 (m = 2) Ω1 (m = 3) 2 0.2000 0.1892 3 0.1628 0.1563 4 0.1385 0.1340 5 0.1209 0.1176 6 0.1074 0.1050 Table 6. Proportionality constants Ω1 and Ω2, with Af = 0.
Ω2 0.2000 0.1579 0.1333 0.1163 0.1034
s F1/F2 (m = 2) A1/A2 (m = 2) F1/F2 (m = 3) A1/A2 (m = 3) 2 0.833 0.833 0.875 0.925 3 0.779 0.754 0.833 0.842 4 0.750 0.722 0.813 0.808 5 0.733 0.705 0.800 0.791 6 0.722 0.695 0.792 0.780 Table 7. Ratio of number of function evaluations and arithmetical operations per node for RK and RKGL, with Af = 0.
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
693
5.1. Efficiency Curves In Figures 3 and 4 we show RA as a function of the imposed tolerance δ, for each of the test problems.
1 .4 1 .2 1 .0
P1
r = 2 r = 3 r = 4
R
A
0 .8 0 .6 0 .4 0 .2 0 .0
1 0 -1 4 1 0 -1 3 1 0 -1 2 1 0 -1 1 1 0 -1 0 1 0 -9
1 0 -8
1 0 -7
1 0 -6
1 0 -5
1 0 -4
1 0 -3
1 0 -2
T o le ra n c e
Fig. 3. RA as a function of tolerance δ, for the indicated values of r, for test problem P1.
1 .0
0 .8
P2
r = 2 r = 3 r = 4
R
A
0 .6
0 .4
0 .2
0 .0
1 0 -1 4 1 0 -1 3 1 0 -1 2 1 0 -1 1 1 0 -1 0 1 0 -9
1 0 -8
1 0 -7
1 0 -6
1 0 -5
1 0 -4
1 0 -3
1 0 -2
T o le r a n c e
Fig. 4. RA as a function of tolerance δ, for the indicated values of r, for test problem P2. In each of these figures, two features are obvious: RA decreases as δ decreases, and, for any δ, RA increases with increasing r. Both of these features are easily understood in terms of (38). Note that r2 + r equals 6, 12 and 20 for the values of r considered here. We see that RA is less
694
Advanced Technologies
than one for all values of r and δ for P2, but for P1 the RKF4GL3 method is somewhat less efficient than RKF4 for large tolerances. This is due to the fact that, in this case, the global error coefficient for RKF4GL3 is twice as large as that for RKF4. We see in (29) that the ratio of global error coefficients does play a role in determining RA. For the smallest tolerance considered here, RA is approximately 0.3 or less, for all r in both test problems, and for r = 2 we have RA = 0.0052 (P1) and RA = 0.0013 (P2). These results serve to indicate just how much more efficient RKGL can be, particularly when very strict tolerances are imposed on the problem. 5.2. Equal Effort From (40) we have
A 1 ln Δ RKGL 1 ln Δ RK ln 1 r A2
r 1
1 G 2
1 1 r
G1
(43)
so that a plot of ln Δ RKGL against ln Δ RK will yield a straight line with slope 1 1 r . Recall that this holds for the case of equal effort (RA = 1). In Table 8 we show numerical results for the test problems. P1 r 1+1/r (theory) 1+1/r (actual) 2 1.50 1.5002 3 1.33 1.3331 4 1.25 1.2536 P2 r 1+1/r (theory) 1+1/r (actual) 2 1.50 1.4910 3 1.33 1.3303 4 1.25 1.2439 Table 8. Slopes of equation (43) for the test problems.
There is good agreement between theory and experiment. As an example of the improved accuracy for RKGL, we found that for problem P1 with r = 3, Δ RK 6 10 10 and Δ RKGL 2 10 12 . 5.3. The LHR Algorithm Lastly, we apply the complete LHR algorithm to the test problems. We impose tolerances of δ 10 6 and δ 10 12 on the relative global error. The four phases, as described previously, of LHR comprise the following:
1.
Phase 1 – application of FLEC with a moderate tolerance of error.
δ on the relative local
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
2.
695
Phase 2 – determining relative global error coefficients using a uniform node distribution with average stepsize determined from Phase 1. 3. Phase 3 – reintegration with a new stepsize determined from Phase 2, and quality control. 4. Phase 4 – Further reintegration, if necessary, also with quality control. The results are summarised in Tables 9−12, where we show the number of arithmetical operations for each phase, and the ratio RA. Of course, the number of function evaluations is easily obtained from these data by means of the proportionality constants in Tables 3 and 6. P1
Phase 1 Phase 2 Phase 3 Phase 4 Total R A Ψ1 Ψ2
δ 10 6
r=2 RK RKGL 1520 748 4067 1160 105154 16356 0 19952 110741 38216 0.345 (0.156)
r=3 RK RKGL 984 804 1232 1012 11550 5819 0 8349 13766 15984 1.161 (0.504)
r=4 RK RKGL 656 1028 492 1028 1886 1799 2296 2056 5330 5911 1.109 (0.895)
Table 9. Arithmetical operations for LHR applied to P1 with δ 10 6 . δ 10 12
P1
Phase 1 Phase 2 Phase 3 Total R A Ψ1 Ψ2
r=2 RK RKGL 15440 2992 131418 6844 107845227 1938476 107992085 1948312 0.018 (0.018)
r=3 RK RKGL 4920 2412 10549 3542 1246014 240603 1261483 246557 0.195 (0.193)
r=4 RK RKGL 2296 2056 2296 1799 70520 35466 75112 39321 0.523 (0.503)
Table 10. Arithmetical operations for LHR applied to P1 with δ 10 12 . P2
Phase 1 Phase 2 Phase 3 Phase 4 Total R A Ψ1 Ψ2
δ 10 6
r=2 RK RKGL 2784 2553 12818 6460 410959 43316 0 0 426561 52329 0.123 (0.105)
r=3 RK RKGL 2054 3612 3822 6279 38514 17388 0 0 44390 27279 0.615 (0.451)
r=4 RK RKGL 1044 2172 928 2172 5394 4525 6554 0 13920 8869 0.637 (0.839)
Table 11. Arithmetical operations for LHR applied to P2 with δ 10 6 .
696
Advanced Technologies
δ 10 12
P2
Phase 1 Phase 2 Phase 3 Total R A Ψ1 Ψ2
r=2 RK RKGL 29232 11322 435841 46784 415844688 4466920 416309761 4525026 0.011 (0.011)
r=3 RK RKGL 12403 11610 41503 26565 4050340 576863 4104246 615038 0.150 (0.142)
r=4 RK RKGL 5104 6154 6612 7240 209148 80545 220864 93939 0.425 (0.385)
Table 12. Arithmetical operations for LHR applied to P2 with δ 10 12 . In these tables, the ratio R A is the ratio of the total number of arithmetic operations (indicated in bold) for each value of r. The ratio R A in parentheses is that for Phase 3 only, except for r = 4 in Table 9, where it has been computed for Phase 4. Phase 4 was required in Table 9 (for the RKGL methods) and in Table 11 (for RKF4), and in those cases the ratio R A includes the contribution from Phase 4. The arithmetical operations count for Phase 3 includes quality control, i.e. the contribution due to the higher order method. All values of R A for Phase 3 only are less than one, and in only two other cases the overall value of R A is greater than one. In both of these cases, seen in Table 9, this is due to the need for Phase 4 reintegration. Generally, the RKGL implementation of LHR is more efficient than the RK implementation. This is particularly evident when r and δ are small, as in tables 10 and 12. The preliminary phases (1 and 2) of LHR contribute most significantly to the overall computational effort when δ is relatively large, but when δ is small the computational load is due almost entirely to Phase 3 (and Phase 4, if necessary). We have confirmed that all applications of LHR described in the above tables have yielded numerical solutions that satisfy the imposed tolerances. In this sense, LHR has proved to be 100% effective. These results are shown in Table 13. In Figure 5 we show, for example, relative global error profiles for both the RK and RKGL implementations of LHR, with regard to problem P1. For the sake of clarity, we only show the solutions obtained by RK3GL3 and RKF4GL3 in Figure 6. The other error profiles are similar. Note the ‘zigzag’ character evident in Figure 2 is also present in these solutions. Due to space restrictions, we have not shown similar plots for problem P2.
δ Δmax (P1) Δmax (P2)
r=2 RK
r=3 RKGL
10 6
8.5 10
10 12
8.2 10
10 6
8.3 10
10 12
8.0 10
7
7.3 10 ♦
RK
7
9.2 10
13
7.6 10
7
8.5 10
13
7.5 10
13
8.1 10
7
8.0 10
13
7.5 10
7
r=4 RKGL
RK
7.1 10 ♦
6.7 10 ♦
7
13
9.7 10
13
7
7.8 10
13
6.9 10
7
13
RKGL 7
8.8 10
13
7.0 10 ♦
9.5 10 ♦ 7
9.3 10
13
7
9.8 10
13
7.6 10
7.5 10
7
13
Table 13. Maximum relative global error Δmax using LHR, for the indicated tolerances δ. The symbol ♦ indicates that Phase 4 of LHR was required.
Improving the efficiency of Runge-Kutta reintegration by means of the RKGL algorithm
1e-6
1e-6
P1 Relative global error
r=2 r=3 r=4
5e-7
P1
r=2 r=3 r=4
Relative global error
Relative global error
Relative global error
P1
1e-12
697
5e-13
0
5e-7
P1
1e-12
5e-13
0
0
4
8
x
12
16
20
0
4
8
x
12
16
20
Fig. 5. (Left) Relative global errors for tolerances δ 10 6 (upper) and δ 10 12 (lower), for the RK version of LHR, applied to problem P1. (Right) Relative global errors for tolerances δ 10 6 (upper) and δ 10 12 (lower), for the RKGL version of LHR, applied to problem P1. Upper plot: RK3GL3, lower plot: RKF4GL3.
6. Conclusion and Scope for Further Work We have studied the RKGL algorithm for the numerical solution of IVPs, with regard to improving the efficiency of global error control via reintegration. Our theoretical investigations have shown that the RKrGLm method requires fewer arithmetical operations and function evaluations per node, than its underlying RKr method. Moreover, since the RKrGLm method is of one order higher than RKr, we have shown that the relative efficiency of the two methods is proportional to the (r2+r)th root of the user-defined tolerance imposed on the problem. Hence, as this tolerance becomes stricter, so the RKrGLm method becomes more efficient. This effect is more pronounced for small r, which is of great benefit, since it is usually true that lower order methods are computationally expensive. In a similar vein, we have shown that when computational effort is equal, the accuracy of the RKrGLm method can be expected to be better than that of RKr. Various numerical experiments have demonstrated these properties; in particular, the LHR algorithm for controlling relative global error has been shown to be generally more efficient when implemented using RKrGLm, than when using RKr. 6.1. Further Work There are a number of issues that need to be considered with regard to the research presented in this paper:
698
Advanced Technologies
The need for the use of Phase 4 suggests that the global error coefficient has not been correctly estimated in Phase 3. Of course, Phase 4 represents a correction phase and so is important, but it is also computationally expensive. The only reason the RKGL version was less efficient than the RK version in Table 9 was the need to use Phase 4. A better estimate of the global error coefficient in Phase 3 could have prevented this.
The use of δ loc δ in Phase 1 is not necessarily optimal. A case in point is for LHR applied to P1 with r = 2 and δ 10 6 : Here, when δ loc 10 3 , Phase 4 was
required. However, with δ loc 10 4 , Phase 4 was not required and the total
arithmetical operations count was only 21957. The effect of the length of the interval of integration should be investigated. A larger interval will require more nodes, and it is clear from Tables 9−12 that when the number of nodes used in Phase 3 is much larger than that used in Phases 1 and 2, the efficiency will be due essentially to Phase 3 only (and Phase 4, if needed). For r = 4, the improvement in relative efficiency of RKGL might be more significant over larger intervals of integration.
7. References Burden, R.L. & Faires, J.D. (2001). Numerical Analysis (7th ed.), Brooks/Cole, ISBN: 0-53438216-9, Pacific Grove. Butcher, J.C. (2000). Numerical methods for ordinary differential equations in the 20th century. Journal of Computational and Applied Mathematics, 125, (April 2000) 1−29, ISSN : 0377-0427. Butcher, J.C. (2003). Numerical Methods for Ordinary Differential Equations, Wiley, ISBN: 0-47196758-0, Chippenham. Hairer, E.; Norsett, S.P. & Wanner, G. (2000). Solving Ordinary Differential Equations I : Nonstiff Problems, Springer-Verlag, ISBN: 3-540-56670-8, Berlin. Hull T.E.; Enright, W.H.; Fellen, B.M. & Sedgwick, A.E. (1972). Comparing numerical methods for ordinary differential equations. SIAM Journal on Numerical Analysis, 9, 4, (December 1972) 603−637, ISSN: 0036-1429. Kincaid, D. & Cheney, W. (2002). Numerical Analysis : Mathematics of Scientific Computing, Brooks/Cole, ISBN : 0-534-38905-8, Pacific Grove. LeVeque, R.J. (2007). Finite Difference Methods for Ordinary and Partial Differential Equations, SIAM, ISBN: 978-0-898716-29-0, Philadelphia. Prentice, J.S.C. (2008). The RKGL method for the numerical solution of initial-value problems. Journal of Computational and Applied Mathematics, 213, (April 2008) 477−487, ISSN : 0377-0427. Prentice, J.S.C. (2009). General error propagation in the RKrGLm method. Journal of Computational and Applied Mathematics, 228, (June 2009) 344−354, ISSN : 0377-0427. Shampine, L. (1994). Numerical Solution of Ordinary Differential Equations, CRC Press, ISBN: 978-0-412051-51-7, Boca Raton. Shampine, L. (2005). Error estimation and control for ODEs. Journal of Scientific Computing, 125, 1, (October 2005) 3−16, ISSN : 0885-7474.