Stochastic Methods and Their Applications to Communications Stochastic Differential Equations Approach
Serguei Primak University of Western Ontario, Canada
Valeri Kontorovich Cinvestav-IPN, Mexico
Vladimir Lyandres Ben-Gurion University of the Negev, Israel
Stochastic Methods and Their Applications to Communications
Stochastic Methods and Their Applications to Communications Stochastic Differential Equations Approach
Serguei Primak University of Western Ontario, Canada
Valeri Kontorovich Cinvestav-IPN, Mexico
Vladimir Lyandres Ben-Gurion University of the Negev, Israel
Copyright # 2004
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+44) 1243 779777
Email (for orders and customer service enquiries):
[email protected] Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to
[email protected], or faxed to (+44) 1243 770620. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Other Wiley Editorial Offices John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Wiley also publishes its books in a variety of electronic formats. Some of the content that appears in print may not be available in electronic books.
British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library 0-470-84741-7 Typeset in 10/12pt Times by Thomson Press (India) Limited, New Delhi Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production.
To our loved ones
Contents 1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Digital Communication Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 3
2.
Random Variables and Their Description. . . . . . . . . . . . . . . . . . . 2.1 Random Variables and Their Description . . . . . . . . . . . . . . . . . 2.1.1 Definitions and Method of Description . . . . . . . . . . . . . 2.1.1.1 Classification . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1.2 Cumulative Distribution Function. . . . . . . . . . . 2.1.1.3 Probability Density Function . . . . . . . . . . . . . . 2.1.1.4 The Characteristic Function and the Log-Characteristic Function. . . . . . . . . . . . . . . 2.1.1.5 Statistical Averages . . . . . . . . . . . . . . . . . . . . 2.1.1.6 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1.7 Central Moments . . . . . . . . . . . . . . . . . . . . . . 2.1.1.8 Other Quantities. . . . . . . . . . . . . . . . . . . . . . . 2.1.1.9 Moment and Cumulant Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1.10 Cumulants . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Orthogonal Expansions of Probability Densities: Edgeworth and Laguerre Series. . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 The Edgeworth Series . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 The Laguerre Series . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Gram–Charlier Series . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Transformation of Random Variables. . . . . . . . . . . . . . . . . . . . 2.3.1 Transformation of a Given PDF into an Arbitrary PDF . . 2.3.2 PDF of a Harmonic Signal with Random Phase . . . . . . . 2.4 Random Vectors and Their Description . . . . . . . . . . . . . . . . . . 2.4.1 CDF, PDF and the Characteristic Function . . . . . . . . . . . 2.4.2 Conditional PDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Numerical Characteristics of a Random Vector. . . . . . . . 2.5 Gaussian Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Transformation of Random Vectors . . . . . . . . . . . . . . . . . . . . . 2.6.1 PDF of a Sum, Difference, Product and Ratio of Two Random Variables . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Probability Density of the Magnitude and the Phase of a Complex Random Vector with Jointly Gaussian Components . . . . . . . . . . . . . . . . . . . . 2.6.2.1 Zero Mean Uncorrelated Gaussian Components of Equal Variance. . . . . . . . . . . . . . . . . . . . . .
7 7 7 7 8 9
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
10 11 12 12 13
....... .......
14 15
. . . . . . . . . . . . .
. . . . . . . . . . . . .
16 17 20 22 23 25 25 26 26 28 30 32 35
.......
37
.......
39
.......
41
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
viii
CONTENTS
2.6.2.2 Case of Uncorrelated Components with Equal Variances and Non-Zero Mean. . . . . . . . . . . . . . . . 2.6.3 PDF of the Maximum (Minimum) of two Random Variables . 2.6.4 PDF of the Maximum (Minimum) of n Independent Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Additional Properties of Cumulants . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Moment and Cumulant Brackets . . . . . . . . . . . . . . . . . . . . . 2.7.2 Properties of Cumulant Brackets. . . . . . . . . . . . . . . . . . . . . 2.7.3 More on the Statistical Meaning of Cumulants . . . . . . . . . . . 2.8 Cumulant Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 Non-Linear Transformation of a Random Variable: Cumulant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Cumulant Brackets and Their Calculations. . . . . . . . . . . . . . 3.
4.
Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 General Remarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Probability Density Function (PDF). . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Characteristic Functions and Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Moment Functions and Correlation Functions . . . . . . . . . . . . . . . . . 3.5 Stationary and Non-Stationary Processes . . . . . . . . . . . . . . . . . . . . 3.6 Covariance Functions and Their Properties. . . . . . . . . . . . . . . . . . . 3.7 Correlation Coefficient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Cumulant Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10 Power Spectral Density (PSD) . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11 Mutual PSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11.1 PSD of a Sum of Two Stationary and Stationary Related Random Processes. . . . . . . . . . . . . . . . . . . . . . . . 3.11.2 PSD of a Product of Two Stationary Uncorrelated Processes 3.12 Covariance Function of a Periodic Random Process . . . . . . . . . . . . 3.12.1 Harmonic Signal with a Constant Magnitude . . . . . . . . . . . 3.12.2 A Mixture of Harmonic Signals . . . . . . . . . . . . . . . . . . . . 3.12.3 Harmonic Signal with Random Magnitude and Phase . . . . . 3.13 Frequently Used Covariance Functions . . . . . . . . . . . . . . . . . . . . . 3.14 Normal (Gaussian) Random Processes . . . . . . . . . . . . . . . . . . . . . . 3.15 White Gaussian Noise (WGN) . . . . . . . . . . . . . . . . . . . . . . . . . . . Advanced Topics in Random Processes. . . . . . . 4.1 Continuity, Differentiability and Integrability 4.1.1 Convergence and Continuity. . . . . . . 4.1.2 Differentiability . . . . . . . . . . . . . . . 4.1.3 Integrability . . . . . . . . . . . . . . . . . . 4.2 Elements of System Theory . . . . . . . . . . . . 4.2.1 General Remarks . . . . . . . . . . . . . . 4.2.2 Continuous SISO Systems . . . . . . . . 4.2.3 Discrete Linear Systems . . . . . . . . . 4.2.4 MIMO Systems . . . . . . . . . . . . . . .
.. of .. .. .. .. .. .. .. ..
.............. a Random Process. .............. .............. .............. .............. .............. .............. .............. ..............
. . . . . . . . . .
.... ....
41 42
. . . . . .
. . . . . .
44 44 46 48 49 49
.... ....
52 54
.... .... ....
59 59 60
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
63 64 70 71 74 77 77 80 82
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
83 84 85 85 86 87 88 88 95
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
99 99 99 100 102 103 103 105 107 109
. . . . . .
. . . . . .
ix
CONTENTS
4.2.5 Description of Non-Linear Systems. . . . . . . . . . . . . . . . . . . . . . 4.3 Zero Memory Non-Linear Transformation of Random Processes . . . . . . 4.3.1 Transformation of Moments and Cumulants . . . . . . . . . . . . . . . . 4.3.1.1 Direct Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1.2 The Rice Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Cumulant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Cumulant Analysis of Non-Linear Transformation of Random Processes . 4.4.1 Cumulants of the Marginal PDF . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Cumulant Method of Analysis of Non-Gaussian Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Linear Transformation of Random Processes . . . . . . . . . . . . . . . . . . . . 4.5.1 General Expression for Moment and Cumulant Functions at the Output of a Linear System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1.1 Transformation of Moment and Cumulant Functions . . . 4.5.1.2 Linear Time-Invariant System Driven by a Stationary Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 Analysis of Linear MIMO Systems . . . . . . . . . . . . . . . . . . . . . . 4.5.3 Cumulant Method of Analysis of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.4 Normalization of the Output Process by a Linear System . . . . . . 4.6 Outages of Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 General Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Average Level Crossing Rate and the Average Duration of the Upward Excursions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Level Crossing Rate of a Gaussian Random Process . . . . . . . . . . 4.6.4 Level Crossing Rate of the Nakagami Process . . . . . . . . . . . . . . 4.6.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Narrow Band Random Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Definition of the Envelope and Phase of Narrow Band Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.2 The Envelope and the Phase Characteristics . . . . . . . . . . . . . . . . 4.7.2.1 Blanc-Lapierre Transformation. . . . . . . . . . . . . . . . . . . 4.7.2.2 Kluyver Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.2.3 Relations Between Moments of pAn ðan Þ and pi ðIÞ . . . . . 4.7.2.4 The Gram–Charlier Series for pR ðxÞ and pi ðIÞ . . . . . . . 4.7.3 Gaussian Narrow Band Process . . . . . . . . . . . . . . . . . . . . . . . . 4.7.3.1 First Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.3.2 Correlation Function of the In-phase and Quadrature Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.3.3 Second Order Statistics of the Envelope . . . . . . . . . . . . 4.7.3.4 Level Crossing Rate . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.4 Examples of Non-Gaussian Narrow Band Random Processes . . . . 4.7.4.1 K Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.4.2 Gamma Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.4.3 Log-Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . 4.7.4.4 A Narrow Band Process with Nakagami Distributed Envelope . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
110 112 112 115 116 117 118 118
. 119 . 121 . 121 . 122 . .
125 131
. 132 . 137 . 140 . 140 . 141 . 145 . 149 . 152 . 152 . . . . . . . .
154 156 156 160 161 163 166 166
. . . . . . .
168 169 172 173 173 175 175
.
177
x
CONTENTS
4.8 Spherically Invariant Processes . . . . . 4.8.1 Definitions . . . . . . . . . . . . . . 4.8.2 Properties. . . . . . . . . . . . . . . 4.8.2.1 Joint PDF of a SIRV 4.8.2.2 Narrow Band SIRVs . 4.8.3 Examples . . . . . . . . . . . . . . . 5.
6.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. 181 . 181 . 182 . 182 . 183 . 184
Markov Processes and Their Description . . . . . . . . . . . . . . . . . . . 5.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Markov Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 A Discrete Markov Process . . . . . . . . . . . . . . . . . . . . . 5.1.4 Continuous Markov Processes . . . . . . . . . . . . . . . . . . . 5.1.5 Differential Form of the Kolmogorov–Chapman Equation 5.2 Some Important Markov Random Processes . . . . . . . . . . . . . . . 5.2.1 One-Dimensional Random Walk . . . . . . . . . . . . . . . . . . 5.2.1.1 Unrestricted Random Walk . . . . . . . . . . . . . . . 5.2.2 Markov Processes with Jumps . . . . . . . . . . . . . . . . . . . 5.2.2.1 The Poisson Process . . . . . . . . . . . . . . . . . . . . 5.2.2.2 A Birth Process . . . . . . . . . . . . . . . . . . . . . . . 5.2.2.3 A Death Process . . . . . . . . . . . . . . . . . . . . . . 5.2.2.4 A Death and Birth Process . . . . . . . . . . . . . . . 5.3 The Fokker–Planck Equation . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Preliminary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Derivation of the Fokker–Planck Equation . . . . . . . . . . . 5.3.3 Boundary Conditions. . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Discrete Model of a Continuous Homogeneous Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 On the Forward and Backward Kolmogorov Equations . . 5.3.6 Methods of Solution of the Fokker–Planck Equation . . . . 5.3.6.1 Method of Separation of Variables . . . . . . . . . . 5.3.6.2 The Laplace Transform Method . . . . . . . . . . . . 5.3.6.3 Transformation to the Schro¨dinger Equations. . . 5.4 Stochastic Differential Equations. . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Temporal Symmetry of the Diffusion Markov Process . . . . . . . . 5.6 High Order Spectra of Markov Diffusion Processes. . . . . . . . . . 5.7 Vector Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1.1 A Gaussian Process with a Rational Spectrum . . 5.8 On Properties of Correlation Functions of One-Dimensional Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
189 189 190 203 207 212 214 217 217 219 221 221 223 224 224 227 227 227 231
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
234 235 236 236 243 244 245 246 257 258 263 263 270
.......
271
Markov Processes with Random Structures . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . 6.2 Markov Processes with Random Structure Statistical Description . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
... ... and ...
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . 275 . . . . . . . . . . . . . . . . . . . . 275 Their . . . . . . . . . . . . . . . . . . . . 279
xi
CONTENTS
6.2.1 Processes with Random Structure and Their Classification . . . . . . 6.2.2 Statistical Description of Markov Processes with Random Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Generalized Fokker–Planck Equation for Random Processes with Random Structure and Distributed Transitions . . . . . . . . . . . . . . 6.2.4 Moment and Cumulant Equations of a Markov Process with Random Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Approximate Solution of the Generalized Fokker–Planck Equations . . . . 6.3.1 Gram–Charlier Series Expansion. . . . . . . . . . . . . . . . . . . . . . . . 6.3.1.1 Eigenfunction Expansion. . . . . . . . . . . . . . . . . . . . . . . 6.3.1.2 Small Intensity Approximation . . . . . . . . . . . . . . . . . . 6.3.1.3 Form of the Solution for Large Intensity. . . . . . . . . . . . 6.3.2 Solution by the Perturbation Method for the Case of Low Intensities of Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2.1 General Small Parameter Expansion of Eigenvalues and Eigenfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2.2 Perturbation of 0 ðxÞ . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 High Intensity Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3.1 Zero Average Current Condition . . . . . . . . . . . . . . . . . 6.3.3.2 Asymptotic Solution P1 ðxÞ . . . . . . . . . . . . . . . . . . . . . 6.3.3.3 Case of a Finite Intensity v . . . . . . . . . . . . . . . . . . . . 6.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.
Synthesis of Stochastic Differential Equations. . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Modeling of a Scalar Random Process Using a First Order SDE . . . . . 7.2.1 General Synthesis Procedure for the First Order SDE . . . . . . . . 7.2.2 Synthesis of an SDE with PDF Defined on a Part of the Real Axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Synthesis of Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4 Non-Diffusion Markov Models of Non-Gaussian Exponentially Correlated Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4.1 Exponentially Correlated Markov Chain—DAR(1) and Its Continuous Equivalent . . . . . . . . . . . . . . . . . . . . . 7.2.4.2 A Mixed Process with Exponential Correlation . . . . . . 7.3 Modeling of a One-Dimensional Random Process on the Basis of a Vector SDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Preliminary Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Synthesis Procedure of a ð; !Þ Process. . . . . . . . . . . . . . . . . . 7.3.3 Synthesis of a Narrow Band Process Using a Second Order SDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3.1 Synthesis of a Narrow Band Random Process Using a Duffing Type SDE . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3.2 An SDE of the Van Der Pol Type . . . . . . . . . . . . . . . 7.4 Synthesis of a One-Dimensional Process with a Gaussian Marginal PDF and Non-Exponential Correlation. . . . . . . . . . . . . . . . . . . . . . . .
. . . .
.
279
.
280
.
281
. . . . . .
288 295 296 296 297 302
.
304
. 304 . 305 . 310 . 310 . 311 . 314 . 317
. 321 . 321 . 322 . 322
. . 326 . . 329 ..
334
. . 335 . . 341 . . 347 . . 347 . . 347 ..
351
. . 352 . . 356 ..
361
xii
CONTENTS
7.5 Synthesis of Compound Processes. . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 Compound Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.2 Synthesis of a Compound Process with a Symmetrical PDF. 7.6 Synthesis of Impulse Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6.1 Constant Magnitude Excitation . . . . . . . . . . . . . . . . . . . . . 7.6.2 Exponentially Distributed Excitation . . . . . . . . . . . . . . . . . 7.7 Synthesis of an SDE with Random Structure . . . . . . . . . . . . . . . . 8.
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Continuous Communication Channels . . . . . . . . . . . . . . . . . . . . 8.1.1 A Mathematical Model of a Mobile Satellite Communication Channel . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Modeling of a Single-Path Propagation . . . . . . . . . . . . . . 8.1.2.1 A Process with a Given PDF of the Envelope and Given Correlation Interval. . . . . . . . . . . . . . 8.1.2.2 A Process with a Given Spectrum and Sub-Rayleigh PDF . . . . . . . . . . . . . . . . . . . . . . 8.2 An Error Flow Simulator for Digital Communication Channels . . 8.2.1 Error Flow in Digital Communication Systems. . . . . . . . . 8.2.2 A Model of Error Flow in a Digital Channel with Fading . 8.2.3 SDE Model of a Buoyant Antenna–Satellite Link . . . . . . . 8.2.3.1 Physical Model . . . . . . . . . . . . . . . . . . . . . . . . 8.2.3.2 Phenomenological Model . . . . . . . . . . . . . . . . . 8.2.3.3 Numerical Simulation . . . . . . . . . . . . . . . . . . . . 8.3 A Simulator of Radar Sea Clutter with a Non-Rayleigh Envelope . 8.3.1 Modeling and Simulation of the K-Distributed Clutter. . . . 8.3.2 Modeling and Simulation of the Weibull Clutter. . . . . . . . 8.4 Markov Chain Models in Communications. . . . . . . . . . . . . . . . . 8.4.1 Two-State Markov Chain—Gilbert Model . . . . . . . . . . . . 8.4.2 Wang–Moayeri Model . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.3 Independence of the Channel State Model on the Actual Fading Distribution . . . . . . . . . . . . . . . . . . . . . . . 8.4.4 A Rayleigh Channel with Diversity . . . . . . . . . . . . . . . . . 8.4.5 Fading Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.6 Higher Order Models . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Markov Chain for Different Conditions of the Channel . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
364 365 367 369 370 371 371
. . . . . . 377 . . . . . . 377 . . . . . . 377 . . . . . . 380 ......
380
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
383 388 389 389 391 391 392 395 397 397 404 408 408 409
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. 418 . 418 . 419 . 421 . 422
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
433
As an extra resource we have set up a companion website for our book containing supplementary material devoted to the numerical simulation of stochastic differential equations and description, modeling and simulation of impulse random processes. Additional reference information is also available on the website. Please go to the following URL and have a look: ftp://ftp.wiley.co.uk/pub/books/primak/
1 Introduction 1.1
PREFACE
A statistical approach to consideration of most problems related to information transmission became dominant during the past three decades. This can be easily explained if one takes into account that practically any real signal, propagation media, interference and even information itself all have an intrinsically random nature.1 This is why D. Middleton, one of the founders of modern communication theory, coined a term ‘‘statistical theory of communications’’ [2]. The recent spectacular achievements in information technology are based mainly on progress in three fundamental areas: communication and information theory, signal processing and computer and related technologies. All these allow the transmission of information at a rate close to that limited by the Shannon theorem [2]. In principle, the limitation on the speed of the transmission of information is defined by the noise and interference which are inherently present in all communication systems. An accurate description of such impairments is very important for a proper organization and noise immunity of communication systems. The choice of a relevant interference model is a crucial moment in design of communication systems and is an important step in their testing and performance evaluation. The requirements, formulated to improve the performance of systems are often conflicting and hard to formulate in a way convenient for optimization. On one side, such models must accurately reflect the main features of the interference under investigation. On the other hand, a maximally simple description is needed to be applicable for the massive numerical simulation required to test modern communication system designs. Historically, two simple processes, so-called White Gaussian Noise (WGN) and the Poisson Point Process (PPP) have been widely used to obtain first rough estimates of a system performance: the WGN model of the noise is a good approximation of an additive wide band noise, caused by a variety of natural phenomena, while the PPP is a good model for event modeling in a discrete communication channel. Unfortunately, in the majority of realistic situations these basic models of the noise and errors are not adequate. In realistic communication channels, a non-stationary non-Gaussian interference and noise are often present. In addition, these processes are often band-limited, thus showing significant time correlation, which cannot be represented by WGN and PPP. The following phenomena can be considered as examples when the simplest models fail to provide an accurate description: 1
Having said that, we would like to acknowledge an exponentially increasing body of literature which describes the same phenomena using a rather different approach, based on the chaotic description of signals [1].
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
2
INTRODUCTION
fading in communication channels; interchannel/interuser interference; impulsive noise, including man-made noise. Of course this list can be greatly expanded [3]. Description of such phenomena using joint probability densities of order higher than two is difficult since a large amount of a priori information on the properties of the noise and interference is required. Such information is usually not available or difficult to obtain [4]. Furthermore, specification of higher order joint probability densities is usually not productive since it results in complicated expressions which cannot be effectively used at the current level of computer simulation. Substitution of non-Gaussian models by their equivalent Gaussian models may result in significant and unpredictable errors. All of this, coupled with increased complexity of the systems and reduced design time, requires simple and adequate models of the blocks, stages, complete systems and networks of the communication systems. Such a variety requires an approach which is flexible enough to cover the majority of these possibilities, and this is a major requirement for models of communication systems. This book describes techniques which, in the opinion of the authors, are well suited for the task of effective modeling of non-Gaussian phenomena in communication systems and related disciplines. It is based on the Markov approach to modeling of random processes encountered in applications. In particular, non-Gaussian processes are modeled as the solution of an appropriate Stochastic Differential Equation (SDE), i.e. a differential equation with a random excitation. This approach, in general phenomenological, is built on the idea of the system state, suggested by van Trees [5]. The essence of this method is to describe the process under investigation as a solution of some SDE synthesized based on some a priori information (such as marginal distribution and correlation functions). Such synthesis is an attempt to uncover hidden dynamics of the interference formation. Despite the fact that such an approach is very approximate,2 the SDE approach to modeling of random processes has significant advantages: universality: a single structure allows the modeling of a great variety of different processes by simple variation of the system parameters and excitation; effectiveness: a single structure allows the modeling and numerical simulation of a spectrum of the characteristics of the random process, such as marginal probability density, correlation function, etc; the suggested models suit well computer simulation. Unfortunately, the SDE approach to modeling is still not widely known as a tool in modeling of communication channels and related issues, despite its effective application in chemistry, physics and other areas [3]. Some work in this area first appeared more than thirty years ago in the Bonch–Bruevich Institute of Telecommunications (St Petersburg, Russia). The first attempt to summarize the results in the area of communication channel modeling based on the SDE approach resulted in a book [6], published in Russian and almost unavailable for international readers. Since then a number of new results have been obtained by a number of researchers, including the authors. A number of old approaches were also reconsidered. The majority of these results are scattered over a number of journal and conference proceedings, partially in Russian and are not always readily available. All of this 2
‘‘Pure’’ continuous time Markov processes cannot be physically implemented.
DIGITAL COMMUNICATION SYSTEMS
3
gave an impetus for a new book which offers a summary of the SDE approach to modeling and evaluation of communication and related systems, and provides solid background in the applied theory of random processes and systems with random excitation. Alongside classical issues, the book includes the cumulant method of Malakhov [7], often used for analysis of non-Gaussian processes, systems with random and some aspects of numerical simulation of SDE. The authors believe that this composition of the book will be helpful for both graduate students specializing in analysis of communication systems and researchers and practising engineers. Some chapters and sections can be omitted for the first reading; some of the topics formulate problems which have not been solved.3 The book is organized as follows. Chapters 2–4 present an introduction to the theory of random processes and form the basis of a first term course for graduate students at the Department of Electrical and Computer Engineering, The University of Western Ontario (UWO). Chapter 5 deals with the theory of Markov processes, stochastic differential equations and the Fokker–Planck equation. Synthesis of models in the form of SDE is described in Chapter 7, while Chapter 8 provides examples of the applications of SDE and other Markov models to practical communications problems. Four appendices detailing the numerical simulation of SDEs, impulse random processes, tables of distributions and orthogonal polynomials are published on the web. Some of the material presented in this book was originally developed in the State University of Telecommunications (St Petersburg, Russia) and is, to a great degree, augmented by the results of the research conducted in three groups at Ben-Gurion University of the Negev (Beer-Sheva, Israel), University of Western Ontario (London, Canada) and Cinvestav-IPN (Mexico City, Mexico). A large number of people have provided motivation, support and advice. The authors would like to thank Doctors R. Gut, A. Berdnikov, A. Brusentsov, M. Dotsenko, G. Kotlyar, J. LoVetri, J. Roy and D. Makrakis, who contributed greatly to the development of the techniques described in this book. We also would like to recognize the contribution of our graduate students, in particular Mr Mark Shahaf, Mrs My Pham, Mr Chris Snow, Mr Jeff Weaver and Ms Vanja Subotic who have provided contributions through a number of joint publications and valuable comments on the course notes which form the first three chapters of this book. This research has been supported in part by CRC Canada, NSERC Canada, the Ministry of Education of Israel, and the Ministry of Education of Mexico. The authors are also thankful to their families who were very considerate and patient. Without their love and support this book would have never been completed.
1.2
DIGITAL COMMUNICATION SYSTEMS
The following block diagram is used in this book as a generic model of a communication system.4 A detailed description of the blocks can be found in many textbooks, for example [8,9]. The main focus of this book is to model processes in a communication channel. Referring to Fig. 1.1, there are two ways in which the communication channel can be defined: the channel between points A and A0 is known as a continuous (analog) channel; the 3 4
At least, the authors are not aware of such solutions. This also includes such electronic systems as radar; the issue of immunity of electronic equipment to interference also can be treated as a communication problem.
4
INTRODUCTION
Figure 1.1
Generic communication system.
part of the system between the points B and B0 is called a discrete (digital) channel. In general, such separation is very subjective, especially in the light of recent advances in modulation and coding techniques, taking into account properties of the propagation channel already on the modulation stage. However, such separation is convenient, especially for a pedagogical purpose and computer simulation of separate blocks of the communications system [9,10]. In addition to propagation media, the continuous communication channel includes all linear blocks (such as filters and amplifiers) which cannot be identified with the propagation media. Such assignment is somewhat arbitrary, since by varying the parameters of the modulation, number of antennas and their directivity, it is possible to vary properties of the channel.5 A theoretical description of such a channel could be achieved by modeling the channel as a linear time and space varying filter [10–12]. This allows use of the concept of system functions, which is well suited for both narrow band and wide band channels. Recent advances in communication technology sparked greater interest in wide band channels [9–10,12], since they significantly increase the capacity of communication systems. In continuous channels one can find a great variety of noise and interference, which limit the information transmission rate subject to certain quality requirements (such as probability of error). Roughly speaking, the interference and noise can be classified as additive (such as thermal noise, antenna noise, man-made and natural impulsive noise, interchannel interference, etc.) and multiplicative (such as fading, intermodulations, etc.). Additive noise (interference) nðtÞ results in the received signal rðtÞ being an algebraic sum of the information signal sðtÞ and noise nðtÞ, rðtÞ ¼ sðtÞ þ nðtÞ
ð1:1Þ
while the multiplicative noise (such as fast and shadowing) changes the energy of the information signal. In the simplest case of frequency non-selective fading, the multiplicative 5
Sometimes modems are also included in the continuous channel. In this case the channel becomes non-linear, depending on the modulation–demodulation technique.
REFERENCES
5
noise is described simply as rðtÞ ¼ ðtÞsðtÞ
ð1:2Þ
where ðtÞ is the fading, sðtÞ is the transmitted signal and rðtÞ is the received signal. Multiplicative noise significantly transforms the spectrum of the received signal and thus severely impairs communication systems. The realistic situation is often much more complicated since different frequencies are transmitted with different gains [12], thus one has to distinguish between so-called flat (non-selective) and frequency selective fading. ‘‘Flatness’’ of fading is closely related to the product of the delay spread of the signal and its bandwidth [13–14]. Large-scale fading, or shadowing, is mainly defined by the absorption of radiowaves, defined by changes in the temperature, composition and other factors in the propagation media. Such changes are very slow compared to the period of the carrier frequency and the signal can be assumed constant for many cycles of the carrier. However, these changes are significant if a relatively long communication session is considered. On the contrary, fast fading is explained by the changing phase conditions on the propagation path, with corresponding destruction or reconstruction of coherency between possible multiple paths of propagation. As a result, the effective magnitude and the phase of the signals varies significantly, which is particularly important in cellular, personal and satellite communications [8,12]. Furthermore, these variations are comparable on the time scale with the duration of the bit, explaining the term ‘‘fast fading’’. A combination of fast and slow fading results in non-stationary conditions of the channel. Nevertheless, if the performance of a communication system is considered on the level of a bit, code block or sometimes a packet or frame, it could be assumed that the fading is at least locally stationary. However, if long term performance of the system is the concern, for example when evaluating the performance of the communication protocols by means of numerical simulation, the non-stationary nature must be taken into account. A digital communication channel (Section B B0 in Fig. 1.1) is characterized by a discrete set of the transmitted and received symbols and can usually be described by distribution of the errors, i.e. incorrectly received symbols. There are a great number of statistics which are useful in such description, for example average probability of error, average number of errors in a block of a given length, etc. It can be seen that the continuous channels can be statistically mapped into a discrete channel if the modulation and its characteristics are known. If simple modulation techniques are used, the performance of the modulation technique can be expressed analytically and corresponding statistics of the discrete channel can be derived. In this case, modeling of the continuous channel is unnecessary. However, modern communication algorithms often involve modulation and coding techniques whose performance cannot be described analytically. In this case, it is very important to be able to reproduce proper statistics of the underlying continuous channel. This book discusses techniques for modeling both continuous and discrete channels.
REFERENCES 1. E. Costamagna, L. Favalli, and P. Gamba, Multipath Channel Modeling with Chaotic Attractors, Proceedings of the IEEE, vol. 90, no. 5, May 2002, pp. 842–859.
6
INTRODUCTION
2. D. Middleton, An Introduction to Statistical Communication Theory, New York: McGraw-Hill, 1960. 3. C.W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, Berlin: Springer, 1994, pp. 442. 4. D. Middleton, Non-Gaussian Noise Models in Signal Processing for Telecommunications: New Methods and Results For Class A and Class B Noise Models, IEEE Trans. Information Theory, vol. 45, no. 4, May 1999, pp. 1129–1149. 5. H. van Trees, Detection, Estimation, and modulation theory, New York: Wiley, 1968–1971. 6. D. Klovsky, V. Kontorovich, and S. Shirokov, Models of Continuous Communications Channels Based on Stochastic Differential Equations, Moscow: Radio i sviaz, 1984 (In Russian). 7. A.N. Malakhov, Kumuliantnyi Analiz Sluchainykh Negaussovykh Protsessov I Ikh Preobrazovanii, Moscow: Sov. Radio, 1978 (In Russian). 8. M. Jeruchim, Simulation of Communication Systems: Modeling, Methodology, and Techniques, New York: Kluwer Academic/Plenum Publishers, 2000. 9. J. Proakis, Digital Communications, Boston: McGraw-Hill, 2000. 10. S. Benedetto and E. Biglieri, Principles of Digital Transmission: with Wireless Applications, New York: Kluwer Academic/Plenum Press, 1999. 11. R. Steele and L. Hanzo, Mobile Radio Communications: Second And Third-generation Cellular And Watm Systems, New York: Wiley, 1999. 12. M. Simon and S. Alouini, Digital Communication Over Fading Channels: A Unified Approach To Performance Analysis, New York: Wiley, 2000. 13. T. Rappoport, Wireless communications: principles and practice. Upper Saddle River, NJ: London: Prentice Hall PTR, 2001. 14. P. Bello, Characterization of Randomly Time-Variant Linear Channels, IEEE Trans. Communications, vol. 11, no. 4, December, 1963, pp. 360–393.
2 Random Variables and Their Description This chapter briefly summarizes definitions and important results related to the description of random variables and random vectors. More detailed discussions can be found in [1,2]. A number of useful examples, important for applications, is also considered in this chapter.
2.1
RANDOM VARIABLES AND THEIR DESCRIPTION
2.1.1 2.1.1.1
Definitions and Method of Description Classification
A Random Variable (RV) 1 can be considered as an outcome of an experiment, physical or imaginary, such that this quantity has a different value from one run of the experiment to another, even if the experiment is repeated under the same conditions. The difference in the outcomes may come either as a result of unaccounted conditions (variables), or, as in the quantum theory, from the internal properties of the system under consideration. In order to describe a random variable one needs to specify a range of possible values which this variable can attain. In addition, some numerical measure of probability of these outcomes must be assigned. Based on the type of values the random variable under consideration can assume, it is possible to distinguish between three classes of random variable: (a) a discrete random variable; (b) a continuous random variable; and (c) a mixed random variable. For a discrete random variable there are a finite or infinite but countable number of values this random variable can attain. These possible values can be enumerated, i.e. a countable set, fn g, n ¼ 0, 1, . . ., covers all the possibilities. Continuous random variables assume values from a single or multiple non-intersecting intervals, ½ai ; bi , ai < bi < aiþ1 < . . . ; i ¼ 1; 2; . . . on the real axis. Both ends of the interval can be at infinity. Finally, a mixed random variable may assume values from both discrete and continuous sets. There is a number of characteristics which allow a complete description of a random variable . In particular, the Cumulative Distribution Function (CDF), P ðxÞ, Probability 1
In future, the Greek letters are used to denote a random variable. Latin letters a, b, c will be reserved for constants and x, y, z for variables.
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
8
RANDOM VARIABLES AND THEIR DESCRIPTION
Density Function (PDF) p ðxÞ and the characteristic function ð juÞ are the most important characteristics of random variables, allowing their complete description. 2.1.1.2
Cumulative Distribution Function
The cumulative distribution function P ðxÞ is defined as the probability of the event that the random variable does not exceed a certain threshold, x, i.e. P ðxÞ ¼ Probf < xg
ð2:1Þ
Any CDF P ðxÞ is a non-negative, non-decreasing, continuous on the left function which satisfies the following boundary conditions P ð1Þ ¼ 0;
P ð1Þ ¼ 1
ð2:2Þ
The converse is also valid: every function which satisfies the above listed conditions is a CDF of some random variable [1]. If CDF P ðxÞ is known, then the probability that the random variable falls inside a finite interval ½a; bÞ or intersection of non-overlapping intervals ½ai ; bi Þ is given by Probfa < bg ¼ P ðbÞ P ðaÞ X Probf 2 [i ½ai ; bi Þg ¼ ½P ðbi Þ P ðai Þ
ð2:3Þ
i
The advantage of the CDF method is the fact that the discrete, continuous and mixed variables are described using the same technique. However, such description is an integral representation and thus it is not easy to interpret. For a complete description of a discrete random variable , which assumes a set of values, xk , k ¼ 0; 1; . . ., one needs to know the distribution of probabilities Pk , i.e. Pk ¼ Probf ¼ xk g
ð2:4Þ
It is clear that Pk 0;
X
Pk ¼ 1
ð2:5Þ
k
Example: binomial distribution. This distribution arises when a given experiment with two possible outcomes, ‘‘SUCCESS’’ and ‘‘FAILURE’’, is repeated N times in a row. The probability of a ‘‘SUCCESS’’ outcome is 0 p 1, and the probability of a ‘‘FAILURE’’ is q ¼ 1 p. Probability PN ðkÞ of k successes in a set of N trials is then given by2 PN ðkÞ ¼ CNk pk qNk ¼
N! pk qNk ; k!ðN kÞ!
k ¼ 0; 1; . . . ; N
ð2:6Þ
Herethe following notation is used for a combination of k elements from a set of N elements: CNk ¼ N! N ¼ ½5. k k!ðN kÞ!
2
RANDOM VARIABLES AND THEIR DESCRIPTION
9
It can be seen that the probability PN ðkÞ is the coefficient in the expansion of ðq þ pxÞN into a power series with respect to x. This explains the term ‘‘binomial distribution’’. Situations described by the binomial law are often encountered in communication theory and many other applications. Examples include the number of errors (‘‘failures’’) in a group of N bits, the number of corrupted packets in a frame, number of rejected calls during a certain period of time, etc. Example: the Poisson random variable. In the case of the Poisson random variable , the random variable assumes integer numbers as values, with probabilities defined by Pk ¼ Probf ¼ kg ¼
k expðÞ k!
ð2:7Þ
This distribution can be obtained as a limiting case of the binomial distribution (2.6) when the number of experiments approaches infinity while the product ¼ N p remains constant [1,3]. This distribution also plays an important role in communication theory, reliability theory and networking.
2.1.1.3
Probability Density Function
A continuous random variable can be described by the probability density function p ðxÞ, defined such that p ðxÞ ¼
d P ðxÞ; dx
P ðxÞ ¼
ðx 1
p ðsÞ d s
ð2:8Þ
Any PDF satisfies the following conditions, which follow from the definition (2.8): 1. it is a non-negative function p ðxÞ 0 2. it is normalized to 1, i.e.
ð1 1
p ðxÞd x ¼ 1
3. probability Probfx1 < x2 g is given by ð x2 p ðxÞd x ¼ P ðx2 Þ P ðx1 Þ Probfx1 < x2 g ¼
ð2:9Þ
ð2:10Þ
ð2:11Þ
x1
Formally, the PDF for a discrete random variable can be defined as a sum of delta functions weighted by the probability of each discrete even, i.e. p ðxÞ ¼
X k
Pk ðx xk Þ
ð2:12Þ
10
RANDOM VARIABLES AND THEIR DESCRIPTION
A great variety of probablity densities is used in applications. Discussion of particular distributions is postponed until later. However, it is important to mention the so-called Gaussian (normal) distribution with PDF " # 1 ðx mÞ2 pðxÞ ¼ Nðx; n; DÞ ¼ pffiffiffiffiffiffiffiffiffiffiffi exp ð2:13Þ 2D 2D and the PDF of the so-called uniform distribution 1 for a x b pðxÞ ¼ ba 0 otherwise
ð2:14Þ
A much wider class of PDFs belong to the so-called Pearson family [4], which is described through a differential equation for its PDF d pðxÞ d xa dx ¼ ln pðxÞ ¼ dx b0 þ b1 x þ b2 x 2 pðxÞ
ð2:15Þ
It follows from eqs. (2.13) and (2.14) that the CDFs of Gaussian and uniform distributions are given by ðx xm ð2:16Þ Nðx; n; DÞd x ¼ pffiffiffiffi PN ðxÞ ¼ D 1 8 0 xa > <xa axb Pu ðxÞ ¼ ð2:17Þ > :b a 1 xb respectively. Here 1 ðxÞ ¼ pffiffiffiffiffiffi 2
2 t 1 x d t ¼ 1 erf exp 2 2 2 1
ðx
ð2:18Þ
is the probability integral which in turn can be expressed in terms of the error function [5].
2.1.1.4
The Characteristic Function and the Log-Characteristic Function
Instead of considering the CDF P ðxÞ or the PDF p ðxÞ one can consider an equivalent description by means of the Characteristic Function (CF). The characteristic function ð j uÞ is defined as the Fourier transform of the corresponding PDF3 ð ð1 1 1 ð j uÞ ¼ p ðxÞej u x d x; p ð j uÞ ¼ ð j uÞej u x d u ð2:19Þ 2 1 1 Thus, one can restore the PDF p ðxÞ from the corresponding characteristic function ð j uÞ. 3
Note the plus sign in the direct transform and the minus sign in the inverse transform.
RANDOM VARIABLES AND THEIR DESCRIPTION
11
There are a number of properties of the CF which follow from the definition (2.19). In particular ð1 ð1 jux j ð j uÞj jp ðxÞjje jd x ¼ p ðxÞd x ¼ ð0Þ ¼ 1 ð2:20Þ 1
1
and ðj uÞ ¼
ð1 1
p ðxÞej u x d x ¼ ð j uÞ
ð2:21Þ
Additional properties of the characteristic function follow from the properties of the Fourier transform pair and can be found in [1,6]. Tables of the Fourier transforms of PDFs are also available [7]. In addition to the characteristic function, one can define the so-called logcharacteristic function (or cumulant generating function) as ð j uÞ ¼ ln ð j uÞ
ð2:22Þ
This function is later used to define the cumulants of a random variable. Example: Gaussian distribution. The characteristic function of a Gaussian distribution can be obtained by using a standard integral (3.323, page 333 in [8]) ð1
q2 expðp x q xÞd x ¼ exp 4 p2 1 2 2
pffiffiffi p
ð2:23Þ
with q ¼ j u, p2 ¼ D=2. This results in D u2 N ð j uÞ ¼ exp j u m 2
ð2:24Þ
The Log-Characteristic Function (LCF) n ð j uÞ of the Gaussian distribution is then N ð j uÞ ¼ j u m
2.1.1.5
D u2 2
ð2:25Þ
Statistical Averages
The CDF, PDF and CF each provide a complete description of any random variable. However, in many practical problems it is possible to limit consideration to a less complete but simpler description of random variables. This can be achieved, for example, through use of statistical averages. Let ¼ gðÞ be a deterministic function of a random variable , described by the PDF p ðxÞ. The ensemble average of the function gðÞ is defined as EfgðÞg ¼ hgðÞi ¼
ð1 1
gðxÞp ðxÞd x
ð2:26Þ
12
RANDOM VARIABLES AND THEIR DESCRIPTION
where the subscript indicates over which variable the averaging is performed. In future, the subscript will be dropped if it does not create uncertainty. By specifying a particular form of the function gðÞ it is possible to obtain various numerical characteristics of the random variable. Because averaging is a linear operation it is possible to interchange the order of averaging and summation, i.e. ( E
K X
) ck gk ðÞ
k¼1
2.1.1.6
¼
K X
ck Efgk ðÞg
ð2:27Þ
k¼1
Moments
A moment mn ¼ mn of order n of a random variable is obtained if the function gðÞ in eq. (2.26) is chosen as gðÞ ¼ n , thus giving mn ¼
ð1 1
xn p ðxÞ d x
ð2:28Þ
The first order moment4 m1 ¼ m ¼ m is called the average value of the random variable . It follows from its definition that the dimension of the average coincides with that of the random variable; the average of a deterministic variable coincides with the value of the variable itself; the average of a random variable whose PDF is a symmetric function around x ¼ a is equal to m ¼ a.
2.1.1.7
Central Moments
A central moment n can be obtained from eq. (2.26) by setting gðÞ ¼ ð m1 Þn , i.e. n ¼ hð m1 Þn i ¼
ð1 1
ðx m1 Þn p ðxÞd x
ð2:29Þ
The first central moment is always zero: 1 ¼ 0. The second central moment 2 ¼ D ¼ 2 is called the variance, and represents the degree of variation of the random variable around its mean. The quantity is known as the standard deviation. The following properties of the variance can be easily obtained from definition (2.29) the variance D has dimension of the square of the random variable ; D 0; D ¼ 0 if and only if is deterministic; 4
In the following the subindex indicating a random variable is dropped if it does not cause confusion.
13
RANDOM VARIABLES AND THEIR DESCRIPTION
the variance D of ¼ c is equal to c2 D , where c is a deterministic constant; the variance of ¼ þ c is equal to D ¼ D ; the following inequality is valid (Tschebychev inequality) [3]. Probfj m1 j "g
D "2
ð2:30Þ
Here " is an arbitrary positive number. The last property states that the probability of large deviations from the average value are very rare. It is possible to obtain relations between the moments and the central moments of the same random variable using the binomial expansion. Indeed n ¼
ð1 1
ðx mÞn pðxÞd x ¼
ð1 X n 1 k¼0
½Cnk ð1Þk xnk mk pðxÞd x ¼
n X
Cnk ð1Þk mnk mk
k¼0
ð2:31Þ Conversely, mn ¼
n X
Cnk nk mk
ð2:32Þ
k¼0
It is interesting to mention that the moment (central moment) of order n depends on all n central moments (moments) of lower or equal order. In particular m2 ¼ D þ m21
ð2:33Þ
Example: Gaussian PDF. In the case of the Gaussian distribution the mean value and the variance are " # 1 ðx mÞ2 x pffiffiffiffiffiffiffiffiffiffiffi exp m1 ¼ dx ¼ m 2D 2D 1 " # ð1 1 ðx mÞ2 2 d x ¼ D ¼ 2 ðx mÞ pffiffiffiffiffiffiffiffiffiffiffi exp 2 ¼ 2D 2D 1 ð1
ð2:34Þ ð2:35Þ
respectively. Thus, the parameters m and D of the Gaussian distribution are actually the mean and variance of the distribution.
2.1.1.8
Other Quantities
There is a great number of other numerical characteristics which are useful in describing a random variable. A few of them are listed below. Much more detailed discussions and applications can be found in [1,2].
14
RANDOM VARIABLES AND THEIR DESCRIPTION
If the PDF pðxÞ has a (local) maximum at x ¼ xm , the value xm is called a mode of the distribution. A distribution with a single mode is called a unimodal distribution, while a distribution with more than one mode is called multimodal. The median x1=2 of the distribution is a value such that Pðx1=2 Þ ¼
ð x1=2 1
pðxÞd x ¼
ð1
pðxÞd x ¼
x1=2
1 2
ð2:36Þ
This can be generalized to define ðn 1Þ values fxk g, k ¼ 0; . . . ; n 2, called the quantiles, which divide the real axis into n equally probable regions ð1 1
pðxÞd x ¼
ð x1
pðxÞd x ¼ ¼
ð1
pðxÞd x ¼
xn2
x0
1 n
ð2:37Þ
Absolute moments can be defined by setting gðÞ ¼ jjn and gðÞ ¼ j mjn in eq. (2.26). In addition, the requirement that the power n is an integer can also be dropped to obtain moments of a fractional order. Finally, the entropy (differential entropy) of the distribution can be defined as HðÞ ¼
K X
pk log pk ; HðÞ ¼
ð1
k¼1
1
p ðxÞ log p ðxÞd x
ð2:38Þ
The entropy is an important concept which is discussed in detail in textbooks on information theory. It also has significant application as a measure of difference between two PDFs.
2.1.1.9
Moment and Cumulant Generating Functions
The characteristic function ð j uÞ was formally defined in Section 2.1.1.4 as the Fourier transform of the corresponding PDF p ðxÞ. It also can be defined in the framework of the statistical averaging of eq. (2.26). Indeed, let gðÞ be an exponential function with a parameter s: gðÞ ¼ expðs Þ. In this case the average value MðsÞ of gðÞ is defined as M ðsÞ ¼ hexpðs Þi ¼
ð1 1
expðs xÞp ðxÞd x
ð2:39Þ
and is called the moment generating function. Expanding the exponential term expðs xÞ into a power series, one obtains the relation between the moment generating function and the moments of the distribution of random variable : * M ðsÞ ¼
1 k X k¼0
k!
+ sk
¼
1 X mk k¼0
k!
sk
ð2:40Þ
RANDOM VARIABLES AND THEIR DESCRIPTION
15
if all moments exist and are finite. In turn, the coefficients of the Taylor expansion (2.40), i.e. the moments mk , can be found as dk ð2:41Þ mk ¼ k M ðsÞ ds s¼0 The transform variable s can be a complex one, i.e. s ¼ þ j u. In a particular case of s ¼ j u, the moment generating function coincides with the characteristic function, i.e. ð j uÞ ¼ M ðsÞjs¼j u and expansion (2.40) can be rewritten as * + 1 k 1 X X k mk s ¼ ð j uÞ ¼ j k uk ð2:42Þ k! k! k¼0 k¼0 with the moments defined according to mk ¼ ðjÞk
dk ð j uÞju¼0 d uk
ð2:43Þ
Thus, under certain conditions, the characteristic function ð j uÞ can be restored from its moments. It is interesting to investigate what additional conditions must be imposed on the moments such that the restored characteristic function uniquely defines the distribution. This problem is known as the moments problem. It is possible to show that if all moments mk are finite and the series (2.42) absolutely converges for some u > 0, then the series (2.42) defines a unique distribution [2]. It should be noted that this is not true for an arbitrary distribution. For example, the log-normal distribution, considered below, is not uniquely defined by its moments [2]. Usually, the non-uniqueness arises when the moments mk increase rapidly with the index k, thus not allowing absolute convergence of the series (2.42). Such problems are often found for distributions with heavy tails. In many cases the higher order moments do not even exist [2]. A slight modification allows the characteristic function to be expressed in terms of central moments by rewriting the series (2.42) as " # 1 X k k j u m1 ð j uÞ ¼ e ð j uÞ 1þ ð2:44Þ k! k¼1 Here we use the property that the moments of random variable ¼ m1 are the central moments of the random variable , and the property that the characteristic function of p ðx m1 Þ is ð j uÞ expð j u m1 Þ.
2.1.1.10
Cumulants
Instead of considering the Taylor expansion for the characteristic function, one can construct a similar expansion of the log-characteristic function ð j uÞ, i.e. it is possible to formally write ð j uÞ ¼
1 X
k jk uk k! k¼1
ð2:45Þ
16
RANDOM VARIABLES AND THEIR DESCRIPTION
Coefficients of this expansion are called cumulants of the distribution p ðxÞ of the random variable , with k being the cumulants of the k-th order. Since both coefficients mk and k describe the same function, there is a close relation between the moments and the cumulants. Indeed, provided that both expansions (2.42) and (2.45) are possible, one can write " # 1 1 1 1 X 1 Y Y X X
r j r ur 1 r k mk k r r r u ¼ exp u ¼ j j exp jr ur ¼ ð j uÞ ¼ s! k! r! r! r! r¼1 r¼1 r¼1 l¼0 k¼0 ð2:46Þ Collecting terms with the same power of u on both sides of this expansion, one obtains the expression of the moment of order k in terms of cumulants of order up to k m r X X
p1 1 p2 2
pm r! ð2:47Þ mk ¼ p ! p ! p ! ! ! . . . m ! 1 2 m 1 2 m¼1 where the inner summation is taken over all non-negative values of indices i , such that 1 p1 þ 2 p2 þ þ m pm ¼ r
ð2:48Þ
In a similar way, the expression of the cumulant k can be written in terms of the moments of order up to k [2]
k ¼ r!
r X X mp 1 mp 2 m¼1
1
2
p1 !
p2 !
mpm m ð1Þ1 ð 1Þ! pm ! 1 !2 ! . . . m !
ð2:49Þ
The inner summation is extended over all the indices and such that 1 þ 2 þ þ m ¼
ð2:50Þ
There are a number of tables which contain explicit expressions for cumulants and moments up to order 12 [2] while [9] provides a convenient means of deriving relations between the cumulants and the moments.
2.2
ORTHOGONAL EXPANSIONS OF PROBABILITY DENSITIES: EDGEWORTH AND LAGUERRE SERIES
In many practical cases one has to deal with a probability density p1 ðxÞ which looks similar to a Gaussian one defined by eq. (2.13). Two characteristic features of such distributions can be summarized as follows: 1. Unimodality, i.e. the PDF has a single maximum, and 2. The PDF has tails extending to infinity on both sides of the maximum, which decay fast when the magnitude of the argument approaches infinity.
17
ORTHOGONAL EXPANSIONS OF PROBABILITY DENSITIES
In this case it is often possible to approximate such PDFs using series of Hermitian or Laguerre polynomials. 2.2.1
The Edgeworth Series
In this case a PDF pðxÞ under consideration is approximated by the following series pðxÞ ¼ p0 ðxÞ
1 X 1 bn x m Hn n! n n¼0
ð2:51Þ
Here " # 1 ðx mÞ2 p0 ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi exp 2 2 2 2
ð2:52Þ
is the Gaussian PDF with the mean value m and variance 2 , and Hn ðzÞ stands for the Hermitian polynomials [5] Hn ðzÞ ¼ ð1Þn exp
2 n 2 z d z exp ; 2 d zn 2
n ¼ 0; 1; . . .
ð2:53Þ
Since the Hermitian polynomials are orthogonal with weight p0 ðxÞ, i.e. ð1 1
Hn ðzÞHm ðzÞp0 ðzÞd z ¼ n!m n ¼
n! 0
m¼n m 6¼ n
ð2:54Þ
the coefficients bn in the expression (2.51), called quasi-moments, can be calculated as bn ¼
n
x m m n pðxÞHn d x ¼ Hn 1
ð1
ð2:55Þ
This expansion is based on the well known theorem of functional analysis [10] stating that if pðxÞ is an arbitrary function such that ð1 1
jpðxÞj2 d x < 1
ð2:56Þ
then 2 ð 1 N X 1 bn x m lim Hn dx ¼ 0 pðxÞ p0 ðxÞ N!1 1 n! n n¼0
ð2:57Þ
In practice the function pðxÞ is known only with a certain degree of accuracy. Thus, the sum (2.51) can be truncated after a finite number, N, of terms. The number N depends on the
18
RANDOM VARIABLES AND THEIR DESCRIPTION
choice of m and 2 . In most cases the best choice of m and 2 for a given N is to choose them to coincide with the first two cumulants derived directly from pðxÞ. Then it is easy to show that b0 ¼ 1;
b1 ¼ b2 ¼ 0
ð2:58Þ
Indeed, using eq. (2.53) we find that H0 ðzÞ ¼ 1;
H1 ðzÞ ¼ z;
H2 ðzÞ ¼ z2 1;
H3 ðzÞ ¼ z3 3 z;
H4 ðzÞ ¼ z4 6 z2 þ 3 ð2:59Þ
Thus, according to eq. (2.55) D x mE b 0 ¼ 0 H 0 ¼ h1i ¼ 1 E D x Dx mE m ¼ ¼0 b 1 ¼ 1 H 1 * + 2 2 D x mE ðx mÞ 2 2 2 b 2 ¼ H2 ¼ 1 ¼ 1 ¼0 2 2
ð2:60Þ
The Edgeworth series is obtained from eq. (2.51) by taking a finite number N of terms and choosing m and to coincide with those obtained from p1 ðxÞ, i.e. "
N x m X bn H pðxÞ p0 ðxÞ 1 þ n n!n n¼3
# ð2:61Þ
The first term in the expansion (2.61) corresponds to a Gaussian distribution. Thus, for a Gaussian distribution all quasi-moments bn are equal to zero. Coefficients b3 and b4 in the series (2.61) describe the departure of pðxÞ from a Gaussian PDF and are known as the skewness and the curtosis [2] b3 3
3 ¼ 3 ¼ 3=2 3
2 b4 4
4 2 ¼ 4 ¼ 4 3 ¼ 2
2 1 ¼
ð2:62Þ
Here 2 ; 3 ; 4 are the cumulants of the distribution pðxÞ as defined by eq. (2.49), and 3 ; 4 are the central moments of third and fourth order. The skewness coefficient describes the deviation of the PDF pðxÞ from a symmetric shape. For any symmetrical distribution this coefficient is equal to zero. Fig 2.1 shows two cases of PDF. The first one (a) has a shallow tail on the right of the mean value, thus in the expression for ðx mÞ3 the positive deviations overpower the negative deviations and give 3 > 0, 1 > 0. A case 2 < 0 leads to a situation where the left tail of the distribution is heavier than the right one.
ORTHOGONAL EXPANSIONS OF PROBABILITY DENSITIES
19
Figure 2.1 Different shapes of PDF similar to a Gaussian PDF (a) 1 > 0, (b) 1 < 0, (c) 2 > 0 and (d) 2 < 0. Dashed line corresponds to a Gaussian distribution.
The curtosis 2 describes how flat the PDF is around its peak value. For a Gaussian PDF this coefficient is equal to zero. Positive values of 2 show that the PDF curve has a narrower (sharper) top than a Gaussian, while 2 < 0 results in a flatter top. Figures 2.1(c) and 2.1(d) show following curves: 2 ¼ 0 (Gaussian), 2 < 0 and 2 > 0. It is common practice to approximate the distributions with small deviation from Gaussian using only 1 and 2 . In this case h 1 x m 2 x mi pðxÞ p0 ðxÞ 1 þ H3 þ H4 ð2:63Þ 3! 4! To obtain such an approximation one needs to estimate the first four moments (or cumulants) of the random process ðtÞ. It is important to mention that for such an approximation (2.63), the non-negativity property of the PDF is often violated. It can be shown that the approximating curve assumes negative values for large values of the argument jxj [9]. This is a consequence of the fact that eq. (2.63) is only an approximation and not an exact PDF. The characteristic function for the PDF given by the series (2.51) is equal to X 1 1 1 bn ðj uÞn ð2:64Þ 1 ðuÞ ¼ exp j m u þ ð uÞ2 n 2 n! n¼0 Expanding the exponent into the Taylor series and comparing with the series (2.61), consisting of the moments of the PDF pðxÞ, one can conclude that the moments are linearly expressed in terms of quasi-moments (and vice versa). This is the reason why the term
20
RANDOM VARIABLES AND THEIR DESCRIPTION
‘‘quasi-moment’’ is used. Quasi-moments can be used to describe a random process in exactly the same way as regular moments, correlation functions and cumulants. It is important to mention that despite the fact that eq. (2.63) provides an approximation which has negative values for large $ $, the importance of the quasi-moment technique lies in the ability to provide an easy algorithm that can be used to obtain a similar approximation at the output of non-linear and linear systems.
2.2.2
The Laguerre Series
If the PDF pðxÞ has zero values for x < 0, then the Edgeworth series converges slowly to pðxÞ. In this case, a more convenient approximation, called the Laguerre series, can be used pðxÞ ¼
1 X
Cu exp½xx LnðÞ ðxÞ
ð2:65Þ
n¼0 ðÞ
Here Ln ðzÞ are the generalized Laguerre polynomials [5] LnðÞ ðzÞ ¼ expðzÞ
z dn ½expðzÞznþ ; n!d z
> 1
ð2:66Þ
The first four of these polynomials are ðÞ
L0 ¼ 1 ðÞ
L1 ¼ 1 þ z ðÞ
2 L2 ¼ ð þ 1Þð þ 2Þ 2 zð þ 2Þ þ z2
ð2:67Þ
ðÞ
6 L3 ¼ ð þ 1Þð þ 2Þð þ 3Þ 3 zð þ 2Þð þ 3Þ þ 3 z2 ð þ 3Þ z3 The Laguerre polynomials are orthogonal in the interval ½0; 1Þ with the weight p0 ðxÞ ¼ z expðzÞ, i.e. ð1 1 z expðzÞLn ðzÞLm ðzÞd z ¼ ðn þ þ 1Þm n ð2:68Þ n! 0 Here ðzÞ is the Gamma function [5]. Taking the orthogonality (2.68) into account, the coefficients of the expansion (2.65) can be found as ð1 n! LðÞ pðxÞd x ð2:69Þ Cn ¼ ðn þ þ 1Þ 0 n If, instead of the original process ðtÞ, one considers a normalized process ¼ = (with the coefficient yet to be defined), then its PDF p ðyÞ can be expressed in terms of the PDF p ðxÞ as p ðyÞ ¼
1 x p
ð2:70Þ
ORTHOGONAL EXPANSIONS OF PROBABILITY DENSITIES
21
In a similar manner to eq. (2.65) one can write an expansion of the new PDF in terms of the Laguerre polynomials pðyÞ ¼
1 X
bn exp½yy LnðÞ ðyÞ
ð2:71Þ
n¼0
Here, the coefficients of the expansion can be calculated as bn ¼
n! ðn þ þ 1Þ
ð1 0
LnðÞ ðyÞp ðyÞdy
ð2:72Þ
After substituting eq. (2.67) in eq. (2.72) and taking into account normalization conditions for PDF and the definition of moments, one can obtain m
b1 ¼ ð þ 2Þ 1 2m m2 ð þ 1Þð þ 2Þ ð þ 2Þ þ 2 b2 ¼
ð þ 3Þ
1þ
ð2:73Þ
Since at this moment and are arbitrary constants, it is possible to choose them in such a manner that both b1 and b2 vanish (i.e. the expansion will have less terms) b1 ¼ b2 ¼ 0. For this purpose the coefficients and have to be chosen as m2 m2 1 ¼ 1 m2 m2 2 m 2 m 2 2 ¼
¼ m m
¼
ð2:74Þ
With this choice of and the first four coefficients in eq. (2.71) become b0 ¼
1 ; ð þ 1Þ
b1 ¼ b2 ¼ 0;
b3 ¼
1 m2 m3 ð þ 3Þ
3 ð þ 1Þ 2
ð2:75Þ
Higher order coefficients have more complicated expressions. The Laguerre series is often used when the first term provides a good approximation, i.e. when the function pðxÞ is close to p0 ðxÞ ¼
1 x x exp
ð þ 1Þ
ð2:76Þ
where and are defined through the mean and the variance of pðxÞ as in eq. (2.74). Thus, the first term in the Laguerre series is the Gamma distribution. Similar orthogonal expansions can be obtained for other weighting functions as long as they generate a complete system of basis functions. Not all probability densities have this property. A detailed discussion of this subject can be found in [2].
22
2.2.3
RANDOM VARIABLES AND THEIR DESCRIPTION
Gram–Charlier Series
There is another important method of representation of a PDF p ðxÞ of some random variable through a given PDF p ðxÞ of some standardized random variable . Without loss of generality one can assume that both random variables have zero mean and unit variance. This can always be achieved by considering a normalized and centred random variable 0 ¼
m1
ð2:77Þ
The PDFs of and 0 are related by x m1 1 p0
p ðxÞ ¼
ð2:78Þ
Thus, if the series for p0 is known then the series for p can be easily obtained by changing the variable. In this section the following approximation is considered p ðxÞ ¼
1 X k¼0
k
dk p ðxÞ d xk
ð2:79Þ
Such expansion is especially convenient when the derivative of the PDF p ðxÞ can be expressed in the following form dk p ðxÞ ¼ Qk ðxÞp ðxÞ d xk
ð2:80Þ
where Qk ðxÞ are well studied functions, for example polynomials5 of order k. Taking the (inverse) Fourier transform of both sides, one obtains the series relating the characteristic functions ð j uÞ and ð j uÞ: ð j uÞ ¼
1þ
1 X
! ð j uÞk ð1Þk k ð j uÞ
ð2:81Þ
k¼1
Using representation of the characteristic functions through the moments of the corresponding distributions, one obtains the following relation between the unknown coefficients k of the expansion (2.79) and the moments of both distributions 1 X ð j uÞn n¼0
n!
mn ¼ 1 ¼
1 u2 X ð j uÞn þ m n 2 n! n¼3
1þ
1 X k¼1
5
k
k
ð j uÞ ð1Þ k
!
1 u2 X ð j uÞl m l 1 þ 2 l! l¼3
The existence of such densities is a consequence of the Rodriguez formula [5].
! ð2:82Þ
TRANSFORMATION OF RANDOM VARIABLES
23
Finally, equating coefficients of equal powers of u, it is possible to obtain the following expressions for the coefficients k : 1 ¼ 2 ¼ 0 1 3 ¼ ðm 3 m 3 Þ 3! 1 4 ¼ ðm 4 m4 Þ 4! 1 5 ¼ ½m 5 m 5 10ðm 3 m3 Þ 5! 1 6 ¼ ½m 6 m 6 15ðm 4 m 4 Þ 20 m 3 ðm 3 m 3 Þ 6!
ð2:83Þ
It can be seen that the Gram–Charlier series coincides with the Edgeworth series if p ¼ expðx2 =2Þ pffiffiffiffiffiffi . 2 Some properties of classical orthogonal polynomials are listed in Appendix B on the web; more detailed discussion can be found in [5,11].
2.3
TRANSFORMATION OF RANDOM VARIABLES
Let a random variable ðtÞ be described by the PDF p ðxÞ which is assumed to be known. Applying a non-linear memoryless transformation ¼ gðÞ to this random variable, one obtains a new random variable , whose PDF p ðyÞ and moments mn are to be found. Moments and central moments can be easily found using the definition of a statistical average mn ¼ h i ¼ hg ðÞi n
n
ð1 1
gn ðxÞp ðxÞd x
n ¼ hð m1 Þn i ¼ hðgðÞ m1 Þn i ¼
ð2:84Þ ð1 1
½gðxÞ m1 n p ðxÞd x
ð2:85Þ
Thus, the calculation of moments of the random variable can be converted to the calculation of corresponding averages based on the known PDF p ðxÞ. No direct calculation of the PDF p ðyÞ is required. The next step is to calculate the PDF p ðyÞ itself. First, it is assumed that y ¼ gðxÞ is a monotonic function, and thus y ¼ gðxÞ is a one-to-one mapping. Furthermore, it should be noted that since ¼ gðÞ is a deterministic transformation, then if a value of the random variable falls inside a small interval ½x; x þ d x, then the random variable falls insides the region ½ y; y þ d y, where the boundaries of the interval are defined by the non-linear transformation and its derivative: y ¼ gðxÞ;
dy ¼
d gðxÞd x dx
ð2:86Þ
24
RANDOM VARIABLES AND THEIR DESCRIPTION
This means that probabilities of these two events are equal and thus d p ðyÞd y ¼ p ðxÞd x ¼ p ½hðyÞ hðyÞ dy
ð2:87Þ
Here x ¼ hðyÞ is a function inverse to y ¼ gðxÞ. For a monotonic y ¼ gðxÞ the inverse function is unique, i.e. has only one branch. If the function f ðxÞ is increasing then a positive increment of the argument d x > 0 results in a positive increment of the function dy > 0. However, if y ¼ gðxÞ is a decreasing function then the increment of the function is negative: dy < 0. However, the area under the curve between y and y þ d y, reflecting the probability, is a positive quantity, thus the absolute value must be used in eq. (2.87).
Figure 2.2 Transformation of a random variable: (a) one-to-one transformation; (b) two-to-one transformation.
If the inverse function has more than one branch the situation is more involved. In this case interval ½y; y þ d y is an image of as many non-intersecting intervals as there are branches of the inverse function. For example, if there are two branches x1 ¼ h1 ðyÞ and x2 ¼ g2 ðyÞ as in Fig. 2.2(b), then 2 ½y; y þ d y is satisfied when 2 ½x1 ; x1 þ d x1 or 2 ½x2 ; x2 þ d x2 . The equality of probabilities thus becomes d d p ðyÞjd yj ¼ p ðx1 Þjd x1 j þ p ðx2 Þjd x2 j ¼ p ðh1 ðyÞÞ h1 ðyÞ þ p ðh2 ðyÞÞ h2 ðyÞ ð2:88Þ dy dy The generalization to a case of M branches x ¼ hm ðyÞ, m ¼ 1; . . . ; M of the inverse function is straightforward d p ðhm ðyÞÞ hm ðyÞ p ðyÞ ¼ d y m¼1 M X
ð2:89Þ
It is also worth mentioning that while the probability of the transformed random variable p ðyÞ can be uniquely defined if the transformation and the input PDF are known, it is not
TRANSFORMATION OF RANDOM VARIABLES
25
possible to uniquely restore the input distribution by observing the output random variable if the transformation is not one-to-one. The following examples demonstrate applications of the rule of transformation to a number of problems often encountered in applications. 2.3.1
Transformation of a Given PDF into an Arbitrary PDF
Let be a continuous random variable, uniformly distributed on an interval ½a; b. It is often desired to find a transformation y ¼ gðxÞ which transforms into another random variable whose PDF p ðyÞ is given. In this case, according to eq. (2.89) p ðyÞd y ¼
dx ba
ð2:90Þ
or x ¼ c þ ðb aÞ
ðy 1
p ðyÞd y ¼ c þ ðb aÞP ðyÞ
ð2:91Þ
where c is a constant of integration. Since P ð1Þ and P ð1Þ ¼ 1, this constant must be chosen as c ¼ a. In this case a $ $ b and, consequently x a xa and gðxÞ ¼ P1 ð2:92Þ P ðyÞ ¼ ba ba Here P1 is a function, inverse to the CDF of the random variable . On the other hand, if one chooses y ¼ gðxÞ ¼ a þ ðb aÞP ðxÞ
ð2:93Þ
then eq. (2.89) results in the transformation of an arbitrary distribution p ðxÞ to a uniform distribution. Thus, in principle, any PDF can be transformed into another PDF. This fact is often used in numerical simulation. 2.3.2
PDF of a Harmonic Signal with Random Phase
The next important example is an ensemble of the harmonic functions with constant amplitude A, angular frequency !0 and random phase ’ sðt; ’Þ ¼ gð’Þ ¼ A0 sinð!0 t þ ’Þ
ð2:94Þ
Assuming that the PDF p’ ð’Þ is known, the distribution ps ðxÞ of the signal sðtÞ can be found for any given fixed moment of time t. However, the calculation is complicated by the fact that the inverse transformation ’ ¼ hðxÞ has infinitely many branches ’n ¼ a sinðx=A0 Þ !0 t;
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ’n ¼ A20 x2 dx
ð2:95Þ
26
RANDOM VARIABLES AND THEIR DESCRIPTION
and thus the expression for the PDF ps ðxÞ is given by 8 1 X > 1 < pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p’ ð’n Þ ps ðxÞ ¼ A20 x2 n¼1 > : 0
jxj A0
ð2:96Þ
jxj > A0
since the resulting signal is confined to the interval ½A0 ; A0 . In many communication applications it is reasonable to assume that the phase ’ is uniformly distributed in the interval ½; . p’ ð’Þ ¼
1 ; j’j 2
ð2:97Þ
In this case the inverse transformation is limited to only two branches, each having the distribution (2.97). Thus, the distribution ps ðxÞ of the signal sðtÞ is ps ðxÞ ¼
2.4 2.4.1
1 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2ffi ; x A0 1 A0
jxj A0
ð2:98Þ
RANDOM VECTORS AND THEIR DESCRIPTION CDF, PDF and the Characteristic Function
A random vector ¼ ½1 ; 2 ; . . . ; n T is a set of random variables i , considered jointly. The number of components n is called the dimension of the random vector. Each component can be continuous, discrete or mixed. In order to describe the random vector one has to describe the range of values each component assumes and probabilities of each value. However, it is not sufficient to provide separate characteristics for each component—all components should be described jointly. Similarly to the case of a scalar random variable, considered above, it is possible to generalize a notion of the cumulative distribution function, probability density function, characteristic function and log-characteristic function for the vector case. The cumulative distribution function Pn ðxÞ and the joint PDF, suitable for description of any type of random vector, are defined as Pn ðxÞ ¼ Pn ðx1 ; x2 ; . . . ; xn Þ ¼ Probf1 < x1 ; 2 < x2 ; . . . ; n < xn g @n Pn ðx1 ; x2 ; . . . ; xn Þ pn ðxÞ ¼ pn ðx1 ; x2 ; . . . ; xn Þ ¼ @ x1 @ x2 @ xn
ð2:99Þ ð2:100Þ
The CDF defined by eq. (2.100) poses the following properties 1. The CDF Pn ðxÞ is equal to zero if at least one of its arguments is negative infinity and it approaches one if all the arguments are equal to positive infinity: Pn ðx1 ; . . . ; 1; . . . ; xn Þ ¼ 0 Pn ð1; . . . ; 1; . . . ; 1Þ ¼ 1
ð2:101Þ ð2:102Þ
27
RANDOM VECTORS AND THEIR DESCRIPTION
2. The CDF is a non-decreasing function in each of its arguments. 3. If the k-th component of the random vector n is excluded from consideration, the CDF Pnn1 of the remaining vector can be obtained as follows nn1 ¼ ½1 ; . . . ; k1 ; kþ1 ; . . . ; n T Pn ðx1 ; . . . ; 1; . . . ; xn Þ ¼ Pnn1 ðx1 ; . . . ; xk1 ; xkþ1 ; . . . ; xn Þ
ð2:103Þ
4. The CDF can be expressed as an n-fold repeated integral of the corresponding CDF ð x1 ð xn pn ðxÞd x1 d x2 . . . d xn ð2:104Þ Pn ðxÞ ¼ 1
1
The PDF pn ðxÞ of a random vector n must satisfy the following conditions, which follow from the properties of the CDF and relation (2.100): 1. The PDF is a non-negative function pn ðxÞ 0
ð2:105Þ
2. The PDF is normalized to unity, i.e. ð1 ð1 p ðxÞd x1 d x2 . . . d xn ¼ 1 1
ð2:106Þ
1
3. The PDF is a symmetric function with respect to its arguments. 4. If integrated over one of the arguments over the real axis, the PDF of the random vector of order n produces a PDF of a random vector of dimension ðn 1Þ obtained by excluding the component over which integration has been performed: ð1 pnn1 ðx1 ; . . . ; xk1 ; xkþ1 ; . . . ; xn Þ ¼ pn ðxÞd xk ð2:107Þ 1
If 1n ¼ n þ j n is a vector of complex random variables, this vector can be described as a vector with 2 n real components, allowing complex vectors to be handled within the framework developed above. The characteristic function n ð j uÞ of the random vector can be defined as the ndimensional Fourier transform of the corresponding PDF pn ðxÞ n ð j uÞ ¼ n ð j u1 ; j u2 ; . . . ; j un Þ ð1 ð1 ¼ pn ðx1 ; . . . ; xn Þ exp½jðu1 x1 þ þ un xn Þd x1 d xn 1
1
ð2:108Þ
The characteristic function is a symmetric, continuous function of its arguments, j n ð j uÞj n ð j0Þ ¼ 1. m n ð j u1 ; j u2 ; . . . ; j um Þ ¼ nn ð j u1 ; j u2 ; . . . ; j um ; 0; . . . ; 0Þ;
m
ð2:109Þ
28
RANDOM VARIABLES AND THEIR DESCRIPTION
The last equation reflects the fact that the characteristic function of a sub-vector of size m can be obtained from the characteristic function of the full vector of size n > m by setting the arguments to zero at coordinates not included in the sub-vector. If components f1 ; 2 ; . . . ; n g are independent then pn ðxÞ ¼
n Y
pk ðxk Þ;
and n ð j uÞ ¼
k¼1
n Y
k ð j uk Þ
ð2:110Þ
k¼1
The converse statement is also true: if the characteristic function is expressed as a product of characteristic functions of the components then the components of the random vector are independent.
2.4.2
Conditional PDF
Joint consideration of two or more random variables allows the introduction of a new concept not considered in the case of a scalar random variable—the concept of the conditional PDF (CDF or CF). Let be a random variable (or vector) which is observed simultaneously with a certain condition B. Then the probability that does not exceed x, given that the condition B is satisfied, is called the conditional CDF P ðx j BÞ of the random variable given the condition B: PjB ðx j BÞ ¼ Probf < x j Bg ¼
Probf < x; Bg ProbðBÞ
ð2:111Þ
The conditional PDF pjB ðxjBÞ is then defined as the derivative of the conditional CDF with respect to x. Its Fourier transform represents the conditional characteristic function jB ð j u j BÞ ð1 @ P ðx j BÞ; j B ð j u j BÞ ¼ p j B ðx j BÞ ¼ pjB ðx j BÞ expðj u xÞd x ð2:112Þ @x 1 Conditional CDF, PDF and CF satisfy all the properties obeyed by regular unconditional CDF, PDF and CF, as can be shown from the definitions (2.111) and (2.112). Upon generalization to the case of random vectors the definition (2.112) becomes pj ðx1 ; . . . ; xk jxkþ1 ; . . . ; xn Þ ¼
p; ðx1 ; . . . ; xk ; xkþ1 ; . . . ; xn Þ p ðxkþ1 ; . . . ; xn Þ
ð2:113Þ
or p; ðx1 ; . . . ; xk ; xkþ1 ; . . . ; xn Þ ¼ pj ðx1 ; . . . ; xk ; xkþ1 ; . . . ; xn Þp ðxkþ1 ; . . . ; xn Þ
ð2:114Þ
known as the formula of multiplication of probabilities. Applying it in a chain manner, one obtains pðx1 ; . . . ; xk ; xkþ1 ; . . . ; xn Þ ¼ pðx1 jx2 ; . . . ; xn Þpðx2 ; . . . ; xn Þ ¼ pðx1 jx2 ; . . . ; xn Þpðx2 jx3 ; . . . ; xn Þpðx3 ; . . . ; xn Þ ¼ pðx1 jx2 ; . . . ; xn Þpðx2 jx3 ; . . . ; xn Þ pðxn1 jxn Þpðxn Þ
ð2:115Þ
29
RANDOM VECTORS AND THEIR DESCRIPTION
Two rules can be provided in order to eliminate variables from the expression for a conditional PDF. 1. To eliminate elements on the ‘‘left of the bar’’, i.e. to eliminate unconditional variables, one just has to integrate the conditional density over these variables ð1 pðx1 ; . . . ; xj1 ; xj ; xjþ1 ; . . . ; xk jxkþ1 ; . . . ; xn Þd xj 1
¼ pðx1 ; . . . ; xj1 ; xjþ1 ; . . . ; xk jxkþ1 ; . . . ; xn Þ
ð2:116Þ
2. To eliminate the variable on the ‘‘right of the bar’’, i.e. to eliminate condition variables, one has to multiply the conditional density by the conditional PDF of the conditions to be eliminated conditioned on all other conditions, and integrate over the variables to be eliminated ð1 pðx1 ; . . . ; xk jxkþ1 ; . . . ; xj1 ; xj ; xjþ1 ; . . . ; xn Þpðxj jxkþ1 ; . . . ; xj1 ; xjþ1 ; . . . ; xn Þd xj 1
¼ pðx1 ; . . . ; xk jxkþ1 ; . . . ; xj1 ; xjþ1 ; . . . ; xn Þ
ð2:117Þ
In particular, the following formula (playing a prominent role in the theory of Markov processes) can be obtained ð1 pðx1 jx2 ; x3 Þpðx2 jx3 Þd x2 ð2:118Þ pðx1 jx3 Þ ¼ 1
All the considered definitions and rules remain valid for the case of a discrete random variable, with integrals being reduced to sums. Random variables, 1 ; 2 ; . . . ; n are called mutually independent if events f1 < x1 g; f2 < x2 g; . . . ; fn < xn g are independent for any values of xi ; i ¼ 1; . . . ; n. In this case their joint PDF (CDF, CF) is a product of PDF (CDF, CF) of individual components, i.e. p ðx1 ; . . . ; xn Þ ¼
n Y
pi ðxi Þ
ð2:119Þ
i ðxi Þ
ð2:120Þ
i¼1
and ðx1 ; . . . ; xn Þ ¼
n Y i¼1
respectively. It is also interesting to note that it is possible to find a random vector whose components are pair-wise independent, i.e. pi j ðxi ; xj Þ ¼ pi ðxi Þpj ðxj Þ, but when considered jointly they are not independent p ðx1 ; x2 ; . . . ; xn Þ 6¼ p1 ðx1 Þp2 ðx2 Þ . . . pn ðxn Þ. Two complex variables 1 ¼ 1 þ j 1 ; . . . ; n ¼ n þ j n are mutually independent if pð1 ; 2 ; . . . ; n ; 1 ; 2 ; . . . ; n Þ ¼
n Y
pði ; i Þ
ð2:121Þ
i¼1
Note that independence of the real and imaginary parts of each variable is not required.
30
RANDOM VARIABLES AND THEIR DESCRIPTION
2.4.3
Numerical Characteristics of a Random Vector
The numerical characteristics considered in Sections 2.1.1.5–2.1.1.8 can be extended to the case of random vectors. In this section it is assumed that a random vector n ¼ ½1 ; 2 ; . . . ; n T is described by the joint PDF pn ðxÞ. In this case the n-dimensional moment m1 ;2 ;...;n of order ¼ 1 þ 2 þ þ n is defined as m1 ;2 ; ...;n ¼
ð1 1
ð1 1
x1 1 x2 2 . . . xn n pn ðxÞd x1 d x2 . . . d xn
ð2:122Þ
Similarly, the central moments can be defined as 1 ;2 ; ...;n ¼
ð1 1
ð1 1
ðx1 m1 1 Þ1 ðx2 m1 2 Þ2 ðxn m1 n Þn pn ðxÞd x1 d x2 . . . d xn ð2:123Þ
where m1 k ¼ m0; 0; ...;1;...;0 ¼ Efk g is the average of the k-th component k of the random vector n. Moments can be obtained as coefficients of the power series expansion of the characteristic function since (
"
n ð j u1 ; j u2 ; . . . ; j un Þ ¼ E exp j
n X
#) uk xk
k¼1 1 X
¼
1 ...n
m1 ;2 ;...;n ð j u1 Þ1 ð j u2 Þ2 ð j un Þn ! ! ! 1 2 n ¼0
ð2:124Þ
where, of course, m1 ; 2 ;...; n ¼ ðjÞ1 þ2 þþn
@ 1 þ2 þþn ðu ; u ; . . . ; u Þ n 1 1 1 n 1 2 @ u1 @ u1 @ u1 u1 ¼u2 ¼¼un ¼0 ð2:125Þ
In a similar manner, n-dimensional cumulants can be defined as coefficients of the power series of the log-characteristic function
1 ;2 ;...;n ¼ ðjÞ1 þ2 þþn
@ 1 þ2 þþn ln ðu ; u ; . . . ; u Þ n 1 2 n 1 1 1 @ u1 @ u1 @ u1 u1 ¼u2 ¼¼un ¼0 ð2:126Þ
A relation between moments and cumulants of a random vector can be established in a way similar to relations obtained for moments and cumulants of a random variable. Finally, conditional densities can be used to define conditional moments and cumulants.
31
RANDOM VECTORS AND THEIR DESCRIPTION
As in the case of a random variable, lower order moments and cumulants play a prominent role in the description of a random vector. In particular, averages of components m1 k ¼ Efk g ¼ m0;...;1;...;0 ; k ¼ 1;
l ¼ 0;
k 6¼ l
ð2:127Þ
their variance 2k and m2 k 2k ¼ Dk ¼ Efðk m1 k Þ2 g ¼ 0;...;2;...;0 m2 k ¼ Efk2 g ¼ m0;...;2;...;0
ð2:128Þ
describe the main features of the individual components k . Statistical dependence between any two components is described by mixed moments, especially by the correlation Rk l and covariance Ck l , defined as non-central and central mixed moments of the second order k ;l ¼ Rk l ¼ Efk l g m1;1 k ;l 1;1 ¼ Ck l ¼ Efðk m1 k Þðl m1 l Þg
ð2:129Þ
with all sub-indices in the definition (2.122) and (2.123) equal to zero except for the k-th and l-th ones. There is a simple relation between correlation and covariance (resulting from the relation between the central moments and moments) Rk l ¼ Ck l þ m1 k m1 l
ð2:130Þ
Since the expectation of a non-negative random variable is non-negative, it is possible to write that, for any real , Ef½ðk m1 k Þ þ ðl m1 l Þ2 g ¼ 2 2k þ 2 Ck l þ 2l 0
ð2:131Þ
This can be satisfied for an arbitrary if and only if the discriminant of the quadratic form in eq. (2.131) is negative, i.e. C2k l 2k 2l 0
ð2:132Þ
This allows one to introduce a so-called correlation coefficient r between two components of the random vector r¼
Ck l ; k l
jrj 1
ð2:133Þ
Two random variables are called uncorrelated if r ¼ 0. This does not imply independence of these variables. It follows from eq. (2.131) that the maximum of the correlation coefficient is achieved when two random variables are linearly dependent, i.e. k ¼ a l þ b.
32
RANDOM VARIABLES AND THEIR DESCRIPTION
The notion of correlation can be extrapolated to a case of complex random variables 1 and 2 as C1 2 ¼ Efð1 m1 Þð2 m2 Þ g
ð2:134Þ
where the asterisk indicates complex conjugation.
2.5
GAUSSIAN RANDOM VECTORS
Since Gaussian random variables play a prominent role in statistical signal processing and communication theory, their most important properties are summarized in this section. A continuous random variable is called a Gaussian random variable if its PDF is given by " # 1 ðx mÞ2 p ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi exp 2 2 2 2
ð2:135Þ
The corresponding characteristic function is then 2 u2 ð j uÞ ¼ exp j u m 2
ð2:136Þ
It is possible to show by direct integration that for a Gaussian random variable and an arbitrary deterministic function y ¼ gðxÞ [9,12] 2n @n d n EfgðÞg ¼ 2 E gðÞ @ Dn d 2 n d gðÞ Ef gðÞg ¼ m EfgðÞg þ D E d
ð2:137Þ ð2:138Þ
Here D ¼ 2 . Setting gðxÞ ¼ xn in eq. (2.137) allows easy computation of the moments of a Gaussian random variable. Two random variables 1 and 2 are called jointly Gaussian if their joint PDF has the following form " # 1 D2 ðx1 m1 Þ2 211 ðx1 m1 Þðx2 m2 Þþ D1 ðx2 m2 Þ2 ffi exp p2N ðx1 ;x2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2ðD1 D2 211 Þ 2 D1 D2 211 " # 1 22 ðx1 m1 Þ2 2r 1 2 ðx1 m1 Þðx2 m2 Þ þ 21 ðx2 m2 Þ2 pffiffiffiffiffiffiffiffiffiffiffi exp ¼ 221 22 ð1 r 2 Þ 21 2 1 r 2 ð2:139Þ
33
GAUSSIAN RANDOM VECTORS
The corresponding characteristic function is then 1 2 N ð j u1 ; j u2 Þ ¼ exp jðm1 u1 þ m2 u2 Þ ðD1 u21 þ 2 1 1 u1 u2 þ D2 u22 Þ 2 1 ¼ exp jðm1 u1 þ m2 u2 Þ ð21 u21 þ 2 r 1 2 u1 u2 þ 22 u22 Þ 2
ð2:140Þ
This distribution has five parameters: the averages of each component m1 , m2 , the variances of each component 21 , 22 , and the correlation coefficient r between the components m1 ¼ Ef1 g; 1 1
m2 ¼ Ef2 g;
21 ¼ Ef12 g; 1 1 ¼ Efð1 m1 Þð2 m2 Þg; r ¼ 1 2
22 ¼ Ef22 g
ð2:141Þ ð2:142Þ
Taking the derivative of both sides of eq. (2.140) with respect to r, one obtains a useful relationship @n ð j u1 ; j u2 Þ ¼ ð1Þn ð1 2 u1 u2 Þn ð j u1 ; j u2 Þ @ rn
ð2:143Þ
Further properties of mutually Gaussian random variables are illustrated by examples of two Gaussian random variables. More detailed discussion can be found in [1,2,13]. 1. If 1 and 2 are uncorrelated, i.e. r ¼ 0, then according to eqs. (2.139) and (2.140) p2 N ðx1 ; x2 Þ ¼ p1 ðx1 Þp2 ðx2 Þ 2 N ð j u1 ; j u2 Þ ¼ 1 ð j u1 Þ2 ð j u2 Þ
ð2:144Þ
Thus, two uncorrelated Gaussian random variables are also independent. 2. Two correlated jointly Gaussian random variables can be transformed into a pair of independent zero mean Gaussian random variables by a linear transformation 1 ¼ ð1 m1 Þ cos þ ð2 m2 Þ sin 2 ¼ ð1 m1 Þ sin þ ð2 m2 Þ cos
ð2:145Þ
Here, the angle can be defined from the condition Ef1 ; 2 g ¼ 0 to be tan 2 ¼
2 r1 2 21 22
¼ =4
if
1 6¼ 2
if
1 ¼ 2
ð2:146Þ
3. Any linear transformation of jointly Gaussian random variables results in jointly Gaussian random variables. 4. If two random variables are jointly Gaussian, each variable by itself is also Gaussian. The inverse statement is not necessarily valid.
34
RANDOM VARIABLES AND THEIR DESCRIPTION
5. If two random variables 1 and 2 are jointly Gaussian, then both conditional densities p1 j2 ðx1 jx2 Þ and p2 j1 ðx2 jx1 Þ are Gaussian. For example 8 2 9 > > 2 > > > > x m r ðx m Þ 2 2 1 1 < = p1 ;2 ðx1 ; x2 Þ 1 1 p2 j1 ðx2 jx1 Þ ¼ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp > > p1 ðx1 Þ 2 22 ð1 r 2 Þ 2 2 ð1 r 2 Þ > > > > : ; ð2:147Þ It can be seen by inspection of eq. (2.147) that this PDF is a Gaussian PDF with the mean m ¼ m2 þ rð2 =1 Þðx1 m1 Þ
ð2:148Þ
D ¼ 22 ð1 r 2 Þ
ð2:149Þ
and the variance
Interestingly, the mean of the conditional PDF depends on the condition, while the variance does not, except for the correlation coefficient. 6. The joint PDF (2.139) can be expanded into the Miller series [13] " # 1 1 ðx1 m1 Þ2 ðx2 m2 Þ2 X x1 m 1 x2 m 2 r n p2 ðx1 ; x2 Þ ¼ exp H H n n 2 1 2 1 2 n! 2 21 2 22 n¼0 ð2:150Þ where Hn ðxÞ are Hermitian polynomials [5]. The advantage of this expansion is the fact that variables x1 , x2 and r are separated in each term. Most of the considered properties can be generalized to a case of a jointly Gaussian vector of dimension n. If mk and 2k ¼ Dk stand for the mean value and the variance of the k-th component and pffiffiffiffiffiffiffiffiffiffiffi Rk l ¼ Efðk mk Þðl ml Þg ¼ rk l Dk Dl ¼ rk l k l ; Rk k ¼ Dk ; Rk l ¼ Rl k ð2:151Þ The marginal joint PDF and the corresponding characteristic function are then given by " # 1 ðx mÞT R1 ðx mÞ pN ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp ð2:152Þ 2 ð2Þn jRj uT Ru T ð2:153Þ N ð j uÞ ¼ exp j m u 2 Here 2
3 m1 6 m2 7 7 m¼6 4 . . . 5; mn
2
R11 6 R21 R¼6 4 ... Rn 1
R12 R22 ... Rn 2
... ... ... ...
2 3 R1 n r11 6 r21 R2 n 7 7 ¼ diagfi g6 4 ... ... 5 rn 1 Rn n
r12 r22 ... rn 2
... ... ... ...
3 r1 n r2 n 7 7diagfi g ... 5 rn n ð2:154Þ
35
TRANSFORMATION OF RANDOM VECTORS
are the vector of averages and the covariation matrix, respectively. Vectors x and u are vector-columns of dimension n. R1 represents the inverse matrix and T is used to denote transposition. Since rk l ¼ rl k , the covariation matrix R is a symmetric matrix and contains nðn þ 1Þ=2 independent elements. Altogether, nðn þ 1Þ=2 þ n parameters are needed to uniquely define Gaussian distribution. The conditional probability density can be easily obtained as pðx1 ; x2 ; . . . ; xk jxkþ1 ; . . . ; xn Þ ¼ pðX 1 jX2 Þ " # ðx mX1 jX2 ÞT R1 1 X1 jX2 ðx mX 1 jX 2 Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp 2 ð2Þk jRX1 jX2 j
ð2:155Þ
where
R R ¼ 11 R21
R12 ¼ R11 R12 R1 R 22 R21 ; R22 X1 jX2
and mX1 jX2 ¼ m1 þ R12 R1 22 ðX 2 m2 Þ ð2:156Þ
and the mean vector and the correlation matrix are split according to conditional and nonconditional variables.
2.6
TRANSFORMATION OF RANDOM VECTORS
In this section the question of transformation of random vectors is treated. To simplify derivations, the case of a vector with two components is considered first and the results are generalized to the case of an arbitrary number of components. Consider two random variables 1 and 2 , described by the joint PDF pn ðx1 ; x2 Þ which are being transformed by two non-linear functions 1 ¼ g1 ð1 ; 2 Þ 2 ¼ g2 ð1 ; 2 Þ
ð2:157Þ
or, in the vector form g ¼ gðÞ, where g ¼ ½1 ; 2 T and g ¼ ½g1 ; g2 T . It is required to find the joint PDF p ðy1 ; y2 Þ of the new vector g. It is easier to approach this problem by finding the joint CDF Pg ðy1 ; y2 Þ. Let S be an area of the plane x1 x2 such that for any point ðx1 ; x2 Þ 2 S the following conditions are satisfied g1 ðx1 ; x2 Þ < y1 ; and g2 ðx1 ; x2 Þ < y2
ð2:158Þ
Events f1 < y1 ; 2 < y2 g and ff1 ; 2 g 2 Sg are then equivalent, since one implies another. Thus, their probabilities are equal as well, i.e. ð Pg ðy1 ; y2 Þ ¼
pn ðx1 ; x2 Þd x1 d x2 S
ð2:159Þ
36
RANDOM VARIABLES AND THEIR DESCRIPTION
In general, the area S can be a union of disjointed sets, thus the calculation of the integral may require integration over a few disjointed sets. If d Sy represents an area in the x1 x2 plane such that y1 g1 ðx1 ; x2 Þ < y1 þ d y1 ; and y2 g2 ðx1 ; x2 Þ < y2 þ d y2
ð2:160Þ
then ð Pg ðy1 þ d y1 ; y2 þ d y2 Þ Pg ðy1 ; y2 Þ ¼ pg ðy1 ; y2 Þd y1 d y2 ¼
pn ðx1 ; x2 Þd x1 d x2 ð2:161Þ d Sy
If the inverse vector function x ¼ hðyÞ has k branches, x ¼ hðiÞ ðyÞ, i ¼ 1; . . . ; k. Let x1 x2 and y1 y2 be two sets of Cartesian coordinates, then the area of the x1 x2 plane corresponding to a small rectangular d Sy ¼ d y1 d y2 is given by @ ðiÞ h ðy1 ; y2 Þ k k X X @ y1 1 jJk jd Sy ¼ d Sx ¼ ðiÞ i¼1 i¼1 @ h ðy1 ; y2 Þ @ y1 2
@ ðiÞ h1 ðy1 ; y2 Þ @ y2 d Sy @ ðiÞ h2 ðy1 ; y2 Þ @ y2
ð2:162Þ
Dðx1 ; x2 Þ is a Jacobian of the transformation of each branch of the function Dðy1 ; y2 Þ x ¼ hðyÞ. Thus, it follows from eq. (2.161) that the transformation of the PDF is where J ¼
pg ðy1 ; y2 Þ ¼
k X i¼1
@ ðiÞ h @ y1 1 ðiÞ ðiÞ pn ðh1 ðy1 ; y2 Þ; h2 ðy1 ; y2 ÞÞ @ ðiÞ h @ y1 2
@ ðiÞ h1 @ y2 @ ðiÞ h @ y2 2
ð2:163Þ
In general, if a random n ¼ ½1 ; 2 ; . . . ; n T vector of length n is transformed to another vector g ¼ ½1 ; 2 ; . . . ; n T by a non-linear transformation y ¼ gðxÞ having k branches of the inverse transformation x ¼ hðiÞ ð yÞ, then the probability density function of the transformed random variable g is given by p ð yÞ ¼
k X i¼1
DðxðiÞ Þ p ðh ðyÞÞ Dð yÞ ðiÞ
ð2:164Þ
where @ ðiÞ h ðyÞ @ y1 1 ðiÞ Dðx Þ ... ¼ Dð yÞ @ hðiÞ ðyÞ @ y1 n
@ ðiÞ h1 ðyÞ @ yn ... ... @ ðiÞ ... hn ðyÞ @ yn ...
is the Jacobian of the i-th branch of the inverse transformation x ¼ hðiÞ ð yÞ.
ð2:165Þ
37
TRANSFORMATION OF RANDOM VECTORS
As in the case of a scalar random variable, it is possible to calculate moments of the transformed random variable without directly calculating the PDF p ðyÞ. Indeed, the definition of moments of an arbitrary order implies that n m11;; 2 ;...; ¼ Ef11 22 . . . nn g ¼ 2 ; ...;n
ð1 1
ð1 1
h1 1 ðyÞh1 1 ð yÞ h1 1 ð yÞd y
ð2:166Þ
If the transformation maps a vector of dimension n into a vector of dimension m < n, the vector g can be augmented by n m dummy variables, often by a sub-vector ½l1 ; . . . ; lnm of the vector n, to allow use of eq. (2.166). The resulting PDF can then be integrated over dummy variables to obtain the PDF of the original vector. Evaluation of moments and cumulants of non-linear transformation using moment and cumulant brackets are considered in Section 2.8. The application of the technique described in this section is illustrated by a number of examples, which have their own importance in applications.
2.6.1
PDF of a Sum, Difference, Product and Ratio of Two Random Variables
Let the joint PDFs pn ðx1 ; x2 Þ of two random variables 1 and 2 be known. The PDF of the sum (difference) of these two variables ¼ 1 2 is to be determined. Introducing a dummy variable 2 ¼ 2 , one can formulate this problem in a standard form. The inverse transformation is then 1 1 1 ¼ 2 ¼1 ð2:167Þ J ¼ 0 1 2 ¼ 2 Thus, according to eq. (2.164), the joint PDF p 2 ðy1 ; y2 Þ is equal to p 2 ðy1 ; y2 Þ ¼ pn ðy y2 ; y2 Þ
ð2:168Þ
Integration over the dummy variable 2 results in the PDF of the sum (difference) of two random variables ð1 pn ðy y2 ; y2 Þd y2 ; and ¼ 1 2 ð2:169Þ p ðyÞ ¼ 1
Equation (2.169) can be further simplified if 1 and 2 are independent. In this case pn ðx1 ; x2 Þ ¼ p1 ðx1 Þp2 ðx2 Þ and ð1 p1 ðy y2 Þp2 ðy2 Þd y2 ð2:170Þ p ðyÞ ¼ 1
In general, if there are n independent random variables 1 ; . . . ; n , the characteristic function of the weighted sum, ¼ a1 1 þ a2 2 þ þ an n ¼
n X k¼1
ak k
ð2:171Þ
38
RANDOM VARIABLES AND THEIR DESCRIPTION
of these variables can be easily found to be * !+ n n n X Y Y ð j uÞ ¼ hexpðj u Þi ¼ exp j u ak k hexpðj u ak k Þi ¼ k ð j u ak Þ ¼ k¼1
k¼1
k¼1
ð2:172Þ Similarly, the expression for the log-characteristic function becomes ð j uÞ ¼ ln ð j uÞ ¼
n X
ln k ð j u ak Þ ¼
k¼1
n X
k ð j u ak Þ
ð2:173Þ
k¼1
The resulting distribution is called the composition of the distributions of each variable. It can be seen from eq. (2.172) that ðjuÞ ¼
1 X
l
l¼1
l!
l
ðjuÞ ¼
1 Pn n l k X X k¼1 ak l ðjuak Þ ¼ ðjuÞl ; l ¼ alk l k l! l! l¼1 k¼1
n X 1 k X
l
l
k¼1 l¼1
ð2:174Þ
Thus, the cumulant of an arbitrary order of the sum can be expressed as a weighted sum of the cumulants of the same order. In particular, eq. (2.174), written for the average m and the variance D ¼ 2 , produces the following rule m ¼
n X
ak mk ;
k¼1
D ¼
n X
a2k Dk
ð2:175Þ
k¼1
If all variables have the same distribution and all the weights are equal, ai ¼ a, then eq. (2.172) is reduced to ð j uÞ ¼ n ð j u aÞ; ð j uÞ ¼ n ð j u aÞ; k ¼ n ak k
ð2:176Þ
Since for an arbitrary Gaussian random variable all the cumulants of order higher than two are equal to zero, eq. (2.174) shows that all the cumulants of the resulting random variable are also equal to zero, i.e. the random variable is also a Gaussian random variable. Finally, eq. (2.174) allows the investigation of the limiting distribution of the average of identically distributed random variables ¼
1 þ 2 þ þ N N
ð2:177Þ
Setting n ¼ N and a ¼ 1=N in eq. (2.176), one obtains
k ¼
N k
k ¼ Nk N k1
ð2:178Þ
Thus, it can be seen, that the transformation (2.177) preserves the mean value, since 1 ¼
1 . However, all other cumulants are reduced by a factor N k1. For sufficiently large N
39
TRANSFORMATION OF RANDOM VECTORS
all cumulants of order higher than two can be neglected compared to the two first cumulants, i.e. pffiffiffiffi
2 2 ð= N Þ2 u2 ð j uÞ j u 1 ð2:179Þ u ¼ jum þ OðN 2 Þ 2N 2 Thus, the distribution of the average of a large number of identically distributed random variables differs little from a Gaussian distribution, with the mean p value ffiffiffiffi equal to the mean value of each component and the variance reduced by a factor of N . If random variables and are the product and ratio of two random variables 1 and 2 whose joint distribution is known, a similar approach allows one to obtain the PDF p ðyÞ of the result. Indeed, setting a dummy variable 2 ¼ 2 ; 2 ¼ 2 , one obtains ¼ 1 2 ; 1 ¼ =2 2 ¼ 2 ; 2 ¼ 2 y1 1 Dðx1 ; x2 Þ 2 1 ¼ y2 y2 ¼ ; J ¼ Dðy1 ; y2 Þ jy 2j 0 1 1 1 ¼ 2 ¼ ; 2 2 ¼ 2 2 ¼ 2 ; Dðx1 ; x2 Þ y2 ¼ J¼ Dðy ; y Þ 0 1
2
y ¼ jy2 j; 1
p ðyÞ ¼
p ðyÞ ¼
ð1
ð1 1
1
p
y d y2 ; y2 y2 jy2 j
p ðy y2 ; y2 Þjy2 jd y2
ð2:180Þ
ð2:181Þ
Equations (2.180) and (2.181) are simplified if 1 and 2 are independent. In this case ð1 ð1 y d y2 ; and p ðyÞ ¼ p 1 p1 ðy y2 Þp2 ðy2 Þjy2 jd y2 ð2:182Þ p ðyÞ ¼ p2 ðy2 Þ jy2 j y2 1 1 2.6.2
Probability Density of the Magnitude and the Phase of a Complex Random Vector with Jointly Gaussian Components
Let ¼ R þ j I be a complex random variable with jointly Gaussian real and imaginary parts. This distribution is described by five parameters: the mean values of each component mR and mI , their variances 2R and 2I and the correlation coefficient r. The magnitude A and the phase of can be obtained using the following non-linear transformation qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi I A ¼ R2 þ I2 ; ¼ atan ; A 0; j j ð2:183Þ R Such a problem arises in many cases when a modulated harmonic signal with random phase is studied. The first step is to transform R and I into new variables by a linear transformation (2.145) 1 ¼ R cos þ I sin 2 ¼ R sin þ I cos
ð2:184Þ
40
RANDOM VARIABLES AND THEIR DESCRIPTION
Here, the angle is defined as tan 2 ¼
2 r 1 2 21 22
¼ =4
if 1 6¼ 2
ð2:185Þ
if 1 ¼ 2
In this case the Gaussian components 1 and 2 are independent, with the joint distribution " # ðy1 m1 1 Þ2 ðy2 m1 2 Þ2 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp p1 ;2 ðy1 ; y2 Þ ¼ 2 D1 2 D2 2 D1 D2
ð2:186Þ
where the new means and variances are defined as m1 1 ¼ mR cos þ mI sin
ð2:187Þ
m1 2 ¼ mI cos mR sin pffiffiffiffiffiffiffiffiffiffiffiffi D1 ¼ DR cos2 þ DI sin2 þ r DR DI sin 2 pffiffiffiffiffiffiffiffiffiffiffiffi D2 ¼ DI cos2 þ DR sin2 r DR DI sin 2
ð2:188Þ
The next step is to introduce polar coordinates A¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 12 þ 22 ¼ R2 þ I2 ;
¼ ¼ atan
2 1
ð2:189Þ
with the inverse transformation given by 1 ¼ A cos ; 2 ¼ A sin ;
cos J ¼ sin
A sin A cos
¼ A
ð2:190Þ
Using eq. (2.164) together with eqs. (2.186), (2.189) and (2.190), one readily obtains the joint PDF pA; ðA; Þ of the magnitude and the phase
pA;
" # ðA cos m1 1 Þ2 ðA sin m1 2 Þ2 A pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp ðA; Þ ¼ 2 D1 2 D2 2 D1 D2
ð2:191Þ
The PDF pA ðAÞ of the magnitude and the PDF p ð Þ ¼ p ð Þ can be obtained by the or magnitude A variables, integration of the joint density pA; ðA; Þ over the phase respectively. Depending on relations between the parameters of the distribution pA; ðA; Þ a number of different distributions can arise from eq. (2.191). Detailed investigation of the particular cases can be found in [12]. Here, only two particularly important cases are considered.
41
TRANSFORMATION OF RANDOM VECTORS
2.6.2.1
Zero Mean Uncorrelated Gaussian Components of Equal Variance
In this case mR ¼ mI ¼ 0, r ¼ 0 and DR ¼ DI ¼ D ¼ 2 . Using eqs. (2.184) and (2.185) one obtains m1 1 ¼ m1 2 ¼ 0, D1 ¼ D2 ¼ D, ¼ =4. This reduces eq. (2.191) to pA; ðA; Þ ¼
A A2 A A2 1 exp exp ¼ 2 2 2 2 2 2 2
ð2:192Þ
Integration over the phase produces the Rayleigh distribution of the magnitude and the uniform distribution of phase A A2 pA ðAÞ ¼ 2 exp 2 ; 2
A 0;
p ð Þ ¼
1 ; 2
j j
ð2:193Þ
In addition, it can be seen from eqs. (2.192) and (2.193) that pA; ðA; Þ ¼ pA ðAÞp ð Þ and thus the phase and the magnitude are independent.
2.6.2.2
Case of Uncorrelated Components with Equal Variances and Non-Zero Mean pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi In this case D1 ¼ D2 ¼ D, ¼ =4, m ¼ m2R þ m2I ¼ m21 1 þ m21 2 > 0 and the joint PDF becomes, (2.191), 2 pA; ðA; Þ ¼
A exp4 2 2
A2 þ m2 2 A m
m 1 1 cos m 2 2
þ
A A2 þ m2 2 A m cosð 0 Þ ¼ exp 2 2 2 2
3 m1 2 sin m 5
ð2:194Þ
Here tan 0 ¼
m1 2 m1 1
ð2:195Þ
Further integration over the phase variable produces the Rice distribution for the magnitude A A2 þ m2 2 A m cosð 0 Þ pA; ðA; Þd ¼ exp d pA ðAÞ ¼ 2 2 2 2 ð A A2 þ m 2 A m cosð 0 Þ A A2 þ m 2 Am exp exp exp d ¼ I0 2 2 2 2 2 2 2 2 2 2 ð
ð
ð2:196Þ
42
RANDOM VARIABLES AND THEIR DESCRIPTION
Similarly, integration of pA; ðA; Þ over A produces a PDF of the phase with the following form 1 m2 m cosð 0 Þ m2 sin2 ð 0 Þ m cosð 0 Þ pffiffiffi pffiffiffiffiffiffiffiffiffiffiffi exp p’ ð Þ ¼ exp 2 þ 1 þ erf 2 2 22 2 2 22 ð2:197Þ Here erfðxÞ is the error function [5]: 2 erfðxÞ ¼ pffiffiffi
ðx
expðt2 Þd t:
ð2:198Þ
0
It can be seen that if m ¼ 0, eq. (2.196) produces the Rayleigh distribution since I0 ð0Þ ¼ 1, and eq. (2.197) describes the uniform distribution of the phase. Characteristically, the case of a large m= produces approximately Gaussian distributions for both the magnitude and the phase, as can be observed by inspection of Fig. 2.3.
Figure 2.3
2.6.3
Dependence of the magnitude and the phase distribution on the ratio.
PDF of the Maximum (Minimum) of two Random Variables
Once again, let p1 ;2 ðx1 ; x2 Þ be the joint PDF of two random variables 1 and 2 . In some applications, such as combining techniques, it is required to find the distribution of the following random variables ¼ maxf1 ; 2 g ¼ minf1 ; 2 g
ð2:199Þ
43
TRANSFORMATION OF RANDOM VECTORS
In this case, the method of non-linear transformation suggested above does not produce the desired result easily. However, this problem can be solved by finding the CDF of the result first. Indeed, based on the definition of CDF, one can write P ðyÞ ¼ Probf ¼ maxf1 ; 2 g < yg ¼ Probf1 < y; 2 < yg ¼
ðy 1
ðy 1
p1 ;2 ðx1 ; x2 Þd x1 d x2 ð2:200Þ
Taking the derivative with respect to y of both sides of eq. (2.200), the following expression for the PDF of can be obtained p ðyÞ ¼
d d P ðyÞ ¼ dy dy
ðy 1
ðy 1
p1 ;2 ðx1 ; x2 Þd x1 d x2
ð2:201Þ
If 1 and 2 are independent, this expression can be further simplified to produce p ðyÞ ¼ p1 ðyÞP2 ðyÞ þ p2 ðyÞP1 ðyÞ
ð2:202Þ
where Pi ðyÞ is the CDF of i , i ¼ 1; 2. In the same manner one can obtain the PDF of the random variable ¼ minf1 ; 2 g. In this case P ðyÞ ¼ Probf ¼ minf1 ; 2 g < yg ¼ Probf1 < y; x1 x2 g þ Probf2 < y; x2 x1 g ðy ð1 ðy ð1 ¼ d x1 p1 ;2 ðx1 ; x2 Þd x2 þ dx p1 ;2 ðx1 ; x2 Þd x1 ð2:203Þ 1
1
x1
x2
Taking the derivative of CDF, (2.203), the desired PDF is obtained in the following form d p ðyÞ ¼ dy
ðy
ð1
1
d x1
p1 ;2 ðx1 ; x2 Þd x2 þ
x1
ðy 1
ð1 d x2
p1 ; ðx1 ; x2 Þd x1
ð2:204Þ
x2
Finally, for independent 1 and 2 , eq. (2.204) reduces to p ðyÞ ¼ p1 ðyÞ½1 P2 ðyÞ þ ½1 P1 ðyÞp2 ðyÞ
ð2:205Þ
Example: selection combining. Two antennas, separated by a relatively large distance, are used to receive a signal from the same source. In each antenna the signal-to-noise ratio (SNR) is described by the so-called Gamma distribution pi ð i Þ ¼
i mi 1 mm mi i i exp ; im ðmi Þ i
i ¼ 1; 2
ð2:206Þ
where mi and i are the parameters of the distribution (severity of fading and the average SNR, respectively). The antenna with the highest SNR is chosen to connect to the receiver. It is possible to assume that the levels of signals in both antennas are independent, thus
44
RANDOM VARIABLES AND THEIR DESCRIPTION
eq. (2.202) can be used to obtain the distribution of the SNR at the input of the receiver. It produces 1 2 mm m 1 1 m2 2 mm m2 2 m1 1 1 m11 2 m21 exp exp pð Þ ¼ m1 P m2 ; þ m2 P m1 ; 1 ðm1 Þ 1 2 2 ðm2 Þ 2 1 ð2:207Þ where Pðm; xÞ ¼
1 ðmÞ
ðx tm1 expðtÞd t
ð2:208Þ
0
is one of the forms of incomplete Gamma function [5]. For the case of independent Rayleigh fading with equal average SNR in both antennas, eq. (2.207) can be simplified to produce 2 1 exp pð Þ ¼ exp 2.6.4
ð2:209Þ
PDF of the Maximum (Minimum) of n Independent Random Variables
The equations obtained in the previous section can be generalized to the case of n independent random variables 1 ; 2 ; . . . ; n , each described by their PDF p1 ðx1 Þ; . . . ; pn ðxn Þ. New variables are defined as ¼ maxf1 ; 2 ; . . . ; n g ¼ maxf1 ; 2 ; . . . ; n g
ð2:210Þ
Following the same steps as in Section 2.6.3 one obtains p ðyÞ ¼
n n n Y X d d Y pk ðyÞ P ðyÞ ¼ Pk ðyÞ ¼ Pk ðyÞ ðyÞ dy d y k¼1 P k¼1 k¼1 k
ð2:211Þ
Similarly the PDF p ðyÞ can be expressed as p ðyÞ ¼
2.7
n n n Y X d d Y pk ðyÞ P ðyÞ ¼ ½1 Pk ðyÞ ¼ ½1 Pk ðyÞ dy d y k¼1 1 Pk ðyÞ k¼1 k¼1
ð2:212Þ
ADDITIONAL PROPERTIES OF CUMULANTS
It was shown earlier (see eq. (2.42)) that the characteristic function ð j uÞ of a process with finite moments mk can be expressed in terms of an infinite series of j u ð j uÞ ¼ 1 þ
1 X mk k¼1
k!
ð j uÞk
ð2:213Þ
ADDITIONAL PROPERTIES OF CUMULANTS
45
Here moments mk are defined by eq. (2.28). Instead of a characteristic function, one can consider the log-characteristic function ln ð j uÞ and its expansion in the Taylor series ln ð j uÞ ¼
1 X
k k¼1
k!
ð j uÞk
ð2:214Þ
Coefficients of this expansion are called the cumulants. Equations (2.213) and (2.214) allow relations between moments and cumulants to be obtained. Furthermore, it is possible to relate central moments and cumulants by noting that " ð j uÞ ¼ e
j u m1
1þ
1 X k k¼1
k!
# ð j uÞ
k
ð2:215Þ
Moments and cumulants of random vectors can be introduced in a similar manner by considering corresponding characteristic functions. n ðuÞ ¼
1 X 1 X
...
n1 ¼0 n2 ¼0
1 X mn1 ;n2 ;...;nN ð j u1 Þn1 ð j u2 Þn2 . . . ð j uN ÞnN n !n ! n ! 1 2 N n ¼0
ð2:216Þ
N
and ln n ðuÞ ¼
1 X 1 X n1 ¼0 n2 ¼0
...
1 X
n1 ;n2 ;...;nN ð j u1 Þ n 1 ð j u2 Þ n 2 . . . ð j uN Þ n N n !n ! n ! 1 2 N n ¼0
ð2:217Þ
N
Using the expansion of lnð1 þ zÞ into the series of powers of z, it is possible to find relations between cumulants of any order, and moments, or central moments, of any order. It is worth noting that the elements of the vector n can repeat themselves. It is reasonable to ask why such a variety of descriptions is needed. One of the possible interpretations is the following: if a Gaussian random process is analyzed, then all the descriptions are equally convenient since the first two moments (or cumulants) completely describe a Gaussian random variable; however, description of a non-Gaussian random variable can be often more conveniently accomplished by using cumulant description. The advantages of cumulant description become obvious when orthogonal expansions of unknown densities, such as Edgeworth and Gram–Charlier series, are used. On the other n=2 hand, it can be shown, that, in contrast to moments, the relative value of cumulants n = 2 is decreasing with the order, thus providing more reasonable approximations for an unknown density. Yet another important property of the cumulants is the fact that they are better suited to describe statistical dependence between elements of random vectors. This statement can be illustrated by considering the following example. Let n ¼ ½1 ; 2 T be a vector with two components, described by the joint probability density p ðx; yÞ, such that m1 x ¼ m1 y ¼ 0. Three possible relations between the components are considered here: (a) the components 1 and 2 are completely independent; (b) the components 1 and 2 are linearly dependent, i.e. 2 ¼ a 1 þ b; (c) the components 1 and 2 are related through a non-linear transformation
46
RANDOM VARIABLES AND THEIR DESCRIPTION
2 ¼ a 12 . It is also assumed that 1 ðtÞ is a symmetric random process, thus implying that all its odd moments are equal to zero mð2 nþ1Þ ¼ h12 nþ1 i ¼ 0
ð2:218Þ
In the case (a) of independent random variables, the characteristic function of the second is a product of the characteristic function of each component, and thus ln 2 ð j u1 ; j u2 Þ ¼ ln 1 1 ð j u1 Þ þ ln 1 2 ð j u2 Þ ¼
1 X
n n¼0
n!
1
ð j u1 Þ n þ
1 X
n n¼0
n!
2
ð j u2 Þ ð2:219Þ
Equation (2.219) indicates that all the cross cumulants are equal to zero, since there are no terms containing ð j u1 Þn1 ð j u2 Þn2 , with both n1 and n2 greater than zero. In the case (b) the first mixed cumulant 11 attains its maximum possible value, since
11 ¼ m11 ¼ h1 2 i ¼ a 21
ð2:220Þ
thus indicating maximum possible correlation. However, the same cumulant is equal to zero in the case (c):
11 ¼ h1 a 12 i ¼ ah13 i ¼ 0
ð2:221Þ
i.e. the first mixed cumulant does not completely describe dependence between two variables. Indeed, such dependence can be traced through the higher order cumulants, for example 21 :
21 ¼ h12 a 12 i ¼ ah14 i > 0
ð2:222Þ
The three cases above emphasize the following meaning of cumulants: the first mixed cumulant (or correlation function) measures the degree of linear dependence between two variables. Higher order cumulants are needed to describe non-linear dependence of two random variables.6
2.7.1
Moment and Cumulant Brackets
So far the angle brackets have been used as an alternative notation to the averaging operator over all random variables under consideration, i.e. hgðxÞi ¼ EfgðÞg ¼
ð1 1
gðxÞp ðxÞdx; hgðx;yÞi ¼ Efgð;Þg ð1 ð1 gðx;yÞp ðx;yÞdxdy; etc: ð2:223Þ ¼ 1 1
6
See [14,18] for more detailed discussions.
47
ADDITIONAL PROPERTIES OF CUMULANTS
The following property of the moment brackets can be obtained from eq. (2.223) * + N N X X n fn ðxÞ ¼ n h fn ðxÞi ð2:224Þ n¼1
n¼1
where n are deterministic constants. Up to now such notation has been used mainly to shorten equations. However, it can be extended to allow a convenient tool for manipulation of cumulant and moment equations, considered below. For this purpose the following convention is used for mixed cumulants [9] ½n
½n
1 2 ½nm m
n11;;n22;...; ;...;nm ¼ h1 ; . . . ; 1 ; 2 ; . . . ; 2 ; . . . ; m ; . . . ; m i ¼ h1 ; 2 ; . . . ; m i
ð2:225Þ
Here, the random variable 1 appears inside the brackets n1 times, 2 appears n2 times, etc. In future, brackets which contain at least one comma are called cumulant brackets. Otherwise the term moment brackets is used. For example ½
2 ¼ 2 1 ¼ h1 ; 1 i;
1 ;2
12 ¼ 1;2 ¼ h1 ; 2 ; 2 i; etc:
ð2:226Þ
The importance of this notation arises from the fact that it is possible to formalize the relationship between the moment and cumulant brackets in an easy way. In addition, it will be shown later in Section 2.8 that it is relatively easy to obtain systems of equations for cumulant functions of random processes, based on the dynamic equations describing the system under consideration. Moreover, the notation for random variables 1 , 2 , etc. and their transformations f ð1 Þ, gð2 Þ, etc. can be simply substituted by their ordinary number equivalents, as shown below. For example, the following relations between cumulant and moment brackets can be easily formalized through the use of Sratonovich symmetrization brackets [9]. 1 ;2 1 ;2 ¼ h1 ; 2 i ¼ h1 2 i h1 ih2 i h1; 2i ¼ h1 2i h1ih2i ¼ m1;1 m1 1 m1 2
1;1 1 ;2 ;3
1;1;1 ¼ h1; 2; 3i 3fh1ih2 3igs þ 2h1ih2ih3i 1 ;2 ;3 2 ;3 1 ;3 1 ;2 m1 1 m1;1 m1; 2 m1;1 m1 3 m1;1 þ 2 m1 1 m1 2 m1 3 ¼ m1;1;1
ð2:227Þ Symmetrization brackets together with an integer in front of the brackets represent the sum of all possible permutations of the arguments inside the brackets. For example 3fh1ih2; 3igs ¼ h1ih2; 3i þ h2ih1; 3i þ h3ih1; 2i 3fh1 2ih3 4igs ¼ h1 2ih3 4i þ h1 3ih2 4i þ h1 4ih2 3i
ð2:228Þ
In a similar manner, moment brackets can be expressed in terms of corresponding cumulants. For example mn ¼ N þ N 1 N1 þ þ
N! ð 2 N2 þ 21 N2 Þ 2!ðN 2Þ!
N! ð 3 N3 þ 1 2 N3 þ 31 N3 Þ þ þ N1 3!ðN 3Þ!
ð2:229Þ
48
RANDOM VARIABLES AND THEIR DESCRIPTION
If one sets 1 ¼ 0 in eq. (2.229), the relation between central moments and cumulants can be derived. Similar relations can be obtained for the mixed cumulants of two variables. For example, one can show that m11 ¼ 11 þ 10 01 m21 ¼ 21 þ 3 20 01 þ 2 11 10 þ 3 210 01 m31 ¼ 31 þ 3 20 11 þ 3 21 10 þ 3 20 10 01 þ 3 11 210 þ 310 01 m22 ¼ 22 þ 20 02 þ 3 10 12 þ 2 21 01 þ þ 4 10 11 01 þ
210
210
02 þ
20 201
þ
ð2:230Þ
2 211
201
Equations (2.227), (2.229) and (2.230) are useful in the sense that they allow the transformation of a problem of analysis of moments of non-linear memoryless transformation random variables into a problem of evaluating its cumulants. The latter is preferable since the cumulant series converges faster. It is important to note that the cumulant brackets, in contrast to moment brackets, do not represent averaging of the quantity inside the brackets. Indeed, it can be seen that cumulants are obtained by non-linear transformation of the moments, thus they cannot be expressed as a linear combination of averaged quantities. Since, in any cumulant brackets, there are at least two arguments separated by a comma, there is no confusion between these two. However, of course, both types of brackets are closely related, since it is possible to express cumulants through moments and vice versa. It can also be shown that cumulant brackets are linear functions of their arguments, i.e. functions separated by two sequential commas. 2.7.2
Properties of Cumulant Brackets
In order to be useful, some formal rules of manipulation with cumulant brackets should be obtained. As earlier, Greek letters are used as a notation for random variables, while Latin letters are used for deterministic variables and constants. The following six properties can be easily obtained from the definition of cumulants. 1: h; ; . . . ; !i is a symmetric function of its arguments; 2: ha ; b ; . . . ; g !i ¼ ab gh; ; . . . ; !i; 3: h; ; . . . ; 1 þ 2 ; . . . ; !i ¼ h; ; . . . ; 1 ; . . . ; !i þ h; ; . . . ; 2 ; . . . ; !i; 4: h; ; . . . ; ; . . . ; !i ¼ 0; if is independent of f; ; . . .g; 5: h; ; . . . ; a; . . . ; !i ¼ 0;
ð2:231Þ
6: h þ a; þ b; . . . ; þ c; . . . ; ! þ gi ¼ h; ; . . . ; ; . . . ; !i The first three properties of cumulant brackets coincide with those of moment brackets, however, the last three properties are unique for cumulant brackets. The fourth property shows that if two groups of random processes are independent, any mutual cumulants are equal to zero (it is not true for corresponding moments). This is yet another indication that information about statistical dependence is encoded more evidently in cumulants rather than in moments. The sixth property indicates that the cumulants are invariant with respect to any deterministic shift in the random variables.
CUMULANT EQUATIONS
2.7.3
49
More on the Statistical Meaning of Cumulants
It has been shown earlier (see eqs. (2.220)–(2.222) and related discussions) that the first 1 ;2 mixed cumulant 1;1 is a measure of linear correlation between two random processes 1 and 2 . Such dependence is also often called first order dependence [14–18]. 1 ;2 ;3 ¼ h1 ; 2 ; 3 i. This cumulant can be The next step is to consider cumulant 1;1;1 expressed through the moment of the same order and mixed cumulants and moments of the lower order as h1 ;2 ;3 i ¼ h1 2 3 i m11 h2 ;3 i m12 h1 ;3 i m13 h1 ;2 i m11 m12 m13
ð2:232Þ
Two interesting particular cases of eq. (2.232) can be further considered: all processes are pairwise uncorrelated, i.e. h1 ; 2 i ¼ h1 ; 3 i ¼ h2 ; 3 i ¼ 0, i.e. there is no linear dependence between all these elements. If, in addition, these processes are independent, then 1 ;2 ;3 1 ;2 ;3 ¼ 0. Thus, cumulant 1;1;1 is not equal to zero if there is dependence between the
1;1;1 components of a more complicated nature than just linear dependence. This (non-linear) dependence can be obtained, according to eq. (2.232), by subtracting linear dependence and deterministic average components. This dependence is present or is absent, no matter if there is linear dependence or not. Thus, in some sense, cumulants are independent coordinates which describe a set of dependent random variables. In this sense, a set of cumulants is preferable to an equivalent set of moments. In general, so-called cumulant coefficients N p11 ;;p22;...; ;...;pN ¼
N
p11;;p22;...; ;...;pN p11 p22 . . . pNN
ð2:233Þ
of order p ¼ p1 þ p2 þ þ pN , describes a non-linear statistical dependence of ðN 1Þ between a set of N variables. It is interesting that there is only a limited freedom in choosing cumulants of a random variable or vector. Indeed, it is shown in [9] that cumulant coefficients n of a random variable n ¼
n n=2
2
¼
n n
ð2:234Þ
must satisfy certain (non-linear) inequalities. For example, skewness 3 and curtosis 4 must satisfy the condition [9] 4 32 þ 2 0
ð2:235Þ
While there is no restriction placed on 3 , i.e. 3 2 ð1; 1Þ; 4 must exceed 2. Restrictions on higher order cumulants are still an area of active research.
2.8
CUMULANT EQUATIONS
It was shown earlier that the characteristic function ð j uÞ can be defined if an infinite set of cumulants k ; k ¼ 1; 2; . . ., is given. In other words, the characteristic function, and thus the
50
RANDOM VARIABLES AND THEIR DESCRIPTION
PDF, is a function of an infinite cumulant vector j ¼ f 1 ; 2 ; . . .g: pðxÞ ¼ pðx; 1 ; 2 ; . . .Þ ¼ pðx; jÞ
ð2:236Þ
It was also mentioned that the cumulants can be chosen independently from a certain region of values and this choice can be made independently for each cumulant as long as they satisfy certain inequalities [9]. This fact allows differentiation of the PDF pðx; 1 ; 2 ; . . .Þ with respect to cumulants, i.e. expressions of the type @@ s pðx; jÞ are justified since it only requires an infinitely small variation of cumulant js around its nominal value. Taking a derivative of both sides of the representation of the PDF through the characteristic function with respect to s one obtains ð @ 1 1 j u x @ pðx; jÞ ¼ e ð j u; jÞd u ð2:237Þ @ s 2 1 @ s Furthermore, using representation (2.19) it is possible to calculate the partial derivative inside the integral to be @ ð j uÞs ð j u; jÞ ð j u; jÞ ¼ s! @ s
ð2:238Þ
and, thus eq. (2.237) becomes @s 1 pðx; jÞ ¼ 2 @ s
ð1 1
ð j vÞs ej v x ð j v; jÞd v
ð2:239Þ
On the other hand, taking derivatives of both sides of eq. (2.19) s times with respect to x one obtains ð @s 1 1 pðx; jÞ ¼ ðj vÞs ej v x ð j v; jÞd v ð2:240Þ @ xs 2 1 Comparing this with expression (2.239) one can obtain a partial differential equation of the form @ ð1Þs @ s pðx; jÞ ¼ pðx; jÞ s! @ xs @ s
ð2:241Þ
This relation between derivatives of the PDF with respect to the argument and the parameters is very useful in the derivation of cumulant equations [9]. The next step is to derive a set of equations which describe evolution of averages of a deterministic function f ðxÞ. In this case the average, defined as h f ðxÞi ¼
ð1 1
f ðxÞpðx; jÞd x
ð2:242Þ
can also be considered as a function of the cumulants of the underlying PDF h f ðxÞi hf ðxÞiðjÞ
ð2:243Þ
51
CUMULANT EQUATIONS
Taking a derivative of both sides of eq. (2.242) with respect to js it is easy to obtain ð1 ð @ @ ð1Þs 1 @s h f ðxÞi ¼ f ðxÞ pðx; jÞd x ¼ f ðxÞ s pðx; jÞd x ð2:244Þ s! @x @ s @ s 1 1 Here the relationship (2.242) has been used to substitute the derivative with respect to cumulants by the derivative with respect to the argument x. Further simplification of eq. (2.244) can be achieved by integrating the right hand side by parts s times and taking into account that for most distributions lim pðx; jÞ ¼ 0;
x!1
@s pðx; jÞ ¼ 0 x!1 @ xs
ð2:245Þ
lim
As a result, a simple relation is obtained between the derivative of the average with respect to the s-th cumulant and the average of the s-th derivative of the function f ðxÞ
ð @ 1 1 ds 1 ds h f ðxÞi ¼ pðx; jÞ s f ðxÞd x ¼ f ðxÞ ð2:246Þ @ s s! 1 s! d xs dx Taking the derivative of both sides of eq. (2.246) with respect to js once more, one obtains @2 @ @ 1 @ h f ðxÞi ¼ h f ðxÞi ¼ @ 2s @ s @ s s! @ s
ds f ðxÞ d xs
¼
1 ðs!Þ2
d2 s f ðxÞ d x2 s
ð2:247Þ
Repeating this procedure n times, an expression for the n-th derivative can be obtained,
ns @n 1 d h f ðxÞi ¼ f ðxÞ ð2:248Þ ðs!Þn d xn s @ ns By the same token, mixed partial derivatives of the average of a function can be obtained [9] to be * + @ ðnþmÞ 1 dðnkþmpÞ h f ðxÞi ¼ f ðxÞ ð2:249Þ @ nk @ m ðk!Þn ðp!Þm d xnkþmp p Equation (2.249) are called the cumulant equations [9]. Their usefulness becomes obvious from the examples below. Example: Relation between the fourth moment m4 and cumulants. It follows from the definition of averages that m4 ¼ hx4 i. Setting f ðxÞ ¼ x4 in eq. (2.249) one can easily obtain
4 @2 1 d 4 4! hx4 i ¼ x ¼ ¼4 ð1!Þð3!Þ d x4 3! @ 1 @ 3
4 3 1 @ 1 1 d 4 4! ¼6 ðhx4 iÞ ¼ x ¼ 2! @ 21 @ 2 2 ð1!Þ2 ð2!Þ d x4 2 2!
4 1 @3 1 1 d 4 4! 4 ðhx iÞ ¼ x ¼ ¼1 4 4 4 4! @ 1 4! ð1!Þ d x 4!
ð2:250Þ
52
RANDOM VARIABLES AND THEIR DESCRIPTION
These quantities can easily be identified with the coefficients of the expansion of the fourth moment in terms of corresponding cumulants. In a similar way, expansion of an arbitrary moment can be obtained. Of course, the applications of eq. (2.249) are much wider than just calculation of moments [9].
2.8.1
Non-linear Transformation of a Random Variable: Cumulant Method
Equations (2.249) are particularly important when it is desired to the obtain the relationship between the moments and cumulants of a transformed random variable ¼ f ðÞ given that the cumulants of the input process are known. In order to obtain this dependence, eq. (2.249) must be first modified. For example, if the task is to find the variance 2 of the output process;
2 ¼ h; i ¼ hf 2 ðÞi hf ðÞi2
ð2:251Þ
it can be seen that this equation already contains an expression for the average h f ðÞi of the output variable which itself may depend on all cumulants of . In general, the cumulant of an arbitrary order jNn ¼ ð; j1 Þ also depends on all cumulants. In order to take this fact into account, the expression for the n-th cumulant can be written as hð; h f ðÞiÞi ¼
ð1 1
ðx; h f ðxÞiÞpðx; jÞd x
ð2:252Þ
Taking a derivative with respect to s , one obtains an equation similar to eq. (2.249). @ hð; h f ðÞiÞi ¼ @ s
ð1
@ ðx; h f ðxÞiÞpðx; jÞd x þ 1 @ s
ð1 1
ðx; h f ðxÞiÞ
@ pðx; jÞd x @ s ð2:253Þ
Taking advantage of the fact that ðx; h f ðxÞiÞ depends on the cumulants only through h f ðxÞi, and using the relationship (2.241), the equation for the derivative of the output cumulant can be written as @ ð; h f ðÞiÞ ¼ @ js
@ @ 1 ds ð; h f ðÞiÞ h f ðÞi þ ð; h f ðÞiÞ @h f i @ s s! d s
ð2:254Þ
In turn, this can be further advanced by using eq. (2.248) to obtain
@ 1 ds @ ð; h f ðÞiÞ f ðÞ ð; h f ðÞiÞ ¼ ½ð; hðÞiÞ þ @ s s! d s @h f i
ð2:255Þ
The latter equation is a generalization of eq. (2.246) and can be used to find cumulants n of the output process in terms of the cumulants ks of the input process. Using the relation between the cumulant and the moments it is possible to obtain an expression for the
CUMULANT EQUATIONS
53
differentials of the cumulants as d 1 ¼ d m 1 d 2 ¼ d m 2 2 d m 1
ð2:256Þ
...: This, coupled with eq. (2.246), produces a set of differential equations for cumulants of the output process @ 1 @ s
@ 2 @ s
¼ ¼
1 s!
ds f ðÞ d s
1 ds 2 ½ f ðÞ 2hf ðÞif ðÞ s! d s
ð2:257Þ
...: Taking into account that f ðÞ ¼ , eq. (2.257) can be rewritten as
1 ds ¼ @ s s! d s D s E d 2 ½ 2hi s d @ 2 ¼ s! @ s @ 1
ð2:258Þ
...: Some important relationships, often used in the following text, can be obtained using eqs. (2.258) and the rules of the simplification of cumulant brackets, described in the appendix. In particular, dependence between lower order cumulants is provided by
@ D @ d m1 1 d2 ¼ 2 ; ¼ ; @ m1 @ 2 d 2 d D2
2 @ D d d d d ; þ ; 2 þ ¼ d d d d @ D
@ m1 ¼ @ m1
d ; d
ð2:259Þ ð2:260Þ
Finally, the derived equations can be further simplified if transformation of a Gaussian random variable is considered. In this case the input variable is completely described by its first two cumulants m1 ¼ 1 and D ¼ 2 . Thus, the output cumulants and any average h’ðÞi will depend only on the two quantities. As a result, eqs. (2.259) and (2.260) can be rewritten as @n h f ðÞi ¼ h’ðnÞ ðÞi @ mn1 @n 1 h f ðÞi ¼ n h’ð2nÞ ðÞi 2 @ Dn
ð2:261Þ
54
RANDOM VARIABLES AND THEIR DESCRIPTION
If D ¼ 0 then m1 and the following initial condition can be obtained for eq. (2.261) @n 1 h’ðÞi ¼ n h’ð2nÞ ðm1 Þi ð2:262Þ n @D 2 D¼0 Finally, using moments mn as h f ðÞi one obtains
@ s mn 1 d2 s ðnÞ ¼ s f ðÞ 2 d 2 s @ Ds @ s mn 1 d2 s ðnÞ ¼ f ðm1 Þ @ Ds D¼0 2s d m21 s
ð2:263Þ
Solving differential equations (2.263) with the initial conditions (2.262) it is possible to obtain the expression for the moments of the transformed variable in terms of the mean and variance of the input Gaussian process. It can be seen that such moments are functions of D with coefficients depending only on the average m1 . Thus, an alternative way of calculating the output moments can be suggested. In this case the moments mn can be represented as a series of powers of D with undetermined coefficients mn ¼ A0 þ A1 D þ A2 D2 þ þ As Ds þ Taking derivatives of eq. (2.264) one obtains @ n mn ¼ n!An @ Dn D¼0 Comparing eqs. (2.263) and (2.265) the following expansion is obtained s 1 X 1 d2 s ðnÞ D mn ¼ ½f ðm Þ 1 2 s s! d m1 2 s¼0
ð2:264Þ
ð2:265Þ
ð2:266Þ
The latter equation can be considered as an alternative for non-linear analysis of non-linear transformations of Gaussian (and only Gaussian) random variables.
APPENDIX: A.1
CUMULANT BRACKETS AND THEIR CALCULATIONS [9]
Evaluation of Averages
A.1.1
Averages of Delta Functions and Their Derivatives ð1 hð xÞi ¼ p ðsÞðs xÞd s ¼ p ðxÞ ð1 1 ð1 p; ðs1 ; s2 Þðs1 xÞðs2 yÞd s1 d s2 ¼ p; ðx; yÞ hð xÞð yÞi; ¼ 1 1
s ð1 d ds ds ð xÞ ¼ ðy xÞp ðyÞd y ¼ ð1Þs s p ðxÞ s s d dx 1 d y
ðA:1Þ ðA:2Þ ðA:3Þ
CUMULANT BRACKETS AND THEIR CALCULATIONS
55
If and are jointly Gaussian and both have zero mean hi ¼ hi ¼ 0 then the following equations are valid
A.1.2
d2 sþ1 ð xÞ ¼ 0 d 2 sþ1
2s s d 1 ð1Þs ð2 s 1Þ!! 1 s d p ffiffiffiffiffiffiffiffiffiffi ffi pffiffiffiffiffiffiffiffiffiffiffi ¼ ð xÞ ¼ 2 Ds d 2 s d Ds 2 D 2D 1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi hðÞðÞi; ¼ 2 ½D D B2
ðA:4Þ ðA:5Þ
Averages of Arbitrary Functions
For an arbitrary function f ðÞ of one variable and Fð; Þ of two variables the following equalities are valid hf ðÞi ¼ h f ðhiÞ þ
1 X ðx hiÞs
s!
s¼2 s
f ðsÞ ðhiÞi ¼ f ðhiÞ þ
3
1 X s s¼2 5
4
s!
f ðsÞ ðhiÞ
1 df 1d f 1d f 1d f
2 þ
3 þ ð 4 þ 3 22 Þ þ ð 5 þ 10 2 3 Þ þ f ðhiÞ þ 2! d 2 3! d 3 4! d 4 5! d 5
ðA:6Þ
In particular p ðxÞ ¼
1 X m¼0
ð1Þm
s ðmÞ ðx mÞ m!
ðA:7Þ
1 d2 d2 d2 hFð; Þi ¼ Fðhi; hiÞ þ Fð; Þ 11 þ 2 Fð; Þ 02 Fð; Þ 20 þ 2 dd d 2! d 2 1 d3 d3 d3 þ Fð; Þ 30 þ 3 2 Fð; Þ 21 þ 3 Fð; Þ 12 d d d d 2 3! d 3 d3 1 d4 þ 3 Fð; Þ 03 þ Fð; Þ½ 40 þ 3 220 d 4! d 4 d4 Fð; Þ½ 31 þ 3 11 20 d 3 d d4 þ 6 2 2 Fð; Þ½ 22 þ 2 211 þ 20 02 d d þ4
! d4 d4 2 Fð; Þ½ 13 þ 3 11 02 þ 4 Fð; Þ½ 04 þ 3 02
4 d d 3 d
ðA:8Þ
In the above equation, derivatives are calculated at ¼ hi and ¼ hi. Both equations are obtained by averaging corresponding Taylor series.
56
RANDOM VARIABLES AND THEIR DESCRIPTION
A.2
Rules of Manipulations with Cumulant Brackets
A.2.1
Arbitrary Distribution at the Same Moment in Time h;2 i ¼ 3 þ 2 1 2 h;;
2
ðA:9Þ
i ¼ 4 þ 2 1 3 þ 2 22
ðA:10Þ
h;;; i ¼ 5 þ 2 1 4 þ 6 2 3
ðA:11Þ
2
h;;;;2 i ¼ 6 þ 2 1 5 þ 8 2 4 þ 6 22 s X h½s ;2 i ¼ sþ2 þ Cs þ1 sþ1
ðA:12Þ ðA:13Þ
¼0
h2 ;i ¼ h;;i þ 2hih;i ¼ 21 þ 2 10 11
ðA:14Þ
2
h2 ;;i ¼ h;;;i þ 2hih;;i þ 2h;i ¼ 22 þ 2 10 12 þ 2 211
ðA:15Þ
h ;;;i ¼ h;;;;i þ 2hih;;;;iþ 6h;ih;;i 2
¼ 23 þ 2 10 13 þ 6 11 12 s s X X Cs h;½ ih;½s i ¼ 2;s þ Cs 1 ; 1;s h2 ;½s i ¼ h;;½s i þ ¼0
ðA:16Þ ðA:17Þ
¼0
h;3 i ¼ 4 þ 3 1 3 þ 3 22 þ 3 21 2 h;;3 i ¼ 5 þ 3 1 4 þ 9 2 3 þ 3 21 3 þ 6 1 22
ðA:18Þ ðA:19Þ
h;;; i ¼ 6 þ 3 1 5 þ 12 2 4 þ 3 21 4 þ 3 21 4 þ 9 23 þ 18 1 2 3 þ 6 23 h3 ;i ¼ 3;1 þ 3 1;0 2;1 þ 3 2;0 1;1 þ 3 21;0 1;1
ðA:20Þ
h2 ;;i ¼ 3;2 þ 3 1;0 2;2 þ 3 2;0 1;2 þ 6 1;1 2;1 þ 3 21;0 1;2 þ 6 1;0 21;1
ðA:22Þ
3
h½k i ¼ 1;k þ
k X
Cks s ks
ðA:21Þ
ðA:23Þ
s¼1
#
" 1 s X X 1 ds l f ð’Þ 1;1;s þ Cs 1;0;l 0;1;sl h;f ð’Þi ¼ s! d’s s¼1 l¼0
" 1 s X X 1 dn f ðÞ 1;1;1;n þ Cnk f3 1;0;0;k 0;1;1;sl gs h’;f ðÞi ¼ n d n! n¼0 k¼0 # n nk XX k l Cn Cnk 1;0;0;k 0;1;0;l 0;0;1;nkl þ
ðA:24Þ
ðA:25Þ
k¼0 l¼0
If 1 ¼ hi ¼ 0; the following series can be obtained ( * + 1 n ðnkÞ X 1 X dk k d h;f ðÞ;gðÞi ¼ C f ðÞ; k gðÞ d n! k¼0 n dnk n¼0 * +
) n1 ðnkÞ X dk k d þ Cn f ðÞ gðÞ
1;n dnk dk k¼1
ðA:26Þ
57
CUMULANT BRACKETS AND THEIR CALCULATIONS
h; ; f ð; Þi ¼
kþ1 1 X 1 X 1 @ f ð; Þ k!l! @ k @ l k ¼0 l¼0
k þl>0
2
4 ;;; 1;1;k;l þ
k l X X i¼0 i þ j 6¼ 0; k þ 1
3 ;; 5 Cki Clj ;; 1;i;j 1;ki;lj
ðA:27Þ
j¼0
Here the following shortened notations are used ;; ;;;
n ¼ n ; n;m ¼ ; n;m ; m;n;k ¼ m;n;k ; n;m;k;l ¼ n;m;k;l
A.2.2
Two Stationary Related Processes
The following notations are used ¼ ðtÞ; ¼ ðt þ Þ; ¼ ðtÞ; ¼ ðt þ Þ.
d h Fð Þi ¼ h FðÞihi þ FðÞ þ FðÞ h; i d
2 1 d d 2 FðÞ þ 2 FðÞ h; ; i þ d 2 d
2 1 d d3 3 2 FðÞ þ 3 FðÞ h; ; ; i þ þ d d 6
d FðÞ h; i þ hFðÞih; i h Fð Þi ¼ h FðÞihi þ d
2 1 d d 2 FðÞ h; ; i þ FðÞ h; ; i þ 2 d d
3
1 d 1 d2 3 FðÞ h; ; ; i þ þ FðÞ h; ; ; i þ d 6 2 d 2
ðA:29Þ
h 2 Fð Þi
d FðÞ h; i ¼ h Fð Þihi þ 2 Fð Þ þ d
2 1 d 2 d þ 2 FðÞ þ 4 FðÞ þ FðÞ h; ; i d 2 2 d
1 d d2 d3 6 FðÞ þ 6 2 FðÞ þ 2 3 FðÞ h; ; ; i þ 6 d d d
ðA:28Þ
2
2
d h ; Fð Þi ¼ h; FðÞihi þ hFðxÞ þ ; FðÞ h; i d
1 d d2 FðÞ þ ; 2 FðÞ h; ; i þ 2 d 2 d 2
1 d d3 þ 3 FðÞ þ ; FðÞ h; ; ; i 6 d 2 d 3
ðA:30Þ
ðA:31Þ
58
RANDOM VARIABLES AND THEIR DESCRIPTION
REFERENCES 1. W. Feller, An Introduction to Probability Theory and Its Applications, 3rd Edition, New York: Wiley, 1968. 2. M. Kendall and A. Stuart, The Advanced Theory of Statistics, 4th Edition, New York: Macmillan, 1977–1983. 3. A. Papoulis, Probability, Random Variables, and Stochastic Processes, Boston: McGraw-Hill, 2001. 4. J. Ord, Families of Frequency Distributions, London: Griffin, 1972. 5. A. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, New York: Dover, 1972. 6. Iu. Linnik, Decomposition of Probability Distributions, New York: Dover, 1965. 7. F. Oberhettinger, Tables of Fourier Transforms and Fourier Transforms of Distributions, New York: Springer-Verlag, 1990. 8. I.S. Gradsteyn and I.M. Ryzhik, Table of Integrals, Series, and Products, New York: Academic Press, 1980. 9. A.N. Malakhov, Cumulant Analysis of Non-Gaussian Random Processes and Their Transformations, Moscow: Sovetskoe Radio, 1978 (in Russian). 10. W. Rudin, Functional Analysis, New York: McGraw-Hill, 1991. 11. G. Szego, Orthogonal Polynomials, Providence: American Mathematical Society, 1975. 12. D. Middleton, Topics in Communication Theory, New York: McGraw-Hill, 1965. 13. K. Miller, Multidimensional Gaussian Distributions, New York: Wiley, 1964. 14. O.V. Sarmanov, Maximum Correlation Coefficient (Symmetrical Case), DAN, Vol. 120, 1958, pp. 715–718. 15. C. Nikias and A Petropulu, Higher-Order Spectra Analysis, New Jersey: Prentice Hall, 1993. 16. D. L. Wallace, Asymptotic Approximations to Distributions. Ann. Math. Stat. 29, 635–654, 1958. 17. H. Bateman and A. Erdelyi, Higher Transcendental Functions, New York: McGraw-Hill, 1953– 1987. 18. O.V. Sarmanov, Maximum Correlation Coefficient (Non-Symmetrical Case), DAN, Vol. 121, 1958, pp. 52–55.
3 Random Processes 3.1
GENERAL REMARKS
A random process ðuÞ describes a physical quantity which depends on a parameter u. A specific trace obtained as the result of a simple observation (umin < u < umax ) is called a realization of this random process, a trajectory or a sample path. In future these terms are used interchangeably. In communication theory, the parameter u usually represents time t or spacial coordinates x; y; z, or both, depending on the application. For now, discussions will be concentrated on the case of temporal random processes, i.e. u t. As an example of a random process, one can consider the variation of voltage (current) across electric circuit elements in the presence of noise. In this case, the notation ðtÞ is used to emphasize that the random process is a function of time. For every fixed moment of time, t ¼ ti , the value of the random process i ¼ ðti Þ is a random variable, since this value may differ from experiment to experiment, even if they are conducted under the same conditions. The random value i is called a sample of the random process ðtÞ taken at the moment of time t ¼ ti . Depending on the shape of the trajectories and the method of description, random processes can be divided into three classes: impulse, fluctuations and a special form. An impulse process can be visualized as a sequence of single pulses, in general of different shapes, which follow each other in time with random intervals between consecutive pulses. Usually, impulse processes are described by piece-wise continuous functions of time. Manmade noise, so-called Electromagnetic Interference (EMI), and a number of natural phenomena, such as lightning, can be described as impulse random processes. A fluctuation type of random process is usually a result of a combination of a large number of contributing sources, often impulses, which overlap in time. The trajectory of such processes resembles a continuous function of time. Thermal and cosmic noise are the prominent representatives of fluctuations. The special type of random processes usually involve a combination of impulse or fluctuation random processes mixed with a deterministic signal or a signal of a known deterministic shape and random parameters. As an example one can consider a signal of the following form ðtÞ ¼ A cosð!t þ Þ þ nðtÞ
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
ð3:1Þ
60
RANDOM PROCESSES
Here, nðtÞ can be a fluctuation or an impulse process and A; !; can be deterministic or random variables. The next few sections are dedicated to general methods of description of random processes.
3.2
PROBABILITY DENSITY FUNCTION (PDF)
Let one imagine that there is a large number N of completely identical systems (Fig. 3.1), constituting an ensemble system. All the systems work in identical conditions. The output of each system is a sample path of a random process ðtÞ under consideration. If identical registering devices are attached to the systems, we can record each system state at any moment of time t. For a fixed moment of time, t ¼ t1 , readings x1 ðt1 Þ; x2 ðt1 Þ; . . . ; xN ðt1 Þ from different systems are different from each other since systems are assumed to be described by a random process. In general, values of xi ðti Þ cover all possible regions of variation of ðtÞ if N is sufficiently large. Let n1 be a number of samples whose values belong to a small interval ½x; x þ x. If N is large enough, a relative number of observations inside this interval is proportional to x and depends on t1 as a parameter. In other words, n1 ¼ p1 ðx; t1 Þx N!1 N lim
ð3:2Þ
The function p1 ðx; t1 Þ is called the marginal probability density of the process ðtÞ considered at the moment of time t ¼ t1 . The marginal probability density is a very important, but not a sufficient, description of a random process. It completely describes the statistical properties only at one fixed moment
Figure 3.1
Illustration of the imaginary experiment with a large number of identical sources.
PROBABILITY DENSITY FUNCTION (PDF)
61
of time. However, it does not describe at all how values of the process ðtÞ are related at two different time instants t ¼ t1 and t ¼ t2 . It is possible to say that the marginal PDF gives a local view of the random process while ignoring its properties on a whole (even small) time interval. A more complete description can be achieved if a second order (two time instant) PDF p2 ðx1 ; x2 ; t1 ; t2 Þ is considered. It describes relations between any two values of the random process ðtÞ considered at any two separate moments of time t1 and t2 . Construction of the second order PDF is similar to that of the marginal PDF. One can observe the state of each of N devices, considered above, at two separate moments of time t1 and t2 (see Fig. 3.1) and let 2N random variables x1 ðt1 Þ; x2 ðt1 Þ; . . . ; xN ðt1 Þ;
x1 ðt2 Þ; x2 ðt2 Þ; . . . ; xN ðt2 Þ
ð3:3Þ
be the result of these observations. The next step is to count observations which fell into a small interval ½x1 ; x1 þ x1 at the moment of time t ¼ t1 and into a small interval ½x2 ; x2 þ x2 at the moment of time t2 . Once again it is assumed that if N is large enough, a relative number n12 of such events is proportional to the length of each interval and depends on both t1 and t2 as parameters. Mathematically speaking lim
N!1
n12 ¼ p2 ðx1 ; x2 ; t1 ; t2 Þx1 x2 N
ð3:4Þ
The function p2 ðx1 ; x2 ; t1 ; t2 Þ is called the joint two-dimensional PDF of the process ðtÞ. In general, the joint second order PDF p2 ðx1 ; x2 ; t1 ; t2 Þ also does not provide a complete description of the random process ðtÞ. It describes only the relation between two time instants of the process. A more detailed description can be obtained by considering a series of probability density functions of an arbitrary order M pðx1 ; x2 ; . . . ; xM ; t1 ; t2 ; . . . ; tM Þ constructed as follows: for M fixed moments of time t1 ; t2 ; . . . ; tM , let xi ðtj Þ be a sample of the random variable ðtÞ at the output of the i-th system at the time instant t ¼ tj . For small intervals, ½xi ; xi þ xi , let n be the number of outcomes such that at ti the observed quantity fits in the interval ½xi ; xi þ xi for every i ¼ 1; . . . ; M. In this case lim
n
N!1 N
¼ pM ðx1 ; x2 ; . . . ; xM ; t1 ; t2 ; . . . ; tM Þx1 x2 xM
ð3:5Þ
A PDF of order M allows conclusions to be drawn about properties of a random process ðtÞ at M separate moments of time. A higher order of PDF provides additional information about the process with respect to the lower order PDF. An M-dimensional PDF must satisfy the following conditions 1. Positivity: pM ðx1 ; x2 ; . . . ; xM ; t1 ; t2 ; . . . ; tM Þ 0 for any xi and tj This follows from the definition (3.5) since n 0
ð3:6Þ
62
RANDOM PROCESSES
2. Normalization: 1 ð
1 ð
1
pM ðx1 ; x2 ; . . . ; xM ; t1 ; t2 ; . . . ; tM Þ dx1 dx2 dxM ¼ 1
ð3:7Þ
1
3. Symmetry: pM does not change under any permutation of its arguments x1 ; x2 ; . . . ; xM . 4. Matching condition: for any m < M ð ð ð pm ðx1 ; x2 ; . . . ; xm ; t1 ; t2 ; . . . ; tm Þ ¼
pM ðx1 ; x2 ; . . . ; xM ; t1 ; t2 ; . . . ; tM Þdxmþ1 dxM RMm
ð3:8Þ This equation shows that if the M-dimensional PDF is known, the PDF of a smaller order m can be found by integrating the former over ‘‘additional’’ variables. In other words, if the PDF of order M is known, then a PDF of order m < M can be found by integrating as in eq. (3.8). Since t is continuous, there is no finite order M which allows for a complete description of the process (except a few particular cases such as Gaussian or Markov processes). However, it can be shown that an infinite but countable sequence of PDFs describes a random process completely [1,2]. If a value of the process under consideration is known and fixed at a moment of time t ¼ t2 , the conditional PDF p1 ðx1 ; t1 jx2 ; t2 Þ can be defined as follows p1 ðx1 ; t1 Þ ¼
p2 ðx1 ; x2 ; t1 ; t2 Þ ¼ p1 ðx1 ; t1 jx2 ; t2 Þ p1 ðx2 ; t2 Þ
ð3:9Þ
Here, of course, 1 ð
p1 ðx2 ; t2 Þ ¼
p2 ðx1 ; x2 ; t1 ; t2 Þdx1
ð3:10Þ
1
The probability density p1 ðx1 ; t1 jx2 ; t2 Þ is known as the conditional PDF for a given x2 . It can be shown that p1 ðx1 ; t1 jx2 ; t2 Þ is a proper PDF. Indeed, since p2 0 and p1 0 then their ratio is also a positive function, thus satisfying one of the conditions. The normalization condition is also satisfied since 1 ð
1
1 p1 ðx1 ; t1 jx2 ; t2 Þdx1 ¼ p1 ðx2 ; t2 Þ
1 ð
p2 ðx1 ; x2 ; t1 ; t2 Þdx2 ¼ 1
1 p1 ðx2 ; t2 Þ ¼ 1 p1 ðx2 ; t2 Þ ð3:11Þ
It is important to realize that the conditional PDF contains more (or at least as much) information about the process ðtÞ at the time instant t ¼ t1 than the marginal PDF p1 ðx1 ; t1 Þ.
THE CHARACTERISTIC FUNCTIONS AND CUMULATIVE DISTRIBUTION FUNCTION
63
Indeed, the value of the marginal PDF can be restored from the conditional PDF as follows. If t2 ! 1, then the knowledge of x2 ¼ ðt2 Þ does not produce any new information about ðt1 Þ since they are independent, i.e. lim p1 ðx1 ; t1 jx2 ; t2 Þ ¼ p1 ð1 ; t1 Þ
t2 !1
ð3:12Þ
However, it is not possible to restore the conditional PDF from the marginal one. The amount of additional information carried by the conditional PDF is quite different in different cases. If ðt1 Þ and ðt2 Þ are independent samples of a random processes ðtÞ, then1 p2 ðx1 ; x2 ; t1 ; t2 Þ ¼ p1 ðx1 ; t1 Þp1 ðx2 ; t2 Þ
ð3:13Þ
The conditional PDF is then defined as p1 ðx1 ; t1 jx2 ; t2 Þ ¼
p2 ðx1 ; x2 ; t1 ; t2 Þ ¼ p1 ðx1 ; t2 Þ p1 ðx2 ; t2 Þ
ð3:14Þ
thus stating that the knowledge of x2 does not contribute any additional knowledge about the value of the process at the previous time. In the other extreme case, two samples ðt1 Þ and ðt2 Þ can be related by a deterministic function dependence, i.e. ðt1 Þ ¼ g½ðt2 Þ. In this case p2 ðx1 ; x2 ; t1 ; t2 Þ ¼ ðx1 gðx2 ÞÞp1 ðx2 ; t2 Þ
ð3:15Þ
and the conditional PDF is reduced to a function. Most of the realistic situations fall in between these two extremes. Eqs. (3.9)–(3.15) can be generalized to the case of a few variables.
3.3
THE CHARACTERISTIC FUNCTIONS AND CUMULATIVE DISTRIBUTION FUNCTION
Instead of probability densities of an arbitrary order, one can consider corresponding characteristic functions of the same order. The characteristic function n ðu1 ; u2 ; . . . ; un ; t1 ; t2 ; . . . ; tn Þ of order n is defined as the n-dimensional Fourier transform of the PDF of order n, i.e. n ðu1 ; u2 ; . . . ; un ; t1 ; t2 ; . . . ; tn Þ 1 1 ð ð pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; ; tn Þ ¼ 1
1
exp½ jðu1 x1 þ u2 x2 þ þ un xn Þdx1 dx2 dxn
1
ð3:16Þ
p1 ðx1 ; t1 Þ and p1 ðx2 ; t2 Þ are different functions in general. However, since they represent the same process, the notation is kept the same.
64
RANDOM PROCESSES
It follows from eq. (3.16) that the characteristic function is an average value of the complex exponent n ðu1 ; u2 ; . . . ; un ; t1 ; t2 ; . . . ; tn Þ ¼ hexp½ jðu1 x1 þ u2 x2 þ þ un xn Þi
ð3:17Þ
The corner brackets here are used as a short-cut for averaging over an assembly. It follows from eqs. (3.16) and (3.17) that n ð0; 0; . . . ; 0; t1 ; t2 ; . . . ; tn Þ 1 1 ð ð ... pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ exp½ j0dx1 dx2 . . . dxn 1 ¼ 1
ð3:18Þ
1
and n ðu1 ; u2 ; . . . ; un ; t1 ; t2 ; . . . ; tn Þ ¼ ðnþmÞ ðu1 ; u2 ; . . . ; 0; . . . ; 0; t1 ; t2 ; . . . ; tn Þ
ð3:19Þ
Cumulative distribution functions (CDF) Pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ of any order can be defined as @ n Pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ ¼ pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ @x1 @x2 . . . @xn
ð3:20Þ
thus leading to Pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ xð1 xð2 xðn ¼ ... pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; . . . ; tn Þdx1 dx2 . . . dxn 1 1
ð3:21Þ
1
Since pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ 0 then Pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ is a nondecreasing function in all arguments, which satisfies the following normalization condition Pn ð1; 1; . . . ; 1; t1 ; t2 ; . . . ; tn Þ ¼ 1
ð3:22Þ
Considering that there is one-to-one correspondence between pn ðxÞ, Pn ðxÞ and n ðxÞ, either of these functions provides the same description of the random process.
3.4
MOMENT FUNCTIONS AND CORRELATION FUNCTIONS
As was pointed out in Section 3.2, a complete description of a random process is accomplished by a series of PDFs (CDFs, characteristic functions) of increasing order. However, in practice, it is important to find a simpler (though in most cases incomplete) description of random processes for the following reasons:
MOMENT FUNCTIONS AND CORRELATION FUNCTIONS
65
1. In many cases, one needs to consider a transformation of a random process by a linear or a non-linear system with memory. Even if a complete description of the input of such systems is achieved (i.e. PDF of an arbitrary order is known), it is impossible to provide a general method of deriving PDFs at the output of the system under consideration. The problem can be solved partially by calculating a sufficient number of moments (or cumulants) of the output process instead. 2. Often, a process cannot be described statistically based on some physical (theoretical) model due to its complexity or lack of precise description. In this case, some statistical properties of the process can be obtained from experimental measurements. For example, low order moments of the process can be measured in many cases; however, a complete description in terms of PDF of an arbitrary order is impossible. It is usually possible to determine the marginal PDF and correlation function, but not the higher order PDFs. 3. Some processes are defined by a small number of parameters. For example, knowledge of the mean value and the covariance function completely describe a Gaussian process. Another interesting example is so-called ‘‘Spherically Invariant Processes,’’ which are also defined by their marginal PDF and the covariance function. In addition, a widely used parametric technique allows the fitting of a model distribution based on a few lower order moments: the Pearson’s class of PDF is widely used in applications and requires only knowledge of the first four moments [3]. 4. Answers to many practical questions can be obtained by considering certain particular characteristics of random processes (such as mean, variance, correlation, etc.). For example, analysis of properties of the output of a receiver with an amplitude detector driven by a linearly amplitude modulated signal Að1 þ m cos tÞ sinð!t þ Þ, usually ignores statistical properties of a random phase . In this case, only statistical properties of A have impact on the output [4]. 5. It is often desired to obtain rough estimates of the output process, rather than its complete and accurate description. Moment functions, Mi1 ðtÞ; Mi1 i2 ðt1 ; t2 Þ; . . . ; of a random process ðtÞ are defined as follows. Let I ¼ ½i1 ; i2 ; . . . ; in be a set of integer numbers greater than or equal to zero. Then for different moments of time, t1 ; t2 ; . . . ; tn , one can define an n-dimensional moment n P ik as function of order ¼ i¼1
Mi1 i2 ...in ðt1 ; t2 ; . . . ; tn Þ ¼ hi1 ðt1 Þi2 ðt2 Þ . . . in ðtn Þi ð1 ð1 ... xi11 xi22 . . . xinn pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þdx1 . . . dxn ¼ 1
1
ð3:23Þ
For example, the one-dimensional moment of order i1 is nothing but the statistical average of the i1 -th power of the random process ðtÞ itself Mi1 ðtÞ ¼ h ðtÞi ¼ i1
ð1 1
xi1 p1 ðxÞdx
ð3:24Þ
66
RANDOM PROCESSES
while the two-dimensional moment Mi1 i2 ðt1 ; t2 Þ of order i1 þ i2 is defined as Mi1 i2 ðt1 ; t2 Þ ¼
ð1 ð1 1
1
xi11 xi22 p2 ðx1 ; x2 Þdx1 dx2
ð3:25Þ
and so on. It is important to note that in order to consider moment functions of order n, one needs to know a PDF of order n. Instead of moment functions Mi1 in ðt1 ; . . . ; tn Þ defined by eq. (3.23), one can consider socalled central moment functions of the same order and dimension, which are defined as follows i1 ;i2 ;...;in ðt1 ; t2 ; . . . ; tn Þ ¼ h½ðt1 Þ M1 ðt1 Þi1 . . . ½ðtn Þ Mn ðtn Þin i ð1 ð1 ¼ ... ðx1 M1 ðt1 ÞÞi1 ðx2 M2 ðt2 ÞÞi2 . . . ðxn Mn ðtn ÞÞin 1
1
p ðx1 ; x2 ; . . . ; xn Þdx1 . . . dxn
ð3:26Þ
There is an obvious relation between the moments and centred moments of the same random process. Indeed, expanding the expression ðx1 M1 ðt1 ÞÞi1 ðx2 M2 ðt2 ÞÞi2 . . . ðxn Mn ðtn ÞÞin into a power series one can obtain ðx1 M1 ðt1 ÞÞi1 ðx2 M2 ðt2 ÞÞi2 . . . ðxn Mn ðtn ÞÞin ¼
XX k1
k2
...
X
Ck1 k2 ...kn xk11 xk22 . . . xknn
kn
ð3:27Þ and, thus i1 ;i2 ;...;in ðt1 ; t2 ; . . . ; tn Þ ð1 X X X ð1 ¼ ... ... Ck1 k2 ...kn xk11 xk22 . . . xknn pn ðx1 ; x2 ; . . . ; xn Þdx1 . . . dxn 1
¼ ¼
1 k1
XX k1
k2
k1
k2
XX
...
X kn
...
X
k2
Ck1 k2 ...kn
kn
ð1
1
...
ð1 1
xk11 xk22 . . . xknn pn ðx1 ; x2 ; . . . ; xn Þdx1 . . . dxn
Ck1 k2 ...kn Mi1 ;i2 ;...;in ðt1 ; t2 ; . . . ; tn Þ
ð3:28Þ
kn
Here, Ck1 k2 ...kn ¼ Ck1 k2 ...kn ðt1 ; t2 ; . . . ; tn Þ is a function of time only and it can be obtained by collecting terms with the same power of xi . For example, the central moment of order k, considered at a single moment of time t, is related to the moments of order up to k as k ðtÞ ¼
ð1 1
ðx m1 ðtÞÞk p1 ðx; tÞdx ¼
k X
k! ð1Þi mi1 ðtÞMki ðtÞ i!ðk iÞ! i¼1
ð3:29Þ
67
MOMENT FUNCTIONS AND CORRELATION FUNCTIONS
Similarly 1;1 ðt1 ; t2 Þ ¼
ð1 ð1 1
1
½x1 m1 ðt1 Þ½x2 m1 ðt2 Þp2 ðx1 ; x2 ; t1 ; t2 Þdx1 dx2
¼ M1;1 ðt1 ; t2 Þ m1 ðt1 Þm1 ðt2 Þ
ð3:30Þ
and so on. It is interesting to note that in order to find an expression for the central moment of order , moments of the order up to (but not higher) must be considered. The reverse statement is also easy to verify. This can be accomplished by considering the following identity xi11 xi22 . . . xinn ¼ ð½x1 m1 ðt1 Þ þ m1 ðt1 ÞÞi1 ð½x2 m1 ðt2 Þ þ m1 ðt2 ÞÞi2 . . . ð½xn m1 ðtn Þ þ m1 ðtn ÞÞin
ð3:31Þ
Definition (3.23) can be extended to produce generalized averages. In particular, instead of powers of xi in eq. (3.23), one can use a general function gðx1 ; x2 ; . . . ; xn Þ of n variables: Mg ðt1 ; t2 ; . . . ; tn Þ ¼ hgðx1 ; x2 ; . . . ; xn Þi ¼
ð1 1
...
ð1 1
gðx1 ; x2 ; . . . ; xn Þpn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ
dx1 dxn
ð3:32Þ
Obviously, this definition includes the definition of moments, central moments and fractional moments (obtained when indices i1 are not necessary integer). It also covers the characteristic function, absolute and factorial moments and many other quantities used in statistics and signal processing [4,5]. The moment functions can be obtained by differentiating the corresponding characteristic function, rather than by integration of the PDF. Indeed, using the Taylor series of the exponent, one can obtain ! ð1 ð1 1 X ðjuxÞn p1 ðx; tÞ exp½ juxdx ¼ p1 ðx; tÞ dx 1 ðu; tÞ ¼ n! 1 1 n¼0 !ð 1 1 n 1 X X ð juÞn j Mn ðtÞ n u xn p1 ðx; tÞdx ¼ ð3:33Þ ¼ n! n! 1 n¼0 n¼0 Since the above eq. (3.33) is the Taylor expansion of 1 ðu; tÞ, its coefficients can be obtained through the values of its derivatives at u ¼ 0 as Mn ðtÞ ¼ ðjÞn
dn 1 ðu; tÞ dun u¼0
ð3:34Þ
It is possible to show that similar expressions are valid for k-dimensional moment functions [5]. It is possible to select a narrow group of moments which provide significant information about the random process under consideration, so called covariance functions. Cumulant
68
RANDOM PROCESSES
functions K1 ðt1 Þ, K2 ðt1 ; t2 Þ, K3 ðt1 ; t2 ; t3 Þ are defined through expansion of ln½n ðu1 ; u2 ; . . . ; un ; t1 ; . . . ; tn Þ into the MacLaurin series. All of them are symmetric functions of their arguments. In the one-dimensional case, cumulant functions are known as cumulants, or semi-invariants [5]. It is also possible to express covariance functions through related moment functions of similar order. In order to show this, one can use the following expansion of the lnð1 þ zÞ into MacLaurin series [6] 1 1 1 ð1Þn n z lnð1 þ zÞ ¼ z z2 þ z3 z4 þ þ n 2 3 4
ð3:35Þ
z ¼ 1 ðuÞ 1
ð3:36Þ
Setting
and using the expansion of 1 ðuÞ in terms of the moment functions Mu ðtÞ, one obtains 1 ln 1 ðuÞ ¼ ð1 1Þ ð1 1Þ2 2 !2 !3 1 1 1 X M 1 X M 1 X M ð juÞ ð juÞ ð juÞ þ þ þ ! 2 ¼ 1 ! 3 ¼ 1 ! ¼1
ð3:37Þ
since 1 ð0Þ ¼
ð1 1
p1 ðxÞ exp½ ju0dx ¼
ð1 1
p1 ðxÞdx ¼ 1
ð3:38Þ
The right-hand side of expression (3.37) represents a polynomial with respect to ð juÞ. Rearranging terms in eq. (3.37), this equality can be rewritten as ln 1 ðuÞ ¼
1 X ¼1
!
ð juÞ
ð3:39Þ
or 1 ðuÞ ¼ exp
1 X ¼1
!
! ð juÞ
ð3:40Þ
Expansion (3.39) defines cumulants of order , or semi-invariants of order . It can be seen from eq. (3.37) that is a polynomial of n variables M1 ; M2 ; . . . ; Mn and vice versa. An implicit expression relating cumulant and moment functions can be obtained by equating terms of the same order of ð juÞ in eqs. (3.39) and (3.37). As a result 1 ¼ M 1 2 ¼ M2 M12 ¼ 2 ¼ 2 3 ¼ M3 3M1 M2 þ 2M12 ¼ 3 4 ¼ M4 3M22 4M1 M3 þ 12M12 M2 6M14 ¼ 4 322 and so on. Expressions for higher order cumulants can be found in [5,7].
ð3:41Þ
MOMENT FUNCTIONS AND CORRELATION FUNCTIONS
69
It is important to mention that the first cumulant coincides with the average of the random process: 1 ðtÞ ¼ M1 ðtÞ ¼ hðtÞi
ð3:42Þ
while the second cumulant is its variance 2 ðtÞ ¼ M2 ðtÞ M12 ðtÞ ¼ hðtÞ M1 ðtÞ2 i ¼ 2 ðtÞ
ð3:43Þ
Higher order cumulants are also very important in describing the shape of the marginal PDF p1 ðxÞ. Let us just mention that the normalized third and fourth cumulants are known as normalized skewness and excess [7,8] 1 ¼
3 3=2 2
;
2 ¼
4 ; 22
¼
4 22
ð3:44Þ
It is important to notice that higher order cumulants do not coincide with corresponding central moments. The difference starts with 4 , as can be seen from eq. (3.41). Higher order covariance (cumulant) functions K1 ðt1 Þ; K2 ðt1 ; t2 Þ; K3 ðt1 ; t2 ; t3 Þ are defined by expanding ln n ðu1 ; . . . ; un Þ into the MacLaurin series. The first three covariance functions are listed below: K1 ðtÞ ¼ 1 ¼ M1 ðtÞ ¼ hðtÞi K2 ðt1 ; t2 Þ ¼ h½ðt1 Þ M1 ðt1 Þ½ðt2 Þ M1 ðt2 Þi ¼ M2 ðt1 ; t2 Þ M1 ðt1 ÞM2 ðt2 Þ ¼ 1;1 ðt1 ; t2 Þ
ð3:45Þ
K3 ðt1 ; t2 ; t3 Þ ¼ M3 ðt1 ; t2 ; t3 Þ M1 ðt1 ÞK2 ðt2 ; t3 Þ M1 ðt2 ÞK2 ðt1 ; t3 Þ M1 ðt3 ÞK2 ðt1 ; t2 Þ þ 2M1 ðt1 ÞM1 ðt2 ÞM1 ðt3 Þ ¼ 1;1;1 ðt1 ; t2 ; t3 Þ It is easy to see that t1 ¼ t2 ¼ t3 , and eq. (3.45) becomes eq. (3.41). Knowledge of covariance functions allows one to restore characteristic functions and thus PDFs. This means that an adequate description of the random process can be achieved by a series of PDFs, a series of characteristic functions, moments or cumulants, as indicated in Fig. 3.2. It is important to note that the decision of which characteristic of a random process to use heavily depends on the type of random process. For a Gaussian process, which is
Figure 3.2
Relations between PDF, CF, moments and cumulants.
70
RANDOM PROCESSES
completely described by its mean and covariance function, all descriptions can be applied with the same complexity. In future, the first two cumulant functions M1 ðtÞ and K2 ðt1 ; t2 Þ will play a central role in the analysis of random processes and their transformation by linear and non-linear systems. A branch of the theory of random processes based on M1 ðtÞ and K2 ðt1 ; t2 Þ is called ‘‘the correlation theory’’ of random processes. Detailed investigation of cumulants can be found in [8]. Some discussions, closely following the approach taken in [8] can also be found in Section 3.8.
3.5
STATIONARY AND NON-STATIONARY PROCESSES
One of the important classes of random processes is so-called stationary random processes. A random process ðtÞ is called stationary in a strict sense if the PDF pn ðx1 ; x2 ; . . . ; xn ; t1 ; . . . ; tn Þ of any arbitrary order n does not change by a time translation of the time instants t1 ; t2 ; . . . ; tn along the time axes. In other words, for any t0 and n pn ðx1 ; . . . ; xn ; t1 ; . . . ; tn Þ ¼ pn ðx1 ; . . . ; xn ; t1 t0 ; t2 t0 ; . . . ; tn t0 Þ
ð3:46Þ
This means that the stationary process is homogeneous in time. Of course, eq. (3.46) implies that the same property is true for characteristic functions, moments and cumulants. Stationary processes are similar to steady-state processes in deterministic systems. All other processes are called non-stationary. As an example of a non-stationary process, one can consider a process of switching load of the power system (random number of users) before it reaches a steady state. It can be seen from eq. (3.46) that the marginal PDF p1 ðx; tÞ of a stationary process does not depend on time at all, i.e. p1 ðx; tÞ ¼ p1 ðx; t tÞ ¼ p1 ðx; 0Þ ¼ p1 ðxÞ
ð3:47Þ
At the same time the PDF of the second order depends only on the time difference, ¼ t2 t1 , between two instances of time p2 ðx1 ; x2 ; t1 ; t2 Þ ¼ p2 ðx1 ; x2 ; t1 t2 t1 Þ ¼ p2 ðx1 ; x2 ; 0; Þ ¼ p2 ðx1 ; x2 ; Þ
ð3:48Þ
In general, the PDF of order n depends only on ðn 1Þ time intervals k ¼ tkþ1 tk between t1 ; t2 ; . . . ; tn . Since the marginal PDF of a stationary process does not depend on time, neither do the characteristic function, moments, and cumulants of the first order. In particular, changing of a time scale also does not change these quantities. In a sense, a description using only the marginal PDF is similar to a description of a harmonic signal sðtÞ ¼ A cosð!t þ ’Þ, using only its magnitude A. It provides knowledge about the range of the process (from A to A) but does not tell how fast the process changes.
COVARIANCE FUNCTIONS AND THEIR PROPERTIES
71
Since the probability density of the second order p2 ðx1 ; x2 ; t1 ; t2 Þ depends only on the time difference ¼ t2 t1 (time lag ) the covariance2 function Cðt1 ; t2 Þ ¼ K2 ðt1 ; t2 Þ also depends only on the time lag Cð Þ ¼ h½ðtÞ m½ðt2 Þ mi ¼ hðt1 Þðt1 þ Þi m2 ð1 ð1 ¼ ðx1 mÞðx2 mÞp2 ðx1 ; x2 ; Þdx1 dx2 1
1
ð3:49Þ
Variance of the random process 2 then coincides with the value of the covariance function at zero lag: ð1
2 ¼ Cð0Þ ¼ ðx mÞ2 p1 ðxÞdx ð3:50Þ 1
In many practical cases, one is limited to considering probability densities of order two. Moreover, only the mean value and the variance of the process are often sufficient to obtain an answer or a rough estimate. Due to this fact, a more relaxed definition of stationarity is usually used. A random process ðtÞ is called a wide sense stationary random process if its mean does not depend on time ðm ¼ M1 ðtÞ ¼ const:Þ and its covariance function C ðt1 ; t2 Þ depends only on the time lag ¼ t2 t1 , i.e. C ð Þ ¼ C ðt2 t1 Þ. A strict sense stationary process is also stationary in the wide sense. The reverse is not always true. It is important to mention that there is an important class of random processes for which these two definitions are equivalent: any wide sense Gaussian process is a strict sense stationary process as well. This follows from the fact that knowledge of the mean and covariance function completely describes any Gaussian process [1,4]. Since Gaussian processes are often encountered in applications, it is important to learn how to calculate or measure the mean value and the covariance function. On the other hand, no further information is required to describe such a process. This fact emphasizes the importance of the correlation theory for applications.
3.6
COVARIANCE FUNCTIONS AND THEIR PROPERTIES
A covariance function between samples of the same process at two moments of time is called the autocovariance function. The general definition of autocovariance is given by eq. (3.45) C ðt1 ; t2 Þ ¼ h½ðt1 Þ m1 ðt1 Þ½ðt2 Þ m1 ðt2 Þi
ð3:51Þ
C ð Þ ¼ hðtÞðt þ Þi m2
ð3:52Þ
which is reduced to
in the case of a stationary process ðtÞ. 2
Here and in the future, the following special notations are used: C ðt1 ; t2 Þ ¼ h½ðt1 Þ m1 ðt1 Þ½ðt2 Þ m1 ðt2 Þi for the covariance function and R ðt1 ; t2 Þ ¼ C ðt1 ; t2 Þ þ m1 ðt1 Þm1 ðt2 Þ ¼ hðt1 Þðt2 Þi for the correlation function. Subindex indicating the random process will often be omitted if it does not lead to confusion.
72
RANDOM PROCESSES
The definition can be extended to consider covariance between two or more random processes. For simplicity, a case of two stationary random processes ðtÞ and ðtÞ with averages m and m respectively is considered. It is possible to define a mutual (joint) covariance function between these two processes as 4
C ðt1 ; t2 Þ ¼ h½ðt1 Þ m ½ðt2 Þ m i 4
C ðt1 ; t2 Þ ¼ h½ðt1 Þ m ½ðt2 Þ m i
ð3:53Þ
If both C and C depend only on the time lag ¼ t2 t1 , i.e. C ðt1 ; t2 Þ ¼ C ð Þ C ðt1 ; t2 Þ ¼ C ð Þ
ð3:54Þ
then ðtÞ and ðtÞ are called stationary related processes. For a stationary related process, one can obtain that C ð Þ ¼ h½ðtÞ m ½ðt þ Þ m i ¼ h½ðt Þ m ½ðtÞ m i ¼ C ð Þ ð3:55Þ Equations (3.52)–(3.53) can be generalized to the case of complex random processes as follows C ð Þ ¼ h½ðtÞ m ½ ðt1 þ Þ m i
ð3:56Þ
C ð Þ ¼ h½ðtÞ m ½ ðt1 þ Þ m i
ð3:57Þ
where stands for the complex conjugation. The importance of the mutual covariance function in describing the relation between two random processes can be explained by considering two extreme cases of dependence between two processes: (a) ðtÞ and ðtÞ are independent, and (b) ðtÞ and ðtÞ are linearly related, i.e. ðtÞ ¼ ðtÞ þ ; > 0. In case (a), the joint probability density is just a product of marginal PDFs p2 ðx1 ; x2 Þ ¼ p1 ðx1 Þp1 ðx2 Þ
ð3:58Þ
where p1 ðx1 Þ is the PDF of ðtÞ and p1 ðx2 Þ is the PDF of ðt þ Þ. In this case C ¼
ð1 ð1 1
1
ðx1 m Þðx2 m Þp1 ðx1 Þp1 ðx2 Þdx1 dx2 ¼ 0
ð3:59Þ
Thus, the mutual covariance function between two independent processes is equal to zero (i.e. there is no correlation between ðtÞ and ðtÞ). On the contrary, under the linear dependence as specified in case (b), one can write C ðt1 ; t2 Þ ¼ h½ðt1 Þ m ½ðt2 Þ m i ¼ C ð Þ
ð3:60Þ
73
COVARIANCE FUNCTIONS AND THEIR PROPERTIES
since hðtÞi ¼ hðtÞ þ i ¼ hðtÞi þ h i ¼ m þ
ð3:61Þ
Setting t2 ¼ t1 ð ¼ 0Þ, one obtains C ðt1 ; t1 Þ ¼ C ð0Þ ¼ 2 ¼
ð3:62Þ
It will be shown later that eq. (3.62) is the maximum correlation which can be achieved. In other words, the covariance function shows how much knowledge about the process ðtÞ can be obtained by observing the process ðtÞ. The following are some properties of autocovariance functions which will be often used 1. The covariance function Cð Þ of a stationary process is an even function, i.e. Cð Þ ¼ Cð Þ
ð3:63Þ
Indeed, since the process is stationary, one can shift the time origin by to obtain Cð Þ ¼ hðtÞðt þ Þi m2 ¼ hðt ÞðtÞi m2 ¼ Cð Þ
ð3:64Þ
2. The absolute value of the covariance function does not exceed its value at ¼ 0, i.e. the variance of the process jCð Þj Cð0Þ ¼ 2
ð3:65Þ
It is clear that the average of a non-negative function is a non-negative quantity, i.e. Ef½ðtÞ m ½ðt þ Þ m g2 0
ð3:66Þ
The last inequality implies that hððtÞ m Þ2 2½ðtÞ m ½ðt þ Þ m þ ððt þ Þ m Þ2 i ¼ 2½ 2 Cð Þ 0 ð3:67Þ and thus jCð Þj 2
ð3:68Þ
For many practical random processes, the following condition is also satisfied lim Cð Þ ¼ 0
!1
ð3:69Þ
74
RANDOM PROCESSES
From a physical point of view, this result can be explained by the fact that most of the physically possible systems have an impulse response which approaches zero (stable systems). As a consequence, the effect of the current value of the system state on the future is limited by the time when the impulse response is significant and disappears when the time lag approaches infinity. To summarize, the covariance function CðtÞ of a stationary process is an even function, having a maximum at ¼ 0. This maximum is equal to the variance 2 , and the covariance function vanishes as ! 1. Some commonly used covariance functions are listed in Table 3.1. Not every function which satisfies the conditions 1–3 can be a proper covariance function, an autocovariance function must satisfy an additional condition. 3. The covariance function is a positively defined function; i.e. for any ai and aj and any N moments of time N X
ai aj Cðti ; tj Þ 0
ð3:70Þ
i;k ¼ 1
This is an obvious consequence of the fact that an average of a non-negative random variable is a non-negative quantity 2 + * + * X N N N X X ai ðti Þ ¼ ai aj ðti Þ ðtj Þ ¼ ai aj Cðti ; tj Þ 0 i ¼1 i¼1 i;k ¼ 1
ð3:71Þ
An equivalent form of this definition is the fact that the cosine transform of the covariance function is also a non-negative function: ð1
Cð Þ cos ! d 0
ð3:72Þ
0
It is worth mentioning that mutual covariance functions do not satisfy the conditions above. Some of the properties of autocovariance can be generalized to a case of nonstationary processes. In this case, eqs. (3.63) and (3.65) become Cðt1 ; t2 Þ ¼ Cðt2 ; t1 Þ jCðt1 ; t2 Þj ðt1 Þ ðt2 Þ
3.7
ð3:73Þ
CORRELATION COEFFICIENT
Equations (3.53) and (3.62) show that the correlation between functions describes not only a level of dependence between two random processes, but it also takes into account their variance. Indeed if one of the functions ðtÞ or ðtÞ deviates a little from its mean, the absolute value of the covariance function would be small, no matter how closely related ðtÞ
75
CORRELATION COEFFICIENT
Table 3.1
Common covariance functions and their Fourier transforms [20]
Normalized covariance function Cð Þ= 2 ¼ Rð Þ
Normalized PSD Sð!Þ ¼ Ð1 j! d 1 Rð Þe
N0 ð Þ 2 exp½ ð1 þ j jÞ expð Þ
exp½2 2 exp½j j cosð!0 Þ
exp½j j½cosð!0 Þ þ sin !0 j j !0
sin ! !
Description or form-filter
N0 2
Correlation interval ð corr Þ corr ¼ Ð1 0 jRð Þjd
1
1
1
1
1.571
2
2
1.221
pffiffiffi 2
22
1.065
Resonant contour
1
1
1.571
Resonant contour
1
!20 þ 2
1.571
Ideal low-pass filter
2!
ð!Þ2 3
1
First order low-pass filter
43
Second order low-pass filter
ð2 þ !2 Þ2 rffiffiffiffiffiffi !2 exp 2 42 " 1 2 þ ð! !0 Þ2 # 1 þ 2 þ ð! þ !0 Þ2
Gaussian filter
4ð!20 þ 2 Þ 1 2 þ ð! !0 Þ2 1 þ 2 2 þ ð! þ !0 Þ 8 < if j!j ! ! : 0 if j!j > !
feff f
0
White noise
2 2 þ !2
R00 ð0Þ
J0 ð2fD j jÞ
2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð2fD Þ2 !2
Jake’s fading
1 2fD
22 fD2
1
A20 cos !0 2
2 A ½ð! !0 Þ 2 0 þð! þ !0 Þ
Sinusoid with
1
A20 !20 2
1
a random phase
8 < if j! !0 j ! 2 ! : 0 if j! !0 j > ! 2
Ideal bandpass filter
rffiffiffiffiffiffiffiffi 2 4
Gaussian bandpass filter
! 2 ! 2
sin
expð2 2 Þ cosð!0 Þ
e
ð!!0 Þ2 42
þe
ð!þ!0 Þ2 42
! pffiffiffi 2
!20 þ
ð!Þ2 12
!20 þ 22
1
1.065
76
RANDOM PROCESSES
and ðtÞ are.3 For a quantitative description of linear4 correlation, it is reasonable to introduce a normalized function, called the correlation coefficient Rð Þ Rð Þ ¼
C ðt1 ; t2 Þ Cð Þ ; R ðt1 ; t2 Þ ¼
ðt1 Þ ðt2 Þ
2
ð3:74Þ
Functions R and R are called the mutual correlation coefficient and autocorrelation coefficient respectively. Alternative notation ð Þ for the correlation coefficient will also be used here. It follows from eq. (3.62) that if two random functions are related by a deterministic function, then the coefficient of mutual correlation between them in any instant of time is equal to 1; if these functions are independent, the correlation coefficient becomes zero. Thus, the correlation coefficient describes linear (but not an arbitrary [10,11]) dependence between two processes. Two stationary processes ðtÞ and ðt þ Þ with the correlation coefficient between ðtÞ and ðt þ Þ equal to zero for all values of > 0, (i.e. R ð Þ ¼ 0) are called uncorrelated. It was shown above that any two independent processes are also uncorrelated. The reverse statement is not generally true. Indeed, Rð Þ ¼ 0 does not generally provide enough grounds to state that any higher order moments (non-linear covariance functions) hi1 ðtÞi2 ðt2 Þi all equal zero in the case of non-Gaussian processes. One of the most important examples which shows that uncorrelation does not imply independence can be constructed as follows. Let the joint PDF of the second order be given by pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ps x21 þ x22 ð3:75Þ p2 ðx1 ; x2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi x21 þ x22 where ps ðxÞ is a PDF of an arbitrary positive random variable. It can be seen that, due to symmetry, the cross-correlation function is equal to zero, i.e. ðtÞ and ðtÞ are uncorrelated. However, p2 ðx1 ; x2 Þ can be represented as a product of p1 ðx1 Þ and p1 ðx2 Þ only in the case of Rayleigh PDF p2 ðsÞ. It follows from eqs. (3.63)–(3.72) that the autocorrelation coefficient has the following properties ð1 Rð Þ ¼ Rð Þ; jRð Þj Rð0Þ ¼ 1; lim Rð Þ ¼ 0; Rð Þ cos ! d 0 ð3:76Þ !1
0
In the following, a notion of the correlation interval corr will be often used for both the cases of practically important correlation coefficients: autocorrelation coefficient with a monotonic decay Rð Þ ¼ ð Þ and fast oscillating decaying correlation coefficient, for example in the form Rð Þ ¼ ð Þ cos !0 . In both cases, correlation interval corr is defined as ð ð1 1 1 corr ¼ ð3:77Þ jð Þjd ¼ jð Þjd 2 1 0 If one sets ðtÞ ¼ ðtÞ then C ð Þ ¼ C ðtÞ 2 . Arbitrarily small would make C ð Þ arbitrarily small. For non-linear correlation see [10,11]
3 4
77
ERGODICITY
Geometrically, the correlation interval corr is equal to the base of a rectangle of height 1 and having the same area as under the curve jð Þj (see Table 3.1). The quantity corr gives a rough estimate of the maximum time lag between two samples of the random process which exhibit a significant correlation between them. It is also important to mention that the definition (3.77) is not unique and an alternative definition can be introduced, for example based on the average square spread [12], or at 3 dB level.
3.8
CUMULANT FUNCTIONS ½n
½n
ðu; t1 Þ ¼ he
jðtÞ
Cumulant functions x;...;z ðt1 1 ; . . . ; tN N Þ of a random process, or a group of random processes, can be introduced using the definition of cumulants of the random variable or a group of random variables and by sampling the random processes under consideration at the moments of time t1 ; . . . ; tN . First, the definitions are given for a case of a single random process ðtÞ and then extended to the case of a group of processes. The cumulant and moment brackets technique is applied below to the case of random processes. As a result the cumulants become functions of moments of time t1 ; . . . ; tN . Let ðtÞ be a scalar random process considered at N moments of time t1 ; . . . ; tN . Then fðt1 Þ; ðt2 Þ; . . . ; ðtN Þg is a random vector which, in turn, can be described using the joint PDF, characteristic function, joint moments, or joint cumulants as discussed in Chapter 2. For example, in the case of marginal PDF p ðx; t1 Þ depending on time t1 , the characteristic function ðu; t1 Þ also depends on the time instant t1 . As a result so do the cumulants i ¼ exp
1 X ðt1 Þ ¼1
!
! ð juÞ
ð3:78Þ
In a similar way, higher order cumulants may depend on more than one moment of time. The following notations are adopted here. Let a random process ðtÞ be sampled at N moments P ðt1 ; t2 ; . . . ; tN Þ is the cumulant of order i , of time t ¼ ti , 1 i N, then ;;...; 1 ;2 ;...;N which, in turn, is the coefficient of expansion of the log-characteristic function N ð ju1 ; ju2 ; . . . ; juN Þ into a power series ln N ð ju1 ; ju2 ; . . . ; juN Þ ¼
1 X 1 X i1 ¼ 0 i2 ¼ 0
...
1 X
ðjÞði1 þi2 þþiN Þ
iN ¼ 0
;;...; 1 ;2 ;...;N ðt1 ; t2 ; . . . ; tN Þ i1 i2 u1 u2 . . . uiNN i1 !i2 ! . . . iN !
ð3:79Þ
3.9
ERGODICITY
Until now the description of a random process has been achieved through corresponding statistical averages over an ensemble of realizations, i.e. over responses of a large number of identical systems. In many cases, for some strict sense stationary processes, it is possible to obtain the same characteristics by averaging a single realization over time. This possibility is justified by the fact that a stationary process is homogeneous in time. Thus, a single
78
RANDOM PROCESSES
realization which is long enough would contain all the information about the process. A process which can be described based only on a single realization is called an ergodic process. All ergodic processes are obviously stationary but the reverse statement is not true. Leaving detailed mathematical considerations outside the scope of this book, it can be noted that sufficient and necessary conditions for a stationary process to be ergodic is the fact that its covariance function vanishes at ¼ 1, i.e. lim Cð Þ ¼ 0
ð3:80Þ
!1
A few particular results can be derived from the ergodicity of a random process ðtÞ. First of all, let ZðtÞ be a random function of an ergodic process ðtÞ. It is assumed that ZðtÞ is also stationary and ergodic. Then, with the probability 1 1 T!1 T
hZðtÞi ¼ lim
ðT
ð3:81Þ
ZðtÞdt 0
In order to prove this formula, one can consider a random variable Z T , defined as an average of ZðtÞ over the time interval ½0; T5 1 ZT ¼ T
ðT
ð3:82Þ
ZðtÞdt 0
Taking the statistical average of both sides, changing the order of integration and statistical averaging, it can be seen that the average of Z coincides with that of the original process: 1 hZ T i ¼ T
ðT
hZðtÞi hZðtÞidt ¼ T 0
ðT
dt ¼ hZðtÞi
ð3:83Þ
0
Here the fact that hZðtÞi does not depend on time for a stationary process has been used. The next step is to show that the variance 2 ðTÞ of Z T approaches zero when T approaches infinity. It can be done by subtracting hZ T i from both sides of eq. (3.82) and squaring the difference ðT 2 ð ð 1 1 T T fZðtÞ hZðtÞigdt ¼ 2 fZðt1 Þ hZðtÞigfZðt2 Þ hZðtÞigdt1 dt2 ðZ T hZ T iÞ ¼ T 0 T 0 0 2
ð3:84Þ Taking the statistical average of both sides of eq. (3.84), one obtains
2 ðTÞ ¼ hðZ T hZ T iÞ2 i ¼
1 T2
ðT ðT 0
CZ ðt2 t1 Þdt1 dt2
In the following, an overbar over a function indicates averaging over time, i.e. xðtÞ ¼ T1
5
ð3:85Þ
0
ÐT 0
xðtÞdt
79
ERGODICITY
Here, CZ ðt2 t1 Þ ¼ 2Z RZ ðt2 t1 Þ ¼ EðfZðt1 Þ hZðtÞigfZðt2 Þ hZðtÞigÞ
ð3:86Þ
is the covariance function of the process ZðtÞ and Rz ðt2 t1 Þ is its correlation coefficient. By assumption ZðtÞ is stationary process and thus both of these functions depend only on t2 t1 . Changing variables to ¼ t2 t1 t2 ¼ t0 þ
2
t1 þ t2 2 t1 ¼ t0 2
t0 ¼
ð3:87Þ
and taking into account that Cz ð Þ is an even function Cz ð Þ ¼ Cz ð Þ, one obtains 2
ð Þ ¼ T 2
ðT ð 2 2z T 1 CZ ð Þd ¼ 1 RZ ð Þd T T T 0 0
ð3:88Þ
The last equation shows that in order to calculate 2 ðTÞ one needs to know the correlation coefficient of ZðtÞ. In two particular cases (small and large T), a good approximation of these values can be obtained. Let corr be a correlation interval of the process ZðtÞ. Then, for T K we can assume that RZ 1 and thus
2 ðTÞ 2Z
ð3:89Þ
If T corr , then eq. (3.88) can be simplified to produce the following estimate
2 ðTÞ
2 2Z T
ð1
jRZ ð Þjd ¼
0
2 2Z corr T
ð3:90Þ
Thus, if a stationary process ZðtÞ has a finite variance, correlation interval and, in addition, its covariance function approaches zero, lim !1 CZ ð Þ ! 0, then 2 ðTÞ vanishes. This means that Z T approaches a deterministic quantity which coincides with the average value of the process hZðtÞi, according to eq. (3.83) 1 hZðtÞi ¼ lim T!1 T
ðT ZðtÞdt
ð3:91Þ
0
At the same time, one can estimate how fast expression (3.91) converges. Indeed, it follows from eq. (3.90) that ðT 1=2 1 Z 2 corr ZðtÞdt hZðtÞi ð3:92Þ T T 0 This means that averaging over time converges at least as fast as the arithmetic mean N 1X ZT ¼ ZðiÞ N i¼1
ð3:93Þ
80
RANDOM PROCESSES
of identically distributed and independent random variables ZðiÞ if N ¼ T=ð2 corr Þ. Thus, to simplify evaluation of the average hZðtÞi, it is reasonable to use the sum (eq. 3.93) instead of the integral (eq. 3.82), assuming that the discretization step is greater than 2 corr ! In a very similar way, one can obtain different moments by averaging an ergodic process over time.
3.10
POWER SPECTRAL DENSITY (PSD)
The power spectral density (PSD) of a random process ðtÞ is defined as the Fourier transform of its covariance function Cð Þ, i.e. ð1 4 Sð!Þ ¼ Cð Þ exp½j! d ð3:94Þ 1
The inverse Fourier transform restores the covariance function from its PSD ð 1 1 Sð!Þ exp½ j! d! Cð Þ ¼ 2 1
ð3:95Þ
Equations (3.94)–(3.95) are known as the Wiener–Khinchin theorem. Setting ¼ 0 in eq. (3.95), one can obtain that the variance of the process is equal to the area under the Sð!Þ curve, i.e. ð ð 1 1 1 1 Sð!Þ exp½j!0d! ¼ Sð!Þd! ð3:96Þ
2 ¼ Cð0Þ ¼ 2 1 2 1 Since Cð Þ is a symmetric positively definite function, its Fourier transform Sð!Þ is always a positive (and thus real) symmetric function. If one considers ðtÞ to be a random voltage (current) then 2 can be considered as the power dissipated by a resistor of R ¼ 1 . 1 Sð!Þd! can be attributed to the spectral components located between A differential power 2 ! and ! þ d!. This is why Sð!Þ is called the power spectral density or the energy spectrum. The covariance function Cð Þ and PSD Sð!Þ of a stationary process ðtÞ ‘‘inherit’’ all the properties of a Fourier pair. In particular, if Sð!Þ is ‘‘wider’’ than some test function S0 ð!Þ then Cð Þ must be ‘‘narrower’’ than the corresponding covariance function C0 ð Þ and vice versa. This is dictated by the so-called uncertainty principle of the Fourier transform [13]. It also imposes conditions between smoothness of the covariance function and the rate of roll off of the PSD [14]. The effective spectral width !eff of the PDF can be defined according to the following equation ð 1 1 !eff ¼ Sð!Þd! ð3:97Þ S0 0 where S0 is the value of the PSD at a selected characteristic frequency !0. It is common to choose !0 such that Sð!Þ achieves its maximum Smax ¼ Sð!0 Þ at !0 . In this case corr !corr const:
ð3:98Þ
81
POWER SPECTRAL DENSITY (PSD)
Alternative definitions are also possible [12]; for example spectral width !0:5 can be defined on the level of 0:5Smax . Of course, !eff > !0:5 . Sometimes instead of PSD, defined by eq. (3.94), normalized PSD sð!Þ is considered sð!Þ ¼ 2 Sð!Þ ¼
1
2
ð1 Cð Þ exp½j! d 1
ð3:99Þ
It follows from eq. (3.99) that sð!Þ ¼
ð1
1 Rð Þ ¼ 2
Rð Þ exp½j! d ; 1
ð1 sð!Þ exp½ j! d! 1
ð3:100Þ
Using the fact that the covariance function and the correlation coefficient are even functions, one obtains the equivalent form of the Wiener–Khinchin theorem: Sð!Þ ¼ ¼
ð1 ð1 1
Cð Þ exp½j! d ¼
ð0
ð1
Cð Þ exp½j! d þ Cð Þ exp½j! d 0 ð1 ð1 Cð Þ exp½j! dð Þ þ Cð Þ exp½j! d ¼ 2 Cð Þ cos ! d 1
0
0
0
ð3:101Þ and, conversely Cð Þ ¼ 2
ð1
Sð!Þ cos !
0
d! 2
ð3:102Þ
It is important to note that the PSD Sð!Þ in eqs. (3.94), (3.95), (3.101), and (3.102) is defined for both positive and negative frequencies. In addition, Sð!Þ ¼ Sð!Þ. Instead of a two-sided ‘‘mathematical’’ spectrum, it is possible to consider a one-sided ‘‘physical’’ spectrum Sþ ð!Þ defined as Sþ ð!Þ ¼
0 Sð!Þ þ Sð!Þ ¼ 2Sð!Þ
!<0 !0
ð3:103Þ
The Wiener–Khinchin theorem thus becomes Sþ ð!Þ ¼ 4
ð1 0
Cð Þ cos ! d ;
Cð Þ ¼
1 2
ð1
Sþ ð!Þ cos ! d!
ð3:104Þ
0
Equations (3.104) are usually more practical in calculations. In contrast to the spectral analysis of deterministic signals, the PSD of a random process ðtÞ does not provide information for restoring a specific time realization of this random signal, since it does not contain any phase information of each spectral component. It is possible to find a number of functions with the same magnitude spectrum but different phase spectrum, thus having the same covariance function.
82
RANDOM PROCESSES
While definition (3.94)–(3.95) seems to be artificial, there are practical considerations justifying such a definition. Indeed, a spectral function Fð!Þ can be defined for a particular realization of a random process ðtÞ known for a significant interval of time T, i.e. Fð!Þ ¼
ðT
ðtÞ exp½j!tdt
ð3:105Þ
0
If F ð!Þ is the complex conjugate of Fð!Þ, then jFð!Þj2 ¼ Fð!ÞF ð!Þ ¼
ðT ðT 0
ðtÞðt0 Þ exp½j!ðt t0 Þdt dt0
ð3:106Þ
0
Taking the statistical average of both sides and keeping in mind that ðtÞ is a stationary process, it can be seen that hjFð!Þj2 i ¼
ðT ðT 0
hðtÞðt0 Þi exp½j!ðt t0 Þdt dt0 ¼
0
ðT ðT 0
Cðt t0 Þ exp½j!ðt t0 Þdt dt0
0
ð3:107Þ After a substitution ¼ t t0 and evaluation of the integral over t0, eq. (3.107) can be reduced to 2
hjFð!Þj i ¼
ðT dt 0
0
ðT T
Cð Þ exp½j! d ¼ T
ðT Kð Þ exp½j! d T
ð3:108Þ
Finally, dividing both sides by T, taking the limit for T ! 1 and recalling eq. (3.94), one finally arrives at the conclusion that Sð!Þ ¼
ð1 1
hjFð!Þj2 i T!1 T
Cð Þ exp½j! d ¼ lim
ð3:109Þ
This formula can be considered an equivalent definition of PSD, and it can be used to determine the PSD of a random sequence of pulses.
3.11
MUTUAL PSD
Let ðtÞ and ðtÞ be two stationary processes, with mutual covariance functions C ð Þ and K ð Þ as defined by eq. (3.94). Similar to the definition of PSD, the mutual PSD can be defined as the Fourier transformation of the mutual covariance function S ð!Þ ¼ S ð!Þ ¼
ð1 ð1 1 1
C ð Þ exp½j! d ð3:110Þ C ð Þ exp½j! d
83
MUTUAL PSD
Conversely, mutual covariance can be expressed in terms of the mutual PSD as C ð Þ ¼
1 2
ð1 1
S ð!Þ exp½ j! d!;
C ð Þ ¼
1 2
ð1 1
S ð!Þ exp½ j! d!
ð3:111Þ
Since the mutual covariance function is not necessarily even, the corresponding PSD is not necessarily real. However, if both ðtÞ and ðtÞ are real, so are C ð Þ and C ð Þ. This implies that S ð!Þ ¼ S ð!Þ;
S ð!Þ ¼ S ð!Þ
ð3:112Þ
S ð!Þ ¼ S ð!Þ
ð3:113Þ
The fact that C ð Þ ¼ C ð Þ also results in S ð!Þ ¼ S ð!Þ;
The following two examples play an important role in analysis theory and applications: (1) sum of two stationary related processes and (2) product of two independent processes.
3.11.1
PSD of a Sum of Two Stationary and Stationary Related Random Processes
In many practical cases, it is important to obtain an expression for PSD S ð!Þ of a sum ðtÞ ¼ ðtÞ þ ðtÞ
ð3:114Þ
of two stationary and stationary related random processes ðtÞ and ðtÞ. It is assumed that auto- and mutual covariance functions C ð Þ, C ð Þ, C ðtÞ, and C ð Þ of these processes are known. The covariance function of ðtÞ is then C ð Þ ¼ h½ðtÞ þ ðtÞ½ðt þ Þ þ ðt þ Þi ¼ hðtÞðt þ Þi þ hðtÞðt þ Þi þ hðtÞðt þ Þi þ hðtÞðt þ Þi ¼ C ð Þ þ C ð Þ þ C ð Þ þ C ð Þ
ð3:115Þ
which immediately implies that S ð!Þ ¼ S ð!Þ þ S ð!Þ þ S ð!Þ þ S ð!Þ
ð3:116Þ
Here, S ð!Þ and S ð!Þ form a mutual PSD, defined as in eq. (3.110). In the case of real random functions ðtÞ and ðtÞ, eq. (3.116) can be rewritten with use of eq. (3.113) to produce S ð!Þ ¼ S ð!Þ þ S ð!Þ þ ½S ð!Þ þ S ð!Þ
ð3:117Þ
Further simplification can be achieved in the case of uncorrelated (but not necessarily independent) processes ðtÞ and ðtÞ. In this case, C ð Þ ¼ C ð Þ 0 and S ð!Þ ¼
84
RANDOM PROCESSES
S ð!Þ ¼ 0, thus implying that the PSD of the sum is just the sum of the PSDs of the components. It is interesting to note that even if ðtÞ and ðtÞ are non-stationary in the wide sense, their sum could be a stationary process. For example, if Ac ðtÞ and As ðtÞ are independent, stationary, zero-mean processes with identical covariance function, then neither one of the processes ðtÞ ¼ Ac ðtÞ cos !t, ðtÞ ¼ As ðtÞ sin !t is stationary. However, their sum is stationary in a wide sense.
3.11.2
PSD of a Product of Two Stationary Uncorrelated Processes
Let a random process ðt; 0 Þ be a product of two stationary zero mean6 processes ðtÞ and ðt þ 0 Þ: ðt; 0 Þ ¼ ðtÞðt þ 0 Þ
ð3:118Þ
where 0 is a fixed time lag. The covariance function of this process can be found to be C ðt; 0 ; Þ ¼ hðtÞðt þ 0 Þðt þ Þðt þ þ 0 Þi ¼ hðtÞðt þ Þihðt þ 0 Þðt þ þ 0 Þi ¼ C ð ÞC ð Þ ¼ C ð Þ
ð3:119Þ
The corresponding PSD is then S ð!Þ ¼
ð1 1
C ð ÞC ð Þ exp½j! d
ð3:120Þ
Using the expression of C ð Þ in terms of the corresponding PSD 1 C ð Þ ¼ 2
ð1 1
S ð!0 Þ exp½j!0 d!0
ð3:121Þ
one can see that S ð!Þ ¼ ¼
ð1 1
1 2
C ð Þ
ð1 1
1 2
ð1 1
S ð!0 Þd!0
S ð!0 Þ exp½ j!0 d!0 exp½j! d
ð1 1
C ð Þ exp½jð! !0 Þ d ¼
1 2
ð1 1
S ð!0 ÞS ð! !0 Þd!0 ð3:122Þ
Thus, the PSD of a product of two stationary uncorrelated functions is equal to the convolution of their PSDs. 6
In this case the correlation function coincides with the covariance function.
COVARIANCE FUNCTION OF A PERIODIC RANDOM PROCESS
3.12
85
COVARIANCE FUNCTION OF A PERIODIC RANDOM PROCESS
In this section, a few examples of the covariance function of periodic signals are considered. Such signals play an important role in a wide range of applications such as radar detection, communication systems with amplitude modulation, etc. The common factor of such processes is the fact that the covariance function of the signal does not vanish at infinity, a property which can be exploited in detection of signals embedded in additive noise.
3.12.1
Harmonic Signal with a Constant Magnitude
As a first example, a covariance function of a random harmonic signal sðtÞ ¼ A0 cosð!0 t þ ’Þ
ð3:123Þ
is considered. The magnitude A0 and the angular frequency !0 are assumed to be fixed deterministic quantities, while ’ is a random, uniformly distributed on the interval ½; phase. In this case, 8 < 1 ’ ð3:124Þ ’ ð’Þ ¼ 2 : 0 otherwise Since the statistical average value of this signal is equal to zero ð
A0 ms ¼ hsðtÞi ¼ sðtÞd’ ¼ 2
ð
cosð!0 t þ ’Þd’ ¼ 0
ð3:125Þ
its covariance function is just Cs ðt; Þ ¼ hsðtÞsðt þ Þi ¼ A20 hcosð!0 t þ ’Þ cosð!0 ðt þ Þ þ ’Þi A20 ½hcosð!0 t þ !0 þ ’ !0 t ’Þi þ hcosð!0 t þ ’ þ !0 t þ !0 þ ’Þi 2 ð A2 A2 A2 ¼ 0 cos !0 þ 0 cosð2!0 t þ !0 þ 2’Þd’ ¼ 0 cos !0 ¼ Cs ð Þ ð3:126Þ 2 4 2 ¼
As can be seen, the covariance function is a periodic function with the same period as the signal itself. However, it does not approach zero when ! 1, and, thus, is not an ergodic process7. This property could be used to identify a weak but long signal sðtÞ embedded in a strong colored noise, whose covariance function approaches zero for relatively large time lags. Indeed, let signal sðtÞ be mixed with a stationary noise nðtÞ ðtÞ ¼ sðtÞ þ nðtÞ 7
It is not possible to extract the statistics of the phase from a single realization of the random process.
ð3:127Þ
86
RANDOM PROCESSES
If the signal and noise are independent (which is often the case), then the covariance function of the sum is just a sum of the covariance functions of the components, according to eq. (3.115) C ð Þ ¼ Cs ð Þ þ Cn ð Þ
ð3:128Þ
If C ð Þ approaches zero when ! 1, then for large values of the eq. (3.128) becomes C ð Þ Cs ð Þ ¼
A20 cos !0 2
ð3:129Þ
Thus, it is often possible to obtain information about the presence or absence of an information signal by analysing its covariance function. If, for a fairly large , the measured covariance function exhibits periodic properties, then the mixture ðtÞ contains the information signal. Otherwise, only noise is present. The PSD of such a signal is, according to eq. (3.94) ð1 1 Sð!Þ ¼ A20 cos !0 exp½j! d 2 1 ð A2 1 ¼ 0 ðexp½j!0 þexp½ j!0 Þ exp½j! d 4 1 ¼ A20 ½ð!!0 Þþ ð!þ!0 Þ 2
ð3:130Þ
since ð1 1
exp½ j! d ¼ 2ð!Þ
ð3:131Þ
Equivalently, the one-sided density is then Sþ ð!Þ ¼ 2Sð!Þ ¼ A20 ð! !0 Þ ¼
A20 ð f f0 Þ 2
ð3:132Þ
This result is quite logical, since eq. (3.132) reflects the fact that all the energy is concentrated at a single frequency f ¼ f0. The magnitude of this spectral line is A20 =2 and is the square of its RMS. 3.12.2
A Mixture of Harmonic Signals
A random signal sðtÞ which is a mixture of harmonic waveforms of different frequencies sðtÞ ¼
n X
Ak sinð!k t þ ’k Þ
ð3:133Þ
k¼1
is often used as a model of multitone signalling systems, such as OFDM systems, with frequency diversity [15], etc. Here Ak and !k are deterministic quantities, and only phases ’k
87
COVARIANCE FUNCTION OF A PERIODIC RANDOM PROCESS
are allowed to fluctuate. As a further limitation, it is assumed that ’k and ’l are independent and uniformly distributed over the interval ½; . Repeating the previous calculations and taking into account that due to the assumption on the phases n 1 ð3:134Þ pð’1 ; ’2 ; . . . ; ’n Þ ¼ pð’1 Þpð’2 Þ . . . pð’n Þ ¼ 2 one can obtain the expression for the covariance function and one-sided PSD of such a process as Cs ð Þ ¼
ð1 1
...
ð1 1
sðtÞsðt þ Þpð’1 ; ’2 ; . . . ; ’n Þd’1 d’2 . . . d’n ¼
n 1X A2 cos !k 2 k¼1 k
ð3:135Þ and Sþ ð!Þ ¼ 2
n X A2 k
k¼1
2
ð! !k Þ
ð3:136Þ
respectively. The last two expressions can be easily obtained from the expression for a single harmonic signal, noting that all the components are independent.
3.12.3
Harmonic Signal with Random Magnitude and Phase
The next task is to find the covariance function of a harmonic signal with a random magnitude. sðtÞ ¼ A0 ðtÞ cosð!0 t þ ’Þ
ð3:137Þ
Here ðtÞ is a stationary random process with the covariance function C ð Þ, ’ is an independent random variable uniformly distributed over ½; . Such a process corresponds to a signal modulated by a random information signal as in ASK or AM [16]. In this case Cs ð Þ ¼ hsðtÞsðt þ Þi ¼ hA0 ðtÞ cosð!0 t þ ’ÞA0 ðt þ Þ cosð!0 t þ !0 þ ’Þi ¼ A20 hsðtÞsðt þ Þi hcosð!0 t þ ’Þ cosð!0 t þ !0 þ ’Þi ¼
A20 C ð Þ cos !0 2
ð3:138Þ
Here a subscript is used to indicate the variable over which the statistical averaging is being performed. Thus, the covariance function of a modulated signal carries information about the covariance function of the modulated signal and the carrier frequency. Covariance functions of various information signals are considered in [16]. It is important to mention that if the initial phase in eq. (3.137) is deterministic, then the signal sðtÞ will be a periodic non-stationary process, often called a cyclo-stationary process. Such processes are treated in detail in [17].
88
3.13
RANDOM PROCESSES
FREQUENTLY USED COVARIANCE FUNCTIONS
It is possible to list a number of covariance functions and PSDs that are often used in applications. Many of them can be obtained by considering the filtering of White Gaussian Noise (WGN)8 with a spectral density N0 =2. If the transfer function of the filter is Hð j!Þ then the PSD of the output signal is Sð!Þ ¼
N0 jHð j!Þj2 2
ð3:139Þ
and thus 1 Cð Þ ¼ 2
ð1 Sð!Þ exp½ j! d 1
ð3:140Þ
Thus, for any particular system the expression for the covariance function is unique. However, in engineering calculations a set of typical correlation functions and PSDs is used to simplify calculations. Some of these functions are listed in Table 3.1. All of the listed examples are obtained by filtering WGN; the corresponding shaping filters (form-filters) are also shown. In addition, Table 3.1 also lists the effective bandwidth !eff as defined by eq. (3.97), ratio !eff =!0:5 and the second derivative of the covariance function C00 ð Þ j ¼0 , calculated at zero lag and playing an important role in level crossing problems (see Chapter 4).
3.14
NORMAL (GAUSSIAN) RANDOM PROCESSES
A random process ðtÞ is called a normal (Gaussian) random process if, for an arbitrary n moments of time t1 ; t2 ; . . . ; tn , the n-dimensional joint PDF of ¼ ðt Þ, ¼ 1; 2; . . . ; n is a normal (Gaussian) PDF 1 1 T 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p pn ðx1 ; x2 ; . . . ; xn Þ ¼ exp ðx mÞ C ðx mÞ 2 ð2Þn jCj
ð3:141Þ
Here m ¼ ½m1 ; m2 ; . . . ; mn T ;
m ¼ hðt Þi
ð3:142Þ
is the average value of the process at time instant t ¼ t ,
2 ¼ hððt Þ m Þ2 i
8
See section 3.15 for more details.
ð3:143Þ
89
NORMAL (GAUSSIAN) RANDOM PROCESSES
is the variance of the process ðtÞ at the same moment of time. Finally, C is the covariance matrix defined as 2
C11
6C 6 21 C¼6 4 ... Cn1
C12
. . . C1n
C22 ...
... ...
...
. . . Cnn
... ...
3 7 7 7; 5
C ¼ C ðt ; t Þ ¼ hððt Þ m Þððt Þ m Þi ð3:144Þ
and jCj stands for the determinant of the matrix C. The characteristic function, which corresponds to the PDF (eq. 3.141) is given by the following expression [19] "
n 1 X n ðu1 ; u2 ; . . . ; un ; t1 ; t2 ; . . . ; tn Þ ¼ exp j m u C ðt ; t Þu u 2 ; ¼ 1 ¼1 n X
# ð3:145Þ
It can be seen from eqs. (3.141) and (3.145) that the expression for the PDF of a characteristic function is completely defined by values of the average at any time instant and the covariance function Cðt1 ; t2 Þ between any two instances of time. Thus, if for a physical reason it is known that the process is Gaussian, then in order to have its complete description one just needs to measure its mean and covariance function. Thus the correlation theory provides a complete description of such processes. For a Gaussian process, all higher order cumulants are equal to zero, as can be seen from the following expansion ln n ¼ j
n X
m u
¼1
n 1 X C ðt ; t Þu u þ . . . þ 0uk um 2 ; ¼ 1
ð3:146Þ
Thus, two Gaussian processes can be different only by mean values and the shape of the covariance function (or PSD). If values of a Gaussian random process ðtÞ are uncorrelated at the time instants t1 ; t2 ; . . . ; tn then pn ðx1 ; x2 ; . . . ; xn ; t1 ; t2 ; . . . ; tn Þ ¼
n X
p1 ðx ; t Þ
¼1
"
n ðx m Þ2 1X exp ¼
2 2 ¼1 ð2Þn=2 1 ; 2 . . . n
1
# ð3:147Þ
since all Rij ¼ 0 for i 6¼ j. This means that if two Gaussian processes are uncorrelated they are also independent.
90
RANDOM PROCESSES
Setting n ¼ 1 and n ¼ 2 in eqs. (3.141) and (3.145), one obtains " # 1 ðx1 m1 Þ2 p1 ðx1 Þ ¼ pffiffiffiffiffiffi exp ; 2 21 2 1
1 1 ðx1 Þ ¼ exp jm1 u1 21 u21 2
ð3:148Þ
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 1 2 1 R2 " ( )# 1 ðx1 m1 Þ2 ðx1 m1 Þðx2 m2 Þ ðx2 m2 Þ2 exp 2R þ 2ð1 R2 Þ
1 2
21
22 1 2 ðx1 ; x2 Þ ¼ exp jðm1 u1 þ m2 u2 Þ ½ 21 u21 þ 2R 1 2 u1 u2 þ 22 u22 ð3:149Þ 2
p2 ðx1 ; x2 Þ ¼
where R ¼ R12 ðt1 ; t2 Þ ¼
C ðt1 ; t2 Þ C ðt2 ; t1 Þ ¼ ¼ R21 ðt1 ; t2 Þ
1 2
1 2
ð3:150Þ
is the correlation coefficient. Equations (3.145)–(3.148) allow an important conclusion to be reached. By definition, the characteristic function of an ensemble of n random variables 1 ; 2 ; . . . ; n is n ðu1 ; u2 ; . . . ; un ; t1 ; . . . ; tn Þ ¼ hexpð ju1 1 þ ju2 2 þ þ jun n Þi
ð3:151Þ
Thus, for the characteristic function 1 ðuÞ of the sum ¼ 1 þ 2 þ þ n
ð3:152Þ
one can write ðu; t1 ; . . . ; tn Þ ¼ hexpð juÞi ¼ hexp½ juð1 þ 2 þ þ n Þi ¼ n ðu; u; . . . ; u; t1 ; t2 ; . . . ; tn Þ ð3:153Þ If all the random variables i are Gaussian random variables described by the characteristic function eq. (3.145), then
1 ðu; t1 ; . . . ; tn Þ ¼ exp jMu 2 u2 2
ð3:154Þ
where M¼
n X ¼1
M ;
2 ¼
n X ; ¼ 1
Cðt ; t Þ
ð3:155Þ
91
NORMAL (GAUSSIAN) RANDOM PROCESSES
This equation has the same form as eq. (3.148), but now m and 2 depend on n instants of time. Since eq. (3.148) describes a Gaussian process, then the random process is also Gaussian. In other words, a sum of a finite number of Gaussian random processes is a Gaussian random process. This statement can be generalized to the case of an arbitrary linear transformation, including differentiation, integration and filtering. The next important property of a Gaussian process is the fact that wide sense stationarity implies strict sense stationarity. It can be easily seen that the conditional PDF of a Gaussian random process is also Gaussian and is given by p2 ðx2 j x1 ; Þ ¼
p2 ðx2 ; x1 ; Þ pðx1 Þ
"
# 1 1 x1 m1 x2 m 2 2 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp Rð Þ 2½1 R2 ð Þ
1
2
2 2½1 R2 ð Þ ð3:156Þ If the process under consideration is a wide sense stationary process, then the following conditions are also satisfied Rðt ; t Þ ¼ Rðt t Þ;
m ¼ m ¼ const;
; ¼ 1; . . . ; n
ð3:157Þ
That is, the average of the process is a constant and the covariance function depends only on the time lag between two time instants. In this case, and the characteristic function the PDF (3.141) and (3.145) would also depend only on t t , meaning that the process is also strict sense stationary. Thus, for a Gaussian random process, these two types of stationarity are equivalent. For a Gaussian stationary process, eqs. (3.141) and (3.145) can be written as " # n 1 1 X exp 2 pn ðx1 ; x2 ; . . . ; xn Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi D ðx mÞðx mÞ 2 D ; ¼ 1 ð2 2 Þn D "
n X 1 x 2 Rð Þu u n ðu1 ; u2 ; . . . ; un Þ ¼ exp jm 2 ; ¼ 1 ¼1 n X
ð3:158Þ
# ð3:159Þ
where
2 ¼ hððtÞ mÞ2 i
ð3:160Þ
is the variance of the random process and ¼ t t
ð3:161Þ
92
RANDOM PROCESSES
Implicit expressions for the PDF and characteristic functions of a Gaussian stationary process can be easily obtained from eqs. (3.148)–(3.149) to produce " # 1 ðx mÞ2 ; p1 ðxÞ ¼ pffiffiffiffiffiffi exp 2 2 2
1 1 ðuÞ ¼ exp jmu 2 u2 2
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 R2 ð Þ " # ðx1 mÞ2 2Rð Þðx1 mÞðx2 mÞ þ ðx2 mÞ2 exp 2 2 ð1 R2 ð ÞÞ 1 2 2 2 2 ðu1 ; u2 Þ ¼ exp jmðu1 þ u2 Þ ½u1 þ 2Rð Þu1 u2 þ u2 2 p2 ðx1 ; x2 Þ ¼
ð3:162Þ
2 2
ð3:163Þ
If the mean value of the process is zero, m ¼ 0, then eq. (3.162) is simplified even further 1 1 ðuÞ ¼ exp 2 u2 2
ð3:164Þ
Using eq. (3.34) one can obtain expressions for all the moments of a marginal PDF 1 2 2 n X 1 1 X 2 u 1 ð1Þn 2n u2n ð2n!Þ ¼ exp 2 u2 ¼ n! ð2nÞ! 2 n!2n n¼0 n¼0
ð3:165Þ
and thus Mn ¼
1 3 5 ðn 1Þ n 0
for even n for odd n
ð3:166Þ
For a stationary Gaussian process with zero mean m ¼ 0, all moments of an odd order are equal to zero. All moments of an even order can be expressed in terms of the covariance function9 Cð Þ using the expression for its probability density (3.141) or its characteristic function (3.145) m2n1 ðt1 ; . . . ; t2n1 Þ ¼ 0 m2n ðt1 ; . . . ; t2n Þ ¼
X
"
Y hðti Þðtj Þi
all pairs i 6¼ j
¼
X
#
hðti Þðtj Þihðtk Þðtl Þi . . . hðtp Þðtq Þi
ð3:167Þ
all pairs i 6¼ j; k 6¼ l; p 6¼ q
9
This also follows from the fact that the covariance function completely defines the PDF of any order of a zero mean Gaussian process, i.e. the results must not come as a ‘‘surprise’’.
93
NORMAL (GAUSSIAN) RANDOM PROCESSES
In particular m1111 ðt1 ; t2 ; t3 ; t4 Þ ¼ hðt1 Þðt2 Þðt3 Þðt4 Þi Cðt2 t1 ÞCðt4 t3 Þ þ Cðt3 t1 ÞCðt4 t2 Þ þ Cðt4 t1 ÞCðt3 t2 Þ
ð3:168Þ
m22 ðt1 ; t2 Þ ¼ m1111 ðt1 ; t1 ; t2 ; t2 Þ ¼ 4 þ 2C 2 ðt2 t1 Þ PDF eq. (3.163) of the second order of a zero mean Gaussian process ðm ¼ 0Þ can be represented as a series p2 ðx1 ; x2 Þ ¼ 2
X
ðnþ1Þ
x x Rn ð Þ 1 2 ðnþ1Þ
n!
ð3:169Þ
Here, ðzÞ is the probability integral closely related to the error function [6] 1 ðzÞ ¼ pffiffiffiffiffiffi 2
x i 1 1h ; exp x2 dx ¼ 1 þ erf 2 2 2 1
ðz
dn ðzÞ dzn
ðnÞ ðzÞ ¼
ð3:170Þ
Derivatives of the probability integral can also be expressed in terms of Hermitian polynomials as ðnÞ ðzÞ ¼
2 x dn ð1Þn1 x p ffiffiffi H ðzÞ ¼ exp n1 n n1 dz 4 2 2
ð3:171Þ
Expansion (3.169) is often used when considering the non-linear transformation of a Gaussian stationary process and other calculations involving Gaussian and Rayleigh processes. For example, one can obtain expressions for joint moments of a Gaussian process M ð Þ ¼ h ðtÞ ðt þ Þi ¼
ð1 ð1 1 1
x1 x2 p2 ðx1 ; x2 Þdx1 dx2 ¼ þ
1 X k¼0
N;k N;k
Rk ð Þ k! ð3:172Þ
where Ni;k ¼
ð1 1
xi ðkþ1Þ ðxÞdx ¼
ð1Þn pffiffiffi 2ni1
ð1 1
xi Hn ðxÞ exp½x2 dx
ð3:173Þ
A set of coefficients Ni;k forms a matrix, shown in Table 3.2. It is possible to point out rules for calculating Ni;k : 1. All elements above the main diagonal ði < kÞ are equal to 0. 2. Below the main diagonal and on the main diagonal ði kÞ, the entries which are nonzero have indices of the same parity (i.e. they are either both even or both odd). All other entries are equal to zero.
94
RANDOM PROCESSES
Table 3.2 i
k¼0
0 1 2 3 4 5 6
1 0 1 0 3 0 15
1
2
3
4
5
0 1 0 3 0 15 0
0 0 2 0 12 0 90
0 0 0 6 0 0 60
0 0 0 0 24 0 360
0 0 0 0 0 120 0
6 0 0 0 0 0 0 720
3. Element Ni;k can be constructed as follows: The first k multipliers are iði 1Þ . . . ði k þ 1Þ. The next group of multipliers are all odd numbers between ði k þ 1Þ to 1, excluding ði k þ 1Þ itself. For example, for i ¼ 6, k ¼ 2. jN6;2 j ¼ ð6 5Þ ð3 1Þ ¼ 90
ð3:174Þ
4. The sign of the element Ni;k depends on the parity of k: it is negative for odd k and positive for even k. Using this rule, it is easy to obtain that m24 ð Þ ¼ 6 ½3 þ 12R2 ð Þ m44 ð Þ ¼ 8 ½9 þ 72R2 ð Þ þ 24R4 ð Þ
ð3:175Þ
It is possible to show that the three-dimensional PDF can be represented by a series, similar to eq. (3.169). Such a representation allows one to calculate three-dimensional moments. It should be stressed again that any linear transformation of a Gaussian process produces a Gaussian process again. Thus, if at the input of a linear filter there is a Gaussian process, a Gaussian process will appear at the output as well. In other words, a Gaussian process is stable with respect to linear transformations. In addition, it is interesting that the PDF of a weighted sum of similarly distributed and weakly correlated random variables approaches the Gaussian PDF, thus emphasizing the importance of the Gaussian distribution. In the case of non-linear transformation, the Gaussian property is lost. If a Gaussian process ðtÞ is transformed according to a non-linear rule ðtÞ ¼ f ðt; Þ, then ðtÞ is not a Gaussian process. In particular, a product of two Gaussian processes is non-Gaussian. However, a non-Gaussian process with the correlation interval corr , acting at the input of a narrow band system with a time constant c > corr and the impulse response hðtÞ produces an output ðtÞ which approaches a Gaussian process. As the time constant s increases, the output processes becomes closer to a Gaussian process. This is due to the fact that the input process can be considered as a large sum of samples of the input processes which are almost independent: ðtÞ ¼
ð1 1
hð tÞð Þd
N X k ¼ N
hðk tÞðk Þ
ð3:176Þ
WHITE GAUSSIAN NOISE (WGN)
95
If s corr then one can choose corr < < c such that ðk Þ are almost independent while hðk tÞ does not change much, i.e. the weights are approximately equal. The greater corr, the more uniformity in weights one can observe and a larger number N of terms is needed to approximate the integral in eq. (3.176).
3.15
WHITE GAUSSIAN NOISE (WGN)
One of the most important processes used in applications is a stationary Gaussian process nðtÞ, whose covariance function is a delta function scaled by a positive constant N0 =2 Cð Þ ¼
N0 ð Þ 2
ð3:177Þ
Since the delta function is equal to zero everywhere except ¼ 0, the process defined by the covariance function (3.177) is uncorrelated at any two separate instants of time. Such a process is called absolutely random. The expression for the PSD of this process can be easily obtained from the sifting property of the delta function ð1 N0 N0 Sð!Þ ¼ ¼ const; Sþ ð!Þ ¼ 2 ¼ N0 ð3:178Þ Cð Þ exp½j! d ¼ Cð Þj ¼0 ¼ 2 2 1 Thus, the spectral density of an absolutely random process has a spectral density which is a constant at all frequencies. Such a process is also known as a ‘‘white’’ process, similar to the white light which is composed of all possible chromatic lines with approximately equal power (see Fig. 3.3). For the white Gaussian noise, eq. (3.96) produces a singular result ð 1 1 Sð!Þd! ¼ 1 ð3:179Þ
2 ¼ 2 1 In other words, the variance of a white noise is equal to infinity, 2 ¼ 1. This can be explained by the fact that a white noise is rather a mathematical abstraction than a real physical process. In real life, PSD usually decays with frequency, thus providing for a finite correlation interval, corr 6¼ 0, and a limited variance. However, white noise is a useful abstraction, which can be used in a case where the correlation interval of the process corr is
Figure 3.3
Covariance function and PSD of a white process.
96
RANDOM PROCESSES
much smaller than a characteristic time constant of the system under consideration. In an equivalent statement, one can use white noise if the bandwidth of the system under consideration is much smaller than the bandwidth of the noise. A simple rule of thumb can be provided for substitution of a real process by a white noise nðtÞ in analysis of systems under random excitation. Let a system with a characteristic time constant s be considered. This system is excited by a random process with covariance function Cð Þ ¼ 2 Rð Þ which has a relatively wide spectrum, and thus corr s . In this case, one can assume that a white noise model is a suitable model. An equivalent spectral density of noise N0 =2 can be chosen such that Sð! ¼ 0Þ ¼ Sð0Þ. In turn, this can be calculated from the covariance function Cð Þ as ð1 ð1 N0 ¼ Sð0Þ ¼ Cð Þd ¼ 2 Cð Þd ¼ 2 2 corr ð3:180Þ 2 1 0 One of the prominent examples of white noise is so-called thermal noise, caused by random movement of a large number of electrons inside metals. It is known that the spectral density of thermal noise of a resistor R is defined by the so-called Nyquist formula Sþ a ð!Þ ¼ 4kTR;
N0 ¼ 4kTR
ð3:181Þ
where k ¼ 1:38 1023 J/K is the Boltzman constant (for T ¼ 200 , kT ¼ 4 1021 W/Hz). It can be seen from the last equation that the PSD of thermal noise is constant and it may appear that thermal noise is an ideal example of white noise. However, one has to take into account that eq. (3.181) is valid only for low frequencies. It is obtained from a quantum formula Su ð f Þ ¼ 4kTR
1 hf hf exp 1 kT kT
ð3:182Þ
where h ¼ 6:62 1034 J s is the Planck constant. If hf =kT 1, i.e. f
kT 4 1021 W=Hz 6 1012 Hz h 6:62 1034 J s
ð3:183Þ
(i.e. eq. (3.181) works well for frequencies up to 100 GHz and normal temperature). However, in order to calculate variance of the thermal noise, one must use the exact expression (3.182) in order to obtain a finite result
2u
¼ 4kTR
ð1 0
1 ð hf hf kT 1 xdx 22 ðkTÞ2 R exp ¼ 1 dt ¼ 4kTR 3h kT kT h 0 exp x 1
ð3:184Þ
As another example, one can consider the covariance function of the random process sðtÞ ¼ nðtÞ cosð!0 t þ ’Þ
ð3:185Þ
REFERENCES
97
obtained by modulation of a harmonic signal with a random uniformly distributed phase by white noise, which can be obtained from eq. (3.138) and is given by the equation 1 A2 N0 2 A ð Þ Cs ð Þ ¼ N0 ð Þ 0 cos ! ¼ 2 2 4 0
ð3:186Þ
That is, such a process is also white.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17.
18. 19. 20.
J. Doob, Stochastic Processes, New York: Wiley, 1990, 1953. A. Kolmogorov, Foundations of the Theory of Probability, New York: Chelsea Pub. Co., 1956. J.K. Ord, Families of Frequency Distributions, London: Griffin, 1972. D. Middleton, An Introduction to Statistical Communication Theory, New York: McGraw-Hill, 1960. A. Stuart and K. Ord, Kendall’s Advanced Theory of Statistics, London: Charles Griffin & Co., 1987. A. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, New York: Dover, 1972. C. Nikias and A. Petropulu, Higher-Order Spectra Analysis: A Non-Linear Processing Framework, Upper Sadle River: Prentice Hall, 1993. A.N. Malakhov, Cumulant Analysis of Non-Gaussian Random Processes and Their Transformations, Moscow: Sovetskoe Radio, 1978 (in Russian). A. Papoulis, Probability, Random Variables, and Stochastic Processes, New York: McGraw-Hill, 3rd edition, 1991. O.V. Sarmanov, Maximum Correlation Coefficient (Symmetrical Case), DAN, Vol. 120, 1958, pp. 715–718. O.V. Sarmanov, Maximum Correlation Coefficient (Non-Symmetrical Case), DAN, Vol. 121, 1958, pp. 52–55. P. Bello, Characterization of Randomly Time-Variant Linear Channels, IEEE Trans. Communications, vol. 11, no. 4, December, 1963, pp. 360–393. W. Siebert, Circuits, Signals and Systems, Boston: MIT Press, 1985. S. Mallat, A Wavelet Tour of Signal Processing, San Diego: Academic Press, 1999. L. Hanzo, W. Webb and T. Keller, Single- and Multi-Carrier Quadrature Amplitude Modulation: Principles and Applications for Personal Communications, WLANs and Broadcasting, New York: John Wiley & Sons, 2000. J. Proakis, Digital Communications, Boston: McGraw-Hill, 2001. H. Sakai, Theory of Cyclostationary Processes and Its Applications, in: Tohru Katayama, Sueo Sugimoto, Eds., Statistical Methods in Control and Signal Processing, New York: Marcel Dekker, 1997. F. Riesz and B. Sz.-Nagy, Functional Analysis, New York: Ungar, 1955. K. Miller, Multidimensional Gaussian Distributions, New York: Wiley, 1964. V. Tikhonov, Statistical Radiotechnique, Moscow: 1966 (in Russian).
4 Advanced Topics in Random Processes 4.1
CONTINUITY, DIFFERENTIABILITY AND INTEGRABILITY OF A RANDOM PROCESS
Analysis and design of systems is often based on their representation by a system of differential or difference equations with random forcing functions. In order to provide a proper mathematical foundation for such analysis one has to operate with properties such as continuity, differentiability and integrability in a way similar to the deterministic case. Some basic definitions are presented below. More detailed considerations can be found in the literature [1–3]. 4.1.1
Convergence and Continuity
Let fn g ¼ 1 ; 2 ; . . . ; n ; . . . be a sequence of random variables. In order to define a notion of the limit of a sequence one has to define a proper measure of the difference n . Based on the chosen measure, the following types of convergence can be defined: 1. Convergence with probability 1. In this case Probfn ! g ¼ 1;
n!1
ð4:1Þ
2. Convergence in probability. In this case for an arbitrary " > 0, the probability of deviations larger than " is equal to zero Probfjn j > "g ¼ 0;
n!1
ð4:2Þ
3. Mean square convergence. In this case it is required that the mean square deviation is zero lim Efjn j2 g ¼ 0
n!1
ð4:3Þ
These definitions are related: convergence in probability follows from both convergence in the mean square sense and from the convergence with probability 1. Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
100
ADVANCED TOPICS IN RANDOM PROCESSES
A similar definition can be extended to the case of a random process ðtÞ by considering samples of the sequence of random processes fn ðtÞg at a given moment of time t. In addition, the notion of continuity could be introduced by considering values of the random process at two closely spaced moments of time t and t þ h. In particular, it is said that a random process ðtÞ is continuous with probability 1 at the moment of time t, if ðt þ hÞ ! ðtÞ with probability 1 as h ! 0. It is important to mention though that continuity with probability 1 does not imply that each realization of the random process ðtÞ is continuous at the same point itself1. In the following, mainly convergence (and continuity) in the mean square sense is used. Such an approach is applicable to so-called random processes of the second order, i.e. to processes whose second moment is finite m2 ðtÞ ¼ Ef2 ðtÞg < 1 at any given moment of time t. A random process of the second order is continuous in the mean square sense at the moment of time t if lim Ef½ðt þ hÞ ðtÞ2 g ¼ 0
h!0
ð4:4Þ
In order to check the validity of this condition, it is necessary and sufficient to know the covariance function C ðt1 ; t2 Þ of the random process ðtÞ: a random process ðtÞ is continuous in the mean square sense if and only if its mean is continuous for the moment of time t and the covariance function C ðt1 ; t2 Þ is continuous at t1 ¼ t2 ¼ t. Indeed, if the conditions above are satisfied then lim Ef½ðt þ hÞ ðtÞ2 g
h!0
¼ lim Ef½ððt þ hÞ m ðt þ hÞÞ ððtÞ m ðtÞÞ2 g h!0
¼ lim½C ðt þ h; t þ hÞ 2C ðt þ h; tÞ þ C ðt; tÞ þ ðmðt þ hÞ mðtÞÞ2 ¼ 0 h!0
ð4:5Þ
and, thus, these conditions are necessary. Sufficiency follows from a simple estimate Ef½ðt þ hÞ mðt þ hÞ ðtÞ þ mðtÞ2 g ¼ C ðt þ h; t þ hÞ 2C ðt þ h; tÞ þ C ðt; tÞ 0 ð4:6Þ If the left-hand side converges to zero, each term in this expression must converge to zero, thus proving sufficiency of the conditions. 4.1.2
Differentiability
The random process ðtÞ of the second order is called a differentiable in the mean square sense random process at the moment of time t ¼ t0 if the limit ðt0 Þ ¼ lim
h!0
1
ðt0 þ hÞ ðt0 Þ h
ð4:7Þ
The definition based on the probability measure allows a certain number of realizations to have discontinuity, as long as the number of such realizations is ‘‘small’’, or more accurately, the number of deviations has measure zero.
CONTINUITY, DIFFERENTIABILITY AND INTEGRABILITY OF A RANDOM PROCESS
101
exists in the mean square sense. The new random process ðtÞ is called the derivative of the random process ðtÞ in the mean square sense if the derivative exists for all moments of time t considered. The necessary and sufficient conditions for differentiability of a process ðtÞ can be formulated as follows: a second order random process ðtÞ is differentiable at the moment of time t in the mean square sense if and only if its mean m ðtÞ is differentiable at the moment of time t and the second mixed derivative @2 C ðt1 ; t2 Þ ð4:8Þ @t1 @t2 t1 ¼ t2 ¼ t exists and is finite. In order to prove necessity of these conditions, one can consider the average of the following quantity ðt0 þ h1 Þ ðt0 Þ ðt0 þ h2 Þ ðt0 Þ 2 h1 h2 2 ðt0 þ h1 Þ ðt0 Þ ðt0 þ h1 Þ ðt0 Þ ðt0 þ h2 Þ ðt0 Þ ðt0 þ h2 Þ ðt0 Þ 2 ¼ 2 þ h1 h1 h2 h2 ð4:9Þ Averaging of the middle term on the right-hand side results in ðt0 þ h1 Þ ðt0 Þ ðt0 þ h2 Þ ðt0 Þ m ðt0 þ h1 Þ m ðt0 Þ m ðt0 þ h2 Þ m ðt0 Þ E ¼ h1 h2 h1 h2 C ðt0 þ h1 ; t0 þ h2 Þ C ðt0 þ h1 ; t0 Þ C ðt0 ; t0 þ h2 Þ þ C ðt0 ; t0 Þ þ ð4:10Þ h1 h2 Taking the limit on the both sides of eq. (4.10), one obtains ( ) ðt0 þ h1 Þ ðt0 Þ ðt0 þ h2 Þ ðt0 Þ 2 lim E h1 ; h2 !0 h1 h2 2 @ @2 m ðtÞ þ ¼ C ðt1 ; t1 Þ @t @t1 @t2
ð4:11Þ
t1 ¼ t2 ¼ t
thus indicating that the average of the right-hand side of eq. (4.9) is finite. After calculating all the terms and taking the limit for h1 ; h2 ! 0, one verifies that ( ) ðt0 þ h1 Þ ðt0 Þ ðt0 þ h2 Þ ðt0 Þ 2 ¼0 ð4:12Þ lim E h1 ; h2 !0 h1 h2 Thus, the derivative of the random process ðtÞ exists in the mean square sense. Necessity also can be easily proven by observing that the right-hand side of eq. (4.9) is a sum of two non-negative terms. If the left-hand side approaches zero, then each of these terms approaches zero separately which is equivalent to the requirements of existence of the derivatives of the mean and covariance functions.
102
ADVANCED TOPICS IN RANDOM PROCESSES
4.1.3
Integrability
Finally, an integral of a random process in the mean square sense can be defined as follows. Let a time interval ½a; b be divided into sub-intervals by points a ¼ t0 < t1 < < tn ¼ b, and f ðtÞ be a deterministic function defined on the same interval ½a; b. A partial sum n ¼
n X
ðti Þf ðti Þðti ti1 Þ
ð4:13Þ
i¼1
is a sequence of random variables fn g. If this sequence has a finite limit J in the mean square sense when maxðtj tj1 Þ ! 0 J¼
n X
lim
maxðtj tj1 Þ!0
ðti Þf ðti Þðti ti1 Þ
ð4:14Þ
i¼1
this is called the Rieman mean square integral of the random process ðtÞ over the time interval ½a; b. J ¼ J½a; b ¼
ðb
f ðtÞðtÞdt ¼
a
n X
lim
maxðtj tj1 Þ!0
ðti Þf ðti Þðti ti1 Þ
ð4:15Þ
i¼1
Improper integrals also can be defined by considering the mean square limit of the integral over a finite but increasing time interval ½a; b. For example ð1
f ðtÞðtÞdt ¼ lim
ðb
b!1 a
a
f ðtÞðtÞdt
ð4:16Þ
The mean square integral of the Rieman type exists if the following two integrals I1 ¼
ðb
f ðtÞm ðtÞdt a ðb ðb f ðt1 Þf ðt2 ÞC ðt1 ; t2 Þdt1 dt2 I2 ¼ a
ð4:17Þ
a
exist. In this case ðb E
f ðtÞðtÞdt
a
¼
ðb
f ðtÞm ðtÞdt
ð4:18Þ
a
and ðb ðb E a
a
f ðt1 Þf ðt2 Þðt1 Þðt2 Þdt1 dt2
¼
ðb ðb a
a
f ðt1 Þf ðt2 ÞC ðt1 ; t2 Þdt1 dt2 þ
ðb
2 f ðtÞm ðtÞdt
a
ð4:19Þ
ELEMENTS OF SYSTEM THEORY
4.2 4.2.1
103
ELEMENTS OF SYSTEM THEORY General Remarks
Analytical investigation of any realistic system requires an adequate mathematical model. There are three main principles which guide choice of the model: (a) it must reflect real properties of the system; (b) it must be reasonably simple; and (3) it must allow well developed mathematical techniques to be applied to the analysis and the design. In order to build such a model, certain a priori information must be available. The amount of such information defines two significantly different approaches to modeling: (a) in some cases there is enough information to model a system’s behaviour quite accurately; (b) only minimal information is available. In the former case, it is usually assumed that the mathematical model of the system is known (analysis) or its parameters must be identified (estimation), while the latter requires that such a model be synthesized. One of the most prominent examples of such synthesis is phenomenological modeling of the communication channel, considered in this book. From the mathematical perspective, operation of any system can be thought of in terms of the input–output relationships, i.e. for an arbitrary input ðtÞ the output ðtÞ is defined as some function of the input ðtÞ ¼ T½ðtÞ
ð4:20Þ
If such description is possible the mathematical description of the system is achieved by providing a proper functional T. The function ðtÞ is referred to as the reaction to the stimulus ðtÞ. Depending on the nature of the input stimulus and the reaction, a great variety of systems can be considered. Of course, the stimulus can depend not only on time but on other variables such as frequency, space coordinates, etc. It is important to note that the structure of the operator T does not necessarily reflect physical processes taking place inside the real system. Such a description only allows accurate reproduction of the output of the system once the input is specified. There are a great variety of possible structures of the system and the corresponding operator T which allow input–output description. The input signal (stimulus) and the reaction can be vectors of dimension n and m respectively, thus describing multiple-input, multiple-output (MIMO) systems, as shown in Fig. 4.1(A). Such description incorporates a simple case of single-input single-output (SISO) systems m ¼ n ¼ 1 and autonomous systems ðn ¼ 0; m ¼ 1Þ. The operator T can be deterministic or stochastic. The deterministic operator describes systems which produce the same response to the same stimulus if the same initial conditions
Figure 4.1
(A) MIMO systems, (B) SISO systems.
104
ADVANCED TOPICS IN RANDOM PROCESSES
are enforced. In other words, the randomness in the response ðtÞ may appear only due to the randomness in the input signal ðtÞ. On the contrary, the operator T is called stochastic if the randomness in the response ðtÞ appears also due to the internal structure of the system. For example, it can be attributed to the random initial conditions, changing structure of the system, etc. If all physical phenomena are to be taken into account, most realistic systems are described by stochastic operators due to such internal factors as thermal noise, fluctuating external conditions (temperature, humidity), etc. However, in many cases the influence of these random factors is negligible and the system can be sufficiently described by a deterministic operator with randomness attributed only to the input stimuli. In many cases, internal randomness can be also properly represented by an equivalent random input, etc. The majority of the discussion in this book deals with deterministic systems with random, inputs, with the notable exception of systems with random structure, described in Chapter 6. Further classification of operators can be achieved based on the type of output process ðtÞ. In particular, one can distinguish between the analog and discrete and digital systems, continuous systems with jumps, etc. It is also possible to distinguish between zero-memory non-linear (ZMNL) systems, linear systems, non-linear systems with memory, causal and non-causal2, etc. The system is called a zero-memory non-linear system if the input–output relationship can be represented in the following form ðtÞ ¼ g½ðtÞ; t
ð4:21Þ
where ½ðtÞ; t is a non-linear deterministic vector function. In other words, the output of a ZMNL system at the moment of time t can be predicted based on the value of the input signal ðtÞ at the same moment of time and does not depend on the values of the input and the output signals at any other previous moment of time t0 < t. A casual system is a system which transforms current and past values of the input signals to the current value of the system output: the future values of the system input have no effect on the current output value. In other words, the output always lags the excitation. While the causal systems are the only systems which can be physically implemented, non-causal systems are often a useful tool in theoretical analysis. As an example, one can mention an ideal low-pass filter [14]. It is often possible to represent a general non-linear system with memory as a cascaded connection of ZMNL and linear sub-systems as shown in Fig. 4.2. In this case, non-linear effects of the original system are reproduced by ZMNL blocks while the memory effects are attributed to the impact of the linear block.
Figure 4.2
2
Representation of a non-linear system with memory.
It will be seen from the following consideration that there is a group of methods well tailored for each separate class of systems, thus justifying such classification.
105
ELEMENTS OF SYSTEM THEORY
Strictly speaking, an operator T ¼ L is called a linear operator if the so-called superposition principle is valid for this operator, i.e. the following equation must be satisfied for any set of the input stimuli k ðtÞ " # n n X X ðtÞ ¼ L ck k ðtÞ ¼ ck L½k ðtÞ ð4:22Þ k¼1
k¼1
Here ck are arbitrary constants, which are independent of time. All other operators are treated as non-linear. 4.2.2
Continuous SISO Systems
The deterministic linear operator L for a SISO system can be given in the form of a differential or difference equation with corresponding initial conditions. For a continuous system with constant parameters, the differential equation can be written in the form m X k¼0
ak
n X dk dk ðtÞ ¼ b ðtÞ; k dtk dtk k¼0
n<m
ð4:23Þ
or, in the symbolic form Am ðsÞðtÞ ¼ Bn ðsÞðtÞ;
ð4:24Þ
where s¼
m n X X d ; Am ðsÞ ¼ ak sk ; Bn ðsÞ ¼ bk s k dt k¼0 k¼0
ð4:25Þ
are polynomials in s defined by the coefficients of differential equation (4.23). Equation (4.23) must be augmented by a set of initial conditions in the form d dm1 ðm1Þ ¼ m1 ðtÞ ð4:26Þ 0 ¼ ðt0 Þ; 0 0 ¼ ðtÞ ; . . . ; 0 dt dt t¼t0 t¼t0 Instead of differential equations one can use the following alternative but equivalent descriptions of the same linear system and its operator L. 1. The impulse response of the system hðt; Þ is the reaction of a linear system to a delta pulse at time t ¼ : hðt; Þ ¼ L½ðt Þ
ð4:27Þ
The response to an arbitrary stimulus ðtÞ can be then found as ðtÞ ¼ L½ðtÞ ¼ L
ð 1 1
ðÞðt Þd ¼
ð1 1
ðÞL½ðt Þd ¼
ð1 1
ðÞhðt; Þd ð4:28Þ
106
ADVANCED TOPICS IN RANDOM PROCESSES
Here the sifting property of the delta function is used in the third term and the superposition principle is used to exchange the order of integration and the operator L action. For a causal system, no response appears before the excitation, i.e. hðt; Þ ¼ 0;
for t <
ð4:29Þ
thus transforming eq. (4.28) to ðtÞ ¼
ðt 1
ðÞhðt; Þd
ð4:30Þ
Linear systems can be further divided into time-variant and time-invariant systems. For a time-invariant system, the impulse response is a function of difference of the arguments, i.e. hðt; Þ ¼ hðt Þ and, thus ðtÞ ¼
ðt 1
ðÞhðt Þd
ð4:31Þ
In other words, a time-invariant system is invariant to a shift of the origin of time: ðt t0 Þ ¼ L½ðt t0 Þ
ð4:32Þ
All other systems are called time-variant. The duration of the impulse response defines the memory of a system since it reflects for how long the perfectly localized input affects the output. A linear system is called stable if its impulse response is absolutely integrable, i.e. ð1
jhðtÞjdt < 1
ð4:33Þ
0
In this case a bounded input would always produce a bounded response. Stability is widely investigated in the control theory with numerous criteria of stability [5,6]. In particular, a system is stable if all roots of the characteristic polynomial m X
ak k ¼ 0
ð4:34Þ
k¼0
have negative real parts. 2. An equivalent description of a stationary linear system can be achieved by means of the frequency response, or the transfer function, Hð j!Þ, defined as the Fourier transform of the impulse response Hð j!Þ ¼
ð1 1
hðÞ expðj! Þd ¼ jHð j!Þj exp½ jð j!Þ
ð4:35Þ
ELEMENTS OF SYSTEM THEORY
107
or, equivalently, hðtÞ ¼
1 2
ð1 Hð j!Þ expð j!Þd 1
ð4:36Þ
The reaction of a linear system to a harmonic signal of frequency ! is the harmonic signal of the same frequency scaled by the value of the transfer function of the system, i.e. ðtÞ ¼ L½expð j!tÞ ¼
ðt 1
ðÞ exp½ j!ðt Þd ¼ Hð j!Þ expð j!tÞ
ð4:37Þ
Instead of the Fourier transform, one can use the Laplace transform to include signals which are not properly described by the Fourier transform. In this case HðsÞ ¼
ð1 hðÞ expðsÞd
ð4:38Þ
0
A detailed discussion of the advantages of both methods can be found in [7].
4.2.3
Discrete Linear Systems
Similar characteristics can be obtained for discrete systems. It is also important to mention an important property of linear systems: characteristics of a discrete system can be obtained from similar characteristics of an equivalent continuous system, while the inverse is not always possible [8]. However, the analysis of discrete systems is important in itself, without relation to the underlying continuous systems. Let n ¼ ðt ntÞ be samples of a random process ðtÞ at the time instants separated by a fixed discretization interval t, and let be the shift (delay) operator n ¼ ðnþ1Þ ; m n ¼ nm
ð4:39Þ
The difference operator r can be defined as rn ¼ n ðnþ1Þ ¼ ð1 Þn
ð4:40Þ
Using this notation, a formal equivalent to eq. (4.23) is then given by ð1 þ 1 r þ þ p rp Þ ¼ ð 0 þ 1 r þ þ q rq Þ
ð4:41Þ
In applied problems, a few particular forms of eq. (4.41) are used: (1) autoregressive (AR); (2) moving average (MA); (3) autoregressive with moving average (ARMA); and (4) autoregressive integrated moving average (ARIMA).
108
ADVANCED TOPICS IN RANDOM PROCESSES
An AR model of the order M, or ARðMÞ, produces the current sample 0 of the centred output process3 as a linear combination of M samples of the output process and the sample of the input process 0 , i.e. 0 ¼
M X
m m þ 0
ð4:42Þ
m¼1
This model has M parameters m plus freedom in choosing the distribution of the input signal. Due to analytical difficulties with considering higher order difference equations, it is common to limit considerations only to ARð1Þ and ARð2Þ models. Despite such limitations, a large number of useful results can be obtained [9–12]. An MA model of order N, or MAðNÞ, produces the current value of a centred output signal as a weighted sum of the current and N past values of the input signal, i.e. 0 ¼ 0 þ
N X
n n
ð4:43Þ
n¼1
This model is also defined by values of N coefficients n and by choice of the distribution of the input process . Both AR and MA models can be treated as discrete filters. The AR model is described by an Infinite Impulse Response (IIR) filter, while the MA model is described by a Finite Impulse Response (FIR) filter. The MA model is well suited for description of relatively smooth processes, while the AR model is used when a certain impulse behaviour has to be properly represented. The ARMAðM; NÞ model is a generalization which has features of both the MA and AR models defined above. In this case, the current sample of the output signal is obtained as a sum of weighted averages of the past N samples of the input process, past M samples of the output process and the current input process. Mathematically, the equation describing ARMAðM; NÞ is thus 0 ¼
M X
m m þ 0 þ
m¼1
N X
n n
ð4:44Þ
n¼1
ARMA models also can be treated as discrete linear filters. Finally, the ARIMAðM; N; KÞ model is a further generalization of the ARMAðM; NÞ with AR part driven by the increments n ¼ rK n ¼ ð1 ÞK n
ð4:45Þ
of the output process: 0 ¼
M X m¼1
m m þ 0 þ
N X
n n
ð4:46Þ
n¼1
A non-zero mean value m can be added later. The variance of the sequence n can also be adjusted in such a way that the sample of the external excitation has the weight 1 in eq. (4.42).
3
109
ELEMENTS OF SYSTEM THEORY
Analysis of a discrete system is most conveniently performed using either the discrete Fourier transform (DFT) [13] 1 X
ðe j! Þ ¼
n ej!n
ð4:47Þ
n zn
ð4:48Þ
n¼1
or the so-called z transform method [13] 1 X
ðzÞ ¼
n¼1
ðe j! Þ ¼ ðzÞjz¼expð j!Þ
ð4:49Þ
Both methods transform difference equations of the form (4.42)–(4.44) to an algebraic equation in the ! or z domain. Indeed, if n ¼ nk , its z transform is given by 1 X
ðzÞ ¼
n¼1
nz
n
¼
1 X
nk z
n
1 X
¼ zk
n¼1
nk zðnkÞ ¼ zk ðzÞ; ðe j! Þ ¼ ej!k ðe j! Þ
n¼1
ð4:50Þ which allows the ARMA equation to be rewritten as " ðzÞ 1 þ
M X
# m z
m
" ¼ ðzÞ 1 þ
m¼1
N X
# n z
n
ð4:51Þ
n¼1
and thus ðzÞ ¼
P 1 þ Nn¼1 n zn ðzÞ PM 1 þ m¼1 m zm
ð4:52Þ
Here ðzÞ is the z transform of the sequence n .
4.2.4
MIMO Systems
Analysis of MIMO systems can be conducted within the same framework considered in the previous section. In general, it is impossible to provide an expression for joint probability densities of the output vector processes but it is feasible to obtain expressions for moments and cumulants. Let n ðtÞ, 1 n N, be a random process acting at the n-th input of an N inputs, M outputs MIMO system. If the impulse response between the n-th input and m-th output is denoted by hnm ðtÞ, the response n ðtÞ of the system at the m-th output is then given by m ðtÞ ¼
N ðt X n¼1
0
hnm ðn Þn ðt n Þdn
ð4:53Þ
110
ADVANCED TOPICS IN RANDOM PROCESSES
Thus, the average value mm ðtÞ of the m-th output and the mutual correlation function Rm1 ;m2 ðt1 ; t2 Þ between two outputs can be calculated as follows mm ðtÞ ¼
N ðt X n¼1
hnm ðn Þmn ðt n Þdn
Rm1 m2 ðt1 ; t2 ¼ Efm1 ðtÞm2 ðtÞg N X N ð t1 ð t2 X ¼ hm1 n1 ðn1 Þhm2 n2 ðn2 ÞRn1 n2 ðt1 n1 ; t2 n2 Þdn1 dn2 n2 ¼1 n1 ¼1 0
ð4:54Þ
0
ð4:55Þ
0
The same equations hold for the centred processes and thus for the mutual covariance functions Cm1 m2 ðt1 ; t2 Þ: Cm1 m2 ðt1 ; t2 Þ ¼
N X N ð t1 ð t2 X n2 ¼1 n1 ¼1 0
0
hm1 n1 ðn1 Þhm2 n2 ðn2 ÞCn1 n2 ðt1 n1 ; t2 n2 Þdn1 dn2 ð4:56Þ
If the input signals are stationary related, the expression for the mutual covariance function is somewhat simplified to read Cm1 m2 ðt1 ; t2 Þ ¼
N X N ð t1 þ ð t1 X n2 ¼1 n1 ¼1 0
0
hm1 n1 ðn1 Þhm2 n2 ðn2 ÞCn1 n2 ð n1 þ n2 Þdn1 dn2 ð4:57Þ
Finally, if t1 ! 1, the expression for the asymptotically stationary mutual covariance function becomes Cm1 m2 ðÞ ¼
4.2.5
N X N ð1 ð1 X n2 ¼1 n1 ¼1 0
0
hm1 n1 ðn1 Þhm2 n2 ðn2 ÞCn1 n2 ð n1 þ n2 Þdn1 dn2 :
ð4:58Þ
Description of Non-Linear Systems
As was mentioned earlier, two classes of non-linear system can be distinguished: zero memory (non-inertial) and non-zero memory (inertial) systems. Zero memory non-linearities are described by an operator of the type ðtÞ ¼ T½ðtÞ ¼ gððtÞÞ
ð4:59Þ
where y ¼ gðxÞ is a deterministic (or random) function which does not depend on time. Often a polynomial yðxÞ
N X n¼0
an ðx x0 Þn ;
ð4:60Þ
ELEMENTS OF SYSTEM THEORY
111
yðxÞ ai x þ bi ; x 2 ½xi ; xiþ1
ð4:61Þ
piece-wise linear
or transcendental approximations are used. In the latter case, the non-linearity gðxÞ is approximated by some simple and well studied transcendental functions such as exponent, tangent, logarithm, etc. Each of these approximations has its advantages and disadvantages. Often it is difficult to balance accuracy of approximation and simplicity of calculations. Mathematical models of non-linear systems with memory vary greatly. Often, they are described by a system of non-linear differential equations (state–space equations), written in the Cauchy form d ðtÞ ¼ f ½ðtÞ; t þ g½ðtÞ; t; ðtÞ dt
ð4:62Þ
where f ðÞ and gðÞ are deterministic functions, which may depend on time. In contrast to linear systems, when the general structure of the solution is known, the form of solution for a non-linear equation (4.62) depends on the order, type, excitation and many other factors. In most cases, it is impossible to obtain an exact solution even for the autonomous case ðtÞ ¼ 0. This is why the analysis of non-linear systems with memory is fundamentally different from analysis of linear systems and zero memory non-linearity: each non-linear system must be analyzed separately. However, it will be seen that for some classes of non-linear system it is possible to obtain significant information in a very general form. Instead of using differential equations, one can rewrite the state–space equations of type (4.62) using the so-called Volterra representation [14] ðtÞ ¼ h0 ðtÞ þ
1 ð X i¼1
hi ðt; 1 ; . . . ; i Þð1 Þ ði Þd1 di
ð4:63Þ
Ri
Such representation is legitimate if the function f ðxÞ is well approximated by the Taylor series and the function gðxÞ does not depend on ðtÞ. Furthermore, if functions f ðxÞ and gðxÞ do not depend explicitly on time, the eq. (4.63) can be recast as ðtÞ ¼ h0 ðtÞ þ
1 ð X i¼1
hi ð1 ; . . . ; i Þðt 1 Þ ðt i Þd1 di
ð4:64Þ
Ri
Functions hi ð1 ; . . . ; i Þ are called the Volterra kernels of the system. The first term of the sum h0 ðtÞ can be interpreted as the internal dynamics of the system since it can be obtained by setting the forcing function ðtÞ to zero. The second term is the reaction of a linear system with the impulse response h1 ðtÞ to the external signal ðtÞ. The third term is more complicated and can be viewed as a two-dimensional convolution. It also can be treated as a combination of a linear system and a ZMNL system. The same is true for higher order terms in eq. (4.64) [14]. Thus, in some cases, a non-linear system with memory can be represented by a combination of linear blocks and a ZMNL block as shown in Fig. 4.3. As a result, analysis, and sometimes synthesis, of a non-linear system can be reduced to analysis of linear systems and ZMNL systems.
112
ADVANCED TOPICS IN RANDOM PROCESSES
Figure 4.3
4.3
Representation of non-linear systems using the Volterra series.
ZERO MEMORY NON-LINEAR TRANSFORMATION OF RANDOM PROCESSES
In this section, dependence of moments and cumulants of the marginal PDF p ðxÞ and the joint PDF p ðx; yÞ of the second order of a random process ¼ f ðÞ on the characteristic of the input process ðtÞ is studied. A zero memory non-linear transformation y ¼ f ðxÞ
ð4:65Þ
is assumed to be known. Transformation of a random process ðtÞ by a ZMNL system can be reduced to a problem of transformation of a random vector as considered in section 2.3. Indeed, if the joint PDF p ðy1 ; . . . ; yn ; t1 ; . . . ; tn Þ of the output process ðtÞ ¼ f ½ðtÞ considered at n moments of time is to be found, one can approach the problem in the following manner. Let ¼ ½ðt1 Þ; ðt2 Þ; . . . ; ðtn ÞT be a vector formed by the samples of the input random process ðtÞ and ¼ ½ðt1 Þ; ðt2 Þ; . . . ; ðtn ÞT be the vector formed by the samples of the output process ðtÞ. Since the transformation has no memory, each component is transformed independently and according to the same rule, i.e. y ¼ ½y1 ; y2 ; . . . ; yn T ¼ f ðxÞ ¼ ½ f ðx1 Þ; f ðx2 Þ; . . . ; f ðxn ÞT
ð4:66Þ
Thus, if the joint PDF p ðxÞ is known one can calculate the joint PDF p ðyÞ using rules of transformation of a random vector.
4.3.1
Transformation of Moments and Cumulants
At a fixed moment of time t values of the random processes ðtÞ and ðtÞ are random variables, and, thus, the relation for the moments and the central moments of the output
ZERO MEMORY NON-LINEAR TRANSFORMATION OF RANDOM PROCESSES
113
process are just given by mn ðtÞ ¼
ð1
yn p ðy; tÞdy ¼
ð1
f n ðxÞp ðx; tÞdx 1 ð1 ð1 1 n
n ðtÞ ¼ ½ y m1 ðtÞ p ðy; tÞdy ¼ ½ f ðxÞ m1 ðtÞn p ðx; tÞdx 1
1
ð4:67Þ ð4:68Þ
In principle, eqs. (4.67) and (4.68) solve the problem of finding the moments of the marginal PDF of the output process ðtÞ at any given instant of time. Of course, for a stationary process, all these quantities do not depend on time. In general, a complete knowledge of the marginal PDF p ðx; tÞ is required. Further simplification can be achieved for a polynomial non-linear transformation. In this case, the transformation is a sum of powers of the input, i.e. f ðxÞ ¼
N X
ð4:69Þ
ai xi
i¼0
Using this expression in (4.67) and setting n ¼ 1, one obtains the expression for the average m1 ðt1 Þ of the output process m1 ðtÞ
¼
ð1
N X
1
! ai x
i
p ðx; tÞdx ¼
i¼0
N X
ai mi ðtÞ
ð4:70Þ
i¼0
Similarly, the expression for the second moment can be obtained by noting that f ðxÞ ¼ 2
N X
!2 ai x
i
i¼0
¼
N X
a2i x2i þ 2
N X
ai aj x iþj
ð4:71Þ
j¼0; j6¼i
i¼0
and, thus m2 ðtÞ ¼
N X
a2i m2i ðtÞ þ 2
i¼0
N X
ai aj miþj ðtÞ
ð4:72Þ
j¼0; j6¼i
In general, the expression f k ðxÞ would be a polynomial of the order kN and, thus, only the moments mn ðt1 Þ of order up to kN are needed to calculate the k-th moment of the output: f k ðxÞ ¼
kN X i¼0
bi xi ; mk ðt1 Þ ¼
kN X
bi mi ðt1 Þ
ð4:73Þ
i¼0
This approach works well for calculation of low order moments or computer simulations. However, sometimes a simpler way of calculating moments can be found based on application of eqs. (2.255)–(2.257) which allow differential equations for h f n ðxÞi in terms
114
ADVANCED TOPICS IN RANDOM PROCESSES
of the cumulants of the input process ðtÞ to be obtained. Indeed, plugging f n ðxÞ into eqs. (2.257), one obtains @ 1 ds k mk ðtÞ ¼ f ðxÞ s! dxs @s ðtÞ ð4:74Þ sþp 2 @ 1 d k m ðtÞ ¼ f ðxÞ k s!p! dxsþp @s @p Example [15]: As an example an exponential transformation is considered here, i.e. ðtÞ ¼ exp½aðtÞ
ð4:75Þ
In this case @ @s
heax i ¼
as ax he i s!
ð4:76Þ
Integrating this equation, one obtains s a s he i ¼ As exp s!
ð4:77Þ
ax
where As is a constant to be determined. Since s in eq. (4.77) is an arbitrary integer, a constant A should exist such that ! 1 s X a s ax ð4:78Þ he i ¼ A exp s! s¼1 Furthermore, if the random variable is identically zero, then s ¼ 0 for any s and thus A ¼ he0 i ¼ 1. Thus, finally ! 1 s X a s ax he i ¼ exp ð4:79Þ s! s¼1 This result also allows one to calculate moments mn . Indeed, ! 1 s s X ak k kax s m ¼ he i ¼ exp s! s¼1
ð4:80Þ
If the input process ðtÞ is a Gaussian process then the calculation of the moments can be further simplified since the higher order cumulants vanish4. In this case it can be shown [15] that s 1 X 1 d2s 2 D f ðxÞ ð4:81Þ m2 ðtÞ ¼ 2 2s dx s! x¼m s¼0 1
4
While the cumulants vanish, even moments do not. Thus, the application of the approach given by eqs. (4.67) and (4.68) is not ‘‘optimal’’.
ZERO MEMORY NON-LINEAR TRANSFORMATION OF RANDOM PROCESSES
115
Example [15]: Let a Gaussian random process be transformed by a square law detector. In this case ¼ f ðÞ ¼ 2 . Using eq. (4.81), one can easily obtain expressions for the first and second moments m1 ðtÞ ¼ m21 ðtÞ þ d ðtÞ ¼ m2 ðtÞ m2 ðtÞ
¼
m41 ðtÞ
þ
6m21 ðtÞD ðtÞ
þ
ð4:82Þ 3D2 ðtÞ
ð4:83Þ
Finally, the variance of the output process is just D ðtÞ ¼ m2 ðtÞ m21 ðtÞ ¼ 4m21 ðtÞD ðtÞ þ 2D2 ðtÞ
ð4:84Þ
The next task is to find an expression for the correlation function R ðt1 ; t2 Þ of the transformed process ðtÞ ð1 ð1 R ðt1 ; t2 Þ ¼ y1 y2 p ðy1 ; y2 ; t1 ; t2 Þdy1 dy2 1 ð ð1 1 1 ¼ f ðx1 Þ f ðx2 Þp ðx1 ; x2 ; t1 ; t2 Þdx1 dx2 ð4:85Þ 1
1
If both processes ðtÞ and ðtÞ are stationary processes then eq. (4.85) is simplified to ð1 ð1 f ðxÞ f ðx Þp ðx; x ; Þdx dx ð4:86Þ R ðÞ ¼ 1 1
Thus, in order to evaluate the correlation function at the output of a non-linear zero memory device, it is sufficient to know either the joint PDF p ðx1 ; x2 ; t1 ; t2 Þ at the input or its characteristic function ðju1 ; ju2 ; t1 ; t2 Þ. As a result, there are two main methods developed for the calculation of the correlation function [5]: (a) direct method based on the known PDF and (b) the Rice method, based on the orthogonal expansion of the characteristic function. Here both methods are briefly summarized. 4.3.1.1
Direct Method
In this section, the following orthogonal expansion of the joint PDF p ðx; x ; Þ, studied in section 2.2, is widely used p ðx; x ; Þ ¼ pðxÞpðx Þ
1 X
n ðÞQn ðxÞQn ðx Þ
ð4:87Þ
n¼0 5 Here fQn ðxÞg1 1 is a set of orthogonal polynomials generated by the PDF p ðxÞ and the coefficients Cn ðÞ can be found as ð1 ð1 Qn ðxÞQn ðx Þp ðx; x ; Þdx dx ð4:88Þ n ðÞ ¼
1 1
5
While it is always possible to build a system of orthogonal polynomials with a given weight p ðxÞ, this system is not always complete. It can be shown that for the Pearson system of PDFs, such a system is indeed complete. For more details see [16].
116
ADVANCED TOPICS IN RANDOM PROCESSES
Using expansion (4.87), the expression for the correlation function can be rewritten as p ðx; x ; Þ ¼
1 X
n ðÞ
ð1 1
n¼0
f ðxÞQn ðxÞpðxÞdx
ð1 1
f ðx ÞQn ðx Þpðx Þdx ¼
1 X
n ðÞa2n
n¼0
ð4:89Þ In this expansion, the coefficients n can be calculated through the marginal PDF pðxÞ as an ¼
ð1 1
f ðxÞQn ðxÞpðxÞdx
ð4:90Þ
It is important to mention here that both n ðÞ and an depend on a particular choice of pðxÞ, which, in turn, defines the set of the orthogonal polynomials Qn ðxÞ. The most popular choice is a Gaussian PDF and the Gamma PDF, as discussed in section 2.2. As an alternative, pðxÞ can be chosen as the marginal PDF of the input process pðxÞ ¼ p ðxÞ.
4.3.1.2
The Rice Method
Let ð1
Fð juÞ ¼
1
f ðxÞejux dx
ð4:91Þ
be the Fourier transform of the non-linearity f ðxÞ under consideration f ðxÞ ¼
1 2
ð1
ð4:92Þ
Fð juÞe jux du
1
As a result, the expression for the correlation function R ðÞ can be rewritten as R ðÞ ¼ ¼
1 42 1 42
ð 1 ð 1 ð 1 Fð juÞe 1
1
1
1
ð1 ð1
1
jux
du
ð1 1
Fð ju Þe
ju x
du p ðx; x ; Þdx dx
ð 1 ð 1 Fð juÞFð ju Þ p ðx; x ; Þe jðuxþu x Þ dx dx du du 1
1
ð4:93Þ Since the expression in the square brackets is the characteristic function of the second order ð ju; ju ; Þ, eq. (4.93) becomes R ðÞ ¼
1 42
ð1 ð1 1
1
Fð juÞFð ju Þ ð ju; ju ; Þdu du
ð4:94Þ
ZERO MEMORY NON-LINEAR TRANSFORMATION OF RANDOM PROCESSES
117
Calculations using eq. (4.94) can be simplified if ðtÞ is a zero mean Gaussian process. In this case 2 ðu; u Þ ¼ exp ½u21 þ 2 ðÞu1 u2 þ u22 2 1 2 2 2 2 X ð1Þn 2 ¼ exp u1 exp u2 ð ðÞu1 u2 Þn 2 2 n! n¼0
ð4:95Þ
As a result, eq. (4.94) becomes 1 1 X ð1Þn 2n n ðÞ C ðÞ ¼ 2 4 n¼0 n!
ð 1
2 X 1 2 u2 ð1Þn 2 n dn ðÞ Fð juÞu exp du ¼ 2 n! 1 n¼0 2n
ð4:96Þ where dn ¼
n 2
2 u2 Fð juÞu2n exp du 2 1
ð1
ð4:97Þ
Relations (4.89)–(4.90) and (4.96)–(4.97) in principle allow calculation of the correlation function of the output of a non-linear device. They are especially helpful for numerical simulations. However, for the purpose of analytical analysis, these equations may become intractable. Especially, it is difficult to trace the effect of the input correlation function to the shape of the output correlation function. An alternative approach, based on the cumulant equations is considered in the following section.
4.3.2
Cumulant Method
The cumulant method is based on the application of the following identity, known as the Price formula [4,14,15,17,18] @n R ðÞ ¼ h f ðnÞ ðxÞf ðnÞ ðx Þi @½R ðÞn
ð4:98Þ
This equation, which is valid for an arbitrary marginal PDF of the input process6, is most useful for analysis of transformation of Gaussian random processes. Without loss of generality, it is assumed that both the input and the output processes have zero mean, i.e. m1 ¼ m1 ¼ 0. Also, for convenience, the following notation is used gn ðÞ h f ðnÞ ðxÞ f ðnÞ ðx Þi 6
Originally it was derived only for the case of Gaussian processes.
ð4:99Þ
118
ADVANCED TOPICS IN RANDOM PROCESSES
In the general case, the output correlation function depends on all cumulants of the input joint density p ðx; x ; Þ, i.e. gn ðÞ ¼ Fn ½2 ð0; Þ; 3 ð0; ; Þ pþq ð0½ p ; ½q Þ
ð4:100Þ
Combining the Price formula (4.98) with (4.100), one obtains a differential equation written now in terms of the cumulants @n R ðÞ ¼ Fn ½R ðÞ; 3 ð0; ½2 Þ; . . . @½R ðÞn
ð4:101Þ
Let gn ðÞ be the value of the function Fn ðÞ at zero value of R ðÞ, i.e. gn ðÞ ¼ Fn ½R ðÞ; 3 ð0; ½2 Þ; . . . R ðÞ¼0
ð4:102Þ
As was shown in [17], the correlation function R ðÞ can be represented as a power series in terms of R ðÞ. In particular 1 1 X X 1 dn ðR ðÞÞ 1 ðkÞ h f ðxÞ f ðkÞ ðx ÞiR ðÞ¼0 Rn ðÞ þ g0 ðÞ Rn ðÞ ¼ R ðÞ ¼ n dðR k! k! ðÞÞ R ð Þ¼0 k¼1 k¼1 ð4:103Þ
The unknown function g0 ðÞ can be defined from the condition R ðÞR ¼0 0
ð4:104Þ
Equation (4.103) represents a general form of dependence of the output correlation function on the input correlation function. This series is significantly simplified if the non-linear transformation is a polynomial function. In this case, an infinite series (4.103) is reduced to a sum of a finite number of terms. Equation (4.103) also allows an asymptotic value of the correlation function R ðÞ for large time lags to be obtained. Indeed, for a sufficiently large , the value of the correlation function is small, thus only the first term in expansion (4.103) could be kept to allow for an appropriate accuracy. In this case lim R ðÞ ffi h f ðxÞi2 R ðÞ
!1
4.4 4.4.1
ð4:105Þ
CUMULANT ANALYSIS OF NON-LINEAR TRANSFORMATION OF RANDOM PROCESSES Cumulants of the Marginal PDF
The cumulants of the marginal PDF of the output process can be found using a technique considered in section 2.8. A few examples are considered below.
CUMULANT ANALYSIS OF NON-LINEAR TRANSFORMATION OF RANDOM PROCESSES 119
Example 1: Let ¼ f ðÞ be given a deterministic ZMNL transformation. It is required to find the dependence of the variance D ¼ 2 of the output process on the input cumulants s . It follows from eq. (4.98) that @D 1 ds 2 ½f ðxÞ 2h f ðxÞif ðxÞ ¼ ð4:106Þ s! dx @s Setting the order s sequentially to 1; 2; . . . , one obtains the following system of equations @D d ¼ 2 f ðxÞ; f ðxÞ dx @m1 2 @D d d d2 d ¼ f ðxÞ; f ðxÞ þ f ðxÞ; 2 f ðxÞ þ f ðxÞ @D dx dx dx dx ð4:107Þ @D d d 1 d3 f ðxÞ; f ðxÞ; ¼ f ðxÞ þ f ðxÞ dx dx2 3 dx3 @3 2 @D 1 d 1 d3 1 d4 f ðxÞ; 3 f ðxÞ þ f ðxÞ; 4 f ðxÞ ¼ f ðxÞ þ dx dx 3 12 @4 4 dx2 Example 2: Let ðtÞ ¼ A2 ðtÞ. In this case, equations for derivatives of the variance D with respect to low order cumulants can be easily obtained from eq. (4.107) by noticing that f 0 ðxÞ ¼ 2Ax, f 00 ðxÞ ¼ 2A and all the higher order derivatives are equal to zero. Using the rules of manipulating cumulant brackets, listed in the appendix of chapter 2, one obtains @D
¼ 2hAx2 ; 2Axi ¼ ð43 þ 81 2 ÞA2 @m1 @D ¼ h4a2 x2 i þ hAx2 ; 2Ai ¼ 4A2 ½2 þ ð1 Þ2 @D @D 1 ¼ h2Ax; 2Ai þ hAx2 ; 0i ¼ 4A2 1 3 ¼ 0 3 @2
ð4:108Þ
@D
1 1 1 d4 A2 2 h2Ai þ hAx h f ðxÞ; ¼ ; 0i þ f ðxÞi ¼ dx4 2 4 3 12 @4 4
Solving this system of equations starting from the equation which responds to the fourth cumulant, one obtains D ¼ A2 ½4 þ 41 3 þ 22 þ 4ð1 Þ2 2
ð4:109Þ
It is worth noting that the expansion (4.109) does not depend on the particular form of the PDF p ðxÞ of the input process, which emphasizes the power of the cumulant method. 4.4.2
Cumulant Method of Analysis of Non-Gaussian Random Processes
In the previous section, cumulant equations were used to obtain relations between the cumulants of the marginal PDF at the input and the output of a zero memory non-linear
120
ADVANCED TOPICS IN RANDOM PROCESSES
transformer. Application of the Price formula allows calculation of the correlation function. It is possible to generalize this equation [4,14,15,17,18] to the case of higher order correlation (cumulant) functions. For example, it can be shown [15] that Dn ½2 ðÞ
Dpþq ð0½p ; ½q Þn
¼
1 h f ðpnÞ ðxÞf ðqnÞ ðx Þi; p!q!
p 6¼ 0; q 6¼ 0
ð4:110Þ
It can be seen that the Price equation (4.98) is a particular case of eq. (4.110) with p ¼ q ¼ 1. However, its application to the case of a non-Gaussian driving process ðtÞ is limited since it does not provide a relation with all the summands of the input process ðtÞ. In order to illustrate this statement, the example of a square law transformation considered above can be used. Once again the transformation ðtÞ ¼ A2 ðtÞ is considered. Since the right-hand side of eq. (4.110) differs from zero only for p ¼ q ¼ 1, and n ¼ 2 or p; q ¼ 1 and n ¼ 1, then, eq. (4.110) produces the following five differential equations with respect to cumulants 2 2 D2 ½C ðÞ A2 d x dx2 ¼ ¼ 4A2 D½C ðÞ2 ð1!1!Þ2 dx2 dx2 D½C ðÞ 1 ¼ h4A2 i ¼ A2 D½4 ð0½2 ; ½2 Þ ð2!Þ2 D½C ðÞ 1 ¼ h2Ax 2Ax i ¼ 4A2 ½Cx ðÞ þ m21 D½C ðÞ 1!1! D½C ðÞ 1 h2Ax 2Ai ¼ 2A2 m1 ¼ ½2 2!1! D½3 ð0; Þ D½C ðÞ
D½3 ð0½2 ; Þ
ð4:111Þ
¼ 2A2 m1
If all cumulants s , s 1 are equal to zero, then the random processes ðtÞ and ðtÞ are equal to zero ðtÞ ¼ ðtÞ ¼ 0. Having this in mind, the following expression for the covariance function can be obtained from eq. (4.111) C ðÞ ¼ A2 f4 ð0½2 ; ½2 Þ þ 2m1 ½3 ð0½2 ; Þ þ 3 ð0; ½2 Þ þ 4m1 C ðÞ þ 2½C ðÞ2 g ð4:112Þ If the input process is a Gaussian process then all the cumulants of order s 3 are equal to zero and eq. (4.112) can be further simplified to R ðÞ ¼ A2 f2½R ðÞ2 þ 4m1 R ðÞg
ð4:113Þ
Of course, the same result can be obtained from the Price formula. Equation (4.113) remains valid if some non-Gaussian process has the skewness and the curtosis equal to zero, i.e. 3 ¼ 4 ¼ 0 and all other higher order cumulants are non-zero. Example: Square law detection of a telegraph signal. Let process ðtÞ be a random telegraph signal with the average bit rate Rb and equal probability of 1s and 0s, encoded by voltage values a. This signal has a symmetric PDF and the covariance function of the
121
LINEAR TRANSFORMATION OF RANDOM PROCESSES
exponential type C ðÞ ¼ a2 e2Rb jj
ð4:114Þ
The cumulant function of the fourth order is obtained in [17] as 4 ð0½2 ; ½2 Þ ¼ 2a4 e4Rb jj
ð4:115Þ
Application of eq. (4.112) results in the following expression for the output covariance function: C ðÞ ¼ A2 f2a4 e4Rb jj þ 2a4 e4Rb jj g 0
ð4:116Þ
This result is easy to understand, since the output process is a constant process ðtÞ ¼ Aa2 . Thus, the compensation of cumulant functions is possible only for non-Gaussian processes. For an input Gaussian process, the output process is always random and, thus, C ðÞ 0. Example: exponential detector. In this case the transformation under consideration is ðtÞ ¼ exp½ ðtÞ;
f ðxÞ ¼ expð xÞ:
ð4:117Þ
pþq C ðÞ p!q!
ð4:118Þ
It follows from eq. (4.110) that D½C ðÞ
D½pþq ð0½p ; ½q Þ
¼
and its solution is given by ( C ðÞ ¼ exp
1 s X s X s¼1
s!
i¼0
) Csi s ð0½si ; ½i Þ
ð4:119Þ
Thus, in contrast to a square law detector, the correlation function of the output process ðtÞ depends on all cumulant functions s ð0½si ; ½i Þ of the input process ðtÞ. This can be explained by the fact that the exponential function has an infinite number of terms in its Taylor series; as a result, all the moments of the input process contribute to the resulting covariance function. It can be concluded that the method of cumulant equations is a very effective tool for investigation of non-linear transformations of random processes.
4.5 4.5.1
LINEAR TRANSFORMATION OF RANDOM PROCESSES General Expression for Moment and Cumulant Functions at the Output of a Linear System
The following quite general problem is to be solved in the analysis of transformation of random processes by a linear (time-invariant) system: given the system impulse response
122
ADVANCED TOPICS IN RANDOM PROCESSES
hðtÞ and the statistical characteristics of the input process ðtÞ, such as joint PDF p ðx1 ; x2 ; . . . ; xn Þ of order n, find statistical characteristics of the output process ðtÞ ¼ L½ðtÞ, such as joint PDF p ðy1 ; y2 ; . . . ; yk Þ of order k < n. There is no known general method to solve this problem, unless the transformation of a Gaussian, a Spherically Invariant or a Markov process is considered7. Instead, the problem of evaluation of moment and cumulant functions could be studied.
4.5.1.1
Transformation of Moment and Cumulant Functions
Solution of the general problem for a Gaussian driving process ðtÞ is significantly simplified since it is known that the output process ðtÞ is also Gaussian and one has to find only the corresponding average m ðtÞ and the output covariance function C ðt1 ; t2 Þ. Let ðtÞ be the input of an LTI system8 with the impulse response hðtÞ and zero initial conditions. Then the output random process is given by the following integral ðtÞ ¼
ðt
hðt sÞðsÞds ¼
ðt
0
hðsÞðt sÞds
ð4:120Þ
0
This integral can be approximated by a partial sum as ðtÞ
K X
hðt sk Þðsk Þsk
k¼1
¼
K X
ð4:121Þ
hðsk Þðt sk Þsk ;
k¼1
¼ t; sk ¼ sk sk1
0 < s1 < s2 < < sk
Averaging both sides and taking into account that the average of a sum is a sum of the averages of the summands, one obtains m ðtÞ
K X
hðt sk Þm ðsk Þsk ¼
k¼1
K X
hðsk Þm ðt sk Þsk
ð4:122Þ
k¼1
Since in this relation only deterministic functions are present, one can easily take a limit of both sides assuming that all sk ! 0 m ðtÞ ¼
ðt 0
hðt sÞm ðsÞds ¼
ðt
hðsÞm ðt sÞds
ð4:123Þ
0
This property reflects the fact that operations of statistical averaging and time integration are interchangeable. 7
There are also a number of particular examples which can be solved completely. It is assumed here that ðtÞ ¼ 0 if t < 0.
8
123
LINEAR TRANSFORMATION OF RANDOM PROCESSES
Eq. (4.120) can be written for moments of time, thus producing the following chain of equations Efðt1 Þðt2 Þ ðtn Þg ð t 1 ð t2 ð tn ¼E hðt1 s1 Þðs1 Þds1 hðt2 s2 Þðs2 Þds2 hðtn sn Þðsn Þdsn 0
ð t1 ð t2
¼
0
0
ð tn
0
0
hðt1 s1 Þhðt2 s2 Þ hðtn sn ÞEfðs1 Þðs2 Þ ðsn Þgds1 ds2 dsn
0
ð4:124Þ or, equivalently ; ;...; m1; 1;...;1 ðt1 ; t2 ; . . . ; tn Þ ð t1 ð t2 ð tn ;...; ¼ hðt1 s1 Þhðt2 s2 Þ hðtn sn Þm;1;1;...;1 ðs1 ; s2 ; . . . ; sn Þds1 ds2 dsn 0
¼
0
ð t1 ð t2 0
0
0
ð tn 0
;...; hðs1 Þhðs2 Þ hðsn Þm;1;1;...;1 ðt1 s1 ; t2 s2 ; . . . ; tn sn Þds1 ds2 dsn
ð4:125Þ If all the moments of time are the same, t1 ¼ t2 ¼ ¼ tn ¼ t, an expression for the n-th moment mn ðtÞ of the output process ðtÞ can be obtained as mn ðtÞ
¼ ¼
ðt ðt 0 0 ðt ðt 0 0
ðt 0 ðt 0
; ;...; hðt s1 Þhðt s2 Þ hðt sn Þm1; 1;...; 1 ðs1 ; s2 ; . . . ; sn Þds1 ds2 dsn ;;...; hðs1 Þhðs2 Þ hðsn Þm1;1;...;1 ðt s1 ; t s2 ; . . . ; t sn Þds1 ds2 . . . dsn
ð4:126Þ Characteristically, this moment depends on the mixed moment function m;1; ;...; 1;...; 1 ðs1 ; s2 ; . . . ; sn Þ considered at n moments of time rather than on the moment mn ðtÞ, thus illustrating the effect of memory. The effect of linearity is emphasized by the fact that the output moment still depends only on a single moment function, rather than a combination of moments of different orders. Equation (4.126) allows one to understand the level of complexity of the problem of evaluation of the moments of the output process. Indeed, it indicates that if the moment function of order n is to be found, one has to measure or derive a joint moment function of the order n and perform n-fold integration. In other words, the complexity of this problem is very high unless one limits considerations only to a few lower order moments. The expression for the correlation function can be obtained from eq. (4.126) by setting n ¼ 2, thus producing R ðt1 ; t2 Þ ¼ m; 1;1 ðt1 ; t2 Þ ¼
ð t1 ð t2 0
0
hðs1 Þhðs2 ÞR ðt1 s1 ; t2 s2 Þds1 ds2
ð4:127Þ
124
ADVANCED TOPICS IN RANDOM PROCESSES
In a very similar manner, equations for mutual moments between the input and the output processes can be defined as ; ...;; ;...; 0 0 m1; ...;1; 1;...;1 ðt1 ; . . . ; tm ; t1 ; . . . ; tn Þ
¼ Efðt10 Þ ðtm0 Þðt1 Þ ðtn Þg ð t1 ð t2 ð tn ; ;...; 0 0 hðs1 Þhðs2 Þ hðsn Þm1; ¼ 1;...;1 ðt1 . . . ; tm ; t1 s1 ; t2 s2 ; . . . ; tn sn Þds1 ds2 dsn 0
0
0
ð4:128Þ
In particular, the mutual correlation function R ðt1 ; t2 Þ can be expressed as R ðt1 ; t2 Þ ¼ m; 1;1 ðt1 ; t2 Þ ¼
ð t2
hðsÞR ðt1 ; t2 sÞds
ð4:129Þ
0
It is also possible to obtain expressions for centred moments. The first step is to subtract eq. (4.123) from eq. (4.120) to obtain processes 0 ðtÞ ¼ ðtÞ m ðtÞ and 0 ðtÞ ¼ ðtÞ m ðtÞ with zero mean 0 ðtÞ ¼
ðt
hðt sÞðsÞds
ðt
0
¼
ðt
hðt sÞm ðsÞds ¼
0
ðt
hðt sÞ½ðsÞ m ðsÞds ð4:130Þ
0
hðt sÞ0 ðsÞds
0
Thus, the relation between centred processes obeys the same equation as the non-centred processes. This mean that all derivations for moments and mutual moments remain valid for centred moments. The only difference is that centred moments must be used on the righthand side of eqs. (4.124)–(4.129). In particular, the covariance C ðt1 ; t2 Þ and the mutual covariance C ðt1 ; t2 Þ can be obtained as follows C ðt1 ; t2 Þ ¼ ¼ C ðt1 ; t2 Þ ¼
ð t1 ð t2 0 ð t1
0 ð t2
0 ð t1
0
hðt1 s1 Þhðt2 s2 ÞC ðs1 ; s2 Þds1 ds2
hðs1 Þhðs2 ÞC ðt1 s1 ; t2 s2 Þds1 ds2 ð t1 hðt2 sÞC ðt1 ; sÞds ¼ hðsÞC ðt1 ; t2 sÞds
0
ð4:131Þ ð4:132Þ
0
Thus, the variance of the output process is given by D ðtÞ ¼
ðt ðt 0 0
hðt s1 Þhðt s2 ÞC ðs1 ; s2 Þds1 ds2 ¼
ðt ðt
hðs1 Þhðs2 ÞC ðt s1 ; t s2 Þds1 ds2
0 0
ð4:133Þ If a linear time-variant system is considered, the above remains valid if the impulse response hðtÞ is substituted by the system function hðt; Þ.
LINEAR TRANSFORMATION OF RANDOM PROCESSES
4.5.1.2
125
Linear Time-Invariant System Driven by a Stationary Process
The results obtained in the previous section can be further simplified if the input process ðtÞ is a stationary process. In this case, expressions for the mean, covariance function, mutual covariance and the variance functions can be written as ðt m ðtÞ ¼ m hðtÞdt 0 ð t1 ð t2 hðs1 Þhðs2 ÞC ðt2 t1 s2 þ s1 Þds1 ds2 C ðt1 ; t2 Þ ¼ 0 0 ð t2 hðsÞC ðt2 t1 sÞds C ðt1 ; t2 Þ ¼ 0 ðt ðt D ðtÞ ¼ hðs1 Þhðs2 ÞC ðs1 s2 Þds1 ds2
ð4:134Þ ð4:135Þ ð4:136Þ ð4:137Þ
0 0
These equations show that even if the input process is stationary, the output process could be non-stationary. This is very similar to the deterministic case when a constant input, applied at t ¼ 0, produces a transient process and the steady state is achieved only at t ¼ 1. However, in most realistic applications the impulse response hðtÞ decays fast enough to justify the steady state and the stationarity achieved after a time greater than the time scale of the impulse response. For example, if the impulse response has an exponential form hðtÞ ¼ expðt=corr Þ, t 0, it is reasonable to assume that at the moment of time t corr the transient process has been completed and the steady state, or stationary regime, is in place. Indeed, in this case, the finite limit of integration in eq. (4.134) can be substituted by infinite limits, thus producing m m
ð1 0
hðsÞds ¼ m Hðj!Þj! ¼ 0
ð4:138Þ
That is, the mean value does not depend on time. In order to obtain the expression for the covariance function, one can introduce a time lag ¼ t2 t1 and assume that t1 corr . Thus, changing the variable of integration and assuming infinite limits of integration, one obtains C ðt1 ; t2 Þ ¼
ð1 ð1
hðs1 Þhðs2 ÞC ð s1 þ s2 Þds1 ds2 ðs þ hðsÞds hð þ s tÞC ðtÞdt ¼ C ðÞ
0 0 ð1 0
ð4:139Þ
0
Therefore it can be said that after a transitional period, the output of a linear system driven by a wide sense stationary random process also becomes a wide sense stationary process. In other words, the output process is an asymptotically wide sense stationary process. Equation (4.139) also allows the relation between power spectral densities of the input and the output processes to be obtained. Indeed, taking the Fourier transform of both sides one
126
ADVANCED TOPICS IN RANDOM PROCESSES
obtains the required relation S ð!Þ ¼ ¼
ð1 ð1 1
C ðÞ expðj!Þdt
expðj!Þd 1 ð1 exp½j!ðs1 s2 Þhðs1 Þhðs2 ÞC ð s1 þ s2 Þ exp½j!ðs2 s1 Þds1 ds2 ð01 ð1 ¼ exp½j!s1 hðs1 Þds1 expðj!s2 Þhðs2 Þds2 0 0 ð1
C ð s1 þ s2 Þ exp½j!ð s1 þ s2 Þds 1
ð4:140Þ
It also can be rewritten in the equivalent form in terms of the transfer function Hð j!Þ S ð!Þ ¼ Hð j!ÞHðj!ÞS ð!Þ ¼ jHð j!Þj2 S ð!Þ
ð4:141Þ
Thus, the spectral density of the output process is that of the input multiplied by the squared absolute value of the transfer function. This fundamental result allows one to effectively use the PSD for calculation of the parameters of the output process in linear systems. It is interesting to note that the transformation of the spectral densities is invariant with respect to the phase characteristic of the linear system. In other words, it is impossible, in general, to restore the transfer function of the system by measuring the PSD of the input and the output. Such limitation is related to the fact that the PSD is defined through the averaging of the Fourier transform of the input process and does not completely represent the property of the linear system. However, in many cases the phase characteristic can also be restored if there is some additional information about the properties of the system [13]. Equations (4.136) and (4.137) can be simplified to produce the following expression for the variance and mutual covariance function in the stationary regime: D ¼ C ðÞ ¼
ð1 ð1 ð01 0
hðs1 Þhðs2 ÞC ðs1 s2 Þds1 ds2
hðsÞC ð sÞds
ð4:142Þ ð4:143Þ
0
The last equation indicates that the input and the output process are stationary related. Example: filtering of WGN. One of the most often considered problems is filtering of WGN by a filter with the impulse response hðtÞ and the transfer function Hð j!Þ. In this case the output process is a zero mean Gaussian process, since m ¼ 0 in eq. (4.138). The PSD of the output process can be obtained from eq. (4.141) by noting that the PSD of WGN is a constant N0 =2: S ð!Þ ¼ jHð j!Þj2 S ð!Þ ¼
N0 jHð j!Þj2 2
ð4:144Þ
LINEAR TRANSFORMATION OF RANDOM PROCESSES
127
The covariance function of the output process ðtÞ can be obtained either by taking the inverse Fourier transform of S ð!Þ or by applying eq. (4.139) and taking into account the fact that the WGN is delta correlated ð ð1 ð sþ N0 N0 1 D ¼ hðsÞds hð þ s tÞ ðtÞdt ¼ hðsÞhð þ sÞds ð4:145Þ 2 2 0 0 0 Example: sliding averaging of a stationary random process. Let the output of a linear system be given as an integral of the input process over a finite interval of time, so-called sliding averaging of the process ðtÞ ¼
1 2
ð tþ
ðsÞds
ð4:146Þ
t
The random process ðtÞ is assumed to be wide sense stationary with the average m and the covariance function C ðÞ. The mean value of the process ðtÞ is thus m ¼
1 2
ð tþ
EfðsÞgds ¼ m
ð4:147Þ
t
The expression for the covariance function can be found to be C ðÞ ¼ EfððtÞ m Þððt þ Þ m Þg ¼
1 42
ð tþ ð tþþ t
C ðs1 s2 Þds1 ds2 ð4:148Þ
tþ
Using the expression of the covariance function in terms of the PSD 1 C ðs1 s2 Þ ¼ 2
ð1 1
S ð!Þ exp½j!ðs1 s2 Þd!
ð4:149Þ
Equation (4.148) can be rewritten as ð ð tþþ ð tþ 1 1 1 C ðÞ ¼ S ð!Þd! exp½j!s1 ds1 exp½ j!s2 ds2 2 42 1 t tþ ð 1 1 sin ! 2 ¼ s ð!Þ expð j!Þd! 2 1 !
ð4:150Þ
This also implies that the PSD s ð!Þ of the output signal is equal to S ð!Þ ¼
sin ! 2 S ð!Þ !
ð4:151Þ
Since the function sin !=! is concentrated around small values of the frequency ! < , the sliding averaging removes high frequency components of the signal spectrum, thus eliminating fast, noise-like components.
128
ADVANCED TOPICS IN RANDOM PROCESSES
The same results can be obtained by noting that the impulse response of the system described by eq. (4.147) is 8 < 1 jtj < hðtÞ ¼ 2 : 0 jtj >
ð4:152Þ
and the corresponding transfer function is thus Hð j!Þ ¼
sin ! !
ð4:153Þ
Having this in mind, eq. (4.139) can be now written as 1 C ðÞ ¼ 2
ð 2 jsj 1 C ð sÞds 2 2
ð4:154Þ
Setting ¼ 0, the expression for the variance of the output process can be obtained as D ¼ C ð0Þ ¼
1 2
ð 2 2
1
jsj C ðsÞds 2
ð4:155Þ
since C ðÞ is an even function. A slight modification of eq. (4.147) shows that the variance of the integrated process 1 ðtÞ ¼ T
ðT
ðsÞds
ð4:156Þ
0
is given by D ðTÞ ¼
2 T
ðT
0
1
2D C ðÞd ¼ T T
ðT
1 r ðÞd T 0
ð4:157Þ
This equation is instrumental in the ergodic theory of random processes. In particular, it allows one to find if the variance of the average process approaches zero, i.e. the values of the average converge to the value of the quantity under consideration. Equation (4.157) indicates that if the covariance function decays fast enough, the resulting process is ergodic in the wide sense. Example: reaction of an RC circuit to the WGN. Let WGN drive an RC circuit shown in Fig. 4.4. The WGN ðtÞ, applied at the moment of time t0 ¼ 0, has constant PSD N0 =2 and the covariance function given by C ðÞ ¼
N0 ðÞ 2
ð4:158Þ
LINEAR TRANSFORMATION OF RANDOM PROCESSES
Figure 4.4
129
Reaction of RC circuit.
The output voltage ðtÞ is measured across the capacitor C and it obeys the following differential equation @ ðtÞ þ ðtÞ ¼ ðtÞ; @t
¼
1 RC
ð4:159Þ
The general solution, given the initial condition 0 ¼ ðt0 Þ is well known and is given by the following equation ðt ðtÞ ¼ 0 expðtÞ þ expðtÞ expðsÞðsÞds ð4:160Þ 0
Since the output process ðtÞ is a Gaussian process for a fixed initial condition, it is only necessary to calculate its mean and covariance functions to completely describe properties of the process ðtÞ. In order to account for the random initial condition, ðt0 Þ ¼ 0 , it is assumed that the distribution of the random variable 0 is given by a known PDF p0 ðÞ. A deterministic initial condition can be considered as a particular case with p0 ðÞ ¼ ð 0 Þ. The first step is to solve the problem for a deterministic 0 . In this case, taking the expectation of both sides of the solution (4.160), one obtains an expression for the mean value m ðt j 0 Þ of the output process ðtÞ as m ðt j 0 Þ ¼ 0 expðtÞ
ð4:161Þ
since the average of the WGN is zero. Using the definition of the covariance function, one calculates C ðt1 ; t2 Þ ¼ Ef½ðt1 Þ m ðt1 j 0 Þ½ðt2 Þ m ðt2 j 0 Þg ð t1 ð t2 2 N 0 exp½ðt1 þ t2 Þ exp½ðs1 þ s2 Þðs2 s1 Þds1 ds2 ¼ 2 0 0
ð4:162Þ
Setting t2 ¼ t1 þ ¼ t þ and using the sifting properties of the delta function, expression (4.162) becomes C ðt; t þ Þ ¼
N0 expðjjÞ½1 expð2tÞ 4
ð4:163Þ
130
ADVANCED TOPICS IN RANDOM PROCESSES
This immediately produces an expression for the variance of the output process as a function of time D ðtÞ ¼
N0 ½1 expð2tÞ 4
ð4:164Þ
For the asymptotically stationary regime, i.e. for t ! 1, eqs. (4.161) and (4.163)–(4.164) are reduced to m ðt j 0 Þ ¼ 0;
D ¼
N0 ; 4
N0 expðjjÞ 4
C ðÞ ¼
ð4:165Þ
Thus, it can be seen that the stationary mean and variance do not depend on the value of the initial condition. However, the transient characteristics do depend on 0 . In order to account for the effect of random initial conditions, one can split solution (4.160) into two parts ðtÞ ¼ 1 ðtÞ þ 2 ðtÞ where 1 ðtÞ ¼ 0 expðtÞ; and 2 ðtÞ ¼ expðtÞ
ðt expðsÞðsÞds
ð4:166Þ
0
The process 2 ðtÞ is a linear transformation of a Gaussian process and thus is a Gaussian process itself. It is easy to see from the analysis above that this process has zero mean and the covariance function defined by eq. (4.165). The process 1 has the PDF defined by p1 ðxÞ ¼ expðtÞp0 ½expðtÞx
ð4:167Þ
which follows the shape of the PDF of the initial condition but has a different scale, which varies in time. The mean and the variance are given by m1 ðtÞ ¼ m0 expðtÞ; D1 ðtÞ ¼ D0 expð2tÞ
ð4:168Þ
respectively. Here m0 and D0 are the mean and the variance of the initial condition 0 . The resulting distribution can be found as a composition of the distributions of 1 ðtÞ and 2 ðtÞ. It is important to mention here that the situation described above is common for all linear systems driven by a Gaussian process: the stationary solution does not depend on the initial state of the system and it is a Gaussian process. Example: derivative of a Gaussian process. Properties of the derivative ðtÞ ¼ 0 ðtÞ of a random process ðtÞ have been considered above in section 4.1.2. To complete the considerations, the spectral domain analysis is presented here. Since the transfer function of the ideal differentiator is given by Hð j!Þ ¼ j!
ð4:169Þ
the expression for the PSD S ð!Þ of the output process can be written as S ð!Þ ¼ jHð j!Þj2 S ð!Þ ¼ !2 S ð!Þ ¼ ð j!Þ2 S ð!Þ
ð4:170Þ
It is easy to see that this condition is equivalent to eq. (4.10) derived by different means.
131
LINEAR TRANSFORMATION OF RANDOM PROCESSES
4.5.2
Analysis of Linear MIMO Systems
Analysis of MIMO systems can be conducted within the same framework which was used in the previous section. In general, it is impossible to provide an expression for joint probability densities of the output vector processes, but it is feasible to obtain expressions for moments and cumulants. Let n ðtÞ1 n N be a random process acting at the n-th input of an N inputs, M outputs MIMO system. If the impulse response between the n-th input and m-th output is denoted by hnm ðtÞ, the response n ðtÞ of the system at the m-th output is then given by m ðtÞ ¼
N ðt X n¼1
hnm ðn Þn ðt n Þdn
ð4:171Þ
0
Thus, the average value mm ðtÞ of the m-th output and the mutual correlation function Rm1 ;m2 ðt1 ; t2 Þ between two outputs can be calculated as follows mm ðtÞ ¼
N ðt X n¼1
hnm ðn Þmn ðt n Þdn
ð4:172Þ
0
and Rm1 m2 ðt1 ; t2 Þ ¼ Efm1 ðtÞm2 ðtÞg N X N ð t1 ð t2 X ¼ hm1 n1 ðn1 Þhm2 n2 ðn2 ÞRn1 n2 ðt1 n1 ; t2 n2 Þdn1 dn2 n2 ¼1 n1 ¼1 0
0
ð4:173Þ The same equation will hold for the centred processes and, thus, for the mutual covariance functions Cm1 m2 ðt1 ; t2 Þ: Cm1 m2 ðt1 ; t2 Þ ¼
N X N ð t1 ð t2 X n2 ¼1 n1 ¼1 0
0
hm1 n1 ðn1 Þhm2 n2 ðn2 ÞCn1 n2 ðt1 n1 ; t2 n2 Þdn1 dn2 ð4:174Þ
If the input signals are stationary related, the expression for the mutual covariance function is somewhat simplified to read Cm1 m2 ðt1 ; t2 Þ ¼
N X N ð t1 þ ð t1 X n2 ¼1 n1 ¼1 0
0
hm1 n1 ðn1 Þhm2 n2 ðn2 ÞCn1 n2 ð n1 þ n2 Þdn1 dn2 ð4:175Þ
Finally, if t1 ! 1, the expression for the asymptotically stationary mutual covariance function becomes Cm1 m2 ðÞ ¼
N X N ð1 ð1 X n2 ¼1 n1 ¼1 0
0
hm1 n1 ðn1 Þhm2 n2 ðn2 ÞCn1 n2 ð n1 þ n2 Þdn1 dn2
ð4:176Þ
132
4.5.3
ADVANCED TOPICS IN RANDOM PROCESSES
Cumulant Method of Analysis of Linear Transformations
It has been shown in section 4.2.2 that a transformation of a random process ðtÞ by a linear system with constant parameters can be described by linear differential equations of the form NðpÞðtÞ ¼ MðpÞðtÞ;
Nð pÞ ¼
n X
ak pk ;
Mð pÞ ¼
k¼0
m X
bk pk
ð4:177Þ
k¼0
Here Nð pÞ and Mð pÞ are polynomials of order n and m of the differential operators. If yðtÞ and xðtÞ are deterministic signals then an auxiliary variable zðtÞ, defined as zðtÞ ¼ Nð pÞyðtÞ ¼ Mð pÞxðtÞ ¼
n X k¼0
ak
m X dk dk yðtÞ ¼ b xðtÞ k dtk dtk k¼0
ð4:178Þ
is also a deterministic signal. It is known from the theory of ordinary differential equations that the solution of eq. (4.177) is a sum of a general solution y0 ðtÞ of the homogeneous equation zðtÞ ¼ 0 and a particular (forced) solution of the non-homogeneous equation yp ðtÞ: yðtÞ ¼ y0 ðtÞ þ yp ðtÞ
ð4:179Þ
If the linear system is stable9, then the part of the response y0 ðtÞ induced by a non-zero initial condition y0 ¼ yðt0 Þ vanishes after a period of time greater than the characteristic time constant s of the system. Thus, for t t0 s, the output of the system consists only of the forced solution ð1 yðtÞ ¼ yp ðtÞ ¼ zðt ÞhðÞd ð4:180Þ 0
Here hðtÞ is the impulse response of the system zðtÞ ¼ Nð pÞ yðtÞ. The function hðtÞ is also Green’s function of the operator NðpÞ, since it satisfies the condition ðtÞ ¼ Nð pÞ yðtÞ. Thus, the operator (4.180) describes a stationary regime of a process transformed by a linear system. By the same argument, the stationary solution of the operator (4.177) can be written as yðtÞ ¼
ð1
hM ðÞxðt Þd
ð4:181Þ
m X d dk hðtÞ ¼ bk k hðtÞ dt dt k¼0
ð4:182Þ
0
where hM ðtÞ M
if m < n. At this moment, it is important to mention that introduction of a auxiliary function zðtÞ does not play a significant role in the derivation of eqs. (4.181)–( 4.182). However, it will be extremely useful in the calculation of cumulants of the transformed process. 9
See [5] for discussion of different criteria of stability.
LINEAR TRANSFORMATION OF RANDOM PROCESSES
The correlation function of the systems (4.177)–( 4.178) can be defined as ð1 Rh ðÞ ¼ hðtÞhðt þ Þd ð1 1 hM ðtÞhM ðt þ Þdt RhM ðÞ ¼
133
ð4:183Þ
1
respectively. These and similar characteristics are often used to describe communication channels [4,17]. Similarly, moment functions mhk ð1 ; 2 ; . . . s Þ of linear systems can be defined as ð1 mhk ð1 ; 2 ; . . . k Þ hðtÞhðt þ 1 Þhðt þ 2 Þ hðt þ s Þdt ð4:184Þ 1
The next step is to quantitatively define the characteristic time s . While this definition is not unique, here the energy-based definition is considered: ð 1 1 2 s ¼ 2 h ðtÞdt ð4:185Þ h0 0 where h0 ¼ max½hðtÞ. Similarly, the correlation interval h can be defined as ð1 1 R2 ðÞd h ¼ 2 Rh ð0Þ 0 h
ð4:186Þ
Instead of considering a time-domain description of a linear system, one can use its description in the frequency domain. In this case, the system is completely described by its transfer function Hð j!Þ, which, in turn, is the Fourier transform of the impulse response hðtÞ ð1 Hð j!Þ ¼ hðtÞ expðj!tÞdt ð4:187Þ 1
The transfer function completely describes the system response in the steady state; however, some additional information (such as initial conditions) is needed to describe the reaction to non-zero initial conditions. Assuming that the transfer function is known, the next task is to consider calculation of the moment and cumulant functions. At this point, it is assumed that the moment and cumulant functions ms ðt1 ; t2 ; . . . ts Þ, s ðt1 ; t2 ; . . . ts Þ of the auxiliary process ðtÞ ¼ NðpÞðtÞ are also known. If the system has deterministic non-zero initial conditions, then it will produce a deterministic relaxation process10. The random component is thus only a result of the external driving force. Thus, in both stationary and non-stationary cases, the random output signal ðtÞ can be represented as ðt ðtÞ ¼ hðÞðt Þd ð4:188Þ 0
10
This deterministic process would contribute only to the first moment (cumulant) m1 ðtÞ ¼ 1 ðtÞ.
134
ADVANCED TOPICS IN RANDOM PROCESSES
The latter allows one to immediately write expressions for the moment and cumulant functions, ms ðtÞ and s ðtÞ, of the marginal distribution. Indeed, using the definition of the moments it is possible to write ms ðt1 ; t2 ; . . . ; ts Þ ¼ Efðt1 Þðt2 Þ ðts Þg ð t1 ðt ðt ¼E hð1 Þðt 1 Þd1 hð2 Þðt 2 Þd2 hðs Þðt s Þds ¼
ð1 0
0
ð1 0
0
0
hð1 Þhð2 Þ hðs Þms ðt1 1 ; t2 2 ; . . . ; ts s Þd1 d2 ds
ð4:189Þ
The cumulant functions can be found in many ways. First of all one can calculate them from the moments, obtained through the use of eq. (4.189), using relations between cumulants and moments. This works well for the lower order cumulants. In the general case, it is more convenient to use the relation between moment and cumulant brackets, as discussed in [15]. This approach produces the following set of equations ð t1 1 ðt1 Þ ¼ hð1 Þ1 ðt1 1 Þd1 ð4:190Þ 0
2 ðt1 ; t2 Þ ¼ s ðt1 ; t2 ; . . . ; ts Þ ¼
ð t1 ð t2 0
ð t1 0
0
hð1 Þhð2 Þ2 ðt1 1 ; t2 2 Þd1 d2 ð ts 0
ð4:191Þ
hð1 Þhð2 Þ . . . hðs Þs ðt1 1 ; t2 2 ; . . . ; ts s Þd1 d2 . . . ds ð4:192Þ
It follows from eqs. (4.189) and (4.192) that the relationship between moment and cumulant functions at the input and the output of a linear system are linear and inertial. This means that the values of the cumulants s of the marginal PDF depend only on the cumulant of the same order but considered in separate time instants. Indeed, setting all the limits of integration to a fixed moment of time tk ¼ t, one obtains ðt ðt ð4:193Þ s ðtÞ ¼ hð1 Þhð2 Þ hðs Þs ðt 1 ; t 2 ; . . . ; t s Þd1 d2 ds 0
0
thus indicating non-zero memory of linear systems. This leads to a significant change of the PDF of the process ðtÞ compared to the PDF of the input signal ðtÞ with only one remarkable exception11: linear transformation of a Gaussian random process remains Gaussian! This follows from the fact that all higher order cumulant functions of the input process are zero. In general, if some cumulant function is zero for the process ðtÞ, the cumulant function of the same order of the process ðtÞ is also equal to zero. This constitutes so-called cumulant invariance of linear transformation [15] and emphasizes the importance of the model approximation: if only a first few cumulants are sufficiently non-zero (two for 11
This is also true for so-called Spherical Invariant Processes considered in Section 4.8.
LINEAR TRANSFORMATION OF RANDOM PROCESSES
135
the Gaussian approximation and four for the model approximation), the same would be valid for the output process, i.e. both these models are sufficient to describe both the input and the output process. However, the cumulant functions themselves can be changed significantly, as will be seen from the example below. A stationary regime is obtained from eq. (4.189) by setting t ! 1. In this case s ð0; 2 ; 3 ; . . . ; s Þ ð1 ð1 ¼ hðu1 Þ hðus Þs ð0; 2 u2 þ u1 ; . . . ; s us þ u1 Þdu1 dus 0
ð4:194Þ
0
A similar expression can be derived for the moment functions ms ð0; 2 ; 3 ; . . . ; s Þ. In order to obtain particular results, one has to specify the structure of the input random process. This will be considered below. It was assumed in eqs. (4.192)–(4.194) that the characteristics of the auxiliary process ðtÞ are known. However, it is worth recalling that m X d dk ðtÞ ¼ M bk k ðtÞ ð4:195Þ ðtÞ ¼ dt dt k¼1 That is, the properties of the processes ðtÞ and ðtÞ should be expressed in terms of cumulant and moment functions of the input process ðtÞ. This can be easily done if one recalls that the order of operations of statistical averaging and differentiation with respect to time is interchangeable. Having this in mind, the following relation can be derived ms ðt1 ; t2 ; . . . ; ts Þ ¼ hðt1 Þ ðt2 Þ ðts Þi @ @ @ ¼ M ðt1 ÞM ðt1 Þ M ðts Þ @t1 @t2 @ts @ @ ¼M M hðt1 Þðt2 Þ ðts Þi @t1 @ts s Y @ ¼ M ms ðt1 ; . . . ; ts Þ @t i i¼1
ð4:196Þ
Thus, the moments of the auxiliary process ðtÞ are the partial derivatives of the corresponding moments of the input random process. The same can be derived for the cumulants s Y @ s ðt1 ; t2 ; . . . ; ts Þ ¼ M ð4:197Þ s ðt1 ; . . . ; ts Þ @t i i¼1 Finally, substituting eqs. (4.196) and (4.197) into eqs. (4.189) and (4.192) one obtains the desired expressions for the moments and the cumulants of the output process ðtÞ in terms of the moments and cumulants of the input process ðtÞ. It can be seen that if a cumulant of the input process ðtÞ is equal to zero, so will be the corresponding cumulant of the auxiliary process ðtÞ. Thus the cumulant invariance is preserved for the process ðtÞ. If the input process ðtÞ is Gaussian, then the output process is a Gaussian process as well and it is sufficient to find only second order cumulants.
136
ADVANCED TOPICS IN RANDOM PROCESSES
Example: response of a linear circuit to a Poisson pulse. Poisson pulse flow is described by a flow of pulses of a deterministic or random shape. Time instants of the impulse appearance obey the Poisson distribution: the probability that there are exactly N pulses on the interval of duration t is given by Pn ¼
ðtÞN expðtÞ N!
ð4:198Þ
Here, is the average number of pulses per unit interval. Thus, the Poisson pulse flow can be represented as ðtÞ ¼
1 X
ai Fi ðt ti Þ
ð4:199Þ
i¼1
where Fi ðtÞ is the so-called elementary pulse (random or deterministic) and ai is the pulse magnitude (random or deterministic). It is shown in [15] that the cumulants of this process are given by ð1 s FðuÞ Fðu þ 2 Þ . . . Fðu þ s Þdu ð4:200Þ s ð0; 2 ; . . . ; s Þ ¼ ha i 1
Now, the problem is reduced to calculation of the cumulants of the output process using the relation (4.193). If the linear system has the impulse response hðtÞ, then its response to an elementary pulse is ð1 IðtÞ ¼ hðÞ Fðt Þd ð4:201Þ 1
It can be clearly seen that, using this notation, eq. (4.193) for the output cumulants can be rewritten as ð1 s IðuÞIðu þ 1 Þ Iðu þ s Þdu ð4:202Þ s ð0; 1 ; . . . ; s Þ ¼ ha i 1
Thus, it can be deducted by comparing eqs. (4.200) and (4.202), that these equations are identical in structure and the only difference is that instead of the pulse shape function FðtÞ, the response waveform IðtÞ of the linear system to the pulse function FðtÞ is used. This implies that the output process is also a Poisson process defined as ðtÞ ¼
1 X
ai Iðt ti Þ
ð4:203Þ
i¼1
The cumulant functions s of the output process ðtÞ are thus given by eq. (4.202). Setting all the lag variables to zero, i ¼ 0, one obtains ð1 s F s ðtÞdt s ¼ ha i 1 ð1 ð4:204Þ s ¼ has i I s ðtÞdt 1
LINEAR TRANSFORMATION OF RANDOM PROCESSES
137
A characteristic time constant for the cumulant function s ðÞ can be introduced in the very same manner as it was introduced for the second order cumulant function. Specifically, they can be defined by the following equations ð 1 1 s F ðtÞdt F02 1 ð 1 1 s Ts ¼ 2 I ðtÞdt I0 1 Ts ¼
ð4:205Þ
where F0 ¼ maxfFðtÞg and I0 ¼ maxfIðtÞg. Using this notation, the corresponding cumulants can be written as s ¼ has iF0s Ts ;
s ¼ has iI0s Ts
ð4:206Þ
It is clear that eq. (4.206) can be used to define normalized cumulant coefficients s , similarly to the definition introduced in section 2.2. It also can be shown that the relation between the input cumulant coefficients sin and the output cumulant coefficient sout under the assumption that hai ¼ 0 Ts ½T2 2 s
sout ¼ sin
½T2 Ts s 2
ð4:207Þ
It is shown in [15] that " out 2s
2in
T2 T2
#s1 ð4:208Þ
and
out 2s1
in 2s1
" #s12 T1 T2 T1 T2
ð4:209Þ
It can be seen by inspection of expression (4.207) that the main factor contributing to change of the PDF of the output process is the ratio between duration of pulses FðtÞ and IðtÞ. Since this ratio is defined by the length of the impulse response of the linear system under consideration, one can conclude that the systems with long memory (or, equivalently, with long response time) would significantly change the PDF of the output process.
4.5.4
Normalization of the Output Process by a Linear System
In this section, it is shown that a linear system tends to normalize the output process, independently of the distribution of the process at the input. For simplicity, the case of a
138
ADVANCED TOPICS IN RANDOM PROCESSES
symmetric distribution of pulses is considered. This implies that all odd moments of ai are equal to zero, i.e. ha2sþ1 i ¼ 0
ð4:210Þ
In turn, eq. (4.210) implies that all odd moments of the output process are also equal to zero: h2sþ1 i ¼ 0 and the output PDF is also a symmetric curve. In this case, it is possible to rewrite eqs. (4.208)–(4.209) to obtain the following expression for the even cumulant coefficients ! s1 T out in 2 ¼ 2s ð4:211Þ 2s T2 out ¼ 0. Since, for any linear keeping in mind that the odd coefficients are equal to zero: 2sþ1 system with memory, the impulse response has non-zero duration, then T2 < T2 , which immediately implies that the relative weight of the higher order cumulants decreases with number: !2 out in in 2s T2 2s out in 2s ð4:212Þ 2s < 2s ; out ¼ in in 2sþ2 2sþ2 T2 2sþ2
This fact shows that the linear system with memory and T2 > T2 suppresses higher order cumulants12, i.e. it normalizes the output. If the memory of the linear system is long, i.e. T2 T2 , the output process is certainly very close to a Gaussian random process. The normalization property is clearly related to the scale of the time constant s of the linear system with respect to the correlation interval of the input signal if s T2 . Then, during the correlation interval of the input signal the impulse response of a linear system remains virtually unchanged, and, thus ð1 ð1 IðtÞ ¼ hðÞFðt Þd ¼ hðtÞ FðuÞdu ¼ hðtÞA ð4:213Þ 0
0
and T2 ¼
1 E2 h20
ð1 1
A2 h2 ðtÞdt ¼ s T2
ð4:214Þ
Ð1 Here E ¼ 0 F 2 ðtÞdt is the energy of a single pulse. Thus, the bigger s is with respect to T2 the bigger the normalization of the output process is as well. This statement can also be recast in terms of the effective bandwidth of the linear system Ð1 feff ¼
12
0
jHð j!Þj2 K02
d! 2
ð4:215Þ
Since the duration of pulses is defined in an equal area then it is possible that a linear system with memory would shorten the duration of the pulse defined in an equal area sense.
LINEAR TRANSFORMATION OF RANDOM PROCESSES
139
Since in all of the physical systems the following relationship is valid [19] s
1 feff
ð4:216Þ
it follows from eq. (4.214) that a deeper normalization is achieved by a system with a narrower bandwidth. Usually the normalization property of a linear system is explained in terms of the Central Limit Theorem [1]. Indeed, the number of impulses exciting the system during a time interval T2 is on average equal to T2 . During the same period of time, the input process can be represented as a sum of T2 terms on average. If T2 T2 , and T2 1, then the conditions of the Central Limit Theorem are satisfied and the output process approaches a Gaussian one. Example: normalization of a process by a linear RC circuit. The impulse response of a linear RC circuit is an exponential function hðtÞ ¼ et ;
t0
ð4:217Þ
1 RC
ð4:218Þ
where ¼ feff ¼
is the effective bandwidth of the filter. It is assumed that han i ¼ 0 and the excitation pulses in eq. (4.199) are the delta functions, i.e. FðtÞ ¼ ðtÞ
ð4:219Þ
It is known that in this case, the cumulants of the input process are delta functions as well [15] 1 ðtÞ ¼ 0; s ðt1 ; t2 ; . . . ; ts Þ ¼ Ds ðt2 t1 Þ ðt2 t1 Þ
ð4:220Þ
Ds ¼ has i
ð4:221Þ
where
Taking into account that all the cumulants are either zero or delta functions, the integration of eq. (4.202) results in 1 ¼ 0;
and
s ¼
Ds ; 2s
s ¼ 2; 3; . . .
ð4:222Þ
It seems that there is no normalization, since all the cumulants are still non-zero, while the time constant s ¼ 1= is much greater than T2 ¼ 0. However, the cumulants of the filtered process are finite, while the cumulants of the original process are equal to infinity. It is also
140
ADVANCED TOPICS IN RANDOM PROCESSES
characteristic that the normalized magnitude of cumulants decays with order. Thus, it could be claimed that the filtered process ðtÞ is indeed closer to Gaussian than the original one s ¼
4.6 4.6.1
ð2Þs=2 Ds s=2
sDd
;
s3
ð4:223Þ
OUTAGES OF RANDOM PROCESSES General Considerations
The problem of a boundary attainment existed for a relatively long period of time. One of the first results in this area was the work of A. Pontryagin [20], who considered calculations of the distribution of the first passage time by a Markov random process, and the pioneering work of S.O. Rice [21,22]. Essentially, the latter laid a foundation for the applied analysis of the characteristics of random processes related to a level crossing problem. A comprehensive review of the recent advances in the theory can be found in [20,23]. In particular, level crossing problems have a great importance in communication problems. The following are just a few examples from a long list: calculation of the average duration of the outage of the fading carrier in channels with fading [24], loss of synchronization in phase-lock loops [25], the average level crossing rate and fade duration in channels with fading [26], evaluation of the average length of bursts of errors in discrete communication channels [27], etc. Let 1 ðtÞ be a realization of a random continuous process ðtÞ defined on a finite time interval ½0; T and min < H < max be a fixed (for now) level, as shown in Fig. 4.5.
Figure 4.5 Definition of the first passage time and the average level crossing rate.
Let be the duration of the segment of the realization 1 ðtÞ which exceeds the chosen level H. This interval of time is called a duration of the upward excursion, and the portion of the trajectory above the threshold H is called a positive pulse. Similarly, the duration downward excursion is defined as the duration of the portion of the trajectory below the threshold H (negative pulse). In addition, let tm be the moment of time when the trajectory achieves its maximum value max . Finally, let n be the number of pulses (both negative and positive). Distributions of the number of extrema and related statistics are studied in extreme value theory [26] and are not considered in this book. Here, the main focus of discussion is kept on
OUTAGES OF RANDOM PROCESSES
141
the evaluation of some statistical characteristics of the random variables , and n due to their applied importance.
4.6.2
Average Level Crossing Rate and the Average Duration of the Upward Excursions
Let random process ðtÞ be differentiated in a mean square sense (see Section 4.1.2 for more details). It is also assumed for now that the joint probability density p_ ðx; x_ ; tÞ of the process ðtÞ and its mean square sense derivative _ðtÞ ¼ dðtÞ=dt at the same moment of time are also known. Details of its evaluation from the joint PDF are considered below. The process ðtÞ will cross the level H in the vicinity ½ti t=2; ti þ t=2 of the moment of time ti if the following condition is met t t H ti þ H <0 ð4:224Þ ti 2 2 This case is illustrated in Fig. 4.6. For now, the consideration is limited to the upward excursion, i.e. process ðtÞ is assumed to cross the threshold H from below. In this case, the
Figure 4.6 Calculation of the LCR.
derivative _ðtÞ of the process must be positive, i.e. _ðti Þ > 0. This is also equivalent to the condition that ðt t=2Þ < H and ðt þ t=2Þ > H. Let the interval ½0; T be split into m non-intersecting sub-intervals of equal duration t. The number m is to be chosen in such a way that there is no more than one upward excursion on each of the intervals. In other words, the length of the interval t is so short that the probability of more than one intersection is much smaller than the probability of a single upward excursion [21]. Inside this small interval, a continuous process ðtÞ can be approximated as ðiÞ ðt0 Þ ðiÞ ðtÞ þ _ðiÞ ðtÞðt0 tÞ;
t < t0 < t þ t
ð4:225Þ
Here ðiÞ ðtÞ is the i-th realization of the random process ðtÞ. Thus, on the interval ½t; t þ t there are two possibilities: there is, with the probability q0, a level crossing inside this interval,
142
ADVANCED TOPICS IN RANDOM PROCESSES
or, with the probability p0 ¼ 1 q0, there is none. Having this in mind, a binary sequence of ðiÞ ðiÞ ðiÞ ðiÞ m pulses 1 ; 2 ; . . . ; m , where j ¼ 1 if _ðtj Þ > 0 and ji ¼ 0 otherwise, can be constructed. Thus, the number of upward excursions on the i-th realization can be calculated as nþ ðH; TÞ ¼
m X
ðiÞ
j
ð4:226Þ
i¼1
Since this number varies from realization to realization, it is more appropriate to calculate the average number of level crossings as N þ ¼ hnþ ðH; TÞi ¼
m X ðiÞ hj i
ð4:227Þ
i¼1
The average value of an individual pulse can be calculated using the probability of each pulse. This produces ðiÞ
hj i ¼ 0 Probfji ¼ 0g þ 1 Probfji ¼ 1g ¼ q0
ð4:228Þ
Thus, the problem of finding an average number of upward excursions is reduced to finding the probability q0. It can be seen by inspection of Fig. 4.6 and eq. (4.224) that the conditions of the appearance of the positive pulse can be rewritten on the interval ½tj ; tj þ tj as H j < ðtj Þ H; and _ðtj Þ > 0
ð4:229Þ
j ¼ _ðtj Þtj
ð4:230Þ
where
This immediately implies that q0 ¼ ProbfH j < ðtj Þ H;
_ðtj Þ > 0g
ð4:231Þ
Assuming that tj ; j ! 0 and the derivative of the process varies in a narrow interval d_x around a fixed value x_ , it is possible to approximate the right-hand side of eq. (4.231) by ProbðH j < j H; x_ < _ðtj Þ x_ þ d_xÞ p_ ðH; x_ ; tj Þj d_x p_ ðH; x_ ; tj Þtj x_ dx ð4:232Þ where p_ ðH; x_ ; tj Þ is the joint PDF of the process ðtÞ and its time derivative. Integrating eq. (4.232) over all possible positive slopes (only such trajectories will be able to achieve the boundary), one concludes that q0 ¼ t
ð1 0
x_ p_ ðH; x_ ; tÞd_x
ð4:233Þ
OUTAGES OF RANDOM PROCESSES
143
Finally, the average number of positive pulses N þ can be obtained by setting the number of intervals m ! 1 and adding contributions from each interval: Nþ ¼
ðT
ð1 dt
0
0
x_ p_ ðH; x_ ; tÞd_x
ð4:234Þ
In the very same manner, the average number N of negative pulses is
N ¼
ðT
ð0 dt 1
0
x_ p_ ðH; x_ ; tÞd_x
ð4:235Þ
Ultimately, the average number of level crossings is then N ¼ Nþ þ N ¼
ðT
ð1 dt 1
0
jx_ jp_ ðH; x_ ; tÞd_x
ð4:236Þ
Furthermore, expressing the joint probability density p_ ðH; x_ ; tÞ in terms of conditional density p_jH ð_x; t j H; tÞ as p_ ðH; x_ ; tÞ ¼ p_jH ð_x; t j H; tÞp ðH; tÞ
ð4:237Þ
expression (4.236) can be rewritten as N¼
ðT
p ðH; tÞdt
ð1
0
1
j_xjp_jH ð_x; t j H; tÞd_x
ð4:238Þ
These equations were first obtained in [21,22]. If the process under consideration is a stationary process then its joint PDF does not depend on time, i.e. p_ ðx; x_ ; tÞ ¼ p_ ðx; x_ Þ ¼ p_jH ð_x j xÞp ðxÞ
ð4:239Þ
In this case, eq. (4.238) can be further simplified to N¼
ðT 0
p ðHÞdt
ð1 1
j_xjp_jH ð_x j xÞd_x ¼ Tp ðHÞ
ð1 1
j_xjp_jH ð_x j xÞd_x
ð4:240Þ
Dividing both parts of the latter by the duration of the observation interval T, the average number n of crossings per unit time can be obtained as n¼
N ¼ p ðHÞ T
ð1 1
j_xjp_jH ð_x j xÞd_x
ð4:241Þ
144
ADVANCED TOPICS IN RANDOM PROCESSES
Similarly, the average number of upward nþ and downward n excursions are given by ð1 Nþ þ ¼ p ðHÞ n ¼ x_ p_jH ð_x j xÞd_x ð4:242Þ T 0 ð 0 N ¼ p ðHÞ x_ p_jH ð_x j xÞd_x ð4:243Þ n ¼ T 1 It was shown earlier that the values of a random process ðtÞ and its derivative _ðtÞ are uncorrelated at the same moment of time. This means that the values of the process and its velocity at the same level are only loosely related. In many cases, an even stronger statement is true: the process and its derivative are independent, i.e. p_jH ð_x j xÞ ¼ p_ ð_xÞ and, thus, eqs. (4.240)–(4.243) can be further simplified to produce ð1 N n ¼ ¼ p ðHÞ j_xj p_ ð_xÞd_x ¼ Cp ðHÞ T 1 ð 1 Nþ ¼ p ðHÞ x_ p_ ð_xÞd_x ¼ C þ p ðHÞ nþ ¼ T 0 ð0 N ¼ p ðHÞ x_ p_ ð_xÞd_x ¼ C p ðHÞ n ¼ T 1
ð4:244Þ
ð4:245Þ
Thus, for the case of an independent derivative the level crossing rates (LCR) of a level H n, nþ , and n are proportional to the values of the PDF of the process considered at the level H. This fact can be used in order to experimentally measure the PDF of the process by counting a number of crossings at different levels of the process. The next step is to calculate an average duration of upward (downward) excursions. Without loss of generality, only the case of upward excursions is considered. For a given time interval ½0; T, the average time spent by the process above a given threshold H is approximately given by ð1 p T ¼T p ðxÞdx ð4:246Þ H
Since, during the same time interval, there are approximately N þ ¼ nþ T positive pulses, the average duration T þ of a positive pulse is Tþ ¼
Ð1 p ðxÞdx Tp ÐH ¼ N þ p ðHÞ 01 x_ p_jH ð_x j xÞd_x
ð4:247Þ
In the very same manner, the average duration of a negative pulse is T ¼
ÐH Tn 1 p ðxÞdx ¼ Ð N p ðHÞ 0 x_ p _ ð_x j xÞd_x jH 1
ð4:248Þ
OUTAGES OF RANDOM PROCESSES
145
It is common to refer to the quantities defined by eqs. (4.245) and (4.248) as an average level crossing rate (LCR) and an average fade duration (AFD). These statistics are often misleadingly called higher order statistics (HOS) of signals. However, it is commonly agreed in statistical signal processing that the higher order statistics refer to the cumulants, moments and their spectra of order higher than two [28]. The calculations above assume that the expression for the joint PDF p_ ðx; x_ Þ of the process and its derivative is known. In the case of a stationary process, it can be shown that this function can be calculated from the joint PDF p ðx1 ; x2 ; Þ of the process at two moments of time. Indeed, choosing t1 ¼ t
t ; 2
t2 ¼ t þ
t ; 2
t ¼ t2 t1
and using continuity of the process ðtÞ, one can write t t _ t tþ ¼ ðtÞ þ x_ ðtÞ; x2 ¼ x þ 2 2 2 t t _ t x_ t ¼ ðtÞ ðtÞ; x1 ¼ x 2 2 2 This transformation allows the joint PDF of ðtÞ and _ðtÞ to be obtained as t t p_ ðx; x_ Þ ¼ lim tp x x_ ; x þ x_ ; t t!0 2 2
ð4:249Þ
ð4:250Þ
ð4:251Þ
Since the joint PDF p ðx1 ; x2 Þ is a symmetric function of its arguments, eq. (4.251) implies that p_ ðx; x_ Þ ¼ p_ ðx; _xÞ
ð4:252Þ
Thus, if a process ðtÞ is stationary and has a derivative, then the distribution of the velocity _ðtÞ is symmetric with respect to zero. If it were not so, the bias in velocity distribution would lead to non-stationarity, since the process will tend either to drift ‘‘upward’’ or ‘‘downward,’’ thus causing a change in the mean value of the process in time. 4.6.3
Level Crossing Rate of a Gaussian Random Process
The problem of level crossing can be solved completely for a Gaussian case. Since ðtÞ and _ðtÞ are related by a linear operation of differentiation, the Gaussian nature of the process ðtÞ would imply that its derivative is also a Gaussian process. Due to the symmetry prescribed by eq. (4.252), the mean value of the process _ðtÞ is equal to zero and it is only required to find its correlation function13 R__ ðÞ. Using eqs. (4.131)–(4.132), it is easy to show that R_ ðÞ ¼
@ R ðÞ; @
R__ ðÞ ¼
@2 R ðÞ @ 2
ð4:253Þ
Since the process ðtÞ has zero mean, the correlation function and the covariance function are equal: R ðÞ ¼ C ðÞ.
13
146
ADVANCED TOPICS IN RANDOM PROCESSES
Since the correlation function R ðÞ has maximum at ¼ 0, this means that R_ ð0Þ ¼ 0
ð4:254Þ
For a Gaussian process, this implies that the process and its derivatives are also independent and, thus, p_ ðx; x_ Þ ¼ p ðxÞp_ ð_xÞ
ð4:255Þ
The variance of the derivative _ðtÞ can be found as the values of its correlation function at zero @2 D_ ¼ 2 R ðÞ ¼ R00 ð0Þ ¼ D 00 ð0Þ ð4:256Þ @ ¼0 Here ðÞ is the normalized covariance function (coefficient) of the process ðtÞ and D is the variance of this process. Combining eqs. (4.255) and (4.256), one obtains an expression for the joint PDF of a Gaussian process and its derivative. ( " #) 2 _ x 1 1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp ð4:257Þ x2 00 p_ ðx; x_ Þ ¼ p ðxÞp_ ð_xÞ ¼ 2D
ð0Þ 2D 00 ð0Þ
Finally, the level crossing rate of a Gaussian process with the covariance function R ðÞ can be obtained from eq. (4.245) qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 00 ð0Þ H 2 ð 1 s2 e 2D LCRðHÞ ¼ se 2 ds; ð4:258Þ 2 0 where s¼
x_ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi D 00 ð0Þ
ð4:259Þ
Since the last integral in eq. (4.258) is equal to unity, one obtains
LCRðHÞ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 00 ð0Þ 2
H2 e 2D
¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 00 ð0Þ 2
H2 e 2D
¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 00 ð0Þ 2
e
ðH=
pffiffiffiffi 2 D Þ
2
ð4:260Þ
The values of 00 ð0Þ for some important correlation functions are summarized in Table 3.1. Some of these values are infinite, thus indicating that the process with such a correlation function has a derivative with infinite variance. The expression for the level crossing rate (4.260) can now be used to obtain an expression for an average duration of the positive outage. Indeed, using the Laplace function ðxÞ [29] ð1 H
! H p ðxÞdx ¼ 1 pffiffiffiffiffiffi D
ð4:261Þ
147
OUTAGES OF RANDOM PROCESSES
expression (4.260) can be transformed to produce
2 Hffiffiffiffi H p 2 1 exp 2D D þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T ¼ 00 ð0Þ
ð4:262Þ
It can be seen from eqs. (4.260) and (4.262) that the expressions for the LCR and AFD are more conveniently expressed in terms of the normalized threshold H h ¼ pffiffiffiffiffiffi D
ð4:263Þ
Dependence of the LCR and AFD on the normalized threshold h is shown in Fig. 4.7 for two cases of covariance function
ðÞ ¼ J0 ð2fD Þ;
00 ð0Þ ¼ 22 fD2
ð4:264Þ
and 2 Þ;
ðÞ ¼ expð 2 =corr
00 ð0Þ ¼
2 2 corr
ð4:265Þ
It can be seen that the LCR and AFD are expressed in terms of the time characteristic of the random process, in particular both quantities are functions of the values of the second
Figure 4.7 Dependence of LCR (a) and AFD (b) on the normalized threshold and the type of correlation function with the same correlation interval.
148
ADVANCED TOPICS IN RANDOM PROCESSES
Figure 4.7
(Continued )
derivative of the correlation function 00 ð0Þ. The same calculation could be conducted in the frequency domain as well. Indeed, using the Parseval theorem and the properties of the Fourier transform one obtains
00 ð0Þ
¼
1 2
ð1
ð j!Þ2 Gð!Þd!
0
2
4 ¼ 2
ð1
f 2 Gð f Þdf
ð4:266Þ
0
Example: Bandlimited (pink) noise. In this case of an ideal baseband filter with the cut-off frequency F, the one-sided PSD of filtered WGN is given by PSDð f Þ ¼
0 f F f > F
N0 0
ð4:267Þ
The value of 00 ð0Þ for this process is thus 00 ð0Þ
Ð1 Ð F 4 0 f 2 Gð f Þd f 4 0 f 2 N0 df 4N0 ðFÞ3 4ðFÞ2 ¼ ¼ Ð1 ¼ ¼ Ð F N0 F 3 3 N0 df 0 Gð f Þd f
ð4:268Þ
0
Example: A Gaussian process with a Gaussian PSD. zero mean be described by a Gaussian PSD:
Let a Gaussian process ðtÞ with
"
2 # f PSDð f Þ ¼ N0 exp 4 feff
ð4:269Þ
149
OUTAGES OF RANDOM PROCESSES
This PSD corresponds to a Gaussian correlation function 2 2 Þ
ðÞ ¼ expðfeff
ð4:270Þ
It is easy to check that both time-domain and frequency-domain methods produce the same answer: 00 ð0Þ ¼ 2ð feff Þ2
ð4:271Þ
Example: Clark’s PSD. It is commonly accepted that the one-sided PSD of the signal received by a moving receiver is provided by the following expression [30] PSDð f Þ ¼
2 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; fD 1 f 2 =fD2
0 f fD
ð4:272Þ
where fD ¼ c f cos is the Doppler frequency of the signal moving with velocity and the angle of arrival of the signal at the receiver’s antenna is ; c is the speed of light in the air. Calculations using both eqs. (4.253) and (4.266) produce 00 ð0Þ ¼ 2fD2 4.6.4
ð4:273Þ
Level Crossing Rate of the Nakagami Process
The Nakagami process is often used to describe non-Rayleigh fading [31]. As was pointed out in section 4.6.2, the LCR can be found using the joint PDF of the process at two separate moments of time. In contrast to a Gaussian random process and the Rayleigh process as an envelope of a complex Gaussian process, the Nakagami process in general cannot be described completely by the joint PDF of the second order. In addition, a different joint PDF can be derived for the same marginal PDF and the correlation function. However, the following model is the most popular in practical applications [31]: p ðx1 ; x2 ; Þ
4ðx1 x2 Þm mmþ1 mx21 þ mx22 2m 0 exp Im1 ¼ x1 x2 ; ðmÞð1 20 Þ 0m1 mþ1 ð1 20 Þ ð1 20 Þ
m
1 2
ð4:274Þ Here 0 ðÞ is the correlation function of the underlying Gaussian process [31], ðÞ 20 ðÞ is the normalized covariance function of the squared process 2 ðtÞ (square of the envelope) and In ðxÞ is the modified Bessel function of the first type of the order n [29]. Thus, eq. (4.274) can be rewritten as p ðx1 ; x2 ; Þ ¼
pffiffiffi 2m 4ðx1 x2 Þm mx21 þ mx22 x exp x I m1 1 2 ð1 Þ ð1 Þ ðmÞð1 Þ ðm1Þ=2 mþ1 ð4:275Þ
150
ADVANCED TOPICS IN RANDOM PROCESSES
The next step is calculation of the joint PDF of the process ðtÞ and its derivative _ðtÞ by means of eq. (4.251). At this stage, the time lag can be assumed to be a small quantity, thus allowing for the following approximate relations, derived from the Taylor expansion of the correlation coefficient and the fact that 0 ð0Þ ¼ 0:
ðÞ ffi 1 þ
00 ð0Þ 2 þ Oð 3 Þ; 1 ðÞ ffi 00 ð0Þ 2 þ Oð 3 Þ; 2 pffiffiffiffiffiffiffiffiffi
00 ð0Þ 2 þ Oð 3 Þ
ðÞ ffi 1 þ 4
ð4:276Þ
Furthermore, it can be seen that the argument of the Bessel function approaches infinity since 1 approaches zero. In this case, the following asymptotic values of the Bessel function can be used [29] ex Im1 ðxÞ pffiffiffiffiffiffiffiffi 2x
ð4:277Þ
Having this in mind and using eq. (4.251), one easily obtains
p_ ðx; x_ Þ 8 > > > <
2
6 6 ¼ lim exp6m !0> 4 > > :
2 2 3 39 2 1
00 ð0Þ 2 1 1 > þ x þ x_ 7 2m 1 þ x x_ x þ x_ > 6 7= 2 4 2 2 7 7 7Im1 6 00 00 4 5>
ð0Þ 2
ð0Þ 2 5 > ; 2 2
1 x x_ 2
39 2
00 ð0Þ 2 1 2 2 > 2 _ 2m 1 þ x x > 7> 6 4 4 > 3 exp6 7> > 2 > 00 5 4 > 1
ð0Þ 2 2 > 2 2 > _ 2m x þ x = 6 7 4 2 6 7 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi v exp4 lim 00 5
u !0>
ð0Þ > > > u 2mpffiffi ffi x 12 x_ x þ 12 x_ > > > > t2 > > 2 > > 00 ð0Þ
> > 2 > > > > 2 > > : ; 8 > > > > > > > > > > <1
0
1
2m 1 2 00 ð0Þ 2 C B x_ x A exp@ 00
ð0Þ 2 4 2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi v ¼ u u 4mx2 u 00 t ð0Þ 2
ð4:278Þ
151
OUTAGES OF RANDOM PROCESSES
It follows from the latter that the Nakagami process ðtÞ and its derivative _ðtÞ are independent and the derivative itself has a Gaussian PDF: p_ ðx; x_ Þ
0
B exp@ ¼
1 00
0
1 00
2m 1 2 ð0Þ 2 C 2m 1 2 ð0Þ 2 C B x_ x A x_ x A exp@
00 ð0Þ 2
00 ð0Þ 2 4 4 2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi v v ¼ p_ ðx; x_ Þ ¼ u u u 4mx2 u 4mx2 u 00 u 00 t ð0Þ t ð0Þ 2 2 ð4:279Þ
Correctness of this equation is guaranteed since 00 ð0Þ > 0 implies that the expression under the square root is positive14. Finally, the LCR and AFD can be expressed as pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 00 ð0Þ 2m1 C2 C ¼ e ðmÞ
ð4:280Þ
T1þ ¼
ðm; C 2 Þ 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi eC 2m1 00 C ð0Þ
ð4:281Þ
N1þ ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi C2 00 ð0ÞCe
ð4:282Þ
N1þ
where pffiffiffiffi m C ¼ pffiffiffiffi H
ð4:283Þ
It can be noted that for m ¼ 1 these equations coincide with eq. (4.257) derived for the Rayleigh process. It has been noted already that the Nakagami m distribution is widely used to describe fading in communication channels. This also poses a problem to the investigation of LCR and AFD in systems with combining. The combining methods can be viewed as techniques allowing formation of a continuous communication channel which allows much smaller variance of the PDF, describing the received signal. This question was investigated as far back as in [31]. It was shown there that if all L branches of the combining scheme are
14
It is interesting to note that eq. (4.279) was probably first published in [32,33] the latter of which unfortunately is not widely available in English. Since then this equation has been rediscovered in a number of publications.
152
ADVANCED TOPICS IN RANDOM PROCESSES
identical, the resulting process is also approximately a Nakagami m process with the following parameters mL ¼ mL;
L ¼ L2
ð4:284Þ
This fact can be used to obtain rough estimates of the performance of the combining scheme under consideration. More detailed and accurate results are described in [35].
4.6.5
Concluding Remarks
All the attention in the previous section has been devoted to calculation of only the first moments of the number of crossings, duration of fades and overshoots, etc. However, such description does not provide complete information about the mentioned quantities, which can be provided only by a marginal PDF. Rice [23] outlined an approach allowing calculation of the marginal PDF pF ðÞ of the fade duration. However, a general approach to this problem is still unknown. Even calculation of the variance D appears to be a difficult problem. Often this quantity is estimated from experiments or numerical simulations. However, for the case of rare fades15 ðC 1Þ, it can be assumed [20,22] that the outages obey the Poisson law and, thus pT ðÞ ¼
1 exp þ ; Tþ T
>0
ð4:285Þ
In the other limiting case C ! 0 of the Gaussian process, it can be shown that the fade duration obeys the Rayleigh law [22]. A number of authors suggest a similar representation for other distributions of the process ðtÞ, however, there are no convincing arguments to support such approximation. Good approximations for a sum of Gaussian noise and a harmonic signal are obtained in [20,23]. However, these expressions are quite complicated and difficult to use in analysis and synthesis. A situation when such equations can be directly used is rarely found.
4.7
NARROW BAND RANDOM PROCESSES
Most communication systems work at a very high frequency (the carrier frequency f0 ) and utilize bandwidth f (measured at 3 dB level) which is usually much smaller than the carrier16 f 1 f
15
ð4:286Þ
That is, the outage process is an ordinary process. However, recently great advances have been achieved in the area of so-called ultra-wide band systems, which utilize a relatively wide spectral band. For more detailed discussion see [36].
16
NARROW BAND RANDOM PROCESSES
153
If a stationary wide band excitation such as WGN acts on the input of such systems, the asymptotically stationary output process has spectral density which is concentrated in a narrow band around the carrier frequency f0, according to eq. (4.144). It will be shown below that the covariance function of a narrow band random process ðtÞ can be written in the following form C ðÞ ¼ D ðÞ cos½!0 þ ðÞ
ð4:287Þ
Here !0 ¼ 2f0 and ðÞ and ðÞ are functions which vary slowly compared to cos !0 . In order to justify representation (4.288), one can introduce a new variable fD , which is sometimes referred to as the Doppler shift, by the rule fD ¼ f f0
ð4:288Þ
Then, using the one-sided PSD Sþ ðf Þ, one can write C ðÞ ¼
ð1
þ
S ðf Þ cosð2f Þdf ¼
0
ð1 f0
Sþ ðf0 þ fD Þ cos½2ðf0 þ fD ÞdfD
ð4:289Þ
Using the following notation D c ðÞ ¼ D s ðÞ ¼
ð1 f ð 10 f0
Sþ ðf0 þ fD Þ cos½2fD dfD ð4:290Þ
þ
S ðf0 þ fD Þ sin½2fD dfD
one can rewrite eq. (4.289) in the following form C ðÞ ¼ D ½ c ðÞ cosð2f0 Þ s ðÞ sinð2f0 Þ ¼ D ðÞ cos½!0 þ ðÞ
ð4:291Þ
where
ðÞ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
s ðÞ
2c ðÞ þ 2s ðÞ; tan ðÞ ¼
c ðÞ
ð4:292Þ
Since c ðÞ is an even function and s ðÞ is an odd function, ðÞ is always an even function and ðÞ is always an odd function of . If the spectral density Sþ ð f Þ is a symmetric function around f0 , the lower limit of integration in eq. (4.290) can be substituted by 1, thus producing s ðÞ ¼ 0 and ðÞ ¼ 0. C ðÞ ¼ D ðÞ cos !0 ¼ cos !0
ð1 f0
Sþ ð f0 þ Þ cosð2fD Þd
ð4:293Þ
Thus, the covariance function of a random narrow band process with symmetric PSD can be represented by a modulated cosine function (4.293). Since the spectral density Sþ ðf0 þ Þ is
154
ADVANCED TOPICS IN RANDOM PROCESSES
almost completely located at the low frequency band (compared to f0 ), functions c ðÞ, and
s ðÞ, and as a result ðÞ and ðÞ, are slow functions of time-shift compared to cos !0 .
4.7.1
Definition of the Envelope and Phase of Narrow Band Processes
It could be also said that a realization of a random process looks like a modulated harmonic signal, that is why it is often referred to as a quasiharmonic signal or modulated signal. It is convenient to write a quasiharmonic process ðtÞ as ðtÞ ¼ AðtÞ cos½!0 t þ ðtÞ;
AðtÞ 0
ð4:294Þ
where AðtÞ and ðtÞ are random slowly varying functions with respect to cos !0 t. The random process AðtÞ is called the envelope and the process ðtÞ is called the random phase. The representation (4.294) is not unique, unless additional conditions are specified. This stems from the fact that for a very short observation interval, it is impossible to estimate the central frequency !0 of the random signal, thus causing uncertainties in the value of the phase and the magnitude. Moreover, the form of the equation will not be changed if the phase shift is 2n. In order to eliminate such uncertainties, one has to apply some meaningful restrictions on the phase ðtÞ and the magnitude. First of all, it will be assumed that the phase is confined to an interval ½0; 2 (or, equivalently ½; ). The restriction on the magnitude is more complex and requires definition of the analytical signal. In this case, the original process ðtÞ is augmented by a conjugate process ðtÞ defined by the so-called Hilbert transform [4] ð1
ðsÞ ds t 1 s ð 1 1 ðsÞ ðtÞ ¼ ds 1 t s
ðtÞ ¼
1
ð4:295Þ
The Hilbert transform of a signal ðtÞ can be considered as the reaction of a filter with the impulse response hðtÞ ¼ 1=t to the signal ðtÞ. In the frequency domain, this transform is described by the following transfer function 1 Hð j!Þ ¼
ð1
1 expðj!tÞdt ¼ jsgn! 1 t
ð4:296Þ
Since jHð j!Þj ¼ 1, this filter shifts the phase of all negative frequencies up by =2 and shifts the phase of all positive frequencies down by =2. Equivalently, any component of the form cos !t is transformed into sin !t. The Hilbert filter is often called the quadrature filter, since it transforms in-phase components of signals into quadrature components of the same signal. The Hilbert transform filter is also an all-pass system, since the magnitude spectrum is unchanged at the output of the system with respect to one at the input of the system. This immediately implies that the PSDs of both ðtÞ and ðtÞ are the same S ð!Þ ¼ jHð j!Þj2 S ð!Þ ¼ S ð!Þ
ð4:297Þ
NARROW BAND RANDOM PROCESSES
155
so are the covariance functions: C ðÞ ¼ C ðÞ
ð4:298Þ
The mutual PSD and the covariance function can be directly calculated from eq. (4.143) S ð!Þ ¼ Hðj!ÞS ð!Þ ¼ jS ð!Þsgn! ð ð 1 1 1 1 C ðÞ ¼ C ðÞ ¼ jS ð!Þsgn ! exp½ j!d! ¼ S ð!Þ sin ! d! 2 1 0
ð4:299Þ ð4:300Þ
The latter shows that the processes ðÞ and ðÞ are uncorrelated if considered at the same moment of time, since EfðtÞðtÞg ¼ C ð0Þ ¼ 0
ð4:301Þ
For a Gaussian process, this also means independence, however, for non-Gaussian processes it is not so. The in-phase component ðtÞ and the quadrature component ðtÞ obtained through the Hilbert transform can be combined in a single complex signal called a pre-envelope [4] ðtÞ ¼ ðtÞ þ jðtÞ ¼ AðtÞ exp½jðtÞ
ð4:302Þ
Thus, the envelope and the full phase can be defined as AðtÞ ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðtÞ ¼ !0 t þ ðtÞ 2 ðtÞ þ 2 ðtÞ; ðtÞ ¼ atan ðtÞ
ð4:303Þ
It can be seen that AðtÞ jðtÞj 0 and AðtÞ ¼ jðtÞj if ðtÞ ¼ 0 and, thus, the process AðtÞ envelopes the process ðtÞ. Equation (4.248) also allows definition of a low frequency equivalent #ðtÞ of the complex pre-envelope as ðtÞ ¼ AðtÞ exp½ jð!0 t þ ðtÞÞ ¼ AðtÞ exp½ jðtÞ expðj!0 tÞ ¼ #ðtÞ expðj!0 tÞ
ð4:304Þ
The frequency content of the process #ðtÞ is concentrated around zero frequency if the original process ðtÞ is narrow band and is concentrated around the frequency f0. Furthermore, the low frequency equivalent can be obtained through a demodulation process as #ðtÞ ¼ ðtÞ expðj!0 tÞ ¼ ½ðtÞ þ jðtÞ½cos !0 t þ j sin !0 t ¼ Ac ðtÞ þ jAs ðtÞ
ð4:305Þ
and, thus Ac ðtÞ ¼ AðtÞ cos ðtÞ ¼ ðtÞ cos !0 t þ ðtÞ sin !0 t As ðtÞ ¼ AðtÞ sin ðtÞ ¼ ðtÞ sin !0 t þ ðtÞ cos !0 t
ð4:306Þ
In order to find characteristics of the envelope and the phase of a narrow band process, it is necessary to find a joint distribution of the in-phase and quadrature components p ðx1 ; x2 Þ.
156
ADVANCED TOPICS IN RANDOM PROCESSES
Such calculations are relatively simple to perform for a Gaussian process, as shown in the example below. An alternative definition of the conjugate process and the envelope can be obtained following [37]. In this case sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 d 1 _ _2 ðtÞ ðtÞ ð4:307Þ ðtÞ ¼ ðtÞ; AðtÞ ¼ 2 ðtÞ þ 2 !0 dt !0 !0 It can be seen that, for a deterministic harmonic signal xðtÞ ¼ A cos !0 t, eq. (4.307) produces an exact result yðtÞ ¼ A sin !0 t. However, for a finite bandwidth of the signal ðtÞ this definition is an approximate one. Both definitions will be used in the considerations below. 4.7.2
The Envelope and the Phase Characteristics
One of the most important questions in the theory of narrow band processes is the relation between the characteristics of the random process itself and the characteristics of its envelope, especially relations between the marginal densities and the covariance functions. 4.7.2.1
Blanc-Lapierre Transformation
Let a narrow band process be represented in the following form ðtÞ ¼ R ðtÞ ¼ AðtÞ cos½!0 t þ ðtÞ ¼ AðtÞ cos ðtÞ;
ð4:308Þ
where the envelope AðtÞ and the phase ðtÞ are slowly varying random functions with respect to cosð!0 tÞ and are defined by eqs. (4.303). Jointly, the envelope AðtÞ and the total phase ðtÞ can be described by the joint PDF pA; ða; Þ. The marginal PDF pA ðaÞ of the envelope and the marginal PDF p ð Þ of the phase can be obtained by integration of the joint PDF over the auxiliary variable. The process R ðtÞ is described by its PDF pR ðxÞ. The goal of this section is to derive the relationship between pA ðaÞ and pR ðxÞ assuming that the process R ðtÞ is a stationary process. As a by-product of this analysis, it will be shown that the phase distribution must be uniform to obtain a stationary process: p ð Þ ¼
1 2
ð4:309Þ
Following the technique discussed in section 4.7.1, the conjugate I ðtÞ and the preenvelope ðtÞ processes can be defined as I ðtÞ ¼ AðtÞ sin½!0 t þ ðtÞ ¼ AðtÞ sin ðtÞ
ð4:310Þ
ðtÞ ¼ R ðtÞ þ jI ðtÞ
ð4:311Þ
The complex process ðtÞ can be described by means of the joint PDF of R ðtÞ and I ðtÞ, as discussed in section 2.1 p ðx1 ; x2 Þ ¼
@2 ProbfR ðtÞ x1 ; I ðtÞ x2 g @x1 @x2
ð4:312Þ
157
NARROW BAND RANDOM PROCESSES
Conversantly, both pR ðxÞ and pI ðxÞ can be obtained from p ðx1 ; x2 Þ by integration, that is ð1 pR ðxÞ ¼ p ðx; x2 Þdx2 ð4:313Þ ð1 1 p ðx1 ; xÞdx1 ð4:314Þ pI ðxÞ ¼ 1
An alternative description of ðtÞ can be achieved through its characteristic function ðu; Þ ¼
ð1 ð1 1
1
p ðx1 ; xÞ exp½ jðux1 þ x2 Þdx1 dx2
ð4:315Þ
As a particular case of eq. (4.315), one can see that ðu; 0Þ ¼ ¼
ð1 ð1 ð1 1 1
p ðx1 ; x2 Þ expðjux1 Þdx1 dx2 ð1 ð1 expðjux1 Þdx1 p ðx1 ; x2 Þdx2 ¼ expðjux1 ÞpR ðx1 Þdx1 ¼ R ðuÞ 1
1
1
ð4:316Þ In the very same way, the characteristic function of the quadrature is given by ð0; Þ ¼ I ðÞ;
ð4:317Þ
These last two equations allow one to determine the PDFs of R ðtÞ and I ðtÞ through ðu; Þ. In fact 1 pR ðx1 Þ ¼ 2
ð1
1 R ðuÞ exp½jux1 du ¼ 2 1
ð1 1
ðu; 0Þ exp½jux1 du
ð4:318Þ
ð0; Þ exp½jx2 d
ð4:319Þ
and pI ðx2 Þ ¼
1 2
ð1 1
I ðÞ exp½jx2 d ¼
1 2
ð1 1
As the next step, one can express the joint characteristic function ðu; Þ, given by eq. (4.315), in terms of the joint PDF pA; ða; Þ of the magnitude and phase. In order to do so, the following change of variables can be used in eq. (4.315) x1 ¼ a cos ð4:320Þ x2 ¼ a sin The Jacobian J of this transformation is just cos J ¼ a sin
sin ¼a a cos
ð4:321Þ
158
ADVANCED TOPICS IN RANDOM PROCESSES
and, thus, the relationship between PA; ða; Þ and p ðx1 ; x2 Þ has the following form (see eq. (2.163) in section 2.6) pA; ða; Þ ¼ p ða cos ; a sin Þa
ð4:322Þ
pA; ða; Þ a
ð4:323Þ
or p ða cos ; a sin Þ ¼
After changing the variables in eq. (4.315), using eqs. (4.320) and (4.323), one obtains ð1 ð1 R ðuÞ ¼ ðu; 0Þ ¼ p ðx1 ; x2 Þ exp½ jux1 dx1 dx2 1 1 ð 1 ð 2 ð4:324Þ ¼ p ða cos ; a sin Þ exp½ju cos aa da d 0 0 ð 1 ð 2 PA; ða; Þ exp½ jua cos da d ¼ 0
0
The exponential term in the last equation can be expressed as a sum of Bessel functions of the first kind (eqs. (9.1.44–9.1.45) on page 361 in [29]) exp½ jua cos ¼
1 X
jn Jn ðuaÞ exp½ jn
ð4:325Þ
n¼1
so that eq. (4.324) becomes R ðuÞ ¼
ð 1 ð ð2Þ X 1 0
¼ ¼
0
1 X n¼1 1 X n¼1
jn
n¼1 ð1
Jn ðuaÞ
0
jn
jn Jn ðuaÞ exp½ jn pA; ða; Þda d
ð1 0
ð 2
exp½jnð!0 t þ ’ÞPA; ða; ’Þd’ da
0
Jn ðuaÞpA ðaÞda
ð 2 0
exp½ jnð!0 t þ ’ÞpjA ð’ j aÞd’
ð4:326Þ
Here, pjA ð’ j aÞ, stands for PDF of the phase, ðtÞ, given a value of a for the magnitude pjA ð’ j aÞ ¼
pA; ða; ’Þ pA ðaÞ
ð4:327Þ
One can obtain R ðuÞ which is independent of time (i.e. R ðtÞ will be a stationary process) if and only if17 pjA ð’ j aÞ ¼
1 2
ð4:328Þ
Recall that the expression for the total phase ðtÞ contains the term !0 t. This time dependence is gone if eq. (4.328) holds.
17
159
NARROW BAND RANDOM PROCESSES
This means that the distribution of the phase does not depend on the value of a, i.e. the magnitude and the phase are statistically independent18. Also, taking into account that ð 2
exp½ jnð!0 t þ ’Þd’ ¼ 0
ð4:329Þ
0
for n > 0, eq. (4.326) can be rewritten as R ðuÞ ¼
ð1
J0 ðuaÞpA ðaÞda
ð4:330Þ
0
or, taking the inverse Fourier transform 1 pR ðx1 Þ ¼ 2
ð1
1 R ðuÞ exp½jx1 udu ¼ 2 1
ð1 ð1 1
J0 ðuaÞpA ðaÞ exp½jx1 uda du:
0
ð4:331Þ In order to express pA ðaÞ in terms of pR ðx1 Þ, it could be observed that R ðuÞ is just the Hankel transform19 (see [38]) of pA ðaÞ=a and thus can be easily inverted to produce pA ðaÞ ¼ a
ð1
J0 ðuaÞR ðuÞu du ¼
0
ð1 0
uJ0 ðuaÞ
ð1 1
pR ðx1 Þ exp½ jx1 udx1 du
ð4:332Þ
The transform pair given by eqs. (4.331) and (4.332) is known as the Blanc-Lapierre transform [19]. An alternative expression can be obtained by rewriting eq. (4.331) as ð ð 1 1 1 J0 ðuaÞpA ðaÞ exp½jx1 uda du 2 1 0 ð ð1 1 1 pA ðaÞda J0 ðuaÞ exp½jx1 udu ¼ 2 0 1
pR ðx1 Þ ¼
Since the inner integral is equal to (equation in [39]) 8 2 ð1 < pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi if 2 J0 ðuaÞ exp½jx1 udu ¼ a x21 : 1 0 if one can write 1 pR ðx1 Þ ¼
18
ð1
pa ðaÞ 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi da ¼ 2 2 a x1 jx1 j
ð1
a jx1 j a < jx1 j
pA ðjx1 j cosh aÞda
0
This condition can be relaxed if only wide sense stationarity is required. Ð1 Ð1 A pair of Hankel transforms is defined by FðuÞ ¼ 0 f ðxÞJ0 ðxuÞx dx; f ðxÞ ¼ 0 FðuÞJ0 ðxuÞu du [38].
19
ð4:333Þ
ð4:334Þ
ð4:335Þ
160
4.7.2.2
ADVANCED TOPICS IN RANDOM PROCESSES
Kluyver Equation
Very often a narrow band random process ðtÞ on a physical level is obtained as a sum of a number of narrow band processes with known characteristics. For example, multipath propagation of radio waves results in a weighted sum of randomly amplified and delayed replicas of the transmitted signal. A similar situation is encountered with sea radar when the reflected wave is a combination of reflections from numerous irregularities of the sea surface. A great number of similar examples can be found elsewhere. In such problems, it is beneficial to find a relationship between the PDF of each summand and the PDF of the net narrow band process. This problem is treated in this section. Let a complex random variable be a sum of N independent complex random variables n , n ¼ 1; 2; . . . ; N: ¼ þ j ¼
N X
n ¼
n¼1
N X
n þ j
n¼1
N X
ð4:336Þ
Yn
n¼1
where n ¼ Reðn Þ, n ¼ Imðn Þ. It is assumed that the joint PDF pn ;n ðxn ; yn Þ is known for each individual n . This can be recast in terms of a joint PDF of magnitude An and phase n as in eq. (4.323) n ¼ An expð jn Þ;
An ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n2 þ n2 ;
n ¼ atan
n n
1 pA ðan Þ 2 n
pAn ;n ðan ; ’n Þ ¼ an pn ;n ðan cos ’n ; an sin ’n Þ ¼
ð4:337Þ ð4:338Þ
The process ðtÞ itself can also be described by the joint PDF of its real and imaginary parts p ðx; yÞ or, equivalently, by its characteristic function ðu; Þ ¼
ð1 ð1 1
1
p ðx; yÞ exp½jðux þ yÞdx dy
ð4:339Þ
Since is a sum of N independent random variables its characteristic function is a product of the characteristic functions of each single term [18], i.e. ðuÞ ¼
N Y
n ðuÞ ¼
N ð1 Y
J0 ðuan ÞpAn ðan Þdan ¼
n¼1 0
n¼1
N Y hJ0 ðuan Þi
ð4:340Þ
n¼1
A similar equation holds for ðuÞ. In turn, according to the Blanc-Lapierre transformation (eq 4.332), pA ðaÞ can be expressed in terms of ðuÞ as pA ðaÞ ¼ a
ð1
J0 ðuaÞX ðuÞu du ¼ a
0
¼a
ð1 0
" J0 ðuaÞ
N ð1 Y n¼1 0
ð1 0
J0 ðuaÞ
N Y hJ0 ðuan Þiu du n¼1
#
J0 ðuan ÞpAn ðan Þdan u du
ð4:341Þ
NARROW BAND RANDOM PROCESSES
161
This equation relates the distribution of the magnitude of the sum through distributions of the magnitude of each individual component. In practice, it is the value of magnitude squared, I ¼ a2 , which is available for measurement. In this case, the PDF of the intensity I can be easily found to be pffiffi Ð 1 pffiffi Q Ð 1 pffiffi I 0 J0 ðu I Þ Nn¼1 0 J0 ðuan ÞpAn ðan Þdan u du pA ð I Þ pffiffi pi ðIÞ ¼ pffiffi ¼ 2 I 2 I ð N pffiffi Y 1 1 ¼ J0 ðu I Þ hJ0 ðuan Þiu du ð4:342Þ 2 0 n¼1 The last equation is known as the Kluyver equation.
4.7.2.3
Relations Between Moments of pAn ðan Þ and pi ðIÞ
The Kluyver equation (4.342) can rarely be used to produce analytical results. However, a useful relation between the moments of the sum and those of its components can be analytically obtained. In order to do so, one can make use of the moment generating function ðÞ as discussed in Section 2.1.1. N N @ mkI ¼ ð1Þ ðÞ ð4:343Þ @N ¼0 A Taylor-type expansion for ðÞ can be obtained directly from the Kluyver equation (4.342) ðÞ ¼
ð1 e
I
pi ðIÞdI ¼
0
ð1 e 0
I
1 2
ð1
N pffiffi Y J0 ðu I Þ hJ0 ðuan Þiu du dI
0
n¼1
ð1 Y ð N pffiffi 1 1 I hJ0 ðuan Þiu du e J0 ðu I ÞdI ¼ 2 0 n¼1 0
ð4:344Þ
The first step is to calculate the inner integral in eq. (4.344). This can be achieved by using eq. (6.614-1) on page 709 in [39] rffiffiffiffiffi 2 2 ð1 pffiffiffi 2 x exp e J ð xÞdx ¼
I1 Iþ1 ð4:345Þ 3 2 2 8 8 8 4 0 and setting ¼ , ¼ u, ¼ 0, x ¼ I, which implies that 2 u2 1 1 þ 1 1 ¼ ; ¼ ; ¼ 8 8 2 2 2 2
ð4:346Þ
and thus ð1 e 0
I
pffiffi u J0 ðu I ÞdI ¼ 4
rffiffiffiffiffi 2 2 u2 u u exp
I12 I12 3 8 y 8
ð4:347Þ
162
ADVANCED TOPICS IN RANDOM PROCESSES
This last equation can be further simplified by observing that (using 10.2.13–14 on page 443 in [29]) rffiffiffiffiffi rffiffiffiffiffi 2z sinh z 2 I1=2 ðzÞ ¼ ¼ sinh z ð4:348Þ z z rffiffiffiffiffi rffiffiffiffiffi 2z cosh z 2 I1=2 ðzÞ ¼ ¼ cosh z ð4:349Þ z z and thus 2 rffiffiffiffiffiffiffiffi u2 u 16 u2 exp I12 ¼ 8 8 8 u2
I12
ð4:350Þ
Taking eq. (4.350) into account, eq. (4.347) becomes ð1 e 0
I
pffiffi u J0 ðu I ÞdI ¼ 4
rffiffiffiffiffi rffiffiffiffiffiffiffiffi u2 16 u2 1 u2 exp exp
¼ exp ð4:351Þ 8 8 4 3 u2
Finally, the expression for ðÞ becomes 1 ðÞ ¼ 2
ð1 Y N 1 u2 hJ0 ðuan Þiu du exp 4 0 n¼1
ð4:352Þ
As the next step, let us use the following substitution of variables in eq. (4.352) x ¼ u2 =4; u ¼
pffiffiffiffiffiffiffiffi 4x;
du ¼
pffiffiffi dx x
ð4:353Þ
so that eq. (4.352) becomes 1 ðÞ ¼ 2
ð1 Y N
pffiffiffiffiffiffiffiffi hJ0 ð 4xan Þi expðxÞdx
ð4:354Þ
0 n¼1
The last integral cannot be calculated analytically in closed form, therefore one can use pffiffiffiffiffiffiffi ffi the Taylor expansion of J0 ð 4xan Þ with respect to variable . Using eq 9.1.10 in [29] J ðzÞ ¼ with ¼ 0 and z ¼
1
z X
2
ðz2 =4Þk k!ð þ k þ 1Þ k¼0
ð4:355Þ
pffiffiffiffiffiffiffiffi 4xan one can obtain 1 n X pffiffiffiffiffiffiffiffi ð1Þkn kn xkn a2k 2 J0 ð 4xan Þ ¼ ; kn !kn ! k ¼0 n
ð4:356Þ
163
NARROW BAND RANDOM PROCESSES
and, thus 1 X n pffiffiffiffiffiffiffiffi ð1Þkn kn xkn ha2k n i hJ0 ð 4xan Þi ¼ kn !kn ! k ¼0
ð4:357Þ
n
Furthermore N N X 1 Y Y n pffiffiffiffiffiffiffiffi ð1Þkn kn xkn ha2k n i hJ0 ð 4xan Þi ¼ kn !kn ! n¼1 n¼1 k ¼0 n
¼ ¼
1 X 1 X
1 2kN1 2k1 N X ð1Þm m xm ha2k N1 ihaN1 i ha1 i kN !kN !kN1 !KN k1 !k1 ! k ¼0
kN ¼0 kN1 ¼0 1 X m
1
ð1Þ cm m xm
ð4:358Þ
m¼0
where m¼
N X
kn
ð4:359Þ
1 2kN1 2k1 N X ha2k N1 ihaN1 i ha1 PN k !k !k !k k !k ! N N N1 N1 1 1 ¼0 k ¼0
ð4:360Þ
n¼1
and cm ¼
1 X 1 X kN ¼0 kN1
k ¼m n¼1 n
1
Finally, integrating eq. (4.354) with respect to x and taking advantage of eq. (4.359) one obtains ðÞ ¼
ð1 X 1 0
! m
ð1Þ cm x
m m
expðxÞdx ¼
m¼0
1 X ð1Þm cm m m!
ð4:361Þ
m¼0
Comparing this to eq. (4.358), the following expression can be derived mkI ¼ k!k!ck ¼ k!k!
kn ¼0 kN1
4.7.2.4
1 2kN1 2k1 N X ha2k iha i ha i N1 N1 1 !k !k !k ! k !k ! k 1 1 PN ¼0 k ¼0 N N N1 N1
1 X 1 X
1
n¼1
ð4:362Þ kn ¼k
The Gram–Charlier Series for pR ðxÞ and pi ðIÞ
As has been mentioned above, it is the distribution of the intensity pi ðIÞ which is usually available for measurement. At the same time, it is important for numerical simulations to
164
ADVANCED TOPICS IN RANDOM PROCESSES
know and reproduce the statistic of the field amplitude pR ðxÞ. The exact analytical form, given by eqs. (4.330)–(4.331) is rarely achievable (however, a number of results can be found in [19]). Important relations like (4.362) give insight into how the distribution behaves but do not really allow its numerical simulation. In this section, useful results allowing the reconstruction of the PDF pR ðxÞ from the PDF of the intensity pi ðIÞ are derived. As the first step, the Blanc-Lapierre transformation (4.331) can be rewritten in terms of pi ðIÞ, using the fact that pffiffi pffiffi pffiffi pA ð I Þ ð4:363Þ pi ðIÞ ¼ pffiffi ) pA ð I Þ ¼ 2 I pi ðIÞ 2 I and thus ð ð 1 1 1 J0 ðuaÞpa ðaÞda exp½jxudu 2 1 0 ð ð pffiffi pffiffi pffiffi 1 1 1 ¼ J0 ðu I ÞpA ð I Þ exp½jxudu d I 2 1 0 ð ð pffiffi pffiffi 1 1 1 dI J0 ðu I Þ2 I pi ðIÞ pffiffi exp½jxudu ¼ 2 1 0 2 I ð ð1 pffiffi 1 1 exp½jxudu J0 ðu I Þpi ðIÞdI ¼ 2 1 0
pR ðxÞ ¼
ð4:364Þ
Let now the PDF pi ðIÞ be represented by its expansion through the Laguerre polynomials Lk ð IÞ as discussed in section 2.2, i.e. pi ðIÞ ¼ exp½ I
1 X
k Lk ð IÞ
ð4:365Þ
k¼0
where k ¼
ð1
pi ðIÞLk ð IÞdI
ð4:366Þ
0
In this case, eq. (4.364) becomes pR ðxÞ ¼
1 2
ð1
ð1 exp½jxudu
1
0
1 X pffiffi J0 ðu I Þ exp½ I k Lk ð IÞdI k¼0
ð1 ð1 1 pffiffi X k exp½jxudu J0 ðu I ÞLk ð IÞ exp½ IdI ¼ 2 k¼0 1 0 ð 1 1 X k Fk ðuÞ exp½jxudu ¼ 2 k¼0 1
ð4:367Þ
where Fk ðuÞ ¼
ð1 0
pffiffi J0 ðu I ÞLk ð IÞ exp½ IdI
ð4:368Þ
165
NARROW BAND RANDOM PROCESSES
In order to simplify eq. (4.368), one can use the following integral (2.19.12-7 on page 474 in [40]) ð1 pffiffiffi xffi Þ =2 px J ðbpffiffi x e ð4:369Þ Ln ðcxÞdx ¼ Jðp; cÞ I ðb x Þ 0 where 2 b ðp cÞn b b2 c exp ; L 4p n 2 pþnþ1 4pc 4p2 2 ð1Þn b þ2n b exp Jðc; cÞ ¼ n!cþnþ1 2 4c
Jðp; cÞ ¼
p 6¼ c
ð4:370Þ ð4:371Þ
Using the last equation with ¼ 0, p ¼ c ¼ , n ¼ k and b ¼ u, one can obtain Fk ðuÞ ¼
1 u2k u2 exp 4 k! kþ1 2
ð4:372Þ
Rewriting eq. (4.367) by taking into account eq. (4.372), one can obtain with u ¼ I pR ðxÞ ¼
1 X k 2 k¼0
ð1 1
Fk ðuÞ exp½jxudu
ð1 1 X 1 u2k u2 k exp exp½jxudu kþ1 2 2 k¼0 4 1 k! pffiffiffi X pffiffiffi X ð pffiffiffi 1 k 1 2k 1 k 2 Gk ðxÞ t expðt Þ exp½jð2 xÞtdt ¼ ¼ k¼0 k! 1 k¼0 k!
¼
ð4:373Þ
where the following notation is used Gk ðxÞ ¼
ð1 1
t2k expðt2 Þ exp½jð2
ð1 pffiffiffi pffiffiffi xÞtdt ¼ exp½ x2 t2k exp½ðt þ j xÞ2 dt 1
ð4:374Þ Comparing the standard integral 3.462-4 on page 338 in [39] ð1 1
pffiffiffi xn exp½ðx Þ2 dx ¼ ð2jÞn Hn ðj Þ
ð4:375Þ
one can obtain Gk ðxÞ ¼ exp½ x 2
ð1 1
t2k exp½ðt þ j
pffiffiffi 2 pffiffiffi pffiffiffi xÞ dt ¼ 22k ð1Þk exp½ x2 H2k ð xÞ ð4:376Þ
166
ADVANCED TOPICS IN RANDOM PROCESSES
and thus pR ðxÞ ¼
pffiffiffi X pffiffiffi 1 X pffiffiffi 1 k ð1Þk k Gk ðxÞ ¼ pffiffiffi exp½ x2 H2k ð xÞ 2k k¼0 k! 2 k! k¼0
ð4:377Þ
Here Hn ðxÞ is the Hermitian polynomial of order n as discussed in section 2.2. The last expression should not come as a surprise. If pi ðIÞ is an exponential distribution (2 of two degrees of freedom) then 0 ¼ 1 and k ¼ 0 for k > 0. This reduces eq. (4.377) to the Gaussian PDF as it must. At the same time, eq. (4.377) is nothing but the Gram–Charlier expansion of the PDF pR ðx1 Þ. The odd order terms are missing due to the fact that pR ðx1 Þ is a symmetrical PDF. Treatment of non-stationary processes becomes very complicated since independence between the phase and the magnitude does not hold. It is possible to obtain a number of useful results for the Gaussian process using direct calculations of the related statistics. This approach is described in some detail in section 4.7.3 below. A number of examples for the non-Gaussian case is considered in section 4.7.4.
4.7.3 4.7.3.1
Gaussian Narrow Band Process First Order Statistics
Let I ðtÞ and Q ðtÞ be two independent Gaussian processes with equal variance 2 and different mean values mI and mQ respectively. These can be considered as components of a complex random process ðtÞ ¼ I ðtÞ þ jQ ðtÞ ¼ AðtÞ exp½ jðtÞ
ð4:378Þ
The goal in this section is to derive first order statistics of the envelope AðtÞ and phase ðtÞ of this complex Gaussian process. Since in-phase and quadrature components are independent, their joint PDF is a product of the marginal PDF of each component, i.e. " # 1 ðx mI Þ2 þ ðy mQ Þ2 exp pI Q ðx; yÞ ¼ pI ðxÞpQ ðyÞ ¼ ð4:379Þ 2 2 2 2 Making a transformation of variables according to x ¼ A cos
ð4:380Þ
y ¼ A sin the Jacobian of this transformation is @x @y @A @A cos ¼ J¼ @x @y A sin @ @
sin ¼A A cos
ð4:381Þ
NARROW BAND RANDOM PROCESSES
167
and, thus, the joint PDF of the envelope and phase is pA; ðA; Þ ¼ pI Q ðA cos ; A sin ÞA " # A ðA cos mI Þ2 þ ðA sin mQ Þ2 ¼ exp 2 2 2 2 " # A2 þ m2I þ m2Q 2AðmI cos þ mQ sin Þ A exp
2 2 2 2
ð4:382Þ
Further simplification can be achieved by introducing two new parameters m¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi m2I þ m2Q ;
¼ a tan
mQ mI
ð4:383Þ
and thus mi ¼ m cos ;
mQ ¼ m sin
ð4:384Þ
As a result, eq. (4.382) can be rewritten as A A2 þ m2 2Amðcos cos þ sin sin Þ exp 2 2 2 2 A A2 þ m2 2Am cosð Þ ¼ exp 2 2 2 2
pA; ðA; Þ ¼
ð4:385Þ
This in hand, one can calculate the PDF of the amplitude by integrating expression (4.385) over : ð A A2 þ m2 2Am cosð Þ pA; ðA; Þd ¼ exp d 2 2 2 2 ð A A2 þ m2 Am cosð Þ A A2 þ m 2 mA exp exp exp d ¼ I0 ¼ 2 2 2 2 2 2 2 2 2
pA ðAÞ ¼
ð
ð4:386Þ Here the following standard integral has been used
ð Am cosð Þ Am cosðÞ Am exp exp d ¼ d ¼ 2I0 2 2 2
ð
ð4:387Þ
Thus, the distribution of the envelope is Rician. In the very same manner, the distribution of the phase can be found to be p ðÞ ¼
ð
pA; ðA; ÞdA ¼
ð1 0
A A2 þ m2 2Am cosð Þ exp dA 2 2 2 2
ð4:388Þ
168
ADVANCED TOPICS IN RANDOM PROCESSES
Using substitutions z¼
A m cosð Þ ;
dA ¼ dz;
A ¼ z þ m cosð Þ
ð4:389Þ
eq. (4.388) becomes " # z þ m cosð Þ ð zÞ2 þ m2 þ m2 cos2 ð Þ exp p ðÞ ¼ dz 2 2 2 2 m cosð Þ 1 m2 m cosð Þ m cosð Þ m2 sin2 ð Þ pffiffiffiffiffiffi exp 2 þ exp ¼ 2 2 2 2 2 ð4:390Þ ð1
where ðsÞ is the CDF of the standard Gaussian PDF 2 ðs 1 x pffiffiffiffiffiffi exp ðsÞ ¼ dx 2 1 2
ð4:391Þ
For a particular case of m ¼ mI ¼ mQ ¼ 0 A A2 exp 2 2 2 1 p ðÞ ¼ 2
pA ðAÞ ¼
4.7.3.2
ð4:392Þ ð4:393Þ
Correlation Function of the In-phase and Quadrature Components
Second order statistics of the in-phase and quadrature components can be easily found using the fact that SI ð!Þ ¼ jHI ð!Þj2
N0 ; 2
2 N0 SQ ð!Þ ¼ HQ ð!Þ 2
ð4:394Þ
where SI ð!Þ and SQ ð!Þ are power spectral densities (PSD) of the in-phase and quadrature components. According to the Wiener–Khinchin theorem, we have then 1 RI ðÞ ¼ 2
ð1 1
SI ð!Þ exp½ j!d!
ð4:395Þ
Alternatively, the correlation function can be obtained through factorization, as described below. In general, the transfer function of a filter can be represented as a ratio of two polynomials HðsÞ ¼
BðsÞ bn sn þ bn1 sn1 þ þ b0 ¼ AðsÞ an sn þ an1 sn1 þ þ a0
ð4:396Þ
169
NARROW BAND RANDOM PROCESSES
One can expand HðsÞ using the method of partial fraction expansion: HðsÞ ¼
a1 a2 an þ þ þ þ KðsÞ s p1 s p2 s pn
ð4:397Þ
where ak are the residues, pk are the poles and KðsÞ is the remaining direct term. Assuming that KðsÞ ¼ 0, one can see that the time-domain response of the filter to an impulse is given by: ð1 ð1 n X a1 an Hð!Þ exp½j!td! ¼ þ þ ak exp½pk t hðtÞ ¼ exp½j!td! ¼ s pn 1 1 s p1 k¼1 ð4:398Þ Now that the expression for hðtÞ is factorized, one can determine the autocorrelation function: ) ð1 ð 1 (X n X n RðÞ ¼ hðtÞhðt þ Þdt ¼ ak al exp½pk exp½ðpk þ pl Þt dt n X n X k¼1 l¼1
ak al exp½pk
ð1
0
0
exp½ðpk þ pl Þtdt ¼
o
¼
k¼1 l¼1
n X n X
ak al
k¼1 l¼1 n X
exp½pk pk þ pl
ak exp½pk t
k¼1
n X l¼1
al pk þ pl ð4:399Þ
Finally, the expression for RðÞ has the following form RðÞ ¼
n X
ck exp½pk
ð4:400Þ
k¼1
where coefficients ck are given by: ck ¼ ak
4.7.3.3
n X
al p þ pl l¼1 k
ð4:401Þ
Second Order Statistics of the Envelope
In order to find second order statistics of the envelope and the phase, one ought to derive the fourth order joint probability density function of I ðtÞ, I ðt þ Þ, Q ðtÞ and Q ðt þ Þ. Since I ðtÞ and Q ðt þ Þ are independent in any moment of time, one obtains pI Q I Q ðx; y; x ; y Þ ¼ pl ðx; x ÞpQ Q ðy; y Þ 2 1 x 2 ðÞxx þ x2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp ¼ 2 2 ð1 2 ðÞÞ 2 2 1 2 ðÞ 2 1 y 2 ðÞyy þ y2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp
2 2 ð1 2 ðÞÞ 2 2 1 2 ðÞ
ð4:402Þ
170
ADVANCED TOPICS IN RANDOM PROCESSES
or pI Q I Q ðx; y; x ; y Þ ¼
2 1 x þ y2 2 ðÞðxx þ yy Þ þ x2 þ y2 exp 2 2 ð1 2 ðÞÞ 42 4 ð1 2 ðÞÞ ð4:403Þ
Here, ðÞ is the normalized correlation function, such that ð0Þ ¼ 1. Changing variables according to x ¼ A cos ; x ¼ A cos ;
y ¼ A sin y ¼ A sin
ð4:404Þ
and noticing that the Jacobian of the transformation (4.404) is cos A sin J¼ 0 0
sin
0
A cos 0
0 cos
0
A sin
0 ¼ AA sin A cos 0
ð4:405Þ
the following joint PDF of magnitude and phase in two separate moments of time can be obtained pAA ðA; A ; ; Þ ¼
AA 1 2 2 exp ½A þ A 2 ðÞAA cosð Þ 42 4 ð1 2 ðÞÞ 2 2 ð1 2 ðÞÞ ð4:406Þ
A joint probability density function of amplitude in two separate moments of time can be calculated by integrating (4.406) over phase variables, i.e. pAA ðA; A Þ ¼
ð ð
pAA ; ðA; A ; ; Þd d
AA A2 þ A2 exp 2 ¼ 2 4 4 ð1 2 ðÞÞ 2 ð1 2 ðÞÞ ð ð
ðÞAA cosð Þ d d exp
2 ð1 2 ðÞÞ AA A2 þ A2
ðÞ AA exp 2 I0 4 ð1 2 ðÞÞ 2 ð1 2 ðÞÞ 1 2 ðÞ 2
ð4:407Þ
Here, once again, the integral (4.387) has been used. The last expression allows one to calculate the correlation function of the envelope. The first step is to calculate an average value of the magnitude. Since it is already known that the marginal PDF is given by the
NARROW BAND RANDOM PROCESSES
171
Rayleigh distribution (4.392), the mean and variance can be easily found to be rffiffiffi A2 A2 exp dA ¼ 2 2 2 2 0 0 rffiffiffi 2 ð1 ð1
A A2 2 2 2 ðA mA Þ pA ðAÞdA ¼ A exp dA ¼ 2 A ¼ 2 2 2 2 2 0 0
mA ¼
ð1
ApA ðAÞdA ¼
ð1
ð4:408Þ ð4:409Þ
Further simplification can be achieved by normalizing the magnitude variables by variance z1 ¼
A ;
z2 ¼
A ;
J ¼ 2
ð4:410Þ
and, thus pAA ðz1 ; z2 Þ ¼
z1 z2 z21 þ z22
ðÞ exp z z I 0 1 2 1 2 ðÞ 2ð1 2 ðÞÞ 1 2 ðÞ
ð4:411Þ
Using a Bessel function expansion in terms of the Laguerre polynomial I0 ðaz1 z2 Þ ¼
1 X n¼0
Ln ðz21 =2ÞLn ðz22 =2Þan
ð4:412Þ
the expression (4.411) can be transformed into
1 2 2 2n z21 þ z22 X z z Ln 1 Ln 1 pAA ðz1 ; z2 Þ ¼ z1 z2 exp 2 2 2 n!n! n¼0
ð4:413Þ
The expansion (4.413) can be now used to calculate the second moment of the envelope hAA i ¼ 2 hz1 z2 i ¼ 2 ¼ 2
ð1 ð1 1 1
z1 z2 pAA ðz1 ; z2 Þdz1 dz2
2 2 2 ð ð 1 X
2n 1 1 z þ z22 z z z1 z2 exp 1 Ln 1 Ln 1 dz1 dz2 n!n! 2 2 2 1 1 n¼0
ð4:414Þ
ð 2 2 2 1 X
2n 1 z z ¼ z1 exp 1 Ln 1 dz1 n!n! 2 2 1 n¼0 2
Finally
CAA ðÞ ¼ hAA i
m2A
1 1 2 X 1 2 2n X 1 2 2n 2 ¼
ðÞ ¼ A
ðÞ 4 n ¼ 1 2n!! 2 n ¼ 1 2n!!
ð4:415Þ
172
ADVANCED TOPICS IN RANDOM PROCESSES
If only two terms are taken into account then CAA ðÞ ¼ 2A ð0:921 2 ðÞ þ 0:058 4 ðÞ þ Þ 2A 2 ðÞ
ð4:416Þ
Introducing a new variable ¼ and integrating with reference to eq. (4.406) over A, A and , we can obtain the expression for the PDF of the phase difference in two moments of time separated by " # 1 2 ðÞ 1 2 þ a sin x p ðÞ ¼ þx ; 2 1 x2 ð1 x2 Þ3=2
4.7.3.4
x ¼ ðÞ cos
ð4:417Þ
Level Crossing Rate
The level crossing rate (LCR) of the random process is defined as the number of crossings of a fixed threshold R per unit time in the positive direction (i.e. from a level below R to a level above R). In order to find the LCR, one needs to calculate the joint probability density of the random process and its derivative p_ ðx; x_ Þ. With this in hand, the LCR can be calculated as LCRðRÞ ¼
ð1 0
xp_ ðx; x_ Þd_x
ð4:418Þ
For a random process with Gaussian components pAA_ ðA; A_ Þ ¼ pA ðAÞpA_ ðA_ Þ
ð4:419Þ
where pA ðAÞ ¼
A A2 þ m 2 mA exp I0 2 2 2 2
ð4:420Þ
and 1 A_ 2 _ ffi exp 2 pA_ ðAÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ð 000 Þ 2ð 000 Þ
ð4:421Þ
The following short notation is also used 000 ¼
@2
ðÞ 2 @ ¼0
ð4:422Þ
173
NARROW BAND RANDOM PROCESSES
Thus, the LCR can be calculated as LCRðRÞ ¼
ð1 0
pA ðRÞpA_ ðA_ ÞA_ dA_ ¼ pA ðRÞ
ð1 0
pA_ ðA_ ÞA_ dA_
A_ A_ 2 ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dA_ ð4:423Þ exp 00 2 2 ð 000 Þ 0 2ð 0 Þ ð1 pA ðRÞ A_ 2 A_ 2 2 00 ffi d 2 ð Þ exp ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 00 2 00 2 ð 0 Þ 2 ð 000 Þ 2ð 0 Þ 0 rffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffi00ffi ð pa ðRÞ 2 ð 000 Þ 1 000 R R2 þ m2 mR 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ¼ exp exp½tdt ¼ pA ðRÞ I0 2 2 00 2 2 2 2ð 0 Þ 0 ¼ pA ðRÞ
ð1
In the particular case of m ¼ 0, one can obtain rffiffiffiffiffiffiffiffi00ffi R R2 0 LCRðRÞ ¼ exp 2 2 2
ð4:424Þ
In many cases, it is easier to calculate the coefficient 000 through the power spectral density. Indeed, using properties of the Fourier transform, one can obtain 000 ¼
4.7.4 4.7.4.1
ð1 @2 ¼ 1
ðÞ !2 Sð!Þ d! 2 @ 2 2 0 ¼0
ð4:425Þ
Examples of Non-Gaussian Narrow Band Random Processes K Distribution
The K distribution pA ðaÞ ¼
2b ba K1 ðbaÞ; ðÞ 2
> 0; x 0
ð4:426Þ
is a rare case when the Blanc-Lapierre transformation can be calculated analytically. In order to do so, the characteristic function of the PDF (4.426) using the Bessel transform [19,38] can be found first R ðsÞ ¼
ð1 0
pA ðaÞJ0 ðsaÞda ¼
bþ1 21 ðÞ
ð1
A K1 ðbaÞJ0 ðsaÞda
ð4:427Þ
0
It can be calculated using the standard integral (6.576–7 on page 694 in [39]) ð1 0
xþ þ1 J ðaxÞK ðbxÞdx ¼ 2 þ a b
ð þ þ 1Þ ða2 þ b2 Þ þþ1
ð4:428Þ
174
ADVANCED TOPICS IN RANDOM PROCESSES
where the following should be used:
¼ 0;
a ¼ s;
x ¼ A;
¼1
ð4:429Þ
This yields R ðsÞ ¼
bþ1 ðÞ b2 1 1 2 b ¼ 21 ðÞ ðs2 þ b2 Þ ða2 þ b2 Þ
ð4:430Þ
The PDF px ðxÞ itself can be found by using an inverse Fourier transform of its characteristic function (4.430) [40]: ð ð 1 1 b2 1 expðisxÞ R ðsÞ expðisxÞds ¼ ds 2 1 2 1 ðs2 þ b2 Þ ð1 expðisxÞ ¼ ds 1 ðis þ bÞ ðis þ bÞ ð b2 1 ¼ ðis þ bÞ2 2 ðis þ bÞ2 2 expðisxÞds 2 1
pR ðxÞ ¼
ð4:431Þ
The last integral is a table integral (3.384–9 on page 321 in [39]): ð1 1
ð þ jxÞ2 ð jxÞ2 ejpx dx ¼ 2ð2 Þ2
jpj2 1 W 1 ð2 jpjÞ ð2 Þ 0;22
ð4:432Þ
Thus, setting
¼ ; 2
¼ b;
p ! x;
x!s
ð4:433Þ
the expression for the PDF becomes pR ðxÞ ¼
b jxj1 22vþ1 b 2ð2bÞ W0;12 ð2bjxjÞ ¼ ð2bjxÞv1 W0;12 ð2bjxjÞ ðÞ 2 ðÞ
ð4:434Þ
This equation can be further simplified using the following expression for the Whittaker function W0;1=2 ð Þ, first in terms of the Kummer function Uða; b; zÞ and then in terms of the modified Bessel function K of the second kind (eq. 13.1–33 on page 505 and eq. 13.6–21 in Table 3.6 on page 510 in [29]):
z
z z1=2 W0; 12 ðzÞ ¼ exp z Uð1 ; 2 2; zÞ ¼ pffiffiffi K12 2 2
ð4:435Þ
and thus 1 2þ2 b pffiffiffi ðbjxjÞ2 K1 ðbjxjÞ 2 ðÞ 1
pR ðxÞ ¼
ð4:436Þ
NARROW BAND RANDOM PROCESSES
175
Therefore, the distribution of instantaneous values corresponding to the K distribution of the magnitude is again expressed in terms of the modified Bessel function K. 4.7.4.2
Gamma Distribution
Let the PDF of the intensity be the Gamma distribution pi ðIÞ ¼
þ1 I expð IÞ ð þ 1Þ
Coefficients m of the expansion (4.365) can be found as ð1 ð 1 þ1 I expð IÞ Lm ð IÞdI pi ðIÞLm ð IÞdI ¼ m ¼ ð þ 1Þ 0 0 ð1 ð1 1 1 ¼ ð IÞ exp½ ILm ð IÞd I ¼ x exp½xLm ðxÞdx ð þ 1Þ 0 ð þ 1Þ 0
ð4:437Þ
ð4:438Þ
The last integral can be calculated using the table integral 7.414 –11 on page 845 in [39] ð1 ððð1 þ þ m Þ ; Refg > 0 ð4:439Þ x1 exp½xL m ðxÞdx ¼ m!ð1 þ Þ 0 with ¼ þ 1, and ¼ 0: ð1 1 1 ð þ 1Þð1 þ m 1Þ ðm Þ m ¼ ¼ x exp½xLm ðxÞdx ¼ ð þ 1Þ 0 ð þ 1Þ m!ð1 1Þ m!ðÞ ð4:440Þ Substituting eq. (4.440) into eq. (4.377) results in the following expression for the PDF of the amplitude of the distribution pffiffiffi 1 X pffiffiffi ð1Þk ðk Þ pR ðxÞ ¼ pffiffiffi exp½ x2 H2k ð xÞ 2k 2 K! K!ðÞ k¼0
ð4:441Þ
A few examples showing the accuracy of the approximation of the Gamma PDF can be found in Fig. 4.8. Since the coefficients of the expansion (4.441) decay as fast as k , one can expect that the approximation (4.441) converges as fast to the exact PDF pR ðxÞ as the corresponding approximation of pi ðIÞ converges to the exact Gamma PDF. 4.7.4.3
Log-Normal Distribution
The log-normal distribution is also frequently used in modeling of random EM fields. In this case " # 1 ðlnðIÞ aÞ2 pi ðIÞ ¼ pffiffiffiffiffiffi exp ð4:442Þ 2 2 I 2
176
ADVANCED TOPICS IN RANDOM PROCESSES
Figure 4.8
Approximation of the Gamma distribution by its Gram–Charlier series.
with moments given by mLn
n2 2 ¼ exp an þ 2
ð4:443Þ
It is impossible to obtain an expression similar to eq. (4.440) in this case. However, the fact that the moments of this distribution are analytically known can be used as follows. Let Lk ð IÞ be rewritten as a power series of ð IÞ: Lk ð IÞ ¼
k X
ðkÞ
al ð IÞl
ð4:444Þ
l¼0 ðkÞ
where the coefficients al can be recursively calculated using eq. 22.3.9 on page 775 in [29] 1 k! 1 k ðkÞ ¼ ð1Þl ð4:445Þ l ¼ ð1Þl k l l! ðk lÞ! l!l! Plugging eq. (4.444) into eq. (4.366), one can obtain a recursive algorithm for the calculation of coefficients in expansion (4.377) ! ð1 k k k X X X l 2 2 ðkÞ ðkÞ ðkÞ l k ¼ ð4:446Þ pi ðIÞ al ð IÞ dI ¼ al l m l ¼ al l exp al þ 2 0 l¼0 l¼0 l¼0 thus leading to the final expression for pR ðxÞ in the form pffiffiffi 1 k X pffiffiffi X 1 ð1Þkl l l 2 2 2 exp al þ H2k ð xÞ pR ðxÞ ¼ pffiffiffi exp½ x 22k ðk lÞ!l!l! 2 k¼0 l¼0
ð4:447Þ
177
NARROW BAND RANDOM PROCESSES
Figure 4.9 Approximation of log-normal PDF by its Gram–Charlier series.
A few examples showing the accuracy of approximation of the log-normal PDF can be found in Fig. 4.9.
4.7.4.4
A Narrow Band Process with Nakagami Distributed Envelope
In this case, the distribution of the envelope is described by the Nakagami PDF 2 mm 2m1 ma2 1 a exp ; m ;a 0 pA ðaÞ ¼ ðmÞ 2
ð4:448Þ
The characteristic function can be found by first finding the characteristic function of the PDF (4.448) using the Bessel transform [19,38] ð1 ð 2 mm 1 2m1 ma2 R ðsÞ ¼ pA ðaÞJ0 ðsaÞda ¼ a exp ð4:449Þ J0 ðsaÞda ðmÞ 0 0 This integral can be calculated using standard integral (6.631–1 on page 698 in [39]) ð1 0
þ þ1 2
þ þ1 2 ; þ 1; x expðx ÞJ ð xÞdx ¼ 1 F1 þ þ1 4 2 2þ1 2 ð þ 1Þ
2
ð4:450Þ
by setting ¼ 0;
¼ s;
¼
m
and
¼ 2m 1
ð4:451Þ
178
ADVANCED TOPICS IN RANDOM PROCESSES
Here 1 F1 ða; b; zÞ is the so-called confluential hypergeometric function20 [39]. As a result s2 R ðsÞ ¼ 1F1 m; 1; 4m
ð4:452Þ
The distribution of the in-phase component can be found through the Fourier transform of the characteristic function, i.e. pR ðxÞ ¼
1 2
ð s2 1 1 s2 expðisxÞds ¼ cosðsxÞdx F m; 1; F m; 1; 1 1 1 1 4m 0 4m 1
ð1
pffiffiffi Using the substitution z ¼ 12 m s, the last integral is converted to rffiffiffiffi ð 1 rffiffiffiffi m m 2 z dz 1 F1 ðm; 1; z Þ cos 2x 0
2 pR ðxÞ ¼
ð4:453Þ
ð4:454Þ
which, in turn, can be calculated with the help of the standard integral 7.642 on page 822 in [39] ð1 0
pffiffiffi ðcÞ 2a1 1 1 2 2 y cosð2xyÞ1 F1 ða; c; x Þdx ¼ expðy Þ c ; a þ ; y 2 ðaÞ 2 2 2
ð4:455Þ
by setting c ¼ 1;
y¼x
rffiffiffiffi m and
a¼m
ð4:456Þ
Thus pR ðjxjÞ ¼
mm
m 1 1 1 x2 m pffiffiffi jxj2m1 exp x2 ; m þ ; 2 2 ðmÞ
ð4:457Þ
Here ða; b; zÞ is the second confluent hypergeometric function, also known as the Kummer Uða; b; zÞ function [39]. Since the Kummer function ða; b; zÞ can also be expressed in terms of the Whittaker function W; ðzÞ as
z 1 1 ð4:458Þ þ ; 1 þ 2 ; z ¼ zð2þ Þ exp W; ðzÞ 2 2 Equation (4.457) can be rewritten in the form suggested in [41]
m 1 pffiffiffi pR ðjxjÞ ¼ ðmÞ
2m1 4
3 jxjm 2
2 m 2 mx x W2m1 ; 2m1 exp 2 4 4
The same function is known as the Kummer M function Mða; b; zÞ or ða; b; zÞ [39].
20
ð4:459Þ
179
NARROW BAND RANDOM PROCESSES
Figure 4.10 PDF of the in-phase (quadrature) component of a stationary random process with the Nakagami distribution for different values of the severity parameter m.
The shape of the PDF of the in-phase component is shown in Fig. 4.10 for different values of parameter m. Alternatively, one can obtain a series expansion for pR ðxÞ in terms of Hermitian polynomials by following the procedure established in section 2.2. Indeed, using eqs. (2.65) and (4.366) # " 1
x X 1 m1 I 1 I I exp k L k pi ðIÞ ¼ ¼ exp 1þ ðmÞ k¼1
ð4:460Þ
with k ¼
ð1 0
pi ðIÞLk
ð1 I 1 1 1 ð1 þ k mÞ m dI ¼ Lk dI ¼ I m1 exp ðmÞ 0 ð1 mÞk! ð4:461Þ
one obtains that the series (4.377) for the PDF of the in-phase component becomes 2 X 1 x 1 ð1Þk ð1 þ k mÞ x p ffiffiffi ffi H pR ðxÞ ¼ pffiffiffiffiffiffiffi exp 2k k¼0 22k k!ð1 mÞk!
ð4:462Þ
For integer values of m ¼ n, n ¼ 1; 2; . . ., expressions (4.452) and (4.457) can be further simplified since 2 2 2 2 k ! n1 X s2 s s s s ck ¼ exp Ln1 ¼ exp 1þ 1 F1 n; 1; 4n 4n 4n 4n 4n k¼1 ð4:463Þ
180
ADVANCED TOPICS IN RANDOM PROCESSES
and, thus 2 " 2 k # n1 X s s exp ck expðisxÞdx 1þ 4n 4n 1 k¼1 k 2 ð 2 ð n1 X 1 1 s 1 1 s2 s expðisxÞds þ expðisxÞds ck exp ¼ 2 1 4n 2 4n 4n 1 k¼1 2 x " # 2 exp n1 X ð1Þk x x pffiffiffiffiffiffiffi ð4:464Þ ¼ ck 2n exp 1þ H2n pffiffiffiffi 2 k¼1
1 pR ðxÞ ¼ 2
ð1
Knowledge of the characteristic function (4.452) of the in-phase process also allows one to derive an expression for the PDF of the envelope of the mixture of the narrow band process with the Nakagami envelope and a narrow band Gaussian noise ðtÞ with the variance 2 ¼ N0 f . Indeed, since in this case R ðtÞ
¼ R ðtÞ þ ðtÞ
ð4:465Þ
and, using independence of the noise and the signal s2 2 s 2 exp ð jsÞ ¼ R ð jsÞ ð jsÞ ¼ 1 F1 m; 1; 4m 2
ð4:466Þ
Taking the inverse Bessel transform (4.332), one obtains the PDF p ðaÞ of the envelope of the process ðtÞ ð1 ð1 s2 2 s2 J0 ðasÞs ds exp ðjsÞJ0 ðasÞs ds ¼ a pA ðaÞ ¼ a 1 F1 m; 1; 4m 2 0 0 ð4:467Þ This integral can be expressed in terms of Meijer functions [39] using the table integral (7.663-1 on page 826 of [39]) 0 1 ð1 2 1;b 22 2 2n 21 @ y A x 1 F1 ða; b; x ÞJ ðxyÞdx ¼ G ðaÞy2þ1 23 41 þ þ 1 ; a; 1 þ 1 0 2 2 2 2 ð4:468Þ if the exponent exp½ 2 s2 =2 is represented by its Taylor series. Setting a ¼ m;
b ¼ 1;
¼
; 4m
¼ 0;
¼nþ
1 2
ð4:469Þ
one obtains
ð1 s 0
2nþ1
1 F1
! 2 1;1 s2 22n 21 ma m; b; G J0 ðasÞdx ¼ 4m ðmÞa2nþ1 23 nþ1;m;nþ1
ð4:470Þ
SPHERICALLY INVARIANT PROCESSES
181
and, thus 1 2k ð 1 X s2 2kþ1 k ð1Þ k J0 ðasÞs ds pA ðaÞ ¼ a s 1 F1 m; 1; 2 k! 0 4m k¼0 ! 1;1 1 2k k 1 X 2 21 ma2 k ¼ ð1Þ G ðmÞ k¼0 k! a2k 23 kþ1; m; kþ1
ð4:471Þ
Although this equation is convenient in calculations of the PDF itself21 it is not useful in analysis.
4.8 4.8.1
SPHERICALLY INVARIANT PROCESSES Definitions
A formal definition of a Spherically Invariant Vector (SIRV) is as follows: a random vector ¼ ½1 ; 2 ; . . . ; N T is called a spherically invariant vector if its PDF is uniquely defined by the specification of a mean vector m , a covariance matrix C , and the characteristic PDF p ðsÞ. It is also known as an elliptically contoured distribution [42–68]. This definition does not provide any specific details about the general form of the PDF of a SIRV. However, the answer is given by the following. Representation theorem [42]: If a random vector is a SIRV, then there exists a nonnegative random variable such that the PDF of the random vector conditioned on is a multivariate Gaussian PDF: ¼
ð4:472Þ
A Spherically Invariant Random Process (SIRP) ðtÞ is a random process such that every random vector n obtained by means of sampling of the random process ðtÞ is a SIRV. It is important to distinguish between a SIRP and a compound process, obtained as a product of two random processes ðtÞ ¼ ðtÞðtÞ
ð4:473Þ
which will be extensively used in the following derivations. It is clear from the definition of the SIRV that any particular realization of a SIRV is a Gaussian process with variance changing from one realization to another. Thus, while the process ðtÞ is non-Gaussian and stationary, it is not an ergodic process, since the distribution of the variance cannot be estimated by observing a single realization of the process. On the contrary, a compound process defined by a similar equation, eq. (4.473), defines an ergodic process. However, the structure of the compound process is more complicated since every realization is not Gaussian and it will not be possible to use general expressions for the joint probability density of an arbitrary order. Nevertheless, it will be shown that a significant gain can be obtained from the The Meijer functions are implemented in such products as Maple# or Mathematica#
21
182
ADVANCED TOPICS IN RANDOM PROCESSES
representation of a compound process as a product of a Gaussian process and an exponentially correlated Markov process.
4.8.2 4.8.2.1
Properties Joint PDF of a SIRV
It is often convenient to assume that the modulating random variable is independent on the underlying zero mean Gaussian vector n. In this case, it is relatively easy to obtain the joint PDF p ðxÞ of the vector g in terms of the covariance matrix C of the Gaussian vector n and the characteristic PDF p ðsÞ of the modulating variable . Indeed, the PDF p ðxÞ is given by " # h pi xT C1 1 1 x qffiffiffiffiffiffiffiffiffiffi exp ¼ p ðpÞ; p ¼ xT C1 p ðxÞ ¼ ¼ exp p ffiffiffiffiffiffiffiffi x N=2 N=2 2 2 jC j ð2Þ ð2Þ C ð4:474Þ Here p is a scalar parameter, defined as a quadratic form of the vector x. Equation (4.474) indicates a well-known property of a Gaussian vector. The conditional PDF is thus given by " # h pi xT C1 1 sN x N ffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffi q exp pj ðy j sÞ ¼ ¼ s exp 2 ds 2s2 ð2ÞN=2 C ð2ÞN=2 x
ð4:475Þ
since multiplication by a fixed number scales the covariance matrix by a factor s2 (see section 2.3). Finally, the unconditional PDF p ðxÞ can be obtained by integration of the PDF (4.475) over the range of and the weight p ðsÞ: p ð y; NÞ ¼
ð1 0
¼
pj ð y j sÞp ðsÞds ¼
1 qffiffiffiffiffiffiffiffiffiffi N=2 ð2Þ C
ð1 0
h pi sN exp 2 p ðsÞds 2s
1 qffiffiffiffiffiffiffiffiffiffi hN ðpÞ ð2ÞN=2 C
ð4:476Þ
where hN ðpÞ ¼
ð1 0
h pi sN exp 2 p ðsÞds 2s
ð4:477Þ
Equations (4.476)–( 4.477) show that the PDF of a SIRV is a function of a quadratic form p, as is the Gaussian distribution. However, the form of this dependence is more complicated than a simple exponent. Thus, a SIRV can be considered as a natural extension of a Gaussian random vector. It is shown in a number of publications [47–58] that a SIRV inherits a number of properties of a Gaussian vector such as invariance under a deterministic linear transformation, properties of an optimal estimator, etc.
SPHERICALLY INVARIANT PROCESSES
183
It follows from eq. (4.472) that the covariance matrices M of a SIRV and the covariance matrix of the underlying Gaussian process M are M ¼ Ef 2 gM ¼ m2 M
ð4:478Þ
Without loss of generality, it is possible to assume that the second moment m2 is unity: m2 ¼ 1. In this case, the correlation matrix of the SIRV is the same as that of the underlying Gaussian process. Setting N ¼ 1 in eq. (4.476), the marginal PDF of the resulting process p ðyÞ is related to the marginal PDF p ðsÞ of the modulation process as ð1 1 y2 ds ð4:479Þ exp 2 2 p ðsÞ p ðyÞ ¼ pffiffiffiffiffiffiffiffi 2s s 2 0 Taking the Fourier transform of both sides, the following equation for the characteristic function of the output process can be easily derived ð1 ð1 ð 1 ds 1 y2 ðj!Þ ¼ p ðyÞ expð j!yÞdy ¼ pffiffiffiffiffiffiffiffi p ðsÞ exp 2 2 expð j!yÞdy s 1 2s 2 0 1 " # 2 2 2 ð1 ð1 1 2k 2k 2k X s ! ks ! p ðsÞ exp p ðsÞ ð1Þ ds ¼ ds ¼ 2 2k k! 0 0 k¼0 ¼1þ
1 X 2k !2k ð1Þk k mu2k 2 k! k¼1
ð4:480Þ
Comparing this expansion with the Taylor series of the characteristic function , ð j!Þ, the following relations between the moments of the modulating PDF and the resulting process can be easily deduced ð j!Þ ¼
1 X ðjÞl m l
l¼0
l!
!l ¼ 1 þ
1 X k¼1
ð1Þk
2k !2k
m dk k! 2k
ð4:481Þ
or, equivalently m2k1 ¼ 0; m2k ¼
ð2kÞ! 2k
m2k ¼ m2k m 2k 2k k!
ð4:482Þ
The last equation can be of course directly obtained by manipulating moment brackets, as in section 2.8. Interestingly, only even moments of the modulating PDF matter, since the odd moments are cancelled by zero odd moments of the Gaussian distribution.
4.8.2.2
Narrow Band SIRVs
If the underlying Gaussian process ðtÞ in eq. (4.472) is a narrow band process, then the resulting SIRV (or compound process) will also be a narrow band process. In turn, the real
184
ADVANCED TOPICS IN RANDOM PROCESSES
Gaussian process ðtÞ can be considered as a real part of a complex process ðtÞ ¼ R ðtÞ þ jI ðtÞ; ðtÞ ¼ RefðtÞg
ð4:483Þ
where the in-phase component R ðtÞ and the quadrature component I ðtÞ are defined using the Hilbert transformation, as described in section 4.7.1. As a result, the process ðtÞ can be thought of as the real part of a complex process ðtÞ ¼ ðtÞðtÞ ¼ ðtÞR ðtÞ þ j ðtÞI ðtÞ
ð4:484Þ
with the envelope defined as A ðtÞ ¼ jðtÞj ¼ ðtÞjðtÞj ¼ ðtÞA ðtÞ
ð4:485Þ
Thus, the envelope of the resulting process is an envelope of a complex Gaussian process modulated by a random variable or a random process ðtÞ. Since the envelope of a narrow band Gaussian process is the Rayleigh process pA ðaÞ ¼
a a2 exp 2 2 2
ð4:486Þ
the PDF of the envelope pA ðaÞ of the process ðtÞ can be expressed as pA ðaÞ ¼
4.8.3
ð1 0
p ðsÞ
a a2 ds exp 2 2 2 2s s2
ð4:487Þ
Examples
A number of analytically tractable examples are summarized in Table 4.1.
Table 4.1 Marginal PDF Gaussian
p ðyÞ
2 y exp 2 2 pffiffiffiffiffiffiffiffiffiffi 2 2
p ðsÞ; s 0 ðs 1Þ
Two-side Laplace
b 2 expðbjxjÞ
2 2 b2 s exp b 2s
Cauchy
b ðb2 þ x2 Þ
b2 b2 exp 2 s3 2s
K-distribution
2b bx K1 ðbxÞ ðÞ 2
No analytical expression
h2N ðpÞ
p exp 2 b2N pffiffiffi pffiffiffi N1 kN1 ðb pÞ ðb pÞ
2N b N þ 12 1 pffiffiffi 2 ðb þ pÞNþ2 pffiffiffi b2N ðb pÞvN pffiffiffi KNv ðb pÞ ðÞ2v1
REFERENCES
185
REFERENCES 1. W. Feller, An Introduction to Probability Theory and Its Applications, New York: Wiley, 1950– 1966. 2. E. Dynkin, Theory of Markov Processes, Englewood Cliffs, NJ: Prentice-Hall, 1961. 3. J. Doob, Stochastic Processes, New York: Wiley, 1990. 4. J. Proakis, Digital Communications, Boston: McGraw-Hill, 2001. 5. L. Zadeh, System Theory, New York: McGraw-Hill, 1969. 6. R. Dorf (Ed.), The Engineering Handbook, CRC Press, 1995. 7. A. Papoulis, The Fourier Integral and Its Applications, New York: McGraw-Hill, 1962. 8. G. Box, G. Jenkins, and G. Reinsel, Time Series Analysis: Forecasting and Control, Englewood Cliffs, NJ: Prentice-Hall, 1994. 9. A. Elwalid, D. Heyman, T. Lakshman, D. Mitra, and A. Weiss, Fundamental Bounds and Approximations for ATM Multiplexers with Applications to Video Teleconferencing, IEEE Journal on Selected Areas of Communications, vol. 13, no. 6, 1995, pp. 1004–1016. 10. M. Zorzi, Some Results on Error Control for Burst-error Channels Under Delay Constraints, IEEE Transactions Vehicular Technology, vol. 50, no. 1, 2001, pp. 12–24. 11. K. Baddour and N. Beaulieu, Autoregressive models for fading channel simulation, Proceedings of GLOBECOM-2001, November, 2001, vol. 2, pp. 1187–1192. 12. P. Rao, D. Johnson, and D. Becker, Generation and analysis of non-Gaussian Markov time series Signal Processing, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 40, no. 4, April 1992, pp. 845–856. 13. A. Oppenheim and R. Shafer, Discrete-time Signal Processing, Upper Saddle River: Prentice Hall, 1999. 14. M. Schetzen, The Volterra and Wiener Theories of Nonlinear Systems, New York: Wiley, 1980. 15. A.N. Malakhov, Cumulant Analysis of Non-Gaussian Random Processes and Their Transformations, Moscow: Sovetskoe Radio, 1978 (in Russian). 16. J.K. Ord, Families of Frequency Distributions, London: Griffin, 1972. 17. P. Bello, Characterization of Randomly Time-Variant Linear Channels, IEEE Transactions Communications, vol. 11, no. 4, December, 1963, pp. 360–393. 18. A. Papoulis,Probability, Random Variables,andStochasticProcesses,New York:McGraw-Hill, 1984. 19. S.M. Rytov, Yu. A. Kravtsov, and V.I. Tatarskii, Principles of Statistical Radiophysics, New York: Springer-Verlag, 1987–1989. 20. V. Tikhonov and V. Khimenko, Excursions of Trajectories of Random Processes, Moscow: Nauka, 1987 (In Russian). 21. S.O. Rice, Mathematical Analysis of Random Noise, Bell Systems Technical Journal, vol. 24, January, 1945, pp. 46–156. 22. S.O. Rice, Statistical Properties of a Sine Wave Plus Random Noise, Bell Systems Technical Journal, vol. 27, January, 1948, pp. 109–157. 23. M. Patzold, Mobile Fading Channels, John Wiley & Sons, 2002. 24. J. Lai and N.B. Nandayam, Minimum Duration Outages in Rayleigh fading Channels, IEEE Transactions on Communications, vol. 49, October, 2001, pp. 1755–1761. 25. W. Lindsey and M. Simon, Telecommunication Systems Engineering, Englewood Cliffs, NJ: Prentice-Hall, 1973. 26. M.D. Yacoub, M.V. Barbin, M.S. de Castro, and J.E. Vargas Bautista, Level Crossing Rate of Nakagami-m Fading Signals: Field Trials and Validation, Electronics Letters, vol. 36, no. 4, 2000, pp. 355–357. 27. J. Lai and N.B. Nandayam, Packet Error Rate for Burst-Error-Correcting Codes in Rayleigh Fading Channels, Proceedings of IEEE VTC, 1998, pp. 1568–1572. 28. C. Nikias and A. Petropulu, Higher-Order Spectra Analysis, New Jersey: Prentice-Hall, 1993.
186
ADVANCED TOPICS IN RANDOM PROCESSES
29. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions, New York: Dover, 1970. 30. W. Jakes, Microwave Mobile Communications, New York: Wiley, 1974. 31. M. Nakagami, The m-Distribution—A General Formula of Intensity Distribution of rapid Fading, In: W.C. Hoffman, Ed., Statistical Methods in Radio Wave Propagation, New York: Pergamon, 1960, pp. 3–36. 32. Y. Goldstein and V.Ya.Kontorovich, Influence of the Statistical Properties of Radio Channel on Error Grouping, Telecommunications and Radio Engineering, vol. 22, no. 8, 1968, pp. 6–8. 33. V.Ya. Kontorovich, Questions of Noise Immunity of Communication Systems Due to Gaussian and Non-Gaussian Interference, Ph.D. Thesis, Leningrad Bonch-Bruevich Telecommunications Institute 1968, (In Russian). 34. F. Ramos, V. Kontorovich, and M. Lara-Barron, Simulation of Nakagami Fading with Given Autocovariance Function, IEEE Intern. Symp. on Intellegent Sign. Proc. and Commun. Systems, ISPACS 2000, Hawaii, November, 5–8, 2000, pp. 561–564. 35. Cyril-Daniel Iskander and P.T. Mathiopolous, Analytical Level-Crossing Rates and Average Fade Durations for Diversity techniques in Nakagami Fading Channels, IEEE Transactions on Communications, vol. 50, no. 8, August, 2002, pp. 1301–1309. 36. R. Ertel, P. Cardieri, K. Sowerby, T.S. Rappaport, and J. Reed, Overview of Spatial Channel Models for Antenna Array Communication Systems, IEEE Transactions on Personal Communications, vol. 5, no. 1, 1998, pp. 10–22. 37. V. Tikhonov, On a Markov Nature of the Envelope of the Narrowband Random Process, Radiotechnika and Elekronika, vol. 6, no. 7, 1961 (In Russian). 38. A.H. Zemanian, Generalized Integral Transformations, New York: Dover, 1987. 39. I.S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series, and Products, New York: Academic Press, 1980. 40. A.P. Prudnikov, Yu. A. Brychkov, and O.I. Marichev, Integrals and Series, New York: Gordon and Breach Science Publishers, 1986. 41. D. Klovsky, V. Kontorovich, and S. Shirokov, Models of Continuous Communication Channels Based on Stochastic Differential Equations, Moscow: Radio i sviaz, 1984, (in Russian). 42. K. Yao, A Representation Theorem and its Applications to Spherically-Invariant Random Processes, IEEE Transactions on Information Theory, vol. 19, no. 5, 1973, pp. 600–608. 43. A. Abdi, H. Barger and M. Kaveh, Signal Modeling in Wireless Fading Channels Using Spherically Invariant Processes, Proceedings of ICASSP vol. 5, 2000, 2997–3000. 44. G. Jacovitti and A. Neri, Special Techniques for The Estimation of the Acf of Spherically Invariant Random Processes, Spectrum Estimation and Modeling, Fourth Annual ASSP Workshop, 3–5 Aug, 1988, pp. 379–382. 45. L. Izzo, L. Paura, and M. Tanda, Performance of a Square-Law Combiner for Reception of Nakagami Fading Orthogonal Signals in Spherically Invariant Noise, Proceedings of IEEE Symposium Information Theory, June-1 July, 1994, p. 91. 46. L. Izzo and M. Tanda, Diversity Reception of Fading Signals in Spherically Invariant Noise, IEE Proceedings on Communications, vol. 145, no. 4, 1998, pp. 272–276. 47. Y. Kung, A Representation Theorem and its Applications to Spherically-Invariant Random Processes, IEEE Transactions on Information Theory, vol. 19, no. 5, September, 1973, pp. 600–608. 48. M. Rangaswamy, D. Weiner, and A. Ozturk, Non-Gaussian Random Vector Identification Using Spherically Invariant Random Processes, IEEE Transactions on Aerospace and Electronic Systems, vol. 29, no. 1, 1993, pp. 111–124. 49. L. Izzo and M. Tanda, Array Detection of Random Signals Embedded in an Additive Mixture of Spherically Invariant and Gaussian Noise, IEEE Transactions on Communications, vol. 49, no. 10, 2001, pp. 1723–1726. 50. B. Picinbono, Spherically Invariant and Compound Gaussian Stochastic Processes, IEEE Transactions on Information Theory, vol. 16, no. 1, 1970, pp. 77–79.
REFERENCES
187
51. G. Wise and N. Gallagher, Jr., On Spherically Invariant Random Processes, IEEE Transactions on Information Theory, vol. 24, no. 1, 1978, pp. 118–120. 52. L. Hoi and A. Cambanis, On the Rate Distortion Functions of Spherically Invariant Vectors and Sequences, IEEE Transactions on Information Theory, vol. 24, no. 3, 1978, pp. 367–373. 53. M. Rupp, The Behavior of LMS and NLMS Algorithms in the Presence of Spherically Invariant Processes, IEEE Transactions on Signal Processing, vol. 41, no. 3, 1993, pp. 1149–1160. 54. M. Rupp and R. Frenzel, Analysis of LMS and NLMS Algorithms with Delayed Coefficient Update Under the Presence of Spherically Invariant Processes, IEEE Transactions on Signal Processing, vol. 42, no. 3, 1994, pp. 668–672. 55. T. Barnard and D. Weiner, Non-Gaussian Clutter Modeling with Generalized Spherically Invariant Random Vectors, IEEE Transactions on Signal Processing, vol. 44, no. 10, 1996, pp. 2384–2390. 56. E. Conte, M. Lops, and G. Ricci, Adaptive Matched Filter Detection in Spherically Invariant Noise, IEEE Signal Processing Letters, Volume: 3 Issue: 8, August, 1996, pp. 248–250. 57. E. Conte, A. De Maio, and G. Ricci, Recursive Estimation of the Covariance Matrix of a Compound-Gaussian Process and Its Application to Adaptive CFAR Detection, IEEE Transactions on Signal Processing, vol. 50, no. 8, 2002, pp. 1908–1915. 58. M. Rangaswamy and J. Michels, A Parametric Multichannel Detection Algorithm for Correlated Non-Gaussian Random Processes, Proceedings of IEEE National Radar Conference, 1997, 13–15 May, 1997, pp. 349–354. 59. V. Aalo and J. Zhang, Average Error Probability of Optimum Combining With a Co-Channel Interferer In Nakagami Fading, Proceedings of Wireless Communications and Networking Conference, 2000, vol. 1, 2000, pp. 376–381. 60. J. Roberts, Joint Phase and Envelope Densities of a Higher Order, IEE Proceedings of Radar, Sonar and Navigation, vol. 142, no. 3, 1995, pp. 123–129. 61. E. Conte, M. Longo, and M. Lops, Modeling and Simulation of Non-Rayleigh Radar Clutter, IEE Proceedings of Radar and Signal Processing F, vol. 138, no. 2, 1991, pp. 121–130. 62. A. Gualtierotti, A Likelihood Ratio Formula for Spherically Invariant Processes, IEEE Transactions on Information Theory, vol. 22, no. 5, 1976, pp. 610–610. 63. M. Rangaswamy, D. Weiner, and A. Ozturk, Computer Generation of Correlated Non-Gaussian Radar Clutter, IEEE Transactions on Aerospace and Electronic Systems, vol. 31, no. 1, 1995, pp. 106–116. 64. M. Rangaswamy, J. Michels, and D. Weiner, Multichannel Detection for Correlated Non-Gaussian Random Processes Based on Innovations, IEEE Transactions on Signal Processing, vol. 43, no. 8, 1995, pp. 1915–1922. 65. L. Izzo and M. Tanda, Asymptotically Optimum Diversity Detection In Correlated non-Gaussian Noise, IEEE Transactions on Communications, vol. 44, no. 5, 1996, pp. 542–545. 66. K. Gerlach, Spatially Distributed Target Detection in Non-Gaussian Clutter, IEEE Transactions vol. 35, no. 3, 1999, pp. 926–934. 67. I. Blake and J. Thomas, On a Class of Processes Arising in Linear Estimation Theory, IEEE Transactions on Information Theory, vol. 14, no. 1, 1968, pp. 12–16. 68. G. Gabor and Z. Gyorfi, On the Higher Order Distributions of Speech Signals, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 4, 1988, pp. 602–603. 69. R. Vijayan and J.M. Holzman, Foundations for Level Crossing Analysis of Handoff Algorithms, Proceedings of ICC, 1993, pp. 935–939. 70. Ya. Fomin, Excursions of Random Process, Moscow, Sovetskoe Radio, 1980 (in Russian). 71. J. Galambos, The Asymptotic Theory of Extreme Order Statistics, New York: Wiley, 1978. 72. T.H. Lehman, A Statistical Theory of Electromagnetic Fields in Complex Cavities, Interaction Note 494, May, 1993. 73. R. Holland and R. St. John, Statistical Coupling of EM fields to Cables in an Overmode Cavity, Proceedings of ACES, 1996, pp. 877–887.
188
ADVANCED TOPICS IN RANDOM PROCESSES
74. R. Holland and R. St. John, Statistical Description of Cable Current Response Inside a Leaky Enclosure, Proceedings of ACES, 1997. 75. R. Holland and R. St. John, Statistical Response of EM-Driven Cables Inside an Overmoded Enclosure, Accepted for IEEE Transactions on EMC. 76. R. St. John, Approximate Field Component Statistics of the Lehman, Overmoded Cavity Distribution, Internet. 77. R. Holland and R. St. John, Statistical Response of Enclosed Systems to HPM Environment, Proceedings of ACE, 1994, pp. 554–568. 78. R. Holland and R. St. John, Statistical Electromagnetics, Philadelphia: Taylor and Francis Scientific Publishers, in press. 79. D.A. Hill, Spatial Correlation Function for Fields in a Reverberation Chamber, IEEE Transactions on EMC, Vol. 37, No. 1, 1995, p. 138. 80. J.G. Kostas and B. Boverie, Statistical Model for a Mode-Stirred Chamber, IEEE Transactions on EMC, Vol. 33, No. 4, November, 1991, pp. 366–370. 81. A. Molisch, Wideband Wireless Digital Communications, Upper Saddle River, NY: Prentice-Hall, 2001.
5 Markov Processes and Their Description 5.1
DEFINITIONS
Let ðtÞ be a random process which depends on only one parameter—time1. Conditional on the range of values which the random process assumes and the nature of the parameter t, one can distinguish five major types of random process: 1. A discrete random sequence m ¼ ðtm Þ or a random chain (discrete process in discrete time). In this case, the random process attains a fixed and countable set of values xk , k ¼ 1; . . . ; K, and the time parameter is discrete as well, i.e. t belongs to a discrete countable set tm , m ¼ 1; . . . ; M. It is important to note that both K and M can be infinite. Processes of this type are often encountered in practice: a random tossing of an object, random telegraph signals and remote sensing are just a few examples. Random discrete sequences are often obtained as a result of sampling and quantization of a continuous random process. Most of the modern communication systems use such signals and thus can be described by random chains. 2. A random sequence m (continuous process in discrete time). This kind of process is different from the discrete chain only in the range of attainable values. In the case of the random sequence it is a continuum of values rather than a discrete set as in the former case. Quite often a random sequence is obtained as the result of sampling of a continuous random process. 3. A discrete random process or continuous time chain (discrete process with continuous time). In this case the random process ðtÞ assumes values from a countable set xk , k ¼ 1; . . . ; K, while the parameter t continuously changes from t ¼ t0 to t0 þ T. It is also common to assume that t0 ¼ 0. The length of a serving queue and the number of users of a computer network are two examples of processes of this type. Often, a continuous time chain can be considered as an output of a quantizer driven by a continuous random process. 4. A continuous random process ðtÞ. In this case, the random process attains a value from a continuum of values with the time parameters also changing continuously. It is also often 1
Among the other possible choices the spatial parameter is often used to model space–time variation of signals.
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
190
MARKOV PROCESSES AND THEIR DESCRIPTION
assumed that the trajectory of the continuous process does not experience large vertical jumps. Noise is one of quite a few examples of this kind of random process. 5. Mixed random processes. In this case, a random process is a mixture of the different types listed above. In particular, continuous processes with jumps can be considered as mixed. In this case, the random process resembles a continuous random process most of the time. However, from time to time, a rapid variation in the value (jump) appears, often due to changes in the system structure. Noise in communication systems using power lines and channels with impulsive and continuous noise are systems whose random structure can be described as a mixed process. Often, it is possible to describe a mixed process using a vector random process nðtÞ ¼ ½ðtÞ; ðtÞT , where ðtÞ is a continuous random process (type 4) and ðtÞ is a discrete chain2. If, in addition, a process poses a Markov property as defined below, one can distinguish Markov chains, Markov sequences, continuous time Markov chains, continuous Markov processes and mixed Markov processes. A few typical trajectories are shown in Figure 5.1. The classification can be extended to a case of a random vector nðtÞ ¼ ½1 ðtÞ; 2 ðtÞ; . . . ; r ðtÞT . In this case the vector process is called r-dimensional. In general, each component belongs to a different type of process and thus classification should be done on a component basis. Extensive literature exists on the theory of Markov processes and their applications [1–6 to name a few]. Only a brief review is included in this book to facilitate further considerations.
5.1.1
Markov Chains
Let a physical system which has K possible states #ð1Þ ; #ð2Þ ; . . . ; #ðKÞ be considered. The system can change its state randomly, depending on the external factors. These changes appear in a sudden manner at time instants t1 < t2 < t3 < . . ., i.e. transitions 1 ! 2 ! 3 ! . . . take place only at these specified moments of time. Here m ¼ ðtm Þ stands for the state of the system after m 1 transitions, and 1 ¼ ðt1 Þ is the initial state of the system (see Fig. 5.1(a)). Of course all k are chosen from a finite set f#ðlÞ g; l ¼ 1; . . . ; K. A complete description of such a system can be achieved if the joint probability Pð1 ; 2 ; . . . ; m Þ can be calculated for an arbitrary m. Using the Bayes formula one can write Pðk1 ; k2 ; . . . ; km Þ ¼ Pð1 ¼ #k1 ; 2 ¼ #k2 ; . . . ; m ¼ #km Þ ¼ Pð1 ¼ #k1 ; 2 ¼ #k2 ; . . . ; m1 ¼ #km1 Þ Pðm ¼ #km j 1 ¼ #k1 ; 2 ¼ #k2 ; . . . ; m1 ¼ #km1 Þ
ð5:1Þ
The notation ðtÞ for a continuous random process and ðtÞ for a discrete random process or chain is reserved for the presentation in this chapter.
2
191
DEFINITIONS
Figure 5.1 Classification of Markov processes: (a) Markov chain; (b) Markov sequence; (c) continuous time Markov chain; (d) continuous Markov process.
where #k is one of the possible states #ðÞ . In the very same manner, one can further expand Pð1 ¼ #k1 ; 2 ¼ #k2 ; . . . ; m1 ¼ #km1 Þ in terms of conditional probabilities to obtain Pðk1 ; k2 ; . . . ; km Þ ¼ P1 ð1 Þ
m Y
ð j 1 ; . . . ; 1 Þ
ð5:2Þ
¼2
where P1 ð1 Þ is the probability of the initial state 1 ¼ ðt1 Þ and ð#k j #k1 ; . . . ; #k1 Þ ¼ Pð ¼ #k j 1 ¼ #k1 ; 2 ¼ #k2 ; . . . ; 1 ¼ #k1 Þ;
2m
ð5:3Þ
192
MARKOV PROCESSES AND THEIR DESCRIPTION
is the conditional probability to reach state #k at the time instant t if the previous states are 1 ; 2 ; . . . ; 1 . The probability ð#k j #k1 ; . . . ; #k1 Þ is also known as the transitional probability. Thus, in order to calculate the joint probabilities Pð1 ; 2 ; . . . ; m Þ, it is sufficient to know the initial state of the system and all the transitional probabilities , or the methodology of their calculation. There are a few specific cases when such calculations are relatively simple: 1. Changes in all states are independent. In this case, the transition to the next state does not depend on the previous state of the system, i.e. m ðm j 1 ; . . . ; m1 Þ ¼ Pm ðm Þ
ð5:4Þ
and eq. (5.2) is significantly simplified to become
Pð1 ; 2 ; . . . ; m Þ ¼
m Y
P ð Þ
ð5:5Þ
¼1
As can be seen, this case coincides with the case of independent trials [4,5]. It is also a degenerate case of a Markov chain considered below. 2. A more general model can be obtained if the transitional probabilities depend only on the current state m1 ¼ ðtm1 Þ of the system and do not depend on its past states at the time instances t1 ; t2 ; . . . ; tm2 . This property is general to all types of Markov process. For this reason Markov processes are often called processes without memory.3 Mathematically speaking, the following equality holds for any index m m ðm j 1 ; . . . ; m1 Þ ¼ m ðm j m1 Þ
ð5:6Þ
Conditional probabilities m ðm j m1 Þ are also called transition probabilities. Equation (5.6) defines the so-called Markov property of a random chain (process). Using the chain rule, one can easily transform eq. (5.2) into
Pð1 ; 2 ; . . . ; m Þ ¼ P1 ð1 Þ
m Y
ð j 1 Þ
ð5:7Þ
¼2
As a result, the complete description of a Markov chain can be achieved by providing the probability of the initial state P1 ð1 Þ and the transitional probabilities ð j 1 Þ. It is important to emphasize that eq. (5.7) holds true for the case of two moments of time ðm ¼ 2Þ for any random process. However, for the case of three or more moments, this equation does not describe a general situation, i.e. not all random processes are Markov. 3
In the case (2), the process under consideration does not remember its past only, in contrast to a process considered in the case (1) which does not remember its past or its present.
193
DEFINITIONS
3. In a similar manner, one could define a Markov process of order nðn 1Þ: if the transitional PDF depends only on n previous states (including the current state) of the system, and does not depend on a more distant past. Mathematically speaking ( m ðm j 1 ; . . . ; m1 Þ ¼
m ðm j mn ; . . . ; m1 Þ;
m>n
m ðm j 1 ; . . . ; m1 Þ;
2mn
ð5:8Þ
A regular Markov chain, defined in Section 2, corresponds to the case n ¼ 1. A Markov chain of order n can be reduced to a simple vector Markov chain for a vector of dimension n. In order to do so, one can consider n sequential states of the system as components of a single n-dimensional vector h defined as hm ¼ fi; m ; i ¼ 1; . . . ; ng ¼ fmnþ1 ; . . . ; m g;
i;m ¼ mnþi
ð5:9Þ
On the previous time step the vector hm1 is expressed as fi; m1 ; i ¼ 1; . . . ; ng ¼ fmn ; . . . ; m1 g
ð5:10Þ
and the following relations between components of the vector are satisfied i; m ¼ iþ1; m1 ¼ mnþi ;
1; m1 ¼ mn ;
n<m
ð5:11Þ
Thus, a Markov chain of the order n is a particular case of a simple Markov vector of dimension n. In this case, the transitional PDF is defined as m ðhm j hm1 Þ ¼ m ðn; m j 1; m1 ; . . . ; n; m1 Þ
n1 Y
ði;m ; iþ1; m1 Þ
ð5:12Þ
i¼1
where ð; Þ ¼ is the Kronecker symbol. This relation follows from the definition (5.11) of the components of the vector hm. It is often necessary to evaluate the probabilities Pm ðhm Þ and Pm1 ðhm1 Þ of the system state at two sequential moments of time. For a simple Markov chain, the necessary relation can be obtained directly from eq. (5.7) Pm ðhm Þ ¼
#k X
m ðhm j hm1 ÞPm1 ðhm1 Þ;
m ¼ 2; . . . ; M
ð5:13Þ
m1 ¼ #1
Indeed, it follows from eq. (5.7) that Pðh1 ; . . . ; hm Þ ¼ m ðhm j hm1 ÞPðh1 ; . . . ; hm1 Þ
ð5:14Þ
Adding both sides of eq. (5.14) over h1 ; . . . ; hm1, one immediately obtains eq. (5.13).
194
MARKOV PROCESSES AND THEIR DESCRIPTION
If a simple vector Markov chain of dimension r is considered, hm ¼ fi; m ; i ¼ 1; . . . ; rg, then the recurrent relation (similar to eq. (5.13)) has the following form Pm ð1; m ; . . . ; r; m Þ #1;K1
¼
X
# r; Kr X
1; m1 ¼#1; 1
m ð1; m ; . . . ; r; m j 1; m1 ; . . . ; r; m1 ÞPð1; m1 ; . . . ; r; m1 Þ
r; m1 ¼ #r; 1
ð5:15Þ or Pm ðm Þ ¼
X
m ðhm j hm1 ÞPm1 ðhm1 Þ
ð5:16Þ
m1 ¼ #
and the summation is conducted over all possible states 0 ¼ f#i;ki ; ki ¼ 1; . . . ; Ki g of each component i; m ; i ¼ 1; . . . ; r. Here Ki is the number of states of the i-th component. For a Markov chain of order n, one can write an equation similar to eq. (5.7), based on eqs. (5.1) and (5.8) Pm ðmnþ1 ; . . . ; m Þ ¼
#K X
m ðm j mn ; . . . ; m1 Þ Pm1 ðmn ; . . . ; m1 Þ
ð5:17Þ
mn ¼ #1
where m > n. At the initial steps 2 m n, we have to set n ¼ m 1. In this case the following recurrent relation holds Pm ð2 ; . . . ; m Þ ¼
#K X
m ðm j 1 ; . . . ; m1 Þ Pm1 ðmn ; . . . ; m1 Þ
ð5:18Þ
m ¼ #1
Expressions (5.17) and (5.18) relate joint probabilities of the last n states of the discrete random variable fm g. It can also be noted that the same relations can be obtained from the recurrent relation (5.16) for the probabilities of states of the Markov chain of order n if one sets r ¼ n and uses eqs. (5.10) and (5.12). In this case the state space of each component is identical and thus Pm ðhm Þ ¼
#K X
m ðn; m j 1; m1 ; 1; m ; . . . ; n1; m Þ Pm1 ð1; m1 ; 1; m ; . . . ; n1; m Þ
1; m1 ¼ #1
¼
#K X
m ðm j mn ; mnþ1 ; . . . ; m1 Þ Pm1 ðmn ; mnþ1 ; . . . ; m1 Þ
mn ¼ #1
ð5:19Þ where n < mðm 2Þ. The obtained relation coincides with eq. (5.17). As an example of the application of a simple Markov chain, one can consider the following problem. Let the initial state of a system be known at the initial moment of time t1
195
DEFINITIONS
and let the transition probabilities be specified as pk ðmÞ ¼ Pðm ¼ #k Þ;
jk ð; mÞ ¼ Pfm ¼ #k j ¼ #j g;
j; k ¼ 1; . . . ; K; 1 m ð5:20Þ
The interpretation of this notation is as follows: the probability pk ðmÞ describes the (unconditional) probability of finding the system at the state #k at the ðm 1Þ-th step (i.e. at the moment of time t ¼ tm ); the probability pk ð0Þ ¼ p0k describes the initial distribution of the states of the system. Conditional probabilities jk ð; mÞ define the probability of the system to reside at the state #k at the moment of time tm , given that the system has resided at the state # at the moment of time t ¼ t < tm . This probability is often referred to as the transitional probability from the state #j to the state #k over ðm Þ steps. It is required to find the probabilities of states of the system at the moment of time tm > t1 , and to find their limit when m approaches infinity. It is clear that all the quantities defined by eq. (5.20) are non-negative and satisfy the following normalization conditions K X
pk ðmÞ ¼ 1;
pk ðmÞ 0;
m ¼ 1; . . . ; M
ð5:21Þ
k¼1 K X
jk ð; mÞ ¼ 1;
jk ð; mÞ 0;
j ¼ 1; . . . ; K
ð5:22Þ
k¼1
Using the Bayes theorem one can immediately write pk ðmÞ ¼
K X
pj ðÞjk ð; mÞ;
k ¼ 1; . . . ; K;
1m
ð5:23Þ
j¼1
Transition of the system from the state #j to the state #k over a few steps can be completed in a number of ways. In other words, the system can reside in different intermediate states #i . Using the Bayes rule and the definition of a Markov process one deduces jk ð; mÞ ¼
K X
ji ð; nÞik ðn; mÞ; j; k ¼ 1; . . . ; K;
ð5:24Þ
i¼1
This relation for a chain with a finite number of states was developed by A. Markov and is known as the Markov equation. It is a particular case of the more general Kolmogorov– Chapman–Smoluchowski equation considered below. It is possible to rewrite eqs. (5.23) and (5.24) in a compact form if the following matrix notations are used: Pð; mÞ ¼k jk ð; mÞ k;
PðmÞ ¼k pk ðmÞ k
ð5:25Þ
196
MARKOV PROCESSES AND THEIR DESCRIPTION
In this case eq. (5.31) becomes PðmÞ ¼ PT ð; mÞPðÞ
ð5:26Þ
Pð; mÞ ¼ Pð; nÞPðn; mÞ
ð5:27Þ
By design, the matrix Pð; mÞ is a square matrix with non-negative elements. In addition, a sum of elements in any row totals unity. Such matrices are called Markov or stochastic matrices. A distinction can be made between homogeneous and non-homogeneous Markov chains. A homogeneous chain can be characterized by the fact that its transitional matrix depends only on the difference of its arguments, i.e. Pð; mÞ ¼ Pðm Þ; jk ð; mÞ ¼ jk ðm Þ;
m>
ð5:28Þ
All other chains are non-homogeneous. For a homogeneous Markov chain, one can consider a one-step transitional matrix, which does not depend on any time parameter P ¼ Pð1Þ;
jk ¼ jk ð1Þ
ð5:29Þ
There is a simple relation between the matrix P and the transitional probability matrix Pð; mÞ ¼ Pðm Þ between the -th and the m-th time step. Indeed, using property (5.27) one can sequentially obtain Pðm Þ ¼ Pðm 1ÞPð1Þ ¼ Pðm 2ÞPð1ÞPð1Þ ¼ Pm
ð5:30Þ
Thus, the k-step transitional matrix is just the one-step transitional matrix raised to the power k PðkÞ ¼ Pk
ð5:31Þ
Furthermore, using a property of matrix transposition, it is possible to calculate probabilities Pðm þ kÞ of states of the system after m steps if the initial probabilities PðkÞ are known. Indeed, using eq. (5.30) in eq. (5.26) one readily concludes that Pðm þ kÞ ¼ PT ðk; mÞPðkÞ ¼ ðPm ÞT PðkÞ ¼ ðPT Þm PðkÞ
ð5:32Þ
A class of homogeneous Markov chains can be further restricted to stationary chains. For a stationary chain, the probability PðmÞ of the states at each step of the system does not depend on the index m, i.e. they are equal at each step. All other chains are non-stationary. It follows from the definition of a stationary chain that PðmÞ ¼ Pð1Þ ¼ P ¼k pk k
ð5:33Þ
This, combined with eq. (5.26), shows that P ¼ PT P
ð5:34Þ
DEFINITIONS
197
i.e. the vector of stationary probabilities is the eigenvector of the transitional matrix P corresponding to the eigenvalue ¼ 1 n. It is important in many applications to decide if there is a limiting stationary distribution P ¼ limm!1 PðmÞ, and, if it exists, how to find it. The following theorems provide an answer to this question. A detailed proof can be found in the literature [7]. It is said that a state #k is reachable from a state #j , if, after a certain (finite) number of steps, there is a positive probability that the system can reach the state #k . In other words, there is a finite number m such that kj ðmÞ > 0. Theorem 1. For any Markov chain with a finite number of states and with all states reachable from all other states, i.e. for any j and k there is an integer mjk such that jk ðmjk Þ > 0
ð5:35Þ
pk ¼ lim pk ðMÞ
ð5:36Þ
there are final probabilities, such that M!1
These final probabilities do not depend on the initial probabilities pk ð1Þ at the first step. A Markov chain for which the final probabilities exist is called an ergodic Markov chain. Theorem 2. If the condition (5.35) is satisfied, then the final probabilities pk , k ¼ 1; . . . ; K are the unique solution of the system of linear equations pk ¼
K X
pj jk ;
k ¼ 1; . . . ; K
ð5:37Þ
j¼1
subject to an additional condition (normalization) K X
pj ¼ 1
ð5:38Þ
j¼1
These final probabilities form a stationary probability vector (distribution) P ¼k pk k. A Markov chain is stationary if its initial distribution Pð1Þ coincides with the final probabilities. It is easy to see that the matrix form of eq. (5.37) coincides with that of eq. (5.34). It can be seen from eqs. (5.37) and (5.38) that there are K þ 1 equations for determining only K unknown final probabilities. However, the first K equations in eq. (5.37) are linearly dependent. Indeed, by summing both sides of the equation over all possible values of the index k one obtains an identity, thus indicating that only ðK 1Þ equations are really independent. As a result, the unknown probabilities pk could be defined from ðK 1Þ eq. (5.37) and an additional eq. (5.38). In order to illustrate this fact, the following two important examples are considered. Example: A symmetric Markov chain with two states (Figure 5.2). Let m , m ¼ 1; . . . ; M be a Markov chain with two possible states #1 and #2 . In the initial state the probability of the state #1 is p1 ð0Þ ¼ p01 ¼ , while the probability of the second
198
MARKOV PROCESSES AND THEIR DESCRIPTION
Figure 5.2 A symmetric Markov chain with two states.
state #2 is p2 ð0Þ ¼ p02 ¼ , where þ ¼ 1. One-step transitional probabilities are described by the following equalities: 11 ¼ 22 ¼ p and 12 ¼ 21 ¼ q, where p þ q ¼ 1, p > 0. Since its one-step transitional matrix P is symmetric, the corresponding Markov chain is called a symmetric Markov chain. The goal is to determine the transitional probabilities jk ðm 1Þ after ðm 1Þ steps, probabilities pk ðmÞ of each state on the m-th step, and the final probabilities pk . One also requires to determine the reverse probabilities kj ðm 1Þ ¼ Pf1 ¼ #j j m ¼ #k g. In particular, the probability 11 ðm 1Þ ¼ Pf1 ¼ #1 j m ¼ #1 g is of interest. This symmetric Markov chain is homogeneous, since the transitional probabilities do not depend on the step m. Setting ¼ 1 and n ¼ 0 in the Markov equation (5.27), one immediately obtains jk ðm 1Þ ¼
2 X
ji ik ðm 2Þ
ð5:39Þ
i¼1
At the same time, the equation for the probabilities of the states on the m-th step can be found from the expression (5.26): pk ðmÞ ¼
2 X
pj ð1Þjk ðm1 Þ
ð5:40Þ
j¼1
The probabilities jk ðm 1Þ can be calculated using two methods: either by using eq. (5.31) and matrix calculus or by calculating these probabilities step by step, using eq. (5.39). The second technique is investigated here in more detail. The normalization condition (5.22) for the case of ¼ 0 can be written as 11 ðm 1Þ þ 12 ðm 1Þ ¼ 1 22 ðm 1Þ þ 21 ðm 1Þ ¼ 1
ð5:41Þ
Since there is a complete symmetry in this problem, one can justify that 12 ðm 1Þ ¼ 21 ðm 1Þ, and, consequently, 11 ðm 1Þ ¼ 22 ðm 1Þ. Thus one needs to calculate only one of these probabilities in order to find all the others. For example, one can concentrate on calculating 11 ðm 1Þ. In order to do that, relation (5.22) can be exploited to produce 11 ðmÞ ¼ 11 11 ðm 1Þ þ 12 12 ðm 1Þ ¼ p11 ðm 1Þ þ q½1 11 ðm 1Þ 1 ðp qÞ 1 ðp qÞ ¼ þ ½211 ðm 1Þ 1 ¼ ðp qÞ11 ðm 1Þ þ 2 2 2
ð5:42Þ
199
DEFINITIONS
Since, for m ¼ 1 the system is in the initial state, one deduces that 11 ð0Þ ¼ 22 ð0Þ ¼ 1;
12 ð0Þ ¼ 21 ð0Þ ¼ 0
ð5:43Þ
Now it is possible to derive the expression for all the transitional probabilities using eqs. (5.41) and (5.42) in a sequential manner. Indeed, it follows from eqs. (5.42) and (5.43) that 1 ðp qÞ 1 ðp qÞ ½211 ð0Þ 1 ¼ þ 11 ð1Þ ¼ þ 2 2 2 2 1 ðp qÞ 1 ðp qÞ2 ½211 ð1Þ 1 ¼ þ 11 ð2Þ ¼ þ 2 2 2 2
ð5:44Þ
1 ðp qÞ 1 ðp qÞm ½211 ðm 1Þ 1 ¼ þ 11 ðmÞ ¼ þ 2 2 2 2 Once 11 ðmÞ is found, the rest of the transitional probabilities can be found from eq. (5.41): 22 ðmÞ ¼ 11 ðmÞ ¼
1 ðp qÞm þ 2 2
ð5:45Þ
1 ðp qÞ 12 ðmÞ ¼ 21 ðmÞ ¼ 1 11 ðmÞ ¼ 2 2
m
ð5:46Þ
Finally, probabilities of each state can be found through the probabilities in the initial state and the transitional probabilities (5.44)–(5.45) p1 ðmÞ ¼ p1 ð1Þ11 ðm 1Þ þ p2 ð1Þ21 ðm 1Þ 1 ðp qÞm 1 ðp qÞm 1 ð Þ ¼ þ ðp qÞm1 þ ¼ þ 2 2 2 2 2 2 p2 ðmÞ ¼ p2 ð1Þ22 ðm 1Þ þ p1 ð1Þ12 ðm 1Þ 1 ðp qÞm 1 ðp qÞm 1 ð Þ ðp qÞm1 ¼ þ þ ¼ þ 2 2 2 2 2 2
ð5:47Þ
ð5:48Þ
Since p > 0 and all the states are reachable, then ðp qÞ < 1 and lim p1 ðmÞ ¼ lim p2 ðmÞ ¼
m!1
m!1
1 2
ð5:49Þ
This indicates that this chain is an ergodic Markov chain and the final probabilities p1 and p2 of the states can be found. The latter can be accomplished by solving a system of linear equations, obtained directly from eqs. (5.37) and (5.38) p1 ¼ pp1 þ qp2
p2 ¼ qp1 þ pp2
or
p1 þ p2 ¼ 1
ð5:50Þ
The only solution of either of these systems is p1 ¼ p2 ¼ 0:5. Further analysis of expressions of probabilities of states shows that there is a transient process before a stationary (final) distribution of states is achieved. It also becomes obvious
200
MARKOV PROCESSES AND THEIR DESCRIPTION
that the approach to stationarity has an exponential nature, since the variable part of p1 ðmÞ is proportional to ðp qÞm . On the other hand if p ¼ q ¼ 0:5, the final probabilities are immediately achieved and the process is stationary. In order to find the reversed probabilities kj ðm 1Þ ¼ Pf1 ¼ #j j m ¼ #k g, the Bayes theorem can be used. Indeed, according to this theorem kj ðm 1Þ ¼ Pf1 ¼ #j j m ¼ #k g ¼
Pf1 ¼ #j ; m ¼ #k g jk ðmÞpj ð1Þ ¼ pk ðmÞ pk ðmÞ
ð5:51Þ
In particular, this leads to the following expression for the reversed transition probability 11 ðm 1Þ 11 ðm 1Þ ¼
11 ðmÞp1 ð1Þ 11 ðmÞp1 ð1Þ þ ðp qÞm ¼ ¼ p1 ðmÞ p1 ðmÞ 1 þ ð Þð p qÞm
ð5:52Þ
This clearly shows that, in general, the reversed transitional probability is not equal to the forward transitional probability. Example: A quasi-random telegraph signal. Let ðtÞ be a random process with two states #0 , which changes states only at the fixed moments of time tn ¼ nT0 , where T0 is a constant, n ¼ 0; 1; 2; . . . ; and is a random variable uniformly distributed on the interval ½0; T0 and independent of ðtÞ. It is assumed that the conditions of the previous example also hold. In particular, #1 ¼ #0 and #2 ¼ #0 . Such a signal is known as a quasi-random telegraph signal. It is important to note that the change of state is only allowed in certain instants of time, in contrast to the example considered later in Section 5.1.3. It is required to find the correlation function and the power spectrum density of such a signal in the stationary case, i.e. assuming that p1 ¼ p2 ¼ 0:5. The expectation hðtÞi of the process is equal to zero since hðtÞi ¼ p1 #0 þ p2 ð#0 Þ ¼ 0
ð5:53Þ
Thus, its covariance function Cð Þ is just an average of the product of two values of the process at separate moments of time, i.e. Cð Þ ¼ hðtÞðt þ Þi
ð5:54Þ
The covariance function can be first evaluated assuming that the value of is fixed. Let the interval of length contain m points where the process can change its value, as shown in Fig. 5.3. Since the signal can assume only two values #0 , the product ð0Þð Þ assumes only two possible values as well. This value is #20 if both ends of the interval lie on the same side of the real axis, i.e. if either ð0Þ ¼ ð Þ ¼ #0 or ð0Þ ¼ ð Þ ¼ #0. If the ends of these intervals lie on pulses of different polarity then ð0Þð Þ ¼ #20 . Thus, one can write Cm ð j Þ ¼ #20 Pfð0Þ ¼ #0 ; ð Þ ¼ #0 g þ #20 Pfð0Þ ¼ #0 ; ð Þ ¼ #0 g #20 Pfð0Þ ¼ #0 ; ð Þ ¼ #0 g #20 Pfð0Þ ¼ #0 ; ð Þ ¼ #0 g ¼ #20 ½p1 11 ðmÞ þ p2 22 ðmÞ #20 ½p1 12 ðmÞ þ p2 21 ðmÞ
ð5:55Þ
201
DEFINITIONS
Figure 5.3
Quasi-random telegraph signal.
Since for a stationary case p1 ¼ p2 ¼ 0:5 and, as has been shown in the example above, 11 ðmÞ ¼ 22 ðmÞ ¼
1 þ ðp qÞm ; 2
12 ðmÞ ¼ 21 ðmÞ ¼
1 ðp qÞm 2
ð5:56Þ
then the expression for the covariance function becomes Cm ð j Þ ¼ #20 ðp qÞm
ð5:57Þ
If the values of and m are such that ðm 1ÞT0 m T0 then there are two possibilities: 1. If < mT0 , then the interval contains ðm 1Þ points where the system is able to change its state. As a result, the covariance function for these values of and
is Cm1 ð j Þ ¼ #20 ðp qÞm1
ð5:58Þ
2. If > mT0 then there are m points where the system is able to change its state. As a result, the correlation function for these values of and is Cm ð j Þ ¼ #20 ðp qÞm
ð5:59Þ
Since the random variable is uniformly distributed on the interval ½0; T0 , its probability density function is a constant p ðÞ ¼
1 ; T0
2 ½0; T0
ð5:60Þ
202
MARKOV PROCESSES AND THEIR DESCRIPTION
and the unconditional covariance function Cð Þ can be obtained by averaging the conditional correlation function Cð j Þ ð ð T0 ð T0 1 mT0
Cð Þ ¼ Cð j Þp ðÞd ¼ Cm1 ð j Þd þ Cm ð j Þd T0 0 0 mT0
¼ #20 m ðm 1Þ ðp qÞm ð5:61Þ ðp qÞm1 þ #20 T0 T0 Given that Cð Þ is an even function of , the final expression for the correlation function takes the following form j j j j ðm 1Þ ðp qÞm ð5:62Þ Cð Þ ¼ #20 m ðp qÞm1 þ #20 T0 T0 The power spectral density of the quasi-random telegraph signal can be then found by applying the Wiener–Khinchin theorem ð1 Sð!Þ ¼ 2 Cð Þ cos ! d
0 ð mT0 1 X
2#20 ðp qÞm1 m ðm 1Þ ð p qÞ cos ! d
¼ þ T0 T0 ðm1ÞT0 m¼0 12 0 !T0 sin B #20 t0 ½1 ðp qÞ2 2 C C B ð5:63Þ 2@ !T0 A 1 2ðp qÞ cos !T0 þ ðp qÞ 2 For a particular case of p ¼ q ¼ 0:5 this expression can be further simplified to produce 12 0 8 !T0 sin j j < 2 B 2 C #0 1 ; j j < T0 C ; Sð!Þ ¼ #20 T0 B kð Þ ¼ ð5:64Þ T0 @ !T0 A : 0; j j > T0 2 It is important to note that if is deterministic and equal to zero, ¼ 0, then the quasitelegraph signal becomes non-stationary. The easiest way to show this is to consider a case of p ¼ q ¼ 0:5. For this signal the covariance function is4 Cðt1 ; t2 Þ ¼ hðt1 Þðt2 Þi ¼ #20 Pfðt1 Þ ¼ #0 ; ðt2 Þ ¼ #0 g þ #20 Pfðt1 Þ ¼ #0 ; ðt2 Þ ¼ #0 g #20 Pfðt1 Þ ¼ #0 ; ðt2 Þ ¼ #0 g #20 Pfðt1 Þ ¼ #0 ; ðt2 Þ ¼ #0 g ( 2 #0 ðm 1ÞT0 < t1 ; t2 < mT0 ¼ 0; otherwise
4
This signal still has zero mean at any moment of time.
ð5:65Þ
203
DEFINITIONS
Thus, the convariance function depends separately on the values of t1 and t2 and the process ðtÞ is non-stationary. 5.1.2
Markov Sequences
A Markov sequence is a sequence of random variables m ¼ ðtm Þ which attains a continuum of values at discrete moments of time t1 < t2 < < tM . A sampled continuous Markov random process is an example of a Markov sequence. A Markov sequence is a good way to describe coherent digital communication systems, since they change their state only at certain fixed moments of time. At the same time a number of possible states is so large that the description in terms of a discrete number of states is rather complicated. A sequence of random variables m is called a Markov sequence if for any m the conditional distribution function (or, equivalently, the conditional probability density) satisfies the following conditions Pm ðxm j x1 ; . . . ; xm1 Þ ¼ Pm ðxm j xm1 Þ
ð5:66Þ
m ðxm j x1 ; . . . ; xm1 Þ ¼ m ðxm j xm1 Þ
ð5:67Þ
These relations reflect the fact that, for a Markov sequence, conditional probabilities and distributions for the future moment of time tm depend only on the state of the system at the current moment of time tm1 , not on the past state of the system. It follows from eq. (5.67) that the joint probability density function pM ðx1 ; . . . ; xM Þ can be expressed in terms of one-step transitional probability densities m ðxm j xm1 Þ, m ¼ 2; . . . ; M and the probability of the initial state p1 ðx1 Þ: pm ðx1 ; . . . ; xm Þ ¼ p1 ðx1 Þ
m Y
ðx j x 1 Þ
ð5:68Þ
¼2
The last equation can also be considered as a definition of a Markov sequence, since it follows from eq. (5.68) that m ðxm j x1 ; . . . ; xm1 Þ ¼
pm ðx1 ; . . . ; xm Þ ¼ m ðxm j xm1 Þ pm1 ðx1 ; . . . ; xm1 Þ
ð5:69Þ
The following is a brief summary of properties of a Markov sequence. 1. If the state of a Markov sequence is known at the current moment of time, its future state does not depend on any past state. This means that if the state xn is known, then for any m > n > random variables xm and x are independent: pðxm ; x j xn Þ ¼ mn ðxm j xn Þn ðx j xn Þ
ð5:70Þ
Indeed, using eq. (5.67) one can write pðxm ; x j xn Þ ¼
pðx ; xn ; xm Þ pðx ; xn Þmn ðxm j xn Þ ¼ ¼ mn ðxm j xn Þn ðx j xn Þ pðxn Þ pðxn Þ
ð5:71Þ
A similar statement can be derived for a few moments in the past and in the future.
204
MARKOV PROCESSES AND THEIR DESCRIPTION
2. Any sub-sequence of a Markov sequence is also a Markov sequence, i.e. for a given tm and moments of time tm1 < tm2 < < tmn the following equation is valid: ðxmn j xm1 ; . . . ; xmn1 Þ ¼ ðxmn j xmn1 Þ
ð5:72Þ
3. A Markov sequence remains Markov if the time direction is reversed, i.e. ðxm j xmþ1 ; . . . ; xmþn Þ ¼ ðxm j xmþ1 Þ
ð5:73Þ
This can be proven by expanding eq. (5.73) using the Bayes formula and the property (5.67) ðxm j xmþ1 ; . . . ; xmþn Þ ¼ ¼
pðxm ; xmþ1 ; . . . ; xmþn Þ pðxmþ1 ; . . . ; xmþn Þ pðxm ; xmþ1 Þ pðxmþ2 ; . . . ; xmþn j xmþ1 Þ ¼ ðxm j xmþ1 Þ pðxmþ1 Þ pðxmþ2 ; . . . ; xmþn j xmþ1 Þ
ð5:74Þ
4. The transitional probability density satisfies the following equation ðxm j x Þ ¼
ð1 1
ðxm j xn Þðxn j x Þdxn ;
m>n>
ð5:75Þ
which is a particular case of the Kolmogorov–Chapman equation. Indeed, for any Markov sequence, the joint probability density at three moments of time can be expressed as pðx ; xn ; xm Þ ¼ pðx Þðxn j x Þðxm j xn Þ
ð5:76Þ
Integrating the latter over xn one obtains pðx ; xm Þ ¼
ð1 1
pðx ; xn ; xm Þdxn ¼ pðx Þ
ð1 1
ðxn j x Þðxm j xn Þdxn
ð5:77Þ
which is equivalent to eq. (5.75). A Markov sequence is called a homogeneous Markov sequence if the transitional probability density m ðxm j xm1 Þ does not depend on m. A Markov sequence is called a stationary Markov sequence if it is a homogeneous sequence and the probabilities of states do not depend on the instant of time pm ðxm Þ ¼ pðxm Þ ¼ const:
ð5:78Þ
Similar to a Markov chain of n-th order, one can consider a Markov sequence of order n 1, defined as m ðxm j x1 ; . . . ; xm1 Þ ¼
m ðxm j xmn ; . . . ; xm1 Þ; m ðxm j x1 ; . . . ; xm1 Þ;
m>n 2mn
ð5:79Þ
205
DEFINITIONS
For n ¼ 1 the Markov chain of order n coincides with a simple Markov chain. For m n one has to formally assume n ¼ m 1 ðm 2Þ. It is possible to introduce an n-dimensional random vector nm ¼ fi; m ; i ¼ 1; . . . ; ng
ð5:80Þ
with components on two sequential steps defined as i; m ¼ iþ1; m1 ¼ mnþ1 ; 1; m1 ¼ mn ðn < mÞ
ð5:81Þ
It is easy to see that this new vector is a simple Markov vector. Its transitional probability density can be expressed as m ðXm j X m1 Þ ¼ m ðxn;m j x1; m1 ; . . . ; xn;m1 Þ
n1 Y
ðxi; m xiþ1; m1 Þ
ð5:82Þ
i¼1
The probability density pm ðxmnþ1 ; . . . ; xm Þ obeys the following recurrence equations, similar to eq. (5.13): pm ðxmnþ1 ; . . . ; xm Þ ¼
ð1 1
m ðxm j xmn ; . . . ; xm1 Þpm1 ðxmn ; . . . ; xm1 Þdxmn
ð5:83Þ
where m > n. On the steps 2 m n the following equality holds for a Markov vector pm ðx2 ; . . . ; xm Þ ¼
ð1 1
m ðxm j x1 ; . . . ; xm1 Þpm1 ðx1 ; . . . ; xm1 Þdx1
ð5:84Þ
In the case of a simple scalar Markov chain eq. (5.84) becomes pm ðxm Þ ¼
ð1 1
m ðxm j xm1 Þpm1 ðxm1 Þdxm1
Similarly, for a vector case eq. (5.84) becomes ð1 pm ðXm Þ ¼ m ðX m j X m1 Þpm1 ðXm1 ÞdX m1 1
ð5:85Þ
ð5:86Þ
where m ¼ 2; . . . ; M and the integration is performed over all the components xi; m1 ði ¼ 1; . . . ; rÞ. Proofs of eqs. (5.82)–(5.86) are similar to those for the case of a Markov chain. Example. Let independent random variables 1 ; 2 ; . . . ; m ; . . . have probability densities pm ðxÞ, respectively. It is possible to prove that the following sequence 1 ¼ 1 ; is a Markov sequence.
2 ¼ 1 þ 2 . . . ;
m ¼ 1 þ 2 þ þ m ; . . .
ð5:87Þ
206
MARKOV PROCESSES AND THEIR DESCRIPTION
Since differences kþ1 k ¼ k are independent, one can obtain an expression of joint probability density function of the first m random variables k as pðx1 ; . . . ; xm Þ ¼ p1 ðx1 Þp2 ðx2 x1 Þ . . . pm ðxm xm1 Þ
ð5:88Þ
Using this relation, the expression for transitional probability becomes m ðxm j x1 ; . . . ; xm1 Þ ¼
pðx1 ; . . . ; xm Þ ¼ pm ðxm xm1 Þ pðx1 ; . . . ; xm1 Þ
ð5:89Þ
Since this transitional density does not depend on x1 ; . . . ; xm2 the considered sequence is a Markov sequence. If, in addition, all the random variables m have zero mean, hm i ¼ 0, then hm i ¼ 0. Furthermore, given that m ¼ m þ m1 , then hm j m1 i ¼ hm j m1 i þ hm1 j m1 i ¼ m1
ð5:90Þ
since m does not depend on m1 and hm i ¼ 0. Finally, using a Markov property of the sequence it is possible to conclude that hxm j x1 ; . . . ; xm1 i ¼ xm1
ð5:91Þ
A sequence of random variables which has the property described by eq. (5.91) is called a martingale [8]. Example. Let 1 ; . . . ; M be a sequence of independent random variables, described by the probability density function pm ðxm Þ; m ¼ 1; . . . ; M. A new sequence fm ; m ¼ 1; . . . ; Mg is defined such that it satisfies the following equation m X ¼ mn
am ¼ m ;
m>n
ð5:92Þ
where fam g are known coefficients, a0 6¼ 0 and an 6¼ 0. It is clear that the sequence m is a sequence of independent variables if n ¼ 0 and Markov for n ¼ 1. For n > 1 one has a case of a Markov chain of order n, which is described by the transitional probability density ! m X m ðxm j xmn ; . . . ; xm1 ¼ a0 pm am x ð5:93Þ ¼ mn
A conditional expectation is then " hm j 1 ; . . . ; m1 i ¼ hm j mn ; . . . ; m1 i ¼
a1 0
hxm i
m X ¼ mn
# am
ð5:94Þ
It can be seen from this equation that the conditional mean depends on n previous values of the process mn ; . . . ; m1 .
207
DEFINITIONS
5.1.3
A Discrete Markov Process
Let process ðtÞ attain values from a finite set #1 ; . . . ; #K and transitions from state to a state appear at random moments of time. The transitional probability ij ðt0 ; tÞ ¼ PfðtÞ ¼ #j j ððt0 Þ ¼ #i Þg;
t > t0
ð5:95Þ
describes the probability that the system is in state #j at the moment of time t, given that the system has been in the state #i at the moment of time t0 . It is clear that K X
ij ðt0 ; tÞ ¼ 1;
ij ðt0 ; tÞ 0
ð5:96Þ
j¼1
and ij ðt0 ; t0 Þ ¼ ij ¼
1; 0;
i¼j i 6¼ j
ð5:97Þ
The main problem in studying Markov processes is to calculate transitional probabilities and unconditional probabilities of the states given the initial distribution of the states and one-step transitional probability densities. The same considerations used to derive eq. (5.24) can be used to obtain the Kolmogorov– Chapman equation in the following form ij ðt0 ; t þ tÞ ¼
K X
ik ðt0 ; tÞkj ðt; t þ tÞ;
t > t0 ;
t > 0
ð5:98Þ
k¼1
In the case of discrete Markov processes and a small time interval t, transitional probabilities can be written as kk ðt; t þ tÞ ¼ Pfðt þ tÞ ¼ #k j ðtÞ ¼ #k g ¼ 1 akk ðtÞt þ oðtÞ kj ðt; t þ tÞ ¼ Pfðt þ tÞ ¼ #j j ðtÞ ¼ #k g ¼ akj ðtÞt þ oðtÞ
ð5:99Þ
where oðtÞ are terms of the higher order with respect to t. It can be seen from eq. (5.99) that it is a characteristic property of a Markov process that for small increments of time t, the probability of kk remaining in the current state is much higher than the probability of kj changing state. The first of the two equations (5.99) is consistent with eq. (5.97) and reflects two facts: (1) for t ¼ 0 the system is certainly in the state #k ; (2) the probability of a transition from the state #k to any other state in the future depends on the moment of time to be considered, and, for a small interval t, these probabilities are proportional to the duration of this time interval. Since the probability of transition from one state to another is non-negative, then akj ðtÞ 0. Furthermore, it follows from eq. (5.96) that the following normalization condition must also be satisfied X akk ðtÞ ¼ akj ðtÞ 0; akj ðtÞ 0 ð5:100Þ j6¼k
208
MARKOV PROCESSES AND THEIR DESCRIPTION
Substituting eq. (5.99) into the right-hand side of eq. (5.98) one obtains a system of differential equations for transitional probabilities X @ ij ðt0 ; tÞ ¼ ajj ðtÞij ðt0 ; tÞ þ akj ðtÞik ðt0 ; tÞ @t k 6¼ j
ð5:101Þ
where akj ðtÞ satisfies the condition (5.100). The solution of this system of differential equations with initial conditions (5.97) provides the dependence of the transitional probabilities on time t. If the number of possible states of the system is finite, then for any continuous functions aij ðtÞ, satisfying the conditions (5.100), the system of differential equations (5.101) has a unique non-negative solution which defines a discrete Markov process. Similarly to a Markov sequence, a Markov discrete process remains Markov when the direction of time is reversed. It can be shown that in addition to the system of equations (5.101), an alternative system of equations can be written. In order to derive this new system of equations one ought to consider a point in time which is close to the initial moment of time t0 and rewrite eq. (5.98) in the following form X ij ðt0 ; tÞ ¼ ik ðt0 ; t0 þ tÞkj ðt0 þ t; tÞ; t > t0 þ t > t0 ð5:102Þ k
The next step is to use the approximation of probabilities for a small time interval t, similar to eq. (5.99) ii ðt0 ; t0 þ tÞ ¼ 1 aii ðt0 Þt þ oðtÞ
ð5:103Þ
Substituting approximation (5.103) into eq. (5.102) and taking a limit for t ! 0 one obtains the so-called reversed5 Kolmogorov equation for a discrete Markov process X @ ij ðt0 ; tÞ ¼ aii ðt0 Þij ðt0 ; tÞ aik ðt0 Þkj ðt0 ; tÞ; @t0 k 6¼ j
t > t0
ð5:104Þ
Equation (5.101) is satisfied not only by transitional probabilities ij ðt0 ; tÞ but also by unconditional probabilities pj ðtÞ of states at any moment of time t. Indeed, once the initial probabilities pj ð0Þ ¼ pj ðt0 Þ are known then, for a discrete Markov process, the following equations are satisfied X X pi ðt0 Þij ðt0 ; tÞ; pj ðt þ tÞ ¼ pi ðtÞij ðt; t þ tÞ ð5:105Þ pj ðtÞ ¼ i
i
After multiplication of both sides of eq. (5.101) by pi ðt0 Þ and summing over all possible states one obtains X @ pj ðtÞ ¼ ajj ðtÞpj ðtÞ þ akj ðtÞpk ðtÞ @t k 6¼ j 5
Equation (5.101) is called the ‘‘forward’’ Kolmogorov equation.
ð5:106Þ
209
DEFINITIONS
A unique solution of eq. (5.106) is obtained using the initial conditions pj ðtÞ ¼ pj ðt0 Þ ¼ p0j
ð5:107Þ
A discrete Markov process is called a homogeneous Markov process if its transitional probabilities ij ðt0 ; tÞ depend only on time difference ¼ t t0 , i.e. ij ðt0 ; tÞ ¼ ij ð Þ;
¼ t t0
ð5:108Þ
It follows from eq. (5.106) that, for a homogeneous Markov process, coefficients akj ðtÞ ¼ akj are constants (as functions of time) and eqs. (5.99)–(5.103) are simplified to a system of linear differential equations with constant coefficients X d ij ð Þ ¼ ajj ij ð Þ þ akj ik ð Þ d
k 6¼ j ij ð0Þ ¼ ij
ð5:109Þ ð5:110Þ
A homogeneous discrete Markov process is called ergodic if there is a limit lim ij ð Þ ¼ pj
ð5:111Þ
!1
which does not depend on the initial state #i . Probabilities pj define an equilibrium, or final, distribution of the states of the system. These probabilities can be defined from algebraic equations, which are the limiting case of eq. (5.109) ajj pj þ
X k 6¼ j
akj pk ¼ 0;
K X
pk ¼ 1
ð5:112Þ
k¼1
The concept of the final probabilities is illustrated by the following examples. Example: A discrete binary Markov process (signal). Let ðtÞ be a discrete Markov process with two possible states #1 ¼ 1 and #2 ¼ 1. The probability of transition from the state #1 ¼ 1 to the state #2 ¼ 1 for a short time interval t is equal to t and the probability of the transition from the state #2 ¼ 1 to the state #1 ¼ 1 is equal to t. Initially, the distribution of these two states is described by the following probabilities ¼ Pfðt0 Þ ¼ #1 g and ¼ Pfðt0 Þ ¼ #2 g ¼ 1 . It is required to determine the probabilities of transitions ij ðt0 ; tÞ, the final stationary probabilities of each state, as well as the mean values of the process ðtÞ and its correlation function. It is worth noting that if a binary Markov process ðtÞ has two states #1 ¼ ’1 and #2 ¼ ’2 , then it could be represented using a linear transformation of a binary process ðtÞ with two states 1: ðtÞ ¼
’1 þ ’2 ’1 ’2 þ ðtÞ 2 2
ð5:113Þ
210
MARKOV PROCESSES AND THEIR DESCRIPTION
Thus, the statistical properties of the process ðtÞ can be easily expressed in terms of statistics of the process ðtÞ. It follows from the definition of ðtÞ that the following are the values of the constants in eq. (5.109): a12 ¼ , a22 ¼ . Since all the coefficients aij are constants, the process ðtÞ is homogeneous6. The next step is to rewrite eq. (5.109) as @ i1 ðt0 ; tÞ @t
9 > ¼ i1 ðt0 ; tÞ þ i2 ðt0 ; tÞ > =
@ i2 ðt0 ; tÞ @t
> > ¼ i2 ðt0 ; tÞ þ i1 ðt0 ; tÞ ;
;
i ¼ 1; 2
ð5:114Þ
Using property (5.96) one obtains 12 ðt0 ; tÞ ¼ 1 11 ðt0 ; tÞ, which, in turn, implies that the first of the eqs. (5.114) can be rewritten as @ 11 ðt0 ; tÞ ¼ ð þ Þ11 ðt0 ; tÞ þ ; @t
t > t0
ð5:115Þ
A general solution of this equation which satisfies the condition 11 ðt0 ; t0 Þ is given by 11 ðt0 ; tÞ ¼
ðt
exp½ð þ Þðt sÞds þ exp½ð þ Þðt t0 Þ
t0
¼
þ exp½ð þ Þ þ þ
ð5:116Þ
In the very same way a solution can be obtained for all transitional probabilities. Here, only the final results are presented: 11 ð Þ ¼
þ exp½ð þ Þ ; 12 ð Þ ¼ ½1 exp½ð þ Þ þ þ þ
ð5:117Þ
22 ð Þ ¼
þ exp½ð þ Þ ; 21 ð Þ ¼ ½1 exp½ð þ Þ þ þ þ
ð5:118Þ
These equations indicate that the process considered is homogeneous since all the transitional probabilities depend only on the time difference ¼ t t0 . In addition, this process is ergodic since the transitional probabilities approach a limit at t ! 1 thus defining stationary probabilities p1 ¼
6
See derivations below.
; þ
p2 ¼
þ
ð5:119Þ
211
DEFINITIONS
Given an initial distribution of the probabilities and the transitional probabilities it is possible to find unconditional probabilities of each state: þ p1 ðt0 þ sÞ ¼ 11 ðsÞ þ ð1 Þ21 ðsÞ ¼ exp½ð þ Þs þ þ ð5:120Þ p2 ðt0 þ sÞ ¼ 22 ðsÞ þ ð1 Þ12 ðsÞ ¼ exp½ð þ Þs þ þ The average hðtÞi of the process ðtÞ can be evaluated directly from the definition of average and the expression (5.120) for probabilities of the states at any moment of time m ðtÞ ¼ hðtÞi ¼ 1 p1 ðtÞ þ ð1Þ p2 ðtÞ ¼ 11 ð Þ þ ð1 Þ21 ð Þ þ2 22 ð Þ ð1 Þ12 ð Þ ¼ exp½ð þ Þ þ þ
ð5:121Þ
Here ¼ t t0 . The next step is to find the covariance function C ðs; Þ ¼ hðt0 þ sÞðt0 þ s þ Þi hðt0 þ sÞihðt0 þ s þ Þi;
s > 0; > 0
ð5:122Þ
Once again, using a definition of the average one obtains hðt0 þ sÞðt0 þ s þ Þi ¼ 1 p1 ðt0 þ sÞ11 ð Þ þ p2 ðt0 þ sÞ22 ð Þ þ ð1Þ p1 ðt0 þ sÞ12 ð Þ p2 ðt0 þ sÞ21 ð Þ;
s > 0; > 0 ð5:123Þ
Finally, using eqs. (5.117), (5.118) and (5.123) the expression for the covariance function is reduced to ( 4 þ C ðs; Þ ¼ 4 exp½ð þ Þs þ þ ð þ Þ2 ð5:124Þ exp½ð þ Þs exp½ð þ Þ þ If the process starts from the stationary distribution of the states, i.e. ¼ p1 ¼ =ð þ Þ, then the process is a stationary process from the initial moment of time t0 . Its mean does not depend on time m ðtÞ ¼
¼ const: þ
ð5:125Þ
and its covariance function depends only on
C ð Þ ¼
4 ð þ Þ2
exp½ð þ Þ ;
>0
ð5:126Þ
212
MARKOV PROCESSES AND THEIR DESCRIPTION
Finally, using symmetry of the covariance function of a stationary process one obtains C ð Þ ¼
4 ð þ Þ2
exp½ð þ Þj j
ð5:127Þ
If ¼ the binary Markov process is called a symmetric binary Markov process or a random telegraph signal. Using this condition, eqs. (5.117)–(5.127) can be significantly simplified to produce 1 11 ð Þ ¼ 22 ð Þ ¼ ð1 þ e2 Þ; 2 1 p1 ¼ p2 ¼ ; m ¼ 0; K ð Þ ¼ exp½2 2 1 12 ð Þ ¼ 21 ð Þ ¼ ð1 e2 Þ 2
5.1.4
ð5:128Þ
Continuous Markov Processes
In this section a continuous Markov process is the subject of discussion. In this case the process can attain a continuum of values and is a function of a continuous parameter—time t. Let ðt0 Þ; ðt1 Þ; . . . ; ðtm1 Þ, and ðtm Þ be values of the random process ðtÞ at the moments of time t0 < t1 < < tm1 < tm . The process ðtÞ is called a continuous Markov process if its conditional transitional probability density m ðxm ; tm j xm1 ; tm1 ; xm2 ; tm2 ; . . . ; x0 ; t0 Þ ¼
pmþ1 ðx0 ; x1 ; . . . ; xm ; t0 ; t1 ; . . . ; tm Þ pm ðx0 ; x1 ; . . . ; xm1 ; t0 ; t1 ; . . . ; tm1 Þ ð5:129Þ
depends only on the values of the process at the last moment of time ðtm1 Þ, i.e. m ðxm ; tm j xm1 ; tm1 ; xm2 ; tm2 ; . . . ; x0 ; t0 Þ ¼ m ðxm ; tm j xm1 ; tm1 Þ;
m>1
ð5:130Þ
As usual pm ðx0 ; x1 ; . . . ; xm1 ; t0 ; t1 ; . . . ; tm1 Þ ¼
@m Probfðtm1 Þ < xm1 ; . . . ; ðt0 Þ < x0 g @xm1 . . . @x0 ð5:131Þ
In other words, the future of a Markov process does not depend on its past if its value at the present is known. It follows from eqs. (5.129) and (5.130) that pmþ1 ðx0 ; x1 ; . . . ; xm ; t0 ; t1 ; . . . ; tm Þ ¼ m ðxm ; tm j xm1 ; tm1 Þpm ðx0 ; x1 ; . . . ; xm1 ; t0 ; t1 ; . . . ; tm1 Þ ð5:132Þ
213
DEFINITIONS
Integration of the last equation over x0 produces pm ðx1 ; . . . ; xm ; t1 ; . . . ; tm Þ ¼ m ðxm ; tm j xm1 ; tm1 Þpm1 ðx1 ; . . . ; xm1 ; t1 ; . . . ; tm1 Þ ð5:133Þ This indicates that for any m > 1, the expression for the joint probability function contains the same transitional probability density ðx; t j x0 ; t0 Þ; t > t0, (from the state x0 to the state x). Sequential application of integration to eq. (5.130) leads to the following important representation of a joint probability density function, considered at any m þ 1 moments of time tm < tm1 < < t1 < t0 pmþ1 ðx0 ; x1 ; . . . ; xm ; t0 ; t1 ; . . . ; tm Þ ¼
m 1 Y
ðxmk ; tmk j xmk1 ; tmk1 Þpðx0 ; t0 Þ
ð5:134Þ
k¼0
In particular p2 ðx0 ; x1 ; t0 ; t1 Þ ¼ ðx1 ; t1 j x0 ; t0 Þpðx0 ; t0 Þ
ð5:135Þ
This shows that a complete description of a continuous Markov process can be achieved if its marginal and transitional probability densities are known. This explains why Markov processes are so important in applications—a limited amount of information (only second order joint probability densities) is needed to obtain a complete description. Using expression (5.130) and the Bayes formula one can show that a continuous valued Markov process remains Markov if time is reversed. In other words ðx0 ; t0 j x1 ; t1 ; x2 ; t2 ; . . . ; xm ; tm Þ ¼ ðx0 ; t0 j x1 ; t1 Þ;
t0 < t1 < < tm
ð5:136Þ
Using the definition of the transitional probability density ðx; t j x0 ; t0 Þ one can obtain the following properties of the transitional density. 1. The transitional probability density is non-negative and is normalized to one after integration over the destination state x: ð1 ðx; t j x0 ; t0 Þdx ¼ 1 ð5:137Þ ðx; t j x0 ; t0 Þ 0 and 1
2. The transitional probability density reduces to a delta function for coinciding moments of time ðx; t j x0 ; tÞ ¼ ðx x0 Þ
ð5:138Þ
3. The transitional probability density satisfies the Kolmogorov–Chapman equation [1,2] ðx; t j x0 ; t0 Þ ¼
ð1 1
ðx; t j x0 ; t0 Þðx0 ; t0 j x0 ; t0 Þd0
ð5:139Þ
This equation can be easily derived using eq. (5.134) for m ¼ 2 p3 ðx0 ; x1 ; x; t0 ; t1 ; tÞ ¼ ðx; t j x1 ; t1 Þðx1 ; t1 j x0 ; t0 Þpðx0 ; t0 Þ Integration over x1 produces the required relation (5.139).
ð5:140Þ
214
MARKOV PROCESSES AND THEIR DESCRIPTION
A continuous Markov process is called a homogeneous Markov process if its transitional probability density depends only on a difference of time instances, i.e. ðx; t j x1 ; t1 Þ ¼ ðx; x1 j t t1 Þ ¼ ðx; x1 j Þ;
¼ t t1
ð5:141Þ
Furthermore, if ðx; x1 j Þ approaches a limit pst ðxÞ when ! 1 lim ðx; x1 j Þ ¼ pst ðÞ
!1
ð5:142Þ
and this limit is independent of the initial state x0 then this Markov process is called ergodic. It is worth noting that the Kolmogorov–Chapman equation in the form (5.139) describes all types of Markov process considered above if the transitional density ðx; t j x0 ; t0 Þ can be a generalized function and the parameter t allows change on a discrete or continuous set. For example, for a Markov chain one can represent the transitional density as ðx; t j x0 ; t0 Þ ¼ ðx; kt j x0 ; ðk 1ÞtÞ ¼
M X M X
Pmn ðx xm Þðx xn Þ
ð5:143Þ
m¼1 n¼1
5.1.5
Differential Form of the Kolmogorov–Chapman Equation
The Kolmogorov–Chapman equation (5.139) is an integral equation and under some conditions it can be converted to a differential equation with a more revealing structure. The following section follows closely Section 3.4 in [1]. It is assumed for now that for all " > 0 1. The following limit exists uniformly in x; x0 , and t such that jx x0 j > " ðx; t þ t; x0 ; tÞ ¼ Wðx j x0 ; tÞ 0 t!0 t lim
ð5:144Þ
2. The following two limits are uniform in x0 , ", and t 1 t!0 t
ð
lim
1 t!0 t lim
ð
jxx0 j<"
jxx0 j<"
ðx x0 Þðx; t þ t; x0 ; tÞdx ¼ K1 ðx; tÞ þ Oð"Þ
ð5:145Þ
ðx x0 Þ2 ðx; t þ t; x0 ; tÞdx ¼ K2 ðx; tÞ þ Oð"Þ
ð5:146Þ
The condition (5.144) allows one to classify two types of continuous valued Markov process. If the limit is equal to zero, i.e. ðx; t þ t; x0 ; tÞ ¼0 t!0 t lim
ð5:147Þ
215
DEFINITIONS
this implies that there is no possibility that a discontinuity of size x x0 would appear at the moment of time t. Otherwise, the quantity ð
ðx; t þ t; x0 ; tÞ dx ¼ t jxx0 j>"
Pjump ¼ lim
t!0
ð jxx0 j>"
Wðx j x0 ; tÞdx
ð5:148Þ
is the probability that a jump in magnitude of at least " may appear at the moment of time t. The former is called a diffusion Markov process and the latter is a continuous process with jumps. If a random process ðtÞ is treated as a coordinate of some imaginary particle, the quantity K1 ðx; tÞ describes the average value of the velocity of the process at the moment of time t. Large deviations of the particle position are excluded in eq. (5.145) to avoid contributions from jumps: the uniformity in " would insure that arbitrary small deviations can be considered, and, thus, contributions of larger velocities can be neglected. Using mechanical analogy the coefficient K1 ðx; tÞ is often called the drift of the Markov process and it is attributed to an external force (potential). Finally, the coefficient K2 ðx; tÞ describes the average energy of fluctuations of the particle around a fixed position x. Such deviations could be caused by external noise and in general do not pull the particle in a certain direction as an external force. It is common to call K2 ðxÞ the diffusion to reflect the fact that a particle can randomly change its position without external force. Thus, the motion of the particle can be thought of as a superposition of the drift, diffusion, and jumps. Let f ðzÞ be a deterministic function which is at least twice differentiable and ðtÞ be a continuous Markov process. In this case the rate of change in the average value of f ðÞ, given that ðt0 Þ ¼ y, is defined by the following equation @ @ h f ðÞ j yi ¼ @t @t
ð1
Ð1
f ðxÞ½ðx; t þ t j y; t0 Þ ðx; t j y; t0 Þdx t!0 t Ð1 Ð1 f ðnÞ 1 ½ðx; t þ t j z; tÞðz; t j y; t0 Þdzdx 1 f ðzÞðz; t j y; t0 Þdz t ð5:149Þ
f ðxÞðx; t j y; t0 Þdx ¼ lim
1 Ð1 1
¼ lim
t!0
1
If " > 0 is fixed then the region of integration can be divided into two intervals: jx zj " and jx zj < ", while the function f ðxÞ can be represented by its Taylor series f ðxÞ ¼ f ðzÞ þ ðx zÞ
d ðx zÞ2 d2 f ðzÞ þ f ðzÞ þ jx zj2 Rðx; zÞ 2 dz2 dz
ð5:150Þ
where limjxzj!0 jRðx; zÞ ¼ 0. As a result, the Kolmogorov–Chapman equation (5.149) can be written as [2] 2 @f 1 2@ f ðx zÞ þ ðx zÞ ðx; t þ t j z; tÞðz; t j y; t0 Þdxdz @z 2 @z2 jxzj<" þ Pðx; z; tÞ þ Qðx; z; tÞ þ Sðn; z; tÞ ð5:151Þ
@ h f ðÞ j yi ¼ lim t!0 @t
ð
ð
216
MARKOV PROCESSES AND THEIR DESCRIPTION
where ð Pðx; z; tÞ ¼ ð Qðx; z; f Þ ¼ ð Sðx; z; tÞ ¼
ð jxzj<"
jxzj"
ð ð
jxzj<"
ðð
jx zj2 Rðx; zÞðx; t þ t j z; tÞðz; t j y; t0 Þdxdz
ð5:152Þ
f ðxÞðx; t þ t j z; tÞðz; t j y; t0 Þdxdz
ð5:153Þ
f ðzÞðx; t þ t j z; tÞðz; t j y; t0 Þdxdz
f ðzÞðx; t þ t j z; tÞðz; t j y; t0 Þdxdz
ð5:154Þ
Assuming uniform convergence allows one to interchange the limit operation with integration, thus the first line of eq. (5.151) can be rewritten with the help of eqs. (5.145) and (5.146) as ð @f 1 @2f aðzÞ þ BðzÞ 2 ðz; t j y; t0 Þdz þ Oð"Þ ð5:155Þ @z 2 @z The remainder term (5.152) vanishes since the following estimate is valid ð ð 1 1 2 2 j x zj Rðx; zÞðx; t þ t j z; tÞ j x zj ðx; t þ t j z; tÞ t jxzj<" t jxzj<" maxjRðx; zÞj ! ½Bðz; tÞ þ Oð"ÞmaxjRðx; zÞj ! 0
ð5:156Þ
It is easy to see that the remaining terms are simply ð QþS¼
jxzj"
ð
f ðxÞ½Wðz j x; tÞðx; t j y; t0 Þ Wðx j z; tÞðz; t j y; t0 Þdxdz
ð5:157Þ
Finally, taking the limit of both sides of eq. (5.149) when " ! 0 one obtains ð ð1 @ 1 @f 1 @2f 0 f ðzÞpðz; t j y; t Þ ¼ AðzÞ þ BðzÞ 2 ðz; t j y; t0 Þdz @t 1 @z 2 @z 1 ð1 ð1 0 0 f ðzÞ ðPÞ f ðxÞ½Wðz j x; tÞðx; t j y; t Þ Wðx j z; tÞðz; t j y; t Þdx dz þ 1
1
ð5:158Þ where ðPÞ
Ð
stands for the principal value of the integral, i.e. ðPÞ
ð1 1
ð Fðx; zÞdx ¼ lim
"!0 jxzj>0
Fðx; zÞdx
ð5:159Þ
The necessity of considering the principal value stems from the fact that wðx j z; tÞ is defined only for x 6¼ z, thus leaving a possibility for Wðx j x; tÞ to be infinite.
SOME IMPORTANT MARKOV RANDOM PROCESSES
217
Since equality (5.158) holds for an arbitrary double differentiable function f ðzÞ, one can hope to find some relations between the integrands on both sides of eq. (5.158) subject to certain boundary conditions. This can be accomplished through integrating by parts, which results in the following integro-differential equation, referred to as a differential Chapman– Kolmogorov equation [1] @ @ 1 @2 ½ðz; tÞðz; t j y; t0 Þ ðz; t j y; t0 Þ ¼ ½Aðz; tÞðz; t j y; t0 Þ þ @p @z 2 @z2 ð1 þ f ðxÞ½Wðz j x; tÞðx; t j y; t0 Þ Wðx j z; tÞðz; t j y; t0 Þdx 1
ð5:160Þ
This equation must be supplemented by the following initial condition ðz; t j y; t0 Þ ¼ ðz yÞ
ð5:161Þ
and a set of boundary conditions which will be considered later. There are also some restrictions on the form of coefficients7 AðzÞ and BðzÞ. Some further classification of Markov processes can be achieved based on the representation (5.160): if Wðx j yÞ ¼ 0 then the process has a continuous trajectory and is called a diffusion Markov process, otherwise it is a process with jumps. For example, if AðzÞ ¼ BðzÞ ¼ 0 the trajectory of the process contains only jumps which appear at random according to a certain distribution. In the case of the diffusion process the differential Kolmogorov–Chapman equation is known as the Fokker– Planck equation, while for the case of process with jumps it is known as the Master equation. More details can be found in [1,2] and in the following sections.
5.2 5.2.1
SOME IMPORTANT MARKOV RANDOM PROCESSES One-Dimensional Random Walk
The theory of Markov processes allows a great number of applied problems to be solved, due to the relative ease of finding a joint probability density of any order. In this section the problem of a one-dimensional random walk is considered as an example. Let a particle be moving along axis, which is represented by a homogeneous Markov chain with an infinite number of states. The particle may change its position only at fixed moments of time tm ¼ mt. A transition is allowed only to the neighbouring state or the particle may remain in the current state. These transitions are described by the following probabilities j; jþ1 ð1Þ ¼ p;
j; j ð1Þ ¼ 1 p q;
and j; j1 ¼ q
ð5:162Þ
Here, of course, p þ q 1. The corresponding geometry is shown in Fig. 5.5. For simplicity it is also assumed that t ¼ 1 (Figs. 5.4–5.6).
7
In future, alternative notation is used for AðxÞ and BðxÞ: K1 ðxÞ ¼ AðxÞ and K2 ðxÞ ¼ BðxÞ.
218
MARKOV PROCESSES AND THEIR DESCRIPTION
Figure 5.4 The correlation function (a) and the power spectral density (b) of the quasi-random telegraph signal.
Figure 5.5
Figure 5.6
A one-dimensional random walk of a particle.
Different boundary conditions on a wall: (a) absorbing; (b) reflecting; (c) elastic.
219
SOME IMPORTANT MARKOV RANDOM PROCESSES
At the initial moment of time t ¼ t0 ¼ 0 the particle resides at # ¼ #0 . At the next moment of time t1 the position of the particle changes by a random increment 1 with three possible values 1, 0, and 1 with the following probabilities Pð1 ¼ 1Þ ¼ p;
Pð1 ¼ 0Þ ¼ 1 p q;
and
Pð1 ¼ 1Þ ¼ q
ð5:163Þ
At the moment of time tm ¼ m the position of the particle is then given by #m ¼ #0 þ
m X
i
ð5:164Þ
i¼1
The particle under consideration can travel along the whole real axis # 2 ð1; 1Þ, and can be confined to a semi-infinite interval # 2 ½c; 1Þ ð# 2 ð1; cÞ or to a finite interval # 2 ½d; c. The first case is known as an unrestricted random walk, while the other two cases are known as a random walk with one or two walls at # ¼ c and # ¼ d respectively. Depending on the boundary conditions imposed on the wall one may distinguish between an absorbing wall, a reflecting wall, or an elastic wall. If a particle reaches an absorbing wall it remains on the wall forever. For a reflecting wall the position of the particle can be only decreased (increased) during the next step. In the case of an elastic wall the coordinate of the particle decreases (increases) with probability r or remains the same with probability 1 r. These three possibilities are shown in Fig. 5.6.
5.2.1.1
Unrestricted Random Walk
In the case of an unrestricted random walk the position # of a particle can be anywhere on the real axis, i.e. it may vary from 1 to 1. For simplicity it can be assumed that the initial position of the particle is at #0 ¼ 0. Then, after m steps the position is given by #m ¼
m X
i
ð5:165Þ
i¼1
the random variable #m takes integer values from a finite interval ½m; m. In order to arrive at the point with coordinate k, the particle must experience m1 0 positive steps, m2 0 negative steps, and m0 0 zero steps such that k ¼ m1 m2
and m0 ¼ m ðm1 þ m2 Þ
ð5:166Þ
Thus, the probability for the particle to reside at m ¼ k is given by a trinomial law Pðm ¼ kÞ ¼
X
m! pm1 ð1 p qÞm0 qm2 m0 !m1 !m2 !
ð5:167Þ
where summation is taken over all possible indices m1 , m2 and m0 satisfying the restriction (5.166).
220
MARKOV PROCESSES AND THEIR DESCRIPTION
Let and 2 stand for the mean and the variance of the random variable i described above. It is clear that ¼ 1 p þ ð1 qÞ ¼ p q; 2 ¼ ð1Þ2 p þ ð1Þ2 p þ ð1Þ2 q 2 ¼ p þ q ðp qÞ2 ð5:168Þ The process # is a Markov process since increments i are assumed to be independent. The average and variance of the random variable #m are given by * + m X h#m i ¼ i ¼ m ¼ mðp qÞ ð5:169Þ i¼1 2m ¼ hð#m mÞ2 i ¼ m2 The next step is to calculate the probability Pð j #m kÞ of the event that the particle resides in one of the coordinates j; j þ 1; . . . ; k; j < k. Since this probability is a sum of probabilities of each state, an approximation can be obtained from central limit theorem (CLT). According to CLT, the random variable m is approximately Gaussian for a large number m, with the mean m and the variance m2 . Thus " # ðk 1 ðx mÞ2 k m j m pffiffiffiffi pffiffiffiffi Pð j m kÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp dx ¼ 2m2 m m 2m2 j ð5:170Þ where 1 ðxÞ ¼ pffiffiffiffiffiffi 2
ðx 1
expðt2 Þdt
ð5:171Þ
is the probability integral [9]. More general results on an unrestricted random walk can be obtained by considering its generating function GðyÞ ¼ hyzi i ¼ py1 þ ð1 p qÞy0 þ qy1 ¼ py1 þ ð1 p qÞ þ
q y
ð5:172Þ
Since all increments are independent, one has hym i ¼ Gm ðyÞ. The next step is to introduce a generating function Gðy; sÞ such that jsGðyÞj < 1 and Gðy; sÞ ¼
1 X m¼0
sm ½GðyÞm ¼
1 y ¼ 1 sGðyÞ spy2 þ yð1 s sp sqÞ sq
ð5:173Þ
which contains all information about the random walk since the probabilities Pðm ¼ kÞ are coefficients in the series expansion of Gðy; sÞ
Gðx; yÞ ¼
1 X 1 X m¼0 k¼0
Pð#m ¼ kÞsm yk
ð5:174Þ
SOME IMPORTANT MARKOV RANDOM PROCESSES
5.2.2
221
Markov Processes with Jumps
In this section a few useful Markov processes often used in modelling of various physical, medical and economical phenomena are discussed. A common feature of these processes is that they contain points where the trajectory experiences a finite jump.
5.2.2.1
The Poisson Process
Let ðtÞ be a counting process, i.e. it represents a number of certain event appearances on a given time interval ½t0 ; tÞ. It is clear that ðtÞ attains only integer values and is a nondecreasing function of time. This implies that the transitional probability ij ðt0 ; tÞ vanishes for j < i, i.e. ij ðt0 ; tÞ ¼ ProbfðtÞ ¼ j j ðt0 Þ ¼ ig ¼ 0
ð5:175Þ
In many applications it is possible to justify that the probability of a change of state of the process ðtÞ on a short interval of time ðt; t þ tÞ is equal to t þ oðtÞ where ð ¼ constÞ > 0. The probability for this process to remain in the original state during the same time interval is then 1 t þ oðtÞ. It is also assumed that the probability of more than one change of state during an arbitrary short interval of time is of the order oðtÞ. Mathematically speaking Probfðt þ tÞ ¼ j j ðtÞ ¼ j 1g ¼ t þ oðtÞ Probfðt þ tÞ ¼ j j ðtÞ ¼ jg ¼ 1 t oðtÞ
ð5:176Þ
It is also reasonable to assume that the initial state of the process is zero, i.e. p0j
pj ðt0 Þ ¼ Probfðt0 Þ ¼ jg ¼
1; 0;
j¼0 j ¼ 1; 2; . . .
ð5:177Þ
The first task is to calculate unconditional probabilities of each state, pj ðtÞ ¼ ProbfðtÞ ¼ jg. This can be accomplished by using eq. (5.106) with ajj ¼ and aj; j1 ¼ , thus leading to @ pj ðtÞ ¼ pj ðtÞ þ pj1 ðtÞ; j 1 @t @ p0 ðtÞ ¼ p0 ðtÞ; j¼0 @t
ð5:178Þ
Once again, a generation function can be used to solve this equation. Indeed, let
Gðs; tÞ ¼
1 X j¼0
sj pj ðtÞ
ð5:179Þ
222
MARKOV PROCESSES AND THEIR DESCRIPTION
be the generating function of the Poisson process. Multiplying both sides of eq. (5.178) by sj and summing over j one obtains 1 1 X X @ @ @ p0 ðtÞ þ sj pj ðtÞ ¼ Gðs; tÞ ¼ p0 ðtÞ ðsj pj ðtÞ sj pj1 ðtÞÞ @t @t @t j¼1 j¼1
ð5:180Þ
¼ ð1 þ sÞGðs; tÞ The initial condition can be obtained from eq. (5.177) as Gðs; 0Þ ¼
1 X
s j pj ð0Þ ¼ 1
ð5:181Þ
j¼0
The solution of eq. (5.180) with initial condition (5.181) is then Gðs; tÞ ¼ Gðs; 0Þ exp½ð1 þ sÞt ¼ exp½ð1 þ sÞt
ð5:182Þ
Expanding the latter into the Taylor series and equating coefficients with equal powers of s one obtains Gðs; tÞ ¼
1 X n¼0
sn pn ðtÞ ¼ expðtÞ
1 X ðstÞn n¼0
n!
¼ expðtÞ
1 X ðtÞn n¼0
n!
Sn
ð5:183Þ
and thus pn ðtÞ ¼
ðtÞn t e n!
ð5:184Þ
Since the parameter does not depend on time or the state, the Poisson process is homogeneous. Using eqs. (5.101) and (5.109) one obtains the following forward and backward differential equations for transitional probabilities d ij ð Þ ¼ ij ð Þ þ i; j1 ð Þ d
d ij ð Þ ¼ ij ð Þ þ iþ1; j1 ð Þ d
ð5:185Þ ð5:186Þ
It can be shown that the solution of these equations with initial conditions ij ð0Þ ¼ ij is given by the following equation 8 ji > < ð Þ exp½ ; ij ð Þ ¼ ð j iÞ! > : 0;
ji j
ð5:187Þ
SOME IMPORTANT MARKOV RANDOM PROCESSES
223
It can be pointed out that, for a non-homogeneous Poisson process, parameter ¼ ðtÞ changes in time and the expression for probabilities of states becomes ð t n ðt 1 ð Þd exp ð Þd
ð5:188Þ pn ðt0 ; tÞ ¼ n! t0 t0 5.2.2.2
A Birth Process
The next step is to consider a process ðtÞ which also attains only integer values 0; 1; 2; . . . but in contrast to the Poisson process, the probability of change of state during a short interval ðt; t þ tÞ depends on the current state. The process considered is known as a birth process. A birth process is defined as follows. If the system is at the state ðtÞ ¼ j; j ¼ 0; 1; 2; . . ., at a moment of time, then the probability of transition j ! j þ 1 during a short interval of time ðt; t þ tÞ is equal to j t þ oðtÞ; the probability of remaining in the same state is 1 j oðtÞ, where j > 0. The probability of transition to any other state is of the order oðtÞ, and thus is negligible. Applying eq. (5.104) to the birth process one obtains a system of linear differential equations d pj ðtÞ ¼ j pj ðtÞ þ j1 pj1 ðtÞ; dt d p0 ðtÞ ¼ 0 p0 ðtÞ dt
j ¼ 1; 2; . . . ð5:189Þ
The last equation must be supplemented by the initial conditions which have the following form p 0j ¼ pj ð0Þ ¼ j;1
ð5:190Þ
The simplest case of the birth process is the case where the rates j are proportional to the current state j: j ¼ j;
¼ const > 0;
j>1
ð5:191Þ
which reflects the fact that the probability of transition is proportional to the number of specimens in the total population.8 Under this assumption eq. (5.189) can be rewritten as d pj ðtÞ ¼ jpj ðtÞ þ ðj 1Þpj1 ðtÞ; dt
j1
ð5:192Þ
The last system of equations can be solved using sequential integration, thus leading to the following Yule–Farri distribution: ( pj ðtÞ ¼
et ð1 et Þ 0;
8
This is approximately true for a case of newborn species [10]
j1
;
j1 j¼0
ð5:193Þ
224
MARKOV PROCESSES AND THEIR DESCRIPTION
The mean value and the variance of this process exponentially increase in time: m ðtÞ ¼ et ;
2 ðtÞ ¼ et ðet 1Þ
ð5:194Þ
Since coefficients j do not explicitly depend on time, the process itself is a homogeneous process, forward and backward Kolmogorov equations (5.109) and (5.110) can be thus written as d ij ð Þ ¼ j ij ð Þ þ j1 i; j1 ð Þ d
d ij ð Þ ¼ i ij ð Þ þ j iþ1; j ð Þ d
ð5:195Þ
It can be shown [3] that the solution of these equations is given by ij ð Þ ¼
5.2.2.3
j1 i
Cj1 e ð1 e Þj1 ; 0;
ji j
ð5:196Þ
A Death Process
A death process ðtÞ attains only integer values and is a non-increasing process and is defined by the following conditions: 1. At the initial moment of time a system under consideration is in a state j0 1, i.e. ð0Þ ¼ j0 . 2. If in the moment of time t the system resides at a state j, i.e. ðtÞ ¼ j, then the probability of transition j ! j 1 during a short interval of time is equal to j t þ oðtÞ and the probability of remaining at the same state is 1 j t oðtÞ. 3. The probabilities of other transitions are of order oðtÞ2 or less. Similar to the case of the birth process it is easy to obtain a differential equation for probabilities of each state d pj ðtÞ ¼ j pj ðtÞ þ jþ1 pjþ1 ðtÞ dt
ð5:197Þ
The solution of the death equation can be obtained in the very same way as the solution of the birth equation.
5.2.2.4
A Death and Birth Process
A birth and death process ðtÞ attains integer values and is allowed to increase or decrease its value according to the following rules: 1. If the system is at the state ðtÞ ¼ j at the moment of time t, then the probability of transition j ! j þ 1 during a short time interval ðt; t þ tÞ is equal to j t þ oðtÞ.
225
SOME IMPORTANT MARKOV RANDOM PROCESSES
2. If the system is at the state ðtÞ ¼ j at the moment of time t, then the probability of transition j ! j 1 during a short time interval ðt; t þ tÞ is equal to j t þ oðtÞ. 3. The probability of remaining in the current state ðtÞ ¼ j is 1 ðj þ j Þt oðtÞ. 4. The probability of change into an adjacent state is of an order of oðtÞ. 5. The state ðtÞ ¼ 0 is an absorbing state, i.e. if the system reaches this state it remains in this state forever. Once again resorting to eq. (5.104) one can obtain a system of differential equations describing a death and birth random process: d pj ðtÞ ¼ j1 pj1 ðtÞ ðj þ j Þpj ðtÞ þ jþ1 pjþ1 ðtÞ dt
ð5:198Þ
In general, for j ¼ 0 one can write d p0 ðtÞ ¼ 1 p1 ðtÞ dt
ð5:199Þ
Since j ¼ 0 is absorbing, only downward trajectories are allowed. It is assumed that at the initial moment of time the system resides in some non-zero initial state 0 < j0 < 1. As a result, the initial conditions are simply pj ð0Þ ¼ jj0 ¼
1; 0;
j ¼ j0 j 6¼ j0
ð5:200Þ
In turn, eq. (5.109) for transitional probabilities becomes d ij ðtÞ ¼ j1 i; i1 ðtÞ ðj þ j Þij ðtÞ þ jþ1 j; jþ1 ðtÞ dt
ð5:201Þ
In general, the solution of eqs. (5.199)–(5.201) is difficult to obtain. However, in a linear case j ¼ j and j ¼ j, ¼ const > 0, ð ¼ const > 0Þ and the initial condition ðtÞ ¼ j0 of the process can be treated analytically. It is shown in [3] that the probability of the states can be found as p0 ðtÞ ¼ ðtÞ pj ðtÞ ¼ ½1 ðtÞ½1 ðtÞ j1 ðtÞ; ij ðtÞ ¼
i X
j ¼ 1; 2; . . .
iþin1 Cni Cn1 ½ðtÞin ½ ðtÞ jn ½1 ðtÞ ðtÞn ;
ð5:202Þ ij
ð5:203Þ
n¼0
where ðtÞ ¼
ðeðÞt 1Þ ; eðÞt
ðtÞ ¼
ðeðÞt 1Þ eðÞt
ð5:204Þ
226
MARKOV PROCESSES AND THEIR DESCRIPTION
The mean and variance of the linear death and birth process are given by m ðtÞ ¼ exp½ð Þt;
2 ðtÞ ¼
þ ðÞ ðÞt e ½e 1
ð5:205Þ
In particular, if ¼ and ð0Þ ¼ j0 ¼ 1, the average growth rate is equal to zero, m ðtÞ ¼ 1, and the variance grows linearly, 2 ðtÞ ¼ 2t. It follows from eqs. (5.202) and (5.204) that the probability that the system resides at the final state ðtÞ ¼ 0 at the moment of time t is given by p0 ðtÞ ¼
eðÞt 1 eðÞt
ð5:206Þ
This expression allows one to obtain the condition defining survivability of the system in the long run. Indeed, if < , then lim p0 ðtÞ ¼ 1;
<
t!1
ð5:207Þ
and the system evolution stops in the absorbing state ðtÞ ¼ 0 with certainty. On the other hand, if > then lim p0 ðtÞ ¼
t!1
;
>
ð5:208Þ
and the system would have a non-zero probability p ¼ ð Þ= to never reach the absorbing state. This result can be easily visualized if and represent the birth and death rate of the population . If the birth rate is less than the death rate, then, eventually, the population becomes extinct. If the birth rate exceeds the death rate then it is possible that the population would never be extinct. It is possible to show [3] that eqs. (5.202) and (5.203) remain valid for a more general case where coefficients and depend on time. In this case one has to use the following expressions for ðtÞ and ðtÞ ðtÞ ¼ 1
1 exp½ ðtÞ; "ðtÞ
ðtÞ ¼ 1
1 "ðtÞ
ð5:209Þ
where ðtÞ ¼
ðt
ðt ½ð Þ ð Þd and "ðtÞ ¼ exp½ ðtÞ 1 þ ð Þ exp½ ð Þd
0
ð5:210Þ
0
For such non-homogeneous processes, the mean and variance can be expressed as [3] m ðtÞ ¼ exp½ ðtÞ; 2 ðtÞ ¼ exp½2 ðtÞ
ðt 0
½ð Þ þ ð Þ exp½ ð Þd
ð5:211Þ
THE FOKKER–PLANCK EQUATION
227
The probability p0 ðtÞ that a system reaches its absorbing state ðtÞ ¼ 0 is given by Ðt
p0 ðtÞ ¼
ð Þ exp½ ð Þd
Ðt 1 þ 0 ð Þ exp½ ð Þd
0
ð5:212Þ
Ðt It approaches unity every time when the integral 0 ð Þ exp½ ð Þd diverges for t ! 1. All the results shown are valid for j0 ¼ 1. If j0 > 1 all equations for the mean value and the variance can be obtained by multiplying the right-hand side by j0. The probability of reaching the absorbing state at ðtÞ ¼ 0 is still one if < and equals ð=Þ j0 if > . If and depend on time, then the probability p0 ðtÞ is obtained by raising expression (5.212) to power j0.
5.3 5.3.1
THE FOKKER–PLANCK EQUATION Preliminary Remarks
The transitional probability density ðx; t j x0 ; t0 Þ; t > t0 of a continuous diffusion Markov process satisfies the following partial differential equations (PDE) @ @ 1 @2 ðx; t j x0 ; t0 Þ ¼ ½K1 ðx; tÞðx; t j x0 ; t0 Þ þ ½K2 ðx; tÞðx; t j x0 ; t0 Þ @t @x 2 @x2 @ ð@Þ 1 @2 ðx; t j x0 ; t0 Þ ¼ K1 ðx0 ; t0 Þ ðx; t j x0 ; t0 Þ þ K2 ðx0 ; t0 Þ 2 ðx; t j x0 ; t0 Þ @t0 @x0 2 @x0
ð5:213Þ ð5:214Þ
The first of these two equations is known as the Fokker–Planck (FPE), or direct, equation. The second equation is called the Kolmogorov equation or the backward equation. It is important to mention that the direct equation involves a time derivative with respect to t (forward time), while the second one involves a time derivative with respect to t0 (backward time). It will be shown later that these two equations form a conjugate pair. 5.3.2
Derivation of the Fokker–Planck Equation
The direct Fokker–Planck equation can be obtained directly from the differential Kolmogorov–Chapman equation (5.160) by setting wðx j y; tÞ ¼ 0. However, in this section an alternative proof is provided. It is assumed for a moment that the transitional probability density ðx; t j x0 ; t0 Þ has all the necessary derivatives needed to formally write eqs. (5.213) and (5.214). Both equations can be derived from the Smoluchowski equation (5.139) ðx; t j x0 ; t0 Þ ¼
ð1 1
ðx; t j x0 ; t0 Þðx0 ; t0 j x0 ; t0 Þdx0
ð5:215Þ
The direct equation can be obtained by choosing t0 close to t, while the reversed equation can be obtained by choosing t0 close to t0 .
228
MARKOV PROCESSES AND THEIR DESCRIPTION
In order to obtain the Fokker–Planck equation, the Smoluchowski equation (5.139) can be rewritten in the following form ðx; t þ t j x0 ; t0 Þ ¼
ð1 1
ðx; t þ t j x0 ; tÞðx0 ; t j x0 ; t0 Þdx0 ;
t þ t > t > t0 ð5:216Þ
where the time interval t is very short. Let ð; t þ j x0 ; tÞ represent a conditional characteristic function of the random increment x ¼ x x0 over a short interval t given x0 . Using the definition of the characteristic function one can write 0
0
0
ð; t þ j x ; tÞ ¼ hexp½jðx x Þ j x ; ti ¼
ð1
0
1
e jðxx Þ ðx; t þ t j x0 ; tÞdx ð5:217Þ
Using the inverse Fourier transform one obtains ðx; t þ t j x0 ; tÞ ¼
1 2
ð1
0
1
ejðxx Þ ð; t þ j x0 ; tÞd
ð5:218Þ
The conditional characteristic function can be expanded into the Taylor series as ð; t þ j x0 ; tÞ ¼
1 X mn ðx0 ; tÞ n¼0
n!
ðjÞn
ð5:219Þ
where mn ðx0 ; tÞ ¼ hfðt þ tÞ ðtÞgn j ðtÞi ¼
ð1 1
ðx x0 Þn ðx; t þ t j x0 ; tÞdx
ð5:220Þ
are conditional moments of the increment x ¼ x x0 . Using eq. (5.220) in expression (5.219) one can derive that ðx; t þ t j x0 ; tÞ 1 ¼ 2
ð1
j ðxx0 Þ
e 1
1 X mn ðx0 ; tÞ n¼0
n!
! ð jÞ
n
d ¼
ð 1 X mn ðx0 ; tÞ 1 1 j ðxx0 Þ e ð jÞn d n! 2 1 n¼0
ð 1 1 0 X X mn ðx0 ; tÞ ð1Þn @ n 1 j ðxx0 Þ n mn ðx ; tÞ @ e d ¼ ð1Þ ðx x0 Þ ¼ n! n! @xn 2 @n 1 n¼0 n¼0 ð5:221Þ Finally, substituting series (5.221) into the Smoluchowski equation (5.216) one obtains
ðx; t þ t j x0 ; t0 Þ ¼
1 X ð1Þn @ n ½mn ðx; tÞðx; t j x0 ; t0 Þ n! @xn n¼0
ð5:222Þ
229
THE FOKKER–PLANCK EQUATION
or, equivalently, ðx; t þ t j x0 ; t0 Þ ðx; t j x0 ; t0 Þ ¼
1 X ð1Þn @ n ½mn ðx; tÞðx; t j x0 ; t0 Þ n! @xn n¼1
ð5:223Þ
Dividing both parts by t and taking the limit when t ! 0 the following partial differential equation is obtained 1 X @ ð1Þn @ n ðx; t j x0 ; t0 Þ ¼ ½Kn ðx; tÞðx; t j x0 ; t0 Þ n! @xn @t n¼1
ð5:224Þ
Here 1 1 hfðt þ tÞ ðtÞgn jðtÞi ¼ lim t!0 t t!0 t
Kn ð; tÞ ¼ lim
ð1 1
ðx x0 Þn ðx; t þ t j x0 ; tÞdx ð5:225Þ
It is worth noting here that since only the Chapman equation is used to derive eq. (5.224), it is valid for any Markov process as long as quantities Kn ðx; tÞ are finite. Most attention in this book is paid to Markov processes with finite coefficients K1 ð; tÞ and K2 ð; tÞ, and zero higher order coefficients Kn ðx; tÞ: Kn ðx; tÞ 6¼ 0;
n ¼ 1; 2;
Kn ðx; tÞ ¼ 0;
n>2
ð5:226Þ
It follows from eq. (5.225) that condition (5.226) describes the rate of decay in the probability of large deviations with decreasing t. Large deviations of the process ðtÞ are permitted, but they must appear in opposite directions with approximately the same probability. Thus, thepaverage increment of the random process over a short time interval ffiffiffiffiffiffi t is of the order of t [4]. Finite jumps of this process appear with zero probability; all the trajectories are continuous in a regular sense with probability one. It can be proven that if, for a Markov process with finite coefficients Kn ðx; tÞ < 1 for all n, some even coefficient is zero, i.e. K2n ðx; tÞ ¼ 0, then all the higher order coefficients n 3 are equal to zero. This statement is known as the Pawula theorem [2,11]. Thus one may just require K4 ðx; tÞ ¼ 0. Thus, for a Markov process with continuous range to be a continuous Markov process it is necessary and sufficient that Kn ðx; tÞ ¼ 0 for all n 3. For continuous Markov processes, the Fokker–Planck equation becomes @ @ 1 @2 ðx; t j x0 ; t0 Þ ¼ ½K1 ðx; tÞðx; t j x0 ; t0 Þ þ ½K2 ðx; tÞðx; t j x0 ; t0 Þ @t @x 2 @x2
ð5:227Þ
Historically, the Fokker–Planck equation in the form of eq. (5.227) was obtained to study Brownian motion and is sometimes called a diffusion equation [1,2]. The coefficient K1 ðx; tÞ describes the average value of the particle displacement and is also known as the drift. Likewise, K2 ðx; tÞ can be interpreted as a variance of a particle’s displacement from the average value and is called the diffusion coefficient.
230
MARKOV PROCESSES AND THEIR DESCRIPTION
Partial differential equation (5.227) is a linear partial differential equation of a parabolic type and a variety of methods can be used to obtain its solution. A few of these methods will be reviewed later. For a detailed analysis the interested reader can refer to [1,2]. Any solution ðx; t j x0 ; t0 Þ of the Fokker–Planck equation must satisfy the following conditions: a transitional probability density should be positive for all values of the arguments and be normalized to one after integration over the destination state, i.e. ð1 ðx; t j x0 ; t0 Þdx ¼ 1 ð5:228Þ ðx; t j x0 ; t0 Þ 0; 1
In addition, this density must be reduced to a function if t ¼ t0 , i.e. ðx; t0 j x0 ; t0 Þ ¼ ðx x0 Þ
ð5:229Þ
A solution of the Fokker–Planck equation for unbounded space x 2 ð1; 1Þ and the initial condition (5.229) is called the fundamental solution of the Fokker–Planck equation. If the initial value of the process x0 is not fixed but distributed according to probability density function pðx; t0 Þ ¼ p0 ðxÞ
ð5:230Þ
then this probability density must be considered as a part of the initial conditions. An unconditional marginal probability density function can be calculated in two ways. 1. Using the property of a joint probability function to normalize to a marginal probability density one readily obtains pðx; tÞ ¼
ð1 1
p2 ðx; t; x0 ; t0 Þdx0 ¼
ð1 1
pðx0 ; t0 Þðx; t j x0 ; t0 Þdx0
ð5:231Þ
This implies that in order to find the marginal probability density it is necessary to know the initial condition (5.229) and the fundamental solution ðx; t j x0 ; t0 Þ of the Fokker– Planck. 2. It is possible to calculate pðx; tÞ directly from the Fokker–Planck equation. Indeed, multiplying both sides of eq. (5.227) by pðx0 ; t0 Þ and integrating over x0 one obtains @ @ 1 @2 pðx; tÞ ¼ ½K1 ðx; tÞpðx; tÞ þ ½K2 ðx; tÞpðx; tÞ @t @x 2 @x2
ð5:232Þ
This indicates that the marginal probability density also satisfies the same Fokker–Planck equation as the transitional probability density. If the initial condition is expressed by eq. (5.229) then the marginal probability density coincides with the fundamental solution of the Fokker–Planck equation. This statement follows directly from eq. (5.231). Taking the latter property into account, as well as the fact that it is the marginal probability density which is usually of interest for applications, it is possible to concentrate on studying the Fokker– Planck equation in the form (5.232). This equation must be solved with initial conditions
THE FOKKER–PLANCK EQUATION
231
(5.230). The solution must also satisfy the positivity and normalization conditions given by pðx; tÞ 0;
ð1 1
pðx; tÞdx ¼ 1
ð5:233Þ
Furthermore, if the covariance function C ðt; t0 Þ, t > t0 of the Markov random process ðtÞ is to be found, one has to find both the transitional probability density ðx; t j x0 ; t0 Þ and the marginal probability density pðx0 ; t0 Þ. Once this is accomplished, the desired covariance function can be found as C ðt; t0 Þ ¼ h½ðtÞ hðtÞi½ðt0 Þ hðt0 Þii ðð ¼ ½x m ðtÞ½x0 m ðt0 Þðx; t j x0 ; t0 Þpðx0 ; t0 Þdxdx0
5.3.3
ð5:234Þ
Boundary Conditions
In order to obtain the solution of the Fokker–Planck equation (5.232) with the initial conditions (5.230) one must also consider appropriate boundary conditions, which depend on a particular problem and can vary from case to case. In order to choose proper boundary conditions the following interpretation of the Fokker– Planck equation can be helpful. In particular, one can associate the value of probability density with some quantity proportional to a number of particles in a flow of an abstract substance [2]. In this case the probability density pðx; tÞ reflects concentration of this substance at the point x at the moment of time t. The flow (or current) of the probability Gðx; tÞ (flow of substance) along the axis x consists of two parts: a local average flow K1 ðx; tÞpðx; tÞ of particles moving in the field of a potential and a random (diffusion) flow 0:5 @t@ ½K2 ðx; tÞpðx; tÞ, i.e. Gðx; tÞ ¼ K1 ðx; tÞpðx; tÞ
1@ ½K2 ðx; tÞpðx; tÞ 2 @t
ð5:235Þ
Using the notion of probability current Gðx; tÞ, the Fokker–Planck equation can be rewritten as @ @ pðx; tÞ þ Gðx; tÞ ¼ 0 @t @x
ð5:236Þ
which can be now interpreted as a law of conservation of the total probability. The conservation equation (5.236) can be rewritten using small increments x and t instead of derivatives as pðx; t þ tÞ pðx; tÞ Gðx þ x; tÞ Gðx; tÞ þ 0 t x
ð5:237Þ
232
MARKOV PROCESSES AND THEIR DESCRIPTION
or ½ pðx; t þ tÞ pðx; tÞx ½Gðx; tÞ Gðx þ x; tÞt
ð5:238Þ
Thus, it can be said that the increment of the probability (substance) during a small interval of time t at the location x is equal to a difference between the incoming flow Gðx; tÞ, entering on the left of the interval ½x; x þ x and the outcoming flow Gðx þ x; tÞ, leaving it, as shown in Fig. 5.7. The expression (5.238) can be interpreted as a law of conservation of probability, or continuity of probability. This also allows one to write similar expressions for the case of discontinuous Markov processes, both in continuous and discrete time.
Figure 5.7
Illustration of the concept of probability current.
If a random process ðtÞ attains all values on the real axis, i.e. x 2 ð1; 1Þ, then eq. (5.236) is valid on the whole real axis. As the boundary condition one must specify the values of the process at x ¼ 1. Integrating eq. (5.236) over x from 1 to 1 and using the normalization condition (5.233) one deduces that the following conditions must be satisfied Gð1; tÞ ¼ Gð1; tÞ
ð5:239Þ
However, in many practical problems, the latter condition is substituted by somewhat stronger conditions Gð1; tÞ ¼ Gð1; tÞ ¼ 0; pð1; tÞ ¼ pð1; tÞ ¼ 0
ð5:240Þ
which are known as zero boundary conditions or natural conditions. In a case when the random process ðtÞ attains values from a finite interval ½c; d, eq. (5.236) should be considered only on this interval, with the solution vanishing outside this interval. However, in this case a number of different boundary conditions can be considered, similar to those used in section 5.2.1. In particular, one can distinguish between the absorbing and reflecting wall. For some problems periodic boundary conditions can be imposed [2]. Detailed discussions of the boundary conditions can be found in [1,2]. An absorbing boundary condition at x ¼ c or x ¼ d requires that the value of the probability density vanishes on this boundary at all times, i.e. pðc; tÞ ¼ 0
or pðd; tÞ ¼ 0
ð5:241Þ
while a reflecting boundary condition requires that the probability current vanishes on the boundary Gðc; tÞ ¼ 0
or Gðd; tÞ ¼ 0
ð5:242Þ
233
THE FOKKER–PLANCK EQUATION
Many problems can be described by an absorbing boundary on one end and a reflecting boundary on the other end. If one of the boundaries is at infinity, natural boundary conditions must be applied on that boundary. The periodic boundary conditions pðx; tÞ ¼ pðx þ L; tÞ; Gðx; tÞ ¼ Gðx þ L; tÞ
ð5:243Þ
can be found in such applications as the theory of matter, crystals, phase lock loops, etc. [1,2,12]. For a sub-class of stationary processes, the transitional probability density ðx; t j x0 ; t0 Þ depends only on the difference of the time arguments ¼ t t0 , i.e. ðx; t j x0 ; t0 Þ ¼ ðx; x0 ; Þ
ð5:244Þ
while the drift and the diffusion do not explicitly depend on time, i.e. K1 ðx; tÞ ¼ K1 ðxÞ;
K2 ðx; tÞ ¼ K2 ðxÞ
ð5:245Þ
The marginal stationary probability density function pst ðxÞ, if it exists, depends neither on time nor on the initial distribution p0 ðxÞ ¼ pðx; t0 Þ. Thus, for a stationary distribution @ pst ðxÞ ¼ 0 @t
ð5:246Þ
and, due to the conservation property of the Fokker–Planck equation @ GðxÞ ¼ 0; @x
GðxÞ ¼ G ¼ const:
ð5:247Þ
As a result, the Fokker–Planck equation is reduced to the following ordinary differential equation d ½K2 ðxÞpst ðxÞ 2K1 ðxÞpst ðxÞ ¼ 2G dx
ð5:248Þ
The general solution of this equation can be found elsewhere (see [13] for example) ðx ðy ð C K1 ðsÞ 2G x K1 ðsÞ exp 2 ds ds dy exp 2 pst ðxÞ ¼ K2 ðxÞ K2 ðxÞ x0 x0 K2 ðsÞ y0 K2 ðsÞ
ð5:249Þ
Here, a constant C can be defined from the normalization property of the probability density function. The lower limit of integration is an arbitrary point on the real axis where the process ðtÞ is defined. For zero boundary conditions on the probability flow ðG ¼ 0Þ eq (5.248) becomes d ½K2 ðxÞpst ðxÞ 2K1 ðxÞpst ðxÞ ¼ 0 dx
ð5:250Þ
234
MARKOV PROCESSES AND THEIR DESCRIPTION
The general solution of this equation is ðx C K1 ðsÞ exp 2 ds pst ðxÞ ¼ bðxÞ x 0 K2 ðsÞ
ð5:251Þ
and the constant C can be obtained to normalize the probability density function pst ðxÞ to unity. The latter results illustrate the importance of the Fokker–Planck equation for applications. Indeed, once the drift K1 ðxÞ and diffusion K2 ðxÞ coefficients are known, it is possible to write an expression for the stationary probability density. In many cases integral (5.251) can be evaluated in quadratures. Unfortunately, investigation of the non-stationary solution is much more complicated. However, as will be shown later, a number of useful results can be obtained. 5.3.4
Discrete Model of a Continuous Homogeneous Markov Process
In order to visualize the nature of time variation of a continuous homogeneous Markov process ðtÞ for small time increments t, it is useful to consider an approach of a discrete Markov process to its continuous limit. In order to do so, a discrete Markov chain with transitions only to one of the neighbouring states is considered. In other words, if the chain is in the state j at the moment of time t, then at the moment of time t þ t the chain can be found in states j 1, j, or j þ 1. Let the probabilities of these transitions be j , 1 j j and j , respectively. In addition, let ij ðmÞ be the probability of transition from the state i to the state j in m steps. Then ij ðmÞ ¼ j1 i; j1 ðm 1Þ þ ð1 j j Þi; j ðm 1Þ þ j i;jþ1 ðm 1Þ
ð5:252Þ
In a similar way the continuous process can be approximated as a discrete process with increments x, 0, and x with probabilities ðxÞ, 1 ðxÞ ðxÞ and ðxÞ, respectively. If ðx; t j x0 ; t0 Þ stands for the conditional probability to find the trajectory of the process inside an interval ðx; x þ xÞ at the moment of time t, given that this trajectory was at x0 at the moment of time t0 , then the equation for evolution of ðx; t j x0 ; t0 Þ in time can be written as ðx; t j x0 ; t0 Þx ¼ ðx x; t t j x0 ; t0 Þxðx xÞ þ ðx; t t j x0 ; t0 Þx½1 ðxÞ ðxÞ þ ðx þ x; t t j x0 ; t0 Þx ðx þ xÞ
ð5:253Þ
or ðx; t j x0 ; t0 Þ ¼ ðx x; t t j x0 ; t0 Þðx xÞ þ ðx; t t j x0 ; t0 Þ ½1 ðxÞ ðxÞ þ ðx þ x; t t j x0 ; t0 Þ ðx þ xÞ
ð5:254Þ
In order for the approximation to approach a continuous process, the time increment t has to approach zero, t ! 0. It will be shown below that certain restrictions must be applied on probabilities ðxÞ and ðxÞ. Since the particle located at x at the moment of time t is allowed only to remain in the current position or move to the closest location separated by x, average displacement and
THE FOKKER–PLANCK EQUATION
235
its variance during a small time interval t are thus mðtÞ ¼ ½ðxÞ ðxÞx
ð5:255Þ
2 ðtÞ ¼ ½ðxÞ þ ðxÞ ½ðxÞ ðxÞ2 ðxÞ2
ð5:256Þ
The drift and diffusion coefficients now can be obtained as the limit of mðtÞ and 2 ðtÞ K1 ðxÞ ¼
lim ½ðxÞ ðxÞ
x; t!0
x ðxÞ2 ; K2 ðxÞ ¼ lim ½ðxÞ þ ðxÞ ½ðxÞ ðxÞ2 x; t!0 t t ð5:257Þ
In order to obtain a finite drift and diffusion, certain restrictions must be imposed on how ðxÞ and ðxÞ behave for small t and x. Assuming that the diffusion coefficient K2 ðxÞ is limited K2 ðxÞ < C ¼ const:
ð5:258Þ
ðxÞ2 ¼ Ct
ð5:259Þ
and choosing x such that
one can easily verify that ðxÞ ¼
1 ½K2 ðxÞ þ K1 ðxÞx; 2C
ðxÞ ¼
1 ½K2 ðxÞ K1 ðxÞx 2C
ð5:260Þ
satisfies eq. (5.257). Thus, in a discrete model pffiffiffiffiffiffi of a continuous Markov process, spacial increments x must be of the order of Oð tÞ and the probabilities ðxÞ and ðxÞ differ only by a quantity of the same order pffiffiffiffiffiffi K1 ðxÞ pffiffiffiffiffiffi ðxÞ ðxÞ pffiffiffiffiffiffiffiffiffiffiffi t Oð tÞ K2 ðxÞ 5.3.5
ð5:261Þ
On the Forward and Backward Kolmogorov Equations
Since the forward and the backward Kolmogorov equations (5.213) and (5.214) define the same transitional probability density ðx; t j x0 ; t0 Þ, these two equations are not independent. In particular, it can be shown that these two equations define two adjoint operations. According to [14], two differential functionals d d d d kðxÞ uðxÞ lðxÞ uðxÞ þ ½ðxÞuðxÞ ðxÞuðxÞ ð5:262Þ L½uðxÞ ¼ dx dx dx dx and M½ðxÞ ¼
d d d d kðxÞ ðxÞ þ lðxÞ ðxÞ ðxÞ ðxÞ ðxÞ dx dx dx dx
ð5:263Þ
236
MARKOV PROCESSES AND THEIR DESCRIPTION
where kðxÞ, lðxÞ, ðxÞ and ðxÞ are given functions of x; uðxÞ and ðxÞ are two functions which satisfy the following condition ð
d d ððxÞL½uðxÞ uðxÞM½ðxÞÞdx ¼ kðxÞ ðxÞ uðxÞ uðxÞ ðxÞ dx dx þ ððxÞ lðxÞÞuðxÞðxÞ
ð5:264Þ
and are called conjugate. In this case a differential operator M½ is uniquely defined by the differential operator L½. It is easy to show that if uðxÞ ¼ ðx; t j x0 ; t0 Þ in eq. (5.213) and ðxÞ ¼ ðx; t j x0 ; t0 Þ in eq. (5.214) then direct and inverse Fokker–Planck operators are conjugate with kðxÞ ¼ 0:5K2 ðx; tÞ, lðxÞ ¼ 0, ðxÞ ¼ 0:5K2 ðx; tÞ K1 ðx; tÞ and ðxÞ ¼ 0. Depending on a particular problem it is wise to use either the direct equation or the reversed equation. For example, if the initial state x0 ¼ xðt0 Þ is given, it is more appropriate to use the direct equation (5.213). However, if it is required to calculate the probability of the first level crossing it is better to use the reversed equation (5.214). Furthermore, since the final value of the process is known, it can be found that if certain limitations are imposed on the solution xðtÞ, the direct equation cannot be used, while the reversed equation is still valid. For example, one can consider movement of a particle on a limited interval of its range. When the particle reaches one of the two boundaries of this interval it remains on the boundary for a certain random time , distributed exponentially. At the end of the time interval the particle instantaneously returns to a point x inside the specified interval described by a known probability density. Further motion of the particle is described by a regular Fokker–Planck equation. The described process is a Markov process. However, the direct equation cannot be immediately applied since the transitions are not local.9 Meanwhile the reversed equation would not change and can be applied [1,2].
5.3.6
Methods of Solution of the Fokker–Planck Equation
Since the Fokker–Planck equation is an equation of a parabolic type, one can apply all the known methods [14]. Only four methods are considered below: (a) method of separation of variables, (b) the Laplace transform, (c) method of characteristic function, and (d) changing the independent variables.
5.3.6.1
Method of Separation of Variables
This method is reasonable to apply when both the drift and the diffusion coefficients do not depend on time (homogeneous Markov process). In this case eq. (5.232) becomes @ @ 1 @2 pðx; tÞ ¼ ½K1 ðxÞpðx; tÞ þ ½K2 ðxÞpðx; tÞ @t @x 2 @x2
9
Since jumps are allowed.
ð5:265Þ
THE FOKKER–PLANCK EQUATION
237
The solution of this equation can be thought of in the form of an anzatz [2,6] pðx; tÞ ¼ ðxÞTðtÞ
ð5:266Þ
where ðxÞ and TðtÞ are functions of x and t only. Substituting eq. (5.266) into eq. (5.265) and dividing both sides by pðx; tÞ ¼ ðxÞTðtÞ one obtains 1 @ TðtÞ ¼ TðtÞ @t
@ 1 @2 ½K1 ðxÞðxÞ þ ½K2 ðxÞðxÞ @x 2 @x2 ðxÞ
ð5:267Þ
The left-hand side of this equation depends only on t, while the right-hand side depends only on x. This implies that both sides of this equation are equal to some constant, say 2. Thus, the partial differential equation can be substituted by an equivalent system of two ordinary differential equations 1 d TðtÞ ¼ 2 ¼ ; TðtÞ dt
0
1 d2 d ½K2 ðxÞðxÞ ½K1 ðxÞðxÞ þ 2 ðxÞ ¼ 0 2 2 dx dx
ð5:268Þ ð5:269Þ
Differential equation (5.268) has an exponential solution10 TðtÞ ¼ expð 2 tÞ
ð5:270Þ
and the second order differential equation (5.269) can be solved using traditional methods [13,14]. This solution ðx; A; B; Þ depends on two arbitrary constants A and B as well as on the separation parameter . Since the Fokker–Planck equation is a linear equation, its general solution is a linear combination of particular solutions, i.e. pðx; tÞ ¼
1 X n¼0
ðx; An ; Bn ; n Þ expð n2 tÞ
ð5:271Þ
where constants An , Bn and n are defined by the corresponding boundary and initial conditions, imposed by a particular physical problem. It will be seen that it is often possible to satisfy the boundary conditions only for particular values of , called the eigenvalues, while the corresponding function n ðx; A; BÞ ¼ ðx; A; B; n Þ is called the eigenfunction corresponding to the eigenvalue n . A set of all possible eigenvalues is also known as the spectrum of the Fokker–Planck equation. If there is a countable number of isolated eigenvalues, the spectrum is called a discrete spectrum; if there is a continuum of eigenvalues then the spectrum is called a continuous spectrum. Finally, if there are both isolated eigenvalues and a region of continuous eigenvalues, the spectrum is called a mixed
10
The weight of the exponent is arbitrarily chosen to be unity with a proper constant absorbed by the function .
238
MARKOV PROCESSES AND THEIR DESCRIPTION
spectrum. A comprehensive treatment of the so-called eigenvalue problem can be found in [15]. It can be shown that if the flow Gðx; tÞ across the boundaries is zero and the spectrum of the Fokker–Planck operator is discrete, the solution can be further simplified to pðx; tÞ ¼ P0 pst ðxÞ þ
1 X n¼1
n ðxÞTn exp½ n2 ðt t0 Þ;
0 ðxÞ ¼ pst ðxÞ
ð5:272Þ
where n ðxÞ are eigenfunctions of eq. (5.269) corresponding to the eigenvalues n2 . P0 and Tn are some constant coefficients to be defined from the initial conditions. The eigenfunction which corresponds to the smallest eigenvalue 0 ¼ 0 is the stationary solution pst ðxÞ of the stationary Fokker–Planck equation as given by eqs. (5.250) and (5.251). Functions n ðxÞ form an orthonormal set with weight 1=pst ðxÞ. Mathematically speaking ð
n ðxÞn ðxÞ dx ¼ mn ¼ pst ðxÞ
1; 0;
n¼m n 6¼ m
ð5:273Þ
In particular ð
ð 0 ðxÞn ðxÞ dx ¼ n ðxÞdx ¼ 0 for n > 0 pst ðxÞ
ð5:274Þ
If the initial probability density is specified, pðx; t0 Þ ¼ p0 ðxÞ then the coefficients Tn and P0 can be obtained simply by setting t ¼ t0 in eq. (5.272) and by multiplying both parts of the latter by m ðxÞ followed by integration over x. Due to the orthogonality property (5.272) one obtains that the coefficients P0 and Tn are uniquely defined by ð P0 ¼ 1 and Tn ¼
n ðxÞp0 ðxÞ dx; pst ðxÞ
n>0
ð5:275Þ
If the initial value of the process is fixed at x0 , i.e. p0 ðxÞ ¼ ðx x0 Þ then the transitional density ðx; t j x0 ; t0 Þ is recovered pðx; tÞ ¼ ðx; t j x0 ; t0 Þ ¼
1 X n ðxÞn ðx0 Þ n¼0
pst ðx0 Þ
exp½ n2 ðt t0 Þ
It is sometimes beneficial to consider normalized eigenfunctions following rule11 n ðxÞ
11
Another useful normalization is
n ðxÞ
¼
pffiffiffiffiffiffiffiffiffiffiffi ¼ n ðxÞ= pst ðxÞ.
ðxÞ pst ðxÞ
n ðxÞ,
ð5:276Þ defined by the
ð5:277Þ
239
THE FOKKER–PLANCK EQUATION
In this case eqs. (5.272)–(5.276) can be rewritten as " pðx; tÞ ¼ pst ðxÞ 1 þ
1 X
# 2 n ðxÞTn exp½ n ðt
t0 Þ ;
0 ðxÞ
¼1
ð5:278Þ
n¼1
1; n ¼ m ¼ mn ¼ 0; n 6¼ m ð Tn ¼ n>0 n ðxÞp0 ðxÞdx;
ð
ð5:279Þ
n ðxÞ n ðxÞpst ðxÞdx
ð5:280Þ
and pðx; tÞ ¼ ðx; t j x0 ; t0 Þ ¼ pst ðxÞ
1 X
n n ðxÞ n ðx0 Þ exp½ 2 ðt
n¼0
t0 Þ;
t t0
ð5:281Þ
Once the transitional density is known, one is able to calculate all parameters of the corresponding Markov process. For example, the joint PDF p2 ðx; x0 ; t; t0 Þ is thus p2 ðx; x0 ; t; t0 Þ ¼ ðx; t j x0 ; t0 Þpðx0 ; t0 Þ ¼ pst ðxÞpðx0 ; t0 Þ
1 X n¼0
2 n ðxÞ n ðx0 Þ exp½ n ðt
t0 Þ ð5:282Þ
and the correlation function is given by ðð R ðt; t0 Þ ¼
xx0 p2 ðx; x0 ; t; t0 Þdxdx0 ð ð 1 X ¼ exp½ n2 ðt t0 Þ x n ðxÞpst ðxÞdx x0
n ðx0 Þpðx0 ; t0 Þ
ð5:283Þ
n¼0
In the stationary case, pðx0 ; t0 Þ ¼ pst ðx0 Þ and the correlation function depends only on the time difference ¼ t t0 R ð Þ ¼
1 X n¼0
h2n exp½ n2 j j ¼ h20 þ
1 X n¼1
h2n exp½ n2 j j
ð5:284Þ
where ð hn ¼ x
n ðxÞpst ðxÞdx
ð5:285Þ
Since 0 ðxÞ ¼ 1 one can obtain from eq. (5.285) that h0 is just the mean value m of the process and, thus, the expression for the covariance function C ð Þ becomes simply C ð Þ ¼ R ð Þ m2 ¼
1 X n¼1
h2n exp½ n2 j j
ð5:286Þ
240
MARKOV PROCESSES AND THEIR DESCRIPTION
Finally, taking the Fourier transform of both sides of eq. (5.286) one obtains the expression for the power spectral density of the Markov process ðtÞ Sð!Þ ¼
ð1 1
C ð Þ expðj! Þd ¼
1 X 4 n2 h2n !2 þ n4 n¼1
ð5:287Þ
Higher order statistics are derived in section 5.6. Example: Gaussian process [6]. Let the following Fokker–Planck equation be considered @ 1 @ 2 @ 2 pðx; tÞ ¼ ð½xpðx; tÞÞ þ pðx; tÞ @t
c @x
c @x2
ð5:288Þ
with natural boundary conditions at x ¼ 1, i.e. Gð1; tÞ ¼ 0. In this case eq. (5.269) becomes 2
@2 @ ðxÞ þ ½xðxÞ þ 2 c ðxÞ ¼ 0 @x2 @x
ð5:289Þ
This problem has a discrete spectrum given by [15,16] n2 ¼
n ;
c
n ¼ 0; 1; 2; . . .
ð5:290Þ
and the eigenfunctions x
1 1 n ðxÞ ¼ pffiffiffiffi F ðnþ1Þ n!
ð5:291Þ
where function F ðnþ1Þ ðzÞ can be easily expressed in terms of the Hermitian polynomials using the Rodrigues formula [9] z2 1 dn 1 F ðnþ1Þ ðzÞ ¼ pffiffiffiffiffiffi n e 2 ¼ pffiffiffiffiffiffi dz 2 2
ð5:292Þ
The stationary solution corresponds to 0 ðxÞ and is thus pst ðxÞ ¼ 0 ðxÞ ¼
1 ð1Þ x
1 x2 F ¼ pffiffiffiffiffiffiffiffiffiffi exp 2 2 22
ð5:293Þ
i.e. the stationary distribution is a Gaussian one with zero mean and variance 2 . Having this in mind, the normalized eigenfunctions n ðxÞ are given by n ðxÞ ð1Þn 2n=2 x pffiffiffiffi ¼ Hn pffiffiffi n ðxÞ ¼ pst ðxÞ 2 n!
ð5:294Þ
THE FOKKER–PLANCK EQUATION
241
This, in turn, provides the following expression for the transitional density 1 1 x2 X 1 x x0
Hn pffiffiffi Hn pffiffiffi exp ðx; t j x0 ; t0 Þ ¼ pffiffiffiffiffiffiffiffiffiffi exp 2 ð5:295Þ 2 n ¼ 0 2n n!
c x 2 22 pffiffiffi Taking into account that H1 ðxÞ ¼ 2x one obtains that the expression for the covariance function is reduced to a single exponent
C ð Þ ¼ 2 exp
c
ð5:296Þ
Example: one-sided Gaussian distribution [6]. The Fokker–Planck equation (5.288), considered with different boundary conditions, leads to a different expression for the eigenvalues and eigenfunctions. Let there be a reflecting wall at x ¼ 0 and the natural boundary conditions at x ¼ 1, i.e. Gð0; tÞ ¼ Gð1; tÞ ¼ 0
ð5:297Þ
It is easy to verify that in this case one can keep only eigenvalues (5.290) and eigenfunctions (5.291) with even indices, i.e. n2 ¼
2n
c
ð5:298Þ
thus resulting in the following expression for the stationary PDF rffiffiffiffiffiffiffiffi 2 x2 pst ðxÞ ¼ exp 2 ; 2 2
x0
ð5:299Þ
and the transitional density rffiffiffiffiffiffiffiffi 1 2 x2 X 1 x x0 2
H2n pffiffiffi H2n pffiffiffi exp exp 2 ðx; t j x0 ; t0 Þ ¼ 2 n¼0 22n ð2nÞ! 2
c 2 2 ð5:300Þ Example: Nakagami distributed process [6]. Let the Fokker–Planck equation have the following form @ 1 @ pðx; tÞ ¼ @t 2 c @x
x
ð2m 1Þ @2 pðx; tÞ þ pðx; tÞ 2mx 4 c m @x2
ð5:301Þ
with the boundary conditions Gð0; tÞ ¼ Gð1; tÞ ¼ 0
ð5:302Þ
242
MARKOV PROCESSES AND THEIR DESCRIPTION
Separation of variables produces the following eigenvalue problem d2 @ ðxÞ þ 2m dx2 @x
x
ð2m 1Þ ðxÞ þ 2 c 2 ðxÞ ¼ 0 2mx
ð5:303Þ
Using the following substitution 2mx2 and uðzÞ ¼ zðm1Þ=2 z¼
rffiffiffiffiffiffi! z 2m
ð5:304Þ
Eq. (5.303) can be transformed to a more standard form [6] d2 d uðzÞ þ uðzÞ þ dz2 dz
"
m 2
2 # þ 2 c 14 m1 2 þ uðzÞ ¼ 0 z z2
ð5:305Þ
Once again, using [9], one can obtain that the eigenvalues of eq. (5.305) are given by n2 ¼
n
c
ð5:306Þ
with corresponding eigenfunctions expressed in terms of the generalized Laguerre polynomials [9] un ðzÞ ¼ zm=2 expðzÞLnm1 ðzÞ
ð5:307Þ
or, returning to the original variables and normalizing the eigenfunctions, one obtains [6] 1 mx2 m1 mx2 n ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2m1 exp Ln n!ðn þ mÞðmÞ
ð5:308Þ
and the joint stationary PDF is given by
1 Lm1 mx2 Lm1 mx20 n n 4 mxx0 2m1 mðx2 þ x20 Þ X nj j exp exp p2 ðx; x0 ; Þ ¼
c n!ðn þ mÞðmÞ n¼0 ð5:309Þ Thus, the stationary distribution described by the Fokker–Planck equation (5.301) is the Nakagami distribution pst ðxÞ ¼
2 m m 2m1 mx2 x exp ðmÞ
ð5:310Þ
THE FOKKER–PLANCK EQUATION
5.3.6.2
243
The Laplace Transform Method
The effectiveness and popularity of the Laplace transform is explained by the fact that it eliminates the dependence of the solution on time t and the partial differential equation becomes an ordinary differential equation. This approach can be directly applied to the Fokker–Planck equation. In order to do so, it can be written as @ 1 @2 @ @ pðx; tÞ ¼ K2 ðxÞ 2 pðx; tÞ þ K2 ðxÞ K1 ðxÞ pðx; tÞ @t 2 @x @x @x 1 @2 @ K K ðxÞ ðxÞ pðx; tÞ þ 2 1 2 @x2 @x
ð5:311Þ
Next, let Pðx; sÞ stand for the Laplace transform of the probability density pðx; tÞ, i.e. Pðx; sÞ ¼
ð1
pðx; tÞest dt
ð5:312Þ
0
Taking the Laplace transformation of both sides of the Fokker–Planck equation one obtains the following ordinary differential equation 1 d2 d 1 K2 ðxÞ 2 Pðx; sÞ þ ½K20 ðxÞ K1 ðxÞ Pðx; sÞ þ K200 ðxÞ K10 ðxÞ s Pðx; sÞ þ p0 ðxÞ ¼ 0 dx 2 dx 2 ð5:313Þ where p0 ðxÞ is the initial PDF of the process under consideration, pðx; tÞ ¼ p0 ðxÞ. The corresponding boundary conditions are also obtained by taking the Laplace transform of these conditions. Example: thermal driven motion in a well with an absorbing wall. If a particle is moving in a limited region with a reflecting wall at x ¼ 0 and an absorbing wall under the influence of the thermal force, the corresponding Fokker–Planck equation can be written as @ 2 @ 2 pðx; tÞ ¼ pðx; tÞ
c @x2 @t
ð5:314Þ
@ pðx; tÞ ¼ pðL; tÞ ¼ 0 @x x¼0
ð5:315Þ
with the boundary conditions
Assuming that the initial position of the particle is uniformly distributed over the interval ½0; L the Laplace transform of eq. (5.314) becomes 1 2 @ 2 @ sPðx; sÞ ¼ Pðx; sÞ; Pðx; sÞ ¼ PðL; sÞ ¼ 0 L c @x2 @x x¼0
ð5:316Þ
244
MARKOV PROCESSES AND THEIR DESCRIPTION
After simple manipulation the solution of the problem (5.316) is given by pffiffiffiffiffi s
1 x pffiffiffiffiffi Pðx; sÞ ¼ sL s
L sL cosh cosh
ð5:317Þ
Since there is an absorbing wall, the particle eventually will be removed from the system. This conclusion is confirmed by the fact that12 lim pðx; tÞ ¼ lim sPðx; sÞ ¼ 0
!1
s!0
ð5:318Þ
Furthermore, one can define a residency (survival) distribution Pr ðtÞ ¼
d dt
ðL
ð5:319Þ
pðx; tÞdx 0
Since it is difficult to find the exact inverse Laplace transform of eq. (5.317), instead of finding the exact distribution pr ðtÞ one can settle for finding the average residency time Tc which can be expressed in terms of Pðx; sÞ as Tc ¼
ð1
tpr ðtÞdt ¼
0
ð1 ðL 0
0
ÐL pðx; tÞdxdt ¼ lim s!0
0
Pðx; sÞdx c L2 ¼ 2 2 s
ð5:320Þ
The latter equation is derived using the finite limit theorem for the Laplace transform. As expected, the average residency time depends on the width of the well and the intensity of noise. Higher order moments of the distribution pr ðtÞ can also be calculated in a similar manner. 5.3.6.3
Transformation to the Schro¨dinger Equations
Let the process considered have a constant diffusion coefficient K2 ðx; tÞ ¼ K2 ¼ const. In this case the eigenvalue problem for the Fokker–Planck equation becomes
d K2 d2 ½K1 ðxÞ ðxÞ þ ðxÞ ¼ ðxÞ dx 2 dx2
ð5:321Þ
or K2 d2 d d ðxÞ þ K ðxÞ K ðxÞ ðxÞ ðxÞ ¼ 0 1 1 dx dx 2 dx2 12
Here the initial and the final value theorem for the Laplace transform is used [17].
ð5:322Þ
STOCHASTIC DIFFERENTIAL EQUATIONS
245
It is easy to see that substitution [18] ð ðxÞ ¼ uðxÞ exp
K1 ðxÞ dx K2
ð5:323Þ
reduces eq. (5.322) to the canonical form 2 2
d 6 uðxÞ þ 42 dx2
3 d K1 ðxÞ K 2 ðxÞ d2 7 dx 1 2 5uðxÞ ¼ 2 uðxÞ þ ½E VðxÞuðxÞ ¼ 0 dx K2 K2
ð5:324Þ
where E¼
2 K2
ð5:325Þ
and d K1 ðxÞ K 2 ðxÞ VðxÞ ¼ 2 dx þ 12 K2 K2
ð5:326Þ
The importance of such a representation is obvious when approximate methods of solution of the Fokker–Planck are to be found: there is a wealth of approximate methods developed for the Schro¨dinger equation in the form (5.324). Example: Gaussian process and Hermitian polynomials. Let the drift of K1 ðxÞ be a linear function of its argument K1 ðxÞ ¼ x
ð5:327Þ
and let the natural boundary conditions be used for the corresponding Fokker–Planck equation. In this case eq. (5.326) is reduced to d2 þ 2 x uðxÞ þ 2 uðxÞ ¼ 0 dx2 K2 K22
ð5:328Þ
which is well studied [15].
5.4
STOCHASTIC DIFFERENTIAL EQUATIONS
Let a deterministic system be described by a differential equation known as the Liouville equation d xðtÞ ¼ f ðx; tÞ þ gðx; tÞuðtÞ; xðt0 Þ ¼ x0 dt
ð5:329Þ
246
MARKOV PROCESSES AND THEIR DESCRIPTION
where xðtÞ is the state of the system, x0 is the initial state of the system, f ðx; tÞ and gðx; tÞ are some deterministic functions, and uðtÞ is external force, for a moment considered deterministic. In this case the state evolution is purely deterministic and can be found through a closed form solution of eq. (5.329). Of course, such a solution is deterministic as well.13 Equation (5.329) can be rewritten in equivalent differential and integral forms as dxðtÞ ¼ f ðxðtÞ; tÞdt þ gðxðtÞ; tÞdðtÞ; xðt0 Þ ¼ x0
ð5:330Þ
and xðtÞ ¼ xðt0 Þ þ
ðt
f ðxð Þ; Þd þ
ðt
t0
gðxð Þ; Þdð Þ
ð5:331Þ
t0
where dð Þ ¼ uð Þd
ð5:332Þ
Randomness can be brought in by a number of ways: one can consider initial conditions xðt0 Þ ¼ , with functions f ðx; tÞ and gðx; tÞ being random. However, one of the most common situations is the case where the external excitation uðtÞ is itself a random process. In particular, if uðtÞ is a white Gaussian noise ðtÞ or a Poisson flow ðtÞ, the corresponding equation (5.329) is called a Stochastic Differential Equation (SDE), expression (5.331) is called a Stochastic Differential, and eq. (5.331) is called a Stochastic Integral. Treatment of stochastic integrals is somewhat different from regular integrals and is considered in some detail below. Comprehensive discussions can be found in [6,8].
5.4.1
Stochastic Integrals
Let f ðx; tÞ and gðx; tÞ be smooth functions of their arguments. In the case when both t and x are deterministic variables the Riemann integral and the Stielties integrals are defined through the limits of the following sums ðt ðt t0
t0
f ðxð Þ; Þd ¼ lim
!0
gðxð Þ; Þdð Þ ¼ lim
!0
m1 X
f ðxð i Þ; i Þðtiþ1 ti Þ
ð5:333Þ
gðxð i Þ; i Þ½ðtiþ1 Þ ðti Þ
ð5:334Þ
i¼0 m1 X i¼0
Here t0 < t1 < < tm ¼ t; ¼ maxðtiþ1 ti Þ, and i are some points belonging to intervals ½ti ; tiþ1 , i.e. ti i tiþ1 . For any piece-wise continuous function f ðxðtÞ; tÞ and continuous function gðxðtÞ; tÞ limits in eqs. (5.333) and (5.334) do not depend on the choice of points i inside each interval. 13
In chaotic systems such a statement is not valid if one allows for small errors in knowledge of the initial conditions [19].
STOCHASTIC DIFFERENTIAL EQUATIONS
247
It can be shown [8] that integrals involving random functions can be defined using eqs. (5.333) and (5.334) if the following conditions are met. 1. Random functions f ðx; tÞ and gðx; tÞ are uniformly continuous in the mean square sense on the time interval ½t0 ; t, i.e. Ef½ f ðxð þ Þ; þ Þ f ðxð Þ; Þ2 g ! 0 as ! 0 uniformly with respect to 2 ½t0 ; t. A similar equality must hold for gðx; tÞ. 2. The following integrals exist and are finite ð t ð t E f 2 ðxðtÞ; tÞdt < 1; E g2 ðxðtÞ; tÞdt < 1 t0
ð5:335Þ
t0
3. Limits in eqs. (5.333) and (5.334) are understood in the mean square sense. The definition of an integral of a random function does not differ much from the definition of the integrals of deterministic functions. However, in a striking contrast, the value of the integral (5.334) depends on the way in which points i are chosen. Definition (5.334) implies that a continuous function gðxðtÞ; tÞ is substituted on the interval ½t0 ; t by a piece-wise constant function g ðtÞ such that g ð i Þ ¼ gðxð i Þ; i Þ;
i 2 ½ti ; tiþ1
ð5:336Þ
and i are chosen according to the following rule
i ¼ i ¼ ð1 Þti þ tjþ1 ;
01
ð5:337Þ
Since it is assumed that xðtÞ is a solution of an SDE in the form (5.331) its trajectory can be approximated by a straight line on a small time interval xð Þ ¼ xðti Þ þ
xðtiþ1 Þ xðti Þ ð ti Þ þ oðÞ tiþ1 ti
ð5:338Þ
Using expression (5.337) in eq. (5.338) one obtains xð Þ ¼ ð1 Þxðti Þ þ xðtiþ1 Þ þ oðÞ
ð5:339Þ
If one further assumes that the function gðxðtÞ; tÞ is continuously differentiable, then the following approximation can be easily obtained from eqs. (5.337) and (5.339) gðxð i Þ; i Þ ¼ gðð1 Þxðti Þ þ xðtiþ1 Þ; ti Þxðti Þ þ xðtiþ1 Þ þ oðÞ
ð5:340Þ
Finally, eq. (5.340) allows one to approximate the stochastic integral (5.334) as J ¼
ðt
gðxð Þ; Þd ð Þ ¼ lim
!0
t0
¼ lim
!0
m1 X i¼0
m1 X
gðxð i Þ; i Þ½ðtiþ1 Þ ðti Þ
i¼0
gðð1 Þxðti Þ þ xðtiþ1 Þ; ð1 Þti þ tiþ1 Þ½ðtiþ1 Þ ðti Þ
ð5:341Þ
248
MARKOV PROCESSES AND THEIR DESCRIPTION
Here the sub-index emphasizes the choice of the time i inside the interval ½ti ; tiþ1 . If ¼ 0, eq. (5.341) defines the so-called Ito stochastic integral [1,2] J0 ¼ lim
!0
m1 X
gðxðti Þ; ti Þ½ðtiþ1 Þ ðti Þ
ð5:342Þ
i¼0
which is widely used in the theory of Markov processes. The next step is to evaluate the difference J J0 for an arbitrary value of from the interval ½0; 1. The assumption of differentiability of function gðx; tÞ allows one to represent it by the first two terms of the corresponding Taylor expansion, i.e. gðð1 Þxðti Þ þ xðtiþ1 Þ; ti Þ " # @ gðx; ti Þ ½xðtiþ1 Þ xðti Þ þ o½xðtiþ1 Þ xðti Þ gðxðti Þ; ti Þ þ @x x ¼ xðti Þ
ð5:343Þ
Since the trajectory xðtÞ is the solution of a differential equation (5.329), one can substitute the corresponding differential (5.330) by a finite difference for a small time interval tiþ1 ti xðtiþ1 Þ xðti Þ ¼ f ðxð i Þ; i Þðtiþ1 ti Þ þ gðxð i Þ; i Þ½ðtiþ1 Þ ðti Þ þ oðÞ
ð5:344Þ
The latter can be used to transform eq. (5.341) to (
m1 X
@ gðxðti Þ; ti Þ gðxðti Þ; ti Þ½ðtiþ1 Þ ðti Þ2 þ oðÞ J J0 ¼ lim lim !0 !0 @x i¼0
) ð5:345Þ
If the external force ðtÞ is white Gaussian noise, this implies that (see section 3.15 or [2]) Ef½ðtiþ1 Þ ðti Þ2 g ¼
N0 ðtiþ1 ti Þ 2
ð5:346Þ
@ ðgðxð Þ; ÞÞd
@x
ð5:347Þ
and eq. (5.346) approaches in the limit to N0 J J0 ¼ 2
ðt t0
gðxð Þ; Þ
Thus, the relationship between J and the Ito stochastic integral J0 is given by J ¼
ðt t0
gðxð Þ; Þd ð Þ ¼
ðt t0
gðxð Þ; Þd0 ð Þ þ
N0 2
ðt t0
gðxð Þ; Þ
@ gðxð Þ; Þd
@x ð5:348Þ
249
STOCHASTIC DIFFERENTIAL EQUATIONS
In more general settings it can be proven that if xðtÞ is solution of a stochastic differential equation (5.330), ðxðtÞ; tÞ is [8] ðt
ðt
N0 ðxð Þ; Þd ð Þ ¼ ðxð Þ; Þd0 ð Þ þ 2 t0 t0
ðt
gðxð Þ; Þ
t0
@ ðxð Þ; Þd
@x
ð5:349Þ
It is important to note that if the function gðx; tÞ does not explicitly depend on the current value of the random process xðtÞ then J ¼ J0 for all , i.e. all definitions of stochastic integrals are identical. While the Ito integrals are mostly used in theoretical analysis due to the fact that they are martingales [4,8], other definitions (with 6¼ 0) are very important. In particular, the case of ¼ 0:5 is known as the Stratonovich form of stochastic integrals [6] m1 X xðti Þ þ xðtiþ1 Þ ti þ tiþ1 ; ¼ gðxð Þ; Þd0:5 ð Þ ¼ lim g ½ðtiþ1 Þ ðti Þ !0 2 2 t0 i¼0 ðt
J0:5
ð5:350Þ The formal difference between Ito and Stratonovich integrals lies in the fact that in order to evaluate values of the function gðx; tÞ in the pre-limit sum (5.334) one picks points at the beginning of the time intervals (Ito form) or in the middle of the time intervals (Stratonovich form). Ito and Stratonovich integrals lead to different results for integration of random functions. For example, integrals of a Wiener process can be obtained using a straightforward equality 1 wðti Þ½wðtiþ1 Þ wðti Þ ¼ fw2 ðtiþ1 Þ w2 ðti Þ ½wðtiþ1 Þ wðti Þ2 g 2
ð5:351Þ
Thus 1 2 N 2 J0 ¼ wð Þd0 wð Þ ¼ w ðtÞ w ðt0 Þ ðt t0 Þ 2 2 t0 ðt
ð5:352Þ
while the Stratonovich definition will produce J0:5 ¼ J0 þ
N 1 ðt t0 Þ ¼ ½w2 ðtÞ w2 ðt0 Þ 4 2
ð5:353Þ
It is possible to prove that for any ‘‘good’’ function ðxðtÞ; tÞ [8] ðt t0
ðxð Þ; Þd0:5 xð Þ ¼
ðt t0
ðxð Þ; Þd0:5 xð Þ þ
1 2
ðt t0
g2 ðxð Þ; Þ
@ ðxð Þ; Þd
@x
ð5:354Þ
It follows from the discussion above that different definitions of the stochastic integrals would lead to a different solution of the stochastic differential equation, eqs. (5.329)– (5.331). That is why it is always necessary to indicate which form of stochastic integrals are
250
MARKOV PROCESSES AND THEIR DESCRIPTION
used. Sometimes, different notations are used for Stratonovich and Ito forms, as discussed in Appendix C on the web. While the Ito form is convenient in theoretical studies, it can be seen from the example below that the usual rules of calculus of integrals do not apply to the Ito form of stochastic integrals. The rules of differentiation also become more complicated than those of classical calculus. In particular, it can be proven that
@ @ 1 2 @2 ðxðtÞ; tÞ þ f ðxðtÞ; tÞ ðxðtÞ; tÞ þ g ðxðtÞ; tÞ 2 ðxðtÞ; tÞ dt dðxðtÞ; tÞ ¼ @t @x 2 @x 1 @ þ gðxðtÞ; tÞ gðxðtÞ; tÞd0 ðtÞ 2 @x
ð5:355Þ
while the usual rules of calculus result in
@ @ 1 @ ðxðtÞ; tÞ þ f ðxðtÞ; tÞ ðxðtÞ; tÞ dt þ gðxðtÞ; tÞ ðxðtÞ; tÞdðtÞ dðxðtÞ; tÞ ¼ @t @x 2 @x ð5:356Þ This fact makes Ito integrals unattractive for engineering calculations. Interestingly enough, the Stratonovich form of stochastic calculus preserves the rules of classical calculus for derivatives and substitution of variables. Further discussion will be provided later in this section. The next step is to realize that a random process described by a stochastic differential equation (5.329) defines a diffusion Markov process if the excitation is white Gaussian noise. This fact is known as Doob’s first theorem [8]. Indeed, let t3 > t2 > t1 t0 be three distinct moments of time. Using the stochastic integral (5.331) one can write xðt3 Þ ¼ xðt2 Þ þ
ð t3
f ðxð Þ; Þd þ
t2
ð t3
gðxð Þ; Þd wðtÞ
ð5:357Þ
t2
Since increments of the Wiener process on non-intersecting intervals are independent, the value xðt3 Þ [given value of xðt2 Þ] is independent of the value xðt1 Þ, i.e. the process xðtÞ is indeed a Markov process. Since both the SDE and the Fokker–Planck describe a diffusion Markov process it is possible to relate the coefficients of SDE (5.329) with those of the Fokker–Planck equation. It will be assumed that the Ito form of stochastic differentials is used, i.e. dxðtÞ ¼ f ðxðtÞ; tÞdt þ gðxðtÞ; tÞd0 wðtÞ
ð5:358Þ
Integrating both sides of eq. (5.358) over a short interval ½t; t þ t one obtains xðt þ tÞ xðtÞ ¼
ð tþt t
f ðxð Þ; Þd þ
ð tþt gðxðtÞ; tÞd0 wðtÞ t
ð5:359Þ
STOCHASTIC DIFFERENTIAL EQUATIONS
251
For a short interval t, the value of the first integral in eq. (5.359) is just ð tþt E t
f ðxð Þ; Þd xðtÞ ¼ x ¼ f ðx; tÞt
ð5:360Þ
while from the definition of the Ito integral it follows that ð tþt E t
f ðxð Þ; Þd0 wðtÞ xðtÞ ¼ x ¼ 0
ð5:361Þ
The latter is the consequence of the fact that the increment of the Wiener process over the time interval ½t; t þ t does not depend on the value of the function xðtÞ ¼ x at the moment of time t. This property also allows one to write the expression for the mean root square value of the integral of gðx; tÞ ( ð ) 2 tþt N0 2 g ðx; tÞt E f ðxð Þ; Þd0 wðtÞ xðtÞ ¼ x ¼ ð5:362Þ 2 t Thus, the drift and diffusion of the solution of SDE (5.358) are given by
xðt þ tÞ xðtÞ ¼ f ðx; tÞ t!0 t ( ) ½xðt þ tÞ xðtÞ2 N0 2 g ðx; tÞ ¼ K20 ðx; tÞ ¼ lim E t!0 t 2
K10 ðx; tÞ ¼ lim E
ð5:363Þ ð5:364Þ
In case of general the relation between J and J0 allows one to write the drift and diffusion as @ @ b ðx; tÞ ¼ K10 ðx; tÞ þ b ðx; tÞ K1 ðx; tÞ ¼ f ðx; tÞ þ 2 @x 2 @x ( ) ½xðt þ tÞ xðtÞ2 N0 2 g ðx; tÞ ¼ K20 ðx; tÞ K2 ðx; tÞ ¼ lim E ¼ t!0 t 2
ð5:365Þ ð5:366Þ
One-to-one correspondence between the drift and the diffusion and functions f ðx; tÞ and gðx; tÞ allows one to rewrite SDE (5.358) directly in terms of parameters of a Markov process:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @ K2 ðx; tÞ dt þ K2 ðx; tÞd wðtÞ dxðtÞ ¼ K1 ðx; tÞ 2 @x
ð5:367Þ
In particular, the Ito and the Stratonovich forms of the SDE thus become pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dxðtÞ ¼ K1 ðx; tÞdt þ K2 ðx; tÞd0 wðtÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1@ K2 ðx; tÞ dt þ K2 ðx; tÞd0:5 wðtÞ dxðtÞ ¼ K1 ðx; tÞ 4 @x
ð5:368Þ ð5:369Þ
252
MARKOV PROCESSES AND THEIR DESCRIPTION
Formula (5.367) also shows that equations with different can be transformed into each other simply by changing the value of function f ðx; tÞ. However, it also shows that the same Markov process can be described by various SDEs and such ambiguity creates some uncertainty in using SDEs as models for realistic physical processes. As has been pointed out, the Ito form is attractive in theoretical investigations since it defines the solutions of SDEs as martingales. However, rules of differentiation and changes of variables are more complicated than those of deterministic calculus. The latter explains the popularity of the Stratonovich form among engineers and physicists: the rules of calculus are the same as in the deterministic case. It is often important to evaluate the adequacy of the model (5.329) which uses white Gaussian noise as an excitation, since none of the realistic waveforms can be WGN. In principle it is possible to avoid such a problem by considering excitation to be a correlated Gaussian noise which is produced by a linear SDE. In this case the Langevin equation will become a system of two SDEs d xðtÞ ¼ f ðx; tÞ ¼ gðx; tÞuðtÞ dt sffiffiffiffiffiffiffiffi d 1 uðtÞ ¼ uðtÞ þ dt
c
22 ðtÞ
c
ð5:370aÞ ð5:370bÞ
It can be shown that the random vector ½xðtÞ; uðtÞT is a Markov vector and can be described in a similar way as a scalar Markov process (see section 5.7 for more details). As a result, it has been shown that if c ! 0 the solution of SDE (5.370a) approaches the Stratonovich form of idealized SDE (5.329). Thus, the Stratonovich form of integrals is probably closer to a physical model of the system excited by a very wide band noise than the Ito form. This fact is an additional reason why the Stratonovich form is often used in applied research—one can use deterministic steady-state equations of systems under investigation and then simply assume that the excitation is WGN. No additional changes of non-linear function f ðx; tÞ are then necessary. Numerical simulation of SDEs is considered in more detail in Appendix C on the web and in much greater detail in a number of monographs [20,21]. It is important to mention here that a different numerical scheme would converge to a different form of the solution of SDE: thus the second and the fourth order Runge–Kutta scheme converges to the Stratonovich form of the solution, while the Euler scheme converges to the Ito form of the solution. In general, on each step of simulation ½ti ; tiþ1 WGN is substituted by a Gaussian random variable ni ð 1 tiþ1 ni ¼ ðtÞdt; ¼ tiþ1 ti ð5:371Þ ti It is easy to see from the definition that this random variable has zero mean and variance N0 =2. Samples of the Gaussian random variable are independent of each other. For uniform sampling WGN is substituted by an independent identically distributed (iid) Gaussian sequence with variance N0 =2t. Example: log-normal process. Let the following equation be considered dxðtÞ ¼ xðtÞduðtÞ
ð5:372Þ
STOCHASTIC DIFFERENTIAL EQUATIONS
253
Its solution satisfying the initial condition xðt0 Þ ¼ x0 is given by [13,18] xðtÞ ¼ x0 exp½uðtÞ uðt0 Þ
ð5:373Þ
For simplicity it is also assumed that x0 > 0. If uðtÞ ¼ ðtÞ is white Gaussian noise, the process wðtÞ ¼ uðtÞ uðt0 Þ is the Wiener process, and its PDF is then 1 x2 pw ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp ð5:374Þ N0 ðt t0 Þ N0 ðt t0 Þ Using rules of zero memory non-linear transformation of densities one obtains the following expression for the PDF of the solution of SDE (5.373) 1 1 ln2 ðx=x0 Þ ð5:375Þ pðx; tÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp x N0 ðt t0 Þ N0 ðt t0 Þ Thus, the resulting process is distributed according to a log-normal law. It is interesting to find out which form of SDE produces the same solution. In order to do that one can obtain solutions in both Ito and Stratonovich forms and compare the results with eq. (5.375). Stratonovich form of the solution. Let time interval ½t0 ; t be partitioned by points t0 < t1 < < tm < t. For the i-th sub-interval one can write xðti Þ ¼ xðti1 Þ þ ½xðti Þ þ xðti1 Þ
½wðti Þ wðti1 Þ 2
ð5:376Þ
or, equivalently wðti Þ wðti1 Þ 2 xðti Þ ¼ xðti1 Þ wðti Þ wðti1 Þ 1 2 1þ
ð5:377Þ
The last equality, recursively applied i times, leads to wðti Þ wðti1 Þ 2 xðti Þ ¼ xðt0 Þ wðt Þ wðti1 Þ i i¼1 1 2 m 1þ Y
ð5:378Þ
The limit in eq. (5.378) can be calculated by noting that if maxjti ti1 j ! 0 and m ! 1 then wðti Þ wðti1 Þ ! 0. Thus, taking the logarithm of both sides of eq. (5.378) and using approximation lnð1 þ xÞ x x2 =2 for small x one obtains ln xðtÞ ¼ ln x0 þ ln x0 þ
m X wðti Þ wðti1 Þ wðti Þ wðti1 Þ ln 1 þ ln 1 2 2 i¼1 m X i¼1
½wðti Þ wðti1 Þ ¼ ln x0 þ wðtÞ wðt0 Þ
ð5:379Þ
254
MARKOV PROCESSES AND THEIR DESCRIPTION
and thus xðtÞ ¼ xðt0 Þ exp½wðtÞ wðt0 Þ
ð5:380Þ
The latter coincides with eq. (5.374). Using expressions for univariate and bivariate characteristic functions of the Wiener process one can obtain expressions for the mean value and the covariance function of non-stationary process xðtÞ: N0 mx ðtÞ ¼ x0 Efexp½wðtÞ wðt0 Þg ¼ x0 exp ðt t0 Þ ð5:381Þ 4 Cxx ðt1 ; t2 Þ ¼ x20 Efexp½ðwðt1 Þ wðt0 Þ þ ðwðt2 Þ wðt0 ÞÞÞg N0 ½t1 þ t2 2t0 þ 2minðt1 t0 ; t2 t0 Þ ¼ x20 exp 4
ð5:382Þ
Ito solution: Since in this case the point is taken on the left of each sub-interval, one obtains that xðti Þ ¼ xðti1 Þ þ xðti1 Þ½wðti Þ wðti1 Þ ¼ xðti1 Þ½1 þ wðti Þ wðti1 Þ
ð5:383Þ
(Euler scheme) and, after recursive application of the latter xðtÞ ¼ xðt0 Þ
m Y
ð1 þ wðti Þ wðti1 ÞÞ
ð5:384Þ
i¼1
Taking the logarithm of both sides and using approximation lnð1 þ xÞ x x2 =2 once again one easily concludes that ln xðtÞ ¼ ln x0 þ
m X
lnð1 þ wðti Þ wðti1 ÞÞ ¼ ln x0 þ wðtÞ wðt0 Þ þ
i¼1
m X ½wðti Þ wðti1 Þ2 i¼1
2 ð5:385Þ
If m ! 1 and maxjti ti1 j ! 0, then the last term approaches N0 ðt t0 Þ=2 thus leading to N0 xðtÞ ¼ x0 exp wðtÞ wðt0 Þ ðt t0 Þ 4
ð5:386Þ
It is easy to confirm that this solution is described by the following PDF, average and correlation function ( 2 ) 1 1 1 x N0 ln pðx; tÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp þ ðt t0 Þ x N0 ðt t0 Þ N0 ðt t0 Þ x0 4 mx ðtÞ ¼ x0 Cxx ðt1 ; t2 Þ ¼ x20 exp
N0 minðt1 t0 ; t2 t0 Þ 4
ð5:387Þ ð5:388Þ ð5:389Þ
STOCHASTIC DIFFERENTIAL EQUATIONS
255
The same result can be obtained by changing variables in the original eq. (5.372). Since conventional rules of calculus do not apply for the Ito form of SDE one has to use a special rule for substitution: if one considers the SDE (Ito form) d xðtÞ ¼ f ðx; tÞ þ gðx; tÞðtÞ dt
ð5:390Þ
and the following substitution yðtÞ ¼ ðxðtÞÞ is required, then the Ito form SDE for yðtÞ can be written as d N0 yðtÞ ¼ f ðx; tÞ0 ðxðtÞÞ þ gðx; tÞ0 ðxðtÞÞðtÞ þ g2 ½ðxðtÞÞ00 ðxðtÞÞ 4 dt
ð5:391Þ
Applying this rule to eq. (5.373) and choosing yðtÞ ¼ ln½xðtÞ one obtains d N0 yðtÞ ¼ þ ðtÞ dt 4
ð5:392Þ
This, in turn, has an exact solution yðtÞ ¼ yðt0 Þ
N0 ðt t0 Þ þ 4
ðt
ð Þd
ð5:393Þ
t0
which immediately produces the result obtained earlier (5.386) N0 xðtÞ ¼ xðt0 Þ exp wðtÞ wðt0 Þ ðt t0 Þ 4
ð5:394Þ
In a general case, one can write xðti Þ ¼ xðti1 Þ þ ½ð1 Þxðti1 Þ þ xðti Þ½wðti Þ wðti1 Þ which leads to a solution in the form 1 N0 ðt t0 Þ xðtÞ ¼ xðt0 Þ exp wðtÞ wðt0 Þ þ 2 2
ð5:395Þ
ð5:396Þ
Thus, solutions of the SDE (5.372) obtained by methods applied to regular differential equations correspond to the Stratonovich form of SDE. Physically, this can be explained by the fact that the white Gaussian noise term ðtÞ in the Stratonovich form of stochastic integrals can be treated as a zero mean Gaussian process ðtÞ with covariance function Cð Þ with a very small correlation interval c (much smaller than the characteristic time of the system under consideration, s c ). The impact of such wide band noise has approximately the same effect as pure WGN if one substitutes Cð Þ ! ðN0 =2Þð Þ, where N0 ¼ 2
ð1 Cð Þd
1
ð5:397Þ
256
MARKOV PROCESSES AND THEIR DESCRIPTION
However, substitution of WGN by a correlated process allows one to treat an SDE as a regular differential equation, thus explaining why the Stratonovich solution is obtained as a result of application of the regular rules of calculus. It is beneficial to illustrate this fact on the example considered above. Let the covariance function of ðtÞ be represented in the form Cð Þ where is a time scale factor. For example, the exponential covariance function can be written as Cð Þ ¼ N0 f1 expð f1 j jÞ
ð5:398Þ
where ¼ f =f1 and f1 is the ‘‘bandwidth’’ corresponding to ¼ 1. On a fixed subinterval ½ti1 ; ti eq. (5.373) now can be rewritten as d ~xðtÞ ¼ ~xðtÞ~ðtÞ dt Its solution is given by a simple formula ð t1 ~xðti Þ ¼ ~xðti1 Þ ~ ðti1 Þ ðtÞdt ¼ ~xðti1 Þ½~ wðti Þ w
ð5:399Þ
ð5:400Þ
ti1
~ ðtÞ approaches the Wiener process and the solution If approaches infinity, the process w (5.400) approaches that of the Stratonovich solution (5.380). Indeed, the variance of the ~ ðtÞ over time interval T is given by increment of the process w ~ ðtÞ2 g ¼ Ef½~ wðt þ TÞ w ¼
ð tþT ð tþT t ð tþT
Ef~ð 1 Þ~ð 2 Þgd 1 d 2
t ð tþT
Rðð 1 2 ÞÞd 1 d 2 ðT j j ¼T 1 Rðj jÞdð Þ T T ð1 ð T jsj N0 T 1 RðsÞdðsÞ ¼ RðsÞdðsÞ ! T ¼T 2 T T 1 t
t
ð5:401Þ
if ! 1. Thus the variance of the process ~ðtÞ approaches that of the Wiener process implying that ~ðtÞ itself approaches the Wiener process. The same conclusion can be reached for a general case of SDE: if WGN is substituted by wide band noise, the solution of the corresponding SDE approaches the Stratonovich form of the solution as bandwidth of the noise increases. Finally, it is possible to summarize the feasibility of modelling of real physical processes using stochastic differential equations into the following three key points. 1. Such modelling is possible if the bandwidth of the driving noise is much wider than the bandwidth of the system, or equivalently c s, where s is the characteristic time constant of the system and c is the correlation interval of the noise. If a continuous time system without delays is described by a system of differential state equations under deterministic input, the same equations can be used if the input is substituted by WGN. In this case one has to treat the obtained SDE as the Stratonovich form of SDE.
257
TEMPORAL SYMMETRY OF THE DIFFUSION MARKOV PROCESS
2. If the discrete system is described by a difference equation under deterministic input, such as xiþ1 ¼ Fi ðxi Þ þ gi ðxi Þi
ð5:402Þ
where i is an iid Gaussian process with variance D ¼ 2 and discretization step t, such a system can be described by a continuous SDE dx xiþ1 xi Fi ðxi Þ xi ¼ þ gi ðxi Þi ¼ f ðx; tÞ þ gðx; tÞðtÞ t t dt
ð5:403Þ
In this case SDE (5.403) must be treated as the Ito form of SDE since the Euler numerical scheme is used (see Appendix C on the web). However, if the input sequence i has correlation interval c exceeding t, c > t one has to treat the approximate continuous SDE as the Stratonovich form of SDE. 3. If a system under consideration is driven by wide band noise (i.e. again c s ) but contains delay d one has to carefully examine the amount of delay. If such delays are small compared to the time constant of the system d s they can be omitted from the differential equations describing the model. However, the delay must also be compared to the correlation interval of the noise. If d c then the system of SDE inferred from the state model must be treated as the Stratonovich SDE, while in the opposite case the Ito form must be used.
5.5
TEMPORAL SYMMETRY OF THE DIFFUSION MARKOV PROCESS
In this section the so-called temporally symmetric property of diffusion Markov processes is considered. It has been found [22–26] that not all actual world processes have this property. Fortunately, stationary processes in communication channels can be thought of as temporally symmetric. This enables one to apply diffusion Markov processes as models of the real ones. Temporal symmetry in a strict sense stationary stochastic process is defined in terms of its joint distributions. A stationary process ðtÞ, t 2 T is temporally symmetric if, for every n and every t1 ; . . . ; tn in T, the random vectors fðt1 Þ; . . . ; ðtn Þg and fðt1 Þ; . . . ; ðtn Þg have the same probability distributions [23]. The set of time indices, T, is assumed to be the entire real line for continuous-time processes, and the set of all integers for discrete-time processes. Thus, whenever t0 is in T, so is t0 . Although this facilitates the definition of temporal symmetry given above, it should be noted that if T does not have this property, the definition can be suitably modified, essentially retaining the same ideas. It is said that a stationary process is second order temporally symmetric if the joint distributions of fðt1 Þ; ðt2 Þg and fðt1 Þ; ðt2 Þg are identical for every t1 and t2 . The
258
MARKOV PROCESSES AND THEIR DESCRIPTION
relationship of second order temporal symmetry to the more general symmetry is similar to that of a wide sense stationary process to a strict sense stationary process. By analogy with [24] the following proposition is proven Proposition 1. A stationary process is second order temporally symmetric if and only if its bivariate distributions are symmetric. Proof: If ðtÞ is second order temporally symmetric, by definition Pððt2 Þ; ðt1 ÞÞ ¼ Pððt2 Þ; ðt1 ÞÞ
ð5:404Þ
for every t1 and t2 , where Pððt2 Þ; ðt1 ÞÞ denotes a bivariate distribution function. Shifting the time origin by t1 þ t2 and using the stationary condition, one can write Pððt1 Þ; ðt2 ÞÞ ¼ Pððt2 Þ; ðt1 ÞÞ
ð5:405Þ
Combining eqs. (5.404) and (5.405) one obtains Pððt2 Þ; ðt1 ÞÞ ¼ Pððt1 Þ; ðt2 ÞÞ
ð5:406Þ
The converse can be proved by retracing the above steps in reverse order. All stationary Gaussian processes are temporally symmetric [24]. This follows from the fact that all joint distributions of these processes depend only on the covariance matrices and that due to stationarity, those matrices are Hermitian. It is possible to prove the following. Proposition 2. A stationary Markov process ðtÞ, generated by the Fokker–Planck equation is temporally symmetric. Proof: We prove this proposition for the case of a discrete spectrum for a one-dimensional process. The proof in the case of a continuous spectrum, as well as a vector case, repeats the same steps. Taking into account the expression for bivariate PDF (5.282) p ðx; x0 Þ ¼
1 X
m ðxÞm ðx0 Þ exp½m
ð5:407Þ
m¼0
one can conclude that this bivariate PDF is symmetric. Then, following Proposition 1, it is possible to conclude that the Markov process generated by the corresponding SDE is temporally symmetric. This means that this approach does not allow one to model temporally asymmetric random processes. One possible approach to modelling asymmetricin-time processes, staying within the SDE framework, is the use of the Poisson flow of delta pulses as the standard excitation.
5.6
HIGH ORDER SPECTRA OF MARKOV DIFFUSION PROCESSES
The high order spectra of random processes became a topic of great interest in recent decades [25]. Higher order statistical analysis involves high order distributions which, in
HIGH ORDER SPECTRA OF MARKOV DIFFUSION PROCESSES
259
the case of Markov diffusion processes generated by proper SDEs, are defined by transient density 1 X m ðxÞm ðx0 Þ
ðx; x0 Þ ¼
m¼0
p st ðx0 Þ
exp½m j j
ð5:408Þ
Let t0 , t0 þ 1 , and t0 þ 1 þ 2 be three sequential moments of time and the corresponding values of the Markov diffusion process ðtÞ are x0 ¼ ðt0 Þ, x1 ¼ ðt0 þ 1 Þ, and x2 ¼ ðt0 þ 1 þ 2 Þ. In this case, expression (5.408), written for two intervals 1 and 2 , has the following form 1 X m ðx1 Þm ðx0 Þ
1 ðx1 ; x0 Þ ¼
m¼0
p st ðx0 Þ
exp½m j 1 j
ð5:409Þ
exp½n j 2 j
ð5:410Þ
and 2 ðx2 ; x1 Þ ¼
1 X n ðx2 Þn ðx1 Þ n¼0
p st ðx1 Þ
respectively. The trivariate density can be then obtained by using the Markov property of the process ðtÞ, i.e. p 2 1 ðx2 ; x1 ; x0 Þ ¼ 2 ðx2 ; x1 Þ 1 ðx1 ; x0 Þp st ðx0 Þ 1 X 1 X n ðx2 Þn ðx1 Þm ðx1 Þm ðx0 Þ exp½n j 2 j m j 1 j ¼ p st ðx1 Þ m¼0 n¼0
ð5:411Þ
The knowledge of the trivariate density allows one to calculate moment mk2 k1 k0 ð 2 ; 1 Þ of order k2 ; k1 ; k0 ð mk2 k1 k0 ð 2 ; 1 Þ ¼ xk22 xk11 xk00 p 2 1 ðx2 ; x1 ; x0 Þdx2 dx1 dx0 3
ð ¼
1 X 1 X
3
m¼0 n¼0
xk22 xk11 xk00
n ðx2 Þn ðx1 Þm ðx1 Þm ðx0 Þ p st ðx1 Þ
exp½n j 2 j m j 1 jdx2 dx1 dx0 ¼
1 X 1 X m¼0 n¼0
xk22 xk11 xk00 exp½n j 2 j m j 1 j
ð ð ð
n ðx2 Þn ðx1 Þm ðx1 Þm ðx0 Þ dx2 dx1 dx0 P st ðx1 Þ ð ð 1 X 1 X n ðx1 Þm ðx1 Þ dx1 ¼ exp½n j 2 j m j 1 j xk2 n ðx2 Þdx2 xk11 px st ðx1 Þ m¼0 n¼0 ð 1 X 1 X xk2 n ðx2 Þdx2 ¼ hn;k2 hm;k0 hn;m;k1 exp½n j 2 j m j 1 j
m¼0 n¼0
ð5:412Þ
260
MARKOV PROCESSES AND THEIR DESCRIPTION
where is the support of the marginal PDF p st ðxÞ, and the following notation is used ð hn;k ¼ xk n ðxÞdx ð5:413Þ ð n ðxÞm ðxÞ dx ð5:414Þ hn;m;k ¼ xk p st ðxÞ Thus the following proposition is proven. Proposition 3. The bivariate moment function of order k2, k1 , k0 of the Markov diffusion process has the form mk2 k1 k0 ð 2 ; 1 Þ ¼
1 X 1 X
m; n;k2 k1 k0 exp½n j 2 j m j 1 j
ð5:415Þ
m¼0 n¼0
Corollary. The third order moment of the diffusion Markov process is given by m3 ð 1 ; 2 Þ ¼ m111 ð 1 ; 2 Þ ¼
1 X 1 X
hn hm hnm exp½n j 2 j m j 1 j
ð5:416Þ
m¼0 n¼0
where
ð hn ¼
ð
xn ðxÞdx and
hnm ¼
n ðxÞm ðxÞ dx p st ðxÞ
ð5:417Þ
xk n ðxÞdx ¼ hn;k
ð5:418Þ
x
Proposition 4. hn;k ¼ hn;0;k Proof: Since 0 ðxÞ ¼ p st ðxÞ, one obtains ð
hn;0;k
n ðxÞ0 ðxÞ dx ¼ ¼ x pst ðxÞ
ð
k
The two-dimensional moment spectra Mk2 k1 k0 ð!2 ; !1 Þ can be found as a two-dimensional Fourier transform of mk2 k1 k0 ð 2 ; 1 Þ Mk2 k1 k0 ð!2 ; !1 Þ ¼ ¼
1
ð1 ð1
mk2 k1 k0 ð 2 ; 1 Þ exp½j!2 2 j!1 1 ð2Þ2 1 1 1 X 1 X 16n m hn;k hm;k hn;m;k 2
m¼0 n¼0
0
1
ð!22 þ 2n Þð!21 þ 2m Þ
ð5:419Þ
thus proving the following proposition. Proposition 5. Mk2 k1 k0 ð!2 ; !1 Þ ¼
1 X 1 X 16n m hn;k2 hm;k0 hn; m;k1 ð!22 þ 2n Þð!21 þ 2m Þ m ¼ 0 n¼0
ð5:420Þ
HIGH ORDER SPECTRA OF MARKOV DIFFUSION PROCESSES
261
Corollary. For the third order moment, two-dimensional spectra have the form M3 ð!2 ; !1 Þ ¼
1 X 1 X 16n m hn hm hnm 2 þ 2 Þð!2 þ 2 Þ ð! n n 2 1 m¼0 n¼0
ð5:421Þ
A third order cumulant can be expressed in terms of moments of order up to 3 as [25,26] c3 ð 2 ; 1 Þ ¼ m3 ð 2 ; 1 Þ m1 ½m2 ð 2 Þ þ m2 ð 1 Þ þ m2 ð 2 1 Þ þ 2m31
ð5:422Þ
or, after substitution of eq. (5.416) into eq. (5.422), c3 ð 2 ; 1 Þ ¼
1 X 1 X
hm hn hmn exp½n j 2 j m j 1 j m1
m¼1 n¼1
1 X m¼1
h2m exp½m ðj 2 j j 1 jÞ ð5:423Þ
and consequently the following proposition is proved. Proposition 6. Bispectrum C3 ð!2 ; !1 Þ [25] of the diffusion Markov process is given by the following formula C3 ð!2 ; !1 Þ ¼
1 X 1 1 X X 16n m hm hn hmn 162n h2m m 1 2 2 2 ð!2 þ 2n Þð!1 þ 2m Þ ð!2 þ 2m Þð!21 þ 2m Þ m¼1 n¼1 m¼1
ð5:424Þ
The coefficients hn;m;k and eigenvalues m play a significant role in third order statistics of the diffusion Markov process. We now show that all spectral properties of any order are defined by the same coefficients. Proposition 7. All spectral properties of a diffusion Markov process are defined by spectra of eigenvalues m and the coefficients hnm . Proof: Consider n samples of the process ðtÞ at the moments t0 ; t0 þ 1 ; . . . ; t0 þ 1 þ . . . þ n1 . Using Markov properties of the process, the n-variate density is given by p 1 ;...; n1 ðxn1 ; xn2 ; . . . ; x0 Þ ¼ p st ðx0 Þ
n1 Y
i ðxi ; xi1 Þ
i¼1
¼
1 X 1 X kn1 kn2
1 X kn1 ðxn1 Þkn1 ðxn2 Þkn2 ðxn2 Þkn2 ðxn3 Þ . . . k1 ðx0 Þ p st ðxn2 Þp st ðxn3 Þ . . . p st ðx1 Þ k ¼0 0
exp½kn1 j n1 j kn3 j n3 j k1 j 1 j
ð5:425Þ
262
MARKOV PROCESSES AND THEIR DESCRIPTION
To obtain the expression for the moment of order n we should integrate eq. (5.425) n times over real axis R mn ð n1 ; n2 ; . . . ; 1 Þ ð ¼ xn1 xn2 . . . x0 p 1 ;...; n1 ðxn1 ; . . . ; x0 Þdxn1 . . . dx0 Rn
¼
1 X 1 X
...
kn1 kn2
ð
1 X
exp½kn1 j n1 j kn3 j n3 j k1 j 1 j
k0 ¼0
ð
kn2 ðxn2 Þkn3 ðxn2 Þ dxn2 . . . px st ðxn2 Þ R R ð ð ð k ðxj Þkj1 ðxj Þ k ðx1 Þk0 ðx1 Þ dxj x1 1 xj j dx1 x0 k0 ðx0 Þdx0 px st ðxj Þ px stðx1 Þ R R R
xn1 kn1 ðxn1 Þdxn1
xn2
ð5:426Þ
thus mn ð n1 ; n2 ; . . . ; 1 Þ 1 X 1 1 X X ¼ ... hkn1 hkn2 kn3 . . . hk2 k1 hk0 exp½kn1 j n1 j kn3 j n3 j k1 j 1 j kn 1 kn2
k0 ¼0
ð5:427Þ Since mkn1 ...k1 k0 ð n1 ; . . . ; 2 ; 1 Þ ¼ mkn1 þ þ k1 þ k0 ð n1 ; 0; . . . ; 0; n2 ; . . . ; 0; 1 ; . . . ; 0Þ ð5:428Þ an n-variate moment can be expressed in terms of kn1 þ þ k1 þ k0 variate moment with some values of time equal to zero. The last statement together with eq. (5.427) proves the proposition. Example: The random process with Rayleigh distribution is generated by the following SDE K2 1 x 2 þ KðtÞ x_ ¼ ð5:429Þ 2 x In this case [6] m ¼
2Km 2
ð5:430Þ
and m ðxÞ ¼
2 x x2 x exp Lm 2 2 n! 2 22
ð5:431Þ
263
VECTOR MARKOV PROCESSES
Table 5.1 Coefficients hmn = for Rayleigh process (5.429) m=n
0
1
2
3
4
0 1 2 3 4
1.2533 0.6267 0.0783 0.0131 0.0020
0.6267 2.1933 0.4308 0.0326 0.0039
0.0783 0.4308 0.7099 0.0873 0.0048
0.0131 0.0326 0.0873 0.0934 0.0084
0.0020 0.0039 0.0048 0.0084 0.0066
Then, using eq. (5.417) one obtains hmn
2 2 ð1 1 x2 x x 2 ¼ 2 x exp 2 Ln Lm dx n!m! 0 2 22 22 pffiffiffi ð 1 2 x1=2 exp½xLn ðxÞLm ðxÞdx ¼ n!m! 0
ð5:432Þ
Values of coefficients hmn in eq. (5.432) for some integers n and m are presented in Table 5.1. It can be seen from this table that the most important terms hmn are those whose index does not exceed 3. Thus, even for higher order statistics, it is possible to take into account only a few first terms in expansion (5.427).14
5.7 5.7.1
VECTOR MARKOV PROCESSES Definitions
The notion of a scalar Markov process can be extended to vector processes. In particular, let nðtÞ ¼ f1 ðtÞ; 2 ðtÞ; . . . ; M ðtÞg be a random vector of dimension M which can be described by a series of probability density functions pnþ1 ðx0 ; . . . ; xn ; t0 ; . . . ; tn Þ of order n, as defined in Chapter 2, or in terms of a series of conditional probability densities n ðxn ; tn j xn1 ; tn1 ; . . . ; x0 ; t0 Þ ¼
pnþ1 ðx0 ; . . . ; xn ; t0 ; . . . ; tn Þ pn ðx0 ; . . . ; xn1 ; t0 ; . . . ; tn1 Þ
ð5:433Þ
The Markov property is defined by the fact that the transitional density, defined by eq. (5.429), depends only on the value of the process at the moment of time tn1 , i.e. n ðxn ; tn j xn1 ; tn1 ; . . . ; xn1 ; tn1 Þ ¼ ðxn ; tn j xn1 ; tn1 Þ
ð5:434Þ
All the properties of a scalar Markov process discussed in sections 5.1–5.5 can be readily extended to the vector case. In particular, one can show that the transitional density 14
For the case of the correlation function this result was considered in [26,29].
264
MARKOV PROCESSES AND THEIR DESCRIPTION
ðx; t j x0 ; t0 Þ must obey the Smoluchowski–Chapman equation ðx; t j x0 ; t0 Þ ¼
ð1 1
ðx; t j x1 ; t1 Þðx1 ; t1 j x0 ; t0 Þdx1 ;
t0 < t1 < t
ð5:435Þ
Here the integration is taken over RM. It is also possible to show that if ðtÞ is a Markov vector process with M components and continuous trajectories, i.e. only the first two conditional moments of increments are non-zero K1i ðx; tÞ ¼ lim
1
t!0 t
Ef½i ðt þ tÞ i ðtÞ j ðtÞ ¼ xg
1 Ef½i ðt þ tÞ i ðtÞ½j ðt þ tÞ j ðtÞ j ðtÞ ¼ xg t!0 t
K2ij ðx; tÞ ¼ lim
ð5:436Þ ð5:437Þ
then the marginal PDF pðx; tÞ and the transitional density ðx; t j x0 ; t0 Þ must obey the Fokker–Planck equation @ pðx; tÞ ¼ Lfpðx; tÞg @t
ð5:438Þ
where the Fokker–Planck operator is defined as Lfpðx; tÞg ¼
M M X M X @ 1X @2 ½K1m ðx; tÞpðx; tÞ þ ½K2mk ðx; tÞpðx; tÞ @xm 2 m¼1 k¼1 @xm @xk m¼1
ð5:439Þ
In addition to continuity of K1m ðx; tÞ and K2mk ðx; tÞ and their derivatives, it is required that the quadratic form HðxÞ ¼
M X M X
xm xk K2mk ðx; tÞ
ð5:440Þ
m¼1 k¼1
is positively defined, i.e. for all xi and xj HðxÞ 0. Markov processes which satisfy the corresponding Fokker–Planck equations are called diffusion Markov processes. Partial differential equations (5.438) and (5.439) must be supplemented by a proper set of initial and boundary conditions and their solution, even in a stationary case, is much more complicated than the case of a scalar diffusion Markov process. In contrast to the one-dimensional case, the exact solution of the Fokker–Planck equation, even in a stationary case, is impossible in a general case [1,2]. In this section, we consider the solution for one particular case, called the potential isotropic case [6]. In this case the Fokker–Planck equation can be rewritten in the form [6] M M M X X @ @ 1X @2 @ px ðx; tÞ ¼ ½K1i ðxÞpx ðx; tÞ þ ½K2ij ðxÞpx ðx; tÞ ¼ Gi ðxÞ 2 @t @x 2 @x @x i i j i¼1 i; j¼1 i¼1
ð5:441Þ
VECTOR MARKOV PROCESSES
265
where the following notation is used Gi ðxÞ ¼ K1i ðxÞpx ðx; tÞ
M 1X @ ½K2ij ðxÞpx ðx; tÞ 2 j ¼ 1 @xj
ð5:442Þ
It can be seen from eq. (5.441) that the stationary PDF, if it exists, satisfies the equation M X @ Gi ðxÞ ¼ 0 @x i i¼1
ð5:443Þ
In the vector case the probability current G ¼ fG1 ; G2 ; . . . ; GM g does not have to vanish inside the region R under consideration, even if G satisfies zero boundary conditions [6] Gi ðxÞ ¼ 0;
i ¼ 1; . . . ; M
ð5:444Þ
since rotational probability flows can occur. In fact, the current G vanishes in the whole region R, i.e. Gi ðxÞ ¼ K1i ðxÞpx ðx; tÞ
M 1X @ ½K2ij ðxÞpx ðx; tÞ 2 j ¼ 1 @xj
ð5:445Þ
only in the special case which is called the potential one [6] when @ K1j ðxÞ @ K1i ðxÞ ¼ @xi K2 ðxÞ @xj K2 ðxÞ
ð5:446Þ
If the conditions (5.446) are met, the stationary PDF can be represented in the form [6] " ð P # M x2 C K ðxÞdx 1i i i¼1 exp 2 px st ðxÞ ¼ K2 ðxÞ K2 ðxÞ x1
ð5:447Þ
The non-stationary PDF px ðx; tÞ can sometimes be obtained using Fourier methods in the same manner as in a one-dimensional case.15 By using the representation pðx; tÞ ¼ TðtÞðxÞ one can separate eq. (5.441) into the system T_ ðtÞ ¼ TðtÞ
15
M M X @ 1X @2 K1i ðxÞðxÞ þ K2ij ðxÞðxÞ þ ðxÞ ¼ 0 @xi 2 i; j¼1 @x2j i¼1
ð5:448Þ
In the case of a potential isotropic system the differential operator of the generating SDE is a symmetric one and this means that application of the Fourier method is possible [14].
266
MARKOV PROCESSES AND THEIR DESCRIPTION
The transition density, bivariate density, and cross-covariance function between two deterministic functions F1 ðxÞ and F2 ðxÞ of the process xðtÞ can be given in terms of the eigenvalues and eigenfunction of the boundary problem as [27,29]16 ðx; x0 Þ ¼ p ðx; x0 Þ ¼ CF1 F2 ð Þ ¼
1 X m ðxÞm ðx0 Þ m¼0 1 X m¼0 1 X
exp½m j j
ð5:449Þ
m ðxÞm ðx0 Þ exp½m j j
ð5:450Þ
hm1 hm2 exp½m
ð5:451Þ
px st ðx0 Þ
m¼0
where hmj ¼
ð x2
Fj ðxÞm ðxÞdx;
j ¼ 1; 2
ð5:452Þ
x1
Having the transient and stationary, or initial PDF, one can construct the PDF of any dimension using the Markov properties of the process xðtÞ. In addition to the Fourier method of solution of FPE (5.441), some other methods, such as the Laplace transform, transition to the Volterra integral equation, and different numerical schemes can be considered [2]. However, these methods do not give sufficient information for synthesis of the SDE and therefore they are not considered here. As was shown earlier, any diffusion Markov process can be considered as a solution of a properly defined stochastic differential equation as discussed in Section 5.4. A similar statement can be made about vector diffusion Markov processes. Indeed, Doob proved [8] that the following system of SDEs dxi ðtÞ ¼ fi ðx; tÞdt þ
M X
gim ðx; tÞd wk ðtÞ; xðt0 Þ ¼ x0 ;
01
ð5:453Þ
m¼1
defines a diffusion Markov process with the drift and diffusion defined as K1i ðx; tÞ ¼ fi ðx; tÞ þ
M X M X Nk k¼1 j ¼ 1
2
gjk ðx; tÞ
@ gjk ðx; tÞ @xj
M 1X K2ij ðx; tÞ ¼ Nk gik ðx; tÞgjk ðx; tÞ 2 k¼1
ð5:454Þ
Here wk ðtÞ is the Wiener process with zero mean and linear cross-covariance Efwk ðtÞg ¼ 0;
pffiffiffiffiffiffiffiffiffiffi Ef½wk ðt2 Þ wk ðt1 Þ½wl ðt2 Þ wl ðt1 Þg ¼ kl Nk Nl jt2 t1 j ¼ gkl Nk jt2 t1 j; 1 k M ð5:455Þ 16
The continuous spectrum case can also be easily obtained as a generalization of eqs. (5.449) and (5.450).
267
VECTOR MARKOV PROCESSES
Equivalently, wk ðtÞ can be thought of as a time integral of a zero mean white Gaussian noise wk ðtÞ ¼
ðt 1
k ðtÞdt; Efk ðt1 Þk ðt2 Þg ¼ Nn ðt2 t1 Þ; Efk ðt1 Þl ðt2 Þg ¼ 0;
k 6¼ l
ð5:456Þ
Having this in mind, one can formally rewrite the system of SDE (5.453) as M X d xi ðtÞ ¼ fi ðx; tÞ þ gim ðx; tÞm ðtÞ; xðt0 Þ ¼ x0 ; dt m¼1
01
ð5:457Þ
with the additional specification of the form of stochastic integrals used. The latter form of SDE is particularly useful if a deterministic system excited by a narrow band random input is studied. In this case one can reuse the deterministic state equations to write SDE (5.457), of course keeping in mind that such a procedure would lead to the solution of SDE (5.457) in the Stratonovich form.17 In particular, two types of the state equation are often found in applications. In the first case the state equation is resolved with respect to the higher order derivative, i.e. it is written in the following form dm xðtÞ ¼ Fðx; x0 ; . . . ; xðm1Þ ; t; ðtÞÞ dtm
ð5:458Þ
Using change of variables x0 ðtÞ ¼ xðtÞ, xk ðtÞ ¼ ðdk =dtk ÞxðtÞ, one can rewrite eq. (5.458) as a system of m first order SDEs, thus converting it to the form (5.457). In the second case, the system is linear and thus it can be described by a system of linear SDEs N X d xj ðtÞ ¼ ajk xk ðtÞ þ j ðtÞ; dt k¼1
1 < k;
j
ð5:459Þ
where ajk are some constant coefficients and j ðtÞ are elements of a zero mean vector Gaussian process with covariance matrix 1 pffiffiffiffiffiffiffiffiffi Cij ðt2 ; t1 Þ ¼ Rij Ni Nj ðt2 t1 Þ; 2
Rkk ¼ 1
ð5:460Þ
It is assumed that coefficients Rij and Ni do not depend on time. Since the system of equations describes linear transformation of a Gaussian process, the resulting process is also a Gaussian one. This significantly simplifies the solution of the problem since one has to find only the mean vector mðtÞ and the covariance matrix KðtÞ. In order to do so one can rewrite the system of SDEs (5.459) in the matrix form d XðtÞ ¼ AXðtÞ þ ðtÞ dt
ð5:461Þ
If the excitation is just an additive function, i.e. gij ðx; tÞ ¼ gij ðtÞ, both the Stratonovich and the Ito form coincide and one does not need to worry about proper interpretation of SDE (5.457).
17
268
MARKOV PROCESSES AND THEIR DESCRIPTION
where XðtÞ ¼ ½x1 ðtÞ; . . . ; xM ðtÞT , A ¼ faij g and ðtÞ ¼ ½1 ðtÞ; . . . ; M ðtÞ. The initial state of the system is described by deterministic initial conditions Xðt0 Þ ¼ X 0 . Averaging both parts of SDE (5.461) over all possible realizations, and taking into account that linear operations (such as differentiation and multiplication by a number) and averaging can be interchanged, one readily obtains equations for the mean vector d mðtÞ ¼ AmðtÞ þ EfðtÞg ¼ AmðtÞ dt
ð5:462Þ
mðtÞ ¼ m0 expðAtÞ
ð5:463Þ
lim mðtÞ ¼ 0
ð5:464Þ
Thus
and t!1
if the matrix A represents a stable system, i.e. all its eigenvalues have a negative real part. In other words, the mean value of the process XðtÞ in the stationary regime is equal to zero. In a similar manner one can derive a differential equation for the variance matrix Cðt; t þ Þ. Indeed d d Cðt; t þ Þ ¼ EfðXðtÞ mðtÞÞðXðt þ Þ mðt þ ÞÞT g dt dt d T ðXðtÞ mðtÞÞðXðt þ Þ mðt þ ÞÞ ¼E dt d T þ E ðXðtÞ mðtÞÞ ðXðt þ Þ mðt þ ÞÞ dt
ð5:465Þ
EfðAðXðtÞ mðtÞÞ þ ðtÞ ðXðt þ Þ mðt þ ÞÞT g þ EfðXðtÞ mðtÞÞðAðXðt þ Þ mðt þ ÞÞ þ ðt þ ÞÞT g ¼ ACðt; t þ Þ Cðt; t þ ÞAT þ B where B¼
1 pffiffiffiffiffiffiffiffiffi Rij Ni Nj 2
ð5:466Þ
A solution of the vector equation (5.465) can be obtained by known methods [28]. In particular, by setting ¼ 0 one can obtain matrix of variances DðtÞ ¼ Cðt; tÞ as a solution of a system of linear ordinary differential equations d DðtÞ ¼ ADðtÞ DðtÞAT þ B; Dðt0 Þ ¼ 0 dt ðt T DðtÞ ¼ eAs BeA s ds t0
ð5:467Þ ð5:468Þ
269
VECTOR MARKOV PROCESSES
while the time dependent covariance matrix Kðt; Þ is then given by Cðt; Þ ¼
Uðt; ÞDð Þ if t
DðtÞT ðt; Þ if t <
ð5:469Þ
where Uðt; Þ is the matrix satisfying the vector homogeneous equation d Uðt; Þ ¼ AUðt; Þ dt
ð5:470Þ
with initial conditions ðt0 ; t0 Þ ¼ I. Here I is an identity matrix. Some simplification can be achieved if the matrix A can be represented in the diagonal form as 2
1 6 0 6 A ¼ T 1 T; ¼ 6 4... 0
0 2 ... 0
3 0 0 7 7 7 0 5 M
... ... ... ...
ð5:471Þ
where T is the transformation matrix. In this case 2 6 6 expðAtÞ ¼ T 1 expðtÞT ¼ T 1 6 4
expð1 tÞ 0 ... 0
...
0
expð2 tÞ . . . ... ...
0 0
0
...
0
3 7 7 7T 5
expðM tÞ ð5:472Þ
Thus 2
expð1 tÞ 0 6 0 expð 2 tÞ 6 mðtÞ ¼ T 1 6 4 ... ... 0
0
... ...
0 0
...
0
3 7 7 7TX 0 5
ð5:473Þ
. . . expðM tÞ
and CðtÞ ¼ T 1 RðT 1 ÞT ; R ¼ frij g
ð5:474Þ
where rij ¼
gij ½1 expði j Þt; G ¼ fgij g ¼ CBCT i þ j
Thus, calculation of parameters of the PDF pðx; tÞ is reduced to matrix calculus.
ð5:475Þ
270
MARKOV PROCESSES AND THEIR DESCRIPTION
5.7.1.1
A Gaussian Process with a Rational Spectrum
A linear system with a rational transfer function Hðj!Þ ¼
Pm ðj!Þ 0 xm þ 1 xm1 þ þ m ¼ ; Qn ðj!Þ 0 xn þ 1 xn1 þ þ n
m
ð5:476Þ
driven by a WGN with spectral density N0 =2 produces a Gaussian random process ðtÞ with rational power spectral density given, according to eq. (4.141), by N0 Pm ðj!Þ2 S ð!Þ ¼ 2 Qn ðj!Þ
ð5:477Þ
It is easy to see that such a system can be described by the following SDE dn dn1 dm dm1
ðtÞ þ
ðtÞ þ þ
ðtÞ ¼ ðtÞ þ ðtÞ þ þ m ðtÞ 1 n 0 1 dtn dtn1 dtm dtm1 ð5:478Þ It is possible to provide an equivalent system of n first order differential equations instead of a single differential equation (5.477) of n-th order. This can be accomplished by introducing ðn 1Þ auxiliary processes
_ ðtÞ ¼ 2 ðtÞ;
_ nm ðtÞ ¼ nmþ1 ðtÞ þ nm ðtÞ;
m ¼ 1; . . . ; n 1
ð5:479Þ
Constants k could be chosen in such a way that no derivatives of WGN ðtÞ would appear in the equivalent system. Substituting new processes defined by eq. (5.479) into the original eq. (5.477) and equating the coefficients in the form of the equal order derivatives of ðtÞ one obtains a recursive equation for calculating coefficients k k ¼ kþmn
kþmn X
i ki ; k ¼ n m; . . . ; n
ð5:480Þ
i¼1
Given that the solution ðtÞ of the original eq. (5.477) can be represented as a component of the vector solution of the following system (5.477) augmented with an additional equation
_ n ðtÞ ¼
n X
nþ1i i ðtÞ þ n ðtÞ
ð5:481Þ
i¼1
Since the linear system of SDEs defines a Markov vector, a Gaussian random process with a rational spectrum can always be considered as a component of a vector Markov process. Furthermore, if a deterministic differential system is driven by a colored Gaussian noise c ðtÞ with a rational spectrum of order n x_ ðtÞ ¼ f ðx; tÞ þ gðx; tÞc ðtÞ
ð5:482Þ
ON PROPERTIES OF CORRELATION FUNCTIONS
271
it is possible to represent the solution xðtÞ as a component of a Markov vector fxðtÞ; 1 ðtÞ; . . . ; n ðtÞgT where k ðtÞ are obtained as components of a Markov vector representing the colored noise. In addition, it is important to mention that if ðtÞ is a Markov diffusion process, its zero memory one-to-one non-linear transformation ðtÞ ¼ F½ ðtÞ is also a diffusion Markov process. Its characteristics can be found by standard rules, as described in this section.
5.8
ON PROPERTIES OF CORRELATION FUNCTIONS OF ONE-DIMENSIONAL MARKOV PROCESSES
It can be seen from eq. (5.286) that for a random process generated by SDE (5.329) the correlation function is the infinite series of exponents.18 Nevertheless, the asymptotic property of eigenvalues indicates that this series rapidly converges [29]. In fact, if one takes into account the asymptotic relationship q ¼ qm ; 1
m ¼ 1; 2
ð5:483Þ
between eigenvalues of the Fokker–Planck operator [2], it follows that in the range > corr , where corr is the correlation interval of the process xðtÞ, one can restrict considerations to the first three terms in eq. (5.286) [29]. This fact raises the problem of the approximate exponential representation of the correlation function of a continuous Markov process being a solution of the non-linear first order SDE (5.329). In order to achieve this goal, one can introduce the following function Vk ðt t0 Þ ¼
ð1 ð1 1 1
xðt0 Þxk1 ðtÞpx ðxðt0 Þ; xðtÞ; t0 Þd½xðt0 Þd½xðtÞ
ð5:484Þ
which becomes the covariance function of the process xðtÞ V2 ðt t0 Þ ¼ Cxx ð Þ
ð5:485Þ
where ¼ t t0 , if k ¼ 2 and hxðtÞi ¼ 0. Taking into account eqs. (5.329) and (5.485) one obtains the following equation [6] @ V2 ðt t0 Þ ¼ @t
xðt0 Þ
d gðxÞ dx
ð5:486Þ
It was shown in [6] that
1 d hxðt0 ÞgðxÞðtÞi ¼ xðt0 ÞgðxÞ gðxÞ 2 dx 18
ð5:487Þ
This fact proves that the random process generated by SDE (5.329) has a monotone spectrum. Consequently, it is impossible to generate a narrow band process using SDE (5.329).
272
MARKOV PROCESSES AND THEIR DESCRIPTION
for the Stratonovich definition of the stochastic integral. Thus, taking into account that for the Stratonovich form of SDE its drift is given by d gðxÞ dx
ð5:488Þ
@ V2 ðt t0 Þ ¼ hxðt0 ÞK1 ðxÞi @t
ð5:489Þ
K1 ðxÞ ¼ f ðxÞ þ gðxÞ one obtains that
Expanding f ðxÞ and g1 ðxÞ ¼ gðxÞ
d gðxÞ dx
ð5:490Þ
into the Taylor series, one can rewrite eq. (5.489) as 1 X @ V2 ðt t0 Þ ¼ ak1 Vk ðt t0 Þ @t k¼2
ð5:491Þ
where [6] ak1 ¼
k1 1 d 1 dk1 f ðxÞ þ gðxÞ k1 k1 ðk 1Þ! dx 2 dx x¼0
ð5:492Þ
It can be shown that for the symmetrical PDF, function V3 ðt t0 Þ is close to zero [6] and V4 ðt t0 Þ n0 V2 ðt t0 Þ [6], where Ð1
x4 px st ðxÞdx 2 1 x px st ðxÞdx
n0 ¼ Ð1 1
ð5:493Þ
Leaving only the first two even terms in eq. (5.491), one obtains the following approximate representation of the correlation function of the solution of SDE (5.329) [29] ^ xx ð Þ 2 exp½ða1 þ a3 n0 Þj j C
ð5:494Þ
The last formula has only a limited application and should not be considered as a proof of the fact that the correlation function of the Markov process generated by SDE (5.329) can be closely approximated by a single exponent. At the same time, if the drift of the Markov process can be accurately approximated by a linear function as K1 ðxÞ ¼ x þ "ðxÞ
ð5:495Þ
REFERENCES
273
where jxj "ðxÞ, then Cxx ð Þ in fact can be approximated by a simple exponent. In this case, according to eq. (5.489) d Cxx ð Þ ¼ Cxx ð Þ þ hx"ðx Þi d
ð5:496Þ
The last term in eq. (5.496) is relatively small, so the solution of eq. (5.496) is relatively close to solution of the equation d Cxx ð Þ ¼ Cxx ð Þ d
ð5:497Þ
which is an exact exponent. More detailed discussion of this problem, as well as some examples, can be found in Chapter 7.
REFERENCES 1. C. W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, Berlin: Springer, 1994. 2. R. Risken, The Fokker–Planck Equation: Methods of Solution and Applications, Berlin: Springer, 1996. 3. A. T. Bharucha-Reid, Elements of The Theory of Markov Processes and Their Applications, New York: McGraw-Hill, 1960. 4. W. Feller, An Introduction to Probability Theory and Its Applications, New York: Wiley, 1966. 5. A. Papoulis and S. Pillai, Probability, Random Variables, and Stochastic Processes, Boston: McGraw-Hill, 2002. 6. R. L. Stratonovich, Topics in the Theory of Random Noise, New York: Gordon and Breach, 1967. 7. J. Kemeny and L. Snell, Finite Markov Chains, Princeton: Van Nostrand, 1960. 8. J. Doob, Stochastic Processes, New York: Wiley, 1953. 9. A. Tikhonov and A. Samarski, Partial Differential Equations of Mathematical Physics, San Francisco, Holden-Day, 1964. 10. A. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, New York: Dover, 1972. 11. J. Murray, Mathematical Biology, New York: Springer, 1993. 12. E. Kamke, Differentialgleichungen, Lo¨sungsmethoden und Lo¨sungen, Leipzig: Akademische Verlagsgesellschaft Geest and Portig, 1959. 13. A. Polianin and V. Zaitsev, Handbook of Exact Solutions for Ordinary Differential Equations, Boca Raton, FL: Chapman & Hall/CRC, 2003. 14. P. E. Kloeden and E. Platen, The Numerical Solution of Stochastic Differential Equations, SpringerVerlag, 1992. 15. P. E. Kloeden, E. Platen and H. Schurz, The Numerical Solution of Stochastic Differential Equations through Computer Experiments, Springer-Verlag, 1994. 16. V. Tikhonov and N. Kulman, Nonlinear Filtration and Quasicoherent Signal Reception, Moscow: Sovradio, 1975 (in Russian). 17. R. Pawula, Approximation of the linear Boltzmann equation by the Fokker–Planck equation, Physical Review 162(1–5), 186–188, 1967. 18. W. Lyndsey, Synchronization Systems in Communication and Control, Englewood Cliffs: PrenticeHall, 1972.
274
MARKOV PROCESSES AND THEIR DESCRIPTION
19. E. Titchmarsh, Eigenfunction Expansions Associated with Second-order Differential Equations, Oxford: Clarendon Press, 1962. 20. I. S. Gradsteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, New York: Academic Press, 1980. 21. A. Berger, Chaos and Chance: An Introduction to Stochastic Aspects of Dynamics, New York: Walter de Gruyter, 2001. 22. J. Schiff, The Laplace Transform: Theory and Applications, New York: Springer, 1999. 23. G. Golub and C. Van Loan, Matrix Computations, Baltimore: Johns Hopkins University Press, 1996. 24. P. Rao, D. Johnson and D. Becker, Generation and analysis of non-Gaussian Markov time series, IEEE Transactions on Signal Processing, 40(4), 845–855, 1992. 25. G. Weiss, Time-reversibility of linear stochastic processes, Journal of Applied Probability, 12(2), 831–836, 1975. 26. R. Rao, Non-Gaussian Time Series, M. Sc. Thesis, Houston: Rice University, 1988. 27. C. Nikias and A. Petropulu, Higher-Order Spectra Analysis, New Jersey: Prentice-Hall, 1993. 28. A. N. Malakhov, Cumulant Analysis of Non-Gaussian Random Processes and Their Transformations, Moscow: Sovetskoe Radio, 1978 (in Russian). 29. D. Klovsky V. Kontorovich, and S. Shirokov, Models of Continuous Communication Channels Based on Stochastic Differential Equations, Moscow: Radio i Sviaz, 1984 (in Russian).
6 Markov Processes with Random Structures 6.1
INTRODUCTION
Markov processes with random structure form a sub-class of mixed Markov processes. They can be characterized by significant changes of their ‘‘local’’1 probability density considered at different time instants. Changes of the local PDF appear at random. It is assumed here that the number of possible states (i.e. number of different PDFs) is finite and is described by a discrete component #ðtÞ with M possible states fi g; i ¼ 1; . . . ; M. Changes in the properties of a Markov process with random structure can be attributed to changes in the state of this discrete component. In each state, the process can be either continuous or discrete, ‘‘locally’’ stationary in a strict or a wide sense, etc. An example of a realization of the process with random structure is shown in Fig. 6.1. Essentially, if no change appears on a given time interval ½ti ; tiþ1 , the process described above can be considered as a piece-wise continuous or a piece-wise discrete random process. Thus for its description it is possible to utilize a well developed mathematical apparatus, used to describe continuous and discrete Markov processes. This description should be augmented by assigning particular indices (the state index) to the trajectories of a particular continuous or discrete Markov process in each state, running from i ¼ 1 to i ¼ M, and providing moments of time tj at which switching from one state to another takes place, as shown in Fig. 6.1. Referring to Fig. 6.2, the process ð1Þ ðtÞ exists on the time intervals ½t0 ; t1 and ½t3 ; t4 , the process ð2Þ ðtÞ exists on the time interval ½t2 ; t3 , and the process ð3Þ ðtÞ exists on the time interval ½t1 ; t2 , etc. It is also assumed that the value of the process ðlÞ ðtÞ is zero if #ðtÞ 6¼ l . It can be easily seen that the probability density p;l ðxÞ of such an augmented process is related to the probability density pðlÞ ðxÞ of the realization considered only in a given state by the following rule p;l ðxÞ ¼ Pl pðlÞ ðxÞ þ ð1 Pl ÞðxÞ 1
ð6:1Þ
The ‘‘local’’ properties here refer to the statistical properties (such as mean, variance, PDF, etc.) estimated through the time averaging of a segment of the realization of the process ðtÞ for a fixed value of the switching process #ðtÞ ¼ l . These properties are assumed to be constant as long as the state of the switching process remains the same, but are allowed to change once the switching process moves to a new state.
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
276
MARKOV PROCESSES WITH RANDOM STRUCTURES
Figure 6.1
A realization of a random process with random structure.
where Pl is the probability of the state l . It is assumed that ðlÞ ðtÞ and the switching process are both stationary. The order in change of indices is random and it is described by the state of the discrete component #ðtÞ with M possible states fi g; i ¼ 1; . . . ; M. In future it is assumed that the switching process #ðtÞ is also a Markov process. As a result, an observer would see pieces of trajectories of different random processes, commutated at random. The resulting process is called a Markov process with random structure. It follows from the definition of a process with random structure, that it reflects an approach to the description of a non-stationary random process by using locally stationary pieces of random duration [1]. The transition from one stationary piece to another happens according to the statistical properties of a non-stationary process. The following examples emphasize reasons to consider processes with random structure as a useful tool in modeling. A number of physical situations can be found to illustrate the validity of such an approach.
Figure 6.2
A realization of the process ðtÞ in a given state.
INTRODUCTION
277
1. A receiver operating in an environment contaminated by interference and noise (unwanted radiation) from different sources can be considered as the first example. It is assumed that the receiver has bandwidth BR (or, equivalently, the response time TR ). Furthermore, it is assumed that the interference has energy spectrum with a spread ! much less than the bandwidth of the receiver !I BR. Interference from a particular source goes on and off at random. Multiple interference at the input is treated as a sequence of individual interferences as shown in Fig. 6.3. This problem can be treated as a narrow band process driving a wide band system, thus implying that the response of the receiver to different
Figure 6.3 Illustration of the impact of interference on a receiver. Ri is the receiver’s response to the interferences Ii .
interferers does not overlap, or such overlap is not a significant one, since the transient time TR 1=BR introduced by the receiver itself is much smaller than the duration of the interference. Statistical characteristics of the interferers are different and since they appear at the output of the receiver at separate moments of time, the output process can be treated as non-stationary. According to classification suggested by D. Middleton, interference is called Class A interference if the switching between different sources appears according to the Poisson law [2]. In practice, such situations can be encountered at the base station (wide band receiver) with cross-channel interference from individual mobiles. 2. The second example arises when the conditions on a communication channel vary greatly during one communication session (or frame, block or packet) duration. As an example, the Lutz model [3,4] of a satellite communication link takes into account the fact that during one telephone call the position of a satellite may change significantly to change fading from Rician to Rayleigh or vice versa. A proper model may include a number of states, each describing a particular severity of the channel conditions. These conditions may not only affect marginal distributions but also differ in second order statistics, for example in the level crossing rates [see Section 4.6]. 3. It is often important to model speech signals on a computer. Given the specifics of the speech formation one can suggest a model where a set of states reflect particular characteristics of the speech modality. A slow speech detector, i.e. a device which reacts to significant changes in speech behaviour, is modeled as two (talk–silence) or three (talk–minitalk–silence) state continuous time Markov chains, with random transition from one state to another [5]. More detailed analysis may track variations in articulation, pitch, instantaneous power, etc. [6,7], resulting in models with a higher number of states.
278
MARKOV PROCESSES WITH RANDOM STRUCTURES
4. It is possible to approximate a continuous non-stationary fading channel by a discrete process with random structure. In this case, states of the channel are chosen based on the required quality of the reception, such as probability of errors Perr , error burst duration, etc. A large class of models found in the literature (see Chapter 8) use the so-called Gilbert–Elliot model of a channel, which can be considered as a particular class of systems with random structure [8,9]. In this case, the channel under study is considered to reside only in one of two possible states: the ‘‘Good’’ state with a low probability of error and the ‘‘Bad’’ state with a higher probability of error. More details of this model are considered in Section 8.2. Of course, it is possible to provide even a greater number of examples where the suggested approach may be useful. In future discussions, it is assumed that a proper modeling of the process in each state is available. In other words, it is assumed that if proper state variables are chosen, dynamic equations are known and the parameters of the models are either known or properly estimated, etc. [8,10,11]. It is important to mention here that the concept of the Markov process with random structure can be completely considered in the framework of so-called Hidden Markov Models (HMM) [6], characterized by a set of the following characteristics fpðlÞ ð0Þ; T ¼ ½pl j ; pðlÞ ðxÞg;
l ¼ 1; . . . ; M
ð6:2Þ
Here pðlÞ ð0Þ is a vector column of initial distributions of the states of HMM, pl j are elements of the transitional matrix T, describing the probability of transitions between states (substructures) of the HMM. Finally, pðlÞ ðxÞ is a vector row of joint probability densities (of order more than one in general) in each state. It can be seen from eq. (6.2) that the HMM approach is effective when a particular class of distributions pðlÞ ðxÞ allows for an effective analysis of the system in each state separately. For example, if each state is described by a Gaussian distribution, this fact significantly broadens the class of problems which can be analytically analyzed based on the knowledge of second order moments. However, if the distributions in each state are non-Gaussian, the multivariate probability densities pðlÞ ðxÞ are known only for a limited number of cases [12]. This problem can be somewhat simplified if Markov processes (or their components) are adopted as model processes in each of the M states (sub-structures). It is assumed in the following that such models can be adequately represented by continuous Markov processes, and thus, can be considered as solutions of certain generating SDEs. As has been discussed earlier, the solution of an SDE can be treated as the response of a dynamic system to random excitation. Extending this view to the case of systems with random structure, it is possible to visualize it as a set of M separate systems, whose outputs are randomly connected (commutated) to an observer. Since such a view refers to a number of separate dynamic sub-systems, such a process is sometimes called a ‘‘polystructure process’’, which degenerates to a ‘‘monostructure’’ if there is only one state. In addition to the representation of isolated systems, some description of the commutation process must be given. This is often accomplished by means of the matrix of transition probabilities fPl j g. If the commutation processes are point processes with different intensities l j then the matrix fPl j g can be expressed in terms of l j and is often referred as the Q matrix [6,7]. The formation of the output signal is shown in Fig. 6.1.
MARKOV PROCESSES WITH RANDOM STRUCTURE
6.2
6.2.1
279
MARKOV PROCESSES WITH RANDOM STRUCTURE AND THEIR STATISTICAL DESCRIPTION Processes with Random Structure and Their Classification
This section is devoted to the mathematical description of Markov processes with random structure, similar to the one shown in Fig. 6.4. A sudden transition from the l-th state to the
Figure 6.4
Formation of a process with random structure.
k-th state can be interpreted as absorption (termination) of the l-th state trajectory ðlÞ ðtÞ, and regeneration (re-creation) of the k-th state trajectory ðkÞ ðtÞ. Since, by the definition, these trajectories do not overlap in time (the system is only in one state at any given moment of time), one can formally write ðtÞ ¼
M X
ðlÞ ðtÞ
ð6:3Þ
l¼1
That is, it is assumed that ðlÞ ðtÞ ¼ 0 if the system is at a state different from the l-th state. In the instant of time when the state of the system changes from l to k, certain initial conditions for regenerating process ðkÞ ðtÞ must be specified. These conditions, random in general, can be described by certain probability densities ql k ðxðkÞ ; tjxðlÞ ; tÞ ql k ðx; tjx0 ; tÞ
ð6:4Þ
conditional on the value of the absorbed process at the switching time, if the trajectory of the l-th process is normalized to unity2 ð1 ql k ðx; tjx0 ; tÞd x ¼ 1 ð6:5Þ 1
The switching process #ðtÞ is an ordinary discrete process without memory, i.e. during a short interval of time, t, the probability that #ðtÞ would change its state more than once is of the order of ð tÞ2 . This assumption implies that the switching process #ðtÞ is a Markov process. 2
All integrals hereafter are to be treated as integrals over Rn with n ¼ maxfnð1Þ ; . . . ; nðMÞ g
280
MARKOV PROCESSES WITH RANDOM STRUCTURES
Classification of Markov processes with random structure can be carried out based on the values of the process ðtÞ at which the state change can appear. In particular, the process ðtÞ is called a process with distributed transitions if the transitions from the l-th state to the k-th state may appear for any value of the process ðlÞ ðtÞ. If the state change may occur only if the process ðlÞ ðtÞ reaches certain threshold levels, forming a discrete set of values, such a process is called a process with lumped transitions. In the following, mainly processes with distributed transitions are considered.
6.2.2
Statistical Description of Markov Processes with Random Structure
A Markov model can be adopted to describe the process with random structure if the absorption and the regeneration of the trajectories are described as above. In this case, an extended state vector f ¼ ½; #ðtÞ
ð6:6Þ
is a Markov vector and can be described by means of marginal probability densities p ðx; l; tÞ ¼ pðlÞ ðx; tÞ
ð6:7Þ
ðx2 ; l; t2 jx1 ; l; t1 Þ ¼ ðlÞ ðx2 ; tjx1 ; t1 Þ
ð6:8Þ
and the transitional PDF
The PDF (eq. 6.8) describes behaviour of the process under the assumption that there is no change of state (no absorption) on the interval ½t1 ; t2 . Since probability densities, defined by eq. (6.7), are in fact particular cases of the joint PDF of the extended process ðtÞ, the integration just over x variable would not produce unity: ð1 1
pðlÞ ðx; tÞd x 1;
ð1 and 1
ðlÞ ðx2 ; t2 jx1 ; t1 Þd x2 1
ð6:9Þ
This reflects the fact that a change of the current state l may appear on the time interval ½t1 ; t2 and a part of the realizations could be absorbed. However, the normalization to unity is obtained by summation over all possible states l, i.e. M ð1 X l¼1
pðlÞ ðx; tÞd x ¼ 1
ð6:10Þ
ðlÞ ðx2 ; t2 jx1 ; t1 Þd x2 ¼ 1
ð6:11Þ
1
and M ð1 X l¼1
1
281
MARKOV PROCESSES WITH RANDOM STRUCTURE
It can be seen that integration just over x of the PDF PðlÞ ðx; tÞ produces the probability of the l-th state of the switching process #ðtÞ: ð1 1
pðlÞ ðx; tÞd x ¼ Pl ðtÞ;
and
M X
Pl ðtÞ ¼ 1
ð6:12Þ
l¼1
The quantity Pl ðtÞ describes the probability that the trajectory of the process belongs to the l-th state and it is not absorbed at the moment of time t, i.e. Pl ðtÞ ¼ Probfl ðtÞ; ¼ #l g. The next step is to consider conditions which can be applied on the transitional density ql k ðx2 ; t2 jx1 ; t1 Þ, describing the process of regeneration of the trajectories. In many cases, it is reasonable to assume that the new trajectory corresponding to the k-th state picks up from the level achieved by the absorbed l-th trajectory3, i.e. ql k ðx2 ; t2 jx1 ; t1 Þ ¼ ðx2 x1 Þ
ð6:13Þ
This condition, called continuous regeneration, is illustrated in Fig. 6.5. Of course there are other ways to define this density; some examples can be found in [1].
Figure 6.5 Jump conditions for a random process with random structure. Regenerated trajectory cðkÞ ðtÞ corresponds to the process with continuous regeneration of trajectories, while ðkÞ ðtÞ corresponds to a general case. In the general case, ProbfðlÞ ðtÞ 2 ½x0 ; x0 þ d x0 ; ðkÞ ðtÞ 2 ½x; x þ d xg ¼ qkl ðx; tjx0 ; tÞ d x d x0 .
6.2.3
Generalized Fokker–Planck Equation for Random Processes with Random Structure and Distributed Transitions
It is often required to obtain an equation which describes the evolution of the probability density pðlÞ ðx; tÞ as a function of time. The discussions in this section are limited to a case of a scalar random process ðtÞ with random structure and with distributed transitions. 3
In general, this assumption is not needed for the derivation of the generalized Fokker–Planck or Kolmogorov–Feller equations. However, here only this case is treated in some detail due to its importance in practical applications. Another important case often considered due to its relative simplicity is the case when the initial value of the regenerated trajectory does not depend on the final value of the absorbed trajectory. In this case qkl ðx; tjx0 ; t0 Þ ¼ pk ðx; tÞ.
282
MARKOV PROCESSES WITH RANDOM STRUCTURES
For an ordinary switching process, which governs the absorbtion and regeneration of the trajectories, one can write Pl k ðtÞ ¼ l k ðtÞ t þ oð t2 Þ
ð6:14Þ
Here l k ðtÞ is the average number per unit time of transitions (intensity of transitions) from the l-th state into the k-th state at the moment of time t and Pl k ðtÞ is the probability of change of the state from ¼ #l to ¼ #k . It is also assumed here that these intensities are independent of the process ðtÞ. In addition, the trajectories in each isolated state4 are assumed for now to be diffusion Markov processes, described by a set of Fokker–Planck equations @ ðlÞ ðx; tjx0 ; Þ @t @ 1 @2 ðlÞ ðlÞ ½K1 ðx; tÞðlÞ ðx; tjx0 ; Þ þ ½K ðx; tÞðlÞ ðx; tjx0 ; Þ; ¼ @x 2 @ x2 2
l ¼ 1; . . . ; M
ð6:15Þ
Here < t, and x0 ¼ ðÞ is the value of the process at a given moment of time . The drift ðlÞ ðlÞ K1 ðx; tÞ and diffusion K2 ðx; tÞ have the same meaning as for an ordinary Markov diffusion process as described in Section 5.3.2. For a short interval of time t ¼ t , the Fokker–Planck equation (6.15) can be rewritten as [13] ðlÞ ðx; tjx0 ; Þ ¼ ðx x0 Þ t
@ 1 @2 ðlÞ ðlÞ ½K1 ðx; tÞðx x0 Þ þ t 2 ½K2 ðx; tÞðx x0 Þ þ oð t2 Þ @x @x 2 ð6:16Þ
Keeping this in mind, it is possible to obtain an equation describing the time evolution of pðlÞ ðx; tÞ, taking into account possible transitions from the k-th state to the l-th state. During a short interval of time t, a particular realization can be either absorbed, if a switching event occurs, or remain unabsorbed. Given the fact that the switching process is ordinary5, the probability of no switching is then PN ¼ 1 P S ¼ 1 t
M X
l k ðÞ ¼ 1 tl ðÞ
ð6:17Þ
l k ðÞ
ð6:18Þ
k¼1
where l ðÞ ¼
M X k¼1
is the net intensity of transitions (switching) from the state l and l l ¼ 0. As has been discussed earlier, the process of regeneration proceeds according to a certain transitional 4
That is assuming that there is no switching. That is, the probability of more than two switchings in a short time interval t is of the order of ð tÞ2 or less.
5
MARKOV PROCESSES WITH RANDOM STRUCTURE
283
PDF qk l ðx; tjx0 ; t0 Þ which is now to be used to relate the probability densities pðlÞ ðx; tÞ and pðlÞ ðx0 ; t0 Þ as ð1 ðlÞ pðlÞ ðx0 ; ÞðlÞ ðx; tjx0 ; Þd x0 p ðx; tÞ ½1 tl ðÞ þ t
M X k¼1
1 ð1
k l ðÞ
1
pðkÞ ðx0 ; Þqk l ðx; tjx0 ; Þd x0
ð6:19Þ
The first term in this expansion represents evolution of the trajectory if no switching appears, while each term under the sum sign represents evolution of the remaining part of the trajectory after absorption, switching and regeneration. Furthermore, taking into account approximation (eq. 6.16) this equation can further be simplified to the following form pðlÞ ðx; tÞ @ ðlÞ 1 @ 2 ðlÞ ½K1 ðx; tÞpðlÞ ðx; tÞ þ t 2 ½K2 ðx; tÞpðlÞ ðx; tÞ tl ðtÞ @x @x 2 ð M 1 X þ t k l ðÞ qk l ðx; tjx0 ; ÞpðkÞ ðx0 ; Þdx0 ð6:20Þ
¼ pðlÞ ðx; Þ t
k¼1
1
Finally, subtracting pðlÞ ðx; Þ from both sides, dividing both sides of eq. (6.20) by t and taking a limit when t ! 1 one obtains the so-called generalized Fokker–Planck equation for a process with random structure and distributed transitions. @ ðlÞ pðlÞ ðx; tÞ pðlÞ ðx; Þ p ðx; tÞ ¼ lim t!1 @t t @ ðlÞ 1 @ 2 ðlÞ ½K ðx; tÞpðlÞ ðx; tÞ ¼ ½K1 ðx; tÞpðlÞ ðx; tÞ þ @x 2 @x2 2 ð1 M X l ðtÞpðlÞ ðx; tÞ þ kl ðtÞ qk l ðx; tjx0 ; tÞpðkÞ ðx0 ; tÞdx0 1
k¼1
ð6:21Þ
The same equation can be written for every state l of the switching process, thus eq. (6.21) should be considered as a system of equations. This is in striking contrast to the Fokker– Planck equation for a vector Markov process with continuous components when the joint PDF satisfies a scalar partial differential equation. Unfortunately, this property also complicates solution of the generalized Fokker–Planck equation. It should be mentioned here that by varying the form of the transitional density qðx; tjx0 ; tÞ it is possible to model a great variety of random processes with random structure. Detailed investigation of these possibilities is beyond the scope of this book. More detail can be found in [1]. The generalized eq. (6.21) can also be interpreted in terms of the probability currents, as for an ordinary Fokker–Planck equation. Indeed, defining absorption functions as Ul ðx; tÞ ¼
M X k¼1
Ulk ðx; tÞ ¼
M X k¼1
l k ðtÞpðlÞ ðx; tÞ
ð6:22Þ
284
MARKOV PROCESSES WITH RANDOM STRUCTURES
and regenerating functions as Vkl ðx; tÞ
¼
ð1 1
k l ðtÞpðlÞ ðx0 ; tÞqk l ðx; tjx0 ; tÞdx0 ;
Vl ðx; tÞ ¼
M X
Vkl ðx; tÞ
ð6:23Þ
k¼1
the generalized Fokker–Planck equation can be rewritten in the form of conservation of probability currents @ ðlÞ @ p ðx; tÞ ¼ GðlÞ ðx; tÞ Ul ðx; tÞ þ Vl ðx; tÞ @t @x
ð6:24Þ
That is, the changes in probability are due to changes in the diffusion current GðlÞ ðx; tÞ minus the flow of the absorption of the trajectory plus the flow of the regenerated trajectories. This fact can be explained by the absence of dependence between values of ðtÞ and the moment of switching tk . This also means that the statistical properties of the switching process do not depend on the prehistory of the process ðtÞ. In other words, behaviour of the process before the switching and at the moment of switching are statistically independent. This fact allows one to assume that the structure of the equation for pðlÞ ðx; tÞ would remain similar to the non-diffusion type of Markov process, such as a Markov process with jumps: the right-hand side of the generalized Fokker–Planck equation will contain a sum of an operator corresponding to each process without switching and an additional term describing switching of trajectories. The same can be said about generalization of the FPE to the vector case. In this case, the FPE becomes n X @ ðlÞ @ ðlÞ p ðx; tÞ ¼ ½K1 ðx; tÞpðlÞ ðx; tÞ @t @ x i i¼1
þ
n X n 1X @2 ðlÞ ½K ðx; tÞpðlÞ ðx; tÞ Ul ðx; tÞ þ Vl ðx; tÞ 2 i¼1 j¼1 @xi @xj 2i j
ð6:25Þ
Here n ¼ maxfn1 ; . . . ; nM g and Ul ðx; tÞ ¼
M X
Ulk ðx; tÞ ¼
k¼1
Vl ðx; tÞ
¼
M X k¼1
M X
l k ðx; tÞpðlÞ ðx; tÞ
ð6:26Þ
k¼1
Vkl ðx; tÞ
¼
1 ð1 X k¼1
1
k l ðx0 ; tÞqk l ðx; tjx0 ; tÞpðlÞ ðx0 ; tÞdx0
ð6:27Þ
For a specific form of the transitional density qk l ðx; tjx0 ; tÞ, given by eq. (6.13), the term representing the regeneration flow in eq. (6.27) can be further simplified to produce Vl ðx; tÞ ¼
M X k¼1
kl ðtÞpðlÞ ðx; tÞ
ð6:28Þ
MARKOV PROCESSES WITH RANDOM STRUCTURE
285
If, on the other hand, qk l ðx; tjx0 ; tÞ ¼ qk l ðx; tÞ, the expression for the same term becomes Vl ðx; tÞ ¼
M X
Vkl ðx; tÞ ¼
k¼1
1 X
qk l ðx; tÞ
k¼1
ð1 1
k l ðx0 ; tÞpðlÞ ðx0 ; tÞdx0
ð6:29Þ
Furthermore, if the switching rates do not depend on the values of the process ðtÞ, i.e. if k l ðx0 ; tÞ ¼ vkl ðtÞ, then eq. (6.29) is simplified even more to produce Vl ðx; tÞ ¼
1 X
qk l ðx; tÞk l ðtÞ
ð1
k¼1
1
pðlÞ ðx0 ; tÞdx0 ¼
1 X
qk l ðx; tÞkl ðtÞ
ð6:30Þ
k¼1
Finally, if the jump conditions are such that the amount of change of the coordinate is distributed according to some PDF pA ða; tÞ which does not depend on the level of the process ðtÞ then qkl ðx; tjx0 ; tÞ ¼ pA ðx0 a; tÞ
ð6:31Þ
and the regeneration flow is described by Vl ðx; tÞ
¼
M X k¼1
Vkl ðx; tÞ
¼
1 ð1 X k¼1
1
k l ðx0 ; tÞpðlÞ ðx0 ; tÞpA ðx0 a; tÞdx0
ð6:32Þ
The next step is to generalize the Fokker–Planck equation to a case of the Markov process with jumps. It is assumed that between two jumps the process is a diffusion Markov process and that the appearance of the jumps is described by a Poisson process with intensity ðtÞ. This process can be treated as a process with random structure with two states and distributed transitions. In each state, statistical properties of the trajectories are the same and coincide with those of the process with jumps. The discontinuity is taken into account by the initial conditions immediately after the jump, which do not coincide with the final conditions of the absorbed trajectory. In other words, the form of the PDF qk l ðx; tjx0 ; tÞ ¼ ql k ðx; tjx0 ; tÞ ¼ qðx; tjx0 ; tÞ is different from the delta function. For each of two states the following Fokker–Planck equations can be written n n X n X @ ðlÞ @ 1X @2 ðlÞ p ðx; tÞ ¼ ½K1 i ðx; tÞpðlÞ ðx; tÞ þ ½K22 i j ðx; tÞpðlÞ ðx; tÞ @ x @ x @t @ x 2 i i j i¼1 i¼1 j¼1 ð1 ðtÞpðlÞ ðx; tÞ þ ðtÞ qðx; tjx0 ; tÞpðkÞ ðx0 ; tÞdx0 ð6:33Þ 1
where k, l ¼ 1, 2, k 6¼ l. Summing both sides of the eq. (6.33) over the index l and taking into account relation (6.18), the well known Kolmogorov–Feller equation can be obtained n n X n X @ @ 1X @2 pðx; tÞ ¼ ½ai ðx; tÞpðx; tÞ þ ½bi j ðx; tÞpðx; tÞ ðtÞpðx; tÞ @t @xi 2 i¼1 j¼1 @xi @xj i¼1 ð1 qðx; tjx0 ; tÞpðx; tÞdx0 ð6:34Þ þ ðtÞ 1
286
MARKOV PROCESSES WITH RANDOM STRUCTURES
since pðx; tÞ ¼ pð1Þ ðx; tÞ þ pð2Þ ðx; tÞ
ð6:35Þ
is the PDF of the process ðtÞ. If the diffusion (continuous) component is absent then the Kolmogorov–Feller equation is reduced to ð1 n X @ @ pðx; tÞ ¼ ½K1 i ðx; tÞpðx; tÞ ðtÞpðx; tÞ þ ðtÞ qðx; tjx0 ; tÞpðx; tÞdx0 ð6:36Þ @t @x i 1 i¼1 If jumps are induced by external -pulses with magnitude distribution pA ðAÞ then qðx; tjx0 ; tÞ ¼ pA ðx x0 Þ. All of the above can be generalized on the vector case when the switching is performed between processes with jumps and no diffusion component n X @ ðlÞ @ ðlÞ p ðx; tÞ ¼ ½K1 i ðxÞpðlÞ ðx; tÞ ðlÞ ðtÞpðlÞ ðx; tÞ @t @x i i¼1 ð1 ðlÞ þ ðtÞ pðlÞ ðx A; tÞpA ðAÞdA þ Vl ðx; tÞ Ul ðx; tÞ 1
Elements ul k ðx; tÞ and lk ðx; tÞ can be organized into the following matrices 3 2 0 u12 ðx; tÞ . . . u1 M ðx; tÞ 6 ... ... 7 7 6 U ðx; tÞ ¼ 6 7 4 ... ... 5 uM 1 ðx; tÞ 2 6 6 V ðx; tÞ ¼ 6 4
0 ...
...
...
ðx; tÞ . . . 12 0
...
ð6:37Þ
ð6:38Þ
0 3 1 M ðx; tÞ ... 7 7 7 ... 5
M 1 ðx; tÞ
ð6:39Þ
0
known as absorption and regeneration matrices. It can be easily seen that if the transitional density qðx; tjx0 ; tÞ is a delta function as in eq. (6.13) then the absorption and regeneration functions are related V ðx; tÞ ¼ U T ðx; tÞ
ð6:40Þ
The next step is to describe the evolution of probabilities of the states Pl ðtÞ as defined by eq. (6.12). It is assumed that the intensities of the transitions do not depend on time t and the current state x of the process, i.e. k l ðtÞ ¼ k l ¼ const. This assumption is not very restrictive from an application point of view and it significantly simplifies the analysis of the problem. Having this in mind and integrating both sides of the Kolmogorov–Chapman eq. (6.25) over x, one obtains an equation for the time evolution of probabilities of the states as P_ l ðtÞ ¼ l Pl ðtÞ þ
M X k¼1
k l Pk ðtÞ
ð6:41Þ
287
MARKOV PROCESSES WITH RANDOM STRUCTURE
This is consistent with the fact that #ðtÞ is a continuous time Markov chain. If Pl ðtÞ is known then the conditional probability density pðlÞ ðx; tÞ of the process at a given state can be expressed as pðlÞ ðx; tÞ ¼
pðlÞ ðx; tÞ Pl ðtÞ
ð6:42Þ
Substituting expression (6.42) into the generalized Fokker–Planck eqs. (6.25) and (6.27), it is possible to obtain the Fokker–Planck equation for the conditional probabilities pðlÞ ðx; tÞ of the process in a given state n n X n X @ ðlÞ @ 1X @2 ðlÞ ðlÞ p ðx; tÞ ¼ ½K1 i ðxÞðx; tÞpðlÞ ðx; tÞ þ ½EK2 i j ðxÞðx; tÞpðlÞ ðx; tÞ @x @x @t @x 2 i i j i¼1 i¼1 j¼1 M X _ Pl ðtÞ ðlÞ Pk ðtÞ k l ðx; tÞ uk l ðx; tÞ p ðx; tÞ þ ð6:43Þ ðtÞ P Pl ðtÞ l k¼1
where k l ðx; tÞ ¼
k l ðx; tÞ u ðx; tÞ ; uk l ðx; tÞ ¼ k l Pl ðtÞ Pl ðtÞ
ð6:44Þ
are normalized absorption and regeneration flows. A similar generalization can be made for the Kolmogorov–Feller eq. (6.37), which becomes n X @ ðlÞ @ ðlÞ ½ai ðxÞpðlÞ ðx; tÞ ðlÞ ðtÞpðlÞ ðx; tÞ þ ðlÞ ðtÞ p ðx; tÞ ¼ @t @x i i¼1 M X P_ l ðtÞ ðlÞ Pk ðtÞ þ k l ðx; tÞ uk l ðx; tÞ p ðx; tÞ Pl ðtÞ Pl ðtÞ k¼1
ð1 1
ðlÞ
pðlÞ ðx A; tÞpA ðAÞdA ð6:45Þ
The transitional density ðlÞ ðx; t þ jx0 ; tÞ also satisfies eqs. (6.43)–(6.45) (as a function of x and t) with the initial conditions ðlÞ ðx; tjx0 ; tÞ ¼ ðx x0 Þ
ð6:46Þ
Assuming that the resulting process is stationary and multiplying both sides of eq. (6.43) by ðlÞ pðlÞ ðx0 ; t0 Þ, the joint conditional PDF p2 ðx; x0 ; Þ considered at two moments of time must satisfy the following equation n n X n X @ ðlÞ @ 1X @2 ðlÞ ðlÞ ðlÞ ðlÞ p2 ðx; x0 ; Þ ¼ ½K1 i ðxÞp2 ðx; x0 ; Þ þ ½K2 i j ðxÞp2 ðx; x0 ; Þ @x @x @t @x 2 i i j i¼1 i¼1 j¼1 ðlÞ
l p2 ðx; x0 ; Þ þ
M X k¼1
k l
Pk ðlÞ d ðlÞ p ðxÞpK ðlÞ ðxÞ ln Pl ðtÞp2 ðx; x0 ; Þ dt Pl ð6:47Þ
288
MARKOV PROCESSES WITH RANDOM STRUCTURES
To shorten the notation, the equations can be rewritten in the operator form @ ðlÞ ðlÞ p ðx; x0 ; Þ ¼ L½ p2 ðx; x0 ; Þ @t 2
ð6:48Þ
where the notation L½ is used to denote the right-hand side of eq. (6.45). An operator, conjugate6 to L½ is denoted as M½ . ðlÞ
M½ p2 ðx; x0 ; Þ ¼
n X
ðlÞ
K1 i ðxÞ
i¼1
n X n @ ðlÞ 1X @2 ðlÞ ðlÞ p2 ðx; x0 ; Þ þ K2 i j ðxÞ p ðx; x0 ; Þ @xi 2 i¼1 j¼1 @ xi @ xj 2
ðlÞ
l p2 ðx; x0 ; Þ þ
M X
k l
k¼1
Pk ðlÞ d ðlÞ p ðxÞpK ðlÞ ðxÞ ln Pl ðtÞp2 ðx; x0 ; Þ Pl dt ð6:49Þ
In order to solve system of equation (6.45), proper initial and boundary conditions must be specified. All the equations must be solved simultaneously, including the equations for evolution of the switching process. Given that all the equations are interdependent, obtaining an analytical solution remains an open question. However, there are two limiting cases when a qualitative and quantitative analysis can be performed. In particular, this includes a case of a very slow and a very fast switching, treated in [14]. The distinction between a fast and a slow switching can be formally expressed in terms of ðlÞ the product of intensity of switching l and the characteristic time s of an isolated system in the l-th state. A switching process is considered slow if maxfl sðlÞ g 1
ð6:50Þ
Similarly, the switching is considered to be fast if maxfsðlÞ g 1
ð6:51Þ
It can be seen from eqs. (6.20)–(6.21) that the terms containing the intensities of the transition are small compared to the diffusion terms if the intensity of switching is low. Interdependence between the states will also become negligible. In other words, it is reasonable to expect that the conditional density describing the evolution of the process in a given state differs little from the isolated process alone in the same state. In the fast switching case, the particular contribution to the internal dynamics of the individual states would become negligible. 6.2.4
Moment and Cumulant Equations of a Markov Process with Random Structure
Without loss of generality, equations for the evolution of moments of a process with random structure can be derived for a scalar case. Using the same technique as in Section 4.4 and the 6
This implies that
Ð1
ðlÞ 1 f1 ðxÞL½p2 ðx; x0 ; Þd x
¼
Ð1
1
ðlÞ
p2 ðx; x0 ; ÞM½ f1 ðxÞdx:
289
MARKOV PROCESSES WITH RANDOM STRUCTURE
following notation h f ðx; tÞil ¼
ð1 1
f ðx; tÞpðlÞ ðx; tÞdx
ð6:52Þ
the time derivative of the average of function f ðx; tÞ in the l-th state can be written as @ h f ðx; tÞil ¼ @t
ð1
pðlÞ ðx; tÞMðlÞ ½ f ðx; tÞdx l ðtÞh f ðx; tÞil
1 M X
þ
k¼1
k l ðtÞ
Pk ðtÞ d h f ðx; tÞik ln Pl ðtÞh f ðx; tÞil Pl ðtÞ dt
ð6:53Þ
or, equivalently 1 M X X P_ l ðtÞ @ 1 ðlÞ @n Pk ðtÞ h f ðx; tÞil ¼ hKn ðx; tÞ n f ðx; tÞil l h f ðxÞil þ h f ðx; tÞik h f ðx; tÞil k l @t n! P P @x ðtÞ l l ðtÞ n¼1 k¼1
ð6:54Þ
Setting f ðxÞ ¼ xn in eq. (6.54) allows one to obtain evolution equations for the moments of the distribution in each state sðs 1Þ s2 ðlÞ ðlÞ s1 hx K2 ðxÞi K1 ðxÞi þ m_ ðlÞ s ðtÞ ¼ shx 2 M X d Pk ðtÞ ðkÞðtÞ ðtÞ þ k l ðtÞ m l ðtÞ þ ln Pl ðtÞ mðlÞ s dt Pl ðtÞ s k¼1
ð6:55Þ
It can be seen that the structure of the moment equation (6.55) coincides with the structure of the corresponding Fokker–Planck and Kolmogorov–Feller equations for the conditional probability densities of the processes in each state. This is not surprising since the functions f ðxÞ ¼ xk can be directly brought inside the Fokker–Planck or Kolmogorov–Feller operators7. Equation (6.55) is to be solved using the initial conditions ðlÞ
mðlÞ s ðtÞjt¼t0 ¼ ms 0
ð6:56Þ
All equations must be solved simultaneously and jointly with the corresponding Chapman eq. (6.41) governing probabilities in each state. Thus, this procedure is similar to that of the ordinary Fokker–Planck or Kolmogorov–Feller equations. It is important to mention that the structure of the equations, with terms belonging to the partial process in the l-th state added to the terms reflecting absorption and regeneration, is to a certain degree the simplest possible structure to account for the internal dynamics of each isolated system and interaction between different states.
7
Assuming that limx!a x pðxÞ ¼ 0 for a ¼ 0 or a ¼ 1.
290
MARKOV PROCESSES WITH RANDOM STRUCTURES
It can be seen that unless ðlÞ
ðlÞ
ðlÞ
K1 ðxÞ ¼ a0 þ a1 x and
ðlÞ
ðlÞ
ðlÞ
ðlÞ
K2 ðxÞ ¼ b0 þ b1 x þ b2 x2
ð6:57Þ
the system of moment equations is not closed and its solution, therefore, is impossible. The Pearson system of distributions8 in the isolated states, defined by equality (6.57), results in the following set of moment equations sðs 1Þ ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ ½b2 ms ðtÞ þ b1 ms1 ðtÞ þ b0 ms2 ðtÞ m_ ðlÞ s ðtÞ ¼ s½a1 ms ðtÞ þ a0 ms1 ðtÞ þ 2 M X d Pk ðtÞ ðkÞ m ðtÞ k l ðtÞ ð6:58Þ l ðtÞ þ ln Pl ðtÞ mðlÞ s ðtÞ þ dt Pl ðtÞ s k¼1 ðlÞ
It should be pointed out that the equation for ms contains only moments of equal or lesser order, thus the system of linear eqs. (6.58) can be solved in a sequential manner, i.e. first one ðlÞ can obtain expressions for the mean values m1 , then for the second order moments, etc. Example: evolution of the mean value of an exponentially correlated random process. If all random processes in the isolated state are exponentially correlated random processes, then their drifts are a linear function9 ðlÞ
K1 ðxÞ ¼
x ml s;l
ð6:59Þ
where ml and s;l are the mean value and the correlation interval of the process in the l-th isolated state. If the switching process achieves a stationary state, probabilities Pl ðtÞ do not depend on time and, thus, the evolution equations for the mean values become M X 1 Pk ðkÞ ml ðlÞ ðlÞ m_ 1 ¼ þ l m1 ðtÞ þ k l ml ðtÞ þ s;l Pl s;l k¼1
ð6:60Þ
or, in a matrix form _ 1 ðt; Þ ¼ Cm1 ðt; Þ þ Tm0 m
ð6:61Þ
where 2
1 þ 1s;1
6 6 12 PP12 C¼6 4 ... 1 M PPM1
8
See section 2.1.1.3. See section 7.2.4.1 for details.
9
21 PP21
31 PP31
...
2 þ 1s;2 ... 2 M PPM2
32 PP32 ... ...
... ... ...
M 1 PPM1
3
7 M 2 PPM2 7 7 ... 5 1 M þ s;M
ð6:62Þ
291
MARKOV PROCESSES WITH RANDOM STRUCTURE
and m0 ¼ ½m1 m2 mM T ;
T ¼ diagf1=s;1 ; . . . ; 1=s;M g
ð6:63Þ
Assuming that the initial conditions for the system of eqs. (6.62) are those given by the stationary moments in the isolated states, i.e. m1 ð0Þ ¼ m0
ð6:64Þ
and the matrix C is invertible, one can easily obtain the solution of the system of equations as m1 ðt; Þ ¼ ðI eC t ÞC1 Tm0 þ eC t m1 ð0Þ
ð6:65Þ
The stationary vector of mean values is thus given by m1 ðÞ ¼ lim m1 ðtÞ ¼ C1 Tm0
ð6:66Þ
t!1
if the eigenvalues of C are all negative. Since the time needed for the mean value to reach its steady state is the largest of all moments, and it is of the order of the correlation interval, the smallest non-zero eigenvalue 1 of the matrix C defines the characteristic time of a system with random structure. Some additional details about behaviour of moments can be extracted by considering a case of only two equations M ¼ 2 and when P2 þ P1 ¼ 1; 21 ¼ P1 ; 12 ¼ P2 . In this case " C¼
P2 þ 1s 1 P1
P2 P1 þ 1s2
# 1
C
and
s 1 s 2 ¼ 1 þ ðP1 s 2 þ P2 s 1 Þ
"
1 s 2
þ P1 P1
P2 1 s 1 þ P2
#
ð6:67Þ i.e. the matrix C is always invertible and the inverse matrix has only positive entries. As a result, the steady-state values of the moments always exist and are given by the following expression
ð1 þ 2 P1 Þm1 þ 1 P2 m2 m1 ðÞ ¼ C Tm0 ¼ 1 þ ðP1 s 2 þ P2 s 1 Þ 2 P1 m1 þ ð1 þ 1 P2 Þm2 1
1
ð6:68Þ
If no switching appears, i.e. ¼ 0, then m1 ð0Þ ¼ m0 which is consistent with the fact that m0 represents a steady-state solution of isolated systems. If the switching intensity is so large that minf 2 P1 ; 1 P2 g 1 then m1 ð1Þ ¼ m1 1 ¼
1 2 P1 m1 þ 1 P2 m2 m1 1 ¼ m1 1 P1 s 2 þ P2 s 1 2 P1 m1 þ 1 P2 m2
ð6:69Þ
292
MARKOV PROCESSES WITH RANDOM STRUCTURES
i.e. the mean values are equal in both states10. In order to understand how the mean values behave for a finite value of , one can rewrite eq. (6.68) as " # 2 P1 m1 þ 1 P2 m2 1 m1 m1 ðÞ ¼ þ 1 þ ðP1 s 2 þ P2 s 1 Þ m2 1 þ ðP1 s 2 þ P2 s 1 Þ 2 P1 m1 þ 1 P2 m2 1 ðP1 s 2 þ P2 s 1 Þ m0 þ m1 ¼ w1 ðÞm0 þ ½1 w1 ðÞm1 1 þ ðP1 s 2 þ P2 s 1 Þ 1 þ ðP1 s 2 þ P2 s 1 Þ ð6:70Þ Thus, for a finite value of the first moments are weighted sums of moments in the isolated states and the values of the same moments in the case of infinite intensity of switching. The weighting function is given by 0 w1 ðÞ ¼
1 1 1 þ ðP1 s 2 þ P2 s 1 Þ
ð6:71Þ
and is the same for both sub-systems. Expression (6.71) also provides a true measure of the intensity of switching: it is slow if ðP1 s 2 þ P2 s 1 Þ 1
ð6:72Þ
ðP1 s 2 þ P2 s 1 Þ 1
ð6:73Þ
and fast if
Condition (6.72) is equivalent to that used earlier, while condition (6.73) is much more conservative and depends on the probability of each state. Example: Evolution of second moments in the case of two linear SDEs. If in isolated states both processes are Gaussian, they can be described by the following drift and diffusion ð1Þ
K1 ðxÞ ¼
x m1 ; s 1
ð2Þ
K1 ðxÞ ¼
x m2 ; s 2
ð1Þ
K2 ¼
2 21 ; s 1
ð2Þ
K2 ¼
2 22 ð6:74Þ s 2
The evolution of first moments has already been discussed in the previous examples and is given by eq. (6.65). A differential equation for the second moment can then be written as _ 2 ðt; Þ ¼ C2 m2 ðt; Þ þ 2 T M m1 ðtÞ ¼ 2 T R m where
" m2 ðtÞ ¼
10
m2 ðtÞ m1 0
"
#
ð2Þ
M¼
ðlÞ
m2 ðtÞ
0 m2
C2 ¼
;
and
This fact will be discussed in detail in Section 6.3.
2 s 1
þ P2
P2
2 P1 þ P1 " # s2 21 0 R¼ 0 22
ð6:75Þ # ; ð6:76Þ
293
MARKOV PROCESSES WITH RANDOM STRUCTURE
with other quantities having the same meaning as in the previous example. The stationary solution m2 ðÞ of this equation can be found by setting the right-hand side to zero and using stationary values of the first moments, i.e. m2 ðÞ satisfies the following equation C2 m2 ðÞ þ 2 T M m1 ðÞ þ 2 T R ¼ 0
ð6:77Þ
which can be solved to produce 1 m2 ðÞ ¼ 2 C1 2 T M m1 ðÞ þ 2 C2 T R
ð6:78Þ
It can be easily shown that 2 C1 2
s 1 s 2 ¼ 2 þ ðP1 s 2 þ P2 s 1 Þ
"
2 s 2
"
þ P1 P1
s 1 1
1 þ 2 ðP1 s 2 þ P2 s 1 Þ 0
#
P2 2 s 1
#
0 s 2
þ P2
1 ¼ w2 ðÞC1 20 þ ½1 w2 ðÞC2 1
" P1 1 s 1 s 2 þ 2 ðP1 s 2 þ P2 s 1 Þ P1
P2
#
P2
ð6:79Þ
where w2 ðÞ ¼ and C1 20
¼ s1 0
0 ; s 2
1 1þ
2 ðP1 s 2
C1 21
ð6:80Þ
þ P2 s 1 Þ
1 s 1 s 2 P1 ¼ 2 ðP1 s 2 þ P2 s 1 Þ P1
P2 P2
ð6:81Þ
It can be noted that the matrix C2 is always invertible. Finally, the stationary solution can be represented as 1 1 m2 ðÞ ¼ ðw2 ðÞC1 20 þ ½1 w2 ðÞC2 1 ÞTM½w1 ðÞm0 þ ½1 w1 ðÞm1 þ ðw2 ðÞC20 1 1 þ ½1 w2 ðÞC1 2 1 ÞT Rw2 ðÞw1 ðÞC20 TMm0 þ w2 ðÞ½1 w1 ðÞC20 TMm1 1 þ w1 ðÞ½1 w2 ðÞC1 2 1 TMm0 þ ½1 w1 ðÞ½1 w2 ðÞC2 1 TMm1 1 þ w2 ðÞC1 20 T R þ ½1 w2 ðÞC2 1 T R
Without further calculations one can deduce the nature of the dependence of the second moment on the parameter . Since w1 ð0Þ ¼ w2 ð0Þ ¼ 1 and w1 ð1Þ ¼ w2 ð1Þ ¼ 0 one can obtain " # m21 þ 21 1 1 ð6:82Þ m2 ð0Þ ¼ C20 TMm0 þ C20 T R ¼ m22 þ 22 2 3 P1 m21 s 2 þ P2 m22 s 1 P1 21 s 2 þ P2 22 s 1 þ 6 P1 s 2 þ P2 s 1 P1 s 2 þ P2 s 1 7 1 6 7 m2 ð1Þ ¼ C1 2 1 TMm1 þ C2 1 T R ¼ 4 2 2 P1 m1 s 2 þ P2 m2 s 1 P1 21 s 2 þ P2 22 s 1 5 þ P1 s 2 þ P2 s 1 P1 s 2 þ P2 s 1 ð6:83Þ
294
MARKOV PROCESSES WITH RANDOM STRUCTURES
It can be noted that, for intermediate values of , the second moment depends on a set of two weights w1 ðÞ and w2 ðÞ as well as additional terms which cannot be attributed either to isolated dynamics or to the properties of the system at the infinite rate of switching. This explains why it is relatively difficult to analyze a case of intermediate values of . Equations for the cumulants of the marginal PDF can be obtained in a very similar way to that shown in Section 4.4. However, taking into account that the moments and the cumulants are related through a non-linear transformation, it would be impossible to retain the minimum structure of such equations [15]. This conjecture can be illustrated by considering ðlÞ ðlÞ the following examples. Since 1 ¼ m1 the structure of the cumulant equations for the first order cumulant is preserved. However, for the second cumulant one can write ðlÞ
ðlÞ
ðlÞ
ðlÞ
ðlÞ
ðlÞ
ðlÞ
_ 2 ðtÞ ¼ m_ 2 ðtÞ 2 m_ 1 ðtÞm1 ðtÞ
2 ¼ m2 ½m1 2 ;
ð6:84Þ
Thus, the evolution equation for the second cumulant becomes d ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ 2ðlÞ
_ 2 ¼ 2hx K1 ðxÞi 2hxðlÞ ihK1 ðxÞi þ hK2 ðxÞi l þ ln Pl ðtÞ ðm2 2m1 Þ dt M X Pk ðtÞ ðkÞ ðkÞ ðlÞ ðm2 2 m1 m1 Þ k l ð6:85Þ þ P ðtÞ l k¼1 Formally, eq. (6.85) can be rewritten as d ðlÞ ðlÞ ðlÞ ðlÞ ðlÞ
_ 2 ¼ 2hx; K1 ðxÞi þ hK2 ðxÞi l þ ln Pl ðtÞ ½ 2 ð 1 Þ2 dt M i X Pk ðtÞ h ðkÞ ðkÞ ðkÞ ðlÞ
2 þ 1 ð 1 2 1 Þ k l þ Pl ðtÞ k¼1
ð6:86Þ
However, its structure is different from a typical one. The situation becomes even more complicated for the third cumulant, since stronger nonlinearities appear in the expression relating moments and cumulants of the order three.
3 ¼ m3 3 m1 m2 þ 2 m31 ;
_ 3 ¼ m_ 3 3 m_ 1 m2 3 m1 m_ 2 þ 6 m21 m_ 1
ð6:87Þ
ðlÞ
and the corresponding equation for 3 ðtÞ becomes intractable. Thus, it can be concluded that investigation of systems with random structure is better conducted either by using Gaussian approximation or by obtaining the first four moments through eq. (6.55). The ðlÞ ðlÞ ðlÞ ðlÞ former requires a simultaneous solution of non-linear equations for 1 ; 2 ; m1 and m2 with corresponding initial conditions. The latter is well known to produce a significant error, since reducing the number of equations by assuming higher order moments equal to zero is not consistent with the fact that the magnitude of even higher order moments increases11. Thus it seems difficult to obtain productive methods of investigating processes with random
11
This statement is not true if a closed system of equations can be obtained. One such case is the Pearson system discussed above.
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
295
structure through cumulant analysis. This fact clearly distinguishes processes with random structure from regular Markov processes, when the description using lower order cumulants is a simple and effective means of analysis of non-linear systems.
6.3
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
In this section, the question of approximate solution of a system of generalized Fokker– Planck equations @ ðlÞ @ 1 @2 ðlÞ ðlÞ p ðx; tÞ ¼ ½K1 ðxÞpðlÞ ðx; tÞ þ ½K ðxÞpðlÞ ðx; tÞ l pðlÞ ðx; tÞ @t @x 2 @ x2 2 M X k l pðkÞ ðx; tÞ þ
ð6:88Þ
k¼1
is considered. In particular, the dependence of the solution on the intensity parameters k l is of interest here. Each isolated system ( ¼ 0 and no switching) produces a stationary solution in the form ðlÞ p0 st ðxÞ
" ð # ðlÞ 1 K1 ðxÞ dx ¼ ðlÞ exp 2 ðlÞ 1 K2 ðxÞ K2 ðxÞ Cl
ð6:89Þ
and the transient solution of the form12 " ðlÞ p0 ðx; tÞ
¼
ðlÞ p0 st ðxÞ
1þ
1 X
# n e
n t
’n ðxÞ
ð6:90Þ
n¼1
Assuming that this switching process ðtÞ of zero intensity is a stationary process, the stationary solution of eq. (6.88) in isolated states is given by ðlÞ
ðlÞ
p0 st ðxÞ ¼ Pl p0 st ðxÞ ðlÞ
"
p0 ðx; tÞ ¼ Pl pl0 st ðxÞ 1 þ
1 X
#
" ðlÞ
n en t ’n ðxÞ ¼ p0 st ðxÞ 1 þ
n¼1
1 X
# n en t ’n ðxÞ
ð6:91Þ ð6:92Þ
n¼1
The material in the next section contains various approximations of the solution for low and high intensity of switching between states.
12
Assuming that the spectrum is discrete.
296
MARKOV PROCESSES WITH RANDOM STRUCTURES
6.3.1 6.3.1.1
Gram–Charlier Series Expansion Eigenfunction Expansion
In this section, an approximate method of solution based on the Gram–Charlier series is presented. While not being the most efficient, this method allows one to understand certain aspects of dependence of the solution on the switching intensity. ðlÞ Let p0 st ðxÞ be a stationary solution of the isolated FPE and let us assume that the spectrum ðlÞ of the FPE is discrete and the corresponding eigenfunctions n;l ðxÞ ¼ p0 st ðxÞQk;l ðxÞ form a 13 basis . In other words, any PDF pðxÞ can be expanded into the following series " # 1 1 X X ðlÞ pðxÞ ¼ k;l k;l ðxÞ ¼ p0 st ðxÞ 1 þ k;l Qk;l ðxÞ ð6:93Þ k¼0
k¼0
where k;l are some coefficients. Since the eigenvalues of the FPE operator are distinct, nonnegative, real and satisfy the following normalization condition ð1 ð1 n;l ðxÞ m;l ðxÞ ðlÞ dx ¼ p0 st ðxÞQn;l ðxÞQm; l ðxÞdx ¼ m n ð6:94Þ ðlÞ 1 1 p0 st ðxÞ ðlÞ
multiplication of both sides of eq. (6.93) by k;l ðxÞ=p0 st ðxÞ and integration over the real axis produces the following expression for the coefficients k; l k;l ¼
ð1
k;l ðxÞ dx pðxÞ ðlÞ 1 p0 st ðxÞ
¼
ð1 1
pðxÞQk; l ðxÞdx
ð6:95Þ
The next step is to represent a stationary solution of the generalized Fokker–Plank equation in each state as a series 1 X ðmÞ ðmÞ pst ðxÞ ¼ Pm k;l k;l ðxÞ ð6:96Þ k¼1
and substitute approximation (6.95) in the l-th equation of the system of the generalized FPE. This produces ðlÞ
ðlÞ
Ll FP ½pst ðxÞ l pst ðxÞ þ " ¼ Ll FP Pl
M X n6¼1
1 X
#
ðlÞ k;l
k;l ðxÞ
l Pl
k¼1
¼ Pl
1 X k¼1
ðnÞ
n l pst ðxÞ 1 X
ðlÞ
k;l
k;l ðxÞ
þ
ðlÞ
k;l ðxÞ l Pl
1 X k¼1
n l Pn
n6¼1
k¼1
k;l k;l
M X
ðlÞ
k;l
k;l ðxÞ þ
M X n6¼1
n;l Pn
1 X
k;l ðxÞ
k¼1 1 X
ðnÞ
k;l
k¼1
In the case of Pearson’s PDF, Qm;l ðxÞ are classical orthogonal polynomials [16,17].
13
ðnÞ
k;l
k;l ðxÞ
¼0
ð6:97Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
297
where k; l is the eigenvalue of Ll FP , i.e. Ll FP ½
k; l ðxÞ
¼ k; l k; l ðxÞ,
Collecting terms with equal
k; l ðxÞ; 0
¼ 0; l < 1; l < < k; l <
ð6:98Þ
one obtains ðlÞ
Pl ðk; l þ l Þk; l þ
M X
ðnÞ
n; l Pn k; l ¼ 0
ð6:99Þ
n6¼l
or, equivalently, if k; l l 6¼ 0 ðlÞ
k; l ¼
M X 1 Pn ðnÞ n;l k; l k; l þ l n6¼1 Pl
ð6:100Þ ðlÞ
While, in general, eq. (6.100) expresses the unknown coefficient k;l in terms of still unknown ðnÞ
coefficients k;l , it allows one to understand the effect of the switching intensity on the changes in the PDF of the solution due to the imbalance between distributions in each state. ðlÞ Since 0; l ¼ 0 and 0; l ðxÞ ¼ p0 st ðxÞ, equality (6.95) produces ðmÞ
0;l ¼ 1
ð6:101Þ
for any m and l. As a result eq. (6.99) is reduced to the identity defining stationary probability of states as can be seen from eq. (6.41) M X
n;l Pn ¼ Pl l
ð6:102Þ
n6¼l
6.3.1.2
Small Intensity Approximation
Since for a very small intensity of switching, perturbation of the solution is relatively small, it is possible to assume that ðmÞ
ðmÞ
k;l k;l ¼
ð6:103Þ
where ðmÞ
k;l ¼
ð1 1
ðmÞ
p0 st ðxÞ
k;l ðxÞ dx ðlÞ p0 st ðxÞ
¼
ð1 1
ðmÞ
p0 st Qk;l ðxÞdx
ð6:104Þ
298
MARKOV PROCESSES WITH RANDOM STRUCTURES ðmÞ
are coefficients of the expansion of the conditional PDF p0 st ðxÞ in the m-th isolated state. Thus the first order approximation of the PDF in the l-th state is "
p
ðlÞ
1 M X Qk;l ðxÞ X Pn ðnÞ ðxÞ 1þ n;l þ Pl k;l k;l l k¼1 n6¼1 " # 1 M X Qk;l ðxÞ X Pn ðnÞ ðlÞ Pl p0 st ðxÞ 1 þ n;l Pl k;l k;l n6¼1 k¼1
#
ðlÞ Pl p0 st ðxÞ
ð6:105Þ
This equation provides a good approximation if l 1;l < k;l , since in general the value ðnÞ k;l decreases with k. It is important to mention here that the expansion (6.105) suggested in this section is not an expansion in powers of a small parameter ¼ maxfk;l g and thus cannot be refined to include second order (in ) correction terms. The perturbation type solutions are considered below in Section 6.3.2.1. Example: two Gaussian processes. Let two Gaussian processes described by the following SDEs x m1 2 21 þ 1 ðtÞ s 1 s 1 x m2 2 22 x_ ¼ þ 2 ðtÞ s 2 s 2 x_ ¼
ð6:106Þ
in isolated states be commutated by a switching process described by 21 ¼ P1 ¼ 2 ; v12 ¼ P2 ¼ 1 ; P1 þ P2 ¼ 1. In this case, the solution for an isolated system is given by " # 1 ðx m1 Þ2 ¼ pffiffiffiffiffiffiffiffiffiffiffi2ffi exp 2 21 2 1 " # 1 ðx m2 Þ2 ð2Þ p0 st ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffi2ffi exp 2 22 2 1
ð1Þ p0 st ðxÞ
ð6:107Þ
The corresponding stationary generalized Fokker–Planck equations are then d x m1 ð1Þ d2 21 ð1Þ p ðxÞ þ 2 p ðxÞ P2 pð1Þ ðxÞ þ P1 pð2Þ ðxÞ 0¼ dx s 1 dx s 1 ð6:108Þ d x m2 ð2Þ d2 22 ð2Þ ð2Þ ð1Þ 0¼ p ðxÞ þ 2 p ðxÞ P1 p ðxÞ þ P2 p ðxÞ dx s 2 dx s 2 ð1Þ 1 , i.e. the Probability density p0 st ðxÞ generates a system of Hermitian polynomials Hn xm 1 eigenfunctions k;1 ðxÞ are given by " # 1 ðx m1 Þ2 1 x m1 Hn pffiffiffi ð6:109Þ k;1 ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffi2ffi exp n! 2 21 2 1 2 1
299
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
corresponding to the eigenvalues [17] k;l ¼
k s 1
ð6:110Þ ð2Þ
k;l of As the result of application of eq. (6.95), the following expression for the coefficients ð2Þ expansion of p0 st ðxÞ can be obtained ð2Þ k;l
" # 1 ðx m2 Þ2 x m1 pffiffiffiffiffiffiffiffiffiffiffiffi exp Hn pffiffiffi dx 2 22 2 1 2 22 1 " # ð 1 1 m2 m1 2 2 exp z pffiffiffi z dz Hn ¼ pffiffiffi 1 n! x 1 2 2
1 ¼ n!
ð1
ð6:111Þ
Here, the following substitution has been used x¼
pffiffiffi 2 2 z m1
ð6:112Þ
If 1 6¼ 2 , expression (6.111) can be evaluated using a standard integral 7.374–8 in [18] ð1 1
exp½ðx yÞ2 Hn ð xÞdx ¼
pffiffiffi y ð1 2 Þn=2 Hn pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2
ð6:113Þ
with m2 m1 2 y ¼ pffiffiffi ; ¼ 1 2 2
ð6:114Þ ð2Þ
k;l This results in the following expression for the coefficients ð2Þ k;l
! n=2 1 22 m2 m1 ffi 1 2 ¼ Hn pffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n! 1 2 21 22 ! 8 n=2 > 1 22 m2 m1 > > 1 2 Hn pffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi > > < n! 1 2 21 22 ¼ ! n=2 > > 1 22 m2 m1 > n > > ffi j Hn j pffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : n! 2 1 2 22 21 1
if
2 < 1
if
2 > 1
ð6:115Þ
If 2 ¼ 1 ¼ , one can use the following standard integral 7.374–6 in [18] ð1 1
exp½ðx yÞ2 Hn ðxÞdx ¼
pffiffiffi n n y 2
ð6:116Þ
300
MARKOV PROCESSES WITH RANDOM STRUCTURES
to obtain ð2Þ k;l
1 ¼ pffiffiffi n!
" # m2 m1 2 2n=2 m2 m1 n exp z pffiffiffi Hn ðzÞdz ¼ n! 2 1
ð1
ð6:117Þ
Finally, eq. (6.105) produces a series for the solution of the generalized Fokker–Planck equation h i 3 2 2 xm 1Þ ffiffi 1 1 Hk p exp ðxm X 2 2 2 1 ð2Þ k;1 5 pð1Þ ðxÞ P1 pffiffiffiffiffiffiffiffiffiffiffi2ffi1 41 þ P2 s 1 k 2 1 k¼1 h i2 3 ðxm2 Þ2 xm ffiffi 2 1 Hk p exp 2 2 X 2 2 ð1Þ k;2 5 pð2Þ ðxÞ P2 pffiffiffiffiffiffiffiffiffiffiffi2ffi2 41 þ P1 s 2 k 2 2 k¼1 ð1Þ
ð6:118Þ
ð1Þ
k;1 can be easily obtained from the expression for k;2 by where an expression for permutation of indices. It is worth noting that the correction terms depend not only on the marginal distributions in the isolated states but also on correlation intervals of the partial ð1Þ ð2Þ k;2 and k;2 processes. Characteristically in the case 2 ¼ 1 and m2 ¼ m1 , coefficients become zero as follows from eq. (6.117), thus confirming an intuitive conclusion that switching between two processes with identical PDFs must preserve the marginal PDF and the difference will appear only in the second order statistics. Results of numerical simulations are shown in Fig. 6.6. Example: Switching between two Gamma distributed processes. Let each of two isolated processes be described by the Gamma PDF ð1Þ
p0 st ðxÞ ¼
ll l x expð l xÞ; ð1 Þ
l ¼ 1; 2
ð6:119Þ
In this case, a system of SDEs and corresponding generalized Fokker–Planck equations can be written as 1 x 1 2 þ x 1 ðtÞ 1 s 1 1 s 1 2 x 2 2 x_ ¼ þ x 2 ðtÞ 2 s 2 2 s 2
x_ ¼
ð6:120Þ
and d 1 x 1 ð1Þ d2 1 ð1Þ 0¼ p ðxÞ þ 2 xp ðxÞ P2 pð1Þ ðxÞ þ P1 pð2Þ ðxÞ dx 1 s 1 dx 1 s 1 d 2 x 2 ð2Þ d2 1 ð2Þ p ðxÞ þ 2 p ðxÞ P1 pð2Þ ðxÞ þ P2 pð1Þ ðxÞ 0¼ dx 2 s 2 dx 2 s 2 ð6:121Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
301
Figure 6.6 Numerical simulation of two switched Gaussian processes. Parameters of the simulation: p1 ¼ 0:3, p2 ¼ 0:7, m1 ¼ 0, m2 ¼ 2, 1 ¼ 1, 2 ¼ 0:7, 1 ¼ 0:2, 2 ¼ 0:3. ¼ 0 *, ¼ 0:5, r, ¼ 1 . . ., ¼ 1 þ þ.
302
MARKOV PROCESSES WITH RANDOM STRUCTURES
respectively. Each isolated FPE defines a set of eigenfunctions [17] rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð þ kÞ ð 1Þ expð l xÞxl 1 Lk l ð l xÞ k; l ðxÞ ¼ k!
ð6:122Þ
Application of eq. (6.104) produces ð2Þ
k;l ¼
22 ð2 Þ
ð1 1
ð1 1Þ
expð 2 xÞx2 1 Lk
ð 1 xÞdx ¼
ðk þ 1 Þ 1 F ; k; ; 2 1 2 1 2 k!ð1 Þ ð6:123Þ
Here the following standard integral, 7.414–7 in [18], has been used ð1 0
est t LnðÞ ðtÞdt
ð þ 1Þðn þ þ 1Þ 1 1 s ¼ F n; þ 1; þ 1; n!ð þ 1Þ s
ð6:124Þ
where Fða; b; c; zÞ ¼ 2 F1 ða; b; c; zÞ is the hypergeometric function [17]. Results of a numerical simulation are shown in Fig. 6.7. 6.3.1.3
Form of the Solution for Large Intensity
The case of large intensities of switching also provokes great interest from the point of view of applications. While the perturbation analysis of this case is conducted in some detail in section 6.3.3, here some conclusions are drawn based on the analysis of eq. (6.100): ðlÞ
k;l ¼
M X 1 Pn ðnÞ n;l k;l k;l þ l n6¼1 Pl
Let l be a large but finite number such that 1 l . Then, it is possible to find an index NðÞ 1 such that NðÞ . If 1 < k NðÞ then k l and, thus ðlÞ
l Pl k;l
M X
ðnÞ
n;l Pn k;l
ð6:125Þ
n6¼l
Comparing eq. (6.125) with eq. (6.102), one concludes that, for small values of k, the coefficients of expansion (6.96) are equal for every state of the process with random structure, i.e. ðnÞ
ðlÞ
k;l ¼ k;l
ð6:126Þ
Once l approaches infinity, NðÞ also must approach infinity and thus the number of equal terms in expansion (6.96) approaches infinity as well. As a result, it can be contended that for a very large intensity of switching, the resulting distributions in each state are proportional to the same function.
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
303
Figure 6.7 Numerical simulation of two switched Gamma processes. Parameters of the simulation: p1 ¼ 0:3, p2 ¼ 0:7, 1 ¼ 0, 2 ¼ 3, 1 ¼ 1, 2 ¼ 0:5, 1 ¼ 0:2, 2 ¼ 0:3. ¼ 0 *, ¼ 0:5 r, ¼ 1 . . ., ¼ 1 þ þ.
304
MARKOV PROCESSES WITH RANDOM STRUCTURES
6.3.2
Solution by the Perturbation Method for the Case of Low Intensities of Switching
In this section, a small perturbation solution of the time-variant system of two generalized Fokker–Planck equations is considered [19,20]. In order to obtain such an approximation, one has to find correction terms for the eigenvalues and eigenvectors in the isolated states. At this stage, it is assumed that the isolated Fokker–Planck operators are described by a discrete spectrum, i.e. there are two discrete sets of eigenvalues fn g, f n g and eigenfunctions f n ðxÞg and fn ðxÞg such that d ð1Þ 1 d2 ð1Þ ½K1 ðxÞ n ðxÞ þ ½K ðxÞ n ðxÞ ¼ n n ðxÞ dx 2 dx2 2 d ð2Þ 1 d2 ð2Þ L2 FP ½n ðxÞ ¼ ½K1 ðxÞn ðxÞ þ ½K ðxÞn ðxÞ ¼ n n ðxÞ dx 2 dx2 2
L1 FP ½
and
ð1 1
n ðxÞ ¼
n ðxÞ
m ðxÞ dx ð1Þ pst ðxÞ
¼
ð1 1
n ðxÞ
m ðxÞ ð2Þ pst ðxÞ
dx ¼ mn ¼
ð1Þ
1 0
ð6:127Þ
if m ¼ n if m ¼ 6 n
ð6:128Þ
ð2Þ
The stationary solutions 0 ðxÞ ¼ pst ðxÞ and 0 ðxÞ ¼ pst ðxÞ correspond to the zero eigenvalue for each isolated equation, i.e. 0 ¼ 0 ¼ 0. 6.3.2.1
General Small Parameter Expansion of Eigenvalues and Eigenfunctions
The next step is to consider the eigenvalues and the eigenfunctions of the perturbed Fokker– Planck operators L1 ½ ðxÞ ¼ L1 FP ½ ðxÞ P2 ðxÞ þ P1 pð2Þ ðxÞ L2 ½ðxÞ ¼ L2 FP ½ðxÞ P1 ðxÞ þ P2 pð1Þ ðxÞ
ð6:129Þ
While both equations must be solved simultaneously, the assumption of a small allows one to treat these equations separately. Due to a similar structure of both operators, the discussion is limited to the operator L1 only. At this point, the exact value of the function p2 ðxÞ is not specified; however, it is assumed to be a known function. A further assumption is that the eigenvalues n and the eigenvectors n ðxÞ of the perturbed operator are holomorphic functions of the small parameter s ¼
P1 P2 s 1 s 2 P2 s 1 þ P1 s 2
ð6:130Þ
where s 1 and s 2 are the characteristic time constants of the isolated systems. The holomorphic property allows the following expansions for small values of s X 2 ð2Þ n ¼ n þ ð s Þð1Þ ð s Þk nðkÞ n þ ð s Þ n þ ¼ n þ k¼1
n ðxÞ ¼
n ðxÞ
þ ð s Þ
ð1Þ n ðxÞ
þ ð s Þ2
ð2Þ n ðxÞ
þ ¼
n ðxÞ
þ
1 X ð s Þk
ðkÞ n ðxÞ
k¼1
ð6:131Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS ðkÞ
305
ðkÞ
where n and n ðxÞ are to be defined. Since n ðxÞ is the eigenfunction corresponding to the eigenvalue n the following equality must hold L1 ½n ðxÞ ¼ L1 FP ½n ðxÞ P2 n ðxÞ þ P1 pð2Þ ðxÞ 1 X ¼ n n ðxÞ þ ð s Þk L1 FP ½ nðkÞ ðxÞ P2 k¼1
n ðxÞ
P2
1 X ð s Þk
ðkÞ n ðxÞ
þ P1 pð2Þ ðxÞ
k¼1
"
¼ n n ðxÞ ¼ n
n ðxÞ
þ
s nð1Þ
n ðxÞ
ð1Þ n ðxÞ
þ s n
1 X
þ
# ð
s Þmþl nðmÞ ðlÞ n ðxÞ
mþ1>1
ð6:132Þ Equating terms with equal powers of on both sides of this expansion, one obtains a system ðkÞ ðkÞ of equations for n and n ðxÞ: L1 FP ½ L1 FP ½
6.3.2.2
ð1Þ n ðxÞ
ðkÞ n ðxÞ
P2 nð1Þ s X ðk1Þ ðxÞ ¼ nðmÞ n
þ n P2 s
ð1Þ n ðxÞ
¼
n ðxÞ
P1 ð2Þ p ðxÞ s
ðlÞ n ðxÞ
ð6:133Þ ð6:134Þ
mþl¼k
Perturbation of 0 ðxÞ
In the following it is assumed that the solution of the system of the Fokker–Planck eq. (6.129) has a stationary solution. Thus one has to enforce that 0 ¼ 0 ¼ 0, while the corresponding eigenfunction n ðxÞ still can be expressed as ð1Þ
0 ðxÞ ¼ pst ðxÞ þ
1 X
ð s Þk
ðkÞ 0 ðxÞ
ð6:135Þ
k¼1
A similar expansion must be valid for the perturbed stationary solution of the second Fokker–Planck equation, i.e. ð2Þ
0 ðxÞ ¼ pst ðxÞ þ
1 X
ðkÞ
ð s Þk 0 ðxÞ
ð6:136Þ
k¼1
Taking this into account, one can rewrite the first of eqs. (6.129) as " 1 X ð1Þ L1 FP ½0 ðxÞ P2 pð1Þ ðxÞ þ 1 Pð2Þ ðxÞ ¼ L1 FP pst ðxÞ þ ð s Þk "
ð1Þ P2 pst ðxÞ
þ
1 X
ðs Þ
k¼1
# k
ðkÞ 0 ðxÞ
k¼1
" þ
ð2Þ P1 pst ðxÞ
þ
1 X k¼1
# ðkÞ 0 ðxÞ
#
ðkÞ ð s Þk 0 ðxÞ
¼0
ð6:137Þ
306
MARKOV PROCESSES WITH RANDOM STRUCTURES ð1Þ
Since L1 FP ½pst ðxÞ ¼ 0, one obtains the following system of perturbation equations 1 X
L1 FP ½
ð1Þ 0 ðxÞ
P2 ð1Þ P1 ð2Þ p ðxÞ þ pst ðxÞ ¼ 0 s st s
ð6:138Þ
L1 FP ½
ðkÞ 0 ðxÞ
P2 s
P1 ðk1Þ ðxÞ ¼ 0 s 0
ð6:139Þ
n ðxÞ
ð6:140Þ
k¼1 1 X k¼1
Representing
ð1Þ 0 ðxÞ
on the basis of f
ðk1Þ 0
n ðxÞg
ð1Þ 0 ðxÞ
¼
þ
as
1 X
0;p
p¼0
and plugging it into eq. (6.138) one derives 1 X k¼1
L1 FP ½
ðkÞ 0 ðxÞ
¼
1 X
0;p L1 FP ½
n ðxÞ
¼
p¼0
1 X
0;p p
n ðxÞ
¼
p¼0
P2 ð1Þ P1 ð2Þ p ðxÞ pst ðxÞ s st s ð6:141Þ
ð1Þ
Furthermore, multiplying both parts of eq. (6.141) by m ðxÞ=pst ðxÞ and integrating over x, the expression for the coefficients 0;m for m > 0 can be deduced: 0;m
P1 ¼ s m
ð1 1
ð2Þ
pst ðxÞ
m ðxÞ dx; ð1Þ pst ðxÞ
m1
ð6:142Þ
with 0;0 still undefined. The latter can be obtained from the normalization condition, since ð1 p 1
ð1Þ
ð1
ðxÞdx ¼
ð1Þ
1
pst ðxÞdx ¼ P1
ð6:143Þ
which is equivalent to 0;0 ¼ 0. Thus, the first order approximation is ð1Þ
p1 ðxÞ pst ðxÞ P1
1 X
n ðxÞ
m¼1
m
ð1Þ
ð1 1
ð2Þ
pst ðxÞ
m ðxÞ dx ð1Þ pst ðxÞ
ð6:144Þ
ð1Þ
Furthermore, approximation of n and n ðxÞ can be obtained directly from eq. (6.133). ð1Þ Since f n ðxÞg forms a complete orthogonal basis, the function n ðxÞ can be expanded in terms of this basis: ð1Þ n ðxÞ
¼
1 X p¼0
n;p
p ðxÞ
ð6:145Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
307
In this case, eq. (6.133) can be rewritten as L1 FP ½
ð1Þ n ðxÞ
"
¼ L1 FP
ð1Þ n ðxÞ
þ n
1 X
n;p
#
p ðxÞ
þ n
1 X
p¼0
¼
P2 ð1Þ n s
n;p
p ðxÞ
p¼0 n ðxÞ
1 X
n;p ðn p Þ
p ðxÞ
p¼0
P1 ð2Þ p ðxÞ s
ð6:146Þ ð1Þ n ðxÞ=pst ðxÞ,
Multiplying both sides of eq. (6.146) by orthogonality, one obtains 0¼
¼
P2 ð1Þ n s
integrating over x and using
ð1
P1 ð2Þ n ðxÞ p ðxÞ ð1Þ d x 1 s pst ðxÞ
ð6:147Þ
i.e. ð1Þ n ¼
P2 s
ð1
P1 ð2Þ n ðxÞ p ðxÞ ð1Þ d x 1 s pst ðxÞ ð1Þ m ðxÞ=pst ðxÞ; m
Furthermore, multiplying both sides of eq. (6.146) by over x and using orthogonality one obtains P1 n;m ðm n Þ ¼ s
ð1 1
ð6:148Þ 6¼ n, integrating
pð2Þ ðxÞ
m ðxÞ dx ð1Þ pst ðxÞ
ð6:149Þ
pð2Þ ðxÞ
m ðxÞ dx ð1Þ pst ðxÞ
ð6:150Þ
and, thus n;m ¼
P1 s ðm n Þ
ð1 1
Returning to eq. (6.131), one derives n ¼ n þ P2 P1 n ðxÞ ¼ ð1 n;n s Þ
ð1 1
pð2Þ ðxÞ
n P1
1 X m¼0 m6¼n
n ðxÞ dx ð1Þ pst ðxÞ m ðxÞ
m n
ð1
1
pð2Þ ðxÞ
m ðxÞ dx ð1Þ pst ðxÞ
ð6:151Þ
Integrating the second equation in (6.151) over x and taking into account the normalization conditions, one obtains, for n ¼ 0, 1¼
ð1 1
n ðxÞd x ¼ ð1 0;0 s Þ
ð6:152Þ
308
MARKOV PROCESSES WITH RANDOM STRUCTURES
which implies that 0;0 ¼ 0 and, thus ð1Þ
0 ðxÞ ¼ pst ðxÞ P1
1 X m¼0 m6¼n
ð1
m ðxÞ
m n
pð2Þ ðxÞ
1
m ðxÞ dx ð1Þ pst ðxÞ
ð6:153Þ
In general n;n ¼ 0 and, therefore n ¼ n þ P2 P1 n ðxÞ ¼
n
P1
1 X
ð1
pð2Þ ðxÞ
1 m ðxÞ
m n
m¼0 m6¼n
ð1 1
n ðxÞ dx ð1Þ pst ðxÞ
pð2Þ ðxÞ
ð6:154Þ
m ðxÞ dx ð1Þ pst ðxÞ
It is interesting to note how intensity of switching changes the first smallest non-zero eigenvalue 1 which defines the correlation interval. Indeed, if switching is performed between two processes with the same PDF but different correlation properties, i.e. ð1Þ ð2Þ P2 pst ðxÞ ¼ P1 pst ðxÞ but 1 6¼ 1 , then it follows from eq. (6.154) that 1 ¼ 1 þ P2 ¼ 1 þ 12 > 1
ð6:155Þ
thus providing a shorter correlation interval. The second order approximation can be obtained from eq. (6.134) by setting k ¼ 2 L1 FP ½
ð2Þ n ðxÞ
P2 s
ð1Þ n ðxÞ
X
¼
nðmÞ
ðlÞ n ðxÞ
ð2Þ n ðxÞ
¼ n
ð1Þ n
ð1Þ n ðxÞ
nð2Þ
n ðxÞ
mþl¼2
ð6:156Þ or, equivalently L1 FP ½ Expanding
ð2Þ n ðxÞ
ð2Þ n ðxÞ
þ
n ð2Þ n ðxÞ
P2 ð1Þ ¼ n s
into a series over the basis f ð2Þ n ðxÞ
¼
1 X
ð1Þ n ðxÞ
ð2Þ n
n ðxÞ
ð6:157Þ
n ðxÞg
n;l l ðxÞ
ð6:158Þ
l¼0
and plugging it into eq. (6.157), one obtains L1 FP ½
ð2Þ n ðxÞ 1 X
þ n
ð2Þ n ðxÞ
n;l l l ðxÞ þ n
¼
l¼0
P2 ð1Þ ¼ n s
1 X
n;l l ðxÞ ¼
l¼0 ð1Þ n ðxÞ
nð2Þ
1 X
n;l ðn l Þ l ðxÞ
l¼0 n ðxÞ
ð6:159Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
Multiplication by
ð1Þ m ðxÞ=pst ðxÞ
309
and integration over x produces, for m ¼ n, ð2Þ n ¼0
ð6:160Þ
P2 ð1Þ n;m n;m ðm n Þ ¼ n s
ð6:161Þ
and
As a result P2 n;m ; n;m ¼ ð1Þ n s ðm n Þ
m 6¼ n
ð6:162Þ
Thus, it can be seen that the second order approximation does not produce any significant correction terms. This fact justifies application of the small perturbation analysis. Example: Two Gaussian processes. A case of commutation of two Gaussian processes is described by SDE (6.106). In this case n n ¼ ; s 1
" # 1 ðx m1 Þ2 x m1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi p p ffiffi ffi ðxÞ ¼ P H exp n 1 n 2 21 2 1 n!2 21
ð6:163Þ
n ;
n ¼ s 2
" # 1 ðx m2 Þ2 x m2 ffi exp Hn pffiffiffi n ðxÞ ¼ P2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 22 2 2 n!2 22
ð6:164Þ
and
Taking eq. (6.163) into account, the expression for the stationary PDF of the process in the first state becomes 3 " # pffiffi 1 ð 1 2 1 Hm xm X P ðx m Þ x m 2 1 2 s1 2 1 ð1Þ 0 ðxÞ ¼ pst ðxÞ41 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2ffi exp Hm pffiffiffi d x5 2 22 2 1 n!2 2 m¼0 m n 1 2
m6¼n
!3 pffiffi 1 1 Hm xm 2 n=2 X P m m 2 1 2 s1 2 1 ð1Þ 1 22 Hm pffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 5 pst ðxÞ41 pffiffiffiffi 1 n! m¼0 m n 2 21 22 2
ð6:165Þ
m6¼n
which is very similar to that given by eq. (6.118). Here the following standard integral (7.374.8 in [18]) has been used pffiffiffi y 2 n=2 exp½ðx yÞ Hn ð xÞdx ¼ ð1 Þ Hn pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 1
ð1
2
ð6:166Þ
310
MARKOV PROCESSES WITH RANDOM STRUCTURES
The first approximation of the eigenvalues is given by eq. (6.151) as " !# n=2 n 1 22 m2 m1 ffi þ P2 1 pffiffiffiffi 1 2 Hm pffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n ¼ s 1 1 n! 2 21 22
6.3.3
ð6:167Þ
High Intensity Solution
In this section, an approximation of the solution for high intensity of switching is considered. The first step is to obtain conditions for the infinite intensity of switching and then to convert the problem into a small perturbation problem.
6.3.3.1
Zero Average Current Condition
It is easy to see that the Fokker–Planck equations can be written in the form of conservation of probability as P1
@ @ p1 ðx; tÞ ¼ P1 J1 ðxÞ P2 pð1Þ ðx; tÞ þ P1 pð2Þ ðx; tÞ @t @x
@ @ J2 ðxÞ P1 pð2Þ ðx; tÞ þ P2 pð1Þ ðx; tÞ P2 p2 ðx; tÞ ¼ P2 @t @x
ð6:168Þ
where ð1Þ
J1 ðx; tÞ ¼ K1 ðxÞp1 ðx; tÞ J2 ðx; tÞ ¼
ð2Þ K1 ðxÞp2 ðx; tÞ
1 @ ð1Þ ½K ðxÞp1 ðx; tÞ 2@ x 2
1 @ ð2Þ ½K ðxÞp2 ðx; tÞ 2@ x 2
ð6:169Þ
and pð1Þ ðx; tÞ ¼ P1 p1 ðx; tÞ pð2Þ ðx; tÞ ¼ P2 p2 ðx; tÞ
ð6:170Þ
are the probability current densities, created by the drift and the diffusion of the continuous Markov process. Additional probability flows P2 p1 ðx; tÞ and P1 p2 ðx; tÞ are due to the absorption and regeneration of trajectories. probabilities P1 and P2 represent the proportion of time during which a given flow is observed. Adding both eqs. (6.168), one obtains @ @ pðx; tÞ ¼ Jðx; tÞ @t @x
ð6:171Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
311
where the change in the average probability current Jðx; tÞ ¼ P1 J1 ðx; tÞ þ P2 J2 ðx; tÞ
ð6:172Þ
is now the only factor responsible for the variation of the probability density pðx; tÞ ¼ p1 ðx; tÞ þ p2 ðx; tÞ ¼ P1 p1 ðx; tÞ þ P2 p2 ðx; tÞ
ð6:173Þ
in time. For a stationary flow, this current must remain constant, i.e. JðxÞ ¼ P1 J1 ðxÞ þ P2 J2 ðxÞ 1 @ ð1Þ ½K ðxÞp1 ðxÞ 2 @x 2 P1 @ ð1Þ P2 @ ð2Þ ð1Þ ð2Þ ½K ðxÞp1 ðxÞ ½K ðxÞp2 ðxÞ ¼ P1 K1 ðxÞp1 ðxÞ þ P2 K1 ðxÞp2 ðxÞ 2 @x 2 2 @x 2 ð1Þ
ð2Þ
¼ P1 K1 ðxÞp1 ðxÞ þ P2 K1 ðxÞp2 ðxÞ
¼ J0 ¼ const:
ð6:174Þ
If p1 ðxÞ and p2 ðxÞ are defined on a semi-infinite or infinite interval, probability densities p1 ðxÞ and p2 ðxÞ, as well as their derivatives, must vanish at infinity, together with the probability current density JðxÞ. Thus, the constant J0 must be equal to zero: J0 ¼ 0
ð6:175Þ
This condition is known as the natural boundary condition in the theory of continuous Markov processes [13 and Section 5.3.3]. However, in the case of processes with random structure it is the average probability current density which vanishes; individual current densities J1 ðxÞ and J2 ðxÞ may remain functions of x in such a way that their sum is zero. 6.3.3.2
Asymptotic Solution P1 ðxÞ
For a very high intensity all the probability densities pðkÞ ðxÞ must approach a fixed distribution p1 ðxÞ pðkÞ ðxÞ ¼ ak p1 ðxÞ
ð6:176Þ
in order to satisfy the balance conditions L X
kl pðkÞ ðxÞ 0
ð6:177Þ
k¼1
for any l and x. The normalizing coefficients are to be chosen to satisfy the normalization condition Pk ¼
ð1 1
pðkÞ ðxÞdx ¼ ak
ð1 1
p1 ðxÞdx ¼ ak
ð6:178Þ
312
MARKOV PROCESSES WITH RANDOM STRUCTURES
As far as a particular form of the asymptotic distribution p1 ðxÞ is concerned, nothing can be deduced from the eq. (6.177) itself. In order to obtain the expression for p1 ðxÞ, one can take advantage of the fact that pðlÞ ðxÞ ¼ Pl p1 ðxÞ. Taking this into account, and adding all M generalized Fokker–Planck equations, one obtains M M @ X @ X pðlÞ ðx; tÞ ¼ p1 ðx; tÞ @ t l¼1 @ t l¼1 " # " # M M X X d 1 d2 ðlÞ ðlÞ ¼ p1 ðx; tÞ Pl K1 ðx; tÞ þ p1 ðx; tÞ Pl K2 ðx; tÞ dx 2 dx2 l¼1 l¼1
ð6:179Þ or M @ X d 1 d2 p1 ðx; tÞ ¼ ½K1 1 ðxÞp1 ðx; tÞ þ ½K2 1 ðxÞp1 ðx; tÞ @ t l¼1 dx 2 dx2
ð6:180Þ
Here the average drift K1 1 ðxÞ and the average diffusion are given by K1 1 ðxÞ ¼
L X
ðlÞ
Pl K1 ðxÞ; K2 1 ðxÞ ¼
l¼1
L X
ðlÞ
Pl K2 ðxÞ
ð6:181Þ
l¼1
The formal solution of eq. (6.180) is easy to obtain since it is a regular Fokker–Planck equation: " ð P # ðlÞ L x P K ðsÞ l 1 l¼1 ds ð6:182Þ p1 ðxÞ ¼ C exp 2 PL ðlÞ 1 P K l 2 ðsÞ l¼1 Example: Two Gaussian processes. Once again a system of two SDEs (6.106) is 2 21 2 22 ð1Þ ð2Þ ð1Þ ð2Þ xm2 1 considered. In this case, K1 ¼ xm s 1 ; K1 ¼ s 2 ; K2 ðxÞ ¼ s 1 and K2 ðxÞ ¼ s 2 . Thus, the averaged drift and diffusion are ðx m1 Þ ðx m1 Þ x m1 1 þ P2 ¼ K1 1 ðxÞ ¼ P1 s 1 s 1 s 1
ð6:183Þ
and K2 1 ðxÞ ¼ P1
2 21 2 2 2 þ P2 2 ¼ 2 1 s 1 s 2 1
ð6:184Þ
respectively. Here m1 1 ¼
P1 m1 s 2 þ P2 m2 s 1 ; P1 s 2 þ P2 s 1
21 ¼
P1 21 s 2 þ P2 22 s 1 P1 s 2 þ P2 s 1
and
1 ¼
s 1 s 2 P1 s 2 þ P2 s 1 ð6:185Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
313
are parameters of the limiting PDF. It can be seen that the limiting distribution p1 ðxÞ itself is Gaussian with parameters defined by eq. (6.185): " # 1 ðx m1 Þ2 p1 ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi exp ð6:186Þ 2 21 2 21 Example: Two Gamma distributed processes. In this case, the system of two SDEs ð1Þ ð2Þ ð1Þ 1 2 (6.120) is analyzed. In this case, K1 ¼ 1 x ; K1 ¼ 2 x ; K2 ðxÞ ¼ 12s 1 and 1 s 1 2 s 2 ð2Þ K2 ðxÞ ¼ 22s 2 . Thus, the averaged drift and diffusion are
1 x 1 2 x 2 1 x a1 K1 1 ðxÞ ¼ P1 þ P2 ¼ 1 s 1 2 s 2 1 s 1
ð6:187Þ
and K2 1 ðxÞ ¼ P1
2 2 2 þ P2 ¼ 1 s 1 2 s 2 1 1
ð6:188Þ
respectively. Here P1 11 s 2 þ P2 12 s 1 1 ¼ ; P1 s 2 þ P2 s 1 1
P1 11 s 2 þ P2 22 s 1
1 ¼
P1 s 2 þ P2 s 1
and
1 ¼
s 1 s 2 P1 s 2 þ P2 s 1 ð6:189Þ
are the parameters of the limiting PDF. It can be seen that the limiting distribution p1 ðxÞ itself is a Gamma PDF with parameters defined by eq. (6.189). p1 ðxÞ ¼
a1 1 expð 1 xÞ ð1 Þ
ð6:190Þ
Taking into account that for the Gamma distribution m¼
and
¼
1
ð6:191Þ
one can rewrite eq. (6.189) as m1 ¼
P1 m1 s 2 þ P2 m2 s 1 ; P1 s 2 þ P2 s 1
1 ¼
P1 1 s 2 þ P2 2 s 1 P1 s 2 þ P2 s 1
and
1 ¼
s 1 s 2 P1 s 2 þ P2 s 1 ð6:192Þ
i.e. transformations of the mean value and standard deviation follow the same rules as for the Gaussian process considered above. Another important conclusion which can be drawn is the importance of correlation properties of the isolated system on the resulting parameters of the distribution. In the case when the distributions in the isolated states are identical, the resulting marginal PDF of
314
MARKOV PROCESSES WITH RANDOM STRUCTURES
course would remain the same as in the isolated states. However, if there is a difference, even in scale (i.e. only in variance), the effect of correlation in each isolated case begins to play a prominent role in the final result. 6.3.3.3
Case of a Finite Intensity
Of course, for a finite but large , a certain correction term must be added. It is a goal of the considerations of this section to derive an appropriate expression for such a correction term. In order to keep the derivations compact the case of two equations is considered below: @ ð1Þ 1 @ 2 ð1Þ ½K1 ðxÞpð1Þ ðxÞ þ ½K ðxÞpð1Þ ðxÞ P2 pð1Þ ðxÞ þ P1 pð2Þ ðxÞ @x 2 @x2 2 @ ð2Þ 1 @ 2 ð2Þ 0 ¼ ½K1 ðxÞpð2Þ ðxÞ þ ½K ðxÞpð2Þ ðxÞ þ P2 pð1Þ ðxÞ þ P1 pð2Þ ðxÞ @x 2 @x2 2
0¼
ð6:193Þ
This can be rewritten in the vector form as
! ! ! @ 1 @2 ½K 1 ðxÞ p ðxÞ þ ½K 2 ðxÞ p ðxÞ Q p ðxÞ ¼ 0 @x 2 @x2
ð6:194Þ
or !
!
LFP ½ p ðxÞ ¼ Q p ðxÞ
ð6:195Þ
where " K 1 ðxÞ ¼
ð1Þ
K1 ðxÞ 0
# 0 ; ð2Þ K1 ðxÞ
" K 2 ðxÞ ¼
ð1Þ
K2 ðxÞ 0
# 0 ; ð2Þ K2 ðxÞ
P2 Q¼ P2
P1 P1
ð6:196Þ and !
LFP ½ p ðxÞ ¼
! ! @ 1 @2 ½K 1 ðxÞ p ðxÞ þ ½K 2 ðxÞ p ðxÞ @x 2 @x2
ð6:197Þ
is the Fokker–Planck operator written in a matrix form. One may look for the solution of eq. (6.194) in the form of an infinite series !
!
p ðxÞ ¼ P1 p ðxÞ þ 1
1 X Pm m¼1
m
!
!
½ p m ðxÞ p 1 ðxÞ
ð6:198Þ
where P1 ¼
P1 0
0 ; P2
! p 1 ðxÞ
¼
p1 ðxÞ p1 ðxÞ
ð6:199Þ
APPROXIMATE SOLUTION OF THE GENERALIZED FOKKER–PLANCK EQUATIONS
315
Pm and ~ pm ðxÞ are the matrix and the vector function to be defined. Substituting a solution in the form (6.199) into eq. (6.194), and using the linearity of the Fokker–Planck equation, one obtains LFP ½~ p ðxÞ
1 X @ 1 @2 @ Pm ~ ½K 1 ðxÞ~ K p1 ðxÞ þ ½K ðxÞ~ p ðxÞ þ ðxÞ ½~ p ðxÞ p ðxÞ 2 1 1 m 1 m @x 2 @ x2 @x m¼1 " # ! 1 1 X X 1 @2 Pm Pm Pm ~ ~ K 2 ðxÞ m ½~ pm ðxÞ ~ p1 ðxÞ ¼ Q P1 þ p1 ðxÞ þ p ðxÞ 2 @ x2 m m m m¼1 m¼1
¼
¼
1 X QPm m¼1
m1
½~ pm ðxÞ ~ p1 ðxÞ
ð6:200Þ
p1 ¼ 0. By equating the terms with equal negative powers of , one can obtain a since QP1 ~ sequence of equations QP1 ½~ p1 ðxÞ ~ p1 ðxÞ ¼
@ 1 @2 ½K 1 ðxÞ~ p1 ðxÞ þ ½K 2 ðxÞ~ p1 ðxÞ; m ¼ 0 @x 2 @x2
ð6:201Þ
and QPm ½~ pm ðxÞ ~ p1 ðxÞ ¼
1 X
m¼1
þ
@ fK 1 ðxÞPmþ1 ½~ pmþ1 ðxÞ ~ p1 ðxÞg @x
1 @2 fK 2 ðxÞPmþ1 ½~ pmþ1 ðxÞ ~ p1 ðxÞg; 2 @x2
m>0
ð6:202Þ
It follows from eq. (6.182) that @ ð1Þ 1 @ 2 ð1Þ P1 ½K1 ðxÞp1 ðxÞ þ ½K ðxÞp1 ðxÞ @x 2 @x2 2 @ ð2Þ 1 @ 2 ð2Þ ¼ P2 ½K1 ðxÞp1 ðxÞ þ ½K ðxÞp1 ðxÞ @x 2 @x2 2
ð6:203Þ
or, equivalently ð1Þ
ð2Þ
P1 LFP ½p1 ðxÞ ¼ P2 LFP ½p1 ðxÞ
ð6:204Þ
Thus, the set of eqs. (6.201)–(6.202) are just the same equation repeated twice, i.e. a unique solution cannot be obtained just by considering eq.(6.202). This is the consequence of the fact that the matrix Q is degenerate, i.e. detðQÞ ¼ 0. As a result, some additional conditions are required to obtain a unique solution of the original problem. This additional condition is a constant probability current condition (6.174), considered in section 6.3.3.1. To make proper use of this condition, eq. (6.202) can be written as ð1Þ
ð2Þ
P1 L1 FP ½p1 ðxÞ P1 P2 ½p1 ðxÞ p1 ðxÞ þ P1 P2 ½p1 ðxÞ p1 ðxÞ ¼ 0
ð6:205Þ
316
MARKOV PROCESSES WITH RANDOM STRUCTURES
or, equivalently ð1Þ
1 1 ð2Þ ð1Þ L1 FP ½p1 ðxÞ; p1 ðxÞ ¼ p1 ðxÞ L1 FP ½p1 ðxÞ P2 P2
ð2Þ
p1 ðxÞ p1 ðxÞ ¼
ð6:206Þ
Using approximation (6.198), the zero current density conditions can now be rewritten as P1 @ P2 @ ð1Þ ð2Þ ½K ðxÞp1 ðxÞ ½K ðxÞp1 ðxÞ 2 @x 2 2 @x 2 1 P1 @ P2 @ ð1Þ ð1Þ ð2Þ ð2Þ ð1Þ ð1Þ ð2Þ ð2Þ P1 K1 ðxÞp1 ðxÞ þ P2 K1 ðxÞp1 ðxÞ ½K ðxÞp1 ðxÞ ½K ðxÞp1 ðxÞ ¼ 0 þ 2 @x 2 2 @x 2 ð1Þ
ð2Þ
P1 K1 ðxÞp1 ðxÞ þ P2 K1 ðxÞp1 ðxÞ
ð6:207Þ
Furthermore, since p1 ðxÞ satisfies the condition ð1Þ
ð2Þ
½P1 K1 ðxÞ þ P2 K1 ðxÞp1 ðxÞ
1 @ ð1Þ ð2Þ ½fP1 K2 ðxÞ þ P2 K2 ðxÞgp1 ðxÞ ¼ 0 2@ x
ð6:208Þ ð1Þ
eq. (6.207) can be simplified using expression (6.206) to produce an equation for p1 ðxÞ: ð1Þ
1 @ ð1Þ ð2Þ ð1Þ f½P1 K2 ðxÞ þ P2 K2 ðxÞp1 ðxÞg 2@ x 1 @ ð2Þ ð2Þ fK ðxÞL1 FP ½p1 ðxÞg ¼ K1 ðxÞL1 FP ½p1 ðxÞ 2@ x 2
ð2Þ
ð1Þ
½P1 K1 ðxÞ þ P2 K1 ðxÞp1 ðxÞ
ð6:209Þ
Alternatively, using the notation of eq. (6.181), one obtains ð1Þ
K1 1 ðxÞp1 ðxÞ
1 @ 1 @ ð1Þ ð2Þ ð2Þ ½K2 1 ðxÞp1 ðxÞ ¼ K1 ðxÞL1 FP ½p1 ðxÞ fK ðxÞL1 FP ½p1 ðxÞg 2@x 2@ x 2 ð6:210Þ
which is an ordinary non-homogeneous differential equation of the first order. The solution of the homogeneous equation pðxÞ is once again proportional to p1 ðxÞ: ðx 1 K1 1 ðxÞ pðxÞ ¼ C p1 ðxÞ ¼ exp 2 dx ð6:211Þ K2 1 ðxÞ 1 K2 1 ðxÞ ð1Þ
The next step is to find solution p1 ðxÞ using the method of undefined coefficients, i.e. the solution is thought of in the form ðx 1 K1 1 ðxÞ ð1Þ exp 2 dx ð6:212Þ p1 ðxÞ ¼ CðxÞ K1 1 ðxÞ 1 K2 1 ðxÞ where function CðxÞ has to be determined. Substituting expression (6.212) into eq. (6.210), one obtains a differential equation for the unknown function CðxÞ ðx 1 K1 1 ðxÞ d 1 @ ð2Þ ð2Þ exp 2 CðxÞ ¼ K1 ðxÞL1 FP ½p1 ðxÞ fK ðxÞL1 FP ½p1 ðxÞg dx 2 dx 2@ x 2 1 K2 1 ðxÞ ð6:213Þ
317
CONCLUDING REMARKS
whose general solution is ðs ðx K1 1 ðyÞ dy exp 2 CðxÞ ¼ C0 þ 1 K2 1 ðyÞ 1 @ ð2Þ ð2Þ
fK2 ðsÞL1 FP ½p1 ðsÞg 2 K1 ðsÞL1 FP ½p1 ðsÞ d s @s ð2Þ
ð2Þ
ð6:214Þ
ð2Þ
Since K1 ðxÞ and K2 ðxÞ are related through the probability density p0 st ðxÞ in the second isolated state as ð2Þ
K1 ðsÞ ¼
ð2Þ
ð2Þ
1 dds ½K2 ðsÞp0 st ðsÞ ð2Þ 2 p ðsÞ
ð6:215Þ
0 st
then the expression in braces can be further simplified to produce @ ð2Þ ð2Þ fK ðsÞL1 FP ½p1 ðsÞg 2 K1 ðsÞL1 FP ½p1 ðsÞ @s 2 ð2Þ ð2Þ ð2Þ ð2Þ K ðsÞp20 ðsÞ dsd fK2 ðsÞL1 FP ½p1 ðsÞg K2 ðsÞL1 FP ½p1 ðsÞ dds ½K2 ðsÞp20 ðsÞ ð2Þ ½K2 ðsÞp20 ðsÞ ¼ 2 ð2Þ ½K2 ðsÞp20 ðsÞ2 d L1 FP ½p1 ðsÞ ð2Þ ð6:216Þ ¼ K2 ðsÞp20 ðsÞ d s p20 ðsÞ
Finally, eq. (6.214) can be written as ðx
ð2Þ
K2 ðsÞp20 ðsÞ d L1 FP ½p1 ðsÞ ds K p20 ðsÞ 1 2 1 ðsÞp1 ðsÞ d s " ð ! # ðx ð2Þ s K1 ðuÞ K1 1 ðuÞ L1 FP ½p1 ðsÞ ¼ C0 þ C1 exp 2 du d ð2Þ ðuÞ K p20 ðsÞ 21 1 1 K2 ðuÞ ( ) x ! ðx ð2Þ ð2Þ ð2Þ K2 ðsÞðL1 FP ½p1 ðsÞÞ
K1 ðsÞ K1 1 ðsÞ K2 ðsÞL1 FP ½p1 ðsÞ ds ¼ C0 þ C1
C1 ð2Þ
K2 1 ðsÞp1 ðxÞ K2 1 ðsÞ K2 1 ðsÞp1 ðsÞ 1 K2 ðsÞ 1
CðxÞ ¼ C0 þ C1
ð6:217Þ
The values of these constants can be determined through the normalization conditions.
6.4
CONCLUDING REMARKS
The following remarks can be added to the results obtained for processes with random structure: 1. The general theory of Markov processes allows the same level of development as the theory of continuous Markov processes. In particular, these results can be extended to the
318
MARKOV PROCESSES WITH RANDOM STRUCTURES
Pontryagin equations in attainment problems [21], narrow band processes with random structure, etc. 2. Analytical results in a closed form can only be obtained for a case of a very low or a very high intensity of switching. These limiting cases are especially attractive since they can be seen as perturbed solutions of isolated systems. 3. Despite the asymptotic nature of the results discussed, they can be readily applied to model processes with weak non-stationary, mixture models such as Middleton Class A noise, etc. [22]. 4 In general, the model of a process with random structure, with a proper choice of the absorption and restoration flows, allows for modelling of a great variety of relatively little studied processes which are very important in applications, such as branching processes, etc.
REFERENCES 1. I. Kazakov and V. Artem’ev, Optimization of Dynamic Systems with Random Structure, Moscow: Nauka, 1980 (in Russian). 2. D. Middleton, Non-Gaussian Noise Models in Signal Processing for Telecommunications: New Methods and Results for Class A and Class B Noise Models, IEEE Trans. Information Theory, vol. 45, no, 5, May, 1999, pp. 1129–1149. 3. E. Lutz, D. Cugan, M. Dippold, F. Dolaninsky, and W. Papke, The Land Mobile Satellite Communication Channel–Recording, Statistics and Channel Model, IEEE Trans. Vehicular Technology, vol. 40, no. 2, 1991, pp. 375–380. 4. A Brusentzov and V. Kontorovich, Radio Channel Modeling Based on Dynamic Systems with Random Structure, Proc. of the Int. Con. Commsphere-91, Hertzlia, Israel, December, 1991 5. D. Goodman and S. Wei, Efficiency of Packet Reservation Multiple Access, IEEE Trans. Vehicular Technology, vol. 40, no. 1, pp. 170–176, February, 1991. 6. L. R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc. of the IEEE, vol. 77, February, 1989, pp. 257–286. 7. J. Dellec, J. G. Proakis, and H. L. Hausen, Discrete Time Processing of Speech Signals, New York: MacMillan, 1993. 8. M. Zorzi, Outage and Error Events in Bursty Channels, IEEE Trans. Communications, vol. 46, March, 1998, pp. 349–356. 9. W. Turin and R. van Nobelen, Hidden Markov Modeling of Flat Fading Channels, IEEE Journal Selected Areas of Communications, vol. 16, no. 12, December, 1998, pp. 1809–1817. 10. F. Babich and G. Lombardi, A Measurement Based Markov Model for the Indoor Propagation Channel, Proc. IEEE Vehicular Technology Conferenece Proc, vol. 1, Phoenix, AZ, May, 1997, pp. 77–81. 11. F. Babich, G. Lombardi, S. Shiavon, and E. Valentinuzzi, Measurements and Analysis of the Digital DECT Propagation Channel, Proc. of the IEEE Conf. Wireless Personal Communications, Florence, Italy, October, 1998, pp. 593–597. 12. K. K. Fang, S. Kotz, and K. W. Ng, Symmetric Multivariate and Related Distributions, New York: Chapman and Hall, 1990. 13. R. Risken, The Fokker–Planck Equation: Methods of Solution and Applications, Berlin: Springer, 1996, 472 p. 14. V. Kontorovich and V. Lyandres, Dynamic Systems with Random Structure: an Approach to the Generation of non-Stationary Stochastic Processes, Journal of Franklin Institute, vol. 336, 1999, pp. 939–954.
REFERENCES
319
15. V. Ya. Kontorovich, S. Primak, and K. Almustafa, On Limits Of Applicability of Cumulant Method to Analysis of Dynamic Systems with Random Structure and Deterministic Systems With Chaotic Behaviour, Proc. DCDIS-2003, Guelph, Ontario, Canada, May, 2003. 16. J. Ord, Families of Frequency Distributions, London: Griffin, 1972. 17. M. Abramovitz and I. Stegun, Handbook of Mathematical Functions, Dover, 1965. 18. I. Gradsteyn and I. Ryzhik, Tables of Integrals, Series, and Products, New York: Academic Press, 2000. 19. Yu. A. Mitropolskii and N. van Dao, Applied Asymptotic Methods in Nonlinear Oscillations, Boston: Kluwer Academic Publishers, 1997 20. T. Kato, Perturbation Theory for Linear Operators, New York: Springer-Verlag, 1966. 21. V. Kontorovich, Pontryagin Equations for Non-Linear Dynamic Systems with Random Structure, Nonlinear Analysis, vol. 47, 2001, pp. 1501–1512. 22. H. Inoue, K. Sasajima, and M. Tanaka, A simulation of ‘‘Class A’’ Noise by P-CNG, Proc. IEEE Int Symp. EMC, 1999, vol. 1, pp. 520–525. 23. O. A. Ladyzhenskaia and N. N. Ural’tseva, Linear and quasilinear elliptic equation, Trans. Editor: L. Ehrenpreis, New York: Academic Press, 1968. 24. B. S. Everitt and D. J. Hand, Finite Mixture Distributions, New York: Chapman and Hall, 1981. 25. V. Kontorovich and V. Lyandres, Pulse Non-Stationary Process Generated by Dynamic Systems with Random Structure, Journal of Franklin Institute, 2003. 26. V. Kontorovich, V. Lyandres, and S. Primak, Non-Linear Methods in Information Processing: Modeling, Numerical Simulation and Applications, DCDIS, series B: Applications & Algorithms, vol. 10, October 2003, pp. 417–428. 27. A. N. Malakhov, Cumulant Analysis of Non-Gaussian Random Processes and Their Transformations, Moscow: Sovetskoe Radio, 1978 (in Russian).
7 Synthesis of Stochastic Differential Equations 7.1
INTRODUCTION
Modeling of signals and interference in the fields of communications, radar, sonar and speech processing is usually based on an assumption that these processes may be considered as stationary (or at least locally stationary) and that experimental estimations of their simplest statistical characteristics, autocovariance function (ACF) and marginal probability density function (PDF), are available. While the generation of stationary random processes (sequences) with either a specified PDF or a required valid ACF does not present any principal difficulty, the solution of the joint problem requires much more effort. One of the most often used methods was suggested in [1]. This approach utilizes sequential combinations of linear filtering and zero memory non-linear transformations of White Gaussian Noise (WGN). A different technique is based on treatment of a process with the prescribed characteristics as a stationary solution of a proper system of stochastic differential equations (SDE) with the WGN or Poisson process as an external driving force [2,3]. Such an interpretation seems attractive as it takes advantage of Markov process theory and appears to be efficient in the modeling of correlated non-Gaussian processes. Applications of this approach can be found useful in modeling of continuous and impulsive noise in communication systems [4], radar [5,6], biology [7], statistical electromagnetics [8], bursty internet traffic, Middleton Class A noise, intersymbol interference combined with additive noise [9,10–13], speech [14] and stochastic ratchets [15,16], etc. It is important to mention [2,64] that an SDE approach provides a unified framework for simulation of the above mentioned phenomena. For practical purposes it is important to distinguish between baseband processes with an exponential type covariance function and narrow band processes which often have exponential decaying cosine covariation functions. A class of processes with a certain PDF and exponent ACF was originally considered by E. Wong [18,21], and A. Haddad [22,25] in the sixties and has been extended to narrow band processes in [26].
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
322
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
The following two definitions are important in the further derivations Definition 7.1. A Markov scalar diffusion random stationary process ðtÞ is called a process if its autocovariance function is a pure exponent, i.e. C ðÞ ¼ 2 expðjjÞ
ð7:1Þ
Definition 7.2. A stationary random process ðtÞ is called a ð; !Þ process if it is a component of a diffusion Markov process and has the covariance function given by a decaying exponent Cxx ðÞ ¼ 2 exp½jj cos !
ð7:2Þ
Although processes are usually referred to as wide sense stationary Markov (WSSM) [27] or separable class [25] processes, the terminology of definitions used is more physically revealing.
7.2 7.2.1
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE General Synthesis Procedure for the First Order SDE
The goal of this section is to design a procedure which allows one to synthesize a first order SDE (Ito form) dx ¼ f ðxÞ þ gðxÞðtÞ dt
ð7:3Þ
such that its stationary solution would have a given PDF px st ðxÞ and a given correlation interval corr . pffiffiffiffiffiffiffiffiffiffiffi For a moment one can assume that a non-negative function gðxÞ ¼ K2 ðxÞ 0 is already chosen and the goal is to find a proper drift K1 ðxÞ ¼ f ðxÞ such that ðx C K1 ðsÞ exp 2 ds ð7:4Þ px st ðxÞ ¼ K2 ðxÞ 1 K2 ðsÞ Multiplying both sides of eq. (7.4) by K2 ðxÞ ¼ g2 ðxÞ, taking the natural logarithm and differentiating with respect to x one obtains 2
K1 ðxÞ d ¼ ln½K2 ðxÞpx st ðxÞ K2 ðxÞ dx
ð7:5Þ
Solving eq. (7.5) with respect to K1 ðxÞ ¼ f ðxÞ the desired synthesis equation becomes [2] 3 2 d p ðxÞ x;st 1 d 16 d 7 þ K2 ðxÞ5 ð7:6Þ f ðxÞ ¼ K1 ðxÞ ¼ K2 ðxÞ ln½K2 ðxÞpx st ðxÞ ¼ 4K2 ðxÞ dx 2 dx 2 dx px st ðxÞ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
323
Recalling that K2 ðxÞ ¼ g2 ðxÞ, eq. (7.6) can be rewritten as 3 d p ðxÞ x st 16 d 7 þ 2gðxÞ gðxÞ5 f ðxÞ ¼ K1 ðxÞ ¼ 4g2 ðxÞ dx 2 dx px st ðxÞ 2
ð7:7Þ
Thus, there are many SDEs which have the same marginal PDF as [28]. It will be seen from future analysis that the choice of function gðxÞ defines higher order statistics, such as the covariance function, correlation interval, etc. Further pffiffiffiffisimplification can be achieved if the diffusion is assumed to be constant, i.e. gðxÞ ¼ K ¼ const. In this case K d px st ðxÞ K d f ðxÞ ¼ 2 dx ln px st ðxÞ ¼ px st ðxÞ 2 dx
ð7:8Þ
This was first suggested in [30] and first published in English in [2]. The only parameter yet to be determined is the scale parameter K, defining the correlation interval. The value of the correlation interval sometimes1 can be estimated using an approximate formula, derived in [2,31] (see also Section 5.8): corr
1 a1 þ a3 n0
ð7:9Þ
where ak ¼
1 dk1 K dk K ðxÞ ¼ ln p ðxÞ 1 xst k1 k k! dx 2k! dx x¼0 x¼0
ð7:10Þ
Ð1 4 x px st ðxÞdx m4 ¼ Ð1 1 2 m2 1 x px st ðxÞdx
ð7:11Þ
and n0 ¼ As a result K
1
1
corr d ln px st ðxÞ dx
2 n0 d3 þ ln px st ðxÞ 3 6 dx x¼0 x¼0
ð7:12Þ
This approximation works well for the case when drift is a symmetric non-linear function well approximated by a cubic polynomial, i.e. for K1 ðxÞ b1 x þ b3 x3. In other cases alternative methods must be used. If the spectrum of the corresponding Fokker–Planck equation is discrete and known, it is appropriate to choose K such that 1 ¼ 1=corr .
324
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Example: Gaussian Markov process. In this case the stationary marginal PDF px st ðxÞ is given by " # 1 ðx mÞ2 px st ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffi exp 22 22
ð7:13Þ
and, thus d xm ln px st ðxÞ ¼ 2 dx
ð7:14Þ
In turn, eq. (7.12) provides the value of K equal to K¼
22 corr
ð7:15Þ
Finally, plugging expressions (7.14) and (7.15) into eq. (7.8) one obtains the generating SDE for a Markov process with a Gaussian marginal PDF xm þ x_ ¼ corr
sffiffiffiffiffiffiffiffi 22 ðtÞ corr
ð7:16Þ
The corresponding Fokker–Planck equation for the transitional density ðx; t j x0 ; t0 Þ has been solved in section 5.3.6, only the final result is shown below 1 1 ðs m2 Þ X 1 xm x0 m kjj Hk pffiffiffi ðx; t j x0 ; t0 Þ ¼ pffiffiffiffiffiffiffiffiffiffi exp Hk pffiffiffi exp 22 k! corr 2 2 22 k¼0 ð7:17Þ Here Hk ðxÞ are Hermitian polynomials [32] and ¼ t t0 . It is easy to see that the transitional density (7.17) gives rise to an exponential covariance function Cxx ðÞ ¼
jj ðx mÞðx0 mÞðx; t j x0 ; t0 Þpx st ðx0 Þdx dx0 ¼ 2 exp ð7:18Þ corr 1
ð1 ð1 1
i.e. approximation (7.12) produces the exact result. It is interesting to note that the Markov process defined by SDE (7.16) is also a Gaussian process, i.e. the joint PDF at any number of time instances remains Gaussian. However, choosing K2 ðxÞ ¼ x in eq. (7.7) one can obtain a non-Gaussian Markov process with a Gaussian marginal PDF. This process would also have a non-exponential covariance function.
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
325
Example: a random process with two-sided Laplace marginal PDF. The two-sided Laplace distribution is defined by the following PDF px st ðxÞ ¼ exp½2jxj
ð7:19Þ
where > 0 is a parameter of the distribution. PDF (7.19) is often used to model telephone noise [33] and other phenomena. It is easy to see from eq. (7.7) that the generating SDE has the following form pffiffiffiffi x_ ¼ K sgn x þ K ðtÞ ð7:20Þ with the constant diffusion coefficient K yet to be determined. Since further differentiation of the drift is impossible, the approximation (7.12) cannot be used here. It is also known [38] that the corresponding Fokker–Planck equation has only one discrete eigenvalue 0 ¼ 0 and the rest is a continuous spectrum. Thus it is not possible to use an approximation by a second smallest eigenvalue as well. A fair estimate can be obtained through the statistical linearization of the non-linear function f ðxÞ in the generating SDE [36]. In other words, one needs to find a slope such that the mean square error, ð1 2 ð f ðxÞ xÞ2 px st ðxÞdx ð7:21Þ Efð f ðxÞ xÞ g ¼ 1
is minimized. Taking the derivative of both sides with respect to and equating it to zero one obtains the optimal value of the slope to be opt
Ð1 x f ðxÞpx st ðxÞdx 1 Ð1 ¼ 2 1 ¼ 2 p ðxÞdx x corr x st 1
ð7:22Þ
In the case of SDE (7.20) this estimate becomes opt ¼ 2K ¼
1 corr
ð7:23Þ
Thus, in order to provide a proper correlation interval one has to choose K¼
1 2corr
ð7:24Þ
and the generating SDE becomes 1 x_ ¼ sgnx þ 2corr
sffiffiffiffiffiffiffiffiffiffiffiffiffi 1 ðtÞ 2corr
ð7:25Þ
A large number of further examples can be constructed and some of them are summarized in Appendix A on the web.
326
7.2.2
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Synthesis of an SDE with PDF Defined on a Part of the Real Axis
It can be seen from eq. (7.8) that the synthesis of SDEs implicitly assumes existence of the logarithmic derivative on the whole real axis, i.e. px st ðxÞ 6¼ 0. However, many useful PDFs have non-zero values only on a part of the real axis. In this case the direct application of eq. (7.8) is not appropriate. Simply ignoring intervals where px st ðxÞ ¼ 0 leads to controversy. For example, a linear SDE x_ ¼ x þ ðtÞ
ð7:26Þ
describes both a Gaussian and a one-sided Gaussian process. Furthermore, numerical simulation of eq. (7.26), using for example the Euler scheme (see Appendix C on the web and [34]), would reproduce only the Gaussian PDF. Interestingly enough, the related Fokker– Planck equation uniquely defines the distributions properly since it is supplemented by a set of boundary conditions, which are different for the Gaussian (natural boundary conditions at x ¼ 1) and one-sided Gaussian (reflecting boundary at x ¼ 0 and natural boundary condition at x ¼ 1) [38]. Thus, certain modifications must be made to the synthesis eq. (7.8) to accommodate these boundary conditions. The following procedure has been suggested in [39]. Let px st ðxÞ be defined only on a finite interval x 2 ½a; b and ðx; "Þ be a function which satisfies the following conditions: ðx; "Þ > 0 on the real axis; ~ pðx; "Þ ¼ ½" þ px st ðxÞðx; "Þ is a PDF for " > 0; the logarithmic derivative of ~ pðx; "Þ is defined on the whole real axis and lim ~ pðx; "Þ ¼ px st ðxÞ
"!0
One possible candidate for such a function is one defined in [39] ðx; "Þ ¼
Cð"Þ
2x a b 1þ ba
2="
ð7:27Þ
Here Cð"Þ is chosen in such a way that the conditions above are satisfied. It can be easily seen that the logarithmic derivative is defined by @ px st ðxÞ @ @ ~ þ ðx; "Þ pðx; "Þ ¼ @x @x px st ðxÞ þ " @x
ð7:28Þ
The limit of the second term in eq. (7.28) approaches a sum of delta functions as " approaches zero: lim
@
"!0 @x
ðx; "Þ ¼ 2 ðx aÞ 2 ðx þ bÞ
ð7:29Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
327
thus constituting two barriers in the function f ðxÞ. Therefore, if the PDF px st ðxÞ is zero on some part of the real axis, one has to augment eq. (7.8) by a correction term (7.29) and, thus, the synthesis equation becomes x_ ¼
pffiffiffiffi K @ ln px st ðxÞ þ 2 ðx aÞ 2 ðx bÞ þ K ðtÞ 2 @x
ð7:30Þ
If either a or b are infinite one has to omit the corresponding delta function from the synthesis equation. The suggested modification of SDE would also require modification of corresponding numerical schemes which are considered in Appendix C on the web. Example: A uniformly distributed correlated process. A uniformly distributed process with the marginal PDF px st ðxÞ ¼
1 ; ba
axb
ð7:31Þ
plays a significant role in the modeling of the phase of narrow band signals, as discussed in section 3.12.3. In this case the synthesis eq. (7.30) becomes x_ ¼
pffiffiffiffi K ½2 ðx aÞ 2 ðx bÞ þ K ðtÞ 2
ð7:32Þ
It is important to note that direct application of eq. (7.8) would result in the SDE for the Wiener process pffiffiffiffi x_ ¼ K ðtÞ ð7:33Þ thus, once again, indicating the importance of correction terms. The Fokker–Planck equation, corresponding to SDE (7.32) is well known to be @ K @2 pðx; tÞ ¼ pðx; tÞ; Gða; tÞ ¼ Gðb; tÞ ¼ 0 @t 2 @x2
ð7:34Þ
and has the following exact solution " # 1 x a x a 1 X 2 k 2 K 0 cos k exp cos k ðx; t j x0 ; t0 Þ ¼ b a k¼0 ba ba 2ðb aÞ2
ð7:35Þ
which implies that the covariance function is given by an infinite series of exponents " # 1 X 2 k 2 K 2 hk exp jj ð7:36Þ Cxx ðÞ ¼ 2ðb aÞ2 k¼1 where hk ¼ ½ð1Þkþ1 þ 1
ðb aÞ2 k 2 2
ð7:37Þ
328
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Using the exponent with the slowest decay ðk ¼ 1Þ one can obtain the following approximation of the coefficient K K
1 2ðb aÞ2 2 corr
ð7:38Þ
Thus, the generating SDE has the following form x_ ¼
1 2ðb aÞ2 ba ½ ðx aÞ ðx bÞ þ corr 2
rffiffiffiffiffiffiffiffi 2 ðtÞ corr
ð7:39Þ
Example: A Nakagami distributed correlated process. A Nakagami distributed process is one of the most important models of the envelope of fading communication signals. Therefore its modeling is an important issue, especially for proper simulation of communication systems. Since the marginal PDF of the Nakagami process is given by 2 m2 2m1 mx2 x exp ðmÞ
ð7:40Þ
K 2m 1 2mx þ 2 ðxÞ þ KðtÞ x_ ¼ 2 x
ð7:41Þ
px st ðxÞ ¼ the generating SDE becomes
The solution for the transitional PDF is also known to be [31] ðx; t j x0 ; t0 Þ ¼
2 mm 2m1 mx2 x exp ðmÞ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 1 X k!ðmÞ ðm1Þ x2 mkK ðm1Þ x0 Lk jj Lk exp ðm þ kÞ 2 2 k¼0
ð7:42Þ
and, thus Cxx ðÞ ¼
1 2 ðm þ 0:5Þ X 2 ðk 0:5Þ 2mkK exp jj 4mðmÞ k ¼ 0 ðm þ kÞk!
ð7:43Þ
Once again estimating the value of the coefficient K through the exponent with the slowest rate of decay one obtains the generating SDE in the following form
x_ ¼
corr
sffiffiffiffiffiffiffiffi 2m 1 2mx 2 þ 2 ðxÞ þ ðtÞ x corr
ð7:44Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
329
If m ¼ 1 the Nakagami process coincides with the Rayleigh process whose generating SDE is then x_ ¼
7.2.3
corr
sffiffiffiffiffiffiffiffi 1 2x 2 þ 2 ðxÞ þ ðtÞ x corr
ð7:45Þ
Synthesis of Processes
It has been seen from the examples above that the solution of the first order SDE with constant diffusion poses an exponentially decaying covariance function. However, approximation of the latter by a single exponent does not always provide good results. In this section a technique is developed to generate a random process with exact exponential correlation and arbitrary marginal PDF (i.e. a process as defined on page 322). The desired process is thought to be a solution of the following SDE with further zero memory monotonous non-linear transformation y ¼ FðxÞ: x_ ¼ f ðxÞ þ gðxÞðtÞ y ¼ FðxÞ
ð7:46Þ
At this stage it is assumed that the non-linear monotonous2 transformation y ¼ FðxÞ and its inverse x ¼ F 1 ðyÞ are given and both PDFs py st ðyÞ and px st ðxÞ are also known.3 Proposition 1. The autocovariance function Cxx ðÞ of the random process generated by system (7.46) is an exact exponent with damping factor , if and only if is the eigenvalue of the corresponding Fokker–Planck equation and the corresponding eigenfunction has the following form ðyÞ ¼ ðFðyÞ þ Þpy st ðyÞ
ð7:47Þ
where
¼
ð y2
FðyÞpy st ðyÞdy
ð7:48Þ
y1
Proof: Necessity. Without loss of generality it is possible to assume that corresponds to the first non-zero eigenvalue ¼ 1
2 3
ð7:49Þ
This requirement is needed to preserve the Markov nature of the transformed process. This is not a very restrictive assumption since px st ðxÞ and py st ðyÞ are related by px st ðxÞ ¼ py st ½FðxÞF 0 ðxÞ (see Section 2.3)
330
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
and the corresponding eigenfunction is 1 ðyÞ ¼ py st ðyÞðFðyÞ þ Þ
ð7:50Þ
Using expression (7.50) and property (5.273) of eigenfunctions one obtains hm ¼
ð y2
FðyÞm ðyÞdy ¼
y1
ð y2
ðFðyÞ þ Þm ðyÞdy ¼
ð y2
y1
y1
m ðyÞ1 ðyÞ dy ¼ 0 py st ðyÞ
ð7:51Þ
for any constant and m 6¼ 1. Consequently, the expression for the ACF consists of only one term Cxx ðÞ ¼ h21 exp½jj
ð7:52Þ
no matter what the other eigenvalues are. Sufficiency. By virtue of the fact that all the coefficients h2m in expansion (5.276) are nonnegative, and the assumption that the function ðFðyÞ þ Þpy st ðyÞ is not an eigenfunction, one obtains that expansion (5.276) contains more than one exponential term. The contradiction proves the sufficiency. Finally, it follows from eq. (5.273) that ð y2
1 ðyÞdy ¼
y1
ð y2
ðFðyÞ þ Þpy st ðyÞdy ¼
y1
ð y2
FðyÞpy st ðyÞdy
ð7:53Þ
y1
which, in turn, is equivalent to
¼
ð y2
py st ðyÞFðyÞdy
ð7:54Þ
y1
Proposition 2. formulae
If the drift K1 ðyÞ and the diffusion K2 ðyÞ are chosen according to the
K2 ðyÞ ¼
2 d py st ðyÞ FðyÞ dy
2K1 ðyÞ ¼ K2 ðyÞ
ðy
ðFðsÞ þ Þpy st ðsÞds
ð7:55Þ
y1
d d ln py st ðyÞ þ K2 ðyÞ dy dy
ð7:56Þ
then the process xðtÞ generated by the corresponding SDE is a process. Proof: It follows from the synthesis eq. (7.8), that the drift K1 ðyÞ can be expressed in terms of the diffusion K2 ðyÞ and the desired PDF py st ðyÞ as 2K1 ðyÞ ¼ K2 ðyÞ
d d ln py st ðyÞ þ K2 ðyÞ dy dy
ð7:57Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
331
Substituting expression (7.57) into eq. (7.7) and taking equality (7.49) into account, one can derive a differential equation for K2 ðyÞ in the following form p0y st ðyÞ d2 d þ K2 ðyÞ py st ðyÞ ¼ 2ðFðyÞ þ Þpy st ðyÞ ½K2 ðyÞðFðyÞ þ Þpy st ðyÞ K2 ðyÞ dy py st ðyÞ dy2 ð7:58Þ or, equivalently d d K2 ðyÞ FðyÞpy st ðyÞ ¼ 2ðFðyÞ þ Þpy st ðyÞ dy dy The latter can be easily solved with respect to K2 ðyÞ to produce ðy 2 K2 ðyÞ ¼ ðFðsÞ þ Þpy st ðsÞds d y1 py st ðyÞ FðyÞ dy
ð7:59Þ
ð7:60Þ
Therefore the proposition has been proved. The next step is to prove that the function K2 ðyÞ defined by eq. (7.54), is non-negative, so there is no contradiction with the definition of the diffusion coefficient. Proposition 3. Proof:
K2 ðyÞ 0.
It is sufficient to show that if ðy ðyÞ ¼ ðFðsÞ þ Þpy st ðsÞds
ð7:61Þ
y1
then ðy1 Þ ¼ ðy2 Þ ¼ 0
ð7:62Þ
0 ðyÞ ¼ FðyÞ þ
ð7:63Þ
y ¼ F 1 ð Þ
ð7:64Þ
and the equation
has a unique solution
The derivative of the function ðyÞ increases monotonously, so it is less then zero when y < and greater than zero when y > . The last assertion is equal to the following inequalities FðyÞ < Fðy1 Þ;
y 2 ½y1 ; y
FðyÞ < Fðy2 Þ;
x 2 ½y ; y2
and therefore the proposition is proved.
ð7:65Þ
332
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Substituting eq. (7.55) into eq. (7.56) one obtains the expression for the drift K1 ðyÞ and consequently for the coefficients f ðyÞ and gðyÞ in SDE (7.3). It was shown in [25], that the simple exponential form of the ACF of a process is due to its linear drift. This property can be proved also using eqs. (7.55) and (7.56). Example: Exponentially correlated processes with the Pearson class PDF. As has been described in section 2.2, any Pearson PDF px st ðxÞ obeys the following differential equation d a1 x þ a0 AðxÞ ln px st ðxÞ ¼ ¼ 2 dx b2 x þ b1 x þ b0 BðxÞ
ð7:66Þ
It is easy to see that if y ¼ x then the stationary PDF px st ðxÞ satisfies eq. (7.66) and the desired drift is given by K2 ðxÞ ¼ ðb2 x2 þ b1 x þ b0 Þ
ð7:67Þ
Equation (7.54) is satisfied as well. In fact, expression (7.54) can be rewritten in the form K2 ðxÞ
d d px st ðxÞ þ K2 ðxÞ ¼ ðx þ Þ dx dx
ð7:68Þ
if FðxÞ ¼ x, using the definition of the Pearson PDF (7.66) and the fact that d K2 ðxÞ ¼ ð2b2 x þ b1 Þ dx
ð7:69Þ
ða1 þ 2b2 Þx þ a0 þ b1 ¼ ðx þ Þ
ð7:70Þ
Furthermore, eq. (7.68) becomes
which is an identity if and are defined as ¼
a1 þ 2b2
a0 þ b1
¼ a1 þ 2b2
ð7:71Þ
It can be seen that for processes with a Pearson PDF, K2 ðxÞ is the polynomial of order not higher than 2. All processes with a non-Pearson PDF would have a non-polynomial diffusion. A number of such examples can be easily obtained Example: Symmetric (two-sided) Laplace distribution. It is often required to obtain an exponentially correlated process with a specified value of the two-sided Laplace PDF parameter . It is obvious that this distribution is not a Pearson one. Having x ¼ y in eq. (7.46), one can write out the PDF of xðtÞ as px st ðxÞ ¼
expðjxjÞ 2
ð7:72Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
333
and consequently K2 ðxÞ ¼
2 ð1 þ jxjÞ 2
K1 ðxÞ ¼ x
ð7:73Þ ð7:74Þ
The corresponding generating SDE is rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ð1 þ jxjÞ dWðtÞ dx ¼ xdt þ 2
ð7:75Þ
Example: Generalized gamma distribution. Let the stationary PDF of the process be the generalized gamma PDF px st ðxÞ ¼
= jxj1 exp½jxj 2ð=Þ
ð7:76Þ
which is used to model radar clutter [40–42], telephone signals [43], and ocean noise [44]. As particular cases, this PDF includes the Rayleigh PDF ð ¼ 2; ¼ 2Þ, the Nakagami PDF ð ¼ 2m; ¼ 2Þ and the Weibull one ð ¼ Þ. Using formula (7.55) one obtains the expression for the diffusion in the form K2 ðxÞ ¼
ð1 Þ
þ1
þ1 ; jxj jxj1 exp½ðjxjÞ
ð7:77Þ
and the generating SDE is thus vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi u þ1 uð1 Þ ; jxj jxj1 expððjxjÞ Þ dWðtÞ dx ¼ xdt þ t þ1
ð7:78Þ
Example: Bimodal distribution. Let the stationary PDF of the process have now the following form px st ðxÞ ¼ Cðp; qÞ expðpx2 qx4 Þ
ð7:79Þ
where constant Cðp; qÞ is to be chosen to normalize px st ðxÞ. It can be shown that4 Cð p; qÞ ¼ 2
4
See integral 3.469-1 in [68].
2 2 rffiffiffi q p p exp K1 8q 4 8q p
ð7:80Þ
334
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Such a PDF has two modes when p < 0 and q > 0. As a particular case, it includes the Gaussian PDF when p > 0 and q ¼ 0. Using eq. (7.55) and setting FðxÞ ¼ x it is easy to obtain the expression for the diffusion in the form " # pffiffiffi p 2 p pffiffiffi 2 K2 ðxÞ pffiffiffi exp q x þ Erfc q x2 þ 2 q 2q 2q where 2 ErfcðzÞ ¼ pffiffiffi
ð1
expðt2 Þdt
ð7:81Þ
ð7:82Þ
z
is the complementary error function [32]. The corresponding generating SDE is thus vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi " u pffiffiffi 2 # u p p pffiffiffi Erfc q x2 þ dx ¼ xdt þ t pffiffiffi exp q x2 þ dwðtÞ 2 q 2q 2q
ð7:83Þ
It is impossible to obtain the exact equation for the transitional probability function, except for a number of cases considered in [47]. However, for a small transitional time an approximate formula can be obtained. Indeed, since the solution of the SDE is a diffusion Markov process, it can be approximated with a Gaussian random process with the same local drift and local diffusion as in [47]. " # 1 ðx x0 K1 ðxÞÞ2 ðx j x0 ; Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp 2K2 ðxÞ 2K2 ðxÞ
ð7:84Þ
or, taking eqs. (7.55) and (7.56) into account " # pffiffiffiffiffiffiffiffiffiffi ps ðxÞ ðx x0 þ ðx mx ÞÞ2 ps ðxÞ ðx j x0 ; Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð7:85Þ Ð x
ffi exp 4 Ð x ðm xÞp ðxÞdx x s 1 4 1 ðmx xÞps ðxÞdx This transitional probability can be used to numerically simulate the random process using the Markov chain method, as described in Appendix C on the web.
7.2.4
Non-Diffusion Markov Models of Non-Gaussian Exponentially Correlated Processes
As has been pointed out by Haddad [25] exponential correlation is due to linear drift of the generating SDE—a fact which can be directly seen from the following equation [49] d ðx mÞ 1 Cxx ðÞ ¼ hxK1 ðx Þi ¼ x Cxx ðÞ ¼ d corr corr
ð7:86Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
thus leading to the unique solution in the form of an exponent 2 Cxx ðÞ ¼ exp corr
335
ð7:87Þ
Thus, in the case of linear drift, the correlation function is always exponential regardless of other kinetic coefficients Kn ðxÞ. This fact allows one to extend the synthesis of exponentially correlated processes to non-diffusion processes. In a general case, a non-linear system can be driven by a mixture of White Gaussian Noise ðtÞ and the Poisson flow of delta pulses ðtÞ dx ¼ f ðxÞ þ gðxÞðtÞ þ ðtÞ dt
ð7:88Þ
Statistical properties of the solution xðtÞ can be completely described by its transitional probability density ðx; t; x0 ; t0 Þ, which must obey the differential Chapman–Kolmogorov equation (see section 5.1.5 and [38]) @ @ 1 @2 ðx j x0 ; Þ ¼ ½K1 ðxÞðx; x0 ; t0 Þ þ ½K2 ðxÞðx; t; x0 ; t0 Þ @t @xð 2 @x2 þ
1
1
ð7:89Þ
½Wðx j z; tÞðz; t; x0 ; t0 Þ Wðz j x; tÞðz; t; x0 ; t0 Þdz
where drift K1 ðxÞ, diffusion K2 ðxÞ and probability of jumps Wðx j z; tÞ can be obtained from the corresponding parameters of eq. (7.88) as (Ito form of stochastic integrals) lim ðz; t; x0 ; t0 Þ ¼ Wðx j z; tÞ
tt0 !0 ð1 1
Wðx j z; tÞdx ¼ 1
ð7:90Þ ð7:91Þ
In the stationary case, if it exists, eq. (7.89) becomes an integro-differential equation of the following form ð1 ½Wðx j z; tÞðz j x0 ; Þ Wðz j x; tÞðx j x0 ; Þdz 1 ð7:92Þ @ 1 @2 ¼ ½K1 ðxÞðx j x0 ; Þ ½K2 ðxÞðx j x0 ; Þ @x 2 @x2 Different particular cases of SDE (7.88) have been considered in a number of publications [2,3,50,51]. The main goal here is to show how different Markov processes with the same non-Gaussian probability density and exponential covariance function can be obtained. 7.2.4.1
Exponentially Correlated Markov Chain—DAR(1) and Its Continuous Equivalent
In this section the discrete time scheme which generates a non-Gaussian Markov chain with an exponential correlation function and a given arbitrary PDF is considered.
336
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Following [52], it is assumed that the stationary distribution of the Markov chain yn with N states 1 < 2 < < N
ð7:93Þ
is described by the following probabilities of individual states qk ¼ Probfyn ¼ k g
ð7:94Þ
Any Markov chain can be completely described by its transitional probability matrix T ¼ ½Tk;l where Tk;l ¼ Probfym ¼ k j ym1 ¼ l g;
k; l ¼ 1; 2; . . . ; N
ð7:95Þ
which is the probability of the event ym ¼ k when ym1 ¼ l and satisfies the following conditions Tk;l 0;
N X
Tk;l ¼ 1; K ¼ 1; 2; . . . ; N
ð7:96Þ
k¼1
The stationary probabilities qk are obtained as the eigenvectors of the transition probability matrix T corresponding to the eigenvalue ¼ 1 (see section 5.1.1) N X
Tk; i qi ¼ qk ;
k ¼ 1; 2; . . . ; N
ð7:97Þ
i¼1
The same matrix defines the power spectrum in the stationary case. However, one may consider the inverse problem, having defined only the stationary probabilities of the states. For a case where the correlation function is an exponential one, the solution has been obtained in [52]. It is possible to follow this procedure to obtain the chain approximation with an infinite number of states as a limit of a finite state Markov chain. To achieve the first goal, the following matrix Q is defined in terms of the probabilities of the states 3 q1 q1 . . . q1 6 q2 q2 . . . q2 7 7 Q¼6 4... ... ... ...5 qN qN . . . qN |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} 2
ð7:98Þ
N times
It is easy to check that Q2 ¼ Q
ð7:99Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
337
and [52] det½Q I ¼ ð1 ÞðÞN1
ð7:100Þ
In terms of the matrix Q, the transition matrix {T} can be defined as [52] T ¼ Q þ dðI QÞ
ð7:101Þ
where 0 d < 1 will define the correlation properties (described below) and I is the identity matrix. At the same time T satisfies the condition (7.97). For any integer m one can obtain the following expression by making use of eq. (7.101) and eq. (7.99) T m ¼ Q þ dm ðI QÞ
ð7:102Þ
Since d is a positive number less than 1, one has lim T m ¼ Q
m!1
ð7:103Þ
which means that the Markov chain described by eq. (7.101) becomes ergodic [53] and has the stationary probability given by qk. The next step is to consider the correlation function Rm of the Markov chain ym . The average value and the average of the squared value can be obtained in terms of the stationary probability qk as hym i ¼
N X
k qk
ð7:104Þ
k2 qk
ð7:105Þ
i¼1
hy2m i ¼
N X i¼1
To calculate the covariance function, one has to consider the two-dimensional probability which may be obtained from eq. (7.102) as QðmÞ ðk; lÞ ¼ Probfym ¼ k ; y0 ¼ l g ¼ fqk þ d m ð k;l qk Þgql
ð7:106Þ
where m > 0 and QðmÞ ðk; lÞ stands for the m-step transitional probability. It is easy to find that eq. (7.106) fits the consistency relation N X k¼1
QðmÞ ðk; lÞ ¼
N X
QðmÞ ðl; kÞ ¼ ql
ð7:107Þ
k¼1
The covariance function Cm is an even function of m defined as Cm ¼ Cm ¼ hym y0 i hym i2
ð7:108Þ
338
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Substitution of eqs. (7.104)–(7.107) into eq. (7.108) produces Cm ¼ Cm ¼ hym y0 i hym i2 ¼
N X
k l Probfym ¼ k ; y0 ¼ l g hym i2
k; l ¼ 1
¼
N X
k l fqk þ dm ð k; l qk Þgq1 hym i2
k; l ¼ 1
¼
N X
k l qk ql þ d
m
k; l ¼ 1
N X
ð7:109Þ k l ð k; l qk Þql hym i
2
k; l ¼ 1
¼ hym i2 þ djmj ðhy2m i hym i2 Þ hym i2 ¼ d jmj ðhy2m i hym i2 Þ which is an exponential function with the correlation length Ncorr defined as Ncorr ¼ ð1Þ= ln d
ð7:110Þ
Formula (7.102) can be extended to the finite state continuous time Markov chain ym ðtÞ as in [52] to produce TðtÞ ¼ Q þ expð tÞðI QÞ
ð7:111Þ
The expression for the covariance function can be given in this case as Cyy ðÞ ¼ expð Þðhy2m i hym i2 Þ
ð7:112Þ
Before turning to the continuous time infinite state Markov chain (a Markov process with a continuum of states) let us point out an important property of the exponentially correlated Markov chain. It follows from the definition of the matrices T and TðtÞ as in eqs. (7.102) and (7.111) that the transition probability density does not depend on the current state. If N ! 1, then the Markov continuous time chain tends to a Markov process, which can be a non-diffusion one. In this case eq. (7.111) can be written as ðx j x0 ; Þ ¼ expð Þ ðx x0 Þ þ ð1 expð ÞÞps ðxÞ
ð7:113Þ
Here ps ðxÞ is the stationary distribution of the limit process. It is important to validate that the latter expression indeed defines a proper PDF of a Markov process. Positivity is obvious, since both summands are positive numbers, thus ðx j x0 ; Þ 0
ð7:114Þ
When approaches infinity, the transitional PDF must correspond to the stationary PDF of the process since the values of the process far away from the observation points are independent on this observation: lim ðx j x0 ; Þ ¼ lim ½expðÞ ðx x0 Þ þ ð1 expðÞÞps ðxÞ ¼ ps ðxÞ
!1
!1
ð7:115Þ
339
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
At the same time the limit of the PDF when approaches zero must be the delta function ðx x0 Þ, since the process cannot assume two different values at the same moment. Taking the limit of eq. (7.113) one obtains that this condition is indeed satisfied: lim ðx j x0 ; Þ ¼ lim ½expðÞ ðx x0 Þ þ ð1 expðÞÞps ðxÞ ¼ ðx x0 Þ
!0
!0
ð7:116Þ
Finally, in order to represent a Markov process, PDF ðx j x0 ; Þ must obey the Smoluchowski equation [48] ðx j x0 ; Þ ¼
ð1 1
ðx j x1 ; 2 Þðx1 j x0 ; 1 Þdx1
ð7:117Þ
with ¼ 1 þ 2 . It is easy to check that this is the case for PDF given by eq. (7.113). Indeed ð1 1
ðx j x1 ; 2 Þðx1 j x0 ; 1 Þdx1 ¼
ð1 1
½expð2 Þ ðx x1 Þ þ ð1 expð2 ÞÞps ðxÞ
½expð1 Þ ðx1 x0 Þ þ ð1 expð1 ÞÞps ðx1 Þdx1 ¼ expðð1 þ 2 ÞÞ ðx x0 Þ þ ð1 expðð1 þ 2 ÞÞÞps ðxÞ ¼ ðx j x0 ; 1 þ 2 Þ
ð7:118Þ
Thus, in fact, the PDF (7.113) defines a Markov process. The covariance function of this process is indeed exponential, since Cxx ðÞ ¼ ¼
ð1 ð1 1
1
ð1 ð1
xx1 ðx j x1 ; Þps ðx1 Þdxdx1 xx1 ½expðÞ ðx1 xÞps ðx1 Þdxdx1
1 1 ð1 ð1
þ
1 1
½ð1 expðÞÞps ðxÞps ðx1 Þdxdx1
¼ expðÞ½2x þ m2x þ ð1 expðÞÞm2x ¼ 2x expðÞ þ m2x
ð7:119Þ
The next step is to understand if the Markov process defined by the PDF (7.113) represents a diffusion Markov process or a process with jumps. In order to do this, one must calculate the following limit, which describes the non-diffusion part of any general Markov process (see Section 5.1.5 and eq. 3.4.1 in [48]) Wðx j x0 ; Þ ¼
lim !0 jx x0 j > "
ðx j x0 ; Þ ¼
lim !0 j x x0 j > "
ð1 expðÞÞps ðxÞ ¼ ps ðxÞ ð7:120Þ
340
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
The last equation implies that the Markov process defined by the PDF (7.115) is a process with jumps, since Wðx j x0 ; Þ 6¼ 0. However, it is important to note that the probability of jumps Wðx j x0 ; Þ does not depend on the current state x0 . This property is inherited from the fact that the pre-limit Markov chain fyðtÞg has the same property. It is interesting to find out which SDE, if any, generates such a process. In order to accomplish this, one has to calculate the drift K1 ðxÞ and the diffusion K2 ðxÞ coefficients, which are defined as [48] 1 K1 ðxÞ þ Oð"Þ ¼ lim !0 1 !0
ð ð
¼ lim
j xzj<"
jxzj<"
ðz xÞðz j x; Þdz ðz xÞ½expðÞ ðx zÞ þ ð1 expðÞÞps ðzÞdz
expðÞ !0
¼ lim
ð jxzj<"
ðz xÞ ðx zÞdz
ð1 expðÞÞ !0
ð
þ lim
jxzj<"
ðz xÞps ðzÞdz ¼ 0 þ Oð"Þ
ð7:121Þ
and 1 !0
ð
K2 ðxÞ þ Oð"Þ ¼ lim
1 ¼ lim !0
ð
jxzj<"
ðz xÞ2 ðz j x; Þdz
jxzj < "
ðz xÞ2 ½expðÞ ðx zÞ þ ð1 expðÞÞps ðzÞdz
expðÞ !0
¼ lim
ð jxzj < "
ðz xÞ2 ðx zÞdz
ð1 expðÞÞ þ lim !0
ð jxzj < "
ðz xÞ2 ps ðzÞdz ¼ 0 þ Oð"Þ
ð7:122Þ
The last two equations show that the SDE generating the continuous Markov process with transitional PDF given by eq. (7.113) is dx ¼ ðtÞ dt
ð7:123Þ
where ðtÞ is a stream of delta pulses
ðtÞ ¼
X
Ak ðt ktÞ
ð7:124Þ
tk
with amplitudes Ak distributed according to the stationary PDF ps ðxÞ and the time between two sequential arrivals is t. In order to obtain the continuous time chain, t must approach zero.
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
7.2.4.2
341
A Mixed Process With Exponential Correlation
Another possible class of non-Gaussian Markov processes with exponential correlation has been considered in [41] and is discussed in this section. In this case, the generating SDE is chosen in the form of a linear system excited by a train of delta functions, similar to eq. (7.124) dx ¼ x þ ðtÞ dt
ð7:125Þ
However, it was found in section 10.4 [51] that a relatively small class of non-Gaussian processes can be represented in this form. The goal here is to identify the class of represented processes. Without loss of generality one may assume that the process has zero mean since a constant value can be easily added to the zero mean random process to account for it. In this case the desired amplitude distribution and its intensity ðtÞ can be adjusted to obtain the desired properties. In the case of eq. (7.84), the differential Chapman–Kolmogorov equation (7.92) becomes
ð1 1
½Wðx j z; tÞðz j x0 ; Þ Wðz j x; tÞðx j x0 ; Þdz ¼
@ ½xðxjx0 ; Þ @x
ð7:126Þ
since the diffusion coefficient is zero, and the drift is a linear term. Multiplying both parts by ps ðx0 Þ and integrating over x0 one can obtain that
ð1 1
Wðx j z; tÞps ðzÞdz ¼
@ @ ð½xps ðxÞÞ þ ps ðxÞ ¼ ð Þps ðxÞ x ps ðxÞ @x @x ð7:127Þ
since ð1
ðx j x0 ; Þps ðx0 Þdx0 ¼ ps ðxÞ
ð7:128Þ
Wðz j x; tÞðx j x0 ; Þdz ¼ ðx j x0 ; Þ
ð7:129Þ
1
and, according to eq. (7.90) ð1 1
Since both Wðx j zÞ and ps ðxÞ must be non-negative functions, the following condition must be satisfied ð1
@ Wðx j z; tÞps ðzÞdz ¼ 1 ps ðxÞ x ps ðxÞ 0 @x 1
ð7:130Þ
This is the weakest test of the kind of distribution that can be implemented using this technique. Also it gives a lower bound on the intensity of the jumps needed for a stationary
342
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
distribution to exist (recall that the constant is defined by the required correlation interval corr ¼ 1= and cannot be chosen arbitrarily [2,41]): @ ps ðxÞ @x x 1 ps ðxÞ
ð7:131Þ
It is assumed in the following that the condition (7.131) is indeed satisfied. Since one has to choose an unknown function Wðx j zÞ of two variables having just one eq. (7.130), it is possible that this choice is not unique. Indeed, below, two possibilities are considered. Following the observation in Section 7.2.4.1, one can assume that the jump probability does not depend on the current state, i.e. Wðx j zÞ ¼ WðxÞ. As an alternative, a more commonly used kernel, depending on the difference between the current and future states, can be chosen, i.e. Wðx j zÞ ¼ Wðx zÞ. Both cases are investigated below. A. PDF Wðx j zÞ does not depend on the current state. In this case Wðx j zÞ ¼ WðxÞ
ð7:132Þ
and the eq. (7.127) is reduced to
ð1 1
Wðx j zÞps ðzÞdz ¼ WðxÞ
ð1 1
ps ðzÞdz ¼ WðxÞ ¼ ð Þps ðxÞ x
@ ps ðxÞ @x ð7:133Þ
or, equivalently @ WðxÞ ¼ 1 ps ðxÞ x ps ðxÞ @x
ð7:134Þ
Since it was assumed that eq. (7.131) is satisfied, the function WðxÞ is a positive function. The only additional condition would be its normalization to 1, i.e. ð ð 1 @ 1 @ WðxÞdx ¼ 1 ps ðxÞ x ps ðxÞ dx ¼ 1 x ps ðxÞdx ¼ 1 @x @x 1 1 1
ð1
ð7:135Þ or, equivalently ð1 x 1
@ ps ðxÞdx ¼ 1 @x
ð7:136Þ
Using integration by parts, the last integral can be transformed to ð1 x 1
@ ps ðxÞdx ¼ @x
ð1 1
xdps ðxÞ ¼ xps ðxÞjba
ð1 1
ps ðxÞdx ¼ xps ðxÞjba 1
ð7:137Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
343
Here a and b > a are the boundaries of the interval ½a; b, where ps ðxÞ differs from zero. Both of them can be infinite. Comparing eq. (7.137) to eq. (7.136) one can conclude that the function WðxÞ represents a proper PDF if xps ðxÞjba ¼ 0
ð7:138Þ
This condition is satisfied automatically if both boundaries are infinite, since ps ðxÞ is integrable. If at least one of the boundaries is finite, then eq. (7.138) constitutes yet another restriction on the class of PDF which can be achieved. B. Probability depending on the difference between the current and future states. This case was originally considered in [51]. Equation (7.127) becomes an integral equation of convolution type ð1
@ Wðx zÞps ðzÞdz ¼ 1 ps ðxÞ x ps ðxÞ @x 1
ð7:139Þ
and can be solved using the Fourier transform technique. Indeed, let w ð j!Þ and x ðj!Þ be characteristic functions, corresponding to the PDF WðxÞ and ps ðxÞ, respectively w ð j!Þ ¼ x ð j!Þ ¼
ð1 ð1 1 1
WðxÞ expð j!xÞdx
ð7:140Þ
ps ðxÞ expð j!xÞdx
ð7:141Þ
Taking the Fourier transform of eq. (7.139) and recalling the convolution theorem one could obtain w ð j!Þx ð j!Þ ¼ x ð j!Þ þ
d ! x ð j!Þ d!
ð7:142Þ
and thus d d! x ð j!Þ d w ð j!Þ ¼ 1 þ ! ¼ 1 þ ! x ð j!Þ d! x ð j!Þ
ð7:143Þ
where x ð j!Þ is the log-characteristic (cumulant generating) function of the process xðtÞ. x ð j!Þ ¼ 1 þ
1 X
k
k! k¼1
ð j!Þk
Substitution of eq. (7.144) into eq. (7.143) produces " # 1 1 X d
k X
k k w ð j!Þ ¼ 1 þ ! 1þ ð j!Þ ¼ 1 þ ð j!Þk d! k! k!k k¼1 k¼1
ð7:144Þ
ð7:145Þ
344
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
thus implying that mkw ¼
k k
ð7:146Þ
The last equation allows one to synthesize an SDE if only a few cumulants of the desired output process are known. Since the characteristic function of a proper distribution must obey certain conditions [53], not all PDFs of the solution xðtÞ can be achieved. Example: exponentially correlated process with a Gaussian marginal PDF. In this case the stationary PDF is defined by 1 x2 ps ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffi exp 2 2 22
ð7:147Þ
In order to synthesize a finite state Markov chain, proper quantization levels i , 1 i N þ 1 are to be chosen, for example, using the entropy method [54]. The corresponding probabilities qi of the states can be calculated using the error function [32]. Taking an infinite number of states, a continuous time model can be obtained as given by eq. (7.113). It is important to note that, despite the fact that the marginal PDF of this Markov process is a Gaussian one and the correlation function is exponential, the process itself is not Gaussian, i.e. the second order joint probability density is not a Gaussian one. In contrast, the diffusion process with the same distribution and correlation function is Gaussian [55]. The corresponding SDE of the form (7.55) is a linear one: x_ ¼ x þ 22 ðtÞ
ð7:148Þ
with transitional PDF equal to 1 1 x2 expðÞxx0 ðx j x0 ; Þ ¼ pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp 2 2 ð1 expð2ÞÞ 22 1 expðÞ
ð7:149Þ
Condition (7.131) becomes
x2 1 2
ð7:150Þ
and is satisfied as long as . The corresponding density (7.134) of jumps then becomes x2 1 þ 2 @ x2 pffiffiffiffiffiffiffiffiffiffi exp 2 ; WðxÞ ¼ 1 ps ðxÞ x ps ðxÞ ¼ @x 2 22
>
ð7:151Þ
MODELING OF A SCALAR RANDOM PROCESS USING A FIRST ORDER SDE
345
It was shown in [41] that it is impossible to obtain a Gaussian process using an SDE of the form x_ ðtÞ ¼ x þ ðtÞ
ð7:152Þ
a Poisson flow of delta pulses. Indeed, in order to achieve a Gaussian process by a linear transformation, one requires the input to be Gaussian as well. However, the Poisson process is not a Gaussian one, unless its intensity is infinite. Example: Pearson class of PDFs. In this case the stationary PDF pst ðxÞ obeys the following equation d pst ðxÞ a1 x þ a0 dx ¼ b2 x 2 þ b1 x þ b0 pst ðxÞ
ð7:153Þ
It was shown in [32] that the following SDE (Ito form) allows one to generate all possible Pearson processes with exponential correlation function of the type ffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dx a0 þ b1 ðb2 x2 þ b1 x þ b0 Þ ¼ x ðtÞ þ a1 þ 2b2 dt a1 þ 2b2
ð7:154Þ
In all cases the transition probability density can be expressed in terms of classical orthogonal polynomials [32] or in closed integral form as found in [47]. A few examples are given in Table 7.1. Table 7.1 Parameters of the continuous Markov process with exponential correlationa pst ðxÞ expðxÞ
x expðxÞ ð þ 1Þ
ð þ þ 2Þ ð þ 1Þð þ 1Þ
ð1 þ xÞ ð1 xÞ 2þþ1
d ln pst ðxÞ dx
ðx; x0 ; Þ
( " # h xx 1 i ðx x0 Þ2 0 pffiffiffiffiffiffi exp exp 1 2 4 2 4 " #) ð1 2 ðx þ x0 Þ 1 2 þ pffiffiffi ex ðx þ x0 Þ ez dz þ exp 4 pffiffiffi 2 =2 x 1 x x 1 e x0 expðÞ =2 pffiffiffiffiffiffiffi 2e xx0 x þ x0 e I exp 1 e 1 e 1 ð Þ ð þ Þx ð1 þ xÞ ð1 xÞ X enðnþþþ1Þ 2þþ1 1 x2 n¼0 ; An P; n ðx0 ÞPn ðxÞ
An ¼ a
ð2n þ þ þ 1Þðn þ þ þ 1Þ ðn þ þ 1Þðn þ þ 1Þn!
Here I ðxÞ is the modified Bessel function of the first kind and P; n ðxÞ are the Jacobi polynomials [32].
346
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Table 7.2
Parameters of Markov processes with exponential correlation and jumps d ln pst ðxÞ dx
pst ðxÞ
h
i þ x expðxÞ h i x expðxÞ 1 ð xÞ ð þ 1Þ ð Þx ð þ Þx2 1 1 x2
1
expðxÞ x expðxÞ ð þ 1Þ
x x
ð þ þ 2Þ ð þ 1Þð þ 1Þ
WðxÞ
ð Þ ð þ Þx 1 x2
ð1 þ xÞ ð1 xÞ 2þþ1
1
ð þ þ 2Þ ð1 þ xÞ ð1 xÞ ð þ 1Þð þ 1Þ 2þþ1
In order to obtain a convenient expression for the Pearson class of distributions, equation (7.134) can be rewritten as 3 @ p ðxÞ st @ 7 6 WðxÞ ¼ 1 pst ðxÞ x pst ðxÞ ¼ 4 1 x @x 5pst ðxÞ @x pst ðxÞ
2
a1 x 2 þ a0 x ¼ 1 pst ðxÞ b2 x 2 þ b1 x þ b0
ð7:155Þ
Table 7.2 contains some of the examples given in Table 7.1. Example: Khinchin PDF. As an example of a non-Pearson distribution one can consider the so-called Khinchin probability density pst ðxÞ ¼
1 cos x x2
ð7:156Þ
This distribution is interesting since none of its moments exist. Indeed, the characteristic function, corresponding to this distribution is ð!Þ ¼
0 j!j 1 1 j!j j!j < 1
ð7:157Þ
and it is not differentiable at ! ¼ 0. The last statement is equivalent to the fact that the PDF (7.156) does not have moments which converge. Nevertheless, it was shown in [41] that this distribution can be obtained if x y x y3 þ ðx yÞ sin expðÞ cos 1 5 4 A1 C x y þ Wðx j yÞ ¼ expðÞ 2 þ ðx yÞ2 2
ð7:158Þ
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
347
where C ðwÞ ¼
ð1 w
cosðzÞ dz z
ð7:159Þ
ð7:160Þ
and ¼
7.3 7.3.1
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS ON THE BASIS OF A VECTOR SDE Preliminary Comments
It was shown in Section 7.2 that it is possible to synthesize a one-dimensional SDE with the solution being a wide band random process with a given PDF and correlation interval. At the same time many real applications require modeling of random processes with a more complicated correlation function (spectral density). Of a special importance is the problem of modeling narrow band random processes frequently encountered in radio communications and other applications [56–58]. To remain in the SDE framework one should consider the SDE of an order higher than one. In this case a one-dimensional random process can be considered as a component of some vector Markov process.5 The first part of this section is devoted to the generation of narrow band random processes (processes with an oscillating correlation function) using two simple forms of an SDE of second order, the stochastic analog of the Duffing and Van der Pol equations [31]. The following material shows how the concept of an exponentially correlated process, combined with the concept of compound processes [40,42], can be used to generate one-dimensional processes with a given PDF and rational spectrum. 7.3.2
Synthesis Procedure of a ð; !Þ Process
Since the ACF of the first order SDE solution is always monotonous, a random process with the ACF defined by eq. (7.2) can be generated only by a system of at least two coupling equations. Such a narrow band stationary process, or ð; !Þ process, may be written as ðtÞ ¼ AðtÞ cos ’ðtÞ sinð!tÞ þ AðtÞ sin ’ðtÞ cosð!tÞ
ð7:161Þ
with uncorrelated baseband components (considered at the same time instant)
k ðtÞ ¼ AðtÞ cos ’ðtÞ
? ðtÞ ¼ AðtÞ sin ’ðtÞ h k ðtÞ ? ðtÞi 0
5
It is important to point out here that in this case the one-dimensional process is not a Markov one.
ð7:162Þ
348
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
and I ðtÞ ¼ k ðtÞ sin !t R ðtÞ ¼ ? ðtÞ cos !t
ð7:163Þ
The marginal distribution p ðxÞ defines, according to the Blanc-Lapierre transform6 [59], the marginal distribution pA ðAÞ of the envelope qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ðtÞ ð7:164Þ AðtÞ ¼ k2 ðtÞ þ ? ð1 ð1 p st ðxÞ!J0 ð!AÞ expð j!xÞdx d! ð7:165Þ pA ðAÞ ¼ A 1
1
where J0 ð Þ stands for a Bessel function of order zero of the first kind [32]. The phase I ðtÞ ’ðtÞ ¼ atan ð7:166Þ R ðtÞ of ðtÞ does not depend on A and is uniformly distributed on ½0; 2. Thus pA’ ðA; ’Þ ¼
1 pA ðAÞ 2
ð7:167Þ
or, returning back to y and z7 variables, pffiffiffiffiffiffiffiffiffiffiffiffiffiffi p z2 þ y2 A 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pR I ðy; zÞ ¼ 2 z2 þ y 2
ð7:168Þ
Detailed derivation and discussion of the representation (7.161)–(7.168) is given in Section 4.7. Here it is just emphasized that the components R ðtÞ and I ðtÞ have identical distributions and autocovariance functions. Equation (7.168) allows one to obtain the bivariate PDF pR I ðy; zÞ corresponding to the given p ðxÞ. If one denotes ð1 2 ¼ x2 pI ðxÞdx ð7:169Þ 1
then the variance matrix of the vector process ¼ ðR ; I ÞT has the following form 2 0 ð7:170Þ ¼ 2 I C ¼ 0 2 The next step is to consider a system of two SDEs dR ¼ R dt þ !I dt þ g11 ðR ; I ÞdW1 ðtÞ þ g12 ðR ; I ÞdW2 ðtÞ dI ¼ !R dt I dt þ g21 ðR ; I ÞdW1 ðtÞ þ g22 ðR ; I ÞdW2 ðtÞ 6
See also Section 4.7. It is known that k ðtÞ, ? ðtÞ, R ðtÞ and I ðtÞ have the same PDF (see Section 4.7 for details).
7
ð7:171Þ
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
349
The stationary PDF should satisfy the following Fokker–Planck equation
@ @ 1 @2 ½K11 ðy; zÞpR I ðy; zÞ ½K12 ðy; zÞpR I ðy; zÞ þ ½K211 ðy; zÞpR I ðy; zÞ @y @z 2 @y2 1 @2 1 @2 ½K212 ðy; zÞpR I ðy; zÞ ¼ 0 ½K ðy; zÞp ðy; zÞ þ þ 222 R I 2 @z2 2 @z@y
ð7:172Þ
where the drifts K11 ðy; zÞ and K12 ðy; zÞ are given by K11 ðy; zÞ ¼ R þ !I and K12 ðy; zÞ ¼ !R I
ð7:173Þ
while the diffusion coefficients can be found according to eqs. (7.176) and (7.178) below. Since eqs. (7.171) form a system with linear drifts, the solution has the correlation matrix CðÞ which obeys the following differential equation C_ ðÞ ¼ !
! CðÞ
ð7:174Þ
Taking into account the initial conditions (7.170) one obtains the solution of eq. (7.174) as
exp½jj cos ! CðÞ ¼ exp½jj sin !
exp½jj sin ! Cð0Þ exp½jj cos !
ð7:175Þ
It can be checked by direct substitution that if the drift is chosen according to (7.173) and the diffusions as ðy sp ðs; zÞds pR I ðy; zÞ 1 R I ðz K222 ðy; zÞ ¼ sp ðy; sÞds pR I ðy; zÞ 1 R I ð z ðy ! K212 ðy; zÞ ¼ K221 ðy; zÞ ¼ spR I ðy; sÞds spR I ðs; zÞds pR I ðy; zÞ 1 1 K211 ðy; zÞ ¼
ð7:176Þ ð7:177Þ ð7:178Þ
then Fokker–Planck equation (7.172) is satisfied. Having these parameters defined by eq. (7.173) and eqs. (7.175)–(7.178), one can write the corresponding generating SDE8 as
d R I
8
¼ !
!
s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dw1 ðtÞ R K211 ðy; zÞK212 ðy; zÞ dt þ I K212 ðy; zÞK222 ðy; zÞ dW2 ðtÞ
ð7:179Þ
In general one has to solve the equation gðy; zÞgH ðy; zÞ ¼ K 2 . Using the fact that K 2 is a symmetric matrix according to eq. (7.178) and the symmetry in the properties of y and z mentioned on page 348, one can choose the symmetrical solution gðy; zÞ which is unique in this case. A more general consideration can be the subject of a separate investigation.
350
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Example: A Gaussian ð; !Þ process. In this case one can write 1 x2 px ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffi exp 2 2 22
ð7:180Þ
and pA ðAÞ ¼
A A2 exp 22 2
ð7:181Þ
which results in the joint PDF of the in-phase and quadrature components pR I ðy; zÞ ¼
2 1 y þ z2 exp 22 22
ð7:182Þ
and the corresponding vector generating SDE is, respectively, pffiffiffiffiffiffiffiffiffiffi 22 dW1 ðtÞ pffiffiffiffiffiffiffiffiffiffi dI ¼ !R dt I dt þ 22 dW2 ðtÞ
dR ¼ R dt þ !zdt þ
ð7:183Þ
Example: A ð; !Þ process with uniformly distributed envelope. In this case one has pA ðAÞ ¼ p ðxÞ ¼ pR I ðy; zÞ ¼
1 ; A0
0 A a0
ð7:184Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 ln A20 x2 þ A20 x2 þ 1 A0
ð7:185Þ
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2A0 y2 þ z2
ð7:186Þ
where the corresponding diffusions are 0sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 þ y2 A 0 1A K211 ðy; zÞ ¼ @ y 2 þ z2 0sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 2 A0 þ z 1A K222 ðy; zÞ ¼ @ y2 þ z2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! A20 þ y2 A20 þ z2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi K211 ðy; zÞ ¼ y2 þ z2
ð7:187Þ
ð7:188Þ
ð7:189Þ
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
351
and the generating SDE is v2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 u A20 þ y2 A20 þ z2 u6 A20 þ y2 7" pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 u6 " # " #" # # 2 2 7 y þz y2 þ z2 pffiffiffiu 7 dW1 ðtÞ R ! R u6 6 7 d ¼ dt þ u6 u6 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 7 ! I I 7 dW2 ðtÞ u4 A2 þ y2 A2 þ z2 A20 þ z2 5 t 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 1 y 2 þ z2 y 2 þ z2 ð7:190Þ
7.3.3
Synthesis of a Narrow Band Process Using a Second Order SDE
General properties of a stationary narrow band random process have been discussed in some detail in Section 4.7. It was shown there that a narrow band process ðtÞ can be represented as ðtÞ ¼ AðtÞ cos½!0 t þ ’ðtÞ
ð7:191Þ
where AðtÞ and ’ðtÞ are random amplitude and phase, varying slowly [60–62] compared to the ‘‘central frequency’’ !0 . In this case the spectrum width of the process ðtÞ satisfies the following condition 1 !0
ð7:192Þ
For a narrow band stationary process, the joint distribution of the amplitude and phase has the following form [63] pðA; ’Þ ¼ pA ðAÞp’ ð’Þ
ð7:193Þ
where, p’ ð’Þ ¼ 1=ð2Þ, ’ 2 ½; and the distribution pA ðAÞ of the envelope AðtÞ and the distribution px ðxÞ of process xðtÞ are related by the Blanc-Lapierre transform pair [59] p st ðxÞ ¼
1
ð1
pA ðAÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dA A2 x 2 jxj
pA ðAÞ ¼
A 2
ð1
d!J0 ð!AÞ!
0
ð1 1
px st ðxÞej!x dx
ð7:194Þ
The covariance function of a narrow band process ðtÞ can be represented in a similar form C ðÞ ¼ BðÞ cosð!0 Þ where BðÞ is a slow function compared to cosð!0 Þ.
ð7:195Þ
352
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Assuming that the random process (7.191) is generated by the WGN-excited deterministic dynamic system of minimal order two, the general form of SDE can be written in the form [2] €x þ f1 ðx; x_ Þ_x þ f2 ðx; x_ Þ ¼ f3 ðx; x_ ÞðtÞ
ð7:196Þ
A full analytical investigation of eq. (7.196) is impossible. To make this problem more manageable only a few particular cases of this equation are considered below.
7.3.3.1
Synthesis of a Narrow Band Random Process Using a Duffing Type SDE
One of the simplest equations to consider is the so-called Duffing SDE [31] €x þ 2!0 x_ þ !20 FðxÞ ¼
pffiffiffiffi K !0 ðtÞ
ð7:197Þ
Here the power spectral density of the WGN is 1. It can be taken that eq. (7.197) represents a non-linear resonant contour excited by WGN (narrow band form filter). The block diagram of this form filter is given in Fig. 7.1. Equation (7.197) can be written in vector form as x_ ¼ y y_ ¼ 2!0 y !20 FðxÞ þ
pffiffiffiffi K !0 ðtÞ
ð7:198Þ
with the corresponding stationary Fokker–Planck equation 0¼
Figure 7.1
@ @ K @2 ½ypðx; yÞ þ ½ð2!0 y þ FðxÞ!20 Þpðx; yÞ þ pðx; yÞ @x @y 2 @y2
ð7:199Þ
A block diagram of a form filter, corresponding to the Duffing equation (7.197).
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
353
It can be easily verified [31] that the stationary solution of the Fokker–Planck equation (7.199) has the following one-dimensional joint PDF ð 2 2 4!0 x y FðsÞds ¼ px st ðxÞpy st ðyÞ pst ðx; yÞ ¼ C exp K x1 K!0
ð7:200Þ
From eq. (7.200) one can obtain the expression for the marginal PDF in the form ð 4!0 x FðsÞds px st ðxÞ ¼ C exp K x1
ð7:201Þ
and, respectively, the unknown non-linearity FðxÞ can be expressed as FðxÞ ¼
K d ln px st ðxÞ 4!0 dx
ð7:202Þ
Once the non-linearity FðxÞ is chosen, the parameters and K can be found from the linearized model of the type [17,66] pffiffiffiffi K !0 ðtÞ
ð7:203Þ
1 @ FðxÞ !20 @x
ð7:204Þ
2 ðK; FÞ!20 x ¼ €x þ 2!0 x_ þ ! where ¼ !;
2 ðK; FÞ
!
is defined both by K and by the nonIt is important to point out here that the parameter ! linearity FðxÞ, and this fact complicates the application of the Duffing equation as a model of 2 ðK; FÞ ¼ 1 would ensure that the narrow band processes: a choice of K which satisfies ! average frequency is equal to the required central frequency !0 but it is difficult to define its characteristics in more detail. Since the PDF px ðxÞ must be an even function of its argument9 x, its log-derivative is an odd function of x. If the function FðxÞ is well behaved in the vicinity of zero it can be expanded into a Taylor series with only odd powers of x FðxÞ ¼
1 X
2kþ1 x2kþ1
k¼0
Kmax X
2kþ1 x2kþ1
ð7:205Þ
k¼0
Thus, the average derivative of FðxÞ is then
9
d FðxÞ dx
Kmax X k¼0
ð2k þ 1Þ2kþ1 hx2k i ¼
Kmax X
ð2k þ 1Þ2kþ1 m2k
k¼0
A narrow band stationary process has a symmetric PDF—see Section 4.7.
ð7:206Þ
354
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
If such a derivative does not exist, one can approximate non-linearity in the mean square sense, i.e. minimize the values of the mean square error MSEðsÞ ¼
ð1
ð1 ½FðxÞ sx2 px st ðxÞdx ¼ ½FðxÞ2 px st ðxÞdx 1 ð1 ð1 1 2 FðxÞxpx st ðxÞdx þ s x2 px st ðxÞdx 2s 1
ð7:207Þ
1
with the minimum achieved at Ð1 d Ð1 px st ðxÞdx 1 x FðxÞxp ðxÞdx K K x st Ð 1 dx Ð1 sopt ðKÞ ¼ 1 ¼ !20 ¼ ¼ 2 2 4! 4! m x p ðxÞdx x p ðxÞdx 0 0 2x x st x st 1 1
ð7:208Þ
Solving for K one obtains K ¼ 4!30 m2x ¼ 4!30 2x
ð7:209Þ
The models which use the Duffing SDE also have an important property which is often assumed (or indeed, proved) that the derivative of a random process is independent of the value of the process itself at the same moment of time and is a Gaussian random variable. This fact makes models based on the Duffing equation attractive for use in fading modeling. Example: A Gaussian narrow band random process. In this case 1 x2 px st ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffi exp 2 2 22 A A2 pA ðAÞ ¼ 2 exp 2 2
ð7:210Þ ð7:211Þ
Using formula (7.201) one obtains the function FðxÞ as FðxÞ ¼
K x 4!0 2
ð7:212Þ
If the central frequency !0 and bandwidth ! are desired, then the parameters K and should be chosen as10 K ¼ 42 !0 ;
¼ !
ð7:213Þ
and finally, the generating SDE is €x þ 2!0 x_ þ 10
!20 x
¼
2!20
sffiffiffiffiffiffiffi ! ðtÞ !0
The same result is obtained using the statistical linearization, (7.209).
ð7:214Þ
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
355
Example: A narrow band process with K0 -distributed envelope. Let the distribution of the envelope of a narrow band random process be given by pA ðAÞ ¼ 2 AK0 ðAÞ;
A0
ð7:215Þ
Then, substituting eq. (7.215) into eq. (7.194), one obtains that the distribution of instantaneous values obeys the two-sided Laplace distribution px ðxÞ ¼
exp½jxj 2
ð7:216Þ
and, thus, the corresponding non-linearity is FðxÞ ¼
K sgnx 8!0
ð7:217Þ
Since the derivative of expression (7.217) does not exist, one has to statistically linearize this expression in the mean square sense. The last function, being statistically linearized using eq. (7.197) [63] produces K¼
4!30 2
ð7:218Þ
Finally, the generating SDE is thus rffiffiffiffiffiffiffi ! !20 !0 2 €x þ 2!_x þ sgnx ¼ 2 !0 ðtÞ 2
ð7:219Þ
Example: A bimodal distribution. It has been pointed out in Section 7.2.3 that the bimodal distribution px st ¼ C expðqx4 px2 Þ
ð7:220Þ
approximates well the distribution of instantaneous values of processes with the Nakagami envelope. In this case the non-linearity FðxÞ becomes a simple third order polynomial FðxÞ ¼
K d K ln px st ðxÞ ¼ ð4qx3 þ 2pxÞ 4!0 dx 4!0
ð7:221Þ
Estimation of the parameter K can be achieved either by using eq. (7.206) K1 ¼
4!0 12qm2 þ 2p
ð7:222Þ
356
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
or by applying eq. (7.209) K2 ¼ 4!0 m2
ð7:223Þ
where the second moment m2 is calculated to be
m2 ¼
1 p K3=4 4q
2 p 8q
K1=4 2
K1=4
2 p 8q
p 8q
ð7:224Þ
It can be seen from Fig. 7.2 that both methods provide a similar11 estimate for jpj > 2q.
7.3.3.2
An SDE of the Van Der Pol Type
The Duffing SDE is convenient only when the Blanc-Lapierre transformation can be expressed analytically in terms of the known functions. At the same time, as was mentioned before, the central frequency of the power spectral density is strongly dependent on (Fig. 7.3) the function FðxÞ and is varying when the parameters of the distribution (7.193) are changed.
Figure 7.2
11
Dependence of K2 =K1 ¼ m2 ð12qm2 þ 2pÞ on p and q.
Accuracy of a statistical linearization is often acceptable if it does not exceed 10–15 % [36].
357
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
Figure 7.3
Block diagram for a form filter, corresponding to the Duffing eq. (7.219).
The following simple second order equations often produce better results [64,66] €x þ f1 ðx; x_ Þ_x þ !20 x ¼ KðtÞ
ð7:225Þ
€x þ f2 ðx; x_ Þ_x þ
ð7:226Þ
!20 x
¼ K x_ ðtÞ
where fi ðx; x_ Þ are some deterministic functions of x and x_ . The corresponding form filters are shown in Figs. 7.4 and 7.5. Once again the narrow band solutions of eqs. (7.225) and (7.226) can be represented in the form (7.191), xðtÞ ¼ AðtÞ cosð!0 t þ ’ðtÞÞ
Figure 7.4 Form filter corresponding to the Van der Pol SDE (7.225).
ð7:227Þ
358
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Figure 7.5 Form filter corresponding to the Van der Pol SDE (7.225).
and the adjoined process ^xðtÞ can be approximated by the Tikhonov formula [67] ^xðtÞ ¼
x_ ðtÞ !0
ð7:228Þ
instead of the exact Hilbert transform (see Section 4.7.1). In this case the envelope can be defined as [67] AðtÞ ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 ðtÞ þ ^x2 ðtÞ ¼ x2 ðtÞ þ x_ 2 ðtÞ
ð7:229Þ
Using approximation (7.228) and relationship (7.229) one can rewrite SDEs (7.225) and (7.226) in terms of the envelope AðtÞ and the process itself as [64,66] pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 8 2 2 > < x_ ¼ !0 A x 2 2 > : A_ ¼ A x F1 ðx; AÞ A 8 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 > < x_ ¼ !0 A x
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A 2 x2 ðtÞ !0 A
2 2 2 2 > : A_ ¼ A x F2 ðx; AÞ þ A x ðtÞ A A
ð7:230Þ
ð7:231Þ
respectively. Here pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Fi ðx; AÞ ¼ fi ðx; !0 A2 x2 Þ; i ¼ 1; 2
ð7:232Þ
MODELING OF A ONE-DIMENSIONAL RANDOM PROCESS
359
The corresponding Fokker–Planck equations are then given by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @ @ p1 ðA; x; tÞ ¼ ½!0 A2 x2 p1 ðA; x; tÞ @t @x @ A2 x2 K 2 x2 F1 ðx; AÞ þ 2 3 p1 ðA; x; tÞ @A A 2!0 A 2 2 1 @2 2A x þ K p1 ðA; x; tÞ 2 @A2 !20 A2
ð7:233Þ
and i @ @ h pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p2 ðA; x; tÞ ¼ !0 A2 x2 p2 ðA; x; tÞ @t @x @ A2 x 2 K 2 x2 F2 ðx; AÞ þ 2 3 p2 ðA; xtÞ @A A 2!0 A " # 2 2 2 1 @2 2 ðA x Þ þ K p2 ðA; x; tÞ A2 2 @A2
ð7:234Þ
respectively. In the stationary case, when t ! 1, the joint PDF can be written as [64,66] pi ðAÞ pi ðx; AÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; A2 x 2
i ¼ 1; 2
ð7:235Þ
taking into account the uniformly distributed phase and eq. (7.193). Here pi ðAÞ is the stationary distribution of amplitude. After substituting eq. (7.235) into eqs. (7.233)–(7.234) and integrating over interval ðA; AÞ with respect to x, one obtains the equation for the stationary PDF of amplitude ð A pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K2 @2 @ 1 K2 2 x2 dx p ðAÞ þ F ðx; AÞ A ðAÞ ¼0 p 1 1 1 @A A A 4!20 @A2 4!20 ð A pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K2 @2 @ 1 K2 2 x2 dx p ðAÞ þ F ðx; AÞ A ðAÞ ¼0 p 2 2 2 @A A A 4!20 @A2 4!20
ð7:236Þ ð7:237Þ
In turn, eqs. (7.236) and (7.237) can be rewritten to have the unknown functions Fi ðA; xÞ on the left hand side, i.e. pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K 2 d 2 2 F1 ðx; AÞ A x dx ¼ 2 1 A ln p1 ðAÞ ¼ F1 ðAÞ dA 4!0 A ðA pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K 2 2 d A 1 þ 3A ln p2 ðAÞ ¼ F2 ðAÞ F2 ðx; AÞ A2 x2 dx ¼ dA 16 A ðA
ð7:238Þ ð7:239Þ
360
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
In order to simplify solution of the integral eqs. (7.238) and (7.239), additional requirements can be imposed on the form of unknown functions Fi ðx; AÞ. In the following, considerations are restricted to the case where Fi ðx; AÞ depends only on one variable. In particular, if it is assumed that they depend only on a single variable A, i.e. x_ 2 fi ðx; x_ Þ ¼ fi x2 þ ¼ Fi ðAÞ !0
ð7:240Þ
the left hand side of eqs. (7.238) and (7.239) can be transformed by noting that ðA
ðA pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A2 Fi ðAÞ Fi ðx; AÞ A2 x2 dx ¼ Fi ðAÞ A2 x2 dx ¼ 2 A A
ð7:241Þ
and, thus Fi ðAÞ ¼
2Fi ðAÞ A2
ð7:242Þ
If dependence only on the variable x is assumed in Fi ðx; AÞ, i.e. fi ðx; x_ Þ ¼ fi ðxÞ ¼ Fi ðxÞ
ð7:243Þ
eqs. (7.238) and (7.239) are reduced to pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K 2 d 2 2 F1 ðxÞ A x dx ¼ 2 1 A ln P1 ðAÞ ¼ F1 ðAÞ dA 4! A 0 ðA pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K 2 2 d A 1 þ 3A ln P2 ðAÞ ¼ F2 ðAÞ F2 ðxÞ A2 x2 dx ¼ dA 16 A ðA
ð7:244Þ ð7:245Þ
It can be shown that the solution of eqs. (7.244) and (7.245) has the following form [64,66] 2 3 d d ðx F F ðAÞ ðAÞ i i 1 x d 6dA dA 7 þ fi ðxÞ ¼ lim dA 4 5 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 A!0 0 dA A A x A2
ð7:246Þ
thus providing a convenient tool for modeling narrow band processes with the prescribed PDF of the envelope [64,66]. Example: A narrow band process with the Nakagami PDF [64,66]. For a narrow band process with the envelope distributed according to the Nakagami law with parameters m and , the marginal PDF of the envelope is given by pA ðAÞ ¼
2 mm 2m1 mA2 A exp ðmÞ
ð7:247Þ
SYNTHESIS OF A ONE-DIMENSIONAL PROCESS WITH GAUSSIAN MARGINAL PDF
Function F2 ðAÞ according to eq. (7.239) is then K 2 6mA4 2 ð6m 3ÞA F2 ðAÞ ¼ 16
361
ð7:248Þ
and eq. (7.242) becomes K2 6mA2 6m 3 f2 ðAÞ ¼ 8
ð7:249Þ
Finally, the generating SDE is then K 2 6m 2 x_ 2 €x þ x þ 2 þ 3 6m x_ þ !20 x ¼ K x_ ðtÞ 8 !0 Example: given by
ð7:250Þ
A generalized Rayleigh distribution [64,66]. If the PDF of the envelope is pA ðAÞ ¼ CA exp½A2 A4
ð7:251Þ
then, according to eq. (7.238), function F 1 ðAÞ is defined by F1 ðAÞ ¼
K 2 ð2A2 þ 4A4 Þ 4!20
ð7:252Þ
The corresponding solution of eq. (7.246) is then f1 ðxÞ ¼
K 2 ð þ 3x3 Þ !20
ð7:253Þ
which, in turn, results in the following generating SDE €x þ
K 2 ð þ 3x3 Þ_x þ !20 x ¼ KðtÞ !20
ð7:254Þ
It is easy to see that, when ¼ 0 (Rayleigh distribution), eq. (7.254) becomes a linear equation coinciding with eq. (7.213). These models were used in a hardware simulator of HF communication channels [64,66].
7.4
SYNTHESIS OF A ONE-DIMENSIONAL PROCESS WITH A GAUSSIAN MARGINAL PDF AND NON-EXPONENTIAL CORRELATION
In this section synthesis of a one-dimensional process with Gaussian marginal PDF and a given covariance function is considered. If the covariance function is a pure exponent with
362
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
decay , its mean and variance are m and 2 respectively, then the first order SDE pffiffiffiffiffiffi x_ ¼ ðx mÞ þ 2ðtÞ ð7:255Þ is the generating SDE for this problem. The problem becomes more difficult if the correlation function differs from the exact exponent. In this case one has to represent this process as a component12 of the vector Markov Gaussian process which, in turn, is the solution of a vector linear SDE x_ ¼ Ax þ BðtÞ
ð7:256Þ
where the state vector x is composed of components xðtÞ and its derivatives with an order not higher than M 1 2 3 xðtÞ 6 x0 ðtÞ 7 7 ð7:257Þ x¼6 4 ... 5 ðM1Þ ðtÞ x A and B are some constant matrices of dimension M M and M 1, respectively. It is well known that spectral density ð!Þ of the process xðtÞ defined by eq. (7.256) is a rational function and it can be written in the form [55] ð!Þ ¼
Nð!2 Þ Dð!2 Þ
ð7:258Þ
where Nð!2 Þ and Dð!2 Þ are two polynomials in !2 . Any given covariance function Cxx ðÞ of the process xðtÞ can be approximated by a finite series of exponents [55] as13 Cxx ðÞ ¼
M X
i expði Þ
ð7:259Þ
i¼1
The last equality produces a rational spectrum according to the Wiener–Khinchin theorem. If the non-rational power spectral density is needed it can be approximated with some accuracy by the rational spectrum of the form (7.258) [55]. Let hr , and h r be zeros of the spectral density ð!Þ, and r and r be its poles. It is assumed that hr and r are chosen in such a way that their real parts are negative. Then the spectral density can be written as Qm ð!2 þ 2Reðhr Þ! þ jhr j2 Þ ð7:260Þ ð!Þ ¼ C Qnr¼1 2 2 s¼1 ð! þ 2Reðr Þ! þ jr j Þ 12
The process itself is, thus, non-Markov. Note that here, in contrast to the one-dimensional SDE, both i and i can be complex but are encountered in complex-conjugate pairs.
13
SYNTHESIS OF A ONE-DIMENSIONAL PROCESS WITH GAUSSIAN MARGINAL PDF
363
where C is a constant. The last expression can be factorized to produce Fð j!Þ 2 Fð j!ÞFðj!Þ ¼ ð!Þ ¼ Gð j!Þ Gð j!ÞGðj!Þ
ð7:261Þ
where FðzÞ and GðzÞ are polynomials of degree m and n, respectively, with all their zeros in the left half-plane. In order for the process ðtÞ to have a finite variance, the degree m of the numerator must be less than the degree of the denominator, m < n. Without loss of generality, one can fix the leading coefficient of GðzÞ as equal to 1, so that GðzÞ ¼
n Y
ðz r Þ
r¼1 m Y FðzÞ ¼ f0 ðz hr Þ
ð7:262Þ
r¼1
where f0 is the leading coefficient of FðzÞ, f0 ¼ c1=2 .14 Finally the process xðtÞ can be derived from a WGN ðtÞ of unit spectral density by the equation of n-th degree d d g xðtÞ ¼ F ðtÞ dx dx
ð7:263Þ
dwr r wr ¼ Mr ðtÞ dt n X wr ðtÞ xðtÞ ¼
ð7:264Þ
or in the Causchy form
r¼1
where the coefficients Mr can be found as [55]: ðp r ÞFðpÞ p!r Gð pÞ
Mr ¼ lim
ð7:265Þ
Another, but equivalent, form of eq. (7.264) can be obtained if the coefficients of the polynomials GðzÞ and FðzÞ are known. Let GðzÞ ¼ FðzÞ ¼
n X r¼0 m X
gr z r
ð7:266Þ
f s zs
ð7:267Þ
s¼0
14
It is assumed that all zeros and poles are distinct. If this is not so, the appropriate formulas can be derived by allowing certain zeros or poles to coalesce [55].
364
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
Then the generating SDE has a form 2
fn1
6 6 fn2 16 ... w_ ¼ 6 fn 6 6 4 f2 f1 w1 x¼ fn
fn
0
0 ... 0 0
...
fn . . . ... ... 0 0
... ...
3
3 2 fn gm 7 0 7 7 7 16 6 fn1 gm1 7 ...7 7ðtÞ 7w þ fn 6 4 ... 5 7 fn 5 fnm g0 0 0
ð7:268Þ
ð7:269Þ
where w is a vector of hidden states. If one wants to avoid the calculations, then the form (7.264) should be used if the correlation function is given (or approximated) as the sum of exponents, and the forms (7.268) and (7.269) should be used if the energetic spectrum is given (or approximated) as a rational function of the frequency. Example: a Gaussian process with a rational spectrum. Let the following power spectral density of a Gaussian process to be generated [36]: Sx ð!Þ ¼
N0 ða þ !0 Þb2 þ ða !0 Þ!2 b4 þ 2ða2 !20 Þ!2 þ !4
ð7:270Þ
where w20 ¼ b2 a2 . In this case the numerator has two roots w ¼ ib1 , b21 ¼ b2 ða þ !0 Þða !0 Þ1 , and the denominator has four roots ð!0 iaÞ. Selecting the roots disposed in the upper half-plane one obtains gðzÞ ¼ ðzÞ ¼ FðzÞ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2K 2 ða !0 Þðz þ b1 Þ z2 þ 2az þ b2
ð7:271Þ
The corresponding generating SDE is then w_ ¼
0
b2 x ¼ w1
where q1 ¼
7.5
q1 wþ ðtÞ 2a q2 1
ð7:272Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2K 2 ða !0 Þ and q2 ¼ 2K 2 ða !0 Þðb1 2aÞ
SYNTHESIS OF COMPOUND PROCESSES
This section builds on the theory of compound processes presented in Section 4.8. The results shown here are based on the synthesis of a process with exponential correlation and a Gaussian process with a rational spectrum, which have been discussed in Sections 7.2.3 and 7.4, respectively.
SYNTHESIS OF COMPOUND PROCESSES
7.5.1
365
A Compound Process
In this section we show how to obtain the exponentially correlated zero mean random process to be a compound one, and, consequently, to be a component of the vector random process: xðtÞ ¼ x1 ðtÞx2 ðtÞ
ð7:273Þ
If processes x1 ðtÞ and x2 ðtÞ in eq. (7.273) have exponential covariance function Cx1 ðÞ ¼ hx1 ðtÞx1 ðt þ Þi ¼ 2x1 expð1 jjÞ
ð7:274Þ
Cx2 ðÞ ¼ hx2 ðtÞx2 ðt þ Þi ¼ 2x2 expð2 jjÞ
ð7:275Þ
then the correlation function of the process xðtÞ is Cx ðÞ ¼ hxðtÞxðt þ Þi ¼ hx1 ðtÞx1 ðt þ Þðx2 ðtÞx2 ðt þ ÞÞi ¼ 2x1 expð1 jjÞ2x2 expð2 jjÞ ¼ 2x expðjjÞ
ð7:276Þ
where ¼ 1 þ 2 , and, consequently, the random process xðtÞ is again a process. Being exponentially correlated, processes x1 ðtÞ and x2 ðtÞ are solutions of the following SDEs x_ 1 ¼ 1 x1 þ g1 ðx1 Þ1 ðtÞ
ð7:277Þ
x_ 2 ¼ 2 x2 þ g2 ðx2 Þ2 ðtÞ
ð7:278Þ
and
respectively. Here 1 ðtÞ and 2 ðtÞ are independent WGN and g1 ðxÞ and g2 ðxÞ are to be defined to provide proper distribution densities for x1 ðtÞ and x2 ðtÞ. Using the equality x_ ðtÞ ¼ x_ 1 ðtÞx2 ðtÞ þ x1 ðtÞ_x2 ðtÞ
ð7:279Þ
one can obtain the following vector SDE 8 > < x_ ¼ 1 x1 þ g1 ðx1 Þ1 ðtÞ x_ 2 ¼ 2 x2 þ g2 ðx2 Þ2 ðtÞ > : x_ ¼ 2 x2 x1 1 x1 x2 þ x1 g2 ðx2 Þ2 ðtÞ þ x2 g1 ðx1 Þ1 ðtÞ
ð7:280Þ
One of the SDEs (7.280) is dependent on two others, so it can be rewritten as a twodimensional SDE 8 > < x_ 1 ¼ 1 x1 þ g1 ðx1 Þ1 ðtÞ; ð7:281Þ x x > 2 ðtÞ þ g1 ðx1 Þ1 ðtÞ : x_ ¼ 2 x 1 x þ x1 g2 x1 x1
366
or
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
8 > < x_ 2 ¼ 2 x2 þ g2 ðx2 Þ2 ðtÞ; x x > _ x x þ x g x ¼ 1 ðtÞ þ g2 ðx2 Þ2 ðtÞ : 2 1 2 1 x2 x2
ð7:282Þ
When representation (7.273) is found, the synthesis of the SDE (7.280)–(7.282) can be made by taking the following steps: choose the correlation time 1 ¼ 11 > x corr , where x corr is the desired correlation time of the process ðtÞ; find the generating SDE corresponding to p1 ðx1 Þ and 1 using eqs. (7.54)–(7.56); find the correlation time of the process x2 ðtÞ according to 2 ¼
1 1 ¼ 1 2
find the generating SDE corresponding to p2 ðx2 Þ and 2 using eqs. (7.54)–(7.56); write the generating SDE in the form of eq. (7.281) or eq. (7.282). Example: K0 -distributed process. The marginal PDF of K0 -distributed random processes is given by px ðxÞ ¼
pffiffiffiffiffiffiffiffi a2 jxj K0 ð ajxjÞ; 16
x>0
ð7:283Þ
The direct application of the methodology of Section 7.2.3 in synthesizing the corresponding process with the correlation time corr ¼ 1= leads to complicated calculations employing higher transcendental functions. At the same time, the product of two symmetric Gamma distributions 2 jx1 j expðjx1 jÞ 2 2 px2 ðxÞ ¼ jx2 j expð jx2 jÞ 2
px1 ðx1 Þ ¼
ð7:284Þ ð7:285Þ
gives the K0 -distributed random variable with PDF px ðxÞ ¼ 2 2 jxjK0 ð2
pffiffiffiffiffiffiffiffiffiffiffi jxjÞ
ð7:286Þ
The SDEs generating a process with PDFs (7.284) and (7.285) are the following pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 F1 ðx1 Þ1 ðtÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x_ 2 ¼ 2 x2 þ 2 F2 ðx2 Þ2 ðtÞ x_ ¼ 1 x1 þ
ð7:287Þ ð7:288Þ
SYNTHESIS OF COMPOUND PROCESSES
Here
8 2 þ 2 x2 2x > > ; if x < 0 < 1 2 x F1 ðxÞ ¼ xpx ðxÞdx ¼ 2 2 > px1 ðxÞ 1 1 > : 2 þ x þ x ; if x > 0 3 x ðx
367
ð7:289Þ
and
8 2 þ 2 x2 2x > > ; if x < 0 < 1 3 x ð7:290Þ F2 ðxÞ ¼ xpx2 ðxÞdx ¼ 2 2 > px2 ðxÞ 1 > : 2 þ x þ x ; if x > 0 3 x pffiffiffi By setting 1 ¼ 2 ¼ =2 and ¼ ¼ b one obtains the desired system of SDEs. ðx
7.5.2
Synthesis of a Compound Process With a Symmetrical PDF
In this section we show how the compound model can be used to obtain the symmetrical random process with a prescribed correlation function and marginal distribution. The following statement lays the foundation for such an approach: Proposition 4. Let ðtÞ be a zero mean stationary compound random process with a symmetrical PDF and the correlation (covariance) function C ðÞ which is bounded by some exponent C ðÞ 2 expðm j jÞ ð7:291Þ Then the process ðtÞ can be represented as a component of the random vector Markov process being the solution of the proper SDE if it can be represented as a product of a Gaussian process and some modulating process. Proof: By assumption it is possible to find a representation of the random process ðtÞ in the form of products of a normal distributed random process n ðtÞ, and a properly distributed and independent of n ðtÞ zero mean symmetric random process sðtÞ ðtÞ ¼ n ðtÞsðtÞ
ð7:292Þ
If ps ðsÞ is the PDF of the random process sðtÞ then the PDF of the process ðtÞ is [42] ð1 ð1 1 x2 2 x2 1 ffiffiffiffiffiffiffiffiffiffi ffi p jsj1 exp ðjsjÞds ¼ s exp p ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffi p ps ðsÞds s 2Dn s2 2Dn s2 2Dn 1 2Dn 0 ð7:293Þ The corresponding relation for the characteristic function is [42] ð’Þ ¼ 2’1=4
ð1 0
pffiffiffiffi s ð!Þ!1=2 J1=2 ð! ’Þd!
ð7:294Þ
368
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
where s ð!Þ and ð!Þ are the characteristic functions of sðtÞ and ðtÞ, respectively. As has been shown above, the covariance function of the compound process ðtÞ is simply the product of the correlation functions of its components n ðtÞ and sðtÞ C ðÞ ¼ Cn ðÞCs ðÞ
ð7:295Þ
If sðtÞ is the process with decay s < m then xn ðtÞ is the normal process with correlation function Cn ðÞ ¼ C ðÞ expðs jjÞ
ð7:296Þ
It is easy to show that Cn ðÞ ! 0 when ! 1, so the random process n ðtÞ can be represented as a component of the random process generated by SDE w_ ¼ Aw þ Bn ðtÞ
ð7:297Þ
where w1 ðtÞ ¼ n ðtÞ, and A and B are the matrices obtained using the algorithm described in the previous section. If the generating SDE for process sðtÞ is x_ ¼ f ðsÞ þ gðsÞs ðtÞ
ð7:298Þ
then consequently the generating SDE for process ðtÞ is 8 w_ ¼ Aw þ B1 ðtÞ > < n x x X x x > _ f a w þ w g x ¼ w þ 2 ðtÞ þ b1 1 ðtÞ 1 1i i 1 : w1 w1 i¼1 w1 w1
ð7:299Þ
Example: A narrow band K-distributed random process. Let us to obtain the random process with marginal distribution p ðxÞ ¼ K0 ðaxÞ
ð7:300Þ
and the correlation (covariance) function of the following form K ðxÞ ¼ 2 expð0 jjÞ cosð!Þ
ð7:301Þ
In this case, the desired process ðtÞ can be represented as a product of a normal distributed random process n ðtÞ with correlation function 0 Cn n ðxÞ ¼ 2 exp jj cosð!Þ 2 and an m-distributed random process SðtÞ with m ¼ 0:5 and ¼ 1.
ð7:302Þ
SYNTHESIS OF IMPULSE PROCESSES
369
In this case process xn ðtÞ is generated by SDE €xn þ
0 x_ n þ !2 xn ¼ n ðtÞ !
ð7:303Þ
and sðtÞ is generated by15 0 s s_ ¼ 2
rffiffiffi ! 2 þ FðsÞðtÞ
ð7:304Þ
where FðsÞ is, according to equation (4.60), pffiffiffi 1 s2 0 2 ; 0; 2 2 FðsÞ ¼
ð7:305Þ
where ð; ; Þ is the generalized Gamma function [32].
7.6
SYNTHESIS OF IMPULSE PROCESSES
Let us now model a stationary impulse process with a given PDF p st ðxÞ. Synthesis of the corresponding generating SDE d xðtÞ ¼ f ðxÞ þ ðtÞ dt
ð7:306Þ
model is based on inversion of the stationary Kolmogorov–Feller equation (see Section 6.2.3)
d ½K1 ðxÞp st ðxÞ þ dx
ð1 1
pA ðx AÞp st ðAÞdA p st ðxÞ ¼ 0
ð7:307Þ
or the generalized Fokker–Planck equation
d 1 d2 ½K1 ðxÞp st ðxÞ þ ½K2 ðxÞp st ðxÞ þ dx 2 dx2
ð1 1
pA ðx AÞp st ðAÞdA p st ðxÞ ¼ 0 ð7:308Þ
if a mixture of Gaussian noise and Poisson impulses are present. For a moment the focus of discussions is on eq. (7.307). If it is also assumed that a particular form of the distribution of the excitation pulses pA ðAÞ is chosen, then the synthesis problem is reduced to a simple first order differential 15
This SDE is written in Ito form.
370
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
equation (7.307) with respect to K1 ðxÞ. Its solution is easy to obtain by simple integration, which produces K1 ðxÞp st ðxÞ ¼
ð x ð 1 1
1
pA ðs AÞp st ðAÞdA p st ðsÞ ds þ C0
ð7:309Þ
A constant C0 must be chosen in such a way that the net probability current vanishes, i.e. one has to choose C0 ¼ 0. This implies that K1 ðxÞ ¼
7.6.1
p st ðxÞ
ð x ð 1 1
1
pA ðs AÞp st ðAÞdA p st ðsÞ ds ¼ f ðxÞ
ð7:310Þ
Constant Magnitude Excitation
In this case all impulses have the same magnitude, i.e. pA ðAÞ ¼ ðA A0 Þ
ð7:311Þ
As a result, one can reduce eq. (7.311) using the sifting property of the delta function to a simpler equation f ðxÞ ¼
p st ðxÞ
ðx 1
½p st ðs A0 Þ p st ðsÞds ¼
P ðxÞ P ðx A0 Þ p st ðxÞ
ð7:312Þ
Example: An exponentially distributed process. We wish to synthesize an SDE with exponential distribution of the solution: p st ðxÞ ¼ expðxÞ;
x0
ð7:313Þ
In this case ( P ðxÞ ¼
0
if x < 0
1 expðxÞ if x 0
ð7:314Þ
and, thus, eq. (7.312) produces 8 > < ½expðxÞ 1 f ðxÞ ¼ K1 ðxÞ ¼ > : ½expðA0 Þ 1
if 0 x A0 x A0
ð7:315Þ
371
SYNTHESIS OF AN SDE WITH RANDOM STRUCTURE
7.6.2
Exponentially Distributed Excitation
In this case all impulses have the same magnitude, i.e. pA ðAÞ ¼ expðAÞ;
A0
ð7:316Þ
As a result, one can reduce eq. (7.311) using the sifting property of the delta function to a simpler equation f ðxÞ ¼ p st ðxÞ
ðx 1
expðsÞ
ðs
expð AÞp st ðAÞda p st ðsÞ ds
ð7:317Þ
0
Example: A Gamma distributed process [64]. Let one desire to model an impulse process with Gamma PDF 1 x expðxÞ ðÞ
p st ðxÞ ¼
ð7:318Þ
Setting the parameter in eq. (7.316) equal to in eq. (7.318) one can easily see that eq. (7.317) becomes f ðxÞ ¼
x
ð7:319Þ
More examples can be found in Table D.2 in Appendix D on the web.
7.7
SYNTHESIS OF AN SDE WITH RANDOM STRUCTURE
In this section the inverse problem is considered for an SDE with random structure. In addition to the information required for synthesis of an ordinary SDE, such as marginal PDF of the process and the correlation function, one has to obtain a priori information about a number of states M (number of sub-structures) and intensity of transitions kl . Knowledge of intensities allows one to solve the Kolmogorov–Chapman equation to obtain stationary
ðlÞ probabilities of states Pl . As the next step, the joint PDF pst ðxÞ can be obtained using the equation
ðlÞ
pst ðxÞ ¼ Pl plst ðxÞ
ð7:320Þ
In the following, only a scalar case is considered. In order to further simplify discussion, ðlÞ ðlÞ systems with a constant drift are considered: K2 ðxÞ ¼ K2 . A stationary solution, if it exists, must satisfy the following system of equations:
ðlÞ
pst ðxÞ
@ ðlÞ @ ðlÞ ðlÞ ½K ðxÞ þ K1 ðxÞ pst ðxÞ ¼ QðxÞ @x 1 @x
ð7:321Þ
372
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
where QðxÞ ¼
ðlÞ
K2 @ 2 ðlÞ p ðxÞ þ Vl ðxÞ Ul ðxÞ 2 @x2 st
ð7:322Þ ðlÞ
The next step is to rewrite eq. (7.321) as a differential equation with respect to K1 ðxÞ, i.e. in the following form d ðlÞ d QðxÞ ðlÞ
ðlÞ K ðxÞ þ K1 ðxÞ ln pst ðxÞ ¼ ðlÞ dx 1 dx pst ðxÞ
ð7:323Þ
which can be easily solved using the method of undefined coefficients ðlÞ K1 ðxÞ
ðlÞ
K
ðlÞ ¼ 2 ln pst ðxÞ 2
l
ðx 1
ðlÞ pst ðxÞdx
þ
M X
kl
k¼1
!
ðx 1
ðkÞ pst ðxÞdx
þC
ð7:324Þ
The constant of integration in eq. (7.324) can be rewritten for pðlÞ ðxÞ if the additional requirement ðlÞ
K1 ð1Þ ¼ 0
ð7:325Þ
ðlÞ
is imposed on K1 ðxÞ. In this case, the synthesis equation becomes aðlÞ ðxÞ ¼
M X bðlÞ d Fl ðxÞ Pk Fk ðxÞ
ðlÞ ln pst ðxÞ þ l ðlÞ kl
ðkÞ 2 dx pst ðxÞ k ¼ 1 Pl pst ðxÞ
ð7:326Þ
where Fl ðxÞ ¼
ðx 1
ðlÞ
pst ðxÞdx
ð7:327Þ ðlÞ
It can be seen that the expression for the drift coefficient K1 ðxÞ contains a term present in the synthesis equation for an ordinary SDE (the first term in eq. (7.326)), while the random structure is represented by additional (the second and the third) terms in eq. (7.326). These additional terms represent contributions of flows of absorption and regeneration of the trajectories of the process with random structure. ðlÞ The diffusion coefficient K2 , as in the case of a monostructure, should be chosen in accordance with the correlation interval of the random process ðtÞ. Quantitatively it can only be obtained from the expression for the covariance function, which can be obtained only for a case of high and low intensities of the switching process and under the assumption that such a correlation function is close to exponential. For the case of a low intensity of switching, this approach produces quite accurate results since the effect of switching can be neglected or accounted for only by first order correction terms. In this case the synthesis of the SDE with random structure is no different from the synthesis of an ordinary SDE.
REFERENCES
373
REFERENCES 1. B. Liu and D. Munson, Generation of a Random Sequence Having Jointly Specified Marginal Distribution and Autocovariance, IEEE Trans. Acoustics, Speech Signal Processing, Vol. 30, 1982, pp. 973–983. 2. V. Kontorovich and V. Lyandres, Stochastic Differential Equations: An Approach to the Generation of Continuous Non-Gaussian Processes, IEEE Trans. Signal Processing, Vol. 43, 1995, pp. 2372– 2385. 3. S. Primak and V. Lyandres, On the Generation of Baseband and Narrowband Non-Gaussian Processes, IEEE Trans. Signal Processing, Vol. 46, 1998, pp. 1229–1237. 4. D. Middleton, An Introduction to Statistical Communication Theory, New York: McGraw-Hill, 1960. 5. E. Conte, M. Longo, and M. Lops, Modeling and Simulation of Non-Rayleigh Radar Clutter, IEEE Proc. F 138, 1991, pp. 121–131. 6. M. Rangaswamy, D. Weiner, and A. Ozturk, Non-Gaussian Random Vector Identification Using Spherically Invariant Random Processes, IEEE Tran. Aeros. Elec. Systems, Vol. 29, 1993, pp. 111– 124. 7. A. Fulinski, Z. Grzywna, I. Mellor, Z. Siwy, and P.N.R. Usherwood, Non-Markovian Character of Ionic Current Fluctuations in Membrane Channels, Physical Review E, Vol. 58, 1998, pp. 919–924. 8. S. Primak and J. LoVetri, Statistical Modeling of External Coupling of Electromagnetic Waves to printed Circuit Boards Inside a Leaky Cavity, Technical Report, CRC, Ottawa, Canada. 9. J. Lee, R. Tafazolli, and B. Evans, Erlang Capacity of OC-CDMA with Imperfect Power Control, IEE Electronics Letters, Vol. 33, 1997, pp. 259–261. 10. R. J. Polge, E. M. Holliday, and B. K. Bhagavan, Generation of a Pseudorandom Set with Desired Correlation and Probability Distribution, Simulation, 1973, pp. 153–158. 11. W. J. Szajnowski, The Generation of Correlated Weibull Clutter for Signal Detection Problems, IEEE Trans. Aer. Electr. Syst., Vol. AES-13, 1977, pp. 536–540. 12. V. G. Gujar and R. J. Kavanagh, Generation of Random Signals with Specified Probability Density Functions and Power Density Spectra, IEEE Trans. Autom. Contr., Vol. AC-13, 1968, pp. 716–719. 13. B. Liu and D. C. Munson, Generation of a Random Sequence Having a Jointly Specified Marginal Distribution and Autocovariation, IEEE Trans. Acous., Speech, Signal Proc., Vol. ASSP-30, 1982, pp. 973–983. 14. R. Rosenfeld, Two Decades of Statistical Language Modeling: Where Do We Go From Here?, Proc. IEEE, Vol. 88, 2000, pp. 1270–1278. 15. J. Luczka, T. Czernik, and P. Hanggi, Symmetric White Noise Can Induce Directed Current in Ratchets, Physical Review E, Vol. 56, 1997, pp. 3968–3975. 16. M. Popescu, C. Arizmendi, A. Salas-Brito, and F. Family, Disorder Induced Diffusive Transport in Ratchets, Physical Review Letters, Vol. 85, 2000, pp. 3321–3324. 17. V. Z. Lyandres and M. Shahaf, Envelope Correlation Function of Narrow-Band Non-Gaussian Process, Int. J. Nonlinear Mechanics, Vol. 30, 1995, pp. 359–369. 18. E. Wong and J. B. Thomas, On Polynomial Expansion of Second-Order Distributions, J. Soc. Industr. Math., 1962, Vol. 10, pp. 507–516. 19. T. Ozaki, A Local Linearization Approach to Non-Linear Filtering, Int. Jour. Contr., Vol. 57, 1995, pp. 75–96. 20. S. Primak, V. Lyandres, O. Kaufman, and M. Kliger, On the Generation of Correlated Time Series with a Given PDF, Submitted to Signal Processing, February, 1997. 21. E. Wong, ‘‘The Construction of a Class of Stationary Markov Processes’’, Proc. Symp. Appl. Math., Vol. 16, 1964, pp. 264–275. 22. T. T. Soong, Random Differential Equations in Science and Engineering. New York: Academic Press, 1971.
374
SYNTHESIS OF STOCHASTIC DIFFERENTIAL EQUATIONS
23. K. Ito, Stochastic integrals, Proc. Imp. Acad., Tokyo, Vol. 40, 1944, pp. 519–524. 24. I. Stakgold, Boundary Value Problem of Mathematical Physics, New York: Macmillan, 1967. 25. A.H. Haddad, Dynamical Representation of Markov Processes of the Separable Class, IEEE Trans. Information Theory, Vol. 16, 1970, pp. 529–534. 26. S. Primak and V. Lyandres, On the Generation of the Baseband and Narrowband Non-Gaussian Processes, IEEE Trans. Signal Processing, Vol. 46, 1998, pp. 1229–1237. 27. J. L. Doob, Stochastic Processes, New York: Wiley, 1953. 28. V. Ya. Kontorovich and V. Z. Lyandres, Fluctuation equations generating a one-dimensional Markovian Process, Radio Engineering and Electronics Physics, July 1972, pp. 1093–1096 (in Russian). 29. M.G. Kendall and A. Stuart, The Advanced Theory of Statistics, London: Griffin, 1958. 30. R. Gut and V. Kontorovich, On Markov Representation of Random Processes, Voprosi Radiotechniki, Vol. 6, 1971, pp. 91–94 (in Russian). 31. R. L. Stratonovich, Topics in the Theory of Random Noise, Vol. 1, New York: Gordon and Breach, 1967. 32. M. Abramoviz and I.A. Stegun, Handbook of Mathematical Functions, New York: Dover Publications, 1971. 33. B. Stuck and B. Kleiner, A Statistical Analysis of Telephone Noise, Bell Technical J., Vol. 53, pp. 1236–1320. 34. R. Risken, The Fokker–Planck Equation: Methods of Solution and Applications, Berlin: Springer, 1996, 472 p. 35. P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, New York: Springer-Verlag, 1992. 36. V. Pugachev and I. Sinitsin, Stochastic Differential Systems: Analysis and Filtering, New York: Wiley, 1987. 37. P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, New York: Springer-Verlag, 1992. 38. C. W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences Berlin: Springer-Verlag, 1983. 39. V. Kontorovich, V. Lyandres, and S. Primak, The Generation of Diffusion Markovian Processes with Probability Density Function Defined on Part of the Real Axis, IEEE Signal Processing Letters, Vol. 3, 1996, pp. 19–21. 40. E. Conte and M. Longo, Characterization of Radar Clutter as a Spherically Invariant Random Process, IEE Proc., Part F., Vol. 134, 1987, pp. 191–201. 41. S. Primak, V. Lyandres, and V. Kontorovich, Markov Models of non-Gaussian Exponentially Correlated processes and Their applications, Physical Review E, Vol. 63, 2001, 061103. 42. M. Rangaswamy, D. Weiner, and A. Ozturk, Non-Gaussian Random Vector Identification Using Spherically Invariant Random Processes, IEEE Tran. Aeros. Elec. Syst., Vol. AES-29, 1993, pp. 111–201. 43. M. M. Sondhi, Random Processes with Specified Spectral Density and First Order Probability Density, Bell System Technical J., Vol. 62, 1983, pp. 679–701. 44. R. J. Webster, A Random Number Generator for Ocean Noise Statistics, IEEE J. Ocean Eng., Vol. 19, 1994, pp. 134–137. 45. S. Xu, Z. Huang, and Y. Yao, ‘‘An Analytically Tractable Model for Videoconferencing,’’ IEEE Trans. Circuits and Systems for Video Technology, Vol. 10, 2000, p. 63. 46. S. Xu and Z. Huang, A Gamma Autoregressive Video Model on ATM Networks, IEEE Trans. Circuits and Systems for Video Technology, Vol. 8, 1998, pp. 138–142. 47. E. Wong, The Construction of a Class of Stationary Markov Processes, Proc. Symp. Appl. Math., Vol. 16, pp. 264–275. 48. S. Primak, Generation of Compound Non-Gaussian Random Processes with a Given Correlation Function, Physical Review E, Vol. 61, 2000, p. 100.
REFERENCES
375
49. A. N. Malakhov, Cumulantnii Analiz Sluchainich Processov i Ich Preobrazovanii (Cumulant Analysis of Random Processes and Their Transformations), Moscow: Radio, 1978 (in Russian). 50. V. Kontorovich and V. Lyandres, Impulsive Noise: A Nontraditional Approach, Signal Processing, Vol. 51, 1996, pp. 121–132. 51. G. Kotler, V. Kontorovich, V. Lyandres, and S. Primak, On the Markov Model of Shot Noise, Signal Processing, 1999, pp. 79–88. 52. J. Nakayama, Generation of Stationary Random Signals with Arbitrary Probability Distribution and Exponential Correlation, IEICE Trans. Fundamentals, Vol. E77-A, 1994, pp. 917–922. 53. S. Karlin, A First Course in Stochastic Processes, New York: Academic Press, 1968. 54. N. S. Jayant and P. Noll, Digital Coding of Waveforms, New Jersey: Prentice-Hall, 1984. 55. C. W. Helstrom, Markov Processes and Application in Communication Theory, in Communication Theory, New York: McGraw-Hill, 1968, pp. 26–86. 56. T. Rappaport, Wireless Communications: Principles and Practice, Upper Saddle River, NJ: Prentice Hall PTR, 2002. 57. G. Stu¨ber, Principles of Mobile Communication, Boston: Kluwer Academic Publishers, 2001. 58. S. Primak, V. Lyandres, O. Kaufman, and M. Kliger, On the Generation of Correlated Time Series with a Given Probability Density Function, Signal Processing, Vol. 72, 1999, pp. 61–68. 59. A. Blanc-Lapierre, M. Savelli, and A. Tortrat, Etude de Modeles Statistiques Suggeres par la Consideration des Effets des Atmospheriques sur les Amplificateurs, Ann. Telecomm, Vol. 9, 1954, pp. 237–248. 60. T. Felbinger, S. Schiller, and J. Mlynek, Oscillation and Generation of Nonclassical States in ThreePhoton Down-Conversion, Physical Review Letters, Vol. 80, 1998, pp. 492–495. 61. A. C. Marley, M.J. Higgins, and S. Bhattacharya, Flux Flow Noise and Dynamical Transitions in a Flux Line Lattice, Physical Review Letters, Vol. 74, 1995, pp. 3029–3032. 62. P. Rao, D. Johnson, and D. Becker, Generation and Analysis of Non-Gaussian Markov Time Series, IEEE Trans. Signal Processing, Vol. 40, 1992, pp. 845–855. 63. S. M. Rytov, Yu. A. Kravtsov, and V. I. Tatarskii, Principles of Statistical Radiophysics, Vol. 2, Correlation Theory of Random Processes, New York: Springer-Verlag, 1988. 64. D. Klovsky, V. Kontorovich, and S. Shirokov, Models of Continuous Communications Channels Based on Stochastic Differential Equations, Moscow: Radio i sviaz, 1984 (in Russian). 65. I. Kazakov and V. Artem’ev, Optimization of Dynamic Systems with Random Structure, Moscow: Nauka, 1980 (in Russian). 66. A. Berdnikov, A. Brusentzov, V. Kontorovich, and V. Lyandres, Problems in the theory and design of Instruments for Analog Simulation of Multipath Channels, Telecomm. Radio. Eng., Vol. 33–34, 1978, pp. 60–64. 67. V. Tikhonov and M. Mironov, Markov Processes, Moscow: Sov. Radio, 1977 (in Russian). 68. I. S. Gradsteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, Academic Press, 1980.
8 Applications 8.1 8.1.1
CONTINUOUS COMMUNICATION CHANNELS A Mathematical Model of a Mobile Satellite Communication Channel
For some years many different organizations all over the world have been involved in activities of land mobile satellite communication services [1]. The quality of this kind of communication suffers from strong variations of the received signal power due to satellite signal shadowing and multipath fading. The shadowing of the satellite signal by obstacles in the propagation path (buildings, bridges, trees, etc.) results in attenuation over the entire signal bandwidth. This attenuation increases with carrier frequency, i.e. it is more pronounced in the L-band than in the UHF band. The shadowed areas are larger for low satellite elevations than for high elevations. Multipath fading occurs because the satellite signal is received not only via the direct path but also after having been reflected from objects in the surroundings. Due to their different propagation distances, multipath signals can add destructively, resulting in a deep fade. In analysing the scattered field characteristics the following assumptions are usually considered as valid: a large number of partial waves; identical partial wave amplitudes; no correlation between different partial waves; no correlation between the phase and amplitude of one partial wave; phase distribution is uniform on ½0; 2 . Physically, these assumptions are equivalent to a homogeneous diffuse scattered field resulting from randomly distributed point scatterers. With these assumptions, the central limit theorem leads to a complex stationary Gaussian process z ¼ x þ j y with zero mean, equal variance and uncorrelated components. The amplitude distribution thus obeys the Rayleigh PDF [2,3] A A2 pA ðAÞ ¼ 2 exp 2 ð8:1Þ 2 Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
378
APPLICATIONS
which depends on a single parameter 2, equal to the variance of the in-phase and quadrature components. In the line of sight conditions, i.e. when the direct component is superimposed, pA ðAÞ is well described by the Rice distribution [2,3] A A2 þ a 2 aA pA ðAÞ ¼ 2 exp I0 2 2 2
ð8:2Þ
where I0 ð Þ is the modified Bessel function of zero order, and a is a parameter describing the strength of the Line-of-Sight (LoS) component (see Section 2.6). The Rice factor K, defined as K¼
a2 2 2
ð8:3Þ
not only defines the ratio of power of the LoS component to the power of the diffused component, but also can be used to compare different models as shown in Table 8.1. In general, it is possible that none of the aforementioned assumptions is valid due to correlation between a small number of partial waves. Numerous approximations have been suggested to describe the measurement results [4–17]. It was also found [3] that one of the most convenient and closely fitting distributions of the envelope is the Nakagami PDF 2 m 2 m1 m A2 exp pA ðAÞ ¼ A 2 2 ðmÞ
ð8:4Þ
In scattering processes generating merely diffuse wave fields, m 1 and the Nakagami distribution is identical to the Rayleigh one (8.1). In the case of a line of sight, the distribution (8.4) with m > 1 approximates the Rice distribution. Another important characteristic of the communication channels is the power spectrum density of the fading signal [2]. Consider an unmodulated carrier being detected by a mobile receiver. Each component arrives at the receiver via different angles to the receiver velocity Table 8.1 Model Nakagami Suzuki
Loo
Different models of fading envelope and their comparison PDF
Rice factor 2 mm 2 m1 m A2 m2 m A exp ðmÞ A a2 pffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 A 2 2 ! ðln s mÞ 3 ð1 þa 1 Aa 2 2 5 d s I0 4 3 exp s e 2 s s 0 A pffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 ðln s mÞ 3 ð1 2 2 1 A þ s A s 2 2 5 d s I0 e 4 exp 2 s 0
a2 2
References [6] [5]
[7]
379
CONTINUOUS COMMUNICATION CHANNELS
vector and thus each signal component acquires a certain Doppler shift that is related to the receiver speed and arrival angle i for the i-th ray according to [2] fi ¼
cos i ¼ fD cos i
ð8:5Þ
Due to the Doppler effect, a single frequency carrier results in multiple signals at the receiver having comparable amplitudes, random phases and a relative frequency shift confined to the Doppler spread about the carrier frequency. The theoretically obtained spectrum Sð f Þ (low pass equivalent) is given by [2] 1 Sð f Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi fD2 f 2
ð8:6Þ
C ðÞ ¼ J0 ð2 fD jjÞ
ð8:7Þ
and the covariance function by
A number of alternative models have also been suggested, especially in the COST 207 report [4] which includes a combination of two Gaussian PDFs (see Table 3.2 in Chapter 3) and a combination of the Jakes PSD (8.6) and a delta function. These possibilities are summarized in Table 8.2. Finally, taking into account the multipath nature of the wave propagation one can represent the low pass equivalent impulse response of the channel as a sum of N rays embedded in white Gaussian noise rðtÞ ¼
N X
Ai ðtÞsðt i Þ þ ðtÞ
ð8:8Þ
k¼1
Table 8.2 Typical PSD of a fading signal [4] PSD Sð f Þ
Model Jakes Gauss I
Gauss II
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi fD2 f 2 ð f þ 0:8 fD Þ2 0:909 exp 0:005 fD2
Normalized covariance function ðÞ J0 ð2 fD Þ !
ð f 0:4 fD Þ2 þ0:0909 exp 0:02 fD2 ! ð f 0:7 fD Þ2 0:968 exp 0:02 fD2
!
ð f þ 0:4 fD Þ2 þ0:0316 exp 0:045 fD2 Rice
!
0:681 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi þ 0:8319 ðf 0:7 fD Þ fD2 f 2
ð fD Þ2 0:909 exp 400
! expðj 1:6 fD Þ
! ð fD Þ2 þ0:0909 exp expðj 0:8 fD Þ 20 ! ð fD Þ2 þ0:968 exp expðj 1:4 fD Þ 200 ! ð fD Þ2 þ0:0316 exp expðj 0:8 fD Þ 40 0:1681 J0 ð2 fD Þ
380
APPLICATIONS
where N is the number of rays considered, Ai ðtÞ is the fading amplitude of the i-th ray and i is the delay of the i-th ray, and ðtÞ is the additive WGN. The number of rays, delay of each ray and their parameters vary greatly for different environments. COST 207 provides tables of corresponding parameters for 4, 6 and 12 path models in rural, bad urban, typical urban and hilly terrain environments. However, regardless of the particular environment, the general structure of the simulator of a mobile communication channel remains the same and is shown in Fig. 8.1.
Figure 8.1 General scheme of a mobile communication channel simulator.
8.1.2
Modeling of a Single-Path Propagation
From the diagram shown in Figure 8.1 one can conclude that the main task in the simulation of the multipath channel is to model a single ray propagation with the given characteristics. Two main schemes which allow one to maintain both the PDF of the envelope and the correlation function of the signal are suggested below. Numerous alternative methods can be found elsewhere [18–22].
8.1.2.1
A Process with a Given PDF of the Envelope and Given Correlation Interval
For the first approximation of the power spectrum (8.4) one can use a rational approximation with only one pole. Although this approximation is somewhat poor it is considered here
CONTINUOUS COMMUNICATION CHANNELS
381
because this scheme facilitates modeling the non-Gaussian random process with any PDF of the envelope. Let the PDF pA ðAÞ of the envelope of a narrow band (with bandwidth fi ) signal be given. Then the bandwidth of the narrow band envelope is approximately 2 fi, and consequently, the correlation interval is corr i ¼
1 2 fi
ð8:9Þ
If the phase ’ðtÞ is uniformly distributed on the interval ½; and does not depend on AðtÞ then xðtÞ ¼ AðtÞ cos ’ðtÞ cos 2 f0 þ AðtÞ sin ’ðtÞ cos 2 f0
ð8:10Þ
is a stationary narrow band random process with the envelope AðtÞ and the power spectrum concentrated around the frequency f0 (see section 4.7). If AðtÞ is a process with certain PDF and correlation interval corr i , the problem is solved. The desired generating SDE can be obtained by applying the results of Chapter 7. The block diagram of the corresponding simulator is shown in Fig. 8.2.
Figure 8.2 Simulator of a narrow band process with given PDF of the envelope and bandwidth.
Example: Nakagami fading. In this case the distribution of the envelope is defined by eq. (8.4) and the corresponding SDE for the envelope is (see Section 7.2.2): K2 2 m 1 2 m A þ 2 ðxÞ þ K ðtÞ A¼ 2 A
ð8:11Þ
The parameter K can be found from the correlation interval to be sffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi fi ¼ f0 cos i K¼ ¼ m corr m mc
ð8:12Þ
382
APPLICATIONS
where c is the speed of light. Finally, the generating SDE of the envelope has the form rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dA 2m 1 2mA ¼ f0 cos i þ 2 ðxÞ þ f0 cos i ðtÞ: dt 2m c A mc
ð8:13Þ
The results of the numerical simulation are shown in Figs. 8.3 and 8.4.
Figure 8.3 Estimated (solid line) and theoretical (dashed line) PDF of the process generated by SDE (8.13); m ¼ 1=2; ¼ 1.
Figure 8.4 Estimated (solid line) and theoretical (dashed line) correlation functions of the envelope generated by SDE (8.13); ¼ 0:1 s.
CONTINUOUS COMMUNICATION CHANNELS
8.1.2.2
383
A Process with a Given Spectrum and Sub-Rayleigh PDF
A regular procedure to obtain the desired correlation function (power spectrum) is known only for the Gaussian random process. This synthesis procedure was considered earlier in Chapter 7. In the same chapter we showed that the compound representation of the random process could be used to obtain processes with a symmetrical PDF and any desired rational spectrum. In this case the complex (two-dimensional) Gaussian process xn ðtÞ is modulated by some positive process sðtÞ xðtÞ ¼ xn ðtÞsðtÞ
ð8:14Þ
Here the Gaussian process xn ðtÞ defines the spectral (correlation) properties of the process xðtÞ, and the modulating signal sðtÞ defines the distribution of xðtÞ. Let pAx ðAx Þ be a distribution of the envelope of the process xðtÞ. Then the distribution ps ðsÞ of the modulation is related to pAx ðAx Þ as pAx ðAx Þ ¼
Ax 2
ð1
exp
0
A2x d
sð Þ 2 2 2 2
ð8:15Þ
where 2 is the parameter of the Rayleigh envelope of the process xn ðtÞ. If the modulation sðtÞ is Nakagami distributed, then the resulting process has PDF of the envelope in the form rffiffiffiffiffiffiffiffiffi ! m mþ1 2m 2 m Ax Ax Km1 2 2 2 ð8:16Þ pAx ðAx Þ ¼ ðmÞ where Kv ð Þ is the modified Bessel function [23]. It is worth mentioning that the mean value of the modulation sðtÞ should be chosen to be 1, in order to keep the mean energy of the process xðtÞ the same as the mean energy of the Gaussian process xn ðtÞ. This leads to the following condition ð1 s ps ðsÞd s ¼ 1 ð8:17Þ 0
Which is satisfied if ¼m
2 ðmÞ 2 ðm þ 0:5Þ
ð8:18Þ
Finally, the PDF of the envelope is pAx ðAx Þ ¼
2 ðmÞ 2 2 ðm þ 0:5Þ2
mþ1 2 Am x Km1 ðmÞ
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! 2 2 ðmÞ Ax 2 ðm þ 0:5Þ2
ð8:19Þ
384
APPLICATIONS
pffiffiffi bþ1 þ1 Examples of K-PDF px ðxÞ ¼ 2þ1 K ðb xÞ. (1) ¼ 1; b ¼ 2—solid line; (2) ðÞ x pffiffiffi pffiffiffi ¼ 2; b ¼ 2 2—dashed line; (3) ¼ 3; b ¼ 3 2—dashed-dotted line; (4) Rayleigh distribution with 2 ¼ 1. Figure 8.5
The PDF (8.19) defines the K-distributed envelope, which is a suitable model for subRayleigh fading. When m ¼ 1, PDF (8.19) becomes a Rayleigh one.1 Some plots of the K distribution are given in Fig. 8.5. The desired power density spectrum (8.6) can be approximated by the third order FIR filter response yðnÞ ¼ b0 xðnÞ þ b1 xðn 1Þ þ b2 xðn 2Þ þ b3 xðn 3Þ
ð8:20Þ
where the coefficients fbi g can be calculated as 2 Fd Fs b0 ¼ 1 ¼ 0:43
!p ¼
ð8:22Þ ð8:23Þ
b1 ¼
ð!p þ 2Þð2 !2p 8Þ þ ð!p 2Þð!2p þ 4 !p þ 4Þ ð!p þ 2Þð!2p þ 4 !p þ 4Þ
ð8:24Þ
b2 ¼
ð!p 2Þð2 !2p 8Þ þ ð!p þ 2Þð!2p 4 !p þ 4Þ ð!p þ 2Þð!2p þ 4 !p þ 4Þ
ð8:25Þ
b3 ¼
ð!p 2Þð!2p 4 !p þ 4Þ ð!p þ 2Þð!2p þ 4 !p þ 4Þ
ð8:26Þ
In this case the modulation sðtÞ becomes the deterministic function sðtÞ ¼ 1; ps ðsÞ ¼ ðs 1Þ.
1
ð8:21Þ
CONTINUOUS COMMUNICATION CHANNELS
385
Let z1 , z2 , z3 be the roots of the polynomial bðzÞ ¼ b3 z3 þ b2 z2 þ b1 z þ 1
ð8:27Þ
si ¼ i þ j !i ¼ Fs ln zi ; i ¼ 1; 2; 3
ð8:28Þ
and si and be defined as
¼
maxfi g 2
ð8:29Þ
This allows one to define a new form filter with poles sni ¼ si þ ¼ i þ þ j !i zni ¼ zi exp Fs
ð8:30Þ ð8:31Þ
which is again a stable filter. Then, if process xn ðtÞ has the spectrum defined by the form filter with poles given by eq. (8.31), and sðtÞ is a process with the decay of the correlation function defined by eq. (8.29), the process xðtÞ ¼ xn ðtÞsðtÞ has the spectrum defined by eqs. (8.20)–(8.26), i.e. it has the desired spectral properties. Results of the simulation are shown in Figs. 8.6–8.8. Example: Nakagami sub-Rayleigh fading.2 A similar technique can be used for Nakagami sub-Rayleigh fading, i.e. for the case m < 1. In this case the modulation process
Figure 8.6
2
Estimated (solid line) and theoretical PDF, corresponding to eq. (8.14).
Material for these examples was prepared in collaboration with Ms Vanja Subotic.
386
APPLICATIONS
Figure 8.7 Correlation function corresponding to spectrum (8.6) (solid line), and those obtained according to eq. (8.31) (dashed line).
is a square root of a beta distributed random process [24–26] ps2 ðyÞ ¼
ym1 ð1 yÞm ; Bðm; 1 mÞ
0m1
ð8:32Þ
Results of the simulation are shown in Figs. 8.6–8.8. Samples of fading have been used to produce an error sequence (see section 8.2) and estimate the so-called Pðn; mÞ characteristic, i.e. the probability that there are exactly n errors in a block of m symbols (Figs. 8.9 and 8.10).
Figure 8.8 Estimated (solid line) and desired (dashed line) normalized correlation function, defined by eq. (8.14).
CONTINUOUS COMMUNICATION CHANNELS
387
Figure 8.9 Nakagami cumulative distribution (CDF) and correlation functions for m ¼ 0:7, ¼ 1; fD ¼ 10 Hz and ¼ 1 103 s. Solid line—theory, dotted line—simulation.
Figure 8.10 Nakagami Pðm; nÞ characteristic for m ¼ 0:5 and m ¼ 0:7; SNR ¼ 5 dB; fD ¼ 10 Hz and ¼ 1 103 .
388
8.2
APPLICATIONS
AN ERROR FLOW SIMULATOR FOR DIGITAL COMMUNICATION CHANNELS
Most modern communications systems use discrete information sources and, consequently, discrete methods of modulation. A general scheme of a communication system of this type is given in Fig. 8.11 [3].
Figure 8.11
Digital communication system.
A large number of encoders–decoders (codecs) have been developed but their practical implementation sometimes leads to the results being far from the theoretical ones. The main reason for this failure is the fact that most scientific work in coding theory was applied to a binary symmetrical channel with independent errors—a model which cannot be considered as adequate for real communication conditions [27–29]. In recent years a number of works were devoted to the experimental investigation of real communication system behaviour and its description by different mathematical models [30– 97]. These models allow one to obtain statistical characteristics of the error flow, and then to calculate the efficiency of different correcting codes. It was found that simple mathematical models merely describe channel properties. The complexity of more accurate models arises from the fact that the discrete channel involves the transformation of signals from discrete to continuous, and back again. It can be seen from Fig. 8.11 that the discrete channel includes a continuous channel,3 and that the properties of the latter affect the properties of the discrete channel under consideration. The transition from continuous to discrete signals involves a modulator and demodulator and, consequently, a discrete communication channel should be defined by both blocks. Analytical investigation of such models is complicated, but the numerical simulation produces important information for channel designers.
3
This assertion allows the use of all models considered in the previous section as part of the discrete channel.
AN ERROR FLOW SIMULATOR FOR DIGITAL COMMUNICATION CHANNELS
8.2.1
389
Error Flow in Digital Communication Systems
In the performance analysis of the modulator–demodulator (modem) where the average error rate is considered as the key parameter, the physical sources of errors are mathematically reasonably tractable and can be found from a mathematical model of the continuous part of the channel. For the coding of a discrete channel involving a modem and the propagation medium, any meaningful analytical description of channel memory in terms of the characteristics of individual physical causes of errors is difficult. It is more convenient to work with sample error sequences obtained from test runs of data transmitted on a given coding channel to obtain statistical descriptions of the error structure, and to use them in the design of error control systems. The required statistical parameters may be obtained from error sequences either by direct processing of the sequences or by forming a parameterized mathematical model that would ‘‘generate’’ similar sequences. The latter involves the following [30]: visualizing a model representative of real channel behaviour; developing methods to parameterize the model using error sequences; deriving the statistics required for code evaluation from the model. The error process on a digital communication link can be viewed as a binary discrete-time stochastic process, i.e. a family of binary random variables fXt ; t 2 Ig where I is the denumerable set of integers and t denotes time. The sequence x ¼ . . . ; x1 ; x0 ; x1 ; . . .
ð8:33Þ
representing a realization of the error process can be related to the channel input and output as follows. Let fan g be the digital sequence, and fbn g be the corresponding output sequence. The perturbation due to the channel between the input and output can be considered as an introduction of a noise sequence n originated from a hypothetical random noise source. This may be represented mathematically as the addition of a noise digit to an input digit, i.e. bn ¼ an þ x n
ð8:34Þ
where fan g, fbn g and fxn g are elements of the Galois field GFðqÞ and the addition is over the same field. For binary communications, q ¼ 2, and the addition is modulo 2. An error is said to occur whenever xn is different from zero. Accordingly, an error sequence fxn g is a stochastic mapping of the noise sequence fn g onto f0; 1g. 8.2.2
A Model of Error Flow in a Digital Channel with Fading
For the time being a communication channel without fading is considered. In this case the received signal ^sðtÞ is given by [3] ^sðtÞ ¼ sk ðt Þ þ Kch ðtÞ
ð8:35Þ
390
APPLICATIONS
where sk ðtÞ is the transmitted signal (logical 0 or 1 in the case of a binary channel), is a constant, describing the general attenuation of the signal in the medium of propagation, ðtÞ 2 is the intensity of WGN in the channel. After demodulation is WGN of unit intensity and Kch and decoding, the received signal ^sðtÞ is transformed to s^k ðtÞ. Error occurs if s^k ðtÞ does not coincide with sk ðtÞ. The error probability perr depends on the Signal-to-Noise Ratio (SNR), defined as SNR ¼
2 Eðsk Þ ¼ h2 ¼ 2 Kch
ð8:36Þ
and on the demodulation scheme. A great number of modulations is treated in [3]. The error probability perr does not depend on time and is equal to the average number of errors per bit transmitted. The error flow in this case can be modeled simply as the Poisson flow with intensity ¼ perr. If fading is present in the channel, the received signal should be written as ^sðtÞ ¼ ðtÞsk ðt Þ þ KðtÞ
ð8:37Þ
Now ðtÞ is a random process and, consequently, SNR and the error probability should be changed in time (i.e. they both are random processes). In the case of slow flat fading, it can be assumed that SNR and perr are slow functions of time, depending only on ðtÞ, defined as hðtÞ ¼
2 ðtÞEðsk Þ 2 Kch
perr ðtÞ ¼ FðhðtÞÞ
ð8:38Þ ð8:39Þ
where the structure of the function Fð Þ is defined by demodulation methods. Corresponding error flow can now be modeled as a Poison flow with slow varying intensity ðtÞ, equal to the instant error probability ðtÞ ¼ perr ðtÞ
ð8:40Þ
A corresponding scheme of the error flow simulator is shown in Fig. 8.12.
Figure 8.12
Error flow simulator.
It is worth noting that in many cases it is possible to calculate probability of error taking fading into account [3]. In this case the average probability of error would define a constant error rate over the duration of simulation. The model suggested here, while on a large scale leading to the same results, represents a finer structure of error flow since it follows local variations in the signal level. Consequently, it could be more appropriate for simulation of performance of relatively short codes or codes with memory.
AN ERROR FLOW SIMULATOR FOR DIGITAL COMMUNICATION CHANNELS
Example: Error flow in a channel with Nakagami fading. channel experience Nakagami fading, described by PDF
391
Let a communication
2 mm 2 m1 m A2 pA ðAÞ ¼ A exp ðmÞ
ð8:41Þ
Thus, one can consider ðtÞ in eq. (8.37) as a random process generated by the following SDE sffiffiffiffiffiffiffiffiffiffiffiffi 2m 1 2m
_ ¼ þ ðtÞ 2 m corr
m corr
ð8:42Þ
Being proportional to the quantity zðtÞ ¼ 2 ðtÞ, the SNR ðtÞ can be thus generated by SDE 2m _ ¼ 2m 1 þ2 m corr
sffiffiffiffiffiffiffiffiffiffiffiffi ðtÞ m corr
ð8:43Þ
where =m is the instantaneous SNR. The solution of this SDE ðtÞ can be used to drive the intensity of a Poisson flow of errors.
8.2.3
SDE Model of a Buoyant Antenna–Satellite Link
Small ships often employ a buoyant antenna array that extends into a ship’s wake to support satellite communications at sea [98]. Due to the changing surface environment, the instantaneous gain and phase of each array element is unknown because of multiple paths, changing interelement geometry and wash-over effects. The complex nature of communication systems based on this type of antenna system requires efficient numerical simulation to realistically assess performance during the communication system design process. The aim of this section is to present a mathematical model of the fading in the buoyant antenna array–satellite link which is useful both for numerical simulation and analytical calculations.4
8.2.3.1
Physical Model
A buoyant antenna array is often used for ship–satellite communications, especially by small boats and submerged submarines. Under the ideal conditions, such an array is represented by N equally spaced antenna elements, all transmitting at full power. In practice, some fraction of the elements are washed over by the sea due to wave motion on the sea surface. This results in random fluctuations in the antenna pattern and the signal received at the satellite. For a relatively slow bit rate it is possible to adjust directivity of the antenna in such a way that there is always a direct line-of-sight (LoS) link; however, the energy level of this link varies. Additionally, there are also some diffusion components present due to reflection from the sea surface and random scattering in the atmosphere. Under such conditions it is possible 4
This section is written in collaboration with Mr J. Weaver.
392
APPLICATIONS
to assume that for a short period of time (when there is no change in the number and position of the active elements), the fading is described either by the Rice law [3] A A 2 þ a2 aA pA ðaÞ ¼ 2 exp I0 ; 2 2 2
A0
ð8:44Þ
or by Nakagami fading [3] pA ðaÞ ¼
2 mm 2 m1 m A2 A exp ; ðmÞ
m > 1;
A0
ð8:45Þ
with parameters a, , and m chosen to properly represent energy levels of the signal and noise. If only short messages are communicated, such as distress signals or very slow rate sensory information, a static model with constant parameters performs satisfactorily. If longterm communication is considered, these parameters cannot be treated as constants due to fluctuations of the geometry of the array which are themselves random processes. It is important to properly represent the effect of such fluctuations on the resulting quality of the communication channel. The following physical model of the channel is accepted here. The fluctuation of the signal level is described as a continuous random process with its parameters randomly changing over a finite number of possibilities or states. The latter is justified since there is a finite number (however large) of possible combinations of active and passive array elements [98]. Transitions from state to state are random and described by an underlying Markov chain as described in Chapter 6. At this stage it is assumed that the parameters of fading in each state are known, either from measurements or from detailed simulation, and the main engineering focus is the effect of state changes on the overall performance of communication systems. One of the most interesting questions that needs to be addressed is the effect of the intensity of state transitions. These transitions are defined by the roughness of the sea surface and the speed of the craft. While there is no exact formula which relates these parameters to the transitional probabilities of a Markov chain, it is reasonable to assume that faster transitions correspond to a rougher sea and a larger velocity. This chapter provides an analytical expression for the resulting PDF of the fading as a function of transitional intensity in the Markov chain. When experimental data becomes available it will be possible to trace a direct influence of the velocity and sea roughness on the bit error rate of the link. 8.2.3.2
Phenomenological Model
The mathematical model of the system is represented as M stochastic differential equations (SDEs), each representing a state of the antenna array x_ ¼
ðnÞ
K2 d ðnÞ ln pn ðxÞ þ K2 ðtÞ; 2 dx
n ¼ 1; . . . ; M
ð8:46Þ
These sequences are randomly switched, maintaining a single point continuity between adjacent sequences, according to the state of the underlying Markov chain. Here ðtÞ is the
AN ERROR FLOW SIMULATOR FOR DIGITAL COMMUNICATION CHANNELS
393
white Gaussian noise with unit spectral density, pn ðxÞ is the desired marginal PDF of the ðnÞ solution and K2 is a constant defining the correlation interval of the resulting process. The probability density function of the solution can be described by a system of generalized Fokker–Planck equations of the following form (see Section 6.2.3) @ ðlÞ @ ðlÞ p ðx; tÞ ¼ ½K1 p ðlÞ ðx; tÞ @t @x M X 1 @ 2 ðlÞ ðlÞ ðlÞ ½K p ðx; tÞ p ðx; tÞ þ kl p ðkÞ ðx; tÞ þ l 2 2 @x2 k6¼l
ð8:47Þ
Here kl is the intensity of transition from the k-th state of the array to the l-th state of the array, and, as the total switching intensity from the l-th state l ¼
M X
kl
ð8:48Þ
k 6¼ l
in the steady state, the probability densities do not depend on time and so eq. (8.47) is further simplified to become an ordinary linear differential equation of the second order. Transitional intensities kl define the probability of each state of the antenna array. In particular, the vector of such probabilities P ¼ ½P1 ; . . . ; PM T is the eigenvector of the matrix of transitional intensity T ¼ fkl g TP ¼ P
ð8:49Þ
In general, the solution to eq. (8.47) can only be found numerically. However, for the two limiting cases, it is possible to obtain accurate and simple analytical approximations. Let s n be a characteristic (correlation) time of the system in the n-th state. Then the switching is considered to be slow if maxfkl sk g 1 and fast if maxfkl sk g 1. In the case of slow fading, the PDF of the process is just a weighted sum of the PDF in isolated states (see Section 6.3.2) pðxÞ ¼
M X
Pn pn ðxÞ
ð8:50Þ
n¼1
such that the switching effects last approximately sn seconds and are subsequently dominated by internal dynamics of the system in the same fashion as in an isolated state. On the contrary, if fast switching is predominant, then all states in the distribution approach a single distribution given by the following expression (see Section 6.3.3) ðx C K 1 ðsÞ exp 2 ds ð8:51Þ pðxÞ ¼ K 2 ðxÞ x0 K 2 ðsÞ where K 1 ðsÞ ¼
M X n¼1
ðnÞ
Pn
K2 d ½ln pn ðxÞ 2 dx
ð8:52Þ
394
APPLICATIONS
and K 1 ðsÞ ¼
M X n¼1
ðnÞ
ð8:53Þ
Pn K2
To simplify the complexity of the model, a two-state Markov model is used to represent the possible states of the channel. The ‘‘Good’’ state represents the situation when almost all the elements of the array are active, and ‘‘Bad’’ represents situations with very few active arrays. In the former case, the fading could be described with the Nakagami law with a large value of the parameter mG > 1 and the average value of the signal level with respect to noise G . The latter case is described by a relatively small mB 1:5 and the average level of signal B < G . The probability of the Bad state is PB and the probability of the Good state is PG ¼ 1 PB . The transitional matrix T of the Markov chain is then defined as
ð1 ÞPG þ T¼ ð1 ÞPG
ð1 ÞPB ð1 ÞPB þ
ð8:54Þ
where the parameter 0 1 reflects correlation properties of the chain ( ¼ 0 corresponds to uncorrelated transitions). It is assumed that BPSK modulation [3] is used. In an isolated state (i.e. when no switching is present), it is possible to show that fading can be described by the following stochastic differential equation [29] K2G 2 mG 1 mG x þ K2G ðtÞ x_ ¼ x G 2 K B 2 mB 1 mB x x_ ¼ 2 þ K2B ðtÞ 2 x B
ð8:55Þ
For a small intensity of fading, the distribution of the magnitude is therefore mG mB 2 mG mG A 2 2 mB m B A2 2 mG1 2 mB1 A exp A exp pðAÞ ¼ PG þ PB G B ðmG Þ G ðmB Þ B ð8:56Þ For the fast switching it follows from eq. (8.52) that the distribution is again Nakagamidistributed m1 2 m1 m 1 A2 2 m1 1 pðAÞ ¼ A exp ðm1 Þ 1 1
ð8:57Þ
where ðGÞ
m1 ¼
PG mG K2
ðGÞ
PG K 2
ðBÞ
þ PB mB K2 ðBÞ
þ PB K2
ð8:58Þ
AN ERROR FLOW SIMULATOR FOR DIGITAL COMMUNICATION CHANNELS
395
and ðGÞ
1 ¼
ðBÞ
PG mG K2 þ PB mB K2 mG ðGÞ mB ðBÞ PG K þ PB K G 2 B 2
ð8:59Þ
Finally, since the expression for the bit error rate of BPSK in Nakagami fading is known in terms of the hypergeometric function [2,23] rffiffiffiffiffiffiffi 1 ðm þ 0:5Þ 2m þ 1 1 ; ; Perr ¼ ð8:60Þ 2 F1 2 m ðmÞ 2 2 m one can easily obtain the expression for the BER in fading with changing parameters. For the case of a slow switching, eq. (8.60) produces sffiffiffiffiffiffiffiffiffiffi 1 G ðmG þ 0:5Þ 2 mG þ 1 1 G ; ; P e ¼ PG 2 F1 mG ðmG Þ 2 2 2 mG sffiffiffiffiffiffiffiffiffiffi B ðmB þ 0:5Þ 2 mB þ 1 1 B ; ; 2 F1 mB ðmB Þ 2 2 mB
ð8:61Þ
For the fast switching, the following is produced 1 Pe ¼ 2
sffiffiffiffiffiffiffiffiffiffiffi 1 ðm1 þ 0:5Þ 2 m1 þ 1 1 1 ; ; F 2 1 2 2 m1 m1 ðm1 Þ
ð8:62Þ
Since parameters P1 , P2 and the intensity of transitions directly affect BER and are functions of the roughness of the sea and the velocity of the craft, the developed equations could be a useful tool for system designers.
8.2.3.3
Numerical Simulation
The modified Ozaki scheme (see Section C.3 in Appendix C on the web and [27–29]) could be used to simulate each non-linear SDE d ½ pst ðxn Þ dx 1 fn log 1 þ ½expð fn tÞ 1 Kt ¼ xn fn t sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi expð2 Kt tÞ 1 t ¼ 2 Kt fn ¼
xnþ1 ¼ j expðKt tÞxn þ t i j
ð8:63Þ ð8:64Þ ð8:65Þ ð8:66Þ
396
APPLICATIONS
where Kt is a factor inversely proportional to the correlation interval of the fading and i is a Gaussian iid process. The processes are then commutated by the two-state Markov process. Performance of a few representative signalling techniques is evaluated with the numerical model. Additionally, the two composite processes are compared on the basis of the Kolmogorov– Smirnov ‘‘Goodness-of-Fit’’ measure [99]. This measure maps the greatest absolute error from the empirical CDF to a reference CDF as shown below ^ obs ðxÞ Pref ðxÞj D ¼ supjP
ð8:67Þ
pffiffiffiffi 0:11 PðD > observedÞ ¼ Qks D N þ 0:12 þ pffiffiffiffi N
ð8:68Þ
and
where N is the number of bins in the empirical CDF and Qks is defined as Qks ðÞ ¼ 2
1 X
ð1Þj1 exp½2 j2 2
ð8:69Þ
j¼1
Using this measure, both fast and slow switching processes were highly likely to match their theoretical counterparts. In the slow switching case, the match was a virtual certainty ð < 0:4Þ and in the fast switching case the probability of a theoretical match was 98.5% ð 0:5Þ. Below, the results for the slow switching are reviewed first (PDF and K-S performance) followed by similar graphs for the fast switching case. Fig. 8.13 shows the results of numerical simulation of the suggested model.
Figure 8.13 Results of the numerical simulation of SDE (8.55) with random structure: (a) slow fading; (b) fast fading.
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
8.3
397
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
Clutter, being an unwanted phenomenon which usually accompanies target detection, has been the subject of numerous investigations related to the performance of sea microwave radar [100–153]. In many real situations (low grazing angles, requirement of high resolution) the probability density function of the clutter envelope (EPDF) does not fit the Rayleigh law [100–153], i.e. it exhibits higher tails and a larger standard deviation–to–mean ratio. Here two models of a simulator of radar clutter, utilizing a second order SDE considered earlier in Chapter 7, and generating clutter with the K or Weibull distribution of the envelope, are considered. Such simulators can be used in the testing of radar target discrimination schemes ([117–153] and many others).
8.3.1
Modeling and Simulation of the K-Distributed Clutter
The K-distribution of the clutter envelope, which is now widely applied both to the description of microwave scattering by a rough ocean surface and to the synthesis of the target detection algorithms [101–108], provides a better agreement with experimental data than the Rayleigh one. In this case the envelope of the signal is distributed according to the following PDF pA ðAÞ ¼
bþ1 A K1 ðb AÞ; 21 ðÞ
A0
ð8:70Þ
where ð Þ is the Euler Gamma function [23], is a positive shape parameter, Kv ð Þ is the modified second kind Bessel function of order [23] and b is related to the average power 2 by the formula b2 ¼ 2 =2
ð8:71Þ
which generalizes the Rayleigh law in the case ! 0. A non-Rayleigh, and in particular K-distributed, correlated clutter may be presented as a narrow band stationary process with a symmetrical non-Gaussian distribution of instantaneous values [101–108]. In addition to having its power spectrum concentrated around its central frequency, the corresponding generating system has to be at least of the second order, and the clutter itself can be considered as a component of the two-dimensional Markovian process, i.e. as a solution, for example, of the following Duffing stochastic SDE €x þ 2 _x þ FðxÞ ¼ K ðtÞ
ð8:72Þ
where ðtÞ is the white Gaussian noise (WGN) of a unit one-sided power spectral density. To find the parameters of eq. (8.72) the following steps have to be taken. 1. Choose a non-linear function FðxÞ which determines the desired distribution of the instantaneous values of the process xðtÞ.
398
APPLICATIONS
2. Estimate the damping factor and intensity K 2 of the WGN, which together define correlation properties of the process xðtÞ. The relation between the stationary distribution px st ðxÞ of the solution of SDE (8.72) and the non-linear term FðxÞ in its operator has been found to be (see Section 7.3.3, eq. (7.201)) ð 4 x FðsÞd s px stðxÞ ¼ C exp 2 k x0
ð8:73Þ
where C is the normalization condition constant. Function FðxÞ in SDE (8.72), generally non-linear, can be determined by inverting eq. (8.73) (see eq. (7.202)) FðxÞ ¼
K2 d ln px stðxÞ 4 dx
ð8:74Þ
In order to use the synthesis equation (8.74) the exact form of the PDF px st ðxÞ of the process xðtÞ must be evaluated from the known PDF of the envelope (8.70). Functions pA ðAÞ and px st ðxÞ are related by the Blanc-Lapierre transformation (4.331) for narrow band processes px st ðxÞ ¼
1 2
ð1
ð1 exp½i s xd s 1
pA ðAÞJ0 ðs AÞd A
ð8:75Þ
0
This has been accomplished in Section 4.7.4.1, eq. (4.436) 1
1 2þ2 b pffiffiffi ðbjxjÞ2 K 1 ðbjxjÞ px st ðxÞ ¼ 2 ðÞ
ð8:76Þ
To calculate the logarithmic derivative of the PDF (8.76) one can use the well known identity [23] d K ðzÞ Kþ1 ðzÞ ð8:77Þ ¼ d z z z which leads to K3=2 ðbjxjÞ K3=2 ðbjxjÞ p0x st ðxÞ ¼ b sgnðxÞ ¼ b sgnðxÞ K1=2 ðbjxjÞ K1=2 ðbjxjÞ px st ðxÞ
ð8:78Þ
As a result, the generating SDE of a narrow band random process with K-distributed envelope is given by €x þ x_ þ sgnðxÞ
K 2 bK3=2 ðbjxjÞ ¼ KðtÞ 4 K1=2 ðbjxjÞ
ð8:79Þ
399
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
The next problem is to estimate the mean value of the central frequency of the narrow band process xðtÞ. For this purpose one can approximate the non-linear function FðxÞ in eq. (8.79) by a polynomial series as in Section 7.3.3: €x þ x_ þ
1 X
! a2 iþ1 x
2i
!0 x ¼ KðtÞ
ð8:80Þ
i¼0
where a1 ¼ 1 and a2 iþ1 are the coefficients of FðxÞ’s power expansion. For this purpose one may use the following presentation of the modified Bessel function around zero [23] ðÞ21 ð1 Þ 2 x ; K ðxÞ 1þ x 4 ð2 Þ
x>0
ð8:81Þ
Taking expansion (8.81) into account, the non-linear function FðxÞ can be rewritten as 3 2 ð1=2 Þ 2 x 1 þ K 2 b2 ð 3=2Þ 6 4 ð1=2 Þ 7 7 FðxÞ ¼ x6 ð1=2 Þ 2 5 8 ð 1=2Þ 4 x 1þ 4 ð3=2 Þ K 2 b2 ð 3=2Þ ð1=2 Þ ð1=2 Þ b2 x2 x 1þ 8 ð 1=2Þ ð1=2 Þ ð3=2 Þ 4
ð8:82Þ
Fig. 8.14 shows the satisfactory quality of such an approximation. It is was shown in Section 4.7 that a narrow band process can be represented as xðtÞ ¼ AðtÞ cosððtÞÞ ¼ AðtÞ cos½!0 t þ ’ðtÞ
Figure 8.14 line).
ð8:83Þ
Non-linear term in eq. (8.79) (solid line) and its approximation in eq. (8.82) (dashed
400
APPLICATIONS
where AðtÞ and ’ðtÞ are the slowly varying envelope and phase, respectively, defined via the Hilbert transform ^xðtÞ of the process xðtÞ. In this case A2 ðtÞ ¼ ^x2 ðtÞ þ x2 ðtÞ ðtÞ ¼ atan
ð8:84Þ
^xðtÞ xðtÞ
Since for a narrow band process its Hilbert transform can be approximated by a derivative of the original process (see Section 4.7) x_ ¼ !0 AðtÞ sin ðtÞ
ð8:85Þ
an equivalent first order SDE can be obtained [154] " # 1 dA 1 X sin 2 iþ1 ¼ 2 sin A sin a2 iþ1 A cos ðtÞ dt 2 !0 i ¼ 0 !0 " # 1 dA 1 X cos 2 iþ1 ¼ 2 cos A sin a2 iþ1 A cos ðtÞ dt A 2 !0 i ¼ 0 A !0
ð8:86Þ ð8:87Þ
which, in turn, can be simplified to dA K2 K ¼ 2 s þ 1 ðtÞ þ dt 4 A 2 !0
ð8:88Þ
d K2 K ¼ 2 c þ þ 1 ðtÞ ¼ !ðtÞ 4 A!0 2 !0 dt
ð8:89Þ
where s and c are regular terms of eqs. (8.86) and (8.87), respectively [155], 1 ðtÞ, and 1 ðtÞ are independent white Gaussian noises with spectral density 0:5 K 2. Averaging both sides of eq. (8.89) over A, the mean angular frequency of the power spectrum of the process xðtÞ can be obtained as " ! ¼ !0 1 þ " ¼ !0 1 þ
ð 1 X a2 iþ1 ð2 iÞ! 1 i¼1
ði!Þ2 2i
1 X a2 iþ1 ð2 iÞ! i¼1
ði!Þ2 2i
1
# pA ðAÞA
2 iþ1
dA
#
m2 iþ1
ð8:90Þ
The initial moments of K-distribution are [101–103] ml ¼
ð1 0
bþ1 A pA ðAÞd A ¼ 1 2 ðÞ l
ð1 A 0
lþ
2 þ 2l 1 þ 12 K1 ðbAÞdA ¼ b l ðÞ
ð8:91Þ
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
401
Using the latter expression and taking into account the fact that b2 ¼ 2 =2 , one can obtain the following expressions for the first three even moments (8.91) þ1 m2 ¼ 22 pffiffiffi 6 ð þ 3=2Þ m3 ¼ 3 b ðÞ 82 ð þ 2Þð þ 1Þ m4 ¼ 2
ð8:92Þ
Approximation (8.82) now can be rewritten as K 2 b2 ð 3=2Þ x½1 þ a3 m3 x2 8 ð 1=2Þ
ð8:93Þ
ð1=2 Þ ð1=2 Þ b2 a3 ¼ ð1=2 Þ ð3=2 Þ 4
ð8:94Þ
FðxÞ where
and !20 ¼
K 2 b2 ð 3=2Þ 8 ð 1=2Þ
ð8:95Þ
Taking into account only terms of order 3 in eq. (8.80), one can arrive at the following expression for the central frequency of the solution of SDE (8.80) !c ¼ !0 ð1 þ a3 m3 Þ
ð8:96Þ
The next step is to show that if the parameter approaches infinity, the generating SDE (8.79) tends to a linear second order equation which generates a narrow band Gaussian process with the corresponding Rayleigh PDF. This can be accomplished by considering the following asymptotic expansion [23] rffiffiffiffiffiffi expð Þ K ð zÞ 2 ð1 þ z2 Þ1=4
ð8:97Þ
where
¼
pffiffiffiffiffiffiffiffiffiffiffiffi 1 þ z2 þ ln
z pffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 1 þ z2
ð8:98Þ
Using eq. (8.97), one can obtain that K ðbjxjÞ ¼ K ð zÞ; z ¼
bjxj
ð8:99Þ
402
APPLICATIONS
If x is fixed and z ! 0, eq. (8.98) can be simplified to
1 þ ln
z 2
ð8:100Þ
thus producing a simple approximation for K ð zÞ and Kv ðbjxjÞ in the following form rffiffiffiffiffiffi rffiffiffiffiffiffi z K ð zÞ exp ln ¼ expð0:307 Þz 2 2 2 rffiffiffiffiffiffi K ðbjxjÞ ¼ expð0:307 ÞbÞjxj 2
ð8:101Þ ð8:102Þ
Finally, substituting expression (8.102) into eq. (8.78), one obtains that K3=2 ðbjxjÞ p0x st ðxÞ ¼ b sgnðxÞ K1=2 ðbjxjÞ px st ðxÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 =ð2 3Þ expð0:307ð 3=2ÞÞb2 b pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 =ð2 1Þ expð0:307ð 1=2ÞÞb2 sffiffiffiffiffiffiffiffiffiffiffi ex 12 ¼ !20 r x ¼ b2 2 32
ð8:103Þ
The last expression corresponds to the linear function FðxÞ in SDE (8.79) €x þ x_ þ !20 r x ¼ KðtÞ
ð8:104Þ
which generates a narrow band Gaussian process with a central frequency !0 r and with an envelope distributed by the Rayleigh law. The next step is to tune the correlation function of the envelope. It is reasonable to assume that a small deviation of the PDF of the envelope does not significantly influence the correlation time, and to use the approximation of the K-distribution by the Nakagami law (8.4). Clearly, the most feasible approximation condition is the equality of the second and fourth moments of both distributions. In the case of the Nakagami PDF, they are equal to [6] mN 2 ¼
ð8:105Þ
and m4 N ¼
mþ1 2 m
ð8:106Þ
It follows from eq. (8.92) that þ1 m þ 1 2 8 2 ð þ 2Þð þ 1Þ ¼ m 2 ¼ 2 2
ð8:107Þ ð8:108Þ
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
403
and, thus m¼
þ1 þ2
ð8:109Þ
The SDE which generates the Nakagami distribution is well known (7.41) K2 2 m 1 2 m A A_ ¼ N þ KN N ðtÞ A 2
ð8:110Þ
where WGN N ðtÞ has unit intensity and KN ¼ 0:5 K. The correlation function of the SDE (8.110) solution may be written as (see eq. (7.43)) 1 2 m þ 12 X 2 i 12 2mki jj exp CAA ðÞ ¼ 4 m ðmÞ i ¼ 1 ðm þ 1Þ
ð8:111Þ
Taking into account only the first term in eq. (8.92), one can obtain an approximation for the envelope covariance function and correlation interval, respectively 2 m þ 12 2 m KN2 exp jj 16 m ðmÞðm þ 1Þ ¼ 2 KN2 m
CA ðÞ
ð8:112Þ
corr
ð8:113Þ
The latter allows one to complete the set of equations required to design a simulator. If the parameters !c , corr , and 2 of sea clutter are given, then the steps to generate the SDE model are the following: 1. Calculate the spectral density Kn of the excitation from eq. (8.113) sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 ð þ 2Þ KN ¼ corr
ð8:114Þ
2. Using eq. (8.96) calculate the frequency !c 1 þ a3 m3
ð8:115Þ
ð þ 2Þð 3=2Þ 2 !20 ð 1=2Þ
ð8:116Þ
!0 ¼ 3. Calculation of the damping factor
¼
404
APPLICATIONS
Figure 8.15
Figure 8.16
Form filter corresponding to SDE (8.79).
Desired (dashed line) and estimated (solid line) PDF obtained from SDE (8.79).
The form filter, corresponding to model (8.79), is shown in Fig. 8.15 and results of the numerical simulation are given in Fig. 8.16.
8.3.2
Modeling and Simulation of Weibull Clutter
A different kind of radar clutter is the Weibull distributed radar clutter [112,122–153]. In this case the PDF of the envelope has the following form pA ðAÞ ¼ A1 exp½ A
ð8:117Þ
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
405
where and are related to the average power 2 of the quadrature components as 2 2 2= 2 ¼ 1þ ð8:118Þ It is easy to see that, for the case of the Weibull distribution (8.118), eq. (7.244) can be rewritten in the form F1 ðAÞ ¼
K2 ½ A 1 4 !20
ð8:119Þ
and, thus, d K 2 2 2 F1 ðAÞ ¼ A dA 4 !20 3 2 d F ðAÞ 2 2 1 d 6d A 7 K ð 2Þ 3 A 5¼ 4 dA A 4 !20
ð8:120Þ
ð8:121Þ
The expression (8.120) has a zero limit when A ! 0 and > 2 (non-Rayleigh case) and is equal to K 2 2 =4 !20 in the Rayleigh case. The integral on the left side of eq. (7.244) can easily be calculated by using eq. (8.121) 2 3 d ðx ð F ðAÞ 1 d 6d A K 2 2 ð 2Þ x A3 d A 7 dA pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 5 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ A 4 !20 x 2 A2 x 2 A2 0 dA 0 2 pffiffiffi 2 2 K ð 2Þ 2 x3 ¼ ð8:122Þ 1 4 !20 2 thus producing
2 pffiffiffi K 2 2 ð 2Þ 2 x2 f ðxÞ ¼ 2 1 4 !0 2
and leading to a generating SDE of the van der Pol type (7.225) 2 pffiffiffi K 2 2 ð 2Þ 2 x2 x_ þ !20 x ¼ !20 ðtÞ €x þ 1 4 !20 2
ð8:123Þ
ð8:124Þ
The next step is to find the correlation function of the solution xðtÞ. For this purpose one can rewrite eq. (8.124) in the Cauchy form x_ ¼ y y_ ¼ x2 y !20 x þ !20 ðtÞ
ð8:125Þ
406
where
APPLICATIONS
2 pffiffiffi K 2 2 ð 2Þ 2 ¼ 1 4 !20 2
ð8:126Þ
The drift coefficients K1 x and K1 y are thus defined as K1 x ðx; yÞ ¼ y K1 y ðx; yÞ ¼ x2 y !20 x
ð8:127Þ
The latter produces the following Gaussian approximation for the covariance functions [156,157]:
3 2
@ @ K K ðx; yÞ ðx; yÞ 1 x 1 x 7 6 @x d hx; x i
@ y 7 hx; x i ¼6 ð8:128Þ 5 hx; y i 4 @ @ d hx; y i K1 y ðx; yÞ K1 y ðx; yÞ @x @y where h i denotes cumulant brackets (see section 4.4.2) and the initial conditions are hx; x ij¼0 ¼ 2 hx; y ij¼0 ¼ 0
ð8:129Þ
Using eq. (8.127) one can rewrite system of equation (8.128) in the following form d hx; x i 0 1 hx; x i ¼ ð8:130Þ ð 2Þhx3 yi !20 hx2 i hx; y i d hx; y i The solution hx; x i of eq. (8.130) is given by hx; x i ¼
2 ð2 e1 1 e2 Þ 2 1
ð8:131Þ
where 1 and 2 are the eigenvalues of eq. (8.130). Defining the correlation interval corr as 2 corr ¼
ð1
hx; x id
ð8:132Þ
0
one can obtain corr ¼
1 þ 2 hx2 i ¼ 1 2 hx3 yi !20
ð8:133Þ
We are now ready to present an algorithm for the synthesis of an SDE whose solution has the Weibull distributed envelope with parameters , , 2 related by formula (8.118) and
A SIMULATOR OF RADAR SEA CLUTTER WITH A NON-RAYLEIGH ENVELOPE
correlation interval corr . In fact, solving eq. (8.133) with respect to 2 pffiffiffi 2 2 K ð 2Þ !20 2 2 ¼ 2 1 hx i ð 2Þhx3 yi 4 !0 2
407
ð8:134Þ
therefore 1 1 8 !20 2 pffiffiffi 2 K2 ¼ 2 ð 2Þ hx2 i ð 2Þhx3 yi 2
ð8:135Þ
Thus, all the parameters in eq. (8.124) are now known. The form filter corresponding to the model (8.124) is given in Fig. 8.17 and the results of the numerical simulation are given in Fig. 8.18.
Figure 8.17
Figure 8.18
Form filter corresponding to SDE (8.124).
Desired (dashed line) and estimated PDF obtained from SDE (8.124).
408
APPLICATIONS
8.4
MARKOV CHAIN MODELS IN COMMUNICATIONS
In contrast to continuous models of fading channels, considered above, this section deals with the attempt to model the error flow directly, i.e. focusing on the discrete channel itself. Such models are powerful tools in the investigation of coding and network performance [38– 103,158–170]. Only a few topics are considered here. The reader is referred to [171] for more details and practical examples. 8.4.1
Two-State Markov Chain—Gilbert Model
Let a Markov chain with only two states, bad (B) and good (G) as shown in Fig. 8.19, be considered. This chain can be described by the following transitional matrix p¼
Figure 8.19
1p r
p 1r
ð8:136Þ
Two-state Markov chain: (a) Zorzi notation [62], (b) original Gilbert paper notation.
In the good state there are no errors, while in the bad state error appears with probability 1 h. We assume that the probability of error, Perr , and the parameter h are given. Since errors appear only in the bad state, one has PB ¼ 1 P G ¼
Perr ; 1h
PG ¼ 1
Perr 1h
ð8:137Þ
Furthermore, any two-state transitional matrix can be decomposed into the following form
P T ¼ ð1 Þ G PG
PB 1 þ PB 0
ð1 ÞPB 0 ð1 ÞPG þ ¼ ð1 ÞPG ð1 ÞPB þ 1
ð8:138Þ
where parameter 0 1 reflects correlation properties of the Markov chain: ¼ 0 corresponds to independent error occurrence, while ¼ 1 results in completely deterministic error occurrence since the chain remains in only one of the states (see Table 8.3). Comparing eqs. (8.136) and (8.138) one obtains Perr r ¼ ð1 ÞPG ¼ ð1 Þ 1 ; 1h
p ¼ ð1 ÞPB ¼ ð1 Þ
Perr 1h
ð8:139Þ
409
MARKOV CHAIN MODELS IN COMMUNICATIONS
Table 8.3 Correspondence between Markov chain parameters in different notations Quantity
Zorzi [62]
Perr
Gilbert [31]
ProbfG ! Gg
1p
Q
ProbfG ! Bg
p
P
ProbfB ! Gg
r
p
ProbfB ! Bg
1r
q
Probability of a correct digit in B state
h¼0
h
1 Perr 1h 1 Perr 1h 1 ð1 Perr hÞ 1h Perr þ ð1 Þ 1h
1
h
If the error-free run statistics are used to estimate parameters of the Gilbert–Elliot model, one has to fit the experimental data Pð0m j1Þ by a double exponential curve Pð0m j1Þ A J m þ ð1 AÞLm
ð8:140Þ
In this case the parameters of the Gilbert model can be defined as [31] LJ ; h¼ J AðJ LÞ
ð1 LÞð1 JÞ P¼ ; 1h
Lh p ¼ AðJ LÞ þ ð1 JÞ 1h
ð8:141Þ
For a more general Elliot model, the same approximation (8.140) of the error-free run produces the following parameters of the model Perr ½AðJ LÞ J JL qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Perr 1 þ JL ðPerr 1 þ JLÞ2 4 JL
ð8:142Þ
¼1
2 1 h Perr 1 h JL JþL h h p¼ kh JL P¼1 p hk k¼
8.4.2
h
Perr 1
ð8:143Þ ð8:144Þ ð8:145Þ ð8:146Þ
Wang–Moayeri Model
One of the decisions which must be made by the simulator developer is to choose the proper order of Markov chain, N, and calculate elements of the transitional matrix. It was pointed out in [44] that the finite state models can produce unreasonable results if the model
410
APPLICATIONS
Figure 8.20
Wang–Moayeri model. Only transitions to a neighbouring state are allowed.
does not represent real physical characteristics of the continuous channel. One of the important features which must be represented is the so-called level crossing rate (LCR) of fading. Having this in mind, Wang and Moayeri have suggested a model [44] which is based on the assumption that transitions appear only between two neighbouring states and all other transitions are not allowed (see Fig. 8.20). Such an assumption is quite reasonable for the case of a very slow fading. Indeed, in this case the velocity A ðtÞ ¼
d AðtÞ dt
ð8:147Þ
of the process is relatively small and one can assume that the signal level AðtÞ is not able to change by an amount more than a quantization step jAðt þ TÞ AðtÞj jðtÞjT < minð Ai Þ
ð8:148Þ
Of course this condition will be violated for a faster fading. However, for a large number of systems with high bit rate, the duration of bit, T, without interleaving can be accurately described by the Wang–Moayeri model, described below. The range of magnitude variation can be divided into N regions by a set of thresholds Ak such that 0 ¼ A0 < A1 < < AK ¼ 1
ð8:149Þ
The fading is said to be at state sk if the magnitude of the signal resides inside interval ½Ak ; Akþ1 Þ. In each state, the probability of error ek is assigned according to the modulation scheme used. The probability pk of the channel being in state sk can be calculated if the fading PDF pA ðAÞ is known. The Wang–Moayeri model assumes that pk ¼
ð Akþ1
pA ðAÞd A
ð8:150Þ
Ak
Since only transitions into the neighbouring states are allowed, all transitional probabilities with indices differing by at least two are equal to zero: pij ¼ 0; for ji jj > 1;
0 < i;
j
ð8:151Þ
411
MARKOV CHAIN MODELS IN COMMUNICATIONS
Now let one assume that the communication system considered transmits R symbols per second, i.e. the symbol rate is R. Then, on average, the number of symbols sent through the communication channel at state sk is given by Rk ¼ pk R
ð8:152Þ
A slow fading assumption implies that the level crossing rate LCRðAk Þ calculated at the level Ak is much smaller than Rk . In this case the transition probability pk;kþ1 can be approximated by the ratio of the upward level crossing rate LCRðAk Þ divided by an average symbol rate transmitted in the state sk . Indeed, given an arbitrary time interval T0 , there will be approximately LCRðAk ÞT0 upward crossings of the level Ak , which corresponds to the channel state transitions from sk to skþ1 (see Fig. 8.20). Thus pk;kþ1 ¼
LCRðAk ÞT0 LCRðAk Þ LCRðAk Þ ; ¼ ¼ Rk T0 Rk pk R
0k N2
ð8:153Þ
In the very same manner, transitional probabilities Pk;k1 can be obtained as pk;k1 ¼
LCRðAk1 Þ ; pk R
1k N1
ð8:154Þ
The rest of the probabilities can be found as follows p0; 0 ¼ 1 p0; 1 pk; k ¼ 1 pk; k1 pk; kþ1 ;
1k N2
ð8:155Þ
pK1; K1 ¼ 1 pK2; K1 The next task is to verify that the suggested assignment of transitional probabilities is consistent with the stationary probabilities pk of the channel states. In other words, one has to prove that p0 p0; 0 þ p1 p1; 0 ¼ p0 pk1 pk1; k þ pk pk; k þ pkþ1 pkþ1; k ¼ pk ; 1 k N 2
ð8:156Þ
pk2 pK2; K1 þ pK1 pK1; K1 ¼ pK1 The first and the third equations can be easily verified. Indeed p0 p0; 0 þ p1 p0; 1 ¼ p0 ð1 p0 ;1 Þ þ p1 p1; 0 ¼ p0 p0 p0; 1 þ p1 p1; 0 ¼ p0 pK2 pK1; K2 þ pK1 pK1; K1 ¼ pK2 pK1; K2 þ pK1 ð1 pK2; K1 Þ ¼ pK1 ð8:157Þ since, according to eqs. (8.153) and (8.154), p0 p0;1 ¼ p1 p1;0 ¼
LCRðA1 Þ ; R
pK2 pK1;K2 ¼ pK1 pK2;K1 ¼
LCRðAK1 Þ R
ð8:158Þ
412
APPLICATIONS
Finally, the second equation in eq. (8.156) can be rewritten as pk1 pk1; k þ pk ð1 pk; k1 pk; kþ1 Þ þ pkþ1 pkþ1; k ¼ pk
ð8:159Þ
pk1 pk; k1 þ pkþ1 pk; kþ1 ¼ pk pk; k1 þ pk pk; kþ1
ð8:160Þ
or, equivalently
But both sides of eq. (8.156) are equal to ½LCRðAk Þ þ LCRðAkþ1 Þ=R, which proves that the designed transitional matrix has the required stationary probabilities of states. A number of useful simulators can be based on the Wang–Moayeri model. The original paper [44] presents a model of a Rayleigh fading channel. In this case the instantaneous SNR ðtÞ is described by the exponential PDF 1 p ðÞ ¼ exp
ð8:161Þ
Here represents the average SNR over the fading channel. If thresholds k are chosen, then the probabilities of each state can be calculated through eq. (8.150) as pk ¼
ð kþ1 k
1 k kþ1 exp d ¼ exp exp
ð8:162Þ
Level crossing rates for such fading also can be easily calculated (see Section 4.6) to be rffiffiffiffiffiffiffiffiffiffiffirffiffiffi 00 LCRðÞ ¼ 0 exp 2
ð8:163Þ
This quantity depends on the exact shape of the normalized correlation function ðÞ. For Clark’s model one shall assume ðÞ ¼ J02 ð2 fD Þ
ð8:164Þ
This results in 000
@2 ¼ ðÞ @ 2 ¼0 ¼
8 J12 ð2 fD Þ2 fD2
1 J1 ð2 fD Þ 2 2 8 J0 ð2 fD Þ J0 ð2 fD Þ ¼ 4 fD 2 fD ¼0
ð8:165Þ
and, thus, the level crossing rate can be calculated as sffiffiffiffiffiffiffiffiffi 2 LCRðÞ ¼ fD exp
ð8:166Þ
MARKOV CHAIN MODELS IN COMMUNICATIONS
413
The next question to be resolved is a proper choice of the threshold levels k , and the number of states N. There are two conflicting requirements which must be taken into account. On the one hand, the range of the each state must be large enough to cover signal level variation during the bit (or block, or packet) duration. In this case it is valid to assume that the same state characterizes the channel. In addition, a wide range ensures that the following state of the channels does not differ greatly from the current state, thus justifying the assumption that only transitions to the neighbouring states are allowed. This requirement dictates that the number of states should be small and the probabilities of all states are approximately of the same order. On the other hand, a wide range of the signal levels inside a single state may result in a situation when two separate bits (blocks, or packets) experience two significantly different SNR levels, while formally belonging to the same state of the channel, thus resulting in significant discrepancy in the error rates. Thus a relatively large number of states is needed. It is suggested in [61] that an appropriate partition is made based on average residence time k in each state. This quantity can be shown to be k ¼
pk LCRðk Þ þ LCRðkþ1 Þ
ð8:167Þ
The mean residence time can be normalized to a bit (block, packet) duration Tb as ck ¼
k Tb
ð8:168Þ
In the case of Clark’s model, the expression (8.168) becomes k kþ1 exp exp 1 sffiffiffiffiffiffiffiffiffiffiffi ck ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi fm Tb 2 k 2 kþ1 kþ1 k exp exp þ
ð8:169Þ
If the threshold k is known then the average residence time in each state ck can be easily calculated. The inverse problem can also be formulated: given ck , find a partition of the SNR, i.e. thresholds k , such that the residence time is exactly ck. In this case eq. (8.169) is an overdefined system since there will be K equations with only K 1 unknown thresholds. One of the possibilities to circumvent this problem is to introduce an additional unknown scale factor c such that only the relative length ck of the intervals is fixed ck ¼ ck c
ð8:170Þ
For example, the authors of [61] suggest that all ck ¼ 1, i.e. the residence time is the same in all states. For a fixed Doppler spread fm Tb it is possible to vary the number of states K and investigate its effect on the average residence time. On the other hand, if the average residence time is confined to certain specific values, such a restriction allows one to determine the number of states of the Markov chain which satisfies this requirement. The results of numerical solution of eqs. (8.169) and (8.170) with constraints ck ¼ 1 with fm Tb ¼ 0:0338 and fm Tb ¼ 0:0169 are shown in Fig. 8.21.
414
APPLICATIONS
Figure 8.21 Relation between the number of states K and the average residence time c for a fixed fading velocity fm Tb as a parameter. The residence time in each state is the same. (a) Linear scale, (b) inverse scale. Choice of parameters corresponds to the carrier frequencies f0 ¼ 900 MHz and 1.9 GHz, vehicular velocity ¼ 5 km=h, and the transmission rate is 1 Mb/s or 100 kb/s.
It is interesting to observe that the number of states is approximately proportional to the inverse average residence time, which is emphasized by linear fit curves in Fig. 8.21(b). A large number of states used results in the fact that the chain changes from state to state more often, thus resulting in shorter residence time. This conclusion has been reached above. To obtain a similar value of the residence time c one requires a bigger number of states for a slower fading, i.e. for a smaller fm Tb. Once the number of states and corresponding thresholds are chosen, the transitional matrix is completely designed and the next step is to take into account the modulation technique used and the average SNR over the channel, since both of these factors contribute to the probability of error in each state. The Wang–Moayeri model has proved to be very popular and useful. Babich and Lombardi [65] investigated a model for Rician fading described by the PDF pA ðAÞ ¼
A A2 þ A2s A As exp I ; 0 2 2 2 2
K¼
A2s 2 2
ð8:171Þ
The Rice factor K reflects a ratio between the power of the line-of-sight component with magnitude As and the power of the diffusion component ¼ 2 . Based on the general expression of the level crossing rate of the process with statistically independent derivative, the level crossing rate can be expressed as (see Section 4.6) sffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffi! rffiffiffiffiffiffiffiffiffiffiffi 000 2 K LCRðÞ ¼ p ðÞ ¼ fD exp K þ I0 2 2
ð8:172Þ
MARKOV CHAIN MODELS IN COMMUNICATIONS
415
Generalization to a case of Nakagami fading can be performed in two ways. One approach was suggested in [172] and is based on an approximation of the Nakagami process with severity parameter m > 1 by a Rice process with the Rice factor K defined as [3,172] pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi m2 m pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K¼ ð8:173Þ m m2 m thus resulting in the following expression rffiffiffiffiffiffiffiffiffiffiffi 00 LCRðÞ ¼ 0 p ðÞ 2 sffiffiffiffiffiffiffiffiffiffi " !# 0 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi1 2 m2 m m2 m A pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp I 0 @2 ¼ fD 2 m m m m m2 m
ð8:174Þ
Another and more direct approach can be based on the fact that, for a certain class of Nakagami process, its derivative is indeed independent of the value of the process [3]. Thus, the general expression (see Section 4.6) can be used to produce rffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffi m 000 2 1 m m m1 LCRðÞ ¼ p ðÞ ¼ fD exp ð8:175Þ 2 2 ðmÞ Of course, all three eqs. (8.172), (8.174) and (8.175) produce the same results for the case of Rayleigh fading, i.e. K ¼ 0 and m ¼ 1. It is suggested by Babich and Lombardi [65] that the distribution corresponding to the Rice fading power be approximated by means of the Weibull distribution " # 1 ð8:176Þ p ðÞ ¼ exp " # ð p ðÞd ¼ 1 exp ð8:177Þ P ðÞ ¼ 1 Here the scale and the shape parameters and are related to the parameters of the Rice distribution K, 2 and A2s by the means of the following equalities, obtained by equalizing the first two moments of both distributions 1 þ 2 ðK þ 1Þ2 1 ¼ 2 1 þ 2K 1 þ 1 ð8:178Þ 2 þ A2s ¼ 1 þ 1 Here ðÞ is the Gamma function [23]. The advantage of using a Weibull PDF becomes apparent since a simple expression can be obtained for probabilities pk of states of the Markov chain if a set of thresholds k is chosen. The reverse procedure is also analytically treatable. Indeed, if one is given a set fpk g of the probabilities of the state of a Markov chain,
416
APPLICATIONS
then the corresponding thresholds can be found by noting that " # k X k pi ¼ Probf k g ¼ 1 exp ¼ Pðk Þ i¼1 and thus
1 k ¼ ln 1 Pðk Þ
1=
" ¼ ln 1
k X
ð8:179Þ
!#1= pi
ð8:180Þ
i¼1
Using thresholds obtained through eq. (8.180), the level crossing rates can be estimated as sffiffiffiffiffiffiffiffiffiffiffiffiffiffi" !#121 ! rffiffiffiffiffiffiffiffiffiffiffi k k X X 000 000 pi 1 pi ð8:181Þ LCRðk Þ ¼ pA ðAÞ ¼ ln 1 2 2 i¼1 i¼1 This can be further used in order to obtain simple analytical expressions for the elements of the transitional matrix of the Wang–Moayeri model. Example: two-state model.5 Approximation (8.181) can be readily applied to the case of the Gilbert–Elliot model. In this case p1 ¼ " is the average probability of error in the channel and the transitional probabilities p ¼ pBG from the ‘‘Bad’’ to ‘‘Good’’ state and r ¼ pGB from the ‘‘Good’’ to the ‘‘Bad’’ state can now be approximated as [65] sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 00 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T 0 ½ lnð1 p1 Þ12 ð1 p1 Þ 1 2 00 1 " ½ lnð1 "Þ12 r¼ ¼T 0 ð8:182Þ p1 2 " sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 00 ð8:183Þ p ¼ 1 0 ½ lnð1 "Þ1 2 2 Similar equations were derived independently in [25] for the case of Nakagami and lognormal distributions and Clark’s correlation function: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s 2 m R 0:4439 m 2 m1 4m R exp fD NT 8 m ð8:184Þ rN ¼ mb ðm; 0Þ m; "
# ðln R mÞ2 c exp NT 2 2 ! rL ¼ pffiffiffi 1 1 lnð bÞ m pffiffiffi þ erf 2 2 2
5
This example was developed in collaboration with Ms Vanja Subotic.
ð8:185Þ
MARKOV CHAIN MODELS IN COMMUNICATIONS
417
Both studies [65] and [25] compared the average block failure rate predicted by eqs. (8.183)–(8.185) with those obtained from numerical simulation using the Jakes simulator and other simulators. It can be seen that there is a good agreement between the numerical simulation and prediction based on the two-state Markov model (Figs. 8.22–8.24).
Figure 8.22 Average block error burst length vs. the probability of the block error. Nakagami fading mean (left) and variance (right).
Figure 8.23 Average block error burst length vs. the probability of the block error. Log-normal fading mean values (left) and variance (right).
418
APPLICATIONS
Figure 8.24 Average block error burst length vs. the probability of the block error. Two vehicles, Nakagami PDF: mean (left) and variance (right).
8.4.3
Independence of the Channel State Model on the Actual Fading Distribution
An interesting comment can be made about the fact that after a Markov chain model is associated with a continuous fading process, i.e. a set of the state probabilities pk is assigned, it is impossible to restore the original distribution of the fading. Indeed, any two distributions can be discretized in such a way as to produce the same set of state probabilities of the corresponding Markov chain. This may be considered as a sort of a paradox. However, a proper model assigns a certain probability of error in each state6 which reflects the average SNR in this state and the modulation–demodulation techniques. Those probabilities will be different for different types of fading, since the same probability of the states of a Markov chain would require different quantization thresholds, thus producing different values of the average SNR in different states. Furthermore, using a state splitting technique [171] a hidden Markov model can be converted to a Markov model. However, due to different error probability assignments, the results of such a conversion will be different for different types of fading. 8.4.4
A Rayleigh Channel with Diversity
Since the only characteristic required to obtain the Wang–Moayeri model is to be able to calculate the marginal PDF of the signal and its level crossing rate, it is possible to adopt the same technique for a case of diversity channels with a single output signal. At this stage it can be realized that there would be a significant number of states which are described either by a very small probability of appearance or very small probability of errors. Both such states could be excluded from further consideration to reduce the complexity of the model and speed up simulation. 6
Thus leading to a Hidden Markov Model (HMM).
419
MARKOV CHAIN MODELS IN COMMUNICATIONS
Following [91] one can consider a Rayleigh channel with N-branch pure selection combining with the following decision rule R ¼ maxðr1 ; r2 ; . . . ; rN Þ
ð8:186Þ
The simplest model assumes that the fading in different branches is independent and identically distributed. The cumulative distribution function of such a decision scheme is given by N N Y Y R2 FR ðRÞ ¼ Probðr RÞ ¼ Probðri RÞ ¼ 1 exp 2 2 n n¼1 n¼1 N R2 ¼ 1 exp 2 ð8:187Þ 2 N pffiffiffi F ðÞ ¼ ProbðSNR < Þ ¼ Probðr < Þ ¼ 1 exp ; ¼ 2 2 ð8:188Þ where 2n is the variance of the signal envelope in the n-th branch. Thus, stationary probabilities of the states of the approximating Markov chain can be obtained as pk ¼ F ðkþ1 Þ F ðk Þ ¼
kþ1 N k N 1 exp 1 exp
ð8:189Þ
It is shown in [91] that the level crossing rate of the combined signal is given by LCRN ðÞ ¼ N
N ð Y n¼2
pð 1 n Þd 1 n
N ð Y
pffiffi
n¼2 0
pðrn Þd rn
ð1
r_ 1 pð_r1 jr1 ¼
pffiffiffi Þd r_ 1
0
sffiffiffiffiffiffiffiffiffiffi N1 2 fm exp ¼N 1 exp sffiffiffiffiffiffiffiffiffiffiffi 2 k k k N1 fm exp LCRN ðk Þ ¼ 1 exp
ð8:190Þ
ð8:191Þ
Here 1 n represents the phase of the signal in the n-th element of the antenna array. Combining eqs. (8.189) and (8.191) with eqs. (8.153) and (8.154) derived earlier for the elements of the transitional matrix T one can obtain the desired Markov chain.
8.4.5
Fading Channel Models
Swarts and Ferreira carried out a comparison study [81] of Gilbert, Elliot and Fritchman models as applied to GSM systems with the band-time product BT ¼ 0:3, maximum Doppler shift fD ¼ 40 Hz and average SNR levels ¼ 5, 10, 15 and 20 dB. Parameters of the models can be estimated based on the error distribution statistics, obtained from the Jakes simulators and eqs. (8.141)–(8.146). Since only simple statistics were used, it appears that all three models perform almost identically.
420
APPLICATIONS
Guan and Turner [47] suggested modelling the fading envelope by approximating it by a general N-state Markov chain of the first order. The interval of variation of the envelope ½0; 1Þ is divided into N sub-intervals by a set of N 1 points n , 1 n N 1. It is suggested that elements of the transitional probability can be found from a joint probability density function pAA ðA; A ; Þ of the envelope considered in two separate moments of time pðAk ; Al Þ ¼ pðAk jAl Þ ¼ pðAl Þ
Ð k Ð l
k1 l1
pAA ðA; A ; Þd A d A Ð l l1 pA ðAÞd A
ð8:192Þ
Such an assignment ensures that not only the stationary probabilities of each state approximate the distribution of the continuous random process but the second order statistics at two sequential moments of time are also well approximated. As an example of the application of their model the authors of [47] consider modeling of the Nakagami distributed random envelope pðAk ; Al Þ ¼
mmþ1
4ðAA Þm ðmÞð1 Þ
m1 2
P
pffiffiffi 2 m A A m ðA2 þ A2 exp Im1 Pð1 Þ P 1
ð8:193Þ
Here P ¼ EfA2 g is the average power of the envelope (proportional to the average SNR in the AWGN channel), 0:5 m 1 is the Nakagami parameter and is the correlation coefficient between two sequential samples of the envelope separated by a duration of bit J02 ð2 fD Þ
ð8:194Þ
While an analytical expression for integral (8.192) cannot be obtained in quadratures, a good approximation has recently been derived [3]. To validate the model, a comparison with the Wang–Moayeri approximation (see Section 8.4.2 for more details) has been conducted for the case of m ¼ 1, ¼ 105 s and fD ¼ 10; 100 Hz. A remarkable agreement between two models can be observed—most of the values differ by only the fifth decimal digit. The authors also consider an interleaved channel to find that the approximation of transitional probabilities calculated through the joint probability density (8.193) still produces good results, while the Wang–Moayeri approximation through the crossing rates becomes less accurate due to the fact that transitions to non-neighbouring states appear with greater probability. At the same time, larger time separation between samples, which is equivalent to larger Doppler spread, or a shorter correlation interval, also brings into consideration the non-Markov nature of Clark’s model of fading. This clearly shows limitations of simple Markov chain models for relatively fast fading. The question of choosing a proper order of Markov chain is treated in [65]. Modeling of a slow fading is considered in [72]. In this paper, the authors obtain measurements of average signal intensity and its variation based on 1.9 GHz measurements, collected in Austin, TX. Measurements were taken by a moving car. This data was later used to obtain fluctuating slow fading by means of numerical simulation of a random driving path, with the car turning randomly at each intersection. The value of the signal along each simulated path was then retrieved from the measurement database, filtered and decimated to produce a signal with 1 Hz sampling rate. The slow fading signal trace, obtained through this process, was used as a reference signal for modeling.
MARKOV CHAIN MODELS IN COMMUNICATIONS
421
An 80-state Markov chain was chosen to approximate real data. Threshold levels were set in such a way that all states had equal probability pi ¼ 1=80 and spanned a region between 20 dB and 50 dB. The authors explained that such a choice would allow them to avoid situations when the chain resided in a few states for a long time. Transitional probabilities were estimated from the experimental data by counting a number of transitions between any two states and normalizing it by a total number of data points. Thus the constructed model was used to generate a sample of a Markov process and compare its characteristics with the one obtained from the measurements. It was found that the cumulative distribution function was well approximated, since a sufficient number of states was selected. The authors also point out that the model is able to approximate the level crossing rate with reasonable accuracy. This can be explained by the fact that the model has been constructed based on the transitional probabilities, i.e. level crossing rates. However, the model fails to approximate the power spectral density of the signal. The authors attribute this discrepancy to the fact that the actual channel does not show a Markov property. However, no further studies of the memory of measurements were conducted. It is clear that more accuracy in Markov models can be achieved by both increasing the number of states and increasing the memory of the Markov process.7 Such an increase in complexity is not desired for fast and approximate simulation. Rao and Chen [79] suggest a mechanism to reduce the number of states by aggregating a number of states into new superstates. The authors use the lumpability property of a Markov chain to derive a two-state model of the fading process and compare performance of such a simplified model with those produced by higher order models; in particular, one error state Fritchman model8 with four non-error states and different types of modulation schemes (FSK, DPSK, QPSK and 8-ary PSK), a hidden Markov model with six states (described in [173] and the Wang and Moayeri model9 with eight states. Many of the critics of Markov models point out that, for the correlation function, a simple Markov chain considered is a poor fit to realistic ones [170]. There is also ongoing discussion on applicability of modeling of fading as a Markov process [166–170,173]. While most of the researchers agree that the second order Markov models fit the canonical Jakes model of the envelope very well, there is no consensus on applicability of simple Markov models. In the view of the authors of the current book, such models are at least very helpful in discovering general trends and analysis performance of complicated codes.
8.4.6
Higher Order Models [170]
A number of methods allow one to extend Markov chain models to account for dependence between the future and the past of the process, given current observation. Such dependence leads to a non-Markov nature of the process under consideration. Zorzi et al. suggest [170] that both the values of the random process and its derivative (velocity) are approximated by a Markov chain. If there are M levels mi ð0 i M 1Þ of the random process ðtÞ to be considered, and there are S levels sj ð0 j S 1Þ of the velocity of the random process _ðtÞ, then the Markov chain consists of M S states obtained 7
That is, to consider Markov processes of order more than 1. See the chapter for more details. See Section 12.1.4. 9 See Section 8.4.2. 8
422
APPLICATIONS
as all possible pairs ðmi ; sj Þ. It is quite clear that the complexity of such a description grows very fast with the number of states. It is claimed in [170] that it is sufficient to choose only three levels of the velocity: S ¼ 3, s0 ¼ 0, s1 ¼ s2 ¼ ". Here velocity levels reflect a well known fact that for many useful processes such as Rayleigh, Nakagami, etc. the velocity is a zero mean Gaussian process. A particular value of " is chosen to compromise between the accuracy of approximation of the flat portion of the early time and the oscillatory portion in the late time of Clark’s model. The authors proceed further to show an example of an ARQ system simulation that provides a significant improvement when compared to the model which approximates only the states of the process ðtÞ.
8.5
MARKOV CHAIN FOR DIFFERENT CONDITIONS OF THE CHANNEL
A number of applications use a finite state Markov chain to describe changing conditions of the communication channel [16]. It is characteristic for a wireless link to change if at least one of the mobile receivers changes position. For example, a pedestrian can move from a residential area with relatively mild obstructions by two-storey wooden houses and trees to either an open area with virtually no obstruction or to downtown areas with high rises completely shadowing the signal. Such dramatic changes can be modeled as separate states of the Markov chain with certain non-deterministic transitions. One of the examples is considered in [56]. Here a land mobile satellite service (LMSS) channel is modeled as a three-state Markov chain. All states have contributions from weak scattered waves. In addition, the state A has a contribution from a strong, unattenuated line-of-sight ray; the state B is characterized by significant attenuation of the line-of-sight, and finally, the state C is described by fully blocked line-of-sight. As a result, fast fading statistics differ in each state. Probabilities of each state are pA , pB and pC , respectively (Fig. 8.25). Of course pA þ pB þ pC ¼ 1
ð8:195Þ
The three-state model presented in [56] is considered more general than the two-state models considered by Lutz [164] (only states B and C were considered, with a non-
Figure 8.25
Single satellite link model suggested in [56].
423
REFERENCES
Rayleigh model for the state C), and Barts and Stutzman [165] (only states A and B were considered). In all states, contribution from the weak diffused scatterers is approximated by Rayleigh fading. In the state A, the Nakagami–Rice model is used. In the state B, attenuation of the direct ray is described by a log-normal distribution. Superimposed with Rayleigh fast fading, this results in the well known Loo model [163]. Fast fading in the state C is represented just by Rayleigh fading. According to this model, the probability density function of the received signal is a weighted sum of the density functions at each state, with weights equal to the probabilities of the states pðrÞ ¼ PA pA ðr; Pr;A Þ þ Pb pB ðr; mL ; L ; Pr;B Þ þ PC pC ðr; Pr;C Þ
ð8:196Þ
Here 2r 1 þ r2 2r exp I0 pA ðr; Pr; A Þ ¼ Pr; A Pr; A Pr; A " # ð1 6:930 r 1 f20 logðzÞ mL g2 r 2 þ z2 exp dz pB ðr; mL ; L ; Pr; B Þ ¼ Pr; B L Pr; B 0 z 2 2L 2r r2 pC ðr; Pr;C Þ ¼ exp Pr;C Pr;C
ð8:197Þ ð8:198Þ ð8:199Þ
and Pr; A ; Pr;B and Pr;C are the mean powers of the multipath components normalized to the direct ray power; ml and L are the mean and the variance of the signal fading in dB. In the original paper [56] these values were obtained from the measurements in different cities. Based on these measurements it is suggested that the following approximations are used for the state probabilities PA ¼ 1
ð90 Þ2 ;
10 < < 90 ; ¼
7:0 103 1:66 104
for urban areas for suburban areas
ð8:200Þ
A similar idea has been suggested in [77]. In this paper the authors suggest that a satellite communication channel can be described by three significantly different conditions, described by the Rayleigh, Rician and log-normal distributions of the signal’s envelope. The transitions between states are described by a Markov chain of order 3. The main contribution of this paper is the fact that the model is implemented on a TMS320C30 microprocessor and can be expanded to model wide band channels by means of parallel computations. One of the drawbacks of this method is that the number of processors required is equal to the number of rays to be simulated, thus increasing the cost of such simulators [174–188].
REFERENCES 1. B. Evans, Satellite Communication Systems, London: IEEE, 1999. 2. M. Jakes, Microwave Mobile Communications, New York: Wiley, 1974.
424
APPLICATIONS
3. M. Simon and M. Alouini, Digital Communication over Fading Channels: A Unified Approach to Performance Analysis, New York: John Wiley & Sons, 2000. 4. COST 207, Digital Land Mobile Radio Communications, Final Report, Office for Official Publications of the European Communities, Luxembourg, 1989. 5. H. Suzuki, A statistical model for urban radio propagation, IEEE Trans. Communications, 25(7), 673–680, 1977. 6. M. Nakagami, The m-distribution—A general formula of intensity distribution of rapid fading, In Statistical Methods in Radio Wave Propagation, edited by W. C. Hoffman, New York: Pergamon, 1960, pp. 3–36. 7. C. Loo, A statistical model for a land mobile satellite link, IEEE Trans. Vehicular Technology, 34(3), 122–127, 1985. 8. A. Aboudebra, K. Tanaka, T. Wakabayashi, S. Yamamoto and H. Wakana, Signal fading in landmobile satellite communication systems: statistical characteristics of data measured in Japan using ETS-VI, Proc. IEE Microwaves, Antennas and Propagation, 146(5), 349–354, 1999. 9. P. Taaghol and R. Tafazolli, Correlation model for shadow fading in land-mobile satellite systems, IEE Electronics Letters, 33(15), 1287–1289, 1997. 10. H. Wakana, Propagation model for simulating shadowing and multipath fading in land-mobile satellite channel, IEE Electronics Letters, 33(23), 1925–1926, 1997. 11. R. Akturan and W. Vogel, Elevation angle dependence of fading for satellite PCS in urban areas, IEE Electronics Letters, 31(25), 2156–2157, 1995. 12. W. Vogel and J. Goldhirsh, Multipath fading at L band for low elevation angle, land mobile satellite scenarios, IEEE Journal Selected Areas in Communications, 13(2), 197–204, 1995. 13. F. Vatalaro, G. Corazza, C. Caini and C. Ferrarelli, Analysis of LEO, MEO, and GEO global mobile satellite systems in the presence of interference and fading, IEEE Journal Selected Areas in Communications, 13(2), 291–300, 1995. 14. Y. Chau and J. Sun, Diversity with distributed decisions combining for direct-sequence CDMA in a shadowed Rician-fading land-mobile satellite channel, IEEE Trans. Vehicular Technology, 45(2), 237–247, 1996. 15. C. Amaya, and D. Rogers, Characteristics of rain fading on Ka-band satellite-earth links in a Pacific maritime climate, IEEE Trans. Microwave Theory and Techniques, 50(1), 41–45, 2002. 16. P. Babalis and C. Capsalis, Impact of the combined slow and fast fading channel characteristics on the symbol error probability for multipath dispersionless channel characterized by a small number of dominant paths, IEEE Trans. Communications, 47(5), 653–657, 1999. 17. A. Ismail and P. Watson, Characteristics of fading and fade countermeasures on a satellite-earth link operating in an equatorial climate, with reference to broadcast applications, Proc. IEE Microwaves, Antennas and Propagation, 147(5), 369–373, 2000. 18. A. Verschoor, A. Kegel and J. Arnbak, Hardware fading simulator for a number of narrowband channels with controllable mutual correlation, IEE Electronics Letters, 24(22), 1367–1369, 1988. 19. G. Arredondo, W. Chriss and E. Walker, A multipath fading simulator for mobile radio, IEEE Trans. Communications, 21(11), 1325–1328, 1973. 20. E. Casas and C. Leung, A simple digital fading simulator for mobile radio, IEEE Trans. Vehicular Technology, 39(3), 205–212, 1990. 21. M. Pop and N. Beaulieu, Limitations of sum-of-sinusoids fading channel simulators, IEEE Trans. Communications, 49(4), 699–708, 2001. 22. P. Chengshan Xiao, Y. R. Zheng and N. C. Beaulieu, Second-order statistical properties of the WSS Jakes’ fading channel simulator, IEEE Trans. Communications, 50(6), 888–891, 2002. 23. M. Abramovitz and I. Stegun, Handbook of Mathematical Functions, New York: Dover, 1995. 24. K. Yip and T. Ng, A simulation model for Nakagami-m fading channels, m < 1, IEEE Trans. Communications, 48(2), 214–221, 2000. 25. V. Subotic and S. Primak, Markov approximation of a fading communications channel, Proc. 2002 European Microwave Week, Milan, 2002.
REFERENCES
425
26. V. Subotic and S. Primak, Novel simulator of a sub-Rayleigh fading channel, IEE Electronics Letters, 2003. 27. S. Primak, V. Lyandres, O. Kaufman and M. Kliger, On generation of correlated time series with a given PDF, Signal Processing, 72(2), 61–68, 1999. 28. V. Kontorovich and V. Lyandres, Dynamic systems with random structure: an approach to the generation of nonstationary stochastic processes, Journal of The Franklin Institute, 336(6), 939– 954, 1999. 29. J. Weaver, V. Kontorovich and S. Primak, On analysis of systems with random structure and their applications to communications, Proc. VTC-2003, Fall, Orlando, Florida, 2003. 30. L. N. Kanal and A. Sastry, Models for channels with memory and their applications to error control, Proc. of the IEEE, 66(6), 724–744, 1978. 31. E. Gilbert, Capacity of a burst-noise channel, The Bell System Technical Journal, 1253–1265, 1960. 32. E. L. Cohen and S. Berkovits, Exponential distributions in Markov chain models for communication channels, Information and Control, 13, 134–139, 1968. 33. E. Elliot, Estimates of error rates for codes on burst-noise channels, Bell Systems Technical Journal, 42, 1977–1997. 34. P. Trafton, H. Blank and N. McAllister, Data transmission network computer-to-computer study, Proc. ACM/IEEE 2nd Symposium on Problems in the Optimization of Data Communication Systems, Palo Alto, October, 1971, pp. 183–191. 35. B. Fritchman, A binary channel characterization using partitioned Markov chains, IEEE Trans. on Information Theory, 13(2), 221–227, 1967. 36. I. Shji and T. Ozaki, Comparative study of estimation methods for continuous time stochastic processes, Journal of Time Series Analysis, 18, 485–506, 1997. 37. W. Press, S. Teukolsky, W. Vetterling and B. Flannery, Numerical Recipes in C, Cambridge, England: Cambridge University Press, 2002. 38. J. P. Adoul, B. Fritchman and L. Kanal, A critical statistic for channel with memory, IEEE Trans. Information Theory, 18(1), 133–141, 1972. 39. H. Blank and P. Trafton, ARQ analysis in channels with memory, National Telecommunication Conference, 1973, pp. 15B-1–15B-8. 40. J. Kemeny, Slowly spreading chains of the first kind, Journal of Mathematical Analysis and Applications, 15, 295–310, 1966. 41. S. Tsai, Evaluation of burst error correcting codes using a simple partitioned Markov chain, IEEE Trans. Communications, 21, 1031–1034, 1973. 42. P. McManamon, HF Markov chain models and measured error averages, IEEE Trans. Communication Technologies, 18(3), 201–208, 1970. 43. R. McCullough, The binary regenerative channel, Bell Systems Technical Journal, 47, 1713–1735, 1968. 44. H. Wang and N. Moayeri, Finite-state Markov channel—a useful model for radio communication channels, IEEE Trans. Vehicular Technology, 44(1), 163–171, 1995. 45. I. Oppermann, B. White and B. Vucetic, A Markov model for wide-band fading channel simulation in micro-cellular systems, IEICE Trans. Communications, E79-B(9), 1215–1220, 1996. 46. S. Tsai and P. Schmied, Interleaving and error-burst distribution, IEEE Trans. Communications, 20(3), 291–296, 1972. 47. Y. Guan and L. Turner, Generalized FSMC model for radio channels with correlated fading, IEE Proc. Communications, 146(2), 133–137, 1999. 48. H. Kong and E. Shwedyk, Markov characterization of frequency selective Rayleigh fading channels, IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, 1995, pp. 359–362. 49. H. Viswanathan, Capacity of Markov channels with receiver CSI and delayed feedback, IEEE Trans. Information Theory, 45(2), 761–771, 1999.
426
APPLICATIONS
50. J. Metzner, An interesting property of some infinite-state channels, IEEE Trans. Information Theory, 11(2), 310–312, 1965. 51. M. Muntner and J. Wolf, Predicted performance of error-control techniques over real channels, IEEE Trans. Information Theory, 14(5), 640–650, 1963. 52. F. Swarts and H. Ferreira, Markov characterization of channels with soft decision outputs, IEEE Trans. Communications, 41(5), 678–682, 1993. 53. C. Tan and N. Beaulieu, On first-order Markov modeling for the Rayleigh fading channel, IEEE Trans. Communications, 48(12), 2032–2040, 2000. 54. A. Haddad, S. Tsai, B. Goldberg and G. Ranieri, Interleaving and error-burst distribution, IEEE Trans. Communications, 23(11), 1189–1197, 1975. 55. H. Wakana, Propagation model for simulating shadowing and multipath fading in land-mobile satellite channel, Electronics Letters, 33(23), 1925–1926, 1997. 56. Y. Karasawa, K. Kimura and K. Minamisono, Analysis of availability improvement in LMSS by means of satellite diversity based on three-state propagation channel model, IEEE Trans. Vehicular Technologies, 46(4), 1047–1056, 1997. 57. H. Wang and P.-C. Chang, On verifying the first-order Markovian assumption for a Rayleigh fading channel model, IEEE Trans. Vehicular Technology, 45(2), 353–357, 1996. 58. A. Goldsmith and P. Varaya, Capacity, mutual information and coding for finite-state Markov channels, IEEE Trans. Information Theory, 42(3), 868–886, 1996. 59. W. Turin, Throughput analysis of the go-back-N protocol in fading channels, IEEE Journal Selected Areas in Communications, 17(5), 881–887, 1999. 60. S. Tsai, Markov characterization of the HF channel, IEEE Trans. Communication Technology, 17(1), 24–32, 1969. 61. Q. Zhang and S. Kassam, Hybrid ARQ with selective combining for fading channels, IEEE Journal Selected Areas in Communications, 17(5), 867–879, 1999. 62. M. Zorzi, R. Rao and L. Milstein, Error statistics in data transmission over fading channels, IEEE Trans. Communications, 46(11), 1468–1477, 1998. 63. M. Zorzi, A. Chockalingam and R. Rao, Throughput analysis of TCP on channels with memory, IEEE Trans. Vehicular Technology, 49(7), 1289–1300, 2000. 64. A. Chockalingam, W. Xu, M. Zorzi and B. Milstein, Throughput-delay analysis of a multichannel wireless access protocol in the presence of Rayleigh fading, IEEE Trans. Vehicular Technology, 49(2), 661–671, 2000. 65. F. Babich and G. Lombardi, A Markov model for the mobile propagation channel, IEEE Trans. Vehicular Technology, 49(1), 63–73, 2000. 66. F. Babich and O. Kelly, A simple digital fading simulator for mobile radio, IEEE Trans. Vehicular Technology, 39(3), 205–212, 1990. 67. H. Kong and E. Shwedyk, Sequence detection and channel state estimation over finite state Markov channels, IEEE Trans. Vehicular Technology, 48(3), 833–839, 1999. 68. Y. Kim and S. Li, Capturing important statistics of a fading/shadowing channel for network performance analysis, IEEE Journal of Selected Areas in Communications, 17(5), 888–901, 1999. 69. H. Bischl and E. Lutz, Packet error rate in the non-interleaved Rayleigh channel, IEEE Trans. Communications, 43(2–4), 1375–1382, 1995. 70. A. Chockalingam, M. Zorzi and L. Milstein, Performance of a wireless access protocol on correlated Rayleigh-fading channels with capture, IEEE Trans. Communications, 42(3), 868–886, 1996. 71. M. Zorzi, Outage and error events in bursty channels, IEEE Trans. Communications, 46, 349–356, 1998. 72. T. Su, H. Ling and W. Vogel, Markov modeling of slow fading in wireless mobile channels at 1.9 GHz, IEEE Transactions on Antennas and Propagation, 46(6), 947–948, 1998. 73. R. Rao, Higher layer perspectives on modeling the wireless channel, ITW, 1998, Killarney, Ireland, pp. 137–138.
REFERENCES
427
74. W. Turin and R. van Nobelen, Hidden Markov modeling of fading channels, VTC 98, 1234–1238. 75. A. Fulinski, Z. Grzywna, I. Mellor, Z. Siwy and P. N. R. Usherwood, Non-Markovian character of ionic current fluctuations in membrane channels, Physical Review E, 58(1), 919–924, 1998. 76. H. Kong and A. Shwedyk, Sequence estimation for frequency non-selective channels using a hidden Markov model, IEEE Vehicular Technology Conference, 1251–1253, 1994. 77. J. An and M. Chen, Simulation techniques for mobile satellite channel, Proc. MILCOM ’94, IEEE, 1994, pp. 163–167. 78. A. Abdi, Comments on ‘‘On verifying the first-order Markovian assumption for a Rayleigh fading channel model’’, IEEE Trans. Vehicular Technology, 48(5), 1739, 1999. 79. A. Chen and R. Rao, On tractable wireless channel models, IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, 825–830, 1998. 80. A. Beverly and K. Shanmugan, Hidden Markov models for burst errors in GSM and DECT channels, GLOBECOM 1998, 6, 3692–3698, 1998. 81. J. Swarts and H. Ferreira, On the evaluation and application of Markov channel models in wireless communications, Proc. VTC 1999—Fall, 1999, pp. 117–121. 82. H. Steffan, Adaptive generative radio channel models, IEEE Conference on Personal, Indoor and Mobile Radio Communications, 1994, pp. 268–273. 83. F. Babich and G. Lombardi, On verifying a first-order Markovian model for the multi-threshold success/failure process for Rayleigh channel, IEEE Conference on Personal, Indoor and Mobile Radio Communications, PIMRC ’97, 1997, pp. 12–16. 84. N. Nefedov, Generative Markov models for discrete channel modeling, IEEE Conference on Personal, Indoor and Mobile Radio Communications, PIMRC ’97, 1997, pp. 7–11 85. Q. Cao and M. Gurkan, Markov model and delay analysis for group ALOHA systems, Proc. VTC1996, 3, 1996, pp. 1761–1765. 86. P. Soni and A. Chockalingam, Energy efficiency analysis of link layer backoff schemes on point-topoint Markov fading links, Proc. PIMRC 2000, 1, 416–420, 2000. 87. D. Huang and J. Shi, TCP over packet radio, Emerging Technologies Symposium: Broadband, Wireless Internet Access, 2000. 88. M. Hueda, A Markov-based model for performance evaluation in multimedia CDMA wireless transmission, Proc. IEEE-VTS Fall VTC 2000, 2, 2000, pp. 668–673. 89. M. Ouameur and D. Massicotte, A Markov chain and quadrature amplitude modulation fading based statistical discrete time model for multi-WSSUS multipath channel, Canadian Conference on Electrical and Computer Engineering, 2001, pp. 487–492. 90. B. Clerckx and D. Vanhoenacker-Janvier, Performance analysis of Rake receivers with maximum a posteriori channel state estimation over time varying frequency selective finite state Markov fading channels, Proc. IEEE RAWCON 2001, pp. 141–144. 91. J. Lu, K. Letaief, M. Liou and J. Chuang, Modeling correlated fading with diversity and its application to block codes performance estimation, Proc. IEEE GLOBECOM 1998, pp. 3657–3662. 92. D. Oosthuizen, H. Ferreira and F. Swarts, Markov models for block code error control systems on fading RF channels, Proc. COMSIG 1989, 1989, pp. 1–6. 93. T. Tao, J. Lu and J. Chuang, Hierarchical Markov model for burst error analysis in wireless communications, Proc. VTC 2001 Spring, 4, 2843–2847, 2001. 94. D. Oosthuizen, H. Ferreira and F. Swarts, On renewal inner channels and block code error control super channels, IEEE Trans. Communications, 42(9), 2645–2649, 1994. 95. C. Lee and Y. Su, Errors-and-erasures decoding of block codes in hidden Markov channels, Proc. IEEE VTC 2001 Spring, 1425–1429, 2001. 96. M. Hueda, On the equivalence of channel-state and block-error Markov models for the analysis of transmission over slow fading channels, Proc. IEEE VTC 2001 Fall, 2, 1043–1047, 2001. 97. J. Hsieh, H. Lin and C. Yang, A two-level, multi-state Markov model for satellite propagation channels, Proc. IEEE VTC 2001, 4, 3005–3009, 2001.
428
APPLICATIONS
98. B. D. Carlson, and L.M. Goodman, Adaptive antenna system design for multi-element combining with transient interrupted channels, IEEE Trans. Vehicular Technology 51(5), 799–807, 2002. 99. R. D’Agostino and M. Stephens, Goodness-of-Fit Techniques, New York: Marcel Dekker, 1986. 100. M. Skolnik, A Review of Radar Sea Echo, Technical report, NRL 2025, Naval Research Laboratory, Washington, 1969. 101. E. Jakeman and P. Pusey, A model for non-Rayleigh sea echo, IEEE Trans. Antennas and Propagation, 40(4), 700–707, 1973. 102. E. Jakeman, Significance of K-distribution in scattering experiments, Physical Review Letters, 40(9), 546–550, 1978. 103. E. Jakeman, On statistics of K-distributed noise, Journal of Physics, A: Mathematics and General, 13(1), 31– 48, 1980. 104. D. Kreithen and G. Hagan, Statistical analysis of K-band sea clutter, Proc. Ocean-91, Hadlay, Hawaii, 128–132, 1991. 105. T. Nohara and S. Haykin, East coast radar trials and the K-distribution, IEE Proc. Radar and Signal Processing F, 138(2), 80–88, 1991. 106. F. Gini, M. Greco, M. Diani and L. Verrazzani, Performance analysis of two adaptive radar detectors against non-Gaussian real sea clutter data, IEEE Trans. Aerospace and Electronic Systems, 36(4), 1429–1439, 2000. 107. G. Tsihrintzis and C. Nikias, Evaluation of fractional, lower-order statistics-based detection algorithms on real radar sea-clutter data, IEE Proc. Radar, Sonar and Navigation, 144(1), 29– 38, 1997. 108. S. Haykin, R. Bakker and B. Currie, Uncovering nonlinear dynamics—the case study of sea clutter, Proc. of the IEEE, 90(5), 860–881, 2002. 109. C. Baker, K-distributed coherent sea clutter, IEE Proc. Radar and Signal Processing, 138(2), 89– 92, 1991. 110. T. Lamont-Smith, Translation to the normal distribution for radar clutter, IEE Proceedings Radar, Sonar and Navigation, 147(1), 17–22, 2000. 111. A. Farina, F. Gini, M. Greco and L. Verrazzani, High resolution sea clutter data: statistical analysis of recorded live data, IEE Proc. Radar, Sonar and Navigation, 144(3), 121–130, 1997. 112. T. Azzarelli, General class of non-Gaussian coherent clutter models, IEE Proc. Radar, Sonar and Navigation, 142(2), 61–70, 1995. 113. R. Tough, C. Baker and J. Pink, Radar performance in a maritime environment: single hit detection in the presence of multipath fading and non-Rayleigh sea clutter, IEE Proc. Radar and Signal Processing, 137(1), 33–40, 1990. 114. T. Hair, T. Lee and C. Baker, Statistical properties of multifrequency high range resolution sea reflections, IEE Proc. F, Radar and Signal Processing, 138(2), 75–79, 1991. 115. M. Sletten, D. Trizna and J. Hansen, Ultrawide-band radar observations of multipath propagation over the sea surface, IEEE Trans. Antennas and Propagation, 44 (5), 646, 1996. 116. G. Corsini, A. Mossa and L. Verrazzani, Signal-to-noise ratio and autocorrelation function of the image intensity in coherent systems. Sub-Rayleigh and super-Rayleigh conditions, IEEE Trans. Image Processing, 5(1), 132–141, 1996. 117. F. Gini and A. Farina, Vector subspace detection in compound-Gaussian clutter. Part I: survey and new results, IEEE Trans. Aerospace and Electronic Systems, 38(4), 1295–1311, 2002. 118. J. Nakayama, Generation of stationary random signals with arbitrary probability distribution and exponential correlation, IEICE Trans. Fundamentals, E77-A(5), 917–921, 1994. 119. G. Fikhtengolts, The Fundamentals of Mathematical Analysis, edited by I. Sneddon, Oxford: Pergamon Press, 1965. 120. J. Proakis, Digital Communications, New York: McGraw-Hill, 1995. 121. B. Mandelbrot, The Fractal Geometry of Nature, New York: WH Freeman, 1983. 122. H. Chan, Radar sea clutter at low grazing angles, IEE Proc. Radar and Signal Processing, 137(2), 102–112, 1990.
REFERENCES
429
123. E. Conte, M. Longo, M. Lops and S. Ullo, Radar detection of signals with unknown parameters in K-distributed clutter, IEE Proc. F: Radar and Signal Processing, 138(2), 131–138, 1991. 124. C. Pimentel and I. Blake, Enumeration of Markov chain and burst error statistics for finite state channel models, IEEE Trans. Veh. Tech., 48(2), 415–427, 1999. 125. C. Leung, Y. Kikumoto and S. Sorensen, The throughput efficiency of the go-back-N ARQ scheme under Markov and related error structures, IEEE Trans. Communications, 36(2), 231–234, 1988. 126. H. Wang and P.-C. Chang, On verifying the first-order Markovian assumption for a Rayleigh fading channel model, IEEE Trans. Vehicular Technology, 45(2), 353–357, 1996. 127. U. Krieger, B. Muller-Clostermann and M. Sczittnick, Modeling and analysis of communication systems based on computational methods for Markov chains, IEEE J. Selected Areas in Communications, 8(9), 1630–1648, 1990. 128. A. Elwalid and D. Mitra, Effective bandwidth of general Markovian traffic source and admission control of high speed networks, IEEE/ACM Trans. on Networking, 1(3), 329–343, 1993. 129. J. Timmer and S. Klein, Testing the Markov condition in ion channel recording, Physical Review E, 55(3), 3306–3311, 1997. 130. M. Boguna, A. M. Berezhkovskii and G. Weiss, Residence time densities for non-Markovian systems. (I). The two-state systems, Physica A, 282, 475–485, 2000. 131. S. Sivaprakasam and K. S. Shanmugaan, An equivalent Markov model for burst errors in digital channels, IEEE Trans. Communications, 43, 1347–1355, 1995. 132. W. Turin, Fitting probabilistic automata via the EM algorithm, Commun. Statist.-Stochastic Models, 12(3), 405–424, 1996. 133. W. Turin and M. Sondhi, Modeling error sources in digital channels, IEEE Journal of Selected Areas in Communications, 11(3), 340–347, 1993. 134. H. Wang and P. Chang, Finite-state Markov channel—a useful model for radio communication channels, IEEE Trans. Vehicular Technology, 44, 163–171, 1995. 135. M. Zorzi and R. R. Rao, On the statistics of block errors in bursty channels, IEEE Trans. Communications, 45, 660–667, 1997. 136. H. Wang and N. Moayeri, Finite-state Markov channel—a useful model for radio communication channels, IEEE Trans. Vehicular Technology, 44(1), 163–171, 1995. 137. M. Zorzi and R. Rao, Bounds on the throughput performance of ARQ selective-repeat protocol in Markov channels, ICC’96, Dallas, Texas, 1996. 138. Y. M. Zorzi, R. Rao and L. Milstein, A Markov model for block errors on fading channels, Proc. PIMRC’96, Taiwan, 1996, pp. 1–5. 139. A. Anastasopuolus and K. Chugg, An efficient method for simulation of frequency selective isotropic Rayleigh fading, Proc. VTC, 1997, pp. 2084–2088. 140. P. Bello, A generic channel simulator for wireless channels, Proc. IEEE MILCOM, 1997, pp. 1575– 1579. 141. G. Burr, Wide-band channel modeling, using a spatial model, Proc. IEEE 5-th Int. Symp. on Spread Spectrum Tech. and Applications, 1998, pp. 255–257. 142. L.Clavier and M. Rachdi, Wide band 60 GHz indoor channel: characterization and statistical modeling, Proc. IEEE VTC, pp. 2090–2102. 143. S. Fechtel, A novel approach to modeling and efficient simulation of frequency selective radio channels, IEEE Journal on Selected Areas in Communications, 11(3), 422– 431, 1993. 144. J. Mastrangelo, J. Lemmon, L. Vogler, J. Hoffmeyer, L. Pratt and C. Behm, A new wideband high frequency channel simulation system, IEEE Trans. on Communications, 45(1), 26–34, 1997. 145. R. Parra-Michel, V. Kontorovich and A. Orozco-Lugo, Modeling wideband channels using orthogonalizations, IEICE Trans. on Electronics and communications, E85-C(3), 544–551, 2002. 146. R. Parra-Michel, V. Kontorovich and A. Orozco-Lugo, Simulation of wideband channels with nonseparable scattering functions, Proc. of ICASSP-2002, 2002, pp. 111.2829–111.2832. 147. K. Yip and T. Ng, Kahrunen–Loev expansion of WSSUS channel output and its applications to efficient simulation, IEEE Journal on Selected Areas in Communications, 15(4), 640–646, 1997.
430
APPLICATIONS
148. T. Zwick, D. Didascalou and W. Wysocki, A broadband statistical channel model for SDMA applications, Proc. of the 5th Int.Symp.on Spread Spectrum Tech. and Applications, 2, 527–531, 1998. 149. A. Chockalingam, M. Zorzi, L. Milstein and P. Venkataram, Performance of a wireless access protocol on correlated Rayleigh-fading channels with capture, IEEE Transactions on Communications, 46(5), 644–654, 1998. 150. E. Conte, M. Longo and M. Lops, Modeling and simulation of non-Rayleigh radar clutter, IEE Proc. F: Radar and Signal Processing, 138(2), 121–130, 1991. 151. E. Conte, M. Longo and M. Lops, Performance analysis of CA-CFAR in the presence of compound Gaussian clutter, IEE Electronics Letters, 24(13), 782–783, 1988. 152. M. Rangaswamy, D. Weiner and A. Ozturk, Non-Gaussian random vector identification using spherically invariant random processes, IEEE Trans. Aerospace and Electronic Systems, 29(1), 111–124, 1993. 153. M. Rangaswamy, D. Weiner and A. Ozturk, Computer generation of correlated non-Gaussian radar clutter, IEEE Trans. Aerospace and Electronic Systems, 31(1), 106–116, 1995. 154. R. L. Stratonovich, Topics in the Theory of Random Noise, New York: Gordon and Breach, 1967. 155. D. Klovsky, V. Kontorovich and S. Shirokov, Models of Continuous Communications Channels Based on Stochastic Differential Equations, Moscow: Radio i sviaz, 1984 (in Russian). 156. V. Pugachev and I. Sinitsin, Stochastic Differential Systems: Analysis and Filtering, New York: Wiley, 1987. 157. A. N. Malakhov, Cumulant Analysis of Non-Gaussian Random Processes and Their Transformations, Moscow: Sovetskoe Radio, 1978 (in Russian). 158. F. Babich and G. Lombardi, A measurement based Markov model for the indoor propagation channel, Proc. IEEE VTC-1997, 77–81, 1997. 159. W. Feller, An Introduction to Probability Theory and Its Applications, 3rd Edition, New York: John Wiley & Sons, 1968. 160. L. Sharf and R. Behrens, First Course in Electrical and Computer Engineering: With MATLAB Programs and Experiments, Reading, MA: Addison-Wesley, 1990. 161. P. Stoica and R. Moses, Introduction to spectral analysis, Upper Saddle River, NJ: Prentice Hall, 1997. 162. E. Parzen, Stochastic Processes, San Francisco: Holden-day, 1962. 163. C. Loo, Digital transmission through a land mobile satellite channel, IEEE Trans. Communications, 38(5), 693–697, 1990. 164. E. Lutz, D. Cygan, M. Dippold, F. Dolainsky and W. Papke, The land mobile satellite communication channel—recording, statistics and channel model, IEEE Trans. Vehicular Technology, 40, 375–386, 1991. 165. R. Barts and W. Stutzman, Modeling and simulation of mobile satellite propagation, IEEE Trans. Antennas and Propagation, 40(4), 375–382, 1992. 166. J. Pierce, A Markov envelope process, IRE Transactions on Information Theory, 4, 163–166, 1958. 167. C. Helmstrom and C. Isley, Two notes on a Markov envelope process, IRE Transactions on Information Theory, 5, 139–140, 1959. 168. J. Pierce, Further comments on A Markov envelope process, IRE Transactions on Information Theory, 5, 186–189, 1958. 169. A. Gray, On Gaussian noise envelopes, IEEE Trans. Information Theory, 16(5), 522–528, 1970. 170. P. Bergamo, D. Maniezzo, A. Giovanardi, G. Mazzini and M. Zorzi, Improved Markov model for Rayleigh fading envelope, Electronics Letters, 38(10), 477–478, 2002. 171. W. Turin, Performance Analysis of Digital Transmission Systems, New York: Computer Science Press, 1990. 172. M. Pham, Performance Evaluation of LMDS Systems Supporting Mobile Users, Ms. Eng. Sci. Thesis, UWO, 2001.
REFERENCES
431
173. S. Sivarprakasam and K. Shanmugan, An equivalent Markov model for burst errors in digital channels, IEEE Trans. Communications, 43(2–4), 1347–1355, 1995. 174. K. Watkins, Discrete Event Simulation in C, New York: McGraw-Hill, 1993. 175. T. Robertazzi, Computer Networks and Systems, New York: Springer, 2000. 176. A. Adas, Traffic models in broadband networks, IEEE Communication Magazine, 35(7), 82–89, 1997. 177. A. Baiocchi, F. Cuomo and S. Bolognesi, IP QoS delivery in a broadband wireless local loop: MAC protocol definition and performance evaluation, IEEE J. Selected Areas in Communications, 18(9), 1608–1622, 2000. 178. C. Ng, L. Yuan, W. Fu and L. Zang, Methodology for traffic modeling using two-state Markovmodulated Bernoulli process, Computer Communications, 99(3), 1266–1273, 1999. 179. D. Goodman and S. Wei, Efficiency of packet reservation multiple access, IEEE Transactions on Vehicular Technology, 40(1), 170–176, 1991. 180. D. Goodman, R. Valenzuela, K. Gayliard and B. Ramamurthi, Packet reservation multiple access for local wireless communication, IEEE Transactions on Communications, 37(8), 885–890, 1989. 181. F. Cali, M. Conti and E. Gregori, Dynamic IEEE 802.11: design, modeling and performance evaluation, Lecture Notes in Computer Science, CNUCE, Institute of National Research Council Via Alfiery 1–56127, Pisa, Italy. 182. F. Babich and G. Lombardi, A Markov model for the mobile propagation channel, IEEE Transactions on Vehicular Technology, 49(1), 63–73, 2000. 183. G. Sater, IEEE 802.16.1mc-00/21r, Media Access Control Layer Proposal for the 802.16.1 Air Interface Specification, July 07, 2000, http://www.ieee802.org/16/mac/index.html. 184. G. L. Stuber, Principles of Mobile Communication Second Edition, Norwell: Kluwer Academic Publishers, 2001. 185. Hao Xu, T. S. Rappaport, R. J. Boyle and J. H. Schaffner, Measurements and models for 38-GHz point-to-multipoint radiowave propagation, IEEE Journal Selected Areas in Communications, 18(9), 310–321, 2000. 186. S. Haykin, Communication Systems, New York: John Wiley & Sons, 1994. 187. J. Verhasselt and J. P. Martens, A fast and reliable rate of speech detector, Proceedings ICSLP-96, 4, 2258–2261, 1996. 188. J. Yee and E. Weldon, Jr., Evaluation of the performance of error-correcting codes on a Gilbert channel, IEEE Transactions on Communications, 43(8), 2316–2323, 1995.
Index Analytical Signal 154 Average fade duration 145, 151 Birth process 223 Blanc-Lapierre transform 159, 348 Chapman equation 196, 204, 207, 214 Characteristic function 10, 15, 27, 63, 116, 228, 254, 343, 367 Clutter K-distributed 397 Weibull 404 Compound process 181 Conditional PDF 28, 62, 182, 287 Correlation coefficient 31, 74 Correlation interval 76–77, 133, 255, 290, 322, 380 Covariance 31 mutual 72 of a modulated carrier 87 of a periodic RP 85 of a sum and product of RV 84 of WGN 95 properties 73 relation to PSD 80, 83 Covariance function 67, 69, 71–72 Jakes’ 379 of process 322 of ! 322 of a Markov diffusion process 239 of Gaussian Markov process 241 of narrow band process 351 of telegraph signal 200 of the Wiener process 254 Covariance matrix 89, 181, 268 Cumulant brackets 48, 55 definition 11, 68 generating function 11 linear transformation 135
non-linear transformation 52, 112 of a random vector 30 of a sum of random variables 38 relation to moments 16, 48, 68 Cumulant equations 49, 51, 118, 289 Death process 224 Diffusion 229, 235, 312, 330 Diffusion process 217, 260 Drift 215, 235, 312, 322, 330, 335 Envelope 154, 156, 170, 177, 328, 348, 380 Error flow 388 Fading 43, 44, 140, 149, 377 Loo 378 Nakagami 378, 382, 385, 391 Rayleigh 41, 44, 76, 384 Rice 41 Suzuki 378 Fokker–Planck equation 217, 227, 393 boundary conditions 231 derivation 227 generalized 287 large intensity approximation 302, 310 methods of solution 236 small intensity approximation 297, 304 FPE, boundary conditions 219, 231, 288, 326 absorbing 219, 232–233 natural 232–233 periodic 232–233 reflecting 232 Gaussian process 71, 88, 115, 130, 146, 155, 166, 240, 270, 298 Gaussian RV 32–34 Hilbert transform 154, 358, 400 Independence 29, 31, 44, 87, 89, 155, 282, 419
Stochastic Methods and Their Applications to Communications. S. Primak, V. Kontorovich, V. Lyandres # 2004 John Wiley & Sons, Ltd ISBN: 0-470-84741-7
434
INDEX
Joint Probability Density 26, 29, 32, 88, 146, 149, 239, 280 Kolmogorov equation 218, 227, 235 Kolmogorov–Feller equations 285–287, 369 curtosis 18, 49, 120 Level crossing rate 140, 144, 149, 172, 410, 411 Log-characteristic function 10, 30, 77, 343 Markov chain 190 ergodic 197 Fritchman 419 Gilbert–Elliot 408 homogeneous 196 models 408 of order n 193 simple 190, 287, 392 stationary 196 vector 193 Wang–Moayeri 409 Markov process 212–213 binary 209 classification 189–190 diffusion 215 ergodic 214 homogeneous 209 with jumps 221 with random structure 276 Markov sequence 190, 203 Master equation 217 Moment equations 51, 289 Moments 12, 64
Yule–Farri 224 Pearson PDF 10, 65, 115, 290, 322 Phase 25, 39, 65, 81, 126, 154, 168, 327 Poisson PDF 9 Poisson process 136, 221, 285, 345, 369, 390 Process with jump 215, 217, 221, 285, 296, 335, 339 Process with random structure 275 classification 279 examples 290, 292, 298, 300, 309, 312, 313, 391 Fokker–Planck equation 281, 288 high intensity approximation 302, 310 low intensity approximation 298, 304 moments equation 288 statistical description 280 Random telegraph process 120, 200 Schro¨dinger equation 244 SDE 245 Ito form 251 relation to Fokker–Planck equation 251, 266 relation to Kolmogorov–Feller 369 Stratonovich form 251 vector 266 Skewness 18 Smoluchowsky equation 195, 227, 339 Spherical invariant process 65, 181 Stochastic integrals 246 Ito 248 rules of change of variable 250 Stratonovich 249 Stratonovich 249, 252
Orthogonal polynomials 23, 115, 296, 345 PDF Cauchy 184 Gaussian, see Gaussian Process K 173, 184, 368, 383, 397 Nakagami 149, 177, 241, 328, 360, 378 Poisson 9, 136 Rayleigh 41, 262, 333, 361, 377 Rice 41, 378 Two-sided Laplace 184, 325 uniform 10, 25, 41, 85 Weibull 333, 397
Temporal symmetry 257 Tikhonov 358 Transitional matrix 196, 278, 394, 408, 412, 414 Transitional PDF 199, 280, 204 Vector random variable and process 26, 112, 182, 205, 252, 263, 268 white Gaussian noise 88, 95, 246, 250, 252, 321, 379, 393 Wiener–Khinchin theorem 80