This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
+ 1T) will be negligible in comparison to -F( tPo) and, in (35c), "Ii, Qi, and 13; are defined by (14a)-(14c) with unless 1/10 is near ~ep + 1T. Thus, P{1/1 ~ l/Jo} = -F( l/Jo) in t/Ji= 1/10· Cases I and II, and in Case III, when YJo is not near ~q; + n, B. l/J.o- Ll 1 and 0, >'m > 0, ..., eJ>pk > o. If ri > o for 1 ~ i ~ n - 1, then node n has a path to each i, 1~ i ~ n - 1. By the definition of routing variables, each t has a path to n and consequently is irreducible. Thus (AI) has a unique solution, with positive t., if'i > 0 for 1 ~ i ~ n - 1. . Now let t = (f1 , ..., tn-I), r = (rl, ..', Tn-I), and let cI> be the n - 1 X n - I matrix with components ep,i (1 ~ i. I ~ n - 1). Equation (At) for 1 ~ i ~ n - lIs then t(l - <1» = r. Since this equation has a unique solution for 'i > 0, I - ~ must have an inverse, and (A4) Since the components of t are positive when the components of r are positive, components of t are nonnegative when the components of r are nonnegative. Differentiating (A4), we get the continuous function of and let m ~ 2 be the smallest integer such that 4>(m -1) ~ 4»3. We first show that m ~ n. Note first that for any l/J E <1>3, A changes l/Jj(i,k) only for i, j such that ti(J) = 0 and thus the node flows and link flows cannot change. aDT/a,;(j) can change, however, and as we shall see later, must change for some i, j if r/> does not minimize DT . Now consider 4>(1) (0 ~ 1 ~ m - 2, where ¢(O) denotes the original ct». Since >(l} E 4>3, A i k (l)(J) > 0 implies that t i ( /) = O. From (12), ¢ik(l)(/) = ilik(l)(j) and ' - (/>' I < €, and thus DT(Am'(q1) < DT(q'>'). Since (/>, = Am(<J» for some m, DT(Am '+m(l!J») < DT(cp')), contradicting (C45) and completing the proof. Proof of Theorem 4: Let be the set of loop free routing variables cp such that DT ( cf» ~ Do. We have verified that A maps loop free routing variables into loop free routing vari.. ables, and from Lemma 5, DT(A «(/») ~ D T ( cf» for tP E
"'0
W cos (£\4> -.p;) > U(r cos 1/Ii + A sin Wi).
(32)
The left-hand side of this inequality is largest when Act> - l/Ii = O. Furthermore, at 1/1; = A4>, the minimum of (Xi -13; is zero as can be seen from (14c), viz.
cxi- ~i =
w
2
2
sin (~~ -l/Ja
(ai
.2
(33)
+ ~i) sm ''Y;
Assuming the conditions far a minimum within the range of integration to be fulfilled, cos () in (13) can be approximated by 1 - 8 2/2, the range of integration extended ta too, and use made of the identity" [1, eq. (7.4.11)]
/~ -00
e-at'2 2
x +t
2
ii) Case II:
P{ '" ~ ~ + rr/2}
= ~ [1 -
(37)
(W/li)le(V/U, U)].
These follow directly from (15) and (16).
C. VIa - tiff>Near 1f/2
Simple asymptotic formulas in the vicinity of the point + 1f/2 evidently cannot be obtained, except in Case I, without further assumptions about the relative sizes of the system parameters. Fot Cases I and II, the asymptotic formulas are i) Case I: 1/1 0 = d4>
P{l/! ~ l/J o} ~ ie- U sin 1/10 [10 (U cos t/Jo)
+ Lo(U cos 1Po)] (38)
dt
2
=-x
'If
tJx erfc (xVi)
(34)
which then leads to the asymptotic formuias 10 i) Case 1:
1 + cos 1/1 0 - - - - erfe VU(1 - cos Vlo) 8 cos VJo
ii) Case II:
P{ 1/1 ~ i1
1/10 } '"
W -u . 1/1 /'((/2-6 0 e sm 0 dOePo c o S 8 4rrU
ii) Case II:
(35b)
1r /2 -
()0
(39)
where (30 = sgn (cos 1JJoWU cos 1/;0 + V sin VJo, 80 = sin- 1 (V/{3o), and V/U ~ 1. t o(-) is a modified Struve function [1, ch. 12]. These last two equations follow directly from (10) and (11) by approximating the denominators by their values at l/J = A + 1(/2. Equation (39) is not much simpler than the exact expression (13) and, hence, for computational purposes (13) is to be preferred. 2
(35a)
-
D. Large t/Jo -
2
2
2
~
Since the left-hand side
of (32) is negative for VI; - dq, near
n, (3i will have a negative sign in Cases I and II and also in
iii)Case III: P{ '" ~ l/Jo} '"
.
JOti+~i - - erfc va; 8~i
r sin VJa - A cos VJo sin '1f
f3i e-(cx;-fl;)
y81T(3i
(35c)
erfe z = (2/";;)f~ exp dt. 10 Equations (35a), (35b), and (35e) can be further simplified with, however, loss of some accuracy, by use of the asymptotic behavior of the complementary error function; i.e., eric z ,., (lIz:.);) exp (-z2) for 9
large z.
(-t 2 )
Case III for small correlations rand i\. Asymptotic formulas in these cases can be obtained directly from (13) by. noting that the major contributions to the integrals come from the endpoints; i.e., the minima of the exponent with a negative {3i. Expanding about these points, we are led to the asymptotic formulas (cf. [8, Appendix]): i) Case I: e: U ( tan 1/1 )
P{t/J~Vlo}~
-
(40)
0
21TU
ii) CaseII:
P{l/J ~ 1/Jo}~
e: U tan (d
2
(Ucosh V+ Vsinh V) (41)
207
Fifty Years of Communications and Networking
iii)Case ill: Although the asymptotic formula can be worked out in this case, the end result is long and cumbersome and will not be given here in detail. It can be shown that
P{~ ~ Wol '" F(44) + 11) - F(l/Jo)
(42)
where the asymptotic form of F(A
+ 1T) is the sum of two
nonzero terms and that of F(l/J o) is composed of the sum of four.
IV. APPLICATIONS
This section presents some illustrative applications of the preceding results. in the first of these, new formulas for the symbol error" probability in MDPSK are given, and are compared with previously known formulas. The second example shows how the distribution of the instantaneous frequency of a narrow..b and wave can be obtained from the Case III simplified representati(j~. In the third application, use of the Case III distribution is illustrated in the calculation of the bit error rate performance of narrow-band digital fM, with limiter-discriminator detection and partial-bit integration postdetection filtering. These results are also compared with those for an ideal frequency detector by making use of the results for the distribution of instantaneous frequency obtained in the second application. In a fourth application, a simplified expression is obtained for the BER in DPSK with a phase error in the reference signal. The final example considers the error probability in MPSK.
A. Symbol Error Rate in MDPSK In M-ary differential phase shift keying (MDPSK), information is 'trans~itted as the phase. difference between two consecutive sinusoidal pulses, and thus the, receiver decision variable' can be viewed as being the phase difference between two
vectorsperturbed by Gaussian noise. For a maximum a posteriori probability receiver employing differentially coherent detection of equiprobable, equal energy MDPSK signals, the probability of a symbol error is [6, ch. 5]
PE(M) = 2 _
i
(43)
p(y,)dt/l "
in which the noise has been taken to be uncorrelated at the sampling instants. Using the results of Section II, for the Case I distribution with "'2 = 1r and VJI ~ 1f/M, we find the symbol error probability 1 1 to be "
E
-
Eb
P
(45)
-=-~-.
log2 M
No
Whertft/===2,(44)gives,uEoninspection, PE(2)=exp(-p)/2. "For large p, the asymptotic fQrmula 1 2 for PE(M) follows from the results of Section III-A. for Case I and small l/Jo, and is [cf. (35a)]
_1
_+_co-..-~_(1T_/M_) erfe Vp[l-cos(7T!M)];
2 cos (TriM)
(46)
M~3.
Several other approximations have been developed for PE(M) and are listed here for comparison with the above result. For large p, Fleck and Trabka [5] derived the approxi-
mation
.PE(M) ~ erfe X .
+
A~ C ~ 1C(p
(47)
+ 1/8)
where X = VP[l _ c~s (1T/M)]. Arthurs and Dym [2] later gave the estimate
PE(M)
~ erfe [ Viisin ';M ]
(48)
and, finally, Bussgang and Leiter [3] obtained the upper bound
(49)
1f
nJAf
P (M).:-
where P is the symbol signal-to-noise ratio, related to the signal energy per bit-to-noise spectral density ratio by
sin (trIM)
211'
.
I
1f/ 2
. -11/2
dt
e- p ( l -
co s ( 1f/ M ) costl
.
1 - cos (n/M) cos t
(44)
11 We are indebted to M. K. Simon for pointing out the following relation between the bit and symbol error probabilities in the case of quaternary DPSK, M ::: 4. In DQPSK, a symbol error results in a single bit error whenever rr/4 ~ I \IJ I < 31r/4, and results in two bit errors when 3rr{4 < 11/1 I ~ n, Using these, we can show that the bit error probability PB(4) is related to the symbol error probability PE(4) by
PB(4)::: (l/2)PE(4) +P{3tr/4:S;;; t/J ~
hoi
PE(M), Fleck ~ tnibka -., ycos (1TfM)[ 1 + PE(M), eq. (46)
co~
! 8m
2
e-p/(2trp).
Consequently, for the region of interest (p greater than 10 dB),PB(4)::: (1/2)PE(4) wttn little error. This approximation is then better than the commonly used PB(4) = (2/3)PE(4)~ which corresponds to orthogonal signals; i.e., PB(M) :::MPE(M)/'4(M - 1) for M = 4.
(rr/2M)]
(rr/2M) (50)
1T}.
The probability P{31r/4 < t/J <; tr} can be evaluated exactly from (9) and (10); however, it is sufficient for our purposes to use the asymptotic' form given by (40), namely, P{3nj4, t/J <; ~}
which is also a good approximation for large o, Although all of these three approximations are good for large signal-to-noise ratios, none is asymptotic in the sense that the error in using it becomes vanishingly small as p becomes infinitely large. T4e Fleck' and Tfabka" approximation is closest" in form to the asymptotic formula, (46). Neglecting the 1/8 in the denominator 'of (47), employing the asymptotic expansion for the erfc function in both (46) and (47), and then forming the quotient of these two gives (for M~ 3)
12 Another asymptotic formula can beobtained from (44) by noting that the limits of integration can be extended to ±rr with an error that becomes vanishingly small as p gets infinitely large. This gives the . asymptotic f o r m u l a ·
P~M) -
1 -
(sin ~) t« (cos ~. p) ;
M .. 3
which has been obtained previously whenM =4 [11, p. 247J.
208
THE BEST OF THE BEST
which is independent of p. For M = 3, 4,8, and 16, values of this ratio are M
Q
0.9186 0.9768 0.9987 0.9999
3 4
8 16
Hence, although Fleck and Trabka's result is not asymptotic in the strict sense of the term, it differs in its leading term from the asymptotic formulas by less than 9 percent in the worst case when M = 3, and by less than 3 percent in the more ordinary situation when M is a power of 2 [M > 2 is tacitly assumed,since an exact expression is known for PE(2)] .
as T ~ 0 is to note first from (14a) that 1i -+ 11' as T -+ o. Then the limits of integration in the integrals of (13) are over 2rr ranges and, consequently, 1) the values of 0; in (13) are unimportant, 2) the algebraic sign of 13; is unimportant, and 3) the integrals in (13) are expressible in terms of /0 and Ie functions as in (18). Thus, employing (13) to evaluate the limit of F{WOT) as T~ 0, and usingthe result in (52), we find (cf, [11, eq. (6.6)]) P{w~wo}
= ~ [1 - sgn (c:P - wo)v'I-(32/Ot2/e(~/0I., 01.)]
+
B. Distribution of Instantaneous Frequency The distribution of the instantaneous frequency of a narrow-band signal may be regarded as a limiting case of the distribution of the phase angle between two vectors perturbed by Gaussian noise) if the vectors are appropriately regarded as samples of a narrow-band waveform with vanishinglysmall time separation. The Case III simplified representation of Section II-e for P{ VJl ~ VJ ~ VJ2} provides a convenient mechanism for readily determining the probability P{ w ~ wo}, that the instantaneous radian frequency lies below some value wo. In the remainder of this section, we will consider only the case that the noise power is not time-dependent so that 01 =
in which and are
w -~
-(XI (f3)
0
.. 2WOI\~) 1/2 e 2(Wo -r2
Q
0
(53)
and f3 are the limits as T ~ 0 of (14b) and (14c),
(54)
(55)
02·
and A == dA(t)/dt and U = U(t) [cf. (1)]. Equation (53) can also be written as
signal is the time derivative of its instantaneous phase; i.e,
P{W-We~V}
The instantaneous radian frequency of a narrow-band
W
8 2 -8 1 = lim - - T
'T~o
= 8(t),
in which 8 1
= lim 7-+0
82
l}I
(51)
T
= 8(t + T) and,
in the final limit,
8 2 - (J1 has been replaced by its modulo 21T difference VJ = (8 2 - ( 1) mod 21T (cf. [11, ch. 6]). Then, formally, the probability P{w ~ wo} can be determined using(a) as the limit P{w~wo}
=
I-P{w>wo},
=1-
lim P{ YJ
T-+O
> Wo T},
lim F(WOT) - lim F(dCP
T-+O
I
T-+O
+ 11");
if Wo
< 4>,
1 + lim F(wor)- lim F(d4> + rr); if Wo 1'-+0
1'-+0
> 4>, (52)
.
..
1
where, by (4a), 4> = We + cf> and cf> == limr-+o T- (cP2 - ~1)· The limits in (52) can be evaluated using (13) by first regarding all of the parameters U, V, W, ~.p, r and A as functions of f, and then letting T -+ O. For small T, r(T) ~ 1 + ';'2/2, where F = [d2r(T)/dT2 ] r == o ; A(T) ~ XT, where :\ = (dA(T)/dT] r=O; and (13) shows immediately that F(£\
where v = Wo -
We
and A, ;. have been expressed in terms of
+ Ws and,2
= -~ 2 - as 2 . The quantities Ws and 08 depend on the normalized baseband power spectrum S(n through Ws = f:cowS(f)df and OS2 = Coo (w - wS)2 S(i) d[. The expressions for Q and {3 corresponding to (S3a) can be obtained by replacing Wo - cP by J) ~ in (54) and Wo -;: - 2wo~ by as 2 + (v - wS)2 in (54) and (55). Equivalent results, expressed in terms of the Marcum Q-function, were obtained by SaIz and Stein [12] .13 Ws, OS2,
c.
and
We.
Here ~ = 2
We
Narrow-Band DigitalFM with Partial-Bit Integration
The bit error probability for narrow-band digital FM with limiter-discriminator detection and integrate and dump (I&D) output flltering is considered in detail in [8]; however) this reference exclusively treats the uncorrelated noises case of an output filter (see Fig. 4) which integrates over the entire bit time. The more involved problem of partial-bit integration, i.e., that in which the output filter integrates over less than the full bit time, requires additional analytical tools that are 13 The probability density function of the instantaneous frequency, which can be obtained by differentiating (53a), has been given by Gatkin et al. [21].
209
Fifty Years of Communications and Networking
IF Filter H(f)
LimiterDiscriminator
z(t)
Fig. 4.
180
.p{ t) (+ clicks)
Narrow-band digital FM.
discriminator output noise [8] . Part of this calculation, the only part to be considered here, requires the evaluation of p{1/J ~ OJ for ~tP > o. The desired result for this probability follows from p{ l/J ~ O} = Jt{O) - F(~tP - 11') and (12), and is
P{l/J provided by our results in Case III of the present paper. In this example, we briefly" indicate how the Case III results can be used in the BER calculation in the partial-bit case. In the "worst case" situation of an alternating one-zero data stream, the operation of narrow-band digital FM can be partly modeled in terms of the phase angle between two vectors perturbed by Gaussian noise, when the underlying sinusoid [cf. (1)] has an amplitude and phase of the forms (cf. [8, eqs. (12) and (13)] for a linear phase IF)
~O}
= sin del> 411
+
1"/2
1 - cos ~«Il cos t ] exp - U - - - - - [ dt 1- r cos t
1 - cos il
-1f/"-
r ~cI> /"/2 sin
411'
-11/2
a
-u i + r~::; tcos --=--------,;;;;."
exp [
dt
1 + r cos a4l cos t
(60) (56a)
A(t) = and
t at sin 1f-
=
T
ct>(t) = tan- 1
ao + 02
In the limit" T ~ 0, the output filter may be regarded as a sampler which samples the instantaneous radian frequency) say w, of the discriminator output. The above probability then becomes P{w ~ O}, which can be found in the limit of the above equation, but is more directly given by (53) with Wo = 0, i 0,..4 = 0, viz.
(56b)
t cos 211 -
T
P{w ~ O} =
(57)
a = U(O)
2
and, consequently, the parameters of (4) can be identified as
2
A (i) T') (T) (-2 -t/J - -2 , U(r) = 2a , V=O, w= U 2
(58) 0
2
(i2 /a2 Ie(flja, a)l
(61)
is the noise power after IF filtering. The noise
correlations of (5) are determined by the system "narrowband" IF filter; and for (WeT) mod 21T = 0 and for a filter with a symmetric low-pass equivalent H(n, A = 0 and r = r(T) is
(1 _~2), ;'
(3:= U(O)
2
(1 + ~2), v
4>=---T(ao + 02)
and
r=
where
~1 -
where
where ao, at, and a2 are constants (ao > 0, 00 ~ a2) and Tis the bit time. Letting T, 0 ~ T ~ T, denote the partial-bit integration time, we can express the phase angle of interest between the noisy vectors as
~4>(1')=c/J
i [1 -
D~
[d 2r(T)/dT2 ] T=O.
The Effect ofPhase Error on DPSK BER
In a recent paper, Blachman [13] considers the effect of an error in the delay of the preceding signal, in calculating the error probability in decoding the present signal, in differential phase-shift keying" (DPSK). By recasting this situation in terms of the phase angle between two vectors perturbed by Gaussian noise, a simplification results for the error prob.. ability. If we let 8 denote a constant phase error in the reference signal, the error probability PB can be written as an infinite series, with terms involving modified Bessel functions as [13, eq. (7)]
given by
F -1{IH(f)1 2 } r(T)= - - - - -
L~
(59)
2
IH(f) 1 df
The overall bit error probability is composed of averages over various bit patterns, and contains contributions from both the continuous and the "click" components of the
•
X
~
n
r;) + (p;)] I n+ I
[In(P;) +In (p;)J cos(2n + +1
1)0
(62)
210
THE BEST OF THE BEST
where PI and P2 are the instantaneous signal-to-noise ratios of the two vectors. Using (18) with 1/1; = -tr/2, r = A == 0, and ~ ::::: (), for 10 I ~ 1r/2, the above equation admits to the
simpliflcation
where
v'l -
o
o
Jl
(67)
1r
P
e- + 21T
=-
. _e- p s1n·2 \lJ cos1/Jerfc(--v'iicos1/J)
411"
in which p = PI and -« ~ l/I ~ 1T (Aell = 0 since MPSK is im .. plicitly coherent signaling). The limit as P2 ~ 00 (or 02 ~ 0) of the function F( t/J) will first be shown to be consistent with this p( \}J), and will then be applied to the MPSK symbol error probability calculation. Taking the limit P2 ~ 00 of (11) (with ~ = 0) leads to
-sgn 1/1 F(l/J) =- 21T
U- 1/e(Y, U)
t d~e-(x+y)tlo(~v'x2
_
dx_e-(x 2 +P - 2 x V P COS l/J )
(68)
(W/U)2 cos2 (j,
and U and Ware given by (4). The equivalence between (62) and (63)14 can be shown directly by the following steps. From (4), (C..1), and Neumann's addition theorem (cf. [1, eq. (9.1.79)]), rr1le(Y, U) has the representations
=
x
00
p(t/J) = /
(63)
y ::::
has the well-known forms [15]
+ y2 - 2xy cos 20)
(64)
i
cotl'IJ I
-00
dx
- -2 e- p ( 1+x 2 ) si n 2 '" 1+x
(69)
which can also be written as (70)
(65)
x == P1/2, Y = P2/2, EO = 1, and em = 2 for m > o. When (65) is used in (63), after some rearrangement (63) be-
in which
comes a Fourier series in cos (2n + 1)8, as is (62). Equality follows by equating coefficients and use of the identity (which can be verified by differentiation)
In taking the limit of (11) to get (69), it is to be noted that both U and V -+ 00 as P2 '4 00 and that the integrand of (11) becomes peaked in the vicinity of t = to, to = tan- 1 {VI (W cos 1/J)}. Expanding the integrand in powers of (t - to) and passing to the limit gives (69). Equation (70) then follows by the change of variable x = tan (). That (68) and (69) are consistent is readily verified by showing that these satisfy
p{t/J) = of(l/J)/at/J.
Turning now to MPSK: for equiprobable, equal energy
= (2n
+
l)e-(x+y)t[In(x~)ln(y~) - In+ l(x~)ln+ 1 (y~)] (66)
MPSK signals, the symbol error probability for the optimum receiver is PE(M) = 2P{l}J > TrIM}. Consequently, employing (70) to evaluate F(rr) - F(rr/M) gives 1
PE(M) =_ 1T
to evaluate the remaining integral over
~.
E. Symbol Error Rate in MPSK A degenerate case of the distribution of the phase angle between two vectors perturbed by Gaussian noise is that in which one of the vectors is noise free. This situation is representative of MPSK; hence, appropriate limits of our previous results can be used to obtain the symbol error probability in
MPSK. When the second vector, say, has zero noise, the probability density function of the phase angle between the two vectors
14 We are grateful to N. M. Blachman for the interesting bit of history that a special case of (63) was obtained, but apparently not pu blished, by L. Lewandowski in 1961.
I1f,2-rr/M dOe-psin2(tr/M)sec2e
(71)
-1f/2
which is essentially the same as the result of Weinstein [17]. When M = 2, (71), with the aid of (69) and (34), reduces to the well-known form PE(2) = (1/2) erfe (.J{J). Also, from (71), it is straightforward to show that PE(M) ~ erfc sin (n!M)] for M >2 (cf. [6, p. 231]).
[V'P
v. RELATION TO THE FOURIER SERIES APPROACH In most of the preceding, definite integral expressions have been presented for the probability densities, distributions, and related quantities of interest. However, some of these densities and distributions can be obtained in a quite general way as Fourier series [22]. It is the goal of this section to briefly ex.. plore the series approach and show its relation to our previous work. Primarily the Fourier series method appears to be limited to Cases I and 11 in which there is no correlation between the noises perturbing the endpoints of the vectors; however, the
Fifty Years of Communications and Networking
211
method will be found to have some application in Case III. Additionally, the Fourier series method quite naturally allows for extensions to situations which fall outside the scope of the main part of this paper, situations in which the phase angle of interest is the result of the modulo 21T combination of more than two components. The starting point for the Fourier series approach is the Fourier series of the periodic extension of the Bennett probability density, of Section IV-E, for the phase noise of the angle of a single vector; i.e.,
L 00
pel/;) =€k Qk(PIP2k(P2)Qk(P3) cos kt/J; 21T k=O
Other combinations can be handled similarly. Probabilities stemming from the series representations of such densities can be determined by termwise integration. For example, Blachman's error probability PB given by (62) is just the termwise integral of (76) over the intervals (-1T, - 1r/2 - 8) and (tr/2 - 0, 1T), viz. (78)
(72) The Fourier series corresponding to this [51 (8) can be shown to be (cf. Prabhu [22], Middleton [18, p. 417] and Lindsey [19, p. 190]) EO
€k
= 1,
= 2 for k > 0
(73)
where1 5
ak(p)
=
If
e-
P 2 /
[I
k / 2 - 1/ 2
(%) + I
k / 2+ 1/ 2
(%)] (74)
From this series, the series for the density of VJ = (29) mod 2n follows by noting that p( l/J) = [PI (l/J /2) + PI (l/i/2 + 1T)] /2, which leads to
l/J =(28) mod 2,".
(75)
The second ingredient in the Fourier series approach is the observation, made from (6), that the Fourier series for the combination of two independent () 's, with densities of the form of (73) or (75), has Q:.coefficients which are just products of the (i-coefficients of the component pdf's. Then, series for the densities of combinations such as t/J == (8 2 - (1) mod 21T and \jJ = (28 2 - ()1 - ( 3) mod 21f t 6 [all the 8's are independent with densities given by (73)] can be written by inspection, viz.
(76) 15 (it (p) is the well-known
signal suppression factor [6, p. 50] .
16 This combination arises in Manchester encoded digital FM [20].
Further, distributions like P{1/1 ~ Vio} in Cases I and II, of concern in previous parts of this paper, can be obtained as definite integral expressions by directly summing their corresponding Fourier series.. However, the a-coefficient of the combination density becomes increasingly complicated as the modulo 21T combination gets more involved, so that it becomes more difficult to sum the Fourier series. For example, we have not as yet succeeded in summing either the series for p( t/J) or its counterpart for P{ '" ~ l]Io} in the case of l/J = (202 ~ (J1 - ( 3) mod 2n.
VI. SUMMARY AND CONCLUSIONS The probability distribution of the modulo 2," phase angle between two vectors perturbed by correlated Gaussian noise has been studied in detail. For various types of signal conditions and noise correlation, the distribution and its asymptotic behavior have been presented. Some of the results were applied to problems in angle modulation in communication
systems, and in finding the distribution of instantaneous frequency. Even in the most complicated case, the expressions for the distribution require only the numerical evaluation of single integrals, and these are easily done on a digital computer or hand-held programmable calculator to ten digit accuracy using the methods described in [10] . Such methods were used to get numerical comparisons of the asymptotic formulas with exact results and, in general, the asymptotic results were found to be quite good for U > 10 dB and over the entire range of
1/10·
By following the techniques used, the results can be extended to other special situations of interest in phase and frequency modulation (cf, [6, ch, 5J). APPENDIX A CHARACTERISTIC FUNCTION METHOD Here we sketch the derivation of (21) and (22). The original development in [11] has been changed somewhat and rearranged into a step-by-step procedure. 1) Define a periodic sawtooth function s(~) of period 27T such that
212
THE BEST OF THE BEST
This function of t/I is discontinuous at t/I = l/J 1 and t/I = VJ2· It is equal to 1 in l/J 1 < t/I <; 1/12 and to 0 elsewhere in - n <; '" ~ 'IT. Then (cf. [11, eq. (2.26)])
=
P{l/1] ~ VJ ~ 1/J2}
~2
-
VI 1
211'
+-
1
21r
E(s(8 2 -6 1 - 1/1 2 +1r)
where the expectation extends over the random variables 8 1 and 8 2 . By the periodicity of s(·), the mod 21r in l/J = (8 2 - ( 1) mod 21T is accounted for. 2) Note the basic relation [11, eq. (2.20)] s(~)
=
100 100 -a o
0
a~
+ 1T)] that can be used in (A-2) to get an expression for P{ l/J 1 ~ 1/1 ~ 1/12}· The desired equations (21) and (22) now follow when (20) is used to express E[Jo (· )] as ~(~~, U t 1/1;), the conditional characteristic function of Section II-E.
0 1 - VJi
and use it to construct the function
dx dy
JO
xy
(A-3)
Which can be verified by changing to polar coordinates x = = R sin rp. Then the Integration with respect to R can be performed immediately (since it is equivalent to integrating J o '(eR) , e being a constant). The integration with respect to > can be done by setting 2ljJ = 'IT/2 - t and using
R cos rp, y
APPENDIX B ALTERNATE DERIVATION OF P{1/Jl ~ 1/1 ~ VJ2} This Appendix presents an alternate derivation of P{1/1 1 ~ ~ 1/12} which follows more along classical lines than the derivation given in Section II-E. For the sake of simplicity, only Case II is considered: in some detail. The derivation can be extended to Case III, although the algebra involved becomes tedious. Case II: Initially, the derivation proceeds along the lines followed 'by Fleck and Trabka [5], who start with the fourdimensional joint Gaussian probability density function p(x], Yl, X2' Y2) of the four noise variables in (2). A change is made to polar coordinates, and the resulting radius vectors integrated over, leading to the joint density P(O'l' 82 ) . Employing (6), the result for p(1/I), 1/1 = (8 2 - ( 1) mod 211, can be written as (cf. [8, Appendix] )
l/J
JTr/2
=-1
p(ljJ)
41r
-n/2
dte-E(l + 2U - E) cos t
(B-1)
where E = U - V sin t - W cos (d
dt
,..,,/2
J
l+cos~cost
o
~
=--, sin ~
.
--tr~~~1f.
3) Replace ~ in (A-3) by ()2 - 8 1 variables of integration to u, v where x R 1 , R 2 > o. This carries (A-3) into
s(82
__1 -
-
(A-4)
~ and change the
= uR1, Y =
o
[00 du dv 0
• J o (Vu 2 R t 2
uv
(B-2)
so that multiplying and dividing the integrand of (B-1) by
vR2 and E, using (B-2), and integrating over (1/1 1, VJ2) gives
() 1 -~) 00
at + sec? t (dE)2 al/J ( aE)2
E(2U-E)=
1 41l'
::= -
~ a~
+ v 2R 2 2 + 2uvR 1R 2 cos (0 2 -8 1 -~)). (A-S)
Taking the expected value of both sides of (A-S) and setting ~ :=. 1/1; - 11 (so aja~ = aja1/li) gives a form 1 7 for E[s(8 2 17 Interchanging the order of expectation and integration can be justified by standard methods of analysis (see [16J). If the double integral is first expressed in the polar coordinates (R, » then the most difficult step to justify is the interchange of the expectation with the R-integration. The validity of this step can be shown by first .writing the R -integral as one from 0 to M plus a second from M to 00. Then the interchange can be made in the first part with the finite limits and, as M -+ 00, the contribution from the second part can be shown to be nil by use of the Riemann-Lebesgue lemma (or the asymptotic behavior of the Bessel function).
1"'2 dt/J 1'"/2 dtel/J
-Tr/2
1
+--1
1l/J 2at/! f1r'2
+-1 41T
Tr 2 / dt fl/J 2 -Tr/2 til 1
47T
l/J
i
1
-.,,/2
E
cos t
e (-cost - - -3E)(3eE
(
at
at
E aE) (de-dl/J )
--1- -
E cos t
- - d t)
al/J
Cll/J
(B-3) Upon integrating the second integral by parts with respect to t, and the third integral by parts with respect to l/J, it will be found formally that all of the remaining double integrals cancel, and that there results the single integral
P{1{JI
E dt[ -e3E -.,,/2 E cos t a",
~1/J~1/J2} =..2. £1r 47T
1
2
J"'2,
'" 1 (B-4)
213
Fifty Years of Communications and Networking However, care must be taken if the singularity E = 0 lies within the range of integration, fOT then the steps performed in going between the last two equations would not be valid. This singularity occurs when
VI
= A4.>,
t
and
v
= tan- 1
and will be avoided if ~ lies outside the range of integration. If ~~ lies within the range of integration, then the above ,derivation must be repeated for the complementary probability P{ljJ < VI 1 or 1/1 > 1/12}' Evaluating the integrand of (B-4) then leads to (9) and (11), which completes the proof. Case III: In extending this derivation to Case III, the probability density function p( ljJ) becomes
J1I
2
(C..3)
(B-S)
W
1
where 10 ( . ) is a modified. Bessel function [see [9, eq. (A3-S)] for (C-2)] . 2) Hankel Exponential Integrals ([ 1, eq. (11.4.28)J:
(C-4) 3) The following two integrals can be obtained by use of the Euier-Legendre transformation, (31):
exp [
a + b sin t + c cos
dt
1 + d cos t
dt
-'It/2
e- E cos t
~
/11
21T
_ 1f
. dt
1 + d cos t
exp
[a + b sin t + ccos ~
where
(C-5)
+ c cos t
V(i2 - {32
cd-a a=--' I-d 2
r
and the analogof (B-2) is
(~~ + sec2t (:~ )2
(C-6)
(C-7) REFERENCES
[I]
[2]
E[2U-2W(rcosA4> + A sind4» - £.(I-r 2 -A 2 ) ]
[1 - (r cos VJ + Asin l/J) cos t] 2
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York: Dover, 1972. E. Arthurs and H. Dym, "On the optimum detection of digital signals in the presence of white Gaussian noise-A geometric interpretation and a study of three basic data transmission systems," IRE Trans. Commun, Syst., vol. CS-IO, pp. 336-372, Dec.
APPENDIXC
[4]
1962. J. J. Bussgang and M. Leiter, "Error rate approximations for differential phase-shift keying," IEEE Trans. Commun, Syst .• vol. CS-12, pp. 18-27, Mar. 1964. J. Edwards, A Treatise on the Integral Calculus, VoL. Il. London,
SOME INTEGRAL RESULTS
[5]
J. T. Fleck and E. A. Trabka, "Error probabilities of multiple-state
(8-7)
[3]
England: Macmillan. 1922.
This Appendix summarizes certain integral results heeded in the main part of the paper. Also listed are two interesting definite integrals which are related to the analysis, and which do not appear to be given in most tables. 1) The Ielk, x) Function: This function is defined and tabulated in [9], and [11] contains additional tabulations.
f6] [7]
Forlkl
l o(f3)
Ck
where
E= U-Vsint-WCOS(AtP-l/J)coSt 1 ~ (r cos ljJ + A sin VI) cos t
=
"'~/',
=e-
1 +d cost
a + b sin t
1 1 == - leta/a a) - - - -
a
tJ
[8]
[X e: I o(kt) dt
(C-l)
t
0
111 e-x{l-k cosO) . d8
1
v'f=k2
1f
o
1 - k cosO
differentially coherent phase-shift keyed systems in the presence of white, Gaussian noise," in Investigation of Digital Data Communication Systems, Rep. UA-1420-S-1, J. G. Lawton, Ed., Cornell Aeronaut. Lab., Inc., Buffalo, NY, Jan. 1961, Detect Memo 2A; available as NTIS Doc. AD256584. W. C. Lindsey and M. K. Simon, Telecommunication Systems Engineering. Englewood Cliffs. NJ: Prentice-Hall, 1973. J. I. Marcum, ~'A statistical theory of target detection by pulsed radar," IRE Trans. Inform. Theory, vol. IT-6, Apr. 1960, pp. 59-267. R. F. Pawula, "On the theory of error rates for narrow-band digital n FM t IEEE Trans. Commun .. vol. COM-29, pp. 1634-1643, Nov.
1981. S. O. Rice, "Statistical properties of a sine wave plus random noise." Bell Syst. Tech. J., vol. 27, pp. 109-157, Jan. 1948. [ lO] - , "Efficient evaluation of integrals of analytic functions by the trapezoidal rule," Bell Syst . Tech. J .. vol. 52, pp. 707-722, May[9]
(C-2)
June 1973.
214
THE BEST OF THE BEST
(11) [12]
J. H. Roberts, Angle Modulation. England: Peregrinus, 1977. J. Salz and S. Stein, "Distribution of instantaneous frequency for signal plus noise, n IEEE Trans. Inform. Theory, vol. IT-lO, pp. 272-274, Oct. ]964.
[l3]
N. M. Blachman, "The effect of phase error on DPSK error probability," IEEE Trans. Commun., vol. COM-29, pp. 364-365, Mar. 1981. A. J. Rainal, HMonopulse radars excited by Gaussian signals," IEEE Trans. Aerosp, Electron. Syst., vol. AES-2, pp. 337-345, May 1966. W. R. Bennett, "Methods of solving noise problems, 0' Proc . IRE. vol. 44, pp, 609-638, May J956. E. C . .Titchrnarsh, The Theory of Functions. London, England: Oxford Univ. Press, 1939. F. S. Weinstein, "'A table of the cumulative probability distribution of the phase of a sine wave in narrow-band normal noise," IEEE Trans. Inform. Theory. vol. IT-23, pp. 640-643, Sept. 1977. D. Middleton, An Introduction io Statistical Communication Theory. New York: McGraw-Hill, 1960. W. C. Lindsey, Synchronization Systems in Communication and Control. Englewood Cliffs" NJ: Prentice-Hall, 1972. 1. Korn, "Comments on 'Limiter discriminator detection performance of Manchester and NRZ coded FSK,'" IEEE Trans. Aerosp. Electron. Syst., vol. AES-16, p. 415, May 1980. N. G. Gatkin, V. A. Geranin, M. I. Karnovskiy, L. G. Krasnyy, and N. I. Cherney, "Probability density of phase derivative of the sum of a modulated signal and Gaussian noise," Radio Eng. Electron. Phys. (USSR), No.8, pp. 1223-1229, 1965. V. K. Prabhu, "Error-rate considerations for digital phasemodulation systems;" IEEE Trans. Commun. Technol., vol. COM17, pp. 33-42. Feb. 1969.
[l4] [15] [16] [17]
[18] [ 19]
[20J {21]
[22j
*
R. F. Pawula was born in Chicago, IL, on May 17, 1936. He attended the Illinois Institute of Technology. Chicago, and the Massachusetts Institute of Technology. Cambridge, and rePHOTO ceived the Ph.D. degree in electrical engineering NOT from the California Institute of technology. Pasadena, in 1965. AVAILABLE He served two years in the U.S. Army prior to his college enrollment. He was a tenured faculty member of the University of California, San Diego, when he retired in 1975. Since that time he has been a communications and radar Consultant and a commercial pilot. His special fields of interest include spread-spectrum communications, communication satellite system analysis. precision DME system
design, intermodulation distortion, multipath, jamming, modulation and ~emodulation, and error correction coding. and decoding. He is also interested in long-range avionics development. Dr. Pawula is a member of Sigma Xi, Tau ~eta Pi, Eta Kappa Nu , Phi Eta Sigma. Rho Epsilon, and Alpha Eta Rho. He was an Alfred Sloan Fellow at M.I.T. and a Howard Hughes Fellow .at Caltech, He also belongs to the Aircraft Owners and Pilots Association. the Experimental Aircraft Association, and the Baja Bush Pilots.
* s. O. Rice was born in Shedds, OR. PHOTO NOT AVAILABLE
He received the B.S. degree in .electrical engineering at Oregon State College, Corvallis, in 1929, did graduate work in 1930 at the California Institute of Technology, Pasadena; and received the D.Sc. (Hon.) degree from Oregon State College, in 1961.
From 1930 to 1972 he was employed by Bell Telephone Laboratories, where he was a Consultant on mathematical problems and Head of the Communications Analysis Research Department. During this time he was concerned with various aspects of communication theory, particularly those areas involving random phenomena and noise. Since 1973 he has been a Research Physicist in the Department of Electrical Engineering and Computer Sciences at the University of California, San Diego.
* J. H. Roberts received the degree (with honours) in mathematics from Reading University, Reading, Berks., England, in 1954. Following military service, he joined the PHOTO General Electric Company at their Wernbley , NOT England, research laboratories. Here he worked in close association with R. G. Medhurst on a AVAILABLE variety of problems in trunk radio links and FM systems. In 1966 he joined the Plessey Company, Roke Manor, Rornsey, Hants., England. His book Angle Modulation (lEE Series on Telecommunications, No.5) was published in 1977. Mr. Roberts received the lEE Electronics Premium from 1968 for two papers on FM threshold entension techniques.
Cochannel Interference Considerations in Frequency Reuse Small-Coverajte-Area Radio Systems DONALD
c.
COX, FELLOW IEEE
Abstract-Frequency reuse small-cover8ge-area radio systems baving hexagonal and square coverage areas are compared. Com· parison is made on the basis of average signal to average interference (S(i) in the corners of the areas and on the basis of the expected probability of Sll exceeding some system threshold for at least one base station that is eligible to provide service. The difference in performance between square and hexogonal systems is small, smaller than the usual uncertainties in the propagation parameters needed in the performance estimates. Results suggest that, even if the signal strength decreases as slowly as the inverse cube of the distance and the standard deviation of the large scale signal variation is as large as to dB, Rood service probabilities (on the order 0199 percent) can be provided in smallcoverage-area radio systems Dsing 30-40 channel sets.
T
I. INTRODUCTION
HE reuse of radio frequencies makes cochannel interference a fundamental consideration in small-coverage-area radio systems. A parameter sometimes used to evaluate the performance of such systems or to compare similar systems is the average signal-to-average interference ratio (S/1) produced at the corner of a coverage area by the first tier of co channel in terferers. Since both signal and interference vary randomly, a more definitive parameter is the expected probability that the signal-to-interference ratio is below some acceptable level [1] . Calculation of this more definitive parameter is more involved than calculation of Sfi.. Values of Sf] are obtained as an intermediate step in determining the expected probability. Coverage areas in frequency reuse systems usually are idealized in the form of hexagons or squares for the purpose of analysis [1] - [4]. The square geometry is simpler but under some conditions the hexagonal coverage areas are expected to provide a better system configuration [4] . An early paper [1] considered some aspects of this problem for square and hexagonal coverage areas. These early comparisons lumped the effects of small scale Rayleigh distributed signal fluctuations caused by multipath [2] and the large scale signal variations caused by shadowing [2] into one Rayleigh distributed random variable. I t is well known that several branches of diversity are quite effective in mitigating small scale multipath fluctuations [2]. Thus, the cochannel interference considerations for frequency reuse systems employing diversity are more strongly dependent on the large scale signal variations. At a fixed radius from an Paper approved by the Editor for Communication Theory of the IEEE Communications Society for publication without oral presentation. Manuscript received November 10, 1980; revised July 13, 198!. The author is with Bell Laboratories, Holmdel, NJ 07733.
9
omni-directional basestation antenna,largescale signal variation can berepresented by a log-normally distributed random variable [2] (i.e.. , the attenuation in decibels is normally distributed). The mean of the signal power S decreases with radial distance r from the base station and is inversely proportional to some power m of the distance, Le., S = kfr" where k is a proportionality constant [2]" In Section II of this paper, values of s/i are estimated at coverage area corners. Estimates are made for different numbers of radio channel sets that correspond to different frequency reuse factors for hexagonal and square coverage areas. The values of s/1 are used in Section III to estimate the expected probability that the signal-to-interference ratio is less than a threshold appropriate for differential phase-shift-keyed (DPSK) digital signals. The values of propagation and system parameters used for the comparisons are based on preliminary estimates for a portable radio telephone environment. Interference from more than one tier of cochannel base stations is included. II. AVERAGE SIGNAL..T O·AVERAGE INTERFERENCE
ESTIMATES FOR SQUARE AND HEXAGONAL COVERAGE AREAS Distances between base stations and portable radio telephones (portaphones) will be short (on the order of 1000 ft) because of limited transmitter power (on the order of 10 mW). Therefore, the value for the exponent m in the distance dependence of average signal power is expected to be smaller than that for the greater distances involved in mobile radio [2]. Estimates made by extrapolating mobile radio measurements suggest that m ~ 3. This propagation parameter and the geometrical distance from the coverage area layout establish the ratio of average signal power S to average interference power las
s/i = 1/
(r ~ -k) m
j
dj
(1)
where r is the distance to the portaphone from the base station providing service, d j is the distance to the portaphone from the jth cochannel interfering base station, and j ranges over all the interfering base stations. This assumes that the interference from all base stations is incoherent and independent so that addition of interference powers is appropriate. The signal and interference from the first tier of interfering base stations for a layout of square coverage areas is de-
Reprinted from IEEE Transactions on Communications, vol. COM-30, no. 1, January 1982.
The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
215
216
THE BEST OF THE BEST
K
T
ti
R
1 ti t!
t!
t!
U
Fig. 1. Layout of square coverage areas. Coverage areas numbered 1 are in the first cochannel interference tier from area O. Areas numbered 2 are in the second tier.
picted in Fig. 1. The squares numbered 1 are in the first tier of cochannel interferers surrounding the coverage area being considered and labeled O. Squares labeled 2 are in the second tier, etc. A similar situation for hexagonal areas is depicted in Fig. 2. An approximate relationship for s/1 in the corner for square areas as a function of the number of tiers of interfering base stations is derived in Appendix A as
(2)
#
* ** *h&·*
Fig. 2.
Layout of hexagonal coverage areas. Coverage areas numbered 1 are in the first cochannel interference tier from area O. Areas 2 and 3 are in the second and third tiers.
center-to-center separation between coverage areas using the same channel is 3nTh, where Th is the "radius" of the hexagonal area measured from the center to a corner. Accuracy for this approximation is better than that for the square coverage areas. Reuse of channels in a fixed channel assignment plan [2][4] implies that the total number of channels available to the system are divided into a number of separate channel sets. Only certain patterns of channel sets repeat over a plane surface for hexagonal or square coverage areas. Thus, only a restricted group of numbers of channel sets are realizable. For square areas, the allowable numbers of channels sets, Cs» are given by
(4) where t is the tier number and T is the number of tiers of inwhere k and I range over the positive integers [1], [2]. For terfering base stations in the radio system. The reuse interval hexagonal areas, the allowable numbers of sets, Ch) are given R measures the number of coverage areas between areas using by the same radio channels;' This approximate relationship is compared with exact calculations for T = 1 in Appendix A (5) and is shown to be within! dB for R > 3 and m < 4. The accuracy is much better for lower values of m, for larger where again k and I range over the positive intergers [1], [2J, values of R, and for larger values of T. [4] . A similar approximate relationship for Sf] in corners for The reuse parameters Rand n, used earlier, have integer hexagonal coverage areas is derived in Appendix B and comvalues and are strictly applicable for only a subgroup of the pared with the exact relationship for T = 1. For hexagonal allowable number of channel sets. For example, with R areas ranging over the integers for squares, the subgroup of numbers of channel sets ranges over 1, 4, 9, 16, 25, '.', (R)2, ... where (~1 (3) the total numbers of channel sets ranges over 1, 2, 4, 5, 8, 9, S/I~(3n)m/ 6~ m-l t=1 t 10, 13,16,17,20,25, "', k 2 + 12 ) •••• Curves based on calculations from the R 2 subgroup and interpolated between for the where n is a parameter related to reuse of radio channels. The intermediate numbers of channel sets are sufficiently accurate for system comparisons for numbers of channel sets greater 1 For example, R = 5 coverage areas in Fig. 1. than 9 or 10.
__
)
217
Fifty Years of Communications and Networking
systems infinite in extent involves large sums of random variables that have different means; however, from Fig. 3 it is evident that more than half of the interference is contributed by the first interference tier. Therefore, a reasonableapproximation for o, can be obtained by consideringa random process that is the sum of eight identical log-normal random variables for square coverage areas and the sum of six random variables for hexagonal areas. Monte Carlo calculations of up to 100000 sums and using three different normal random number generators resulted in (Ji = 5 dB for squares and 0t = 5.5 dB for . i~ hexagons. This assumes a 0 of 10 dB for the basic process being summed. These values are in reasonable agreement with [6].4 Summing six random processes for hexagons with (J = 7 dB resulted in = 3.7 dB. Equation (8) is plotted as the upper five curves in Fig. 4 for hexagonal and square areas, for (J 8 = 10 dB and 7 dB, for a system margin of '"1 = 8 dB, and for m = 3 and 4. The calculations are based on 8/1 from Fig. 3. The probability PI of not having adequate S/I from a single base station to a coverage area comer is quite large, even for m = 4 and for largenumbers of channel sets. If there were only one possible serving base station the situation would be quite grim. However, as illustrated in Figs. 5 and 6, there are four base stationsat the same minimum distance from a corner for square areas and three for hexagonal areas. These. different base stations use different radio channels and are associated with a different set of interferers. Therefore, fluctuations on the signals and interferences should be independent and the probability Pp of s - i < 'Y for the set of q minimum distance "primary server" base stations is P(s - i < '1) for one station raised to the qth power, that is, Pps = Pt 4 = p4(s - i < '» for squares or Pp h =P1 3 =p3(si < 1) for hexagons. These probabilities'' are also plotted in Fig. 4. Inspection of Fig. 5 shows that there are eight base stations surrounding the corner being considered that are at a distance only V5 greater than that of the primary servers. These eight secondary stations have a lower Sf] but still have a probability below 1.0 that s - i < "/. Since the direct paths between the eight secondary servers and the corner are separated by only a small angle from the paths to the primary servers, the signals
0,
100
- - SQUARE O"s - 10 dB
Vi - 5 dB
- - - HEX O"s·10d8
cr i - 5.5d8
•••••• HEX
O"s -
7 dB
vi- 3.7d8
FOR ALL CURVES: T. CD
80
r> ada
-
to- 20 Z
LLI
o
0:
'"
D.
(I)
10
8
~
5 4
!
2
0:
a:: o o
&&J
o
:;
1
~
.8
o
.5 .4
(I)
Z
.~
o
~ .2 ::i iii :
.1
~ .08
CL
.O~
"'> •••
.04
'+,J
•••
..
.02 •
.01 10
It
III
I
I I
A~loV:ABlE,~"'8fr OF1sO~'RFI CHA~N~L '~TSI I fI
I
tf I
Itt
I
I
I.
1ft
I
ALLOWABLE NUMBER OF HEXAGONAL CHANNEL SETS
20
50
40 50 60 NUMBER OF CHANNEL SETS
70
I
II
I
80
Fig. 4. Probability of not providing service in a coverage area corner because of excessive cochannel interference for systems with different numbers of channel sets. Curves are discussed in text.
are probably not all independent. A reasonable estimate might
be to say' that the eight secondary servers are equivalent to four independent secondary servers. Then P(s - i < 1) can be calculated using (6) and 8/1 for the secondary servers. The contribution Pas of the four equivalent independent secondary servers is again this single station probability raised to the 4th power. The overall probability of s - i < 'Y for the four primary and four secondary servers is then pp;'S$ w~c~ is also plotted in Fig. 4 for m = 3 and 0, = 10 dB. The Sf! for the secondary servers was determined by adjusting S for the increased distance and assuming that was the same as for the primary servers.
r
4 The Monte Carlo results were not quite as sensitive to truncation
of the normal distribution as was expected from [6]. 5 The probabilities P ps and Pph. canbe viewed as the probabilities of service that would result for these system parameters for a system using the four or three nearest base stations in a selection diversity conflguration to combat the large-scale log-normal signal variation if the different base stations use different radio channels as assumed here.
.
'\ ?!-r., .... -;7) ~~ -,
\
.,'~
,-
.....
fY
•
.
... I J
....
\
..J.r
...,.
" ~
-...
\
•
4 PMMRY SERVERS AT
.
'
n
r
8 SECONDARY SERVERS AT ~ r
4 THIRD ORDER fCQRREI.ATED WITH PnIARY)
Fig. 5.. Square coverage areas illustrating possible center-located base stations that could provide service at a corner of four coverage areas.
218
THE BEST OF THE BEST where log indicates a logarithm to the base 10. The variables sand i are normally distributed with probability densities
40
1 pes) =- - exp
V2ir
[(s - S)2 ] ---
(7a)
[(i - 1)2] ---
(7b)
2a8 2
Gs
and 1<,
20
f(l)
Cl
9 2
... ~
.'..... '~~2
10
~~ . ::::::-.==
----sQuARE 1~1
1 o,V2ir
p(i) = - - exp
. ::::=.:=:..=
········~AlLOWABl£ NUMBERS OF SQUARE CHANNEL SETS .1/
•••• •oj II
10
I
I ; I
II I I I II II I III I III I
20
IIII II I
I I
I II I I II I
III III
III 1 ' 1 II I II
ALLOWABLE NUMBERS OF HEXAGONAL CHANNEL SETS
30
40
50 60 70 NUMBERS OF CHANNEL SETS
80
90
I1II I I I
I
J
100
I I'
110
Fig. 3. Average signal to average cochannel interference in a coverage area corner for systems with different numbers of channel sets. Curve parameters are defmed in the text.
The
Sfi at corners of areas are shown in Fig. 3 as functions
of numbers of channel sets for hexagonal and square coverage areas. The curves are interpolated between points calculated from (2) and (3). The tic marks along the horizontal axis indicate the allowable numbers of channel sets for squares and hexagons as indicated. The curves are for T = 1 and T = 00, i.e., for a system with only one tier of cochannel base stations and for an infinite system. From Fig. 3, it is obvious that 8/1 in corners is greater for hexagonal cells for all m, for any number of tiers, and for numbers of channel sets greater than 9 or 10. It is also evident in Fig. 3 that s/l in corners is more sensitive to the propagation parameter m than to any other factor. This is unfortunate since the greatest uncertainty is in this empirically derived parameter. The knee in the curves between 10 and 30 channel sets suggests that systems requiring more than 30 channel sets may experience diminishing returns in numbers of channels invested in the system.
III. COMPARISON OF HEXAGONALAND SQUARE COVERAGE AREA SYSTEMS IN TERMS OF SERVICE PROBABILITY Since both S and I are random variables, the fraction PI of corners (or alternately the fraction of area closely surrounding a corner) having an S/1 less than some system threshold is the quantity that measures the quality of service provided. 2 As discussed earlier, S can be modeled as a log-normal random variable.f The sum of log-normal variables, I, is also approximately log-normal [6J, [7]. Then sand i can be defined as
s = 10 log S i
= 10 log I
(6)
2 This is based on interference considerations alone. Of course, absolute signal level must be considered also. . 3 Recall that diversity is assumed to mitigate the small scale signal variation so this analysis considers only the large scale variation.
2u;2
where sand i are the dB means and Us and ard deviations of the normal processes. The fraction PI is then given by
PI =P(s - i
1 < '}') = ~ PC.
I-V
y21T
(Jj
are the dB stand-
exp [-z2 /2] dz
(8)
z:::_oo
where 'Y is the system signal-to-interference threshold in decibels, and
(9)
s-
It remains then to determine I in terms of s/1 and to select appropriate values for 'Y, as, and (Ji. A reasonable estimate for as obtained by combining standard deviations for mobile radio propagation [2] and for building penetration attenuation [8] is Os ~ 10 dB. As indicated in Section II, interference power I is the sum of interference power from many sources and may be expected to be noiselike. From [9], a signal-to-noise ratio of 8 dB results in an error probability of 10- 3 for differential phase-shift-keyed (DPSK) signals. Since error rates of several parts in 10- 3 are acceptable for voice transmission, a value of 'Y = 8 dB will be used for the system signal-to-interference threshold. If u = In U and u is normally distributed with mean ii and standard deviation au, then [5J, [7] (IOa) and
where iJ is the mean of U and 0u is the standard deviation of U. From (6) and (lOa) it follows that
s- i =
10 log
(S/1) - (us 2
-
0;2) {In 10)/20.
(11)
Estimating 0; is made difficult because of the inherent intractability of the problem of SUIns of log-normal processes [6] , [7] . The sum of a number of log-normal random variables is a random variable that is also approximately log-no nnal [6], [7]. This fact has been verified by this author by Monte Carlo computer calculations. The total interference for radio
219
Fifty Years of Communications and Networking
second-orderserving base stations, for both squares and hexagons, contribute significantly to the service probabilities for the comers. For comparison in terms of probability of service, the parameters that matter most, t.e., m and oJ in the propagation law, are the parameters that are least wellknown.
IV. SERVICE PROBABILITY AT LOCATIONS OTHER THAN CORNERS
3 PMtARY SERVERS AT r 3 SECONDARY SERVERS
AT 2r
6 lHR) CRlER SERVERS AT 2J2 r
Fig. 6. Hexagonal coverage areas illustrating possible center-located base stations that could provide service at a corner of three coverage
areas,
For the hexagonal arrangement in Fig. 6, it is evident that the three base stations twice as far from the comer to be served as the primary servers could also serve the comer. Their contribution to the probability that s - i < "I isPsh =p3(si < 1) where, in this case, p~ - i < 1) is computed based on S/] at the comer for one secondary server. The overall probability that s - i < "1 for the three primary and three secondary servers is then PphPSh which is plotted in Fig. 3 for m = 3 and Os = 10 dB. The third-order set of servers in Fig. 5 are at a distance from the comer to be served that is three times the distance of
the primary servers. These third-order base stations lie along the same paths from the comers as the primary servers and are likely to have signal variations that are highly correlated with the variations of the primary servers; thus, they cannot be expected to improve service to the corner significantly. Any farther out base stations would also be of questionable help and so will not be considered. There are six third-order servers in the hexagonal arrangement in Fig. 6. The same argument about path proximity applies here as was used for the second-order servers for the square arrangement. Thus, these six third..order servers will be assumed to have the effect of three independent ones, with a probability that s - i < 'Y of Pst h = p3 (8- t < 1'). The overall probability of s - t < 1 for the hexagonal areas is then the product of the probabilities for first-, second.. and third-order servers of PpJl'&hPa'h) which is also plotted in Fig. 4 for m = 3 and 0, = 10 dB. Looking at Fig. 4 it is evident that, in considering firstorder servers only, the square coverage area arrangement provides lower probability of no service, i.e., a higher probability of service, for the equivalent number of channel sets. How.. ever, the hexagonal arrangement appears to provide approximately equal service when second-order servers are considered and better service when third-order servers are included. The independence assumptions weigh heavily on these conclusions and since no data exist to confirm or refute the assumptions, this apparent superiority of the hexagonal arrangement must be viewed cautiously. In any event, it is obvious that the
In this section, the probability that s - t < '1 discussed in the last section for coverage area comers will be considered atjnterior points in a coverage area. Consideration will be con .. Med to the foDowing set of conditions: hexagonal coverage areas, 27 channel sets, propagation exponent m = 3, system threshold '1 := 8 dB, signal standard deviation a, = 10 dB, and interference standard deviation = 5 dB. The details will change somewhat for other parameter values but the overall trends are expected to be the same over the range of parameters that can be reasonably expected. Fig. 7 illustrates the shortening of some paths to potential serving base stations and lengthening of other paths as the location needing service shifts from the comer of a coverage area toward the center. New values of 5/1 can be calculated for each new set of path lengths and P(s - i <,) determined from each s/1. This has been done for the conditions stated on the figure. These results are plotted in Fig. 8 as a function of normalized distance from the center of the closest base station (see Fig. 7) along the line between the center and a comer, i.e., 1 is at the comer and 0 is at the center. The upper two curves in Fig. 8 are associated with the closest base station and the next two base stations as labeled. The combined probability that s - i < l' for these three is the third curve down from the top.6 These three base stations were the set of primary servers for the corner and remain that for locations between the comer and the center. The fourth curve down from the top is the combined probability for the three primary and three secondary servers. The probability for the three primary, three secondary, and three of the six third-order servers discussed in the previous section is plotted as the bottom curve. For the curves that included the three closest base stations, the decrease in probability associated with the closest base station tends to be offset by increases in other probabilities until the nonnaJized distance decreases to about 0.5 or 0.6. Then the rapid decrease associated with the closest base station overwhelms the other factors. The single data point at 0.866 is for the midpoint of a coverage area edge, lying midwaybetween two comers. The curves in Fig. 8 are replotted in Fig. 9 as functions of the square of the distance from the center, i.e., the square of the horizontal axis values in Fig. 8. This parameter is related to the fraction of the area experiencing the corresponding probability value or lower. The single data point at 0.75 illustrates the very small difference in probability for the midpoint along a coverage. area edge as compared to a point at the same distancefrom the center along the line between the center and a comer. This single point in Figs. 8 and 9 indicates that
0,
6 As mentioned earlier, this probability also represents the performance of selection diversity against the large scale signal variations.
220
THE BEST OF THE BEST _
100
~ so bJ o
r-----------------.
a: ~
a:
50
40
L&J
t-
Z
W
o
N
LtJ U
20
10
Z
< ....en o
~
Pph
5 4
4
CLOSEST
W
U
Fig. 7.
Hexagonal coverage areas similar to Fig. 6 illustrating possible service to a location interior to a coverage area.
sa:
w
6
Pph PSh
2
en
o
z
u..
o
100
.4
.....
a::
lLI 0...
C7's·10dB C7'j -5 dB
50 40
27 CHANNEL SETS
=3 r - eea
m
.5
80
~ o
.8
HEXAGONAL
r-----------------
615 MIDPOINT OF SlOE
.2
a::
lLI
tZ
20
.1
w
U
~
o
a:
LL.
w u z ;: (f)
o
10
.4
.2
0
SQUARE
OF
CENTER
OF AREA
DiSTANCE
.s
1.0
NORMALIZED DISTANCE FROM
FROM CENTER
8
.6
(FRACTION OF AREA
OUT TO NORMAL1ZED
1
Fig. 9. Probability of not providing service as a function of the square of the distance from a center-located base station. The square of the
5 4
distance is proportional to the area up to that distance.
.... ~
w
u
sa::
the dominant factor is distance from the center, i.e., that direction is not important. Therefore, the distance squared is a measure of the fraction of the area affected. The curves in Fig. 9 illustrate, then, that the probability of s - t < 1, i.e., the probability of not receiving service, is nearly the value at the corner for 60-70 percen t of the coverage area.
2
LtJ
en
o z ~
.8
>-
..... J
en4
HEXAGONAL
.5
SETS
.4
27 CHANNEL
m=3
CD
V. CONCLUSIONS
y =8 dB
o
0::
s = 10 d B dB
e,
Comparisons were made between hexagonal and square
CT
.2
0'.:5
shaped coverage areas for frequency reuse small-coverage-area radio systems. In general, the differences in performance be-
t
6 IS MIDPOINT Of SIDE .1 '---_......-
o
CENTER
.I..o.-o _ _ .....lo-_ _.--J
.2
.4
.6
.8
1.0
CORNER
NORMALIZED DISTANCE FROM CENTER OF
COVERAGE AREA
Fig. 8. Probability of not providing service in a coverage area as a function of distance from the base station located in the center of the area.
twe'en systems with the two different shaped areas are smaller than differences due to uncertainties in other system parameters. When the systems are compared only on the basis of average signal-to-average interference Sf] at the coverage area comers, the hexagons are better; however, when compared on the basis of probability of receiving service from anyone of the closest possible serving base stations, the squares are better. Comparison on the basis of service probability using nearly all possible serving base stations suggests that) using this criterion, the hexagons may be somewhat better than the squares. Probabilities calculated for these service compari-
221
Fifty Years of Communications and Networking
sons also indicate the effectiveness of selection diversity in
combating large-scale signal variations. Radio systems are compared assuming an infinite set of cochannel frequency reuses, a required S/! of at least. 8 dB, and log-normal large-scale signal variations with a ~O dB s~9· ard deviation.. These systems provide good service prot:a.. bilities (on the order of 99 percent) with either hexagonal or square coverage areas if the available radio channels are subdivided into 30-40 channel sets, even when the propagation law changes as slowly as inverse distance cubed and has a standard deviation as large as 10 dB. Significantly fewer channel sets are required or, alternately, a significantly higher service probability is provided as the propagation power law increases from third toward fourth power and as the standard deviation decreases from 10 dB. Service probabilities remain relatively constant over 60-70 percent of the coverage area that is furthest from the central base station.
APPENDIX A Consider an arrangement of square coverage areas as depicted in Fig. 1 with omnidirectional antennas located at area centers. Let R be the reuse interval defined in Section II, rs the half width of an area, and t the tier number. The number of in terfering cochannel base stations in a tier is 8t. The distance from the center of a coverage area boundary to the center of the closest interfering base station is (2Rt - 1)r$. Also, the distance de between a coverage area corner closest to the
closest interfering base station and that interfering station is approximately (2Rt - l)rs for large R. For large R, de
is»;
~
The shortest distance from a comer of a coverage area to an interfering base station at the adjacent comer of the inter-
ference tier (along the diagonal of the tier) is dd = (2Rt l}./2Ts. Again, for large R this becomes dd ~ 2Rtrs Vi. The level of interference power or signalpower at a point is k/dm where d is distance between the point and a base station, m is the distance dependence of the attenuation law, and k is a constant depending on antennas, frequency, etc. The average interference from a tier is I_
4k
~
e
1 + (2)m/2 _
(2RTs)1Tl t" -1
t
(2)m /2
where half of the distances from a coverage area to the interference tier are approximated by de and half by dd. The total interference j contributed by the first T tiers is then _
+ (2)m/2] (2Rr V2)m
4k[ 1
I~·
a
T
~
1
tm-tO
At the corner of a coverage area the average signal is S = kjTsm(2)m /2. Then at the comer the s/1 contributed by the first T tiers is _ _ 2 m - 2R m / 1 + (2)m/2
S/I~
T "I ~_.
~ tm-
1
(2)
The approximate relationship in (2) slightly overestimates values of 8/1. The approximation is best for small m and for
TABLE I
Sa
DECIBEL DIFFERENCES BETWEEN APPROXIMATE FORMULA AND ACTUAL s/1 FOR CORNERS OF SQUARE COVERAGE AREAS. INTERFERENCE IS FROM THE FIRST TIER ONLY (I 1).
=
Channel'm Sets
I
R
3.5
~
9
.2
.4
.6
.8
16
.1
.25
.35
·5
.07
.2
.2
.3
.05
.1
.1
.2
.02
.06
.1
.1
25
5
36
64
8
TABLEII
DECIBEL DIFFERENCE BETWEEN APPROXIMATE ro RMULA AND ACTUAL FOR CORNERS OF HEXAGONAL COVERAGE AREAS. INTERFERENCE IS FROM THE FIRST TIER ONLY (t = 1).
sli
Channel
sll
r-.
3.5
4
2
3
2
0.10
.2
.3
.4
27
3
0.04
.1
.15
.2
48
4
0.02
.06
.1
.1
Sets 12
large R and deteriorates as m increases or R decreases. These trends are indicated in Table I which contains entries of the quantity (approx. sJl-exact S/1) in decibels for the least accurate case of one interference tier and ranging over values ofm and R,
APPENDIXB Consider an arrangement of hexagonal coverage areas as depicted in Fig. 2. The distances d t from the centers of inter-
fering coverage areas to the center of the serving area (area 0 = 3ntTIJ where n is a parameter related to the reuse of radio channels, 'h is the radius of the hexagonal areas (see Fig. 2), and t is the tier nwnber. For only a subset of the allowable hexagonal reuse patterns (channel sets), n is an integer. The approximate Sji at a coverage area comer from a set of interferers is obtained by substituting into (1) the centerto-eenter distance, d j = 3ntTh, for d j and letting r =Th. By noting that there are 6t interferers for each tier t, numbered as in Fig. 2, and considering only complete tiers, the sum over j can be replaced by a sum over t, yielding (3). Some of the distances from interferers to the comer are overestimated and some are underestimated, but these effects balance out quite well as illustrated in Table II. The entices in Table" II are (approx. Sf/-exact S/1) in decibels for the least accurate case of one interference tier and ranging over values of m and n. The approximate formula slightly overestimates in Fig. 2) are d i
s/1.
[1]
REFERENCES J. S. Engel, "'The effects of cochannel interference on the parameters of a small-cell mobile telephone system," IEEE Veh. Technol., vol. VT-18, pp. 110-116, Nov. 1969.
222 [2]
[3] [4J
THE BEST OF THE BEST W. C. Jakes, Jr., Microwave Mobile Communications.
New
York: Wiley, 1974, ens. 1,2,5-7. W. R. Young, "Jntroduction, background, and objectives," Bell Syst. Tech. L, Special Issue on Advanced Mobile Phone Service, vol. 58, pp. 1-14, Jan. 1979. V. H. MacDonald, "The cellular concept," Bell Syst. Tech. i.. Special Issue on Advanced Mobile Phone Service, vol. 58 pp. 15-42, Jan. 1979. J. Aitchison and J. A. C. Brown, The Lognormal Distribution. Cambridge, England: Cambridge Univ. Press, 1957. I. Nasell, "Some properties of power sums of truncated normal random variables," Bell Syst. Tech. J., vol. 46, pp. 2091-2110, Nov. 1967. L. F. Fenton, "The sum of log-normal probability distributions in scatter transmission systems, IRE Trans. Commun, Syst .. vol. CS-9, pp. 57-67, Mar. 1960. P. I. Wells, ~ "The attenuation of UHF radio signals by houses." IEEE Trans. Veh. Technol., vol. VT-26, pp. 358-362, Nov. 1977. M. Schwartz, W. R. Bennett, and S. Stein, Communication Systems and Techniques. New York: McGraw-Hill, 1966, p. 299. t
[5] [6] [7]
H
[8]
[9]
Donald C. Cox (S'57-M'60-SM'71-F'79) received the B.S. and M.S. degrees from the University of Nebraska, Lincoln, in 1959 and 1960, respectively, and the Ph.D. degree from PHOTO Stanford University, Stanford, CA, in 1968, NOT all in electrical engineering. From 1960 to 1963 he did microwave comAVAILABLE munications system design for the U.S.A.F. Dyna-Soar at Wright-Patterson AFB, OH. From 1963 to 1968 he was at Stanford University doing tunnel diode amplifier design and research on microwave propagation in the troposhere. From 1968 to 1973 he was a member of the Technical Staff of Bell Laboratories, Holmdel, NI, doing research in mobile radio propagation and on high capacity mobile radio systems. He is now Supervisor of a group at Bell Laboratories doing propagation and systems research for portable-radio telephony and for
millimeter-wave satellite communications. Dr. Cox is a member of Commissions B, C, and F of USNC/URSI, Sigma Xi, Sigma Tau, Eta Kappa Nu, and Pi Mu Epsilon, and is a Registered Professional Engineer in Ohio and Nebraska.
GMSK Modulation for Digital Mobile Radio Telephony KAZUAKI MUROTA, Member, IEEE, and KENKICHI HIRADI, Member, IEEE
Abstract-T~i~ paper is concerned with dig.tal modulation for .fut.. r~ ",ol!i~ radio tel~pbone services. first, the specific r~quire. ments on: tbe digital modulation for mobile radio use are described. The~, . 'preVlodulation . Gaussian filt~red minimum shift' keying (G~SK) ~ith ~oh~re"t detection is proposed a~ an effecti~e digital modulatioq 'for:' the present purpose, and its fundamental properties a~e cla~ined ~~t~ the aid" ~f ~ac~ine ecmputatlon. The constitution of rno~ti'~tor 'and demodulator is then' discussed from the viewpoints of mobile' .rad~~ . ~pp~icatioris" The 'superlortty of this modulation is supported by 's~me expertmental test results,
L INTRODUCTION It is well known that voice transmission in many V;HF and UHF mobile radio telephone systems has usually been "made by using a single-channel-per-carrier '(SCPC) analog FM transmission technique, However, in order to provide highly secure voice and/or high-speed data transmission by the use of largescale integrated (L~I) transceivers, digital mobile radio transmission is currently being studied [I ] -[ 7]. While digital transmission can surely bring many advantages, some technical problems must be solved. This paper concerned with a digital modulation for future mobile radio communications.. From the viewpoint of mobile radio use, the out-of-band radiation power in the adjacent channel should be generally suppressed 60-80 dB below that in the desired channel.' So as to satisfy this severe requirement, it is necessary to .manipulate the RF output signal spectrum. Such a spectrum manipulation cannot usually be performed at the final RF stage in the multichannel SCP~ transceivers because the transmitted JlF frequency is variable. Therefore, intermediate-frequency (IF) or baseband filtering with frequency up conversion is mostly used. However, when such a spectrum-manipulated signal is translated up and passed through a nonlinear dass-C power amplifier, the required. spectrum manipulation should
~bt b 7 violated by the nonlinearities. In order to mitigate the impairments, some narrow-band digital modulation schemes with constant or less fluctuated envelope property have been
researched [8] -[ 10] . 'In this paper, prernodulation Gaussian filtered minimum shift keying (GMSK) with coherent detection is proposed as an effective' digital modulation for the present purpose, and its fundamental properties are analyzed with the aid of machine computation.' The relationship between out-of-band radiation suppression and bit-error-rate (BER) performance is made clear. Constitution of the modulator and demodulator is then discussed. The superiority of this modulation is supported by some experimental test results. II. GMSK MODULATION
A. Spectrum Manipulation of MSK
Minimum shift keying (MSK), which is binary digital FM with a modulation index of 0.5, has the following good properties: constant envelope, relatively narrow bandwidth, and coherent detection capability [11 J-[ 13]. However, it does not satisfy the severe requirements with respect to outof-band radiation for SCPC mobile radio. MSK can be generated by direct FM modulation'. As is easily found, the output power spectrum of MSK can be manipulated by using a premodulation low-pass filter (LPF), keeping the constant envelope property, as shown in fig. I. To make the output power spectrum compact, the prernodulation LPF should have the following properties: 1) narrow bandwidth and sharp cutoff 2) lower overshoot impulse response 3) preservation -of the filter output pulse area which corresponds to a phase shift 11'/2. ' Condition 1) is needed to suppress the high-frequency components, 2) is to protect against excessive instantaneous frequency deviation, and 3) is for coherent detection to be applicable as simple MSK. Generally, the introduction of the premodulation LPF Paper- approved by the Editor for Communication Theory of the IEEE Communications Society for publication after presentation at violates the minimum frequency spacing constraint and the 29th IEEE Vehicular Technology Conference, Chicago, IL, March fixed-phase constraint of MSK.. However the above two con1979. t4anuscript received Mayi28,,1980;.revised January 5, 1981'. straints are not intrinsic requirements f~r effective coherent , The authors are with the Yokosuka Electrical Communication binary FM with modulation index 0.5 . Such a premodulationLaboratory, Nippon Telegraph an' Telephone Public Corporation, filtered MSK signal can be detected coherently because its Kanagawa-Ken, Japan. '
is
Reprinted from IEEE Transactions on Communications, vol. COM-29, no. 7, July 1981.
The Best ofthe Best. Edited by W. H. Tranter, D. P. Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
223
224
THE BEST OF THE BEST
J::L_ ~~PUT ~
NRZ DATA ~
FM t-'O DULATOR
Fig. 1. Premodulation baseband-filtered MSK. pattem-averaged phase-transition trajectory does not deviate from that of simple MSK.
. r
B. Fundamental Properties of GMSK
A Gaussian LPF satisfies all the above-described characteristics. Consequently, the modified MSK modulation using a premodulation Gaussian LPF can be expected to be an excellent digital modulation technique for the present purpose. Such a modified MSK is named Gaussian MSK or GMSK in connection with Gaussian low-pass filtering. Let us now investigate the GMSK modulation from various aspects. Output Power Spectrum : Fig. 2 shows the machine-computed results of the output power spectrum of the GMSK signal versus the normalized frequency difference from the carrier center frequency (j - fc)T where the normalized 3 dBdown bandwidth of the premodulation Gaussian LPF BbT is a parameter. The spectrum for GMSK with B b T = 0.2 is nearly equal to that of TFM. The effective variable parameter BbT can be selected by the system designer considering overall spectrum efficiency of the cellular zone structure. Fig. 3 shows the machine-computed results of the fractional power in the desired channel versus the normalized bandwidth of the predetection rectangular bandpass filter (BPFj BiT. Table I shows the occupied bandwidth for the prescribed percentage of power where BbT is also a variable parameter. For comparison, the occupied bandwidth of TFM is also shown in TaWeI. . Fig. 4 shows the machine-computed results of the ratio of the out-of-band radiation power in the adjacent channel to the total power in the desired channel where the normalized channel spacing fsT is taken as the abscissa and both channels are assumed to have the ideal rectangular bandpass characteristics with BiT I. The situation of fsT = 1.5 and Bit I corresponds to the case of f s ~ 25 kHz and B i == 16 kHz when fb = IIT= 16 kbits/s. From Fig. 4, it is found that We,GMSK with BbT = 0.28 can be adopted as the digital modulation for conventiorial VHF and UHF SCPC mobile radio communications without carrier frequency drift where the ratio of out-ofband radiation power in the adjacent channel, to the total power in the desired channel must be lower than -60 dB. When a certain amount of carrier frequency drift (for example Af = ± 1.5 kHz) exists, B b T 0.2 is needed. BER Performance: Let us now consider the theoretical BER performance of GMSK modulation using coherent detection in the presence of additive white Gaussian noise. Since the GMSK modulation of interest is a certain kind of binary digital modulation, its BER performance bound in the high SNR condition is approximately represented as
=
=
=
· · t · · ·r · . ...............l. ········ 1 . 0
· 20
..~
~
i 0.7
· 40 >-
lV)
Z
u.J
o
-60
-80
- 100
- 120 '------'0.5 o
-'-_ _--'-..........-L-_----""---L- _....J 1.0 1.5 2. 0 2.5
NORMALIZ ED FREQUENC Y
(f -felT
Fig. 2. Power spectra of CMSK.
99. 8
!':.
99.6
a: w
:>:
0 c,
-' c<
z
99.4
0
IU
c<
a: u,
99. 2
99. 0
9B.8
' - - - ---'-_...w....LL.uL.J..LL _
o
0.5
1. 0
_
.L-_---'-. --i.
1.5
NORMALIZED BANOWIOTH
2.0
BiT
Fig. 3. Fractional power ratio of GMSK.
r, = !
dmin )
erfc ( 2VNo
(I)
where No is the power spectrum density of the additive white Gaussian noise and erfc( ) is the complementary error func-
.
j
...J
2. 5
225
Fifty Years ofCommunications and Networking TABLE I OCCUPIED BANDWIDTH CONTAINING A GIVEN PERCENTAGE POWER
~
90
99
99,9
99 .99
0 .2
0 .52
0.79
0 .99
1. 22
0.25
0 .57
0 :86
1.09
1. 37
0 .5
0 .69
" 1: 04
.
1. 33
2.08
MSK
0 .78
1. 20
2 . 76
6.00
TFM
0 .52
0 .79
1. 02
1. 37
Or--
.... u ....z ....
C>'
-
----,------,--
-40
-r
-60
,
....
I-
Z
....:z
:
:z
""
j
:I:
U
.....
:z ..... u
..,"" Cl
""
-
-80 - - 1
:
!
-.,.....,--
---,.-
,
-
1.0
\
I E b=2
-l00 --· - i · · r -- -f\\t~·-·
- 120
L -_ _ ---"-
o
0.5
.........
1. 0
~
1. 5
_
_'_--'L -_
2.0
_'___'
2.5
NORMALIZEO CHANNEL SEPARATION ; fsT Fig. 4.
Adjacent channel interference of GMSK.
tion given by erfc (x)
= ~ ('" exp (-u 2 ) du o vn Jx
·t..· ···· · . ··· ······. ·· ··.. ····t··.. · ·······..·· · ·.. · · . ···
I
i
!
r-
_~
---''---
~
0.4
0.6
__'
0.8
Normalized minimum signal distance of GMSK.
Fig. 5 shows the machine-computed results for d min of the GMSK signal versus BbT where E b denotes the signal energy per bit defined by
.
- -! ~ 16 \t\ ~t\- " , i ',
..........
0.2
Fig. 5.
. . . j . .... ....... . . . 0.25 \ \ . "~ O. ~ ) ', \i ~'\ 1\ \
L -_
o
_ .
LPF BANDW IDTH : BbT
········ ··..···· ···i....···················
:
··········-I-..·· ·· ·..···..···..··· ·..
---,
~ :.:.=.~y \\.. i..
1' . . .... ............1.
1"
1. 2
. .•.
: :~ -: -:~- 1~1-=I::--:~:
1.4
-2O ---- -r---
u, C>'
-I
..
2. 0 r---,-----,-----r-=:::::::=====1
iT 0
lu m(t ) 12 dt = -I
2
IT
luit)1 2dt.
(4)
0
In the case BbT -+ 00, which corresponds to the simple MSK signal, Fig. 5 yields d min = 2VEb' which is that of antipodal transmission, It is noticed that the meaningful observation tirne interval for the GMSKsignal t2 - tl may be made longer than 2T, which corresponds to that for the simple MSKsignal, due to the intersyrnbol interference (lSI) effect on the phase transitions. .Substituting the machine-computed results of d m in into (I), the BER performance of the GMSK modulation with Cq~erent detection is obtained. Fig. ~ shows the performance degradation of GMSK from antipodal transmission due to the lSI effect of the premodulation LPF. This figure shows that the performance degradation is small and that the required Eb/No of GMSK with BbT = 0.25 does not exceed more than 0.7 dB compared to that of antipodal transmission. III. IMPLEMENTATION
(2)
A. Modulator
The Simple and easy method is to modulate the frequency of veo directly by the use of baseband Gaussian pulse stream, as shown in Fig. 1. However, this modulator has the weak point that it is difficult to keep the center frequency within the allowable value under the restriction of maintaining the (3) linearity and the sensitivity for the required FM modulation. Such a weak point can be removed by the use of an elaborate PLL modulator with a precisely designed transfer characterwhere um(t) and us(t) are the complex signal waveforms corre- istics or an orthogonal modulator with digital waveform sponding to the mark and the space trarismissions, respec- generators [14]. Instead of such a modulator, a 1T/2-shift tively. : binary PSK (BPSK) modulator followed by a suitable PLL .While the BER performance bound given by (1) is attained phase smoother, as shown in Fig. 7, is considered to be a only when the ideal maximum likelihood detection is adopted, prominent alternative where the transfer characteristics of it gives an approximate solution for the ideal BER perform- this PLL are also designed for the output power spectrum to satisfy the required condition. ance of GMSK modulation with coherent detection.
Furthermore, d m in is the minimum value of the signal distance d between mark and space in Hilbert space observed during the time interval from tl to t2 and d is defined by
226
THE BEST OF THE BEST 3------------r-----~---___,
i
2 ············
·············r····························f· . ··························r······················
.
~
(a)
~
o
oL.----L---.-L_ _.==::!:=======:! o 0.2 0.6 0.4
0.8
lPF BANDWIDTH : BbT Fig. 6.
Theoretical Eb/lVO degradation of GMSK.
OUTPUT
DATA
Fig. 7.
PLL-type GMSK modulator.
(b)
Fig. 8.
B. Demodulator Similar to the simple MSK or TFM system, the orthogonal coherent detector is also applicable for the GMSK system. When realizing such an orthogonal coherent detector, one of the most important and difficult problems is how to recover the reference carrier and the timing clock. The most typical method is de Buda's one [12]. In his method, the reference carrier is recovered by dividing by four the sum of the two discrete frequencies contained in the frequency doubler output and the timing clock is directly recovered by their difference. Remembering that the action of the well-known Costas loop as a carrier recovery circuit for BPSK systems is equivalent to that of a PLL with a frequency doubler [15] , de Buda's method is realized by the equivalent one shown in Fig.. 8(a). This modified method can easily be implemented by conventional digital logic circuits and its configuration is also shown in Fig. 8(b). In this configuration, two D flip-flops act as the quadrature product demodulators and both of the ExclusiveOr logic circuits are used for the baseband multipliers. Furthermore, the mutually orthogonal reference carriers are generated by the use of two D flip-flops, and the veo center frequency is then set equal to the four times carrier center frequency. This configuration is considered to be especially suitable for the mobile radio unit which must be simplified, miniaturized, and economized. IV. EXPERIMENTS
A. Test System Fig. 9 shows the block diagram of the experimental test system where the carrier frequency and the bit rate are Ie = 70 MHz andfb = 16 kbits/s, respectively. A pseudonoise (PN) pulse sequence with a repetition period of N = (2 1 5 - 1) bits is generated by the 15-stage feedback shift register (FSR) and is used as a test pattern signal. After passing through a pre-
Orthogonal coherent detector for MSK/GMSK. (a) Analog type. (b) Digital type.
________
Fig. 9.
~~TENUATOR
Block diagram of experimental test system.
.modulation Gaussian LPF having a variable bandwidth B b , the PN sequence is put into the synthesized RF signal generator having an external FM modulation capability. The frequency deviation of the RF signal generator is set equal to illd = ± 4 kHz, which corresponds to the MSK condition for the 16 kbits/s transmission. Then the GMSK signal of our choice is obtained as the RF signal generator output, and is transmitted into the receiver via the Rayleigh fading simulator [16]. Predetection bandpass filtering in the receiver is performed by the precisely designed Gaussian bandpass crystal filter. The bandpass-filtered output is demodulated by the digital orthogonal coherent detector shown in Fig. 8. The regenerated output is fed into the error-rate counter for the BER measurement. B. Power Spectrum and Eye Pattern Fig. 10 shows the measured power spectra of the RF signal generator output when BbT is a variable parameter. It is clearly seen that the measured results agree well with the machine-cornputed ones shown in Fig. 2. Moreover, GMSK with BbT = 0.25 is shown to satisfy the severe requirements,
227
Fifty Years ofCommunications and Networking
B T
b
=
B T = b (MS K)
00
00
(MS K)
Fig. 11. Instantaneous frequency variation of GMSK.
0.2
lSI effect causes inferior transmission performance. However, this misgiving is happily unwarranted because the demodulator output of GMSK with BbT = 0.25 degrades only slightly from that of simple MSK. It is easily found from Fig. 12 which shows the respective eye patterns measured by the analog-type orthogonal coherent detector shown in Fig. 8(a). It is also certified from the BER performance test results described later.
C. Static BER Performance Fig. 10. Measured power spectraofGMSK (V: 10 dB/div., H : 10 kHz/div.). of the out-of-band radiation of sepe mobile radio communications. The corresponding eye pattern measured at the premodulation Gaussian LPF output is shown in Fig. 11. This figure shows that the above satisfactory performance of the out-of-band radiation can only be attained by the sacrifice of introducing severe lSI effects into the baseband waveform of the FM modulator input. It might be feared that such a severe
Fig. 13 shows experimental test results for static BER performance in the nonfading environment where the normalized 3 dB-down bandwidth of the premodulation Gaussian LPF, BbT, is a variable parameter and the normalized 3 dB-down bandwidth of the pre detection Gaussian BPF is BiT ~ 0 .63, Le., Bj = 10kHz for fb = I IT = 16 kbits/s. The condition BiT ~ 0.63 is nearly optimum, as shown in Fig. 14. From Fig. 13, performance degradation of GMSK with BbT = 0.25 relative to simple MSK is found to be only 1.0 dB. Moreover, the measured static BER performance of simple MSK degrades by 0.7 dB from the theoretical one of ideal antipodal binary
228
THE BEST OF THE BEST 10- 1 ..-----,----,.--~-----,---,--....,
10- 2
)
',
B T = 00 b (MS K)
,
)
:
:
:
""~\
, ..... \ :
!
L.
j
:
B. T = 0 .63
:
0
\
:
:
i
,....
!
: '0,\ ~
10-
L -_ _ l.-_ _....... _
8
6
4
Fig. 13.
BbT
:
· · - l --- I - -~\t- -l -I--
5
10-6
.
:PREDETECTION BPF:
:
i
~
_
~!..-_..J-_ _-"-_
10
12
_
14
-1
16
Static BER performance.
0 .25 4
a;
3
"0
Z
S I:3 «: o:
2
w '" Cl
B T b
0.2
o
0.5
1.0
1.5
2.0
PREOETECTION BPF BANDW IDTH : BiT Fig. 14. Fig. 12.
GMSK eye patterns demodulated by orthogonal coherent detector.
transmission system. If r denotes the received signal energyto-noise density ratio, i.e., Eb{No , the measured static BER performance in the nonfading environment can be approximated as P e(r)
e= t erfc (.J(ii)
(5)
where erfc( ) is the complementary error function given by (2) and 0: is a constant parameter determined as
I
o:e=
0.68
for GMSK with BbT = 0.25
0 .85
for simple MSK (BbT~ 00).
(6)
Degradation of required EtJN o for obtaining BER of 10-3.
The above-obtained results can be estimated by the degradation of the minimum signal distance shown in Figs. 5 and 6. D. Dynamic HER Performance In the practical V{UHF land mobile radio environment, signal transmission between a fixed base station and a moving vehicle is usually performed via random multiple propagation routes. Consequently, fast and deep multipath fading, which can generally be treated by the well-known Rayleigh fading model , appears on the received signals of both stations and degrades the signal transmission performance severely. In particular, when a quasi-stationary slow Rayleigh fading
229
Fifty Years ofCommunications and Networking
ment, i.e., f D T same figure.
,
As an effective digital modulation for mobile radio use, premodulation Gaussian-filtered minimum shift keying (GMSK) modulation with coherent detection has been proposed. The fundamental properties have been analyzed with the aid of machine computation. The constitution of modulator and demodulator has also been discussed. The superiority of this modulation has been supported by experimental results.
!
= 0.25
,om' mm r'\I~ r rm r-m mrm ; ,~
;
- 40 H,
"lO m' mf -1tt- i _m (~m l"-~
co
10-
~
4
i
<;} \
.
ACKNOWLEDGMENT
12' Hz
The authors wish to thank Dr. K. Miyauchi, S. Ito, K. Izumi, and Dr. S. Seki for their helpful guidance. They also are grateful to Dr. M. Ishizuka and H. Suzuki for their fruitful discussions.
· ········ · · ·t····· ····· ····1\\···t······ ····· ···f················j········ · ····
lOm5 1 -_mimi~~~\l- i_ , I '1\
REFERENCES [I]
10- 6 L-._~_ _...i__--'_ _~_ _-'---"-.........._ _...J 30 40 o 10 50 70 20 60
Fig. 15.
Dynamic BER performance.
model is assumed, dynamic BER performance is given by
1 00
Pe(f)
=
o
(7)
Pe(r)p(r)dr
where I' is the average Eb/No and ph) is the probability density function (pdf) of r given by
per) = rI
exp
('Y - r ).
(8)
Substituting (5) and (8) into (7) yields
Pe(f)£:EI ( 12
£S) er --
+I
£:EI 4(kf
0, is also shown by the dashed line in the V. CONCLUSION
t
BbT
-+
(9)
where (k is the constant parameter given by (6). However, the dynamic BER performance in the fast Rayleigh fading environment, where the temporal variation effect of the fading cannot be neglected, has not yet been theoretically estimated because the tracking performance of the carrier recovery circuit in such environment cannot be analyzed. Fig. 15 shows the experimental test results of dynamic BER performance of the GMSK with BbT = 0.25 in the simulated fast Rayleigh fading environment where the maximum Doppler frequency, i.e., the fading rate f D, is a variable parameter. For comparison, theoretically estimated dynamic BER performance in the quasi-stationary slow Rayleigh fading environ-
O. Bellinger. " Digital speech transmission for mobile radio service," Elec . Commun .. vol . 47. pp. 224-230. 1972. [2J J. S. Bitler and C. O . Stevens . " A UHF mobile telephone system using digital modulation: Preliminary study ," IEEE Trans . Vehic. Technol .. voI.VT-22, pp. 78-81. Aug . 1973. [3) N. S. Jayant, R. W. Schafer. and M. R. Karim. "Step-sizetransmitting differential coders for mobile telephony," in Proc. IEEE Int . Conf. Commun .. June 1975, pp. 30/6-30/10. [41 D. L. Duuwei ler and D. G . Messerschmitt, "Nearly instantaneous companding and time divers ity as appl ied to mobile radio transmission ." in Proc . IEEE Int . Conf. Commun . . June 1975, pp. 40 /12-40/15 . [5) J . C . Feggeler, "A study of digitized speech in mobile telephony." presented at the Syrnp . on Microwave Mobile Commun .. session V-3. Boulder. CO. Sept.-Oct. 1976. [6J H. M. Sachs. "Digital voice considerations for the land mobile radio services ," in Proc. IEEE 27th Vehic. Technol. Conj., Mar. 1977. pp. 207-219 . [7) K. Hirade and M. Ishizuka, "Feasibility of dig ital voice trans miss ion in mobile radio communications," Paper Tech . Group . IECE Japan . vol. C578-2, Apr . 1978. [8) F. G. Jenks. P. D. Morgan. and C . S. Warren. "Use of four-level phase modulation for dig ital mobile radio," IEEE Trans. Elec tromagn , Compat ., vol, EMC-14. pp. 113-128. Nov. 1972. [9] P. K. Kwan, " T he effects of filtering and limiting a double-binary PSK signal ," IEEE Trans . Aerosp, Electron. Syst .. vol. AES-5 . pp . 589-594 . July 1969. [101 S. A. Rhodes, " Effects of hardlimiting on bandlimited transmission with conventional and offset QPSK modulation ." in Proc. IEEE Nat. Telecommun , cs«, 1972. pp . 20F/I-20F/7 . [III H . C . van den Elzen and P. van der Wurf', " A simple method of calculating the characteristics of FSK signal s with modulation index 0 .5." IEEE Trans . Commun .• vol . COM-20. pp. 139-147. Apr. 1972. (12) R. de Buda, "Coherent demodulation of frequency shift keying with low de viation ratio ," IEEE Trans . Commun. . vol . COM-20. pp . 466-470. June 1972. [13] H. Miyakawa et al .. " Digital phase modulation scheme using continuous-phase waveform." Trans . IECE Japan , vol. 58-A, pp. 767-774 . Dec . 1975. [14) F. de Jager and C. B. Dekker. "Tamed frequency modulation. a novel method to achieve spectrum economy in digital tran smission. " IEEE Trans . Commun .. vol , COM·20. pp. 534-542. May 1978 . [IS) R. L. Didday and W. C. Lindsey . " Subcarrier tracking methods and communication system design." IEEE Trans . Commull. Techno!., vol. COM-16, pp . 541-550, Aug. 1968. [ 16) K. Hirade et al . " Fading simulator for land mobile radio communications ." Trans . IECE Japan . vol. 58-B, pp. 449-459. Sept . 1975.
Continuous Phase Modulation-Part I: Full Response Signaling TOR AULIN,
MEMBER, IEEE, AND
CARL-ERIK W. SUNDBERG, MEMBER~ IEEE
error probability at large signal-to-noise ratio (SNR) and spectrum is achieved. This trade off is studied for modulation in-
Abstmct-«The continuous pbase modulation (CPM) signaling scheme bas gained interest in recent years because of its attractive spectral properties. Data symbol pulse shaping has previously been studied with regard to spectra, for binary data and modulation index 0.5. In this paper these results have been extended to the M-ary case. where the pulse shaping is over a one symbol interval, the so-called fuU response systems. Results are given for modulation indexes of practical interest, concerning both performance and spectrum. Comparisons are made with minimum shift keying (MSK) and systems have been found which are significantly better in E biN 0 for a large signal-to-noise ratio (SNR) without expanded bandwidth. Schemes with the same bit error probability as MSK but with considerably smaller bandwidth have also been found. Significant improvement in both power and bandwidth are obtained by increasing the number of levels M from 2 to 4.
dexes of practical interest and also for systems where the instantaneous frequency is not constant over each symbol interval. The channel noise is assumed to be additive,white Gaussian throughout the paper. The symbol error probability for an optimum detector at large SNR is calculated using the mini .. mum Euclidean distance between any two signals in the signal space [3]. The optimum detector operates coherently, and due to the continuous phase, the detector must observe the received signal for more than one symbol interval to make a decision about a specific symbol [3] .
I. INTRODUCTION
II. GENERAL SYSTEM DESCRIPTION
F
OR digital transmission over bandlirnited channels, the demand for bandwidth efficient constant envelope signaling schemes with good reliability has increased in recent years. A system often used in practice is multilevel phase shift keying, M-ary PSK, which has the drawback that, although, for M equal 2 or 4, the receiver sensitivity is acceptable, the signal is too wide-band because of discontinuous phase. Thus, RFfiltering has to be performed before transmission causing a nonconstant envelope signal and a decreased receiver sensitivity. The so-called minimum shift keying (MSK) , or fast frequency shift keying (FFSK), binary signaling schemes opened new prospects since the error probability performance is the same as coherent 2· or 4-ary PSK but the spectrum is narrower for large frequencies. Choosing an M larger than 4 (e.g., M = 8 or M 16) in the MPSK system makes the main lobe of the spectrum narrower, but the sensitivity to noise is considerably increased. A general definition of continuous phase modulation (CPM) systems is given in the next section. Assume that each data symbol only affects the instantaneous frequency of the transmitted signal in one symbol interval and that the phase is a continuous function of time. This defines the subclass full response CPM systems considered in this paper. In Part II more general CPM schemes are considered.. In some cases the phase is allowed to be discontinuous while maintaining the coupling between the phase in successive symbol intervals . By increasing M, an interesting tradeoff between symbol
=
Manuscript received March 19, 1980; revised September 19, 1980.. This work was supported by the Swedish Board of Technical Development under Grant 79-3594. The authors are with the Department of Telecommunication Theory, University of Lund, Fack, 8-220 07 Lund, Sweden.
For CPM systems, the transmitted signal is s(t, a) =
~ cos (2rrfot + .p(t,a) + "'0)
(1)
where the information carrying phase is
(2) and Q = .... a-2 a-I Qo at ... is an infinitely long sequence of uncorrelatedM-ary data symbols, each taking one of the values (X;
= ±l, ±3, ..., ±(M -1);
i
= 0, ±l, ±2,.··
(3)
with equal probability 11M. (M is assumed even.) E is the symbol energy, T is the symbol time,fo -Is the carrier frequency, and
Reprinted from IEEE Transactions on Communications, vol. COM-29, no. 3, March 1981. The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
231
232
~
THE BEST OF THE BEST
-0j-IL....-- __ ~s(t.~) f
Phase
FM-modulator
21fh
Fig. 1.
4hn
3hrr
Schematic modulator for CPM. 2hn
pulses. A schematic modulator is shown in Fig. 1. Note that a CPM signal always has a constant envelope. Defining the baseband phase response (phase pulse)
q(t) =
= 21rh
+1
',Q.
[t"" g(T)dT;
-~----+--~~-----4---~--b
(4)
-1
-hif
it is seen that the phase of the CPM signal
'(J(t, e)
hn
~ (liq(t - iT);
i=_oo
-00
is formed by
00.
-2h1f
(5)
-3h1l'
-4h1T
A causal CPM system is obtained if the frequency pulse g(t) satisfies
t
g(!) =:0; / g(t) $ 0;
t>LT
O~t<:LT
(6)
where the pulse length L measured in symbol intervals T may be infinite. L = 1 yields full response schemes considered in this part. The normalizing constraint for the frequency pulse g(t) can be expressed as
q(L7) =
t·
(7)
The CPFSK modulation schemes [5], [7] , [13] are a subclass of the CPM signaling scheme where the instantaneous frequency is constant over each symbol interval. Thus, for a full response CPFSK modulation scheme, we have
o·, q(t)
=
t
2T' 1 .
2'
(8)
t~T
which corresponds to linear phase trajectories over each symbol interval (see Fig. 2). Note that although the scheme is full response, the actual phase in any specific symbol interval depends upon the previous data symbols. The CPM signal is assumed to be transmitted over an additive, white, and Gaussian channel having a one-sided noise power spectral density No. Thus, the signal available for observation is
ret)
= set, a) + net);
Phase trajectories for a binary full response CPFSK system. Four bit time intervals are shown.
infinitely long sequence Q which minimizes the error probability. This is referred to as maximum likelihood sequence estimation (MLSE). In order to be able to study the performance of an optimum MLSE detector, a suboptimum detector is studied instead. The limiting case of this suboptimum detector is the MLSE detector [7] , [11] . This suboptimum detector observes the received signal r(t) for N symbol intervals to make a decision about a specific data symbol, say Qo • Thus,.the receiver observes the signal '
ret) = s(t, 0:) + net);
(9)
where net) is a Gaussian random process having zero mean and one-sided power spectral density No. A detector which minimizes the probability of erroneous decisions must observe the received signal ret) over the entire time axis and choose the
(10)
O~t<::NT
and if we let N --")0 00, an MLSE detector is obtained. An optimum detector maximizesthe likelihood function [3] , [5] AN[ r( t)] = e -
t~O
O~t~T
Fig. 2.
2
NT
No f 0
-
[ (r( r) - s( t ,et) ]
2
dt
(II)
and since the quantities foNT Tl(t)dt and foNT s2 (t, a)dt are independent of ON = Qt, ...) o'N-l' one can, as well, maximize
ao,
AN'[r(t)]
2
=;vo
NT
f0
. r(t)s(t,a)dt
(12)
or the log likelihood function
l
NT
log (AN'[r(t)) =
0
r(t)(t, Ii) dt.
(13)
Since there are MN sequences, 0. N = cio, ai, ..., aN -1, but the detector is only interested in finding an estimate of £to, the M N sequences can be formed into M groups:
ao
Q l ,N , 0.3 'N, ... , Q(M-l ),N / Q-l,N,a-3,N, ·.. ,a_(M-l),N
(14)
233
Fifty Years of Communications and Networking
Equation (19) can be written
where
I
Cik ,N == k, &'1 , Q2' ' .. , a.N - 1 k
(15)
=±1, ±3, ..., ±(fll- 1)
and it is not necessary for the detector to find the· specific sequence aN which maximizes (13) and to choose
r,«
1
_"N -1
m-"
~~Q[D(a.k'N'QI'N)] ~~ k I ..J2No
D
2
=2E(N-~
k*l
where
(17)
and the summation is taken over all pairs of sequences defined by (15), with the restriction that k =1= l, k, l = ±l, ±3, ..., ±(M - 1). D[ak~, al~Nl is the Euclidean distance between the signals ,erk ~) and s(t, (ll/V).
set
The squared Euclidean distance can be written
i
NT cos[.,o(t,Qk,N-Q"N)} dt)
(21)
Thus, it is sufficient. to consider the difference sequence
(22)
instead of the pair of sequences ak,N and Q 1.N • The approximation
p (16)
(Qk ,N , a l ,N )
e
~ r . Q[DmintN] 0 ..J2No
(23)
of (16) is good for large E/No . It is now assumed that EINo is sufficiently large for this approximation to be valid. The limitations of this assumption is considered in detail in [20] . r o is a positive constant, independent of E/No , and Dmin,N is the minimum of D(ak,N, al~) with respect to the pair of sequences Ok ,N and a 1,N with the restriction that k =1= I. This quantity can also be calculated using the difference sequence 1N through
D~in.N =2£' ~ IN_~lNT cos [.,o(t,'YN)} dtl (24) with the restriction that
1 0 = 2,4,6, ..., 2(M -1)
(18)
D2
d2 = -
D 2 (Gk ,N , Otl,N)
= 2£ (N _!-
I
2Eb
NT
0
cos
[21rh 1;
• q(t -iT)] dt) .
1
(25)
(26)
Note that
(a/ - al)
;=0
(27) (19)
The superscript denotes the value of the first symbol in a sequence of N symbols, i.e.,
i=1,2,···,N-l.
i=1,2,·",N-l.
In Part II, we will only deal with squared Euclidean distances normalized by bit energy
Assuming 21(10 T >- 1, this can be written
T
11; =O,±2,±4,···,±2(M-l);
(20)
where E b is the bit energy. Thus, error probability comparisons for large SNR can be made directly in EblNo between systems even if they have different M. Only values that are powers of two (M = 2, 4, 8, ...) will be considered. As a reference point in the following, note that d~ in = 2 for MSK, binary PSK (BPSK), and quaternary phase shift keying (QPSK). For the case of full response CPFSK systems, calculations of dln in ,N has been considered for both the bi-
234
THE BEST OF THE BEST
nary case [1], [5] - [7J, [13] and also for M = 4 and M to some extent [10J, [11]. [13] .
=8
1"ll.1 :; r:
t
"'''1
III. BOUNDS ON THE MINIMUM EUCLIDEAN DISTANCE An important tool for the analysis of CPM systems is the
so-called phase tree . This tree is formed by all phase traject~ries I{J(t . Q) having a common start value zero at t = O. The ensemble is over the sequence Q and Fig. 2 shows a part of
the phase tree for a binary full response CPFSl<. system. A more general case is shown in Fig. 3 where two phase trees for a quaternary CPM system having different frequency baseband pulsesg(t) are shown. To calculate the minimum squared Euclidean distance for an observation length of N symbols, all pairs of phase trajectories in the phase' tree over N symbol intervals must be considered. The phase trajectories must not coincide over the first symbol interval however. The Euclidean distance is calculated according to (21) for all these pairs, and the minimum of these Euclidean distances is the desired result. It is of great importance to remember that t he phase must always be viewed modulo 27T in conjunction with distance calculations. A practical method to do this is to form a cylinder by folding the phase tree [16]. [17] . Trajectories which seem to be far apart in the phase tree might actually be very close or even coincide whim viewed modulo 27T. . It is clear from (18) that, for a fixed pair of phase trajectories, the Euclidean distance is a nondecreasing function of the observation length N. If just a few pairs of infinitely long sequences are 'chosen, an upper bound on the minimum Euclidean distance at all values of the observation interval N is obtained. Good candidates for these infinitely long pairs are pairs that merge as soon as possible. Two phase trajectories merge at a certain time Ifthey coincide all the time thereafter. These merges are called inevitable if they occur independently of h . Thus. an upper bound on the minimum Euclidean distance is obtained as a function of the modulation index h for allN. . Applying this method to the scheme in Fig. 2, it is seen that if a pair of sequences is chosen as
«+1 ~ +1, -1. Q2. Q3 • ••• { Q_I - -1. + 1. Qz, Q3 ... •
(28)
the two phase trajectories coincide for all t > 2T. Thus, the upper bound on the normalized minimum squared Euclidean distance is 1
d B 2 (h) = 2 - . T
[ZT cos [27Th(2q(t) -
211..
t ir-e ' n, -2hl'l
-) h 'Ol
-~ h
- 5 h ll
"-Sh'!!
Fig. 3. Phase trees for M = 4 CPM schemes with two different baseband pulses get). The o, fJ function (see (56») with or = 0.25, is {J"" o is shown by a solid line and the HCS, half cycle sinusoid (see (58») is shown by a dashed line.
It can be noted that instead of using the pair of sequences = +2. -2, 0, 0, '" could be used together with (24) for calculation of the normalized squared Euclidean distance. . Turning to the quaternary case. we can find pairs of phase trajectories merging at t = 2T. Fig. 3 shows two examples of phase trees for this case. These merges occur at the points labeled A, D, C, D, and E in the phase tree. Unlike the binary case, there is more than one merge point, and two different pairs of phase trajectories can have the same merge point. There are only three phase differences however, namely, those having phase difference +2htr , +4h7T;and +6h7T at t = T. It is easily seen that an upper bound on the minimum Euclidean squared distance for the M-ary case is obtained by using the difference sequences
(28). the single difference sequence r
r = 'Yo, ro , O. 0, 0; Le.•
where (21) was used. For the binary CPFSK system . i.e., linear phase trajectories, the result is
d
0
sin 27Th)
27Th
.
ro=2 .4,6. ..·,2(M-l)
(31)
and taking the minimum of the resulting Euclidean distances,
2q(t - T))] dt (29)
dBz(h) = 2 ( 1 -
'-
(30)
!2 _.!.T f2T cos [27Th B2(h) = log2 0f) • 1 <;; kmin ';; M - l 0 • 2k(q(t) - q(t :- T))]
-l
(32)
235
Fifty Years of Communications and Networking
which, for theM-ary CPFSKsystem, specializes to d B 2(h)
= log2 (M) •
min
2 1-
(
1 <':k<.M -1
k2rrh) .
sin k21rh
2.5
(33)
Fig. 4 shows the minimum distance d 2 (h) for binary CPFSK as a function of hand N . The upper bound d B 2 is shown dashed. Note the peculiar behavior at h = 1. This will be dis..
\
\
\
\
cussed later. It can be noticed that for all full response CPM systems with the property
2 \\ / 'dS(h)
2.0
\
\
\ \
,
(34) a merge can occur at t
= T and thus
. 'Y = 10,0,0,0, ...;
the difference sequences
10 =2,4,6,···,2(M-l)
3
1.5
(35)
yield the upper bound d B 2(h) = log2(~ •
11 - T.!.I0 l<:k<;¥-l
T
min
cos [211'h • 2q(t)] dtl.
(36)
This class of pulses is called weak [20], .and is not considered because distance properties are poor. Only positive pulsesg(t)
will be considered below. Furthermore, they are assumed to be symmetric with
get) =geT - t);
O~t~T.
(37)
o Fig. 4. Normalized squared Euclidean distance versus modulation index for binary CPFSK: upper bound (dashed line) and d 2(h) for N = 1, ~, 3 and 4 'bit decision intervals.
The modulation indexes [defined by (38)]
n k'
h =-C
k=1,2,·u,M-I n:::: 1,2, -..
Weak Modulation Indexes, he For the construction of the upper bound d B 2(h) on the minimum squared Euclidean distance, pairs of phase trajectories giving merges at t = 2T were used. These merges occur independently of the value of h. For all pulses get) except weak ones the first inevitable (for all h), merge occurs at t = 2T. For specific values of the modulation index h, however, other merges are also ·pQssible. In the binary case (see Fig. 2), a merge Can occur at t = T if the difference sequence is chosen to be 'Y = +2, 0, ·0, ... and the modulation index h is an integer. This is because the two points labeled A and B are a multiple of21T apart, and thus coincide modulo 2~. For an M-ary full response system, the phase trajectories take the values ±2irhq(T), ±61Thq(T), ..., ±2(M - l)h1TQ(T), which for positive g(t) pulses reduces to ±h1r, ±3h1T, ... , ±(M - l)hrr.o Thus, there are M - 1 phase differences between the nodes in the phase tree at t = T, and merges occur at t = T for fl-values given by
rrh • 'Yo
= 21T
• n;
"Yo n
= 2, 4, 6, '.., 2(M -1) == 1, 2, ...
(38)
(39)
are called weak modulation indexes of the first order. For these modulation indexes, the minimum Euclidean distance is normally below the upper bound for all values of N; see Fig. 4 for h = 1. Sometimes the minimum distance for weak modulation index values is considerably below d B 2(h c ) . For such cases, he is called a catastrophic modulation index value
[17]-[20]. Merges of the weak (catastrophic) type occur at anyobservation length. Thus, weak h-values of the second order are defined by 'Yo = 2,4,6, ..., 2(M -1)
(40)
. 11 =±2,±4,±6,···,±2(M'-1) n
= 1,2, .'.
As we will see later, the effect of weak modulation indexes
of a higher order than one are of minor importance, since the corresponding Euclidean distance for the pairs of phase tia.. jectories causing the merge is above the upper bound d B 2 (h) '[20]. For weak modulation indexes of the second order, the
236
THE BEST OF THE BEST
corresponding Euclidean distance might be on the upper bound dB 2 (h); for the third order and higher it is strictly above. But from (39) it can be seen that the number.of firstorder weak modulation indexes grows rapidly with M. For weak modulation indexes of the first order, it is suffi .. cient to calculate the corresponding Euclidean distance over the first symbol interval. Since the calculation of the upper bound uses the two first symbol intervals, it can be concluded that d 2(h c ) can be, and normally is, smaller than d B 2(h c ) . Thus, for weak modulation indexes of the first order, d~in(hc) might be smaller than d B 2(h c ) (for details, see [20]).. Furthermore, it is shown in detail [17] , [20] that only '!c' of the first order can influence the minimum distance calculation for full response CPM systems. . Tightness of the Upper Bound d B 2(h)
A powerful property of the upper bound on the minimum
Euclidean distance is that except for 'weak modulation indexes of the first order, the minimum Euclidean distance itself equals this bound if the observation interval is long enough [18], [20]. Denoting the minimum normalized squared Euclidean distance for an N symbol observation interval d~in,N(h), we have that
The tightness of the upper bound and the behavior of (43)
is illustrated in all the minimum distance figures . Optimization of the Upper Bound d B 2(h) For frequency pulses g(t) having the symmetry property (37), the expression for the upper bound can be written
d B 2(h) = log2 M
min
iT 0
cos [41Thkq(t)] dt)\ (45)
and since cos (.) ~ -1 , dB2 (h) can never exceed 4 • log2 AiThus, at most, an improvement of 3 dB in E/No for a large SNR might be obtained for the binary case compared to MSK. Furthermore, (45) can' also be written
d B 2(h) == log2 M •
min
l~k<M-l
12 (1 -2cos [lI'hk]
·T1 IT'2 cos [41Thlato(t) -1Thk] dt )1
(41) if
.1 2(1 - -!.T
l~k~M-l
0
(46)
+ TI2).
(47)
where
I
N ~ NB (h)
h
*he:
qo(t) = q(t (42)
(first order).
NB(h) is the number of symbol intervals required to reach the upper bound for the specific modulation index h. If specific pair of phase trajectories never merge, the Euclidean distance will grow without limit.. This is true because the Euclidean distance calculated over each symbol . interval is positive. Since the minimum distance was previously shown to be upper bounded, the pair of phase trajectories giving the minimum distance must eventually merge.. For a modulation index near' a first-order weak modulation index, the difference sequence (35) gives the smallest growing Euclidean distance with N:
The binary case (M = 2) will be considered first. In this case,
a
d 2(h) = log2 (M) min
l
11 - ~T iT cos [21Thk'Yoq(t)] dt 0
+ (N -1)(1 -cos [1Thk'YoD \
(43)
min
l
{I - cos [1rhkYol}.
12
1. J
cos [41ThQo(t) -7Th) dt
(48)
It canat once be observed that cos 7Th = 0 for h = 1/2,3/2, 5/2) etc. Thus, the upper bound equals two, independent of the shape of the frequency pulse g(t). Actually, this is true for all pulses get) (except weak ones) [16]. This is of particular interest since much attention has been devoted to the case of h = 1/2 [6] , [12] , [14] , [15] . To maximize d B 2(h) for the binary case, the last term in (48) must be minimized . Two different cases can be distinguished:
Case I:
o«» ~ t where cos 1fh ~ o.
(49)
Case II:
and the smallest growth rate per symbol interval is
10 82 M •
· T1 [T 0
(44)
For sufficiently large N in (43), the minimum distance for the considered h is given by dB 2 (h) since d2 (h) will exceed dB 2 (h).
! ~ h ~ 1 where cos nh ~ O.
(50)
To maximize d B 2 (h), the integral in (48) must be minimized for case I and maximized for case II. It is also clear that the
Fifty Years ofCommunications and Networking
237
pulse, which maximized d B 2 (h) for case I, minimizes d B 2(h) for case II and vice versa.
'To make the integral in (4~) as small as possible, the argument inside the cosine must be as close to 1T as possible. This yields for the interval 0 ~ h ~ 1
I
f; case I
go(t) =
*;
qo(t) =
case II
(51)
the beginning and the end of the symbol interval, but remains constant (forms a plateau) the middle of the symbolinterval. Note that q 1 (t) and q2(t) give systems with discontinuous phase. However, the ensemble 'of possible phase trajectories is completely known to the receiver just as for all CPM systems. This gives a slight generalization of the considered class of systems. It is interesting to note that the plateau function with ~ = 1/2 gives the upper bound [17]
in
with theresulting phase responses
T
O·,
O~t<-
,
2
ql (t) =
T
1•
2'
t>~
0;
t=O
case I
(52)
2
and
q2(t) =
1•
O
1.
t~T
4'
2'
caseII.
(53)
From the sign symmetry, t~ minimize d B 2(h) the phase re-
sponses above have to be inter~lian~ed with each other for the respective cases. The upper bound d B 2(h) for the two phase responses is now found "as
l
d B 2 (h) = 1 - cps 271'h;
using qt (t)
dB~(hj == 2(1 - cos 7Th);
using q,.(t)
I
2(1 -cos 1Th)~dB2(h)~ 1 -cos 21Th;
t
f3
4Tl-2a:
2~
! ~h ~ 1.
~T~
1-( !--1)+1' 2Q
1 2'
~.
T
2'
(58)
O~t~T
r'
4T
o·,
. cn
l(l-COS~} . T'
O~t~T
1.
t~T
~
2'
(55)
t
~
(1 -a)T
(~ -Ci)T<:t~T
t~T.
(59)
and like CPFSK this pulse g(t) has no continuous derivatives at the end points [14]'. The pulse in itself is continuous, however, unlike CPFSK·. Since the frequency pulse"is a half crete sinusoid, this scheme 'will be 'referred to 8$ half cycle sinusoid (HeS) [see Fig. 3 (dashed tree) for the" quaternary case] . Another pulse of interest is
o.
----+_. q(t) =
t~O,t~ T
?T'. nt -sm-'
O~h~t
O~t~~T
a t 1 -2fj
g(t) =
q(t) =
The two phase responses ql (t) and q2 (t) aremembers of the class called a', fj functions [16]., [17.] defind by f3
I
O·
(54)
Ananalogous technique can be used to derivebounds for h ~ 1.
20: T'
is concluded that large values of N are required to make z ". 2 dmin(~) equal dB (h) near h = 1. In practice," the phase responses qt (t) and q2 (t) are not attractive because of their spectra, This will be discussed more in Section VI. From a spectral point of view, the phase during a symbol interval should change slowly and smoothly, and the following frequency pulses with corresponding 'phase responses are of interest. The first one is
with the corresponding phase response
o<; h ~ I.
Thus,for any binary scheme,dB 2 (h) is bounded by I - COS 21Th ~ 4B 2 (h) ~ 2( 1 - COS 1fh);
which, for small values of Q, approaches a value of 4 near h == 1. Since h = 1 is a first-order weak modulation index, it
g(t) =
;
.2T
I
(
21rt)
l-cos- . -T '
t <: 0, t
~
T
O<:t~T
(60)
with the corresponding phase response
0;
(56)
Hence, when h =: 1/2, Ql (t) corresponds to binary PSK but with a T/2 time offset. The phase response q2 (t) corresponds to a so-called plateau function; i.e., the phase changes only in
q(t) =
~
(!- __
2 T 1
2'
1 sin
211'
t<;O
21ft) ~ T
O~t~T
t~T.
(61)
238
THE BEST OF THE BEST I'EK. EJ'SK e t c .
-,
u pp er bou nd
q l (tJ
CPFSK
a · 0 . 25, 8 =0 -
.,;:.--/
RC h= 1/4
1/16
1/ 6
1/4
Fig. 5. Upper bound comparison for M = 2, 0 " h .. 0.5. The bound for q 1 (t) is the upper bound on all upper bounds dB 2(h) , and the or = 0 plateau function is the lower bound on all the upper bounds d B 2(h) in the considered interval.
This pulse g(t) has one continuous derivat ive at the end points t = 0 and t = T. Since this frequency pulse is a raised cosine function , it will be referred to as raised cosine (Re). When h = 1!2, this scheme has previously been referred to as SFSK [12] . Fig. 5 shows the upper and lower bounds on d B 2(h) for all binary schemes in the region 0 < h ,;;;; 1/2, computed with (55). Fig. 5 also shows d B 2(h) for CPFSK, HCS (with formula in [18]), and a, (3 functions with a = 0 .25, f3 = 0.5 and a = 025, (3 = O. The bound for the RC scheme is in between that of 0: = 025, (3 = 0 and HCS. The problem of finding frequency pulsesg(t) that optimize d B 2(h), given hand M, is far more complicated in the general M-ary case than for the binary case. This general problem has not been solved. The reason is that the upper bound is constructed from the minimum of more than one function, and h varies with fixed M, different functions take the minimum value. This is also true for fixed hand M when the frequency pulse g(t) is varied. For an M-ary scheme with 0 < h ,.;; l!M the binary bounds (55) on d B 2 (h) still apply after multiplication withlog2M. Of course, also for M-ary schemes d B 2(h) <410g2M . However, due to the fact that the bound d B 2(h) in this case is formed by taking the minimum of several component functions (32) , this maximum value can never be reached [17] , [20] . It is previously known [5], [7] , that the h-value maximizing the minimum Euclidean distance (N ~ 3) for binary
CPFSK is h = 0.715. This value of h also maximizes d B 2(h) for this scheme . The same h was also shown by Kotelnikov [1] to maximize the Euclidean distance when N = I. It is possible to find the values of h which maximize d B 2(h) for M-ary CPFSK [see (33)] and they are given in Table I together with the maximum value of d B 2 (h) for M 2, 4, 8,16 , and 32 . The optimum occurs for h-values slightly below h = 1. Unfortunately, h = 1 is a first-order weak modulation index for all M, but if N is made large enough (N ~ N B)d~ in (h o) equals d B 2 (h o ) . For the quaternary case, the CPFSK scheme gives a max imum of d B 2(h o ) = 4.232. In [17] it is shown that a scheme based on the a, /3 function with 0: = 0 , (3 = 0.17 for h = 0.62 gives a minimum distance of 4.62. This value came out of a nonexhaustive search for quaternary schemes yielding large distance values. However, better schemes may exist.
=
IV. NUMERICAL RESULTS ON THE MINIMUM EUCLIDEAN DISTANCE In this section numerical results on the rrurumum normalized squared Euclidean distance will be given in form of graphs. These graphs present the minimum Euclidean distance versus the modulation index h for speciflc schemes and different values of N, the number of received signal intervals observed. Thus, these graphs will show what is below the upper bound d B 2(h), and also how large Nhas to be made in a
239
Fifty Years of Communications and Networking TABLEl OPTIMUM h ·VALUES AND CORRESPONDING NORMALIZED EUCLIDEAN DISTANCES FORM-ARY CPFSK SCHEMES M
Optinun h h 0
d~l ho)
,
Ns
2
.715
2 . 434
3
4
. 914
4. 232
9
B
. 964
6 . 141
41
15
. 983
8.058
178
32
.992
10 . 050
777
,,
_
,
I
3.5 I
I
d2 B ( hl
, - ..... , \
\
\ \
N=7 N=6
\
\
\
\ \
\
J
,
I
I
, I
3. 0
specific situation, to make the minimum Euclidean distance d~in(h)equal todB 2(h).
I
I
I
Plateau Functions
I I
As an example of a plateau function, a binary scheme with a phase response very similar to that of Q = 0.05 and ~ = 1/2 will be chosen. The difference between the chosen phase function and the Q, fj function is that the phase does not vary linearly in the intervals 0 < t < QT and ( 1 - 0:) T < t < T. Instead, the phase varies like a raised cosine (for an exact definition, see b-functions, b = 0.05 in (16]). Fig. 6 shows d~in(h) for this binary system. The number of observed bit intervals is N = 1,2, ''',7. The upper bound on the minimum Euclidean distance is also shown by a dashed line where the minimum Euclidean distance still does not equaldB 2(h).
I
I
I
2 .5
I
J
t'6K
I I I I I
\ I
2. 0
N=3
I
CPFSK and M-ary PSK Fig. 4 shows the well-known minimum distance for binary CPFSK for N = I , 2, 3, and 4 observed bit intervals. Also shown in Fig. 4 is the upper bound di(h). It can be noted that h = 1/2 corresponds to MSK and gives d~in{l/2) = 2, which is the same as antipodal signaling, e.g., BPSK. The required observation interval for PSK is one bit interval, and for detectors making bit by bit decisions, PSK is optimum [3]. The required observation interval for MSKis two bit intervals, and the asymptotic performance in terms of error probability is the same as that for PSK. The optimum modulation index for CPFSK is h = 0.715 when the number of observed symbol intervals is 3. This gives the minimum Euclidean distance d~in(O.715) = 2.43 and thus a gain of 0.85 dB in terms of Eb/No is obtained compared to MSK or PSK. The minimum normalized squared Euclidean distance versus the modulation index h is shown in Fig. 7 for the quaternary CPFSK system (see also [10] ). Note that the upper bound d B 2 (h) (shown by a dashed line where it is not reached) is twice the minimum distance for a receiver observation interval of the N = I symbol. This is because the rectangular frequency pulse g(t) has the symmetry property (37). The maximum value of dB 2 (h) is approximately reached for N = 8 observed symbol intervals (compare to Table I). It is clear from (39) that the first-order weak modulation indexes in the interval 0 < h ~ 2 are he = 1/3, 1/2,2/3, 1,4/3, 3/2,5/3, and 2, and the effect of some of these early merges can be clearly seen in Fig. 7. Note that most of these weak
1.5
Fig. 6. Normalized squared minimum distances d 2(h) versus modulation index for a b-function with b = 0.5. This phase function is very similar to the Ot-{J function with Ot = 0.5, {I = 1/2 (see [16] or [18]).
indexes are catastrophic. The minimum Euclidean distance for these is no better than 2. It is interesting to compare the minimum distance for the quaternary CPFSK system to QPSK (phase response qt (t), h = 1/4). As indicated in Fig. 7, the minimum squared distance for QPSK is d7nin = 2, and for the quaternary CPFSK system it is slightly below this value for h= 1/4. This is a different relative performance level than that for M = 2. For M 2, h 1/2 all schemes have the minimum squared distance d~in = 2, CPFSK and PSK included. The minimum distance for the octal (M = 8) CPFSK system is given in Fig. 8 N = 1, 2 , 3 and in some interval for N = 4 and 5. The upper bound dB 2 (h), which, as usual, is shown by a dashed line where it is not reached, is like the quaternary case reached with N = 2 observed symbol intervals for low
=
=
THE BEST OF THE BEST
240
4
.
, /
- 6
- 7 \ \
\
,
\
,
('"'
_
.
"
4 .5
I
1
I j
5
I I
,, I
, , I
I
\
dS(hl
-,
.... , ...r >:
·'f \
". . , ,-
2
1 \/
J i
6,
I \ I \
,
,
'\
\
5
\
\
\
", '-- '"
4
3
'1
Cf'SK
1
2
\
1.5 ~
I
.5
I
1. 0
o Fig. 8.
o Fig. 7.
Minimum normalized squared distance versus modulation index for M =4 CPFSK.
modulation indexes. Compared to the quaternary case, the number of first-order weak (catastrophic) modulation indexes has increased in the interval of 0.3 ::s;; h ::s;; I. Larger values of N are required to reach d B1.(h), compared to the quaternary and especially the binary case. The scheme 8PSK (qt (t), h = 1/8), previously shown to maximize dB 2(1/8) is also indicated in Fig. 8, and it is sren that octal CPFSK yields the same minimum distance if h is chosen slightly larger than h = 1/8 and if N = 2. Much larger distances can be obtained for the CPFSK system by choosing, for instance, h ~ 0.45 and N~ 5. Distance properties of CPFSK schemes with larger values of Mhave been investigated in [I7J , [20] . The maximum attainable minimum distance value grows with M, but the number of first-order weak (castastrophic) modulation index values also grows with M, as does the length of the observation interval necessary for reaching the upper bound d B 2(h). However, for h :5 03, N = 2 is sufficient for all M. As an illustration to the behavior discussed above, Table I shows NB which is a lower bound on the observation interval for reaching the upper bound.
.5
1.0
Minimum normalized squared Euclidean distance versus modulation index for M =8, CPFSK.
HCS, RC, and M-ary PSK The HCS system yields a phase tree where the phase trajectories are always raised cosine shaped over each symbol interval. Fig. 3 shows the quaternary case. The upper bound d B 2(h) for HCS,M = 2 is given by [18] (62) where J 0 (.) is the Bessel function of the first kind and zero order. The maximum value of d B 2(h) is smaller than that for binary CPFSK, but this maximum value is still reached with N = 3 observed symbols (d~in = 2.187 for h = 0.626, [18]). For h = 1/2, d~in(h) equals that of MSK and PSK, as in all binary full response CPM systems. In the region of 0 < h < 1/2, the upper bound is reached with N = 2 observed symbols as in binary CPFSK. Fig. 5 shows that HCS givesa larger minimum distance than binary CPFSK in this region. The minimum Euclidean distances for the quaternary HSC system are given in Fig. 9 when N = 1,2, 3, and 4 observed symbol intervals, and QPSK is also indicated. d B 2(l/4) is still smaller than the minimum distance for QPSK of course, but since the upper bound for HCS is reached with N ~ 2 observed symbol intervals in the region of0
241
Fifty Years of Communications and Networking
4
4.5
3
2
1. 5
.5
o Fig. 9.
Fig. 10. .5
1.0
Minimum normalized squared distance versus modulation index h for M = 4, HCSsystem.
1.0
Minimum normalized squared Euclidean distance versus modulation index h for M = 8, HCS system.
commonly used function [14] , [15]
L oo
the pulse shaping. The results for the octal case (see Fig. 10) follow the same trend as for the CPFSK system; i.e., the upper bound is reached with N = 2 for low h-values and the number of catastrophic modulation indexes has increased compared to the quaternary case.
Ro(f)df
1 00
o
(64)
Ro(f)df
which gives the fractional out of band power at the one-sided bandwidth B, will also be given. Fig. 11 shows the power spectra (double-sided) for M-ary The power spectral density for the full response CPM schemes considered in this paper can be calculated with formulas given CPFSK with h = 11M, M = 2,4,8. The corresponding fracin [2]. The data symbols are assumed to be independent and tional out of band power plots are shown in Fig. 12 (for other identically distributed. ' For the case of full response CPFSK values of h, see [17], (20)). It is well known that for fIxedM systems, the spectrum can be expressed directly in terms of and g(t), the spectrum widens for increasing h. For certain h elementary functions [2] and for RC systems in terms of values discrete components occur. Fig. 11 and the distance Bessel functions [12] . figures illustrate the fact that for a roughly fixed distance, the Spectra for systems with different values of M should be spectral main lobe is decreasing with increasing M. The becompared at the same bit rate. The bit rate normalized variable havior of the spectra for large frequencies (i.e., the spectral tails) depends only on the number c of continuous derivatives f· Tb is used where of the instantaneous phase. It is shown in [8] that the tails T (63) decrease withfas If 1-2(e + 2) . For CPFSK, c equals O. Tb = - - ' . log2 M Fig. 13 shows the spectra for the quaternary CPFSK, HCS, and RC schemes for h = 1/4 . Note that for increased values of Hence, the power spectra Ro(j) are plotted against the bit rate c the main lobe becomes larger. The main lobe widens intuinormalized frequency separation from carrier. Plots of the tively due to the presence of higher phase slopes over a portion
V. POWER SPECTRUM
242
THE BEST OF THE BEST
o
~~~-~ _ &..
__.. .
_----~
_ . . ..-_.. . . __.-
3 -1 .
Il1o
f ..T
b
- 10
-20
-3 0
-4 0
- 50 - 60
-70
-8 0
Fig. II . Normalized power spectral density in c.'.1cibels for M-ary (M = 2, 4, and 8) CPFSK with modulation indexes H = 1/2. 1/4, and 1/8, respectively.
.5 1.0 1.5 ~:-"'"'-'--~--'--"""""-'--"""""-'--"""""-'--"""""-'--"""'-'---'-' l> S ' Tb
-10
-20 M=2 . h =
.-/
1
'2
-3D
-40
-soL Fig. 12. Fractional out of band power in decibels for M-ary (M =2, 4, and 8) CPFSK with modulation indexes h = 1/2. 1/4. 1/8, respectively. . "
of the pulse for non-CPFSK schemes. The spectral tail of HCS 8 behaves like and like for R(:. Further spectra for these schemes are plotted in [17}, [19] , [io] . The spectra of schemes with plateau functions are investigated in [16], [18]. As might be expected , the rapid phase change in the beginning and the end of each symbol interval gives wide spectra. The previously mentioned spectral tail behavior versus c is also applicabe in this case. However, f must be impractically large before this asymptotic behavior is dominating. Furthermore, it was concluded above that large
r'
r
d~ in (close to 4) are reached with plateau functions with
M == 2 and h close to 1. For h == 1. the power spectrum contains spectral lines however.
VI. DISCUSSION AND CONCLUSIONS From the distance and spectrum results above and in [17] , [191. [201, it is evident that M·ary full response CPM schemes have both bandwidth compaction properties and yield gain in EblNo as compared to MSK. Schemes within this class of CPM systems can also be designed to give a large gain in Eb/No with "
Fifty Years 0/ Communication s and Networking
10 . 10l ogRO(f J
o
243
[dB]
0.5
1. 5
TABLE II BANDwIDTH/DrSTANCE TRADEOFF FOR SOMEM·ARY CPFSK SYSTEMS CPFSK
-10
- 20
-3 0
- 40
\
Bandwidth 2B· T~
scheme
90%
99%
99.9%
2 Dmi / 2EtJ
Gain ove r MS K,d B
N B syrrlJo l s
0
2
M=2 h=.5
0.78
1. 20
2.78
2 .0
M=4 h=.25
0 . 42
0 . 80
1. 42
1. 45
-1 . 38
2
M=8 h=.125
0 .30
0.54
0 .96
.60
- 5 . 23
2
M=4 h=. 40
0. 68
1.0S
2. 08
3 . 0~
1.8 2
4
M= 4 h=.4 5
0 . 76
1.18
2. 20
3 . 60
2. 56
5
M=8 h=. 30
0.70
1.00
1. 76
3 .0
1.76
2
M=8 h=. 45
1. 04
1. 40
2 .36
5 .4 0
4 . 31
5
D
o
Ga i n i n dB re l a t i ve t o P SK
\
- 50
6
5 4
-6 0
3
-c;
/
2
Fig. 13. Normalized power spectral densities in decibles for quaternary CPFSK with modulation index h = 1/4. The schemes are CPFSK (solid line), HCS (dashed line), and RC (dash-dotted line).
,,"
('",
....,,"
•
like HCS and R'C , and in t hese cas es the tradeoff between
bandwidth and gain in EblNo at large SNR is even more attractive . In the binary case, plateau functions are a way to achieve considerable gains in terms of EblNo, which unfortunately gives poor spectra . . In Table II comparisons between various CPFSK schemes are . made , both concerning bandwidth and gain in terms of Eb/No (dB) at large SNR. The reference system is MSK. Three different definitions of bandwidth will be used. The normalized bandwidth (double-sided) is defined at 2BT b , for which 90 , 99, or 99.9 percent of the total signal power is within the frequency band 1/- 10'· 1.,;; B. Table II also gives the number of observed symbols NB required to reach the given minimum squared Euclidean distance value. The quat ernary scheme with h = 0 .45 has approximately the same bandwidth as MSK (99 percent bandwidth) and yields ' fI gain in Eb/No of 2.56 dB. The octal scheme with h = ,0.45 gives a slight bandwidth expansion when compared to MSK (at 99 percent bandwidth), but gives the gain 4.31 dB in terms of Eb/No . A more exhaustive comparison between different M·ary CPF'SK systems can be found in Fig. 14. In this figure the gain in decibles of various schemes is shown versus the 99 percent bandwidth. The schems are binary (indicated by x) , quaternary (indicated bye) and octal (indicated by 0). Note the supe-
D
•
• )(
)(
" " o +---..----r--------r----r~ 1. 5 2.0 1. 0 0.5 \MSK :It
-1
D
-2
•
-3
the same bandwidth as MSK, or considerably smaller bandwidth at the expense of an increased EblNo. This holds ,for example , for M-ary CPFSK. The same also holds for systems
•
D
,,,"0
-4
-5
x
-6 -7
x : Mo 2
D
. : Mo 4
-8 -9
c : M=8
•
-10
Fig. 14. Bandwidth/performance comparison relative to MSK for vari(IUS CPFSK systems '(99 percent fractional ou t of band power band width , see Table 10 .
rior performance of quaternary and octal schemes. In the binary case it was possible to find the frequency pulses g(t) maxim izing the upper bound on the minimum Euclidean distance. This was not the case for multilevel systems , and it is believed that the optimum frequency pulse.depends onM and h. In the interval < h < I/M ,M-ary PSK was shown to yield the largest minimum Euclidean distance , but the HCS andRe schemes are not far from this opt imum . However, the two latter schemes have much smaller spectral tails than M-ary PSK. It was shown that the number of first-order weak modulation indexes grows with M, thus putting ' a practical limit on how large anM should be chosen.
°
THE BEST OF THE BEST
244
It is interesting to note that for the schemes considered in r11) T. A. Schonhoff, "Bandwidth vs performance considerations for CPFSK." in Proc. IEEE National Telecommun. Conf. Record, this paper, a gain in terms of Eb/No is obtained without ex.. 1975. pp. 38.1-38.5'. , panded bandwidth, compared to MSK. This is different from [12] F. Amoroso. •'Pulse and spectrum manipulation in the minimum (frequency) shift keying (MSK) format;' IEEE Trans. Commun.; the case with a channel coded MSK system, where the spec" vol. COM-24~ pp. 381-384, Mar. 1976. trum must be expanded by a factor of l/R where R is the code fl31 T. A.. Schonhoff, "Symbol error probabilities for M-ary CPFSK: rate [3], [9]. For the CPM systems, no parity symbols are Coherent and noncoherent detection;" IEEE Trans. Commun .. vol. transmitted, and the total signal energy is devoted to the infor.. COM-24. pp. 644-652~ June 1976. (14] M. K. Simon. "A generalization of the minimum-shift-keying mation symbols. (MSK)-type signaling based upon. input data symbol pulse This paper explores the distance and bandwidth properties shaping." IEEE Trans. Commun., vol. COM-24, pp. 845-856, of full response CPM systems. In spite of the restriction that Aug. 1916. . the schemes must be a full response type (i.e., the instanta.. [15] M. Rabzel and S. Pasupathy, "Spectral shaping in minimum shift keying (MSK) type signals.. " IEEE Trans. Commun .• vol. COMneous frequency only depends one data symbol), we have 26, pp. 189-195, Jan. 1978. found considerable improvements. However, larger improve.. [16] T. Aulin and C-E. Sundbera, "Binary CPFSK type of signaling with input data symbol pulse shaping-E.rro~ probability and ments are obtainable with partial response systems (the spectrum .." Telecommunication Theory. Techn. Rep. TR:.99. instantaneous frequency depends on more than one data Univ. Lund. Lund. Sweden. July 1978. symbol). This class of systemis considered in part II. We'have [171 - - , ·£M-ary CPFSK type of signaling with input data symbol pulse shaping-Minimum distance and spectrum," Telecommu'intentionally omitted all problems dealing with transmitter nication Theory. Techn. Rep. Tk-I l I. Univ. Lund. Lund. Sweden, and receiver considerations. These problems will be treated in Aug. 1978. a unified manner in Part II. [18J --a "Bounds on the performance of binary CPFSK type of
on
signaling with input data symbol pulse shaping ," in Proc, IEEE Nat. Telecommun. Conf. Record. Birmingham, AL~ 1978. pp.
REFERENCES [I) [2J [3]
V. A. Kotelnikov ~ The Theory of Optimum Noise Immunity. New York: Dover, 1960. R. R. Anderson and J. Salz, "Spectra of digital FM:' Bell Syst, Tech. J., vol. 44. pp. 1165-1189, July-Aug. 1965. J. M. Wozencraft and I.. M .. Jacobs, Principles of Communication
Engineering. New York: Wiley, 1965. . [4] R. W. Lucky, J. Salz, and E. J. Weldon. Jr. Principles of Data Communication. New York: McGraw-Hili. 1968. . [5] M. G. Pelchat. R. c. Davis. and M. B. Luntz, "Coherent demodulation of continuous phase binary FSK signals ..·· in Proc . Int. Telemetering Conf.. Washington, DC. 197L pp. i81-i90. [6) R. deBuda. "Coherent demodulation of frequency-shift keying with low deviation ratio." IEEE Trans. Commun .• vol. COM-20. [7]
[8]
[9] (10)
pp. 429-436, June 1972. W. P. Osborne and M. B. Luntz, "Coherent and noncoherent detection of CPFSK.." IEEE Trans. Commun., vol. COM-22. pp. 1023-1036, Aug. 1974. T. J. Baker. Asymptotic behaviour of digital FM spectra. ,. IEEE Trans. Commun .. vol. COM-22. pp. 1585-1594. Oct. 1974. W.. C. Lindsey and" M. K. Simon, Telecommunication Systems Engineering. Englewood Cliffs. NJ: Prentice-Hall. J974. T. A. Schonhoff. "Symbol error probabilities for M-ary coherent continuous phase frequency-shift keying (CPFSK) ••. in Proc. IEEE Int. Conf, Commun. Conf. Record. San Francisco. CA. 1975. pp. U
34.5-34.8.
6.5.1-6.5.5..
[19J - - , ., M-ary CPFSK type of signaling with input data symbol pulse shaping-c-Minimum distance and spectrum," in Proc. IEEE Int. Conf. Commun, Conf, Record. Boston. MA. 1979, pp. 42.3.142.3~6.
(201 {2 J]
T. Aulin, ··CPM-:-A power and bandwidth efficient digital constant envelope modulation scheme," Dr. Techn. dissertation, Telecommunication Theory. Univ. Lund. Lund. Sweden. Nov. 1979. T. Autin, N. Rydbeck, and C.-E. W. Sundberg, "Continuous phase modulation-Part II: Partial response signaling," this issue,
pp. 210-225.
Continuous Phase Modulation-Part II: Partial Response Signaling TOR. AULIN,
MEMBER, IEEE,
NILS RYDBECK, AND CARL-ERIK W. SUNDBERG,
MEMBER, IEEE
and
I. INTRODUCTION
between error probability performance spectrum is achieved by using more levels than two and moderate smoothing of the phase at the symbol transition instants.. As will he shown in this paper, the use of partial response CPM systems yields a more attractive tradeoff between erro~ probability and spectrum than does the full response systems. The spectral properties are improved at almost all frequencies and the first side-lobes are considerably lower. This spectral improvement takes place without increase of the probability of symbol error at practical signal to noise ratios. The improvement is obtained by introducing memory in the modulation process. The price for the improvements is system complexity, especially with respect to the optimum receiver. practice, only rational values of the modulation index are useful, but it will be convenient here to imagine that h is real.
HIS paper presents constant envelope digital modulation Tsystems having both good symbol error and spectral prop-
II. PROPERTIES OF THE MINIMUM EUCLIDEAN DISTANCE
Abstract-An analy~is of, constant envelope digital partial response continuous phase moduJatJoD (CPM) systems is reported. Coherent detection is aSsumed and -the channel is Gaossian. The receiver observes tbe received signal over more than one symbol interval to make use of the correlitive properties of the transmitted signal. The systems are M-aryt aDd ba~band pulse shaping over several 'symbol intervals is ~onsidered. Ail optimum receiver based on the Viterbi algorithm is presentect. Constant envelope digital modulation schemes with eXCelleDt spectral tail properties are given. The spectra have extremely Jetw sidelobes. '1. is concluded tbat partial response ePM systems have spectrum com~dion properties. Furthermore, at equal or even smaller bandwidth than minimom shift keying (MSK), a considerable gain in .t~nsmitter power can be obtained. This gain increases with M. Receiver and transmitter configurations are I
presented.
In
erties. The systems developed and studied are called partial reIn this section general properties of the minimum Euclidean sponse continuous phase modulation (CPM) systems. This is because the data symbols modulate the instantaneous phase distance as a function of the real valued modulation index h of .the transmitted signal and this phase is a continuous func- are given for partial response CPM systems. This distance is de. tion of time. One single data symbol affects this phase over fined in Part I. more than .one symbol interval, an approach called partial response signaling [1], [3], [8]. The general class of signals The Phase Tree and the Phase Difference Tree is defined in Part L An 'important tool for calculation of the minimum EucliPreviously, power spectra for systems of this type have dian distance is the so-called phase tree. This tree is formed by been analyzed, for example, in [5.]) [9], arid [14]. In this the ensemble of phase trajectories having a common start paper the analysis of the performance of the optimum detec- phase (root), say zero, at time t = O.The data symbols for all trajectories in the tree before this time are all equal. tor for an additive white Gaussian channel is given. Some work the ?n a suboptimum detector for a very special case of. the CPM The value of these previous data symbols (the pre-history) can be chosen arbitrarily) but for unifying purposes they will be signaling scheme appears in [13] . As was treated in Part I of this paper, several attempts have chosen to be M - 1; i.e., ai =. M - 1; i = -1, -2, .... An exbeen made t~ improve the asymptotic spectral properties for ample of a phase tree for a binary partial responseCPFSKsysbinary full response CPM systems with modulation index h == terri is shown in Fig. 1 and the frequency pulse g(t) is 1/2, compared to. minimum shift keying (MSK). Considerable asymptotic improvement can be obtained by making the phase O
phase
I
:r;
m,
Reprinted from IEEE Transactions on Communications, vol. COM-29, no. 3, March 1981.
The Best ofthe Best. Edited by W H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
245
THE BEST OF THE BEST
c
B
c
Fig. 1. Phase tree (ensemble of phase trajectories) when the frequency pulse g(t) is constant and of length L = 3. It is assumed that all the binary (M =2) data symbols prior to t = 0 are all +1. Note thatA is a crossing and not a merge. Band C denotes rust and second merges. respectively. Note : characters with underbars appear boldface in text.
The phase difference trajectories are defined by .p(t. 'Y) = 21Th "Ii
~ 'Yiq(t - in;
;=_00
= 0, ±2, ±4, ...• ±2(M-
(2)
1).
The phase difference tree is now obtained from the ensemble of phase difference trajectories. Since the first pair of data symbols must be different in the phase tree, the first difference symbol "Yo must not equal zero in the phase difference tree. In the phase tree the prehistory is the same for all phase trajectories, and thus the difference symbols forming the prehistory in the phase difference tree must all be zero. This means that the phase difference tree does not depend on the prehistory and hence, neither do the Euclidean distances (see (24), Part I). Since there is a sign symmetry among the difference sequences and thus also among the phase difference trajectories, this can be removed for calculation of Euclidean distances. This is clear from I{24) and the fact that cosine is an even function. For calculations of the minimum Euclidean distance, when the receiver observation interval equals N symbol intervals, it is sufficient to consider the phase difference tree, using the difference sequences 'TN defined by
= 2, 4, 6 , ...• 2(M -
"Ii:::
Weak Modulation Indices he As was mentioned earlier, a phase difference trajectory identically equal to zero for all t ~ tm defines a merge. Since the phase difference tree always must be viewed modulo 2rr in conjunction with distance calculations, a merge that depends upon the modulation index h is obtained if there exists a phase difference trajectory which equals a nonzero multiple of 2rr for all t ~ t e . The value of a phase difference trajectory depends on the modulation index h, and a situation like this occurs if 1) there exists a phase difference trajectory which is a constant not equal to zero for all t ~ t e ; 2) the modulation index h is chosen so that this constant is a multiple of 21T . Since any phase difference trajectory is achieved by feeding the difference sequence into a filter having the impulse response q(t) and multiplying the output of this filter by 2rrh, the first requirement is met for te = LTby choosing
i
'tt " 0 "10
since this corresponds to a phase difference trajectory which is identically equal to zero for all t ~ tm . In Fig. 2 the phase difference tree with sign symmetry removed is given for the frequency pulse defined by (I). Merges are also shown . Note that in Fig. I pairs of phase trajectories must be considered in calculating Euclidean distance, but in Fig. 2 only single phase difference trajectories need to be.
(3)
1)
0, ±2, ±4, " ' , ±2Q1-1-1); i
= 1,2, "' , N -
'Yi =
1
The merges are easily identified in the phase difference tree,
O',
.-co
2,4.6 , ..., 2(M - 1);
i=O
0;
i>O
(4)
since for frequency pulses g(t) of length L symbol intervals = q(Ln, t ~ LT. Sometimes this can occur also for te =
q(t)
Fifty Years ofCommunications and Networking
247
'Il( t , r )
Bnh
Gnh
4nh
2n h
-----r 2
-2n h
-4 nh
Fig. 2. Phase difference tree when the frequency pulse g(t} is constant and of length L =3 symbol intervals. Note that A is not a merge since the trajectory cannot be identically zero in the future. Band C denotes merges. Characters with underbars appear boldface in text.
(L - 1)T, (L - 2)T and so on, but only for specific phase responses q(t). The second requirement above is satisfied if
2Trh e'Yoq(L1) = k 'j:rr;
= 1,2, ... 'Yo = 2, 4, 6, ..., 2(M -1) k
(5)
where it is assumed that q(L1) =1= O. Thus, for these modulation indices, in the sequel called weak, he satisfies k h = . k = 1,.2, ... e "Ioq(L1)' 'Yo = 2, 4, 6, ..., 2(M -1). (6) A merge, modulo 21T, occurs at t = LT. In a similar way, the phase difference trajectory can be made constant for all t ~ te = (L + LV,)T by choosing i c»
0;
"Ii =
2,4,6, ..., 2(M - 1);
i =0
0, ±2, ±4, "', ±2Q'J -1); i = 1,2, ..., III
o·, (7) and the weak modulation indices are
k h e = - - - AL - - -
q(L1) ~ 'Yi
k
= 1,2,3, '"
q(LT) t- 0
i=O
In Fig. 3 the minimum normalized squared Euclidean distance is shown versus h when N = I, 2, ..., 10 observed symbol intervals for the binary CPM system given by (1). It is seen that for most h the minimum Euclidean distance increases to an upper bound d B 2(h) with N, but not for the weak modulation indices - 3 1 9 6 2. 3 (9) h e-4' '8 '5'7'2' 'It can also be seen that some of the "weak" modulation indices, e.g., he = 4/5, 7/8 do not affect the growth of the minimum Euclidean distance with N. This will be discussed in more detail later. However, indices such as he = 3/2 are catastrophic.
An UpperBound on the Minimum Euclidean Distance A phase difference trajectory identically equal to zero for all t ~ t m does not increase the Euclidean distance if the observation interval is made longer than N = tmlT symbol intervals. By choosing any fixed difference sequence so that a merge is obtained, a limited upper bound on the minimum Euclidean distance is obtained. The first time instant for which any phase difference trajectory can be made identically zero ever after is in general t = (L + 1)T, where L is the length of the causal frequency pulse get). This is called the first merge, and the phase difference trajectories giving this merge are obtained by choosing the difference sequences
(8) It can be noted that for a given finite interval of the modulation index, the number of weak modulation indices within this interval increases with M as in Part I.
"Ii =
0
r-eo
"10 = 2,4, 6, ..., 2(M - 1);
i =O
-Yo
0
i
=1
r» 1
(10)
248
THE BEST OF THE BEST
__
get)
~1
S
/ .,
I
I
/
r
\
I
2T
3
2
N-S
.---
- _ ._.,.;'
~ , - , ,,-
__________________
o
~l>
h
1. 5
.5
Fig. 3. The upper bound on the minimum Euclidean distance (dashed) and minimum distance curves (solid) for 1 " N" 10 observed symbol intervals. The pulse g(t) is constant and of duration 3T [equation (l)J andM =2.
and thus the phase difference trajectories are
ep(t, 'Y)
=
0;
t';:;;
'Yo • 21Tl!q(t) ;
O~t~T
'Yo • 21Th[q(t)-q(t-T)] ;
T~t~(L+I)T
'Yo • 21Th[q«L
+ l)T)-q(LT)] = 0;
0
t>(L + I)T (11)
than those giving the first merge. Hence, the former upper bound can be tightened by also taking the minimum of the Euclidean distances associated with the second merge. This new upper bound might also be further tightened by taking third merges into account, and so on . The exact number of merges needed to give an upper bound on the minimum Euclidean distance which cannot be further tightened is not known in the general case, but in no case treated below were merges later than the Lth needed [16] -[18], [21] . For the full response case (L = 1) it was shown in Part I that first merges give the tight bound. Fig. 3 shows the resulting upper bound dashed for the CPM scheme defined by (I) together with actual minimum normalized Euclidean distances when N = I, 2, " ', 10 observed symbol intervals. The upper bound is the minimum of three functions corresponding to the first three merges [21]. As was mentioned earlier, not all weak modulation indices affect the minimum Euclidean distance. In this case the phase difference trajectories associated with (7) give larger Euclidean di~tances than the upper bound. Only those weak modulation indices which have phase difference trajectories having distance smaller than the upper bound affect the minimum Euclidean distance [21].
Weak Systems In general , the first inevitable (independent of h) merge ocCurs at t = (L + 1)T. However, there are partial response CPM systems which have earlier merges, depending on the shape of the frequency pulse g(t) and the number of levels M, but not the modulation index h. A partial response system is said to be weak of order L e , if the first inevitable merge occurs at t = (L + 1 - Le)T. L e cannot be larger than L. A class of firstorder weak systems are those where the frequency pulse g(t) integrates to zero, i.e., q(LT) == O. The sequence (10) of course still gives a merge, but not the earliest-one. By choosing 'Y '= I
By taking the minimum of the Euclidean distances between the signals having the phase difference trajectories above, an upper bound on the minimum Euclidean distance as a function of h, forfix M and g(t), is achieved just as in Part I. The resulting bound is more complex, however. One can also consider second merges, that is phase difference trajectories which merge at t = (L + 2)T. These phase difference trajectories are obtained from (2) by using
'Y;=
0;
;<0
2,4,6, "', 2(M - 1);
;= 0
0, ±2, ±4, " ' , ±2(M -1);
i = 1,2
0;
;>2
(12)
satisfying 2
L'Y; =0. ;=0
(13)
It might happen that the Euclidean distance associated with these phase difference trajectories for a .specific h is smaller
!
0; 'Yo
= 2, 4 , 6, ... ,2 (M -
;:f: 0
1);
t =0
(14)
the phase difference trajectory now equals zero for all t > LT; see (2). Examples of weak systems are given in [18] and [21] . The partial response CPFSK system having the frequency pulse
4T' g(t)
=
1
--' 4T'
O',
O
(15)
otherwise
is a weak system of the first order [21] . The main conclusion concerning weak systems is that they should be avoided, since the potential maximum value of the minimum Euclidean distance, indicated by the upper bound dB 2 (h), will never be reached due to the early merges. More results for weak systems can be found in [18]. We have found that there is little difficulty choosing pulses in such a way that the resulting schemes does not have the weak property considered above. This will be clearly illustrated in the following.
Fifty Years ofCommunications and Networking
249
III. A SEQUENTIAL ALGORITHM FOR COMPUTATION OF THE MINIMUM EUCLIDEAN DISTANCE A fast sequential algorithm for computation of the minimum Euclidean distance, for any real valued modulation index h and large numbers N of observed symbol intervals, has been developed. The algorithm is sequential in N and uses the basic properties of the minimum Euclidean distance, namely that it is upper bounded and is a nondecreasingfunction of N given g(t), h, and M. The sequential property is obtained from the fact that squared Euclidean distances for coherent CPM systems are additive. This means that if the squared Euclidean distance has been calculatedfor N symbol intervals,the squared Euclidean distance for N + 1 .observed symbol intervalsis obtained by just adding an increment to the previously calculated squared Euclidean distance. This holds for fixed difference see quences 1, and can be seen from the expression for the normalized squared Euclidean distance
- q(t -
)J
iD + q(LD~•• 1=0
(16)
dt
mum of these Euclidean distances gives the minimum Euclidean distance over N = 1 observed symbol intervals. When the minimum Euclidean distance for N = 2 observed symbol intervals is to be computed, only those 7o-values whose corresponding distances when N = 1 did not exceed the upper bound are used. The secondcomponent of the difference sequence is chosen to be 1'1 = -2(M - 1), -2(M - 2), .... , 2(M - 1), and the Euclidean distance is computed through (17). Again, only those sequences whose correspondingEuclidean distances are not above the upper bound will be stored as possible difference sequences for N = 3. The algorithm con.. tinues like this up to the maximum N-value, Nmaxo Through this procedure subtrees are continuously deleted. It is clear that any finite upper bound on the minimum Euclidean distance can be used for cutting subtrees. If the upper bound is too loose, however, the number of possible difference sequenceswill grow. The algorithm will naturally be as fast as possibleif the smallest possible upper bound is used. This algorithm has empirically been found to increase only linearly in computational complexity with N, and not expo.. nentially as the brute force method. Without this algorithm the results on the minimum Euclidean distances presented in the next chapter would be impossible to achieve. Using this algorithm allows observation intervals up to a couple of hundred symbol intervals, although N $ 100 is usually sufficient.
/
IV. NUMERICAL RESULTS ON THE MINIMUM EUCLIDEAN DISTANCE
Thus, 2
d ('YN+l , h)
=d 2('YN , h)
+ 1 - -1 T
-q(t-iD+q(LD
t;" [ ( cos 21rh
NT
N-L
~
N ~
'Yj
j=N-L+l
)11
"Ii
In the previous sections tools have been developed for evaluating the minimum Euclidean distance of any CPM sys-
J dt.
(17)
A brute force method for calculating the minimum Euclidean distance for a givenN is to compute d 2('YN , h) for all sequences 'YN defined by (3) and take the minimum of all the achieved quaritities. The number of calculations required using this method grows exponentially with N since the number of difference sequences of length N is (M - 1) (2M - 1 1 and
yv-
tem, and thus also the performance in terms of symbol error probability at large SNR. These tools will now be applied to selected classes of CPM systems. The chosen schemes are in no specific sense optimum. Systems are achieved, which are reasonably easy to describe and analyze and which have attractive properties. The first class of partial response CPM systems analyzed in this chapter is defined by the frequency pulse
()t =
_ 1_ 2LT
(I-Cos[2rrtJ); LT
O~t~LT,
(18) even for M = 2 it is unrealistic to calculate the minimum 0; otherwise, Euclidean distance for moderate N-values. A flowchart for a more efficient algorithm can be seen in Le., the frequency pulse g(t) is a raised cosine over L symbol Fig. 4. It is assumed that the minimum Euclidean distance is intervals. The classis obtained by varyingM, L, and the moduto be calculated for a CPM system when the modulation inlation index h. dex is h = h m in , hmin+Ah' ...., hmax-D.h' h m ax and for 1 ~ Another class of CPM systems which also will be considered N ~ Nm a x observed symbol intervals. The first step is to comare those having the Fourier transform of the frequency pulse 2(h) on the squared minimum Euclidpute the upper bound d B g(t) ean distance for the given h-values. Now all the Euclidean distances for N = 1 observed symbol intervals are computed, i.e., G(f) = F{g(t)} the difference sequence if 'Yo = 2, 4, '.., 2(M - 1). If any of these distances is larger than the upper bound for that specific 2 h-value, the entire subtree having the corresponding 'Yo-value Ifl~ LT (19) will never be used. This can be done since the Euclidean distance is a nondecreasing function of N for fixed h. The miniotherwise, g
1
250
THE BEST OF THE BEST Carpute an upper bound
~(h)' h=hmin.hmin+tah, •••• hmax
Read sequence of length N-1
N:=1
Yo·Y1'".· ·'YN-1' corresponding distance d~_1 (h) and start value
Yo:=2
of
2• dnun, NChl=100000
ph~5e. difference
from scratch
trajectory
pad .
Compute phase difference
trajectory 0
~
t
~
t and
corresponding distance d2(hl
CoolJute p~se d~ fference trajectory, (N-1) T < t < NT and Incre-
ns~t
of the
dista~ce 6d~{h)
NO Note Yo,d (~) and final 1 value of phase difference trajectory on scratch pad
YES
Note. sequence of length N
2 . Yo,l" ••• ,"f N- 1' dN(hl and final value of phase difference trajectory on scratch
YES
Rad
YFS
NO (l
YES
NO
W
Fig. 4. (a) Flowchart for the sequential algorithm for computation of minimum distances d~inp(h), 1<; N '" N max' hmin ~ h ~ h max . (b) Continued.
i.e., the Fourier transform of the frequency pulse i(t) is raised cosine shaped [4] . The frequency pulse is
sin [21(t~
1 LT g(t)=- - - -
LT
21rt
LT
[
2t }-4 ( LT
where 1
go(t)~-
21(tJ
cos LT.
T
)2
M
[Sin (1ft/1) - 1T
trt/T
2
24
• 2 sin(rrt/1)-(2Trt/T)cos(!,t/T)_(trt/T)2 sin (1rt/i)
(20)
(1rt/T)3
J
i
(22) which has an infinite duration. The third class of CPM systems has previously been considered [13] for the specific case h = This scheme will be referred to as the TFM (tamed frequency 1/2, M = 2) and an approximation of the frequency pulse used modulation) system as in [13]. It will be analyzed only for the binary case, but the modulation index will not be constrained is [17] , [21] to 1/2 in the approximate analysis presented in [13] . (21) g(t) = [go(t -1) + 2go (t ) + go(l + T)] The schemes given by (18) will be denoted IRe (= RC),
i
as
Fifty Years ofCommunications and Networking
251
2RC; 3RC, etc. since the frequency pulse g(t) is a raised cosine of length L = 1, 2, 3 etc. The schemes defined by (20) will be denoted ISRC, 2SRC etc, since the frequency pulse g(t) has a Fourier transform which is raised cosine shaped (Spectral Raised Cosine) and the width of the main-lobe of this pulse is L= 1,2, 3 etc.
reference poi n t
rsr~
Binary Systems The RC class will now be analyzed by means of the minimum Euclidean distance. The simplest of these systems is naturally the IRC scheme , which was considered in Part I. The minimum normalized squared Euclidean distance for the 2RC scheme is shown in Fig. 5 when the receiver observation interval is N I, 2, 3,:4, and 5 symbol intervals. The upper bound on the minimum Euclidean distance is shown dashed. For h = 1/2, all binary full response CPM systems including MSK have the minimum squared Euclidean distance 2 (see Part I) . For this h-value the 2RC system yields almost the same distance when N 3; the exact figure is 1.97. Hence, MSK and 2RC with h = 1/2 have almost the same performance in terms symbol error probability for large SNR, but the optimum detector for the 2RC system must observe one symbol interval more than that for the MSK system : It can be expected, however, that the 2RC, h = 1/2 system has a more compact spectrum due to its smooth phase tree.and this will be discussed in Section V. For the binary full response CPFSK system , the maximum value of the minimum Euclidean distance is 2.43 when h = 0.715 and N = 3 [23]. Fig, 5 shows that larger values of the minimum Euclidean distance can be obtained for the 2R9 system . When h :::::: 0.8 the minimum Euclidean distance is 2 ;65 for N = 4 . From a spectral ,poirit of view, however, low modulation indices should be used. It can be seen that in the region o < h '~ 1/2, the upper bound is reached with N 3 observed symbol intervals. The phase tree for the binary 3RC system is shown in Fig. 6 and the specific shap; of a phase depends upon the present and the two preceeding data symbols. This makes the phase tree yet smoother than for the 2RC case [~O] . The first merge occurs at t =4T. Note the straight lines in the phase .tree. These occur for all RC-schemes and they can be used for synchronization {24],.{28] , Note also that the slope in the phase tree is never larger than hTT/T (i.e ., the slope of the full response CPFSK tree, Fig. 2 in Part I). This is true for all binary RCschemes. The minimum normalized squared Euclidean distances for this scheme are shown in Fig. 7 when N = 1,2, " ' , 6 and N = IS observed symbol intervals. The upper bound d B 2(h) has been calculated through the method described in Section III, using the first , second, and third merges, For N = I observed symbol interval , the minimum Euclidean distance is very poor for almost every modulation index, but already with N = 2 it is increased sigriificantly and when N = '4 the upper bound is reached in the region O
=
=
of
=
J:::::=_ .2
Fig. 5.
_
~
.4
-,-.6
",--
,--
......£>
h
1. 2
.8
Minimum squared Euclidean distance versus modulation index h for the binary 2RC scheme.
(0 , ' , 1)
,6n
S
(0,1 . -') .......... (O, -1 . -1) <,
('?f,'.11
( ~ . - ,.l)
4n
T
(¥., ,-1) ,. . . . (¥.-I.-n ' 1 ~ . , . 1)
1!'f ,-' .1) (~.1 .-11
(~.-1. -1)
Fig. 6.
Binary phase tree for 3RC,h ' = 4{5: The assignment of states is used in Section VI.
There is an apparent weak modulation index when h = 2/3, but this is not a true weak modulation index since the minimun Euclidean distance in fact increases (by asmall amount) when N is increased . This behavior is caused by a crossing and not .a merge, and the phase trajectories are close after this crossing. The effect can be seen in the phase tree in Fig. 6, for th is specific modulation index, if the phase' is viewed modulo 21T. Phase trees and minimum distance graphs for 4RC, 5RC are found in [I7], [20], and [21]. The distance/modulation index plot of the binary 6RC scheme is shown in Fig. 8 . Note that the peak d 2 (h) value is larger than those for the shorter
252
THE BEST OF THE BEST
4 .:
,.
\,
\
\ \
30
2
I"6K
1.0
Fig. 8.
1. 5
2.0
2 .5
Minimum squared Euclidean distance versus h for M = 2, 6RC.
use strictly bandlimited frequency pulses get) . A frequently used pulse for digital AM·systems is the SRC pulse [41 . This .3 •5 1. 0 1.5 pulse is of course of infinite duration. The phase tree for the binary 3SRC system with the freFig. 7. Minimum squared Euclidean distance versus modulation index quency pulse get) given by (20) with L = 3 truncated symmet h for binary, 3RC. N is the receiver observation interval. The upper rically to a total length of 7 symbol intervals is very similar to bound is shown dashed. the 3RC tree in Fig. 6. It is believed that an optimum detector for the 3RC system works well if the transmitted signal is pulses and it occurs for a larger h·value. Also note that the dis- 3SRC, but the distortion in the 3SRC phase tree will appear tance value at h-values below 0.5 decreases with L. Fig. 8 also as a slightly increased noise level. This assumption must of shows that longer pulses require longer observation intervals. course be properly shown. Fig. 10 shows the minimum norFor the binary 6RC system, minimum Euclidean distance malized squared Euclidean distances for binary 3SRC values above 5 can be obtained when h ~ 1.25 and N = 30 ob- scheme. The pulse g(t) has been truncated to a total length of served symbol intervals. This corresponds to an asymptotic 7T. By comparing these minimum Euclidean distances to those gain in SNR of at least 4 dB coinpared to MSK. for the binary 3RC system (see Fig. 7), it is indeed seen that These trends are summarized in Fig. 9 using data in [17] they are very similar. The number of observed symbol inter and [2 i} , where upper bounds for the RC family 3 ~ L ~ 6 vals required to reach the upper bound dB 2(h) must be inare shown. Except for a few weak h-values, the minimum dis- creased for the 3SRC system, however. This is due to the tails tance equals dB2(h) for sufficiently large N-values. of the 3SR C pulse. The binary system 8RC is second-order weak, i.e., the upA bandwidth efficient digital CPM system was recently deper bound on the minimum Euclidean distance calculated by veloped and given the name tamed frequency modulation using the merge difference sequences can be further tightened [13], or TFM. This binary system uses the bandlimited freby also taking into account the sequence giving the early quency pulse given by (21). This system has modulation Inmerge at t = 7T. The two systems 7RC and 9RC are not weak dex 1/2, and the detector is constructed with the assumption according to the above definition, but since they are very close that the. transmitted signal is linearly modulated, and is thereto the 8RCsystem they are "nearly weak." Thus, for 7RCand fore suboptimum . The detector is simple, however, and simu9RC N must be chosen very large for all modulation indices to lations have indicated that the performance in terms of symbol make the minimum Euclidean distance equal to the upper error probability for a given SNR is very close to the optimum bound d B 2(h), even for very low h·values. For further details, detector [13] . This system will now be analyzed by means of see [I7J and [21J. the minimum Euclidean distance ; it has been generalized to inFrom a spectral point of view it is intu itively appealing to clude all real valued modulation indices. The result can be seen
the
253
Fifty Years of Communications and Networking
3
~
'\
~~
d 2 {hl
,
' /
B
2
3 3RC
.5
3RC
1.0
1 .5
Fig. 10. Minimum squared Euclidean distances versus h for a binarY modulation system with a spectral raised cosine pulse 3SRC, truncated to total length of 7T.
a
2
+--++1.;-.------- .1... 0. --------11.5 Fig. 9.
h
Summary of upper bounds for 3RC-6RC. MSK is shown for comparison.
in Fig. 1 i. The subpulse go(t) given by (22) has been truncated to a total length of five symbol intervals. The main-lobe of the frequency pulse get) used for the TFM system is approximattvely 3.7T. For h = 1/2 the upper bound on the minimum Euciidean distance is reached with N = 10 observed symbol intervals and equals 1.58 . The value of the upper bound for 3RC and 4RC when h = 1/2.is 1.76 and 1.51, respectively .
2
Quaternary and Octal Systems In Part I it was found that multilevel systems (M = 4, 8, ...) yield larger minimum Euclidean distances than binary systems. The first quaternary partial response CPM system within the RC class is 2RC. Now four phase branches leave every node in the phase .tree, depending on the data symbols ± 1, ±3. The phase tree for the binary 2RC system forms a subtree of the quaternary system. The first merges Occur at t 3T. The result of the distance calculations, using the algorithm described in Section III, are shown in Fig. 12 for the M = 4, 3RC scheme ; Distance calculations for various M-ary RC
=
Fig. 11.
Minimum squared Euclidean distances versus h for TFM. The subpulse 80(1) is truncated to a length of ST.
schemes are reported in [17] and [21]. The trends already observed for the binary system again appear as the length of the frequency pulse g(t) is increased . The upper bound decreases for low modulation indices and increases for large modulation indices. Actually, very large values of the minimum
254
THE BEST OF THE BEST
V. POWER SPECTRA
/
4
2
15
The power spectra (double-sided) can be obtained by numerical calculations, simulations in software, and hardware measurements. Numerical calculations have been carried out by using formulas in [25]. These computer calculations are very 'time consuming) especially for large M and L values and for large IT values. The bulk of the spectra in this paper have been obtained by means of simulations, a much faster method. For each pulse shape, M, h, and L value, the simulated spectra have been compared to numerically calculated spectra and a close fit was observed [26] . Comparisons have also been made to previously published numerically calculated spectra in [5] ,
[9] ~ and [14] . '-'~:The simulated spectra have been calculated by the use of a well-tested simulation program [17]. In these simulations, which are made in discrete time, the complex envelope ei.p(t,G) has been sampled four times per symbol interval. The data symbol sequence is a randomly generated 128 symbols long M-ary sequence, and the estimated power spectrum is the squared magnitude of the discrete Fourier transform (DFT) applied to the complex envelope eiv;(t,a). These limitations introduce some distortion in the spectra and in some cases simulations have been performed using eight samples per symbol interval of the complex envelope. No significant change in the result has been observed. The behavior of the power spectra for large frequencies is
also of interest. It is not feasible to use the above-mentioned methods to obtain the spectral tail behavior due to numerical inaccuracy. In [5] the asymptotic behavior of partial response CPM systems is given. The number of continuous derivatives of g(t) determines the fall-off rate for large frequencies [5], [26], as discussed below. Fig. 13 shows the power spectra for the binary CPM sys.. •2 .5 1.0 1.5 terns 2RC, 3RC, and 4RC when h ~ 1/2. For comparison the Fig. 12. Minimum squarednormalized Euclidean distance versus h for spectrum for M~K is also shown. The frequency is normalized M =4, 3RC. The upper bound is reached with N = 12 symbols in the with the bit rate l/Tb ) where T = T b-log 2M and the spectra interval 0.25 <; h <; 0.5. are shown in a logarithmic scale (dB). The spectrum for MSK falls off as 'f 1-4 and for the RC-systems as 'f 1-8 for large Euclidean distance are obtained in the region h ~ 1.3 and a frequencies [5]. For full response CPM systems the improved receiver observation interval of N = 15 symbol intervals is fall-off rate has to be paid for by increased first side-lobes. It enough to make the minimum Euclidean distance coincide can be seen from Fig. 13, that by also increasing the length of with the upper bound. For these h·vaIues the minimum Euclid- the frequency pulse g(t), the side-lobes can be made to decrease. ean distance takes values around 6.28. This corresponds to an The power spectra are thus compact and the first side-lobe asymptotic gain of 5 dB compared to MSK or QPSK. very small with long pulses. Minimum Euclidean distance calculations have also been The effect of changing the modulation index h for the biperformed for the octal (M == 8) andhexadecimal (M = 16) nary RC systems have been studied in detail in [17], [21], systems using the frequency pulses 2RC and 3RC. The results and [26]. The general trend is that the spectra become wider can be found in [17] and [21] . The main conclusion concern- for increasing h-values, but normally maintain their smooth ing these results is that the trend for increasing L holds, Le., shape. As an example, see Fig. 14. For h-values equal to inlower values of d B 2(h) for low modulation indices, larger teger values, spectral lines occur [5] , [25] . values of d B 2(h) for large modulation indices and an increased As can be expected, even more compact spectra can be obnumber of symbol intervals required in order to make the ac- tained by using strictly bandlimited pulses g(t) of the SRCtual minimum Euclidean distance equal to the upper bound type. Fig. 15 shows the power spectra for the binary system d B 2(h). The octal 3RC system yields a minimum Euclidean 6SRC, when the modulation index is h = 0.4,0.5,0.8, and distance of 9.03 when h = 1.38 [21] withN= 12 symbol in.. 1.2. In this simulation the infinitely long frequency pulse given tervals required. The asymptotic gain in terms of EblNo com- by (20) has not been significantly truncated [17]. It can be seen that these spectra are more compact than those for the pared to MSK or QPSK is 6.5 dB.
255
Fifty Years of Communications and Networking dB
dB
Ot--~~-......----
.......---......----....
----t----.. .
-20....----V+-"'h---.......
-40t-----+--~~-p.-~+----~-
o...fIIIIlIlr...... -..-----r-----w----...
-20t----+--+-t-\---l~~----___.;.---
........
.....
-60r-----1~~-~-\oo--_4_----!
-60------...-----~
....
-""""'!'---~
... -M:2
6SRC
M=2
1,5 -80.----.-....---~--~
o
Fig. 13.
os
1,0
.......-"""--.....
Power spectra for binary CPM schemes with various baseband pulses. h =0.5.
dB O~~--==~~-------r------..
Fig. 15.
Power spectra for M =2, 6SRC. No pulse truncation.
RC system for large frequencies. No side-lobes at all can be distinguished for the 6SRC spectra shown. Bandwidth is not a quantity which is precisely defined for signals that are not strictly band1imited. A common way of defining bandwidth for such signals is by means of the fractional out-of-band power (see Part I). Naturally, this definition of bandwidth can also be used for the CPM signals considered in this paper. However, since the power spectra are extremely compact and the out-of-band' computation is not simple, bandwidth will instead be defined directly from the power
spectra. Thus, the bandwidth 2BTb is the bit rate normalized
0.5 Fig. 14.
',0
'.5
Power spectra for M = 2, 4RC. h ::; 0.5,0.8, and 1.2.
frequency for which the spectrum itself equals -60 dB when f = B. The level ~60 dB has been chosen in order to illustrate constant envelope modulation schemes which have spectra sufficiently well-behaved to make bandlimiting radio-frequency filtering unnecessary. . Just like full response CPM systems multilevel partial response CPM systems yield more compact spectra than binary systems for fixed frequency pulse and distance. An example of this behavior is shown in Fig. 16~ where the power spectra for the quaternary system (M = 4, 3RC) is given for h = O~25, 0.4, 0.5, and 0.6.. By comparing to the corresponding binary systems, it can be seen that the spectra have been compressed by using four levels instead of two [17], [21] . Note that the frequency is normalized with the bit rate 1ITb and not the symbol rate lIT. Power spectra for multilevel systems having strictly bandlimited frequency pulses g(t) are shown in [17] and [21] . The systems there are quaternary 3SRC and 4SRC, and octa13SRC, respectively. For each of these systems spectra are shown for
256
THE BEST OF THE BEST gers) there are p different phase states with values 0, 21f/p, 2·21r!p, ..., (p - 1)2.,,/p. The state is defined by the L-tuple an = (On' Qn-l' Qn-2' •.., Qn-L+l). The total number of states is S = pM(L -1 ). Transmitter and receiver structures
dB
utilizing these properties are described below'.. Fig. 6 shows the phase tree for a binary system with a ,raised cosine pulse of length L = 3 (3RC) for hi = 4/5. Phase states and correlative states are assigned to the nodes in the phase tree. The root node is arbitrarily given phase state 0, Each node in the tree is labeled with the state (fJ n, Q n -1 , a:n~2). The state trellis diagram can be derived from Fig. 6. The. transmitted signal can always be written
-"
f2E
set, a) == ~
T [let) cos (21Tfot) -
Q(t) sin (21l'fot)]
(25)
where
I(t) = cos [~(t, a)l
IQ(t) = sin
[~t, a)J
(26)
.
From (23) we have
«)(t, a) ~ 8( t, a) + ()n; 0,25
0,5
various h-values. The same comparisons between the RC and the SRC schemeshold for the M..ary as for the binary case.
VI. TRANSMITTER AND RECEIVER STRUCTURES It is assumed that the pulse g(t) has finite length LT, Le., get) == 0 for t < 0 and t > LT. Since get) is time limited, q(t) is 0 for t ~ 0 and constant at q(LT) for t ~ LT. For positive pulses g(t), e.g., the raised cosine pulses (18), q(LT) == 1/2. Thus, the information carrying phase 1-(5)can be written as n
o) = 21fh ~ Ot;q(t - iT) j:.:_
00
n
~
= 21Th
~n-L+l
nT~ t ~
Qtc/(t - i1) + h1r
(n + l)T.
n-L
~
Qi,
~_oo
(23)
Hence, for given hand g(t) and for any symbol interval n, the phase ~t, a) is defined by Qn, the correlative state' vector (an - l , tXn - 2 , ... , tXn - L + l ) an d the phasestate On, where n-L On = h1T ~ ;=-00
Qi
mod 21T.
+ I)T
(27)
where
0,75
Fig. 16. Power spectra for M = 4, 3RC. Note that the frequency is normalized with the bit rate lITb in all spectrum figures.
'fJ(t,
nT ~ t <: (n
(24)
The number of correlative states is finite and equal to ~L - 1). For rational modulation indices the phase tree is reduced to a phase trellis [10], [12], [22], [28]. For h:::: 2k/p(k,pinte-
n
8(t, a)= 21th
~
i=n-L+l
all(t -i1).
(28)
Hence, for nT~ t ~ (n + l)T
I
I(t) = cos [8(t, a)] cos On - sin [8(t, a)] sin On
Q(t) = cos [fJ{t, a)] 'sin On
+ sin [6(t, a)] cos One
(29)
The basic transmitter structure is given by (25) and (26). The J- and Q-generators can be implemented in different ways. Fig. 17 shows an example based on (27) and (28). By also using (29), the ROM (read only memory) size can be reduced by a factor of p, but adders and multipliers must be used as in Fig. 18. Alternative structures are presented in [24]. These transmitters work for all rational h-values and time limited pulses g(t). Only the ROM contents change. An exact rational relationship between h and lIT is obtained. Both ROM's in Fig. 18 (phase branch ROM's) are of size Ns-Nq-ML bits, where Ns is the number of samples per symbol interval and Nq is the number of bits per sample. The size of the cos (On) and sin (On) ROM's is peNq . The phase states are accessed sequentially
[24] . From a spectral point of view it is desirable to use strictly bandlimited pulses g(t), e.g., the SRC pulse given by (20). This pulse must be truncated to some length L T , and hence the phase branch ROM's become a factor ~LT-L) larger, compared to an RC pulse of length L.
Receiver Structures
The receiver observes the signal r(t) = s(t, a) + n(t), where the noise n(t) is Gaussian and white (see Part I). The MLSE
Fifty Years of Communications and Networking
257
a. It is equivalent to maximize the correlation J(O) = l(t)
L~
(31)
r(t)· s(t, Q) dt.
Now define In(a) =
( n + l )T
r(t) • s(t, a) dt.
/- ( X )
(32)
Thus, it is possible to write J n(a) == I n - 1 (li) Qlt)-ROM sin (8It.Q&.rJ.8nJ
+ Zn(a)
(33)
where
QCd
\In Fig. 17. / ... and Q-generators. Characters with underbars appear boldface in text.
Zn(a) =
1
(n+ l)T
nT
ret) • cos (wot + 'P(t, Q») dt.
(34)
Using the above formulas it is possible to calculate the function J(ii) recursively through. (33) and the· metric Zn (ci).. This metric is recognized as a correlation between the received sig.. nal over the nth symbol interval and an estimated signal from the receiver.
! fI
i·e trr
The Viterbi algorithm (7] isa recursive procedure to choose those sequences that maximizes the log likelihood function up to the nth symbol interval. A trellis is used to choose among possible extensions of those sequences. The receiver computes Zn(a n , On) for all ML possible sequences an == {ant Qn-It ... ,
cos(B(1,An)1
ROM
Qn-L+l} and all p possible6n . Thismakesp·ML different Z,
values. Rewriting(34) using (27) yields
sin (9(tanJ J
ROM
(35) It is seen that Zn (an, On) is obtained by feeding the signal r(t) into a filter and sampling the output of the fllter at t = (n + 1)T. In this case a bank of bandpass filters must be used. The noise net) is written n(t) Fig. 18.
1- and Q-generators with reduced ROM size. Characters with underbars appear boldface in text.
L:
=x(t) • cos wot -y(t)
(36)
Using the basic quadrature receiver the received quadrature
components are
receiver maximizes the log likelihood function [2] log, [P'(t)li(r(t)!O>)--
• sin w.ot.
[r(t)-s(t,Q)}2dt
(30)
a.
with respect to the infmitely long estimated sequence The maximizing sequence a is the maximum likelihood sequence
estimate and P,.(t) t Q is the probability density function for the observed signal r(t) conditioned on the infinitely long sequence
/(t) "" O(t) ==
ft
~-[
l(t) + x
(t)]
(37)
i[J~ Q(t) + yeo] .
By inserting these components in (35) and omitting double
THE BEST OF THE BEST
258 frequency terms; we have
+ cos (8n ) + sin (On) - sin (On)
1
i lll
( n + l )T
OCt) • sin (6(t, an))dt
r: nT
O(t) • cos (6(t, Qn))dt
nT
r(n+l)T
L
jet) • sin (6(t, i'in))dt.
(38)
nT
.
This can be interpreted as 4ML baseband fllters with the impulse response
cos [21Th .
±
Q/-lJ((l - j)T -
/=-£+1
o and
--I
h$(t,an ) -
o
(39)
for t outside [0, 11
sin [21Th ·
t)]
all.rnative processor
±
j=-L+l
o
Qjq«1 - j)T -
t)]
6
2
Eb / llO [dB ]
8
9
10
(40)
for t outside [0 , T] .
The number of filters required can be reduced by a factor of two by observing that every Q~sequence has a corresponding sequence with reversed sign. Fig. 19 shows an optimum receiver with F = 2ML matched filters, The outputs of these filters are sampled once every symbol interval. The metrics Zn(Qn' On) are obtained by the use of (38) and used by the Viterbt algorithm. A delay of NT symbol intervals is introduced . Alternative receiver structures are given in [11], [12]
and[21] .
)(
Sirrulati on resu l t s
f or N, 11. A method has been developed for calculation of upper bounds on the symbol error probability for a Viterbi detector having a path memory of length NT symbol intervals [21], [27]. Fig. 20 shows the result of this calculation for the biFig. 20. Upper bounds on the bit error probability for 3RC, M = 2, nary 3RC scheme with h = 4/5 when NT = 1,2, "', 20 and h =4/5 . A lower bound is also shown. Compare QPSK (dashed). NT = 00 . A lower bound is also shown. As a reference the bit error probability for QPSK (BPSK) is shown dashed. The asymptotic behavior cannot be improved by making NT > compared to QPSK(MSK). From Fig. 20, it is seen that this N B [27], where N B is the value of N where d~ln = d B 2 . • gain is achieved even at fairly low SNR. This conclusion holds Compare the improvements of the upper bounds with NT in for a large variety of schemes and it can thus be concluded that the minimum Euclidean distance is sufficient for the charFig. 20 to the growth of d~ in with N for H = 4/5 in Fig. 7. The lower bound shown in Fig. 20 is based on the mini- acterization of performance in terms of symbol error probamum Euclidean distance and it is seen that if NT ~ NB , the bility [21], [27] . Even where the upper bound is loose the upper and lower bounds are close even for fairly large error minimum distance appears to be accurate . For the scheme conprobabilities. By further increasing NT, improvement is sidered in Fig. 20 some simulation results are shown . In this achieved for low SNR. For every low SNR the upper bounds simulation a coherent Viterbi detector with NT = II was used. are loose. From the calculation of d~ in yielding the value We have obtained results similar in appearance for a wide 3.17 (see Fig. 7), the asymptotic gain in ElJ/No is 2.00 dB variety of modulation systems [21] . >
259
Fifty Years of Communications and Networking
VII. DISCUSSION AND CONCLUSIONS In this paper classes of M-ary partial response CPMsystems have been analyzed, with respect to their minimum Euclidean distance, erro r probability, arid spectral properties. Naturally it is desirable to have a system which yields both large values of the minimum Euclidean distance (small error probability) and a compact spectrum. To start with binary systems, the RC schemes give lower . values of the upper bound on the minimum Euclidean distance for low h-values and larger for large h-values. To have an RC system with main-lobe width L bit intervals with the same performance as MSK for large SNR, the modulation index must be chosen larger and larger asL is increased. For the 3RC system h must be 0.56 roughly, and for the 4RC system, about .60. Although the modulation index is slightly increased, the spectra for the two partial response systems are far more attractive than that for MSK. This is true also for _the asymptotic spectral behavior. At forb = 1 the spectrum for 3RC with h = 0.56 is 23 dB lower than the MSK spectrum and the corresponding figure for 4RC with h = 0.6 is 33 dB. For the SRC system , the modulation index must be h :::::< .63 to have the minimum Euclidean distance 2, i.e. , the same as MSK. The spectral behavior of the system SRC with h ::::: 0.63 is.far more compact than for the MSK system and at = 1 there is a 40 dB difference in favor for the RC system. This is a clear trend: Using RC systems with longer frequency pulses and with the modulation index chosen to yield the same performance in terms of error probability for large SNR as MSK, spectrally more efficient systems are obtained. The number of observed symbol intervals must grow however. Binary RC systems can also be chosen which yield large gains in Eb/No . Take the 6RC with h = 1.28 . When N = 30 observed bit intervals a gain of 4 dB in Eb/No is obtained compared to MSK. When I forb I ~ 0.8 the spectrum for 6RCwith h = 1.28 is below the MSK spectrum, and the asymptotic properties of the 6RC spectrum are of course more attractive. If a smaller minimum Euclidean distance than two can be accepted, which is sometimes the case, still narrower band binary RC schemes can be considered. For example , the binary RC systems with minimum Euclidean distance 1/2 have a loss of 6 dB in EbiNo . Systems having this performance are 2RC, h :::::< 0 .21; 3RC, h :::::< 0.25; 6RC, h :::::< 0.34. These systems are very narrow band. The favorable minimum Euclidean distance versus bandwidth tradeoff pointed out for a few cases of binary partial response CPM is even more pronounced for systems using more levels than two. Quaternary RC systems having the same minimum Euclidean distance as MSK are 2RC, h ::::: 0.32 , and 3RC, h :::::< 0.37. From [21), the quaternary systems give more compact spectra for the same distance. The octal system 2RC, h :::::< 0 .25 and the hexadecimal system 2RC, h ~ 0.21 yield the same minimum Euclidean distance as MSK. The modulation indices are very low and thus good spectral properties can be expected. Large gains in terms of Eb/No can be achieved especially for multilevel systems, and as an example the system M = 8, 3RC , h ~ 1.38 is chosen. With N'= 12 observed symbol inter-
t-r,
dB gai n 6
rel MSK
-60dB
level
3SRC
5RC
3 3RC 2
o +------r---Et-I--f-----:__- ~ ~ - - - ~ 26Tb 3
-1
"\
-2
-3
.
h
= j;
RC ( L va ri es al ong th e d a sh ed
curve)
6SRC
-4
;< M=2 •
M=4
a
M- B
Fig. 21. Bandwidth/power comparison between various partial response CPMsystems.
vals the minimum normalized square Euclidean distance equals 9.03, and thus the gain in Eb/No for large SNR is 6.5 dB. A summary of results concerning the bandwidth-minimum Euclidean distance tradeoff is given in Fig. 21. The bandwidth has been calculated using the definition in .Section V, i.e., double-sided bit rate normalized frequency where the doublesided power spectrum takes the value -60 dB. This figure shows .the asymptotic gain in Eb/No compared to MSK (and QPSK) versus the corresponding bandwidth for a number of M-ary partial response CPM schemes. For exarriple, for the scheme M = 8, 3SRC a number of bandwidth and distance values are shown as 0 in the figure. To underline the fact that only h varies, but that . the pulse shape and the number of levels are the same, these points are connected, The interaction between distance and bandwidth for a fixed scheme when h varies is clearly demonstrated. The figure also shows the clear improvement which is obtained by increasing Land/or M. Another trend shown in Fig. 21 is the connection between binary RC and SRC schemes for h = 1/2 and increasingL. Both bandwidth and distance change with L for fixed h. Plots like figure 21 can be drawn for alternative definitions of bandwidth, depending on the application . The plots will be different, but the relative positions of the different schemes will remain unchanged.
the
THE BEST OF THE BEST
260
The main conclusion concerning the systems considered in this paper is that digital constant envelope modulation systems can be found which are both power and bandwidth efficient. Nonbinary (e.g., M = 4) systems are especially attractive. A specific system is achieved by specifying the number of levels M, the modulation index h, and the frequency pulse g(t). The modulation index should be rational for implementation reasons. The price that has to be paid for systems with optimum receivers is complexity. REFERENCES r I ] A. (21
r31 [41 (5J
[61 [71
[81 (91
flOJ [l l ]
[121
r131 rt4] [151
r16]
Lender. "The duobinary technique for high speed data transmission:' IEEE Trans. Commun, Electron .. vol, COM-II. pp. 214-218. May 1963. J. M. Wozencraft and I. M. Jacobs. Principles of Communication Engineering. New York: Wiley. 1965. A. Lender. "Correlative level coding for binary data transrnission ." IEEE Spectrum. vol. 3. pp. 104-115. Feb. '966. R. W. Lucky. J. Salz. and E. J. Weldon. Principles of Data Communication. New York: McGraw-Hili. 1968. T. l, "Baker. "Asymptotic behavior of digital FM spectra." IEEE Trans. Commun .. vol. COM-22. pp. 1585-1594. Oct. 1974. W. C. Lindsey and M. K. Simon. Telecommunication Systems Engineering. Englewood Cliffs. NJ: Prentice-Hall. 1974. G. D. Forney. "The Viterbi algorithm:' Proc, IEEE. vol. 61. pp. 268-278. Mar. 1973. P. Kabal and S. Pasupathy, "Partial response signaling. to IEEE Trans. Commun .. vol. COM-23. pp. 921-934. Sept. 1975. G. J.Garrison."A power spectral density analysis for digital FM.'· IEEE Trans. Commun.. vol. COM-23. pp. 1228-1243. Nov. 1975. J. B. Anderson and R. de Buda. "Bener phase-modulation error performance using trellis phase codes." Electron. Lett.. vol. 12. pp, 587-588. Oct. 1976: R. C. Davis. An experimental 4-ary CPFSK modem for line-ofsight microwave digital data transmission." in EASCON Conf. Rec.. Washington. DC, 1978. pp. 674-682. T. A. Schonhoff, H. E. Nichols. and H. M. Gibbons. "'Use of the MLSE algorithm to demodulate CPFSK." in Proc . Int. Conf. Commun .• Toronto, Canada. 1978. pp. 25.4.1-25.4.5. F. delager and C. B. Dekker. "Tamed frequency modulation. a novel method to achieve spectrum economy in digital transmission. ,. IEEE Trans. Commun .. vol. COM-26, pp. 534-542. May U
1978. G. S. Deshpande and P. H. Wittke. "The spectrum of correlative encoded FSK"· in Proc, Int. Conf. Commun., Toronto. Canada 1978. pp. 25.3.1-25.3.5. T. Aulin , N. Rydbeck, and C-E. Sundberg. "Bandwidth efficient digital PM with coherent phase tree demodulation." Telecommun. Theory. Univ. of Lund. Lund. Sweden. Tech.' Rep. TR-I02. May 1978. - - . "Bandwidth efficient constant-envelope digital signalling
with phase-tree demodulation:' Electron. Lett., vol. 14. pp. 487-
[ 171
489. July 1978.
- - . " 'Further results on digital FM with coherent phase tree demodulaticn-minimum distance and spectrum. t ' Telecommun. ·..Theory, Univ. of Lund. Lund. Sweden. Tech. Rep. TR-119. Nov.
.v
1918.
T:' Aulin and C-E. Sundberg. "Minimum distance properties of M-ary correlative encoded CPFSK." Telecommun. Theory. Univ. of Lund. Lund. Sweden. Tech. Rep. TR-120. Nov. 1978. r 191 N. Rydbeck and C-E. Sundberg, "Recent results on spectrally efficient constant envelope digital modulation methods .' , in Proc, IEEE Int. Conf, Commun., Boston. MA~ 1979. pp. 42.1.1-42.1.6. [20] T. Aulin, N. Rydbeck, and C-E. Sundberg. "Bandwidth efficient digital FM with coherent phase tree demodulation." in Proc. IEEE Int. Conf. Commun., Boston. MA. 1979. pp. 42.4.1-42.4.6. [211 T. Aulin, "CPM-A power and bandwidth efficient digital constant envelope modulation scheme," Ph.D. dissertation. Telecomrnun. Theory. Univ. of Lund. Lund. Sweden. Nov. 1979. [22] T. Aulin, N. Rydbeck , and C-E. Sundberg. "Performance of constant envelope M-ary digital FM-systems and their implementation'," in Proc. Nat. Telecommun. Conf., Washington. DC. . 1979. pp. 55.1.1-55.1.6. [231 T. Aulin and C-E. Sundberg. •'Continuous phase modulation-Part I: Full response signaling," this issue, pp. 196-209. [241 T. Aulin, N. Rydbeck, and C-E. Sundberg. "Transmitter and receiver structures for M-ary partial response FM. Synchronization considerations." Telecommun. Theory. Univ. of Lund. Lund. Sweden. Tech. Rep. TR-I21. June 1979. [251 ,R. R. Anderson and J. Salz. "Spectra of digital FM." Bell Svst. Tech. J., vol. 44. pp. 1165-1189. July-Aug. 1965. [261 T. Aulin and C-E. Sundberg. "Digital FM spectra-Numerical calculations and asymptotic behaviour." Telecommun. Theory. Univ. of Lund. Lund. Sweden. Tech. Rep. TR-14L May 1980. [271 T. Aulin, "Symbol error probability bounds for coherently Viterbi detected digital FM." Telecommun. Theory. Univ. of Lund. Lund. Sweden. Tech. Rep. TR-131. Oct. 1979. [281 T. Autin. N. Rydbeck. and C-E. Sundberg, "Transmitter and receiver structures for M-ary partial response FM," in Proc. 1980 Int. Zurich Seminar on Digital Commun.; Mar. 4-6. 1980. pp.
[18]
A2.I-A2.6.
Carrier and Bit Synchronization in Data CommunicationA Tutorial Review L. E. FRANKS,
Abstract-This paper examines the problems of carrier phase estimation and symbol timing estimation for carrier-type synchronous digital data signals, with tutorial objectives foremost. Carrier phase recovery for suppressed-carrier versions of double sideband (DSB), vestigial sideband (VSB), and quadrature amplitude modulation (QAM) signal formats is considered first. Then the problem of symbol timing recovery for a baseband pulse-amplitude modulation (PAM) signal is examined. Timing recovery circuits based on elementary statistical properties are discussed as well as timing recovery based on maximum-likelihood estimation theory. A relatively simple approach to evaluation of timing recovery circuit performance in terms of rms jitter of the timing parameters is presented.
I
I. INTRODUCTION
N digital data communication there is a hierarchy of synchronization problems to be considered. First, assuming that a carrier-type system is involved, there is the problem of carrier synchronization which concerns the generation of a reference carrier with a phase closely matching that of the data signal. This reference carrier is used at the data receiver to perform a coherent demodulation operation, creating a baseband data signal. Next comesthe problem of synchronizing a receiver clock with the baseband data-symbol sequence. This is commonly called bit synchronization, even when the symbol alphabet happens not to be binary. Depending on the type of system under consideration, problemsof word-, frame-, and packet-synchronization will be encountered further down the hierarchy. A feature that distinguishes the latter problems from those of carrier and bit synchronization is that they are usually solved by means of special design of the message format, involving the repetitive insertion of bits or words into the data sequence solely for synchronization purposes. On the other hand, it is desirable that carrier and bit synchronization be effected without multiplexingspecial timing signals onto the data signal, which would use up a portion of the available channel capacity. Only timing recovery problems of this type are discussed in this paper. This excludes those systems wherein the transmitted signal contains an unmodulated component of sinusoidal carrier (such as with "on-off" keying). When an unmodulated component or pilot is present, the standard approach to carrier synchronization is to use a phase-locked loop (PLL) which locks onto the carrier component, and has a narrow enough loop bandwidth so as not to be excessively perturbed by the sideband components of the signal, There is a vast literature on the performance and Manuscript received June 28, 1979; revised March 26, 1980. The author is with the Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA 01003.
FELLOW. IEEE
design of the PLL and there are several textbooks dealing with synchronous communication systems which treat the PLL in great detail [1] - [5]. Although we consider only suppressedcarrier signal formats here) the PLL material is still relevant since these devices are often used as component parts of the overall phase recoverysystem. For modulation formats which exhibit a high bandwidth efficiency, Le., which have a large "bits per cycle" figure of merit, we find the accuracy requirements on carrier and bit synchronization increasingly severe. Unfortunately, it is also in these high-efficiency systems that we find it most difficult to extract accurate carrier phase and symbol timing information by means of simple operations performed on the received signal. The pressure to develop higher efficiency data transmission has led to a dramaticallyincreased interest in timing recovery problems and, in particular, in the ultimate performance that can be achieved with optimal recoveryschemes. We begin our review of carrier synchronization problems with a brief discussion of the major types of modulation format. In each case (DSB, VSB, or QAM), we assume coherent demodulation whereby the received signal is multiplied by a locally generated reference carrier and the product is passed through a low-pass filter. We can get some idea of the phase accuracy, or degreeof coherency, requirements for the various modulation formats by examining the expressions for the coherent detector output, assuming a noise-free input. Let us assume that the message signal, say) a(t), is incorporated by the modulation scheme into the complex envelope (3(t) of the carrier signal. 1
yet)
= Re [(J(t) exp (jO) exp (j21Tfot)]
(1)
and the reference carrier r(t) is characterized by a constant complex envelope r(t) == Re [exp (j8) exp U21tfot)] .
(2)
From (A-8), the output of the coherent detector is Z 1 (t)
=t
Re [13(t) exp Uf) - j8)] .
(3)
For the case of DSB modulation, we have (J(t) = a(t) + Z 1 (t) is simply proportional to a(t). The phase error - 8 in the reference carrier has only a second-order effect
jf), so
()
1 See the Appendix for definitions and basic relations concerning complex envelope representation of signals.
Reprinted from IEEE Transactions on Communications, vol. COM-28, no. 8, August 1980.
The Best ofthe Best. Edited by W H. Tranter, D. P. Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
261
THE BEST OF THE BEST
262
on detector performance. The only loss is that phase error
cause,s a reduction, proportional to cos? (8 - 8), in signal-
to-noise ratio at the detector output when additive noise is present on the received signal. For VSB modulation, however, phase error produces a more severe distortion. In this case ~(t) = a(t) + ja(t), where a(t) is related to aCt) by a time-invariant ftltering operation which causes a cancellation of a major portion of one of the sidebands. In the limiting case of complete cancellation of a sideband (SSB), we have a(t) = a(t), the Hilbert transform of a(t) [6]. The coherent detector output (3) for the VSB signal is Zl (t)
=
t a(t) cos (8 - 8) -! a(t) sin (8 -
8)
(4)
and the second term in (4) introduces an interference called quadrature distortion when lJ =1= 8. As i1(t) has roughly the same power level as a(t), a relatively small phaseerror must be maintained for low distortion, e.g., about 0.032 radian error for a 30 dB signal-to-distortion ratio. In the QAM case, two superimposed DSB signals at the same carrier frequency are employed by making P(t) = a(t) + jb(t), where a(t) and b(t) are two separate,possibly independent, message signals. A dual coherent detector, using a reference carrier and its 1f/2 phase-shifted version, separates the received signal into its' in-phase (I) and quadrature (Q) components. Again considering only the noise-free case, these components are
t aCt) cos (8 - 8) - t bet) sin (8 - 8) cQ(t) = t b(t) cos (0 - 8) + t a(t) sin (8 - 8).
(5)
From (5) it is clear that 8 =1= 8 introduces a crosstalk interference into the I and Q channels. As a(t) arid b(t) canbe expected to be at similar power levels, the phase accuracy requirements for QAM are high compared to straight DSB modulation. From the previous discussion we see that the price for the approximate doubling of bandwidth efficiency in VSB or QAM, relative to DSB, is a greatly increased sensitivity to phase error. The problem is compounded by the fact that carrier phase recovery is much more difficult for VSB and QAM, compared to DSB.
II. CARRIERPHASE RECOVERY Before examining specific carrier recovery circuits for the suppressed-carrier format, it is helpful to ask, "What properties must the carrier signal y( t) possess in order that operations on y(t) will produce a good estimate of the phase parameter 81" A general answer to this question lies in the cyclostationary nature of the y(t) process.f A cyclostationary processhas statistical moments which are periodic in time, rather than constant as in the case of stationary processes [2], [6], [7]. To a large extent, synchronization capability can be characterIn (2), these processes are called periodic nonstationary,
kyy(t
+ T, t) =
t Re [kaa(r) exp U21f!OT)]
+ t Re [kaa(r) exp U41ffot + j21TfoT + j28)] (6)
cAt) =
2
ized by the lowest-order moments of the process, such as the mean and autocorrelation. The yet) process is said to be cyclestationary in the wide sense if Efy(t)] and kyy(t + T t t) = E[y(t + r)y(t)] are both periodic functions of t. A process modeled by (1) is typically cyclostationary with a period of lIfo or 1/2/0- The statistical moments of this process depend upon the value of the phase parameter 8 and it is not surprising that efficient phase estimation procedures ate similar to moment estimation procedures.. It is important to note here that we are regarding 8 as an unknown but nonrandom parameter. If instead we regarded 8 as a random parameter uniformly distributed over a 21f interval, then the y(t) process would typically be stationary, not cyclostationary. A general property of cyclostationary processes is that there may be a correlation between components in different frequency bands, in contrast to the situation for stationary processes (8). For carrier-type signals, the significance lies in the correlation between message components centered around the carrier frequency (+fo) and the imagecomponents around (-fo). This correlation is characterized by the cross-correlation function. k~fj*(T) = E[~(t + T)~(t)) for ay(t) processas in (1) when ~(t) is a stationary process. 3 Considering first the DSB case with ~(t) = a(t) + j8, and using(A-I0) we have
where the second term in (6) exhibits the periodicity in t that makes y(t) a cyclostationary process. We are assuming that y(t) contains no periodic components. Consider what happens, however, when y(t) is passed through
a square-law device. We see immediately from (6) that the output of the squarer has a periodic mean value,since E[y2(t)]
= kyy(t, t) = ; kaa(O)
+t
kaa(O) Re [exp (j28 + j41rfo t)J . (7)
If the squarer output 'is passed through. a bandpass fllter with transfer function H(f) as shown in Fig. 1, and if H(j) has a unity-gain passband in the vicinity of f = 2/0 , then the mean value of the filter output is a sinusoid with frequency 2fo, phase 28, and amplitude tE[a 2 (t)] . In this-sense, the squarer has produced a periodic component from the y(t) signal. It is often stated that the effect of the squarer is to produce a discrete component (a line at 2/0) in the spectrum of its output signal. This statement lacks precision and can lead to serious misinterpretations becausey2 (t) is not a stationary process, so the usual spectral density concept has no meaning. A stationary processcan be derived f rom y 2(t) by phase random.. izing [6], but then the relevance to carrier phase recovery is lost because the discrete component, has a completely indeterminate phase.. 3 Despite its appearance, this is not an autocorrelation function, due to the definition of autocorrelation for complex processes; see (A..l1).
263
Fifty Years of Communications and Networking Squarer
BPF(2f o)
-~ Fig. 1.
-~
-
PLL
Squarer
Timing wove
~)
Timing recovery circuit.
The output of the bandpass filter in Fig. 1 can be used directty to generate a reference carrier. Assuming that H(j) completely suppresses the low-frequency terms [see (A-8)] the filter output is the reference waveform w{t)
=t
Re {[w
e
(32] (t) exp (j20) exp U4rrfo t )}
(8)
where the convolution product [w @ {32] represents the filtering action of H(j) ·in terms of its low-pass equivalent Q(f) in (A-S). For the DSB case, (32(t) = a2(t ) is real and w(t) is real4 if Il(j) has a symmetric response about 2fo. Then the phase of the reference waveform is 2fJ and the amplitude of the reference waveformfluctuates slowly [depending on the bandwidth of H(f)]. The reference carrier can be obtained by passing wet) through an infinite-gain clipper which removes the amplitude fluctuations. The square wave from the clipper can drive a frequency divider circuit which halves the frequency and phase. Alternatively, the bandpass filter output can be tracked by a PLL and the PLL oscillator output passed through the fre.. quency..dividercircuit. There is another tracking loop arrangement, called the Costas loop, where the voltage-controlled oscillator (VeO) operates directly at to. We digress momentarily to describe the Costas loop and to point out that it is equivalent to the squarer followed by a PLL (1)- [3] . The equivalence is established by noting that the inputs to the loop filters in the two configurations shown in Fig. 2 are identical. In the PLL quiescent lock condition, the VCO output is in quadrature with the input signal so we introduce a rr/2 phase shift into the veo in the configurations of Fig. 2.. Then using (A-B) to get the output of the multiplier/low-pass filter combinations, we see that the input to the loop filter is v(t)
=i
Re [A 2, (j2(t) exp U26 - j20 - j7(/2)]
(9)
veo
in both configurations if the amplitude of the output is taken as A 2 in the squarer/PLL configuration, and taken as A in the Costasloop. . Going back to (8), we see that phase recovery is perfect if (w ~ {32] is real. Assuming wet) real, a phase error will result only if a quadrature component [relativeto (32 (t)] appears at
t
the output of the squarer. This points out the error, from a different viewpoint, of using the phase randomized spectrum of the squarer output to analyze the phase recovery performance because the spectrum approach obliterates the distinction between I and Q components. For the DSB case, a quadrature component will appear at the squarer output only if there is a quadrature component of interference added to the input signal yet). We can demonstrate this effect by considering the 4 A real wet) corresponds to the case where the cross-coupling paths between input and output I and Q components in Fig. 10 are absent. If the bandpass function H(J) does not exhibit the symmetrical amplitude response and antisymrnetrical phase response about 2/0 for a real wet), then there simply is a fixed phase offset introduced by the bandpass filter.
(b)
Fig. 2.
Carrier phase tracking loops. (a) Squarer/PLL (b) Costas loop.
input signal to be z(t) = yet) + net) where net) is white noise with a double-sided spectral density of No W/Hz. We can represent net) by the complex envelope, [UI(t) + jUQ(t)] exp (j(J) where, from (A-IS) the I and Q noise components relative to a phase (J are uncorrelated and have a spectral density of 2No. The resulting phase of the reference waveform(8) is 26 = 26 + tan It
_1[
2w0(~UQ+U/uQ) ] . 2 w e «(j2 + 2{3u/ + u/ - UQ 2)
(10)
We can approximate the phase error ep == IJ - () (also called phase jitter because lJ is a quantity that fluctuates with time) by neglecting the noise X noise term in the numerator and
both signal X noise and noise X noise terms in the denominator in (10). Furthermore, we replace w @ {j2 by its expected value (averaging over the message process) and use the tan - 1 X ~ x approximation. With all these simplifications, which are valid at sufficiently high signal-to-noise ratio and with suffici.. ently narrow-band H(j), it is easy to -derive an expression for the variance of the phasejitter. (11a)
(lIb) where B
~
1 co
-co
I fl.{f) 12 dt ==
[CO IH(j) 12 dt 0
is the noise bandwidth of the bandpass filter, recallingthat we have set Q(O) = 1. The message signal power is S = E[a 2 (t)] and for the second version of the jitter formula (11b) we have assumed a signal bandwidth of W Hz and have defined a noise power over this band of N = 2NoW. This allowsthe satisfying physical interpretation of jitter variance being inversely pro.. portional to signal..to-noise ratio and directly proportional to the bandwidth ratio of the phase recovery circuit and the message signal. For the smaller signal..to-noise ratios, the accuracy and convenienceof the expression can be maintained by incorporating a correction factor known as the squaring loss [3] .
264
THE BEST OF THE BEST
When the signal itself carries a significant quadrature com-
ponent, as in the case of the VSB signal, there will be a quadrature component at the squarer output that interferes with the phase recovery operation even at high signal..to-noise ratios. Let us suppose that the VSB signal is obtained by filtering a DSB signal with a bandpass filter with a real transfer function (no phase shift) and with a cutoff in the vicinity of fo- The resulting quadrature component for the VSB signal is aCt) = (PQ 0 a] (t) and PQ(t) is derived from the low ..pass equivalent transfer function for the bandpass filter in accordance with (A-7). The real transfer function condition makes PQ(t) an odd function of time, which also makes the cross-correlation function for aCt) and o(t) an odd function. The result is that, for (3(t) = a(t) + [a(t), the autocorrelation for the VSB signal is
kyy(t
+ T, t) = t
Re [{kaa(T) + kaa(T) + j2kaa(T)}
• exp (j21710 7)] • exp (j4rrfot
+t
Re [{kaa(r) - kaa(r)}
+ j2nfoT + j20)} .
(12)
Comparing (12) with (6), we see that the second, cyclostationary, term is much smaller for the VSB case than the DSB case since the autocorrelation functions for a(t) and aCt) differ only to the extent that some of the low-frequency components in a(t) are missing because of the VSB rolloff characteristic. Although the jitter performance will be poorer, the phase recovery circuit in Fig. 1 can still be used since the mean value of the reference waveform is a sinusoid exhibiting the desired phase, but with an amplitude which is proportional to the difference in power levels in a(t) and ti(t).
E[ w(t)]
=t
[kaa(O) - kaa(O)] Re [exp U41ffot + j28)] .
cify a minimum phase-recovery bandwidth and then adjust other parameters of the system to minimize the steady-state phase jitter. Another problem with a very narrow..band bandpass filter is in the inherent mistuning sensitivity, where mistuning is a result of inaccuracies in filter element values or a result of small inaccuracies or drift in the carrier frequency. This problem is avoided. with tracking loop configurations since they lock onto the carrier frequency. One the other hand, tracking loops have some problems also, one of the more serious being the "hangup" problem [9] whereby the nonlinear nature of the loop can produce some greatly prolonged acquisition times. Although we have modeled the phase recovery problem in terms of a constant unknown carrier phase, it may be important in some situations to consider the presence of fairly rapid fluctuations in carrier phase (independent of the message process). Such fluctuations are often called phase noise and if the spectral density of these fluctuations has a greater bandwidth than that of the phase recovery circuits, there is a phase error due to the inability to track the carrier phase. Phase error of this type, even in steady state, becomes larger as the bandwidth of the recovery circuits decreases. Another practical consideration is a 1T..radian phase ambiguity in the phase recovery circuits we have been discussing. The result is a polarity ambiguity in the coherently demodulated signal. In many cases this polarity ambiguity is unimportant, but otherwise some a priori knowledge about the message signal willhave to be used to resolve the ambiguity. For a QAM signal with (j(t) = aCt) + jb(t), where a(t) and bet) are independent zero-mean stationary processes, we get
kyy(t + r, t)
+
(13) However, it is not possible to get a very simple formula for the variance of phase jitter, as in (11), because the power spectral density of the quadrature component of ~2 (t), which is proportional to a(t) a(t), vanishes at f = 0, unlike in the additive noise case. An accurate variance expression must take into account the particular shape of the filtering function as well as the shape of the VSB rolloff characteristic. Our examination of phase recovery for DSB (with additive noise) and VSB modulation formats has indicated that rms phase jitter can be made as small as desired by making the width of f!(f) sufficiently small. The corresponding parameter in the case of the tracking loop configuration is called the loop bandwidth [3]. These results, however, are for steady-state phase jitter since the signals at the receiver input were presumed to extend into the remote past. The difficulty with a very narrow phase recovery bandwidth is that excessive time is taken to get to the steady-state condition when a new signal' process begins. This time interval is referred to as the acquisition time of the recovery circuit and in switched communication networks or polling systems it is usually very important to keep this interval small, even at the expense of the larger steady-state phase jitter. One way to accommodate the conflicting objectives in designing a carrier recovery circuit is to spe-
noo
= t Re [{koQ(r) + kbb(r)} exp (j211'107)]
t Re ({kaa(T)-kbb(r)}
• exp
U41l'fot + j21T!OT + j28)]
(14)
and the situation is very similar to the VSB case (12). In this case where aCt) and bet) are uncorrelated, the mean reference waveform has the correct phase) but the amplitude vanishes if the power levels in the I and Q channels are the same. E[w(t)] ::::
t ,[kaa(O)- kbb(O)]
Re [exp (j41ffot
+ j28)]. (15)
Hence, unless the QAM format is intentionally unbalanced, the squaring approach in Fig. 1 does not work. We briefly examine what happens when the squarer is replaced by a fourth-power device in the recovery schemes we have been considering. From (1), we can obtain y4(t)
=t
Re [(f(t) exp U81ffot + j48)]
+t
Re [ I (3(t) )2(32 (t) exp
+ i J ((t) 14 .
U41(lot + j20)] (16)
Now if we use a bandpass filter tuned to 4/0 which passes only
Fifty Years of Communications and Networking
265
the first term in (16), then the mean reference waveform at the filter output is
E[w(t)] =
t Re [{a
4
-
3(a~2} exp (j81TJOt + j48)]
intersymbol interference is
ak -ak
(17)
still assuming independent aCt) and b(t) and a balanced QAM format, t.e., kaa(O) = kbb(O) = a 2 • Hence, a mean reference waveform exists even in the balanced QAM case if a fourthpower device is used." One very popular QAM format is quadriphase-shift keying (QPSK) where the standard carrier recovery technique is to use a fourth-power device followed by a PLL or to use an equivalent "double" Costas loop configuration [3]. The QPSK format, with independent data symbols, can be regarded as two independent binary phase-shift-keyed (BPSK) signals in phase quadrature. In a nonbandlimited situation each BPSK signal can be regarded as DSB-AM where the message waveform has a rectangular shape characterized by a(t) = ± 1. In this case, the complex envelope of the QPSK signal is characterized by ~(t) ::: (± 1 ± i)/Vi or (j(t) = exp (j(rr/4) + j(rr/2}k) with k = 0, 1, 2, or 3. The result is that (34 (t) -1 and the 4/0 component in (16) is a pure sinusoid with no fluctuations in either phase or amplitude. For PSK systems with a larger alphabet of phase positions, the result of (17) cannot generally be used as the I and Q components are no longer independent. Analysis of the larger alphabet cases shows that higher-order nonlinearities are required for successful phase recovery [3], [10]. For any balanced QAM format, such as QPSK, the phase recovery circuits discussed here give a 17/2-radian phase ambiguity. This problem is often handled by use of a differential PSK scheme, whereby the information is transmitted as a sequence of phase changes rather than absolute values of phase.
=
~ ang(kT- n1)
(19)
n=l=k
and this term can be made to vanish for pulses satisfying the Nyquist criterion, i.e., g(nT) = 0 for n =1= O. For bandlimited Nyquist pulses, the intersymbol interference will not be zero when T -:/= T, and if the bandwidth is not significantly greater than the Nyquist bandwidth (1/2T) the intersymbol interference can be quite severe even for small values of timing error. The problem is especially acute for multilevel (nonbinary) data sequences where timing accuracy of only a few percent of the symbol period is often required. Symbol timing recovery is remarkably similar in most respects to carrier phase recovery and we find that similar signal processing will yield suitable estimates of the parameter T. In the discussion to follow, we assume that {ak} is a zero-mean stationary sequence with independent elements. The resulting PAM signal (18) is a zero-mean cyclostationary process, although there are no periodic components present [6]. The square of the PAM signal does, however, possess a periodic mean value.
E[x 2(t)] = ~ ~ g2(t- kT-r).
(20)
k
Using the Poisson Sum Formula [6], we can express (20) in the more convenient form of a Fourier series whose coefficients are given by the Fourier transform of g2(t).
E[x 2 (t)]
III. PAM TIMING RECOVERY The receiver synchronization problem in baseband PAM transmission is to find the correct sampling instants for extracting a sequence of numerical values from the received signaL For a synchronous pulse sequence with a pulse rate of l/T, the sampler operates synchronously at the same rate and the problem is to determine the correct sampling phase within a Tsecond interval The model for the baseband PAM signal is
='
2 (j2n~ =-a ~ ~Al exp - ( t -
T 1.
T
T)
)
(21)
where
For high bandwidth efficiency, we are often concerned with data pulses whose bandwidth is at most equal to twice the Nyquist bandwidth. Then IG(j) t == 0 for If I > l/Tand there x(t) = ~ a/cK(t -kT- r) (18) are only three nonzero terms (l= 0, ± 1) in (21). k=_oo This result suggests the use of a timing recovery circuit of where {ak} is the message sequence and g(t) is the signaling the same form as shown in Fig. 1, where now the bandpass filter is tuned to the symbol rate; l/T. Alternate zero crossings pulse. We want to make an accurate determination of t, from operations performed on x(t). We assume that g(t) is So of wet), a timing wave analogous to the reference waveform in defined that the best sampling instants are at t = kT + r; Section II, are used as indications of the correct sampling ink = 0, ± 1, ± 2, .... The objective is to recover a close replica stants. Letting H(I/T) = 1, the mean timing wave is a sinusoid of the message sequence {ak} in terms of the sequence {ak = with a phase of -21tT/T, for a real G(j). x(kT + T)}~ assuming a normalization of g(0} = 1. In the noise-free case, the difference between ak and ak is due to (22) intersymbol interference which can be minimized by proper shaping of the data pulse g(t). With perfect timing (7 = T), the 00
sUnless a(t) and b(t) are Gaussian processes, for then a 4
=3(;2)2.
"
We see that the zero crossings of the mean timing wave are at a fixed time offset (T/4) relative to the desired sampling instants.
266
THE BEST OF THE BEST
This timing offset can be handled by counting logic in the clock circuitry, or by designing H(j) to incorporate a n/2 phase shift at f = lIT. The actual zero crossings of w(t) fluctuate about the desired sampling instants because the timing wave depends on the actual realization the entire data sequence. Different zero crossings result for different data sequences and for this reason the fluctuatiori in zero crossings is sometimes called pattern-dependent jitter to distinguish it from jitter produced by additive noise on the pAM signal. To evaluate the statistical nature of the pattern-dependent jitter, we need to calculate the variance of the timing wave. This is a fairly complicated expression in terms of Hif) and G(j) but it can be evaluated numerically to study the effects of a variety of parameters (bandwidth, mistuning, rolloff shape, etc.) relating to data pulse shape and the bandpass filter transfer function [11]. For a 'relatively narrow-band real H(j) and real G(f) bandlimited as mentioned previously, the variance expression has the form
of
, 41T varw(t)=Co +C1 cos-(t-r) T
(23)
Fig. 3.
Baseband PAM receiver with timing recovery.
off characteristic can result in very small values of AIAlthough the class IV partial response format exhibits a relatively high tolerance to timing error 1i), it is likely that some other recovery scheme may have to be used. Some of the proposed schemes [13], [14] closely resemble the dataaided approach discussed in Section IV. Calculation of the statistical properties of the actual zero crossings of the timing wave is difficult. A useful approxima.. tion can be obtained by locating the zero crossings by linear extrapolatiori using the mean slope at the mean zero crossing. When this approach is used, the expression for timing jitter variance becomes [11]
t
where Co ~ C 1 > 0 are constants depending on G(j) and H(f). The cyclostationarity of the timing ,waveis apparent from this expression. As the bandwidth of H(f) approaches zero, the (24) value of Ct approaches Co so that the variance has a great fluctuation over ~ne symbol period. Note that the minimum variance occurs just at the instant of the mean zero crossings, In order to reduce this pattern-dependent jitter, there is hence the fluctuations in zero crossings are much less than fortunately an attractive alternative to making the bandwidth would be expected from a consideration of the average variance of the timing wave over a symbol period. This again points of H(j) very small, which increases acquisition time in the out the error in disregarding the cyclostationary nature of the same manner as for carrier phase recovery circuits, or to mak.. timing wav,e process as, for example, 'in using the power spec- ing the bandwidth of G(j) very large. There are symmetry tral density of the squarer output to analyze the jitter phe- conditions that can be imposed upon H(/) and G(j) that rriake C. = Co in (24), resulting in nonfluctuating zero crossings. nomenon. The mean timing wave (22) can be regarded as a kind of These conditions are simply that G(f) be a bandpass characdiscriminator characteristic or Sscurve for measuring the teristic symmetric about 1/2T, with a-bandwidth not exceedparameter T. For the bandlimited case we are discussing here, ing 1/2T, and H(f) be symmetric about lIT. The symmetry in this S-curve is just a sinusoid, with a zero crossing at the true G(f) can. be accomplished by prefiltering the PAM signal value of the parameter. Discrimination is enhanced by increas- before it enters the squarer [~1] , [15] . Since the timing recoving the slope at the zero crossing. As this slope is proportional ery path is distinct from the data signal path, the prefiltering to AI' we can see .how the shape of the data pulse g(t) affects can be performed without influencing the data signal equalizatiming recovery. From (21) we see that the value of Al tion, as shown in the baseband receiver configuration of Fig. 3. Although we are dealing with a baseband signal process, it is depends on the amount of overlap of the functions G(f) and interes~ing to observe that the timing jitter problem can be G(I/T - f), and hence it depends on the amount by which studied by means of complex envelopes and decomposition the bandwidth of G(j) exceeds the i/'4T Nyquist bandwidth. into I and Q components, as in the carrier phase recovery case With no excess bandwidth, Al = 0 and this method of timing recovery fails. The situation improves rapidly as the excess [16). One way to do this is to let 'Y(t) be the complex envebandwidth factor increases from 0 to 100 percent. With very , lope of g(t), relative to a frequency fo ~ 1/2T. This makes r(f) large increases in bandwidth there are mote harmonic compon- bandlimited to If I < .1/2T. Then, taking T = 0 for convenents the mean timing wave; and its zero crossing slope can ience, the output of the squarer is be further increased without increasing signal level by proper phasing of these components. On the other hand, there are systems where the spectral distribution is such that the fractional amount of energy above 1/2T is very small. An important case is that of (class N) partial response signaling where the pulse shape is chosen to produce a spectral null at 1/2T. (25) This spectral null in combination with a sharp baseband roll-
in
267
Fifty Years ofCommunications and Networking
The second term in (25) can be disregarded as not beingpassed by H(j). The rust term is expressed by a complex envelope relative to 10 = lIT. It is the quadrature component (imaginary part) of this complex envelope that produces timing jitter. This component is ba(t)
=~ "
~ akam(-l)k+mc/(t - kT)cQ(t - m1) '?J
(26)
where c,(t) and cQ(t) are the real and imaginary parts of l(t). The Fourier transform of (26), evaluated at f = 0, is
where M(f) = ~ ak exp (-j2rrkTf) is the transform of the data sequence. The integral (27) vanishes because, for a real G(f) i.e., a real r(f), the integrand is an odd function. The situation is similar to the VSB carrier signalcase, wherethe spectrum of the quadrature component at the squarer output vanished at f = O. We see here also that the particular shape of H(j) will have a major influence when calculating the jitter variance because the spectrum of the jitter-producing component goes to zero just at the center of its passband. N. MAXIMUM-LIKELIHOOD PARAMETER ESTIMATION The foregoing carrier- and blt-synchrontzation clrcuits were
developed on a rather heuristic basis and a' natural question arises as to how IllUCP. improvement in parameter estimation could result' from the choice of other circuit configurations or circuit parameters. It seems natural to regard 0 and T as unknown but nonrandom parameters which suggests the maximum-likelihood (ML) estimator.as the preferred strategy [17] . Some authors have used the maximuma posteriori probability (MAP) receiver by modeling 8 and T as random parameters with specified a priori probability density functions. However, in most situations the a priori knowledge about 8 is only accurate to within many carrier cycles, or in the case of T, to within many symbol periods. As our concern is with estimation modulo 2," for 8 or modulo T for T, we would use a "folded" version of the a priori density functions, resulting in a nearly uniform distribution over the interval. In this case, the ML approach estimates and MAr estimates would be essentially identical. We find that the phase and timing-recovery circuits based on the ML approach may not be drastically different from the circuits already considered. In fact, under the proper conditions, the circuits we have examined can be close approximations to ML estimators. One of the main advantages of the ML approach,' in addition to suggesting appropriate circuit configurations, is that simple lower bounds on jitter performance can be developed to serve as benchmarks for evaluating performance of the actual recovery circuits employed, In this section we begin with discussion of ML carrier phase recovery with. a rather general specification of the message signal process. We show that the Costas loop, or the equiva-
lent squarer/BPF, can be designed to closely approximate the ML phase estimator. Then we present a similar development for ML estimation of symbol timing for a baseband PAM signal. We introduce the idea of usingmforrnatton about the data sequence to aid the timing recovery processand we later make comparisons to show the effectiveness of such data-aided schemes. Extension of the idea to joint recovery of both carrier phase and symbol timing parameters is discussed in Section V. To formulate the problem in terms of ML estimation, we require that the receiver perform operations on a To-second record of the received signal, z(t) = y(t, 8) + n(t), to estimate the parameter 8, assumed essentially constant over the To ..second interval. This interval is called the observation interval and the To parameter would be selected in accordance with acquisition time requirements. Estimation procedures based on data from a single observation interval will be referred to as oneshot estimation. We find that the one-shot MLestimatorslead to the simplest methods for evaluatingjitter performance. On the other hand, the preferred implementation of recovery circuits is usually in the form of tracking loops where the parameter estimates are being continuously updated. Fortunately, it is a relatively simple matter to relate the rms one-shot estimation error to the steady-state error of the tracking loop and the loop bandwidth is directly related to the To parameter, We shall assume that the additive noise net) Is Gaussian and white with a double-sided spectral density of No W1Hz. Initially we consider the situation where yet, 8) is completely known except for the parameter 6. The resulting likelihood function, with argument IJ which can be regarded as a trial estimate of the parameter, is given by
L(~)=expl-_l-
i
2No TO
[Z(t)-y(t,8)]2 dt
j.
(28)
The ML estimate is the value of (J which minimizes the integral in (28). This integral expresses the signal space distance between the functions z(t) and y (t, 8) defined on the interval To [6], [17]. Expanding the binomial term in (28), we see that
A
All
A(O) = In L(8) == No
TO
z(t)y(t, 8) dt A
+ constant
(29)
since z2 (t) is independent of lJ, and if fJ is a time shift or phase shift parameter, then the integral of y2(~, 0) over a relatively long To interval would have only a small variation with 8. The first term in (29) is often called the correlation between the received signal z(t) and the reference signal y(t, 8) so that in this "known-signal" case, the ML receiver is a correlator, and 8 is varied so as to maximize the correlation. When y(t, 0) contains random message parameters, the appropriate likelihood function for estimating (J is obtained by averaging L(8)-not in L(O)-over these message parameters, We shall illustrate the method using the example of carrier phase estimation on a PSB signal where the modulating signal aCt) is a zero-mean, Gaussian random process with a substanti-
268
THE BEST OF THE BEST
ally flat spectrum bandlimited to W Hz. Finding the expectation Of L(O) with respect to the Gaussian message processcan be done without great difficulty by makinga Karhunen-Loeve expansion of the process to give a series representation with independent coefficients [18]. The result of this averaging gives a log-likelihood function closely approximated by ,
1
A(8) =
TO
[Re a!{t) exp (-je)]
t Re[exp (-i20)
=
+~ 2
1 To
£0
2
dt
(32) for the nontracking implementation. For the DSB signal with additive noise, we have
(k2(t) dtJ
aCt) = [aCt)
I a(t) I~ dt
(30)
where a(t) is the complex envelope, relative to fo, of the received signal. We ignore the second integral in (30) as it is independent of 8. (30) suggests a practical implementation of the ML phase estimator. Consider a receiver structure which. produces the complex signal
it
A(t) =
(k2(s)ds
t-To
which is the same control voltage that appears in the Costas loop (9) and Fig. 2(b), with the normalization, A = 1. If the veo had a voltage-controlled phase, rather than frequency, we would include the To -second· integration effect by means of the loop filter, However, in this case we simply let F(f) = 1 and rely on the integration inherent in the VCO. The parameter To is related to loop performance by adjusting the loop gain factor M, which is proportional to loop bandwidth, so that the steady-state jitter variance for the loop is identical to
~ p(t) exp (j28(t».
(31)
+ u~t) + jUQ(t)] exp UO)
(34)
where ul and uQ are the I and Q components of noise relative to the carrier phase. 8. Letting the veo gain constant be M (hertz/volt) so that ~(t) = 21TAfv(t) and assuming a high signal.. to-noise ratio so that the second-order noise effects can be neglected, a linearized loop equation for phase error, 4J :;:: 8- (J, appears as
. 2rrB L 21TBL q,(t) + 21fB Lq,(t)= uQ(t)a(t) - -s-b(t)
S
(35)
The difficult part of solving this equation to get the steadystate variance of q> is the second driving term where b(t) ~ a2 (t) - Sand q(t) are clearly not independent. It turns out however, that if the loop bandwidth parameter B L ::= !- MS is sufficiently small compared to signal bandwidth W, then this term can be neglected. The other excitation term uQa can be treated as white noise with a spectral density of2NoS, and the steady-state variance of q> can be determined by conventional frequency-domain techniques. The result is
The integral in (31) is the convolution product of a.2 (t) and a To·second rectangle, hence A(t) can be regarded as the com.. plex envelope of the output of the squarer/bandpass filter configuration of Fig. 1. In this case, H(j) corresponds to a sine (Tof) shape centered at 2/0. Writing X(t) polar form as shown in (31), we see that the corresponding term in (30) is maximized, at any t, by choosing () =:= O(t). In other words, by suitably designing the shape of the bandpass transfer function, (36) the simple structure of Fig. 1 is a ML phase estimator, in the sense that the mstantaneous phase of the timing wave output and, equating (36) and (32) we find that BL = l/trTo is the is the best estimate of the DSB carrier phase based on observa- relation sought between observation interval and tracking loop tions over only the past To seconds, The phase jitter can be bandwidth. evaluated approximately by the same method leading to (11), Turning now to ML timing recovery for the baseband PAM with the result that signal, with t replacing {}, and using (18) for y(t, r), the log-
in
var If> =
(8)-1 ST:= WT N 2M
likelihood function for the case of a known signal (29) becomes
1
(32)
o
where S = k aa (0) is the signal power and N == 2No W is noise power over the signal band. The tracking loop version of this phase estimator is devel.. oped by forming a loop error signal proportional to the derivative of A with respect to IJ. Then, as the loop action tends to drive the error signal to zero, the resulting value of 0 should correspond to a maximum of A. Sincethe control voltage v(t) for a veo normally controls frequency, rather than phase, we suppress the integration in (30) and let v.(t) = =
a
8" ao 1
t
•
[Re
Re [Q2(t) exp (-j28 - j1T/2)]
(33)
(37) where
qk(1)=
1
z(t)g(t-kT-r)dt.
To
It is possible to use this expression directly for timing recovery in a situation where a relatively long sequence, say K, of known symbols is transmitted as a preamble to the actual message sequence. The receiver would store the K..symbol sequence and attempt to establish the correct timing before the end of the preamble. The idea can also be used during message transmission if the symbols are digitized, so that the receiver makes decisions as to which of the finite number of possible symbols
269
Fifty Years ofCommunications and Networking
have been transmitted. The receiver decisions are then assumed tion in the DA case is taken as to be correct, at least for the purposes of timing recovery. The bootstrap type of operation is referred to as decision-directed (41) or data-aided timing recovery and it has received extensive. study for both symbol timing and carrier phase recovery [19](22] . In the following, we shall use the term "data-aided" to where refer to both modes of operation, Le., the start-up mode where a known data sequence is being transmitted and the tracking mode where the symbol detector output sequence is used. For recovery strategies which are not data aided, we need to average the likelihood function (37) over the random data With this approximation, the integral is a convolution integral, variables. If we assume that the {a k } are independent Gaus- and qk can be interpreted as the sampled (at t =kT + f) outsian random variables and also that the data pulses have unit put of a matched filter having the impulse response g(-t). The energy and are orthogonal over the To interval, i.e., same approximation is used for the NDA case (39) and the orthogonality condition (38) can be interpreted to mean that the matched filter response to a single data pulse is a pulse sat(38) isfying the Nyquist criterion. This approximation, which leads to relatively simple implementations for the recovery circuits, does introduce a degradation from the idealized ML performthen the log-likelihood function is given by ance. An interesting interpretation of the effect of the approximation is that it introduces a pattern..dependent component (39) of jitter, as discussed in Section VI. For tracking loop implementation of these timing recovery strategies, we use a voltage-controlled clock (VCC) driven by where the q k are the same quantities defined in (37). Although the Gaussian density is obviously not an accurate model for digital data signals, we want to consider it here because it pro(42) vides the link between the ML estimators and the estimators of Sections II and III based on statistical moment properties. for the DA case, and It is the Gaussian assumption that leads to the square-law type of nonlinearity. If we consider equiprobable binary data, for example, the corresponding log-likelihood function is (23][25] A ~ 1 A(T):::: ~ In cosh - qk(T) k No
tx2
(40)
and since In cosh x :::: for small x, the square-law nonlinearity is near optimum at the lower signal-to-noise ratios. The log-likelihood function for equiprobable independent multilevel data has also been derived [25], [26]. When the Gaussian assumption is used for the data, it is also possible to consider correlated data as well as nonorthogonal pulses, i.e., when (38) does not hold. Both ofthese effects can be dealt with by replacing the ak-sequence in (39) by a linear discrete-time filtered version of this sequence (27]. In summary, we find that recovery circuits based on the Gaussian-distributed data assumption are somewhat simpler than the optimum circuits and in most situations the jitter performance is not appreciably worse. We note that the method for evaluating rrns jitter, presented in Section VI, does not depend on the particular kind of density function used to characterize the data. When it comes to implementation of receivers based on (37) and (39) for the data-aided (DA) or nondata-aided (NDA) strategies, we usually resort to an approximation which involves replacing the infinite sum by a K-term sum, where KT = To, and replacing the finite integration interval by an infinite interval. Then the approximate implementable, log-likelihood func-
(43) for the NDA case. The K-term summation is suppressed, being replaced by the integration action of the as in the case of the Costas loop phase recovery circuit discussed earlier. Similar also is the relation between To and the loop bandwidth, the loop gain being adjusted so that the steady-state variance of timing jitter is the same as for one-shot estimation in a single observation intervaL The result is also B L = l/1TTo [26]. The tracking loop configurations are evident from inspection of (42) and (43). One structure will serve for both strategies by incorporating a DA/NDA mode switch as shown in Fig. 4. This could be quite useful in a system that uses the DA strategy on a message preamble, then switches to the NDA strategy when the message symbols begin. Notice that the NDA configuration is remarkably like a Costas loop, which suggests the existence of an equivalent realization using a square-law device. This alternative and equivalent form is shown in Fig. 5. The corresponding implementation of (40) for NDA recovery with binary data involves the same structure as shown in Fig. 4, except that a tanhr-) nonlinearity is incorporated into the upper path of the NDA loop [25] .
vee
270
THE BEST OF THE BEST Matched FU'er
Fig. 6. Fig. 4.
Matched Filter
Fig. 5.
Receiverimplementation of the Qk(6, T) test statistic.
ML baseband PAM timing recovery circuit.
Squarer
Alternative implementation of NDA baseband timing recovery.
v. JOINTRECOVERY OF CARRIER PHASE AND SYMBOL TIMING
When a carrier system, such as VSB/P AM or QAM/P AM is used to transmit a digital data signal, we have the possibility of jointly estimating the carrier phase and symbol timing parameterd Such a strategy certainly cannot be worse than estimating the parameters individually and, in some cases, join estimation gives remarkable improvements. Some authors have extended the idea to joint estimation of the data sequence and the two timing parameters [28] -[30]. We shall not consider this latter possibility here, but shall consider both DA and NDA joint parameter estimation. Our conjecture is that, in the majority of applications, DA recovery performance differs little from that of joint estimation of data and timing param-
eters. , We consider first the QAM/PAM data signal case where we want to estimate 0 and T in y(t;6, r) == Re [{
~ akg(t- kT- r) + jbkh(t- kT- r)} (44)
• exp (j8) exp (j21T!Ot)]
from receiver measurements on z(t) = y(t) + n(t) over a To· second observation interval. The implementable version of the log-likelihood function for the DA case is
(45)
Fig. 7.
Data-aided QAM joint tracking loop for carrier phase and symbol timing.
and a(t) is the complex envelope of the received signal. The quantities are interpreted as the sampled (at t :::: kT + r) output of a coherent demodulator (operating at a phase 8) whose input is a bandpass filtered version of the received signal. The receiver implementation for these quantities is shown in Fig. 6, and a similar implementation would provide the fJ k quantities. For the joint tracking loop the partial derivatives of A with respect to ~ and T, without the K-term summation, are used to update the veo and vee frequencies once every T seconds. For the normal QAM case we let h(t) = g(t) and some simplifications result, for then aqk/aO = Pk and = -ilk. The aqk/aT and ap,J3f quantities are obtained by differentiating the I and Q baseband signals before sampling. The complete tracking loop implementation for the DA case is shown in Fig. 7. . For balanced QAM/PAM with identical pulse shapes and statistically identical independent data in the I and Q channels, the NDA mode of recovery fails [271 because the NDA loglikelihood function is
qk
af'k/aa
1
where
qk(O, r) = Re
t
xp (-jlh
K-l
;\(0,;')=- ~ 2No
L~ a(t)g(t -
kT- r) dtJ
p~(8, r) = Re [-j exp (-;8) L~ o:(t)h(t- kT- T) d~
k=O
Qk
2
+Pk2
(46)
and this is independent of iJ under the previous assumptions. Fortunately, a simple modification makes the NDA mode effective. This modification is h(t) = g(t ± T/2) and the format is called staggered QAM (SQAM)_ The implementation is similar) but somewhat more complex, to that shown in Fig. 7
271
Fifty Years of Communications and Networking oc(t} Vestigial
Matched
bond
BPF
o Fig. 8.
1/2T
Energy spectral density for VSB data pulse.
because additional samplers are needed for sampling at both
kT+ T and kT+ 1 ± T/2.
Considering now the one ..dimensional VSB/PAM case, the log..likelihood functions are the same as (45) and (46) with the b k and Pk quantities omitted, and with
Qk(O,
r) = Re [ exp (-jO) L~ O!(t}y * (t -
kT- f)dt
J
(47) In (47), 1(t) = g(t) + j g(t) is the complex envelope of a single) unit-amplitude carrier data pulse. The orthogonality condition corresponding to (38) for the NDA case can be satisfied by a pulse whose energy spectrum I r(j) 12 has a shape of the form shown in Fig. 8, exhibiting a Nyquist-type of symmetry in both of its rolloff regions.
veo
and In contrast to the QAM case, we find that the voltages should be derived as linear combinations of the partial derivatives of A with respect to 8 and i.e., there is a coupling between the parameter estimates. To show this, we consider the approximate solution for the one-shot estimator based on a Taylor series expansion of A about trial values of 80 and TO. If these values are sufficiently close to the true values, then we can take as refined estimates, 8 1 and T 1 , the solutions of
vee frequency-control
r,
[8 70~
AsT(8 0, 10~ ATT (8 0 '
1 -
71 -
00] 70
[-A-A 0 °, TO)] (0
::=
r (8 0 , 70 )
(48) where the subscripts in (48) denote partial derivatives. The solution of (48) is greatly simplified if the 2 X 2 matrix is replaced by its, mean value, and this is valid at moderately high signal-to..noise ratio and moderately long observation intervals. Then we have a simple form of estimation given by
(49) where the 2 X 2 matrix A is the expected value of the matrix in (48). The A matrix can be regarded as a generalization of the A 1 quantity in (22), (24) for the single-parameter recovery and conproblem. For the joint tracking loop, the trol voltages are linear combinations of the AfJ and A, quantities (without the K-term summation) as characterized by the inverse of the A matrix. In the QAM and SQAM cases, the A
yeO
vee
La. Fig. 9.
One..dimensional joint tracking loop.
matrix is diagonal so that no coupling is required, but for VSB there are strong off-diagonal terms. It has been shown [30][32] that loop convergence rates can be substantially improved by incorporating this coupling on the control signals. A block diagram for the one-dimensional joint tracking loop is shown in Fig. 9. A DA/NDA mode switch is shown in the diagram, but it must be recognized that the A -1 coupling matrix is a compromise value in either one or both modes because the A matrix is quite different for the DA and NDA cases, as discussed in the next section. Another practical consideration is that the configuration of Fig. 9 can also be used for QAM and SQAM with some loss in performance. Here we just eliminate the Pk quantities in (45) or (46) and use the matched BPF for the I-channel pulse, l.e., let r*(-t) = g(-t). The loss in performance will be about 3 dB or greater, depending on signal-to-noise ratio and on which parameter is considered, as there are some cancellations of pattern-dependent jitter in the configuration of Fig. 7 which are not possible in that of Fig. 9.
VI. PERFORMANCE OF TIMING RECOVERY SCHEMES A convenient approach to evaluating timing recovery circuit performance, is to derive expressions for rms phase-and timing-jitter directly from (49), or its one-dimensional counterpart for individual estimation of (J or T • For these calculations we assume 8 0 and r 0 are the true values of the parameters, so that the left..hand side of (49) gives the jitter variables. The equation is linearized in the sense that the jitter variables depend linearly on the receiver measurements Ao and AT. As in all analyses of this type, the results are accurate if the jitter is relatively small, which in this case generally means a moderately high signal-to-noise ratio and a moderately long observation interval. This approach affords an effective means to study jitter performance with respect to the values of all system parameters, such as signal..to-noise ratio, To (or K), excess bandwidth, and pulse shape. It also allows comparison of jitter performance of the various modulation formats and evaluation of the DA strategy relative to the NDA strategy.
THE BEST OF THE BEST
272 Another important aspect that can be examined from the rms jitter calculations is the effect of the implementation approximations A -+ Aand q k ~ qk- Let us take as an illustrative example the one-dimensional case of baseband timing recovery. For the implementable DA case we get var T= var Ar(r)[E{ArT(T)}]-2 K-l
~
No
k=O
~ ,2(mT- k1)
mEK' 2
==--+---------2 2 K
a KD
n
(SO)
where
ret) =
Joo g(s + t)g(s) ds -00
and
D
= -;:(0) =
loo g2 (t) dt -00
is the energy in the time derivative of a single data pulse. The notation m e K' in (50) means that the sum is taken for aU k < 0 and ~ K, i.e., just the terms not used in the other sum. The first term in (50) is seen to vary inversely with K; signalto-noise ratio, and the energy in K(t). From this it is obvious that "sharp-edged" pulses can give excellent timing recovery performance. In fact, the entire denominator in the first term can be taken approximately as the expected value of the energy of the time-derivative of the received PAM signal over the To -second observation interval. Another significant aspect ~fthe first term in (50) is that it gives the entire jitter variance if A and q k are used instead of A and qk. In other words) the second term in (50) gives the additional jitter variance resulting from the practical implementation considerations. This term does not depend on the noise level and it can be regarded as the effect of the patterndependent component of jitter. It varies inversely with K 2 since the numerator is essentially a constant even for moderately small values of K. Thus we see that if severe requirements are placed on acquisition time (small K), then the effect of this pattern-dependent term is apt to dominate. Otherwise, for larger K, the first term may be dominant and the difference between true ML estimation and its implementable approximation may be negligible. Although the variance expressions are somewhat different, the same general conclusions about the two types of jitter terms hold for the NDA timing recovery case, and for joint timing and carrier phase recovery [26). Another physical interpretation, in the case of carrier phase recovery, is as follows. There are two random interference components producingjitter in the carrier phase tracking loop; one is due to the additive noise on the input signal, and the other due to the message sidebands of the carrier signal. It is primarily the quadrature components of these interferences that cause the jitter. The quandrature noise has a flat spectrum about the carrier frequency, so the jitter variance due to this interference should vary in direct proportion to the loop bandwidth. The quadra-
ture data dependent interference has a spectral null in the vicinity of the carrier frequency, so we would expect jitter variance due to this effect to increase faster than linearly with loop bandwidth. Hence, for a given signal-to-noise ratio, and for a large enough loop bandwidth (rapid acquisition) we would expect the pattern-dependent term to dominate. The relative performance of the different modulation formats is governed primarily by the size of the elements of the A matrix. Also, the A matrix almost completely characterizes the difference in performance of the DA and NDA strategies. For example, in the NDA/VSB case, the A matrix elements contain terms proportional to the integral of the product of Ir(j)1 2 and I tntt - f) 12 [26], [27]. Thus the size of the terms depends on the amount of overlap of the pulse energy spectrum and its frequency-translated version, and this depends on the amount of excess bandwidth available. For the staircase shape shown in Fig. 8, the term is directly proportional to the excess bandwidth factor E. As a result, the rms jitter has a lIe behavior and performance is unacceptable at very small excess bandwidth. On the other hand, the A matrix for DA/VSB has a completely different dependence on r(f) and it results in a finite jitter variance at € = 0 and a much slower rate of decrease for increasing E. In fact, with excess bandwidths over about 30 percent, the difference between DA and NDA jitter is usually small enough so that it may not be worth the additional circuit complexity to implement the DA strategy [26] . Finally, the variance expressions can be used to compare performance with different pulse shapes or to solve for optimum pulse shapes. There is no universally optimal pulse shape to cover the variety of cases discussed here. For one thing, we can see from (50) that the optimal pulse shape can depend on the signal-to-noise ratio. It is found, however, that the staircase, or "double-jump," rolloff pictured in Fig. 8 is optimal in certain cases and tends to be desirable in all cases. For example, it is better than the familiar "raised-cosine" rolloff, by a factor of approximately 2 in jitter variance [26]. It is interesting to note that such pulse shaping is also optimal from the standpoint of providing maximal immunity to timing or phase offsets [33], [34]. This fact accentuates the importance of proper pulse shaping for overall system performance.
APPENDIX COMPLEX ENVELOPE REPRESENTATION OF SIGNALS
A straightforward extension of the familiar two-dimensional phasor representation for sinusoidal signals has proven to be a great convenience for dealing with carrier-type data signals where properties of amplitude and phase shift are of special significance. As a supplement to this paper only the most basic relationships are presented. More details and the derivations of the formulas can be found in some texts on communication systems or in [6, chs. 4 and 7] . An arbitrary signal x(t) can be represented exactly by a complex envelope -y(t) relative to a "center" frequency 10, which for modulated-carrier signals is usually, but not necessarily, taken as the frequency of the unmodulated carrier. x(t) = Re ["Y(t) exp U2trfo t )] .
(A-I)
273
Fifty Years ofCommunications and Networking
Expressing the complex number 'Y(t) in polar form reveals directly the instantaneous amplitude p(t) and phase 8(t) of the signal.
-y(t) = p(t) exp [j8(t)]
= c/(t) + jCQ(t).
(A-3)
Equation (A-1) might be regarded as one part of a transform pair. The other equation, i.e., how to get -y(t), given x(t), presents a small problem. Due to the nature of the "real part of" operator Re, there is not a unique 1(t) for a given x(t). We solve this problem by making the definition
')'(t) = [x(t)
+ jx(t}]
exp (-j21ffot)
~(t)
~-
(A-2)
In some situations, the rectangular form of "Y(t) in (A-2) has a more direct bearing on the problem as it decomposes the signal into its in-phase and quadrature (f and Q) components.
x(t) = CI(t) cos 21Tfot - cQ(t) sin 21rlot.
x~t)
(A4)
where x(t) is the Hilbert transform of x{t). The prescription for getting 1(t) from x(t) is especially simple in the frequency domain. The Fourier transform r(j) is obtained by doubling X(f), suppressing all negative-frequency values, and frequencytranslating the result downward by an amount 10. Incidentally, using this approach no narrow-band approximations concerning x(t) are necessary, and an arbitrary value of 10 can be selected. We now characterize the two most important signal processing operations, filtering and multiplication, in terms of equivalent operations on complex envelopes. Consider first the timeinvariant bandpass filtering operation in Fig. 10. We express the bandpass transfer function H(f) in terms of an equivalent low-pass transfer function f'l,(f), according to
Ca(t)
Fig. 10. Bandpass filtering and low-pass equivalent operation on com.. plex envelopesignals. x(t):Re[Y(t)exp(j2",f ot)]
y(t)
=Re[,8(t)exp(j2J1f
Fig. 11.
o fl]
z2(t)
=iRe[Y(t)~(t)e)(p(j47Tfot)]
Low-frequency and 2/0 terms of product of two bandpass signals.
Notice that if H(f) is symmetric about fo, then PO(/) = 0 (this is the definition of symmetry for a bandpass filter) and there is no cross coupling of the! and Q components in the filtering operation. Next we consider the output of a multiplier circuit, z(t) = x(t)y(t), when the two inputs are expressed in complex envelope notation. From (A-8), the multiplier output consists of two terms, one representing low-frequency components and the other representing components around 2/0'
z(t) = Re ['Y(t) exp U21Tfot)] Re
=t
Re ['){t)P*(t)]
+t
[~(t) exp
(j21Tfo t)]
Re [7'<:t)P(t) exp (j41Tfo t ]·
(A-8) In most applications a multiplier is followed by either a lowpass filter (LPF) or a bandpass filter (BPF), as shown in Fig. 11, in order to select either the first or second term in (A-8) and completely reject the other term. In our application we may regard y(t) as the reference carrier; then the LPF output z 1 (t) is the response of a coherent demodulator to x(t). If Straightforward manipulation shows that the input- y(t) = x(t), so that the multiplier is really a squarer circuit, the output relation for complex envelopes is also a time-domain BPF output Z2(t) can be used for carrier phase recovery. Its convolution complex envelope, relative to 2fo, is proportional to -y2 (t). Finally, when the bandpass signal is modeled as a random (A-6) (j(t) = [w 01] (t) process, we use the same correspondence, (A-I) and (A-4), and this result is general because of our particular method for between the real process x(t) and the complex envelope prodefining the complex envelope in (A-4). If we express w(t) in cess 1(t). It is of interest to relate the statistical properties of terms of its real and imaginary parts, wet) = p](t) + i PQ(t), x(t) to those of its in-phase and quadrature components, relthen the two-port bandpass filtering operation can be repre- ative to some f o - First we note that E[x(t)] = Re {E[1(t)] sented by a real four-port filter with separate ports for the I exp (j2rrfot) }; hence for a wide-sense stationary (WSS) x(t) and Q input and output. The four-port filter is a lattice config- process, 'Y(t) must be a zero-mean process, in order that uration involving the transfer functions PM and P Q(f) as E[x(t)] be independent of t. Proceeding to an examination of second-order moments, it is a simple matter to show that shown in Fig. 10. 'Y(t) must be a WSS process if x(t) is to be a WSS process. The PJ (/) = n(j) + f2*(-j) converse is not true. A WSS 1(t) may produce a nonstationary x(t), indicated as follows. Rewriting (A..1) as 1 1 (A-7) P (/) = - o(j)--n*(-fJ. x(t) = )'(t) exp (j21rfot) + -y*(t) exp (-j21ffot ) (A-9) Q 2j 2j
n(f) is not necessarily a physical transfer function. If H(f) exhibits asymmetry about 10, then n(j) is asymmetric about f = 0 and the corresponding impulse response w(t) is complex. In fact, wet) is precisely the complex envelope of 2h(t), where h(t) is the real impulse response of the bandpass filter.
t
t
t
t
THE BEST OF THE BEST
274
the autocorrelation for x(t) can be expressed as
kxx(t
+ T, t) =:
E~~(t
l7]
+ T)x(t)] = t Re [k'Yl'(r)
• exp (.i21Tf o T»)
+ t Re
[8]
[k1'1'*(r)
• exp (j4Tffot + j2rrfoT )]
[9l
(A-10)
[101
where, for complex WSS processes, we define the autocorrelation of 'Y(t) as
[11]
k'Y'Y(r) = £['1(t
+ TyY*(t)] .
(A-II)
The quantity k'Y~ .(T) = E[,{t + r) 'Y (t)) in (A-tO) can be regarded as the cross correlation between signal components centered at +[0 and at If x(t) is WSS, then this cross cor-
-rOe
relation must vanish in order that the t.. dependent term in (A. ~O) vanish. Otherwise x(t) ~s a cyclostationary process.
, If we let 'Y(t) ::= u(t) + j v(t), where the I and Q processes, u(t) and V(t), are jointly WSS) then we have
(12) i13]
(14)
l15] [16]
t 17] [18 J
and the condition for stationarity of x(t) requires that
119]
Thus for a WSS bandpass process, the I and Q components are balanced in the sense that they have the same autocorrelation function. Also, the cross correlation of the I and Q components must be an odd function, since k v u (1) ::: kuv(-T) for any pair of WSS processes. For example, u(t) = L(t) would satisfy the autocorrelation condition in (A-13), but not the cross correlation condition. The size of k'Y'Y.(r) indicates the degree of cyclostationarity of a bandpass process, In the extreme case where either the I or Q component is missing, as in DSB-AM, we would have k'Y'Y.(T) =: ± k'Y'Y (r), e.g., for V(t) = 0,
kxx(t
+ T, t) = kuu(T)
cos (21TfoT) cos (2~fot
+ 21TfoT).
[20]
[21 J
l22} [23J [24J l25)
(A-l 4)
[26]
In modeling an additive noise process n(t) on received signals; w~ often use the white-noise assumption wherein k nn (T) =: NOO(T). If we let ret) + j s{t) be the complex envelope of the process relative to any fo which is significantly larger than the passband width of the signals, then the whitenoise process is equivalently modeled by I and Q processes whose correlation functions are given by .
\27]
kYS(T) = O.
(A-IS)
REF~RENCES
ll]
J. J. Stiffler, Theory of Synchronous Communications.
[29J [30J [31]
Engle-
wood Cliffs, NJ: Prentice-Hall, 197 i. W. C. Lindsey, Synchronization Systems in Communication and Control. Englewood Cliffs, NJ: Prentice-Hall, 1972. [3] W. C. Lindsey and M. K. Simon, Telecommunication svstems Engineering. Englewood Cliffs, NJ: Prentice-Hall, 1973. (4) F. M. Gardner, Phaselock Techniques, 2nd ed. New York: Wiley, 1979. (5] A. J. Viterbi , Principles of Coherent Communication. New York: McGraw-Hill. 1966. (6) L. E. Franks, . Signal Theory. Englewood Cliffs, NJ: PrenticeHall. 1969. .
[2]
[28 J
[32]
l33] (34)
W. R. Bennett. "Statistics of regenerative digital transmission," Bell Syst . Tech. J., vol. 37, pp. 1501-1542, Nov. 1958. W. A. Gardner and L. E. Franks. "Characterization of cyclostationary random signal processes," IEEE Trans. Inform. Theory, vol. IT-21, pp. 4-14, Jan. 1975. F. M. Gardner, "Hangup in phase-lock loops," IEEE Trans. Commun., vol. COM-25, pp. 1210-1214. Oct. 1977. M. K. Simon, .. Further results on optimum receiver structures for digital phase and amplitude modulated signals." presented at 1978 Int. Conf. Commun., Toronto. Canada, 1978. L. E. Franks and J. P. Bubrouski, "Statistical properties of timing jitter in a PAM timing recovery scheme," IEEE Trans. Commun., vol. COM-22, pp. 913-920. July 1974. P. Kabal and S. Pasupathy, .. Partial response signaling." IEEE Trans. Commun .• vol. COM-23. pp. 921-934, Sept. 1975. H. Sailer. · 'Timing recovery in data transmission systems using multilevel partial response signaling," presented at 1975 lnt. Conf. Commun., San Francisco. CA. 1975. S. U. H. Qureshi, "Timing recovery for equalized partial response systems." I£EE Trans. Commun., vol. COM-24. pp. 1326-1330. Dec. 1976. E. Roza, .• Analysis of phase-locked timing extraction circuits for pulse code transmission," IEEE Trans. Commun .• vol. COM-22, pp. 1236-1249, Sept. 1914. F. M. Gardner, "Setr-noise in synchronizers:' this issue. pp. 1159-1163. H. L. Van Trees, Detection, Estimation and Modulation Theory, Part I. New York: Wiley, 1968. L. E. Franks. " Acquisition of carrier and timing data-l,' in Signal Processing in Communication and Control. Groningen, The Netherlands: Noordhoff, 1975, pp. 429-447. ' W. C. Lindsey and M. K. Simon, '"Data-aided carrier tracking loop," IEEE Trans. Commun .. vol. COM-19. pp. 157-168, Apr. 1971. R. Matyas and P. J. Mcl.ane , "Decision-aided tracking loops for channels with phase jitter and intersymbol interference:' IEEE Trans. Commun., vol. COM-22, pp. 1014-1023, Aug. 1974. M. K. Simon and J. G. Smith, ••Offset quadrature communications with decision-feedback carrier synchronization." IEEE Trans. Commun., vol. COM-22, pp. 1576-1584, Oct. 1974. U. Mengali, "Synchronization of QAM signals in the presence of lSI:' IEEE Trans, Aerosp, Electron. Syst., vol. AES-12, pp. 556560, Sept. 1976. A. L. McBride and A. P. Sage, "Optimum estimation of bit synchronization." IEEE Trans. Aerosp. Electron. Syst .. vol. AES5. pp. 525-536, May 1969. P. A. Wintz and E. J. Luecke, "Performance of optimum and suboptimum synchronizers." IEEE Trans. Commun., vol. COM17, pp. 380-389, June 1969. R. D. Gitlin and J. Salz , "Timing recovery in PAM systems:' Bell Syst. Tech. J .. vol. 50, pp. 1645-1669. May-June 1971. M. H. Meyers and L. E. Franks, "Joint carrier phase and symbol timing for PAM systems," this issue, pp. 1121-1129. L. E. Franks, "Timing recovery problems in data communication,." in Communication Systems and Random Process Theory. Groningen, The Netherlands: Sijthoff and Noordhoff. 1978, pp. 111-127. H. Kobayashi, ~. Simultaneous adaptive estimation and decision algorithm for carrier modulated data transmission systems." IEEE Trans. Commun., vol. COM-19, pp. 268-28, June 1971. G. Ungerboeck , "Adaptive maximum likelihood receiver for carrier-modulated data transmission systems," IEEE Trans. Commun .. vol. COM-22. pp. 624-636. May 1974. D. D. Falconer and J. Salz. "Optimal reception of digital data over the Gaussian channel with unknown delay and phase jitter," IEEE Trans. Inform. Theory. vol. IT-23. pp. 117-126, Jan. 1977. U. Mengali, " Joint phase and timing acquisition in data transmission:' IEEE Trans. Commun., vol. COM-25, pp. 1174--1185, Oct. 1977. M. Mancianti , U. Mengali, and R. Reggiannini, "A fast start-up algorithm for channel parameter acquisition in SSB-AM data transmission." presented at 1979 Int. Conf. Commun .. Boston, MA, 1979. L. E. Franks, "Further results on Nyquist's problem in pulse transmission,·~ IEEE Trans. Commun .. vol. COM-16, pp. 337340. Apr. 1968. F. S. Hill. "Optimum pulse shapes for PAM data transmission using VSB modulation," IEEE Trans. Commun., vol. COM-23. pp. 352-361, Mar. 1975.
Fifty Years of Communications and Networking L. E. Franks (S'48-M'61-SM'71-F'77)
was born in San Mateo, CA. on November 8, 1931. He received the B.S. degree in electrial engineering from Oregon State University. Corvallis, in PHOTO 1952. and the M.S. and Ph.D. degrees if) NOT electrical engineering from Stanford University, Stanford. CA, in 1953 and 1957, respectively. AVAILABLE In 1958 he joined Bell Laboratories, Murray Hill, NJ. working on filter design and signal analysis problems. He moved to Bell Laboratories, North Andover, MA. in 1962 to serve as Supervisor of the Data Systems Analysis Group. In 1969 he became a
275 Faculty Member at the University of Massachusetts, Amherst, where he is currently Professor of Electrical and Computer Engineering and is engaged in research and teaching in signal processing and communication systems. He served as Chairman of the Department between 1975 and 1978. He was Academic Visitor at Imperial College, London. England, in 1979. Dr. Franks is the author of the textbook. Signal Theory (Englewood Cliffs. NJ: Prentice-Hall, 1969). He is a member of the VRSJ Commission C, the Communication Theory and Data Communication Systems Committees of the IEEE Communications Society. and currently is the Associate Editor for Communications for the IEEE TRANSACTIONS ON INFORMATION THEORY.
Tamed Frequency Modulation, A Novel Method to Achieve Spectrum Economy in Digital Transmission FRANK de JAGER AND CORNELIS B. DEKKER, MEMBER IEEE t
Abstract- This paper describes a new type of frequency modulation, called Tamed Frequency Modulation (TFM); for digital transmission. The desired constraint of a constant envelope signal is combined with a maximum of spectrum economy which is of great importance, particularly in radio channels. The out-of-band radiation is substantially less as compared with other known constant envelope modulation techniques. With synchronous detection, a penalty of only 1 d8 in error performance is encountered as compared with four-phase modulation. The idea behind TFM is the proper control of the frequency of the transmitter oscillator, such that the phase of the modulated signal becomes a smooth function of time with correlative properties. Simple and flexible implementation schemes are described.
some parameters of the transfer characteristic on the TFM spectrum will be investigated. The description of the receiver and the calculation of the receiving filters follows next. It will then be shown that the implementation can be simple and flexible. The last section presents a further evaluation to obtain other types of synchronous frequency modulation. II. THE CONCEPT OF TPM
To start with, we will consider four-phase modulation (4l/)) , well known from digital transmission via telephone lines. With this modulation method information can be transmitted efficiently in a relatively narrow bandwidth. In the two-dimenI. INTRODUC1'ION sional representation of its signal space diagram in Fig. l(a), N the last fifteen years numerous modulation systems for four signal points are defined on a circle. Every two bits of the efficient digital transmission via telephone lines have been incoming binary data stream are encoded into one of these introduced. In almost all cases the resultant modulated signals points. Between two sampling moments, the phasor v(t), repreexhibit amplitude variations. Such systems are implemented senting the modulation, moves from one 'point to another. The with linear. amplifiers and linear modulators. For radio com- way in which this takes place determines the spectrum width municatiori constant-envelope modulated signals are preferable of the modulated signal. The magnitude of the phasor is often due to existing system constraints in power economy and the deliberately not kept constant between the sampling moments, consequent use of non-linear power amplifiers. Quite naturally in order to reduce the spectrum width. However, as mentioned this leads to the use of frequency modulation. in the introduction, we prefer a constant amplitude for our However, the spectrum of an, FM signal is relatively wide. In communication applications. order to narrow the spectrum, a channel filter with a precisely The coherent orthogonal demodulator is the optimum prescribed attenuation and phase characteristic may be used receiver structure for 4
I
Reprinted from IEEE Transactions on Communications, vol. COM-26, no. 5, May 1978.
The Best ofthe Best. Edited by W. H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
277
THE BEST OF THE BEST
278
1Y
ef
then for MSK ¢(mT
v t)
-x
-x
•
Fig. 1.
-1 -1 -1 +1 -1 off '-1 +1 +1 -1-1 +1 +1 +1 - 1 -1 +i +1 I
(k+2)1fk (k+ 1)1f/2
i
j
Iii I
I
i
I
I
I
I
I
I
I
1 I
!
1\1 . \/\/'.1'\ I t\Y ,,,: .\ Y V V .•. V ...' :. . . ./
\.f\J\/ \/
k1t1z (k-1) 1f1z
\.j time t
mT (m+1) T Fig. 2.
I
A
A
~
Phase behavior of MSK (- - -), MSK with sinusoidal smoothing (. .. .) and TFM (--). PSD x fblt (dB)
"
-60
\.
~. ~:'":\ ' '''' t: \ <':' ;
':
\( ~I
+ a m/2 + a m+l /4)
,
\I!
\. r>; r>.. 1'1 \ ( \ r {-'" \' \I
# ~
with
III. SPECIFICATION OF THE PREMODULATION FILTER
lator be Ko radians/volt/second. With the definition of aCt) in (I) and the impulse response g(t) of the filter, the phase
". ~'"'' Y...
\/ ....../
-100
o
(3)
phase 'fl u ct u at io ns. Let the deviation sensitivity of the modu-
'. \ "''>I
-80
Fig. 3.
¢(mT):: (n/2) . (a m - l /4
The wanted fluctuations of the phase as a function of time are obtained when the signal is applied to a frequency modulator as shown in Fig. 4. The premodulation filter G(w) has to shape the data signal aCt) so as to obtain the wanted smooth
to
-40
+ T) -
(c)
Signal space constellations of a 41/l-Signal (a), an MSK signal (b) and a TFM signal (c).
data in
(2)
and for TFM
(b)
(a)
+ T) - ¢(mT) = (n/2) . am
2
!f-fd
--
fb it
Power spectral density functions of MSK (- - e); MSK with sinusoidal smoothing (. . . .) and TFM ( - ) .
or rj>(t) =K o "
... I;
an" x(t - nT) + C,
In the literature considerable attention has been devoted to n=-oo finding the optimal smoothing, in the sense of giving a PSDF which is as narrow as possible. Amoroso [4] used the sinus- where oidal smoothing represented by the dotted line in Fig. 2. The sharp edges have disappeared but the maximum slope of the x(t) = ce g(T) . dr phase path has considerably increased. The corresponding PSDF, given in Fig. 3 by the dotted line, therefore showsbut and C is a constant. At the sampling moment t little improvement. becomes We found that it is possible to obtain a considerably better ... spectrum efficiency by prescribing a phase path as depicted by " ¢(mT) = K ~ tin" x(mT - nT) + C o the solid line in Fig. 2. We named this modulation method n=-oo Tamed Frequency Modulation (TFM). The phase path is characterized by the fact that the values of the phase at the successive sampling moments t = (m - l)T, mT, (in + l)T, '" are and obtained from the data via a different code rule from that for
(4)
[t
aCt)
=
00
~ an" 5(t - nT),
n=-oo
with an
= + 1 or -1,
= mT the phase
ec
(I)
=K o " ~ an" [x(mT+ T -nT)-x(mT-nT)] n=-oo
Fifty Years of Communications and Networking
279
h(t)
or
Fig. 5. Network with a transfer characteristic Suo)
¢>(mT
+ T) -
(2Ko)·
00
=K o ~
Qm-l
1=-00
[x (IT + T)-x(/T)].
(5)
From eq. j it can be seen, that the right-hand side of (5) can be equal to ±rr/2, ±tr/4 Or 0 radians. Writing the code rule in (3) in more detail as ¢(mT + T) - f/J(mT) = (1T/2) • (... + am -
2 •
0
+ QTri-l!4 + Qm/2 + 0m+l/4
+am +2 • 0 + ".),
(6)
then, with (5), we obtain
x(IT + T) - x(IT) =
1T!(8Ko )
for 1 := 1,
1T/(4Ko)
for 1 = 0,
iT/(8Ko)
for 1:= .....,..1,
o
(7)
otherwise.
Combining this with the definition of x(t) in (4) we obtain
1
g(t) • dt.=
IT
-rr/(4Ko)
o
for 1:::: 0,
(8)
otherwise.
= [1T!(8Ko)] •e-j w T + +
[1T/(4K o))
(1T!(8K o )] • e i w T
= [1t/(2Ko )] • cos2 (wTj2).
(9)
*.An impulse response h(t) satisfies the third NyqlIist criterion if, for any integer I:
j
<21+1 ) . T / 2
(21-1)-T/2
" h(t) dt
=
G(w)
=H(w) • Sew) = [1T/(2Ko )] • H(w) • cos2 (wT/2)
(10)
where H(w) applies to any filter giving a pulse satisfying the third Nyquist criterion. IV. DIFFERENT TFM SYSTEMS In the previous sections we have seen that the combination of a premodulation filter G(w) , defined in (10), and a frequency modulator with deviation sensitivity Ko , gives rise to a phase behavior as given in (5), thus yielding a TFM system. The filter S(w) , being a part of G(w), determines the total
amount of phaseincrease or decreaseduring a sampling interval, while the other part of G(co), namely H(w), prescribes the phase path from C/>{mT) to (jJ(mT + 'ij. In this section we will look at the influence of different functions for H(w) on the TFM spectrum. Generally, H(w) can be written as [7]:
{i
for I = 0,
0
otherwise.
(11)
where NI (w) is the Fourier spectrum of a function satisfying the first Nyquist criterion. A class of Nyquist characteristics which has been extensively used and studied is the raised-cosine characteristic 1] :
t
The relation in (8) gives the condition for the impulse Fesponse g(t) needed to ensure that the phase goes through the values at the sampling' moments Shown in Fig .. 2 for TFM. This condition is certainly fulfilled if get) is derived from a single pulse h(t), satisfying the third Nyquist criterion* [6], [7] by simple scaling and delay operations with a network S(w) as is shown in Fig. 5. The transfer characteristic of this network is givenby Sew)
The overall shape factor G(w) of the premodulation filter can now be written as
H(w) = [(wT)/(2 sin (wT/2»] · N1 (w ),
1T/(8Ko)
(l+l)T
=(n cos2 (wT/2)]!
for 0 ~ Iw I ~ 1T(l - a)IT,
[1 - sin «(Tw - 1l')/2a)]
/2
for 1T(1 - a)IT ~ t w I ~
o
1T (1
+ a)/T,
(12)
otherwise.
The only variable left is the roll-off factor Q (0 ~ Q ~ 1). For various values of Ci. the PSDF's of the corresponding TFM systems will be calculated. In following the calculation method of Garrison [5], we truncate the length of the impulse response g(t) of the network G(w) to five symbol intervals (5T), and we henceforth approximate the impulse response with eight samples per bit interval T. If Garrison's method is applied to a system in which G(w) is implemented by analog means, then the results incorporate an approximation error. However, for a digital implementation this error does not occur.
280
THE BEST OF THE BEST
- ' -' -0{=
I
+ + + + ()(=1/2 - - - - - ()( =
- 20
- 40
- 60
\
'1it-
•..•.. ()( = 1/8 --()(= o
\
\\..,
\ ...\ •
I,
I '+
+.
-80
i
'' ' 0",\
++1'+
.....
~ :! ++ +
Fig. 7.
-loa
Impulse response g(t} of the premodulation filter . PSDx fb it (dB)
a Fig. 6.
time t
14---'---+1001
"'
~ -+ +·'r '-' -
to
It-tel tbit
PSDF's of the TFM signal with different roll-off factors in the transfer characteristic of the premodulation filter .
The resulting power spectral density functions**, plotted in Fig. 6, show no great improvement of the out-of-band radio ation when the roll-off factor a is made smaller than 0.25. A certain truncation of the length of the impulse response g(t) also has to be accepted, so as to keep the amount of hardware for the implementation of G(w) small. In the following, the PSOF of TFM is calculated three times for different truncation lengths (3T, ST , 7T). For H(w) we have chosen a filter with the smallest bandwidth (a = 0) [8]: H(w)
=
WT/(2 sin (wT/2))
!o
for Iwi";;
-ar,
otherwise,
(13)
The shape factor of G(w) with this H(w) is
[rrwT/(4K Q • sin (wT/2))).cos z (wT/2) G(w)=
forlwl";;rr/T,
o
(14)
otherwise
and the corresponding g(t) is shown in Fig. 7. The PSOF 's of TFM systems with different truncation lengths of g(t), depicted in Fig. 8, show a considerable reduction of the out-of-band radiation if the length is increased. When we draw the PSOF of the 7T-version of TFM in Fig. 3 (solid line), we see finally the great improvement obtained by TFM in comparison with MSK and MSK with sinusoidal smoothing. V. RECEIVER STRUCTURE From the signal space diagram ofTFM in Fig. I(c) it can be seen that an orthogonal coherent demodulator can be used as .. The 0 dB reference is a constant but arbitrary value throughout the paper.
(JT) (ST) (7T)
a Fig. 8. PSDF's of the TFM signal with different truncations of the length of the impulse response of the premodulation filter.
receiver (Fig. 9). The incoming TFM signal sin (wet + ¢(t)) is multiplied by respectively sin (wet) and cos (wet). This results in baseband signals cos [(t)] and sin [¢(t)] which follow from the phase function in Fig. 2. The observed eye patterns are shown in Fig. 10(a) and these signals arrive at the input of the low-pass filters A1(w) and Az(w), together denoted as A1,z(w). These filters provide for minimization of the error probability. Finally , a decoder of the same configuration as used by de Buda (3} in his MSK demodulator, can produce the output data signal. In the MSK case, the demodulated signals in de Buda's decoder can only have the amplitude I at the sampling moments , while in TFM the amplitude can be I or 0.707. This is due to the smoothed phase path occurring in TFM, which however does not affect the polarity of the demodulated signals. This can be shown in a straight-forward way by comparing the phase paths of both FFSK and TFM (Fig. 2). The deterioration of the eye opening of TFM will cause a somewhat higher error probability, but in this section we will see that the penalty is only I dB as compared with 4<1>-modulation. The question of ·erro r propagation and the need for differential encoding for TFM are the same as for MSK [3] . Also. carrier recovery and clock recovery for TFM and MSK are quite similar. We will come back to this in the following section. The low-pass filters A1.Z(w) need some special attention . The filters have to minimize the error probability. This can be
281
Fifty Years of Communications and Networking
sin (G.lc t)
moment can see to be J2(~1.4}: Two slightly different signals i 1 'et) and i2 ' (t) with the same opening are shown in Fig. 100c). These signals can be constructed by superimposing impulse responses z(t) , which can be written as:
. 1.j2(l +
data out
z(t) =
COS(G.lc
Fig. 9.
o
cos (1Tt/(2T»!4]
for
.
It \0:;;; 2T,
(lS)
otherwise.
Now the linear expressions for i 1 '(r) and i 2 '(t) can be given as
t)
Orthogonal coherent receiver structure.
k 00
i 1 ' (t ) =
p= -oo
up'{z(t-2Tp)} ,
(16a)
where up = +1 or -I, and
k 00
i 2'(t ) =
where wp =
wp
'
{z(t - T - 2Tp)},
(l6b)
p= _oo
+ I or -
I.
The corresponding shape factors 11 ' (w) and 12 I (w) are
;,~).,m~~ I
mT I
I
(m+2)T I (in+lifT I
I
_1
-0.7
-
-0.7
~
time t
I
_1
-
0.7
--OJ -
-1
-
0.7
(b)
i;« XXX~ I
i;(t)
I
I
(m+2)T ,0>+"-)T, I
I
<XX )
- - 0.7 ~
time t
[2wT(I --(2wT/1T)2») .
(17)
The power spectral density function [few») 2 of the signals in (I6a) and (l6b) is shown in Fig. I I and approximates the PSDF , [1(w») 2, derived from Fig. 8, of the real input signals. With this few) rather than 11 (w) and 12 (w) a useful, nearly optimum filter A 1(w) ::::: A 2(w) = A(w) can easily be found, leading to the transfer characteristic which is shown in Fig. 12. Its equivalent noise bandwidth t», is found to be :
i'"
wr =
A2(W)' dw
~ 0.71T/T.
(18)
-0.7 - -0.7
(e)
Fig. 10.
'(co) = 12'(w) = 1'(w ) ::::: [sin (2wT») /
-- 7
'(t).=~"!n} mT
II
Observed eye patterns of the demodulated signals (a, b) and eye patterns of the approximating signals (e).
done by simply minimizing the noise variance 0 2 in the output signal, if the intersymbol interference is taken to be zero. The solution to this problem is described in the following . If we suppose for the moment that the transfer characteristics from the data input of the transmitter to the inputs of the filters can be regarded to be the transfer characteristics of two linear networks 11 (w ) and 12 (w) , then the optimum receiver filters can be found according to Lucky et al. [I). Since it is not at all obvious that 11 (co) and 12 (w) for TFM exist and even so, how they should be derived analytically , we make app roximating descriptions, which considerab ly simplifies this problem . The observed eye patterns of Fig. 10(a) are schematically redrawn in Fig. 1O(b). The eye opening at the sampling
With this approximate filter the calculated bit error probability is 'only I dB worse for TFM (Fig. 13) in comparison with 41/> modulation as caiculated by Lucky et al. (l] . In practice the difference will be even smaller because ideal rectangular low-pass filters used by Lucky et al. in their description, are not practically feasible. One finai observation has to be made . The foregoing calculation of the filters is based on the assumption of white noise, but in practice the out-of-band radiation of transmitters in neighboring channels has to be added and the total disturbing signal received cannot be interpreted as white noise . If this effect is non-negligible the filter A(w) has to be re-optimized. VI. IMPLEMENTATION
a) Transmitter In this section we give two different implementation diagrams each having its own advantages and disadvantages . The diagram of the transmitter in Fig. 4 is the basis for the first implementation . In the description of the TFM signal we have
282
THE BEST OF THE BEST 1
Fig. II .
PSDF's of the demodulated signals and the approximate signals.
Fig. 12.
Transfer characteristic of the low-pass filter A(w) . (b)
P4P =~erfc(frJl
PTFM= ierfc({,j')'iUfC(~J p
i
o
10
5 -
10.109(11)
IdBl
Fig. 13. Error probability curves for 41 modulation and TFM with ideal recovered carrier and clock. The variable T/ is the signal-tonoise ratio at the input of the receiver in a bandw idth lIT.
assumed the center frequency We and the deviation sensitivity Ko to be invariant. In practice, however , they are insufficiently constant. Extra measures have to be taken , as shown in Fig. 14(a) , to keep these parameters at the prescribed values. An adder is inserted for control of the center frequency, while a multiplier can be used to keep the deviation of the frequency modulator at the correct value when K o varies. Both are controlled by means of a detector. This circuit has to generate the two control signals to make the phase I/1{t) in the output TFM signal sin [wet + ¢(t)] pass through the prescribed values at the sampling moments, as shown in Fig. 2. In addition, the
Fig. 14.
Diagram of the TFM transmitter with control circuit (a) and the implementation (b) without the oscillator.
center frequency has to be kept at the specified value. The detector therefore receives as input parameters the input data "signal a(t), the sampling moments and the value of the center frequency. These parameters give the information about the center frequency needed and the increase or decrease of the phase per bit interval. The control circuit can be thought of as consisting of two cooperating phase-locked loops . The analytical optimization of this system is difficult since it is a twodimensional control process. By combining the filters H(w) and S(w), the premodulation filter can be easily implemented by means of a digital filter [9] . An extra low-pass filter H(w) has to be added to 'reject spurious signals around multiples of the sampling frequency Is of the digital filter . Fig. 14(b) shows the implementation of the control circuit and the filters . The advantage of this type of TFM transmitter is that the output rFM signal can exactly meet the constant amplitude condition. A disadvantage is the presence of a feedback system which might cause instabilities. The other type of transmitter, which is shown in Fig. 15(a) , is based on a quadrature modulator. Two signals sin (I/1{t)] and cos [I/>(t)] are fed from a network E to two product modulators operating in quadrature . It will be seen that the output signal is sin [wet + I/>(t)] , i.e., the wanted TFM signal. This signal can thus be applied to a class-C power amplifier, without introducing extra out-of-band radiation. A more detailed implementation diagram is given in Fig. 15(b). The dat a Q n + 3 enter a shift register with a length of q bits . The value q corresponds with the number of bit intervals to which the length of the impulse response g(t) is trun cated (q ;;;. 3) , as described in the previous section. For the moment we have taken q = 5. From equation 3 it can be seen that the difference in phase
283
Fifty Years ofCommunications and Networking sin (C4,t)
data in aCt)
e sin COS
(G.lc t)
(a)
Fig. 16. Implementation of the digital signal processing part E of the TFM transmitter in Fig. 15. signal out
TFM
(b)
Fig. 15. Basic diagram of the TFM transmitter without a feedback system (a) and a moredetailed diagram (b). between two sampling moments does not exceed ±rr/2 radians. The cross-over to another quadrant takes place at the sampling moments. Within each quadrant , the phase path is completely determined by the impulse re~ponse g(t), truncated over 5T, and the values a n-2 , a n-1 , an , On+1 and a n+2 which are present in the shift register. Moreover, it is necessary to remember in which quadrant the phase path is located. From equation 3 it can be deduced that the phase shifts to a following quadrant in the phase diagram if two successive data symbols have the same value +1. It shifts in the opposite direction if the value is -1 . It remains in the original quadrant if the two data symbols do not have the same value. A modified up/down counter, here called quadrant counter Q, can perform this task. The number of the quadrant is represented by two output bits. The 7 bits, 2 for the quadrant information and 5 for the phase path , form the address for two digital memories called the sine table and the cosine table . The sin [Q>(t)] and cos [Q>(t)], corresponding to the 7 bits, are stored in these memor ies. The size of the memories increases with q . These "tables" are read out with a sampling frequency fa. Generally speaking, fg = I/T. := L . fbit , where L is an interpolation factor [10] . The sampled values sin [t1J(nT + mTs ) ] and cos [1fl(nT + mT.)] , with m = 0, 1, 2, ..., L - 1, are supplied via digital-to-analog converters and low-pass filters B'(w) to the modulators . The accuracy of the converte rs is limited. This means that we get some distortion which can be considered as noise . At the moment, the accuracy required of the D/A converters for a certain permissible amount of out-of-band radiation in a neighboring channel has been determined heuristically. The implementation of the digital signal processing part described above is shown in Fig. 16. The accuracy of the D/ A converters is eight bits and the interpolation factor L = 8.
In order to suppress the spurious signals around We ± 211[, • r (r integer) in the modulated signal, low-pass filters B'(w) are used. The group delays of these filters should be frequency-invariant and equal in the pass-band. If the cut-off frequency is too low, unwanted variations of the amplitude and phase of the TFM signal will occur. If the interpolation factor L is taken to be large enough, e.g., eight or sixteen, an acceptable cut-off frequency may be 4 • fbit or 8 • !bit . The two parts of the quadrature modulator should have the same flat amplitude and phase characteristic in (t he frequency band concerned. If they are not the same, amplitude variations and unwanted phase variations will occur, which cannot be eliminated by linear means. For 70 MHz, at which we Implemented our TFM system, inequalities of 0.5% have to be taken into account. The power spectral density function measured at the output of the TFM transmitter, shown in Fig. 15 and 16, is given in Fig. 17 and shows good similarity to the corresponding calculated power spectral density function in Fig. 8 (5T version). The flat part in the measured spectrum for I (f - fe)/fbit I> 1 is caused by the distortion of the 8-bit D/A converters. The advantage of this type of transmitter is the absence of a feedback system. Its disadvantage is the occurrence of small amplitude variations in practice. b) Receiver
In the implementation of the receiver according to Fig. 9 a carrier recovery and a clock recovery system have to be provided. These can be similar to the ones suggested by de Buda [3]. The generation of clock and phase reference is based on the fact that the maximal amount of phase change per symbol interval T for both TFM and MSK equals ±rr/2 radians . Using a frequency doubling circuit , two discrete frequencies are generated. The difference of the frequencies corresponds to the clock frequency and the sum corresponds to four times the carrier frequency. VII. FURTHER EVALUATION In the previous sections we have seen that the use of a system according to Fig. 4 , where the premodulation filter is defined by G(w) = H(w) • Sew) = [1T/(2Ko)] • H(w) • cos2 (wT/2) , gives rise to a TFM signal. The influence of dif-
284
THE BEST OF THE BEST TABLE 1 Sew)
sin (wT!2)
(a) Fast Freq .
wT/2
(b) Duobinary Freq .
for I w
sin (wT/2) o otherwise
rr
1" -
T
(a') Fast Freq .
(c) Tamed Freq,
=0.5
Shift Keying m
Shift Keying
wT!2
[1f cos 2 (wT/2)j/(2K o)
[1f cos (wT/2) j /(2Ko)
rr/(2K o)
H(w)
Shift Keying
(c'I -Tamed Freq ,
(b') Duobinary Freq. Modulation m
Modulation
data
in
(1f +2)
=0.5
Modulation
- I -I -I +1 -I +1 - 7+1 +7 -7 -I +1 +1 +1 -1-7+1 +1
ii
i
f
I
I
i
I
Ii
i
iii
I
I
I
I
I
7Th (b)
(k-tT)Tfh If 1T'h
(c)
(IH)11Jz
o
0.5
1.5 -
mT
If - fd fbit
Fig. 18.
Fig. 17. Power spectral density function measured at the output of the TFM transmitter shown in Figs. 15 and 16, with 8-bit D/A converters , L =8, truncation to 5 T and 0/ =O.
H(w)
= [2 sin (wTj2)] !(wT) ,
-<>o<w <00
- 40 - 60
- 80
-too
(19)
producing a non-continuous modulating signal. In this case the impulse response h(t) is of rectangular form and obviously satisfies the third Nyquist criterion. To clearly distinguish both cases, we talk about keying versus modulation (Table I) . Several of these systems have been described in literature before , e.g., Fast Frequency Shift Keying [2 , 3} , Duobinary Frequency Shift Keying m = 0.5 [5, 11] and Duobinary Frequency Modulation m == 0.5 [5] ***. In Fig. 18 the phase behaviors for the three frequency shift keying systems are depicted for a certain data stream . In each of these systems the phase changes linearly with time between the transition points. The outgoing PSDF's of these systems are given in Fig. 19, showing a slight improvement in spectrum economy when a high-order correlative coding is used. ...... It should be noted that in literature the name Duobinary Frequency Modulation m =0.5 does not uniquely define H(w) .
_
t ime t
Phase behavior of FFSK or MSK (a), Duobinary FSK with m =0.5 (b) and Tamed FSK (c).
,
ferent functions for H(w) on the outgoing TFM spectrum has been investigated. In this section we compare the spectrum of TFM with spectra of other types of synchronous frequency modulation , which also exhibit a constant amplitude . Originally for H( w) the function of equation (I 3) was used, giving rise to a continuous modulating signal. Now, for H(w) a different function is chosen,
(m+7)T
o
\
..,-, ,.. . \ \
, .... ,
,.,
.~J~~\ r: .... (a) I :. .... t, ~ I .. V • • ~7'" ; I (c) 01.
l!
(bY
- : It - tel fbit
Fig. 19.
PSDF's of FFSK or MSK (a), Duobinary FSK with m (b). and Tamed FSK (c).
= 0.5
According .to the classification of Kretzmer [12], several different functions for Sew) can be taken, but only the three functions used give rise to a two-level eye opening if an orthogonal coherent demodulator is used as the receiver. The spectra of these three systems can be made much narrower, however, by the use of a filter H(w) defined in relation (13)_ This is shown in Fig. 20. CONCLUSION In this paper we have described a novel and promising type of frequency modulation , named Tamed Frequency Modu-
285
Fifty Years ofCommunications and Networking [5] - 20
- 40
"'.\'., ...-... " '.
- 60
- 80
-.,' -
.....
[61
[7J
......
....
' - -- - - - ...... (a')
...... ... . .... (b')
-100
(9)
(c~
(10)
o
2 -
If- fel f bit
Fig. 20. PSDF's of Fast Frequency Modulation (a'), Duobinary Frequency Modulation with m = 0.5 (h') and Tamed Frequency Modulation (c'). In each system the truncation interval equals 5T.
lation, for digital transmission . A very low out -of-band radiation is obtained as compared with other constant-envelope modulation techniques. In this way the severe constraints of the radio field can be met with the receiving filter shown in Fig. 12 and the PSDF of TFM in Fig. 3. For example, with a radio channel spacing of 25 kHz and a required data rate of 16 kbits/s the power radiated into the adjacent channel can be 85 dB lower than the power radiated into the wanted channel. The detection quality is almost the same as for four-phase modulation. Finally, it is shown that the implementation of the TFM transmitter and receiver can be relatively simple .
ACKNOWLEDGMENT The authors would like to thank D. Muilwijk and B. van de Ham of Philips Telecommunication Industries for their contri-
butions to the investigation .
REFERENCES R. W. Lucky , J. Salz, E. 1. Weldon Jr., Principles of Data Communication , McGraw-Hill Book Company, New York, 1968 . . [2] H. C. van den Elzen, P. van de Wurf, A Simple Method of Calculating the Characteristics of FSK Signals with Modulation Index 0.5,IEEE Transactions on Communications , vol. COM-20, No.2, pp . 139-147 , April 1972 . 13] R. de Buda, Coherent Demodulation of Frequency Shift Keying with Low Deviation Ratio , IEEE Transactions on Communications, vol, COM-20, No.3 , pp. 429435, June 1972 . [41 F. Amoroso, Pulse and Spectrum Manipulation in the Minimum (Frequency) Shift .Keying (MSK) Format, IEEE Transactions on Communications , vol. COM-24, No.3, pp . 381-384, March 1976 . [ I]
[81
(11) (12)
G. J . Garrison, A Power Spectral Density Analysis for Digital FM, IEEE Transactions on Communications, vol. COM-B, No. 11, pp. 1228-1243, Nov, 1975. H. Nyquist , Certain Topics in Telegraph Transmission Theory, AlEE Trans. , vol, 47, pp , 617 -644, April 1928 . S. Pasupathy, Nyquist's Third Criterion, Proceedings of the IEEE, vol. 62, No.6, pp . 860-861, June 1974. W. R. Bennett, J, R. Davey, Data Transmission, McGraw-Hill Book Company, 1965 . A. D. Sypherd, Design of Digital Filters Using Read-Only Memories, Proceedings of the NEC, Chicago, vol, 25,8-10 Dec. 1969, pp. 691-693 . F. A. M. Snijders, N. A. M. Verhoeckx, H. A. van Essen, P. J. van Gerwen, Digital Generation of Linearly Modulated Data Waveforms, IEEE Transactions on Communications , vol, COM23, No. 11, pp. 1259-1270, Nov. 1975 . A. Lender, A Synchronous Signal with Dual Properties for Digital Communications, IEEE Transactions on Communication Technology, vol. COM-l3, No.2, pp. 202-208 , June 1965 . E. R. Kretzrner, Generalization of a Technique for Binary Data Communication, IEEE Transactions on Communication Technology , vol. COM-14, No.1 , pp . 67-68, Febr. 1966 .
*
Frank de Jager was born in Amsterdam, The Netherlands, on June 13, 1919 . He was graduated from the Technical University of Delft in 1946, when he joined the Philips Research Laboratories in Eindhoven and was assigned to the Telecommunications Department where he worked in the fields of carrier telephony, delta modulation , vocoders, companders, data transmission, automatic equalization and , finally, in radio communication. In 1958 he received, together with Johannes A. Greefkes the Veder-Award for work on speech transmission with low signal-tonoise ratios . In 1972 , together again with Johannes A. Greefkes, he received the 1972 IEEE Award in International Communication in honor of Hernand and Sosthenes Behn for contributions to communications system s research, in particular, for inventions in the delta-modulation area .
* Cornelis B. Dekker (M'76) was born in The Netherlands, on May 23, 1950 . He received the degree in electrical engineering from the Technical University, Delft, The Netherlands, in 1973 . After his military service he joined the Telecommunications Department of the Philips Research Laboratories where he is now working on digital communication via radio channels . Mr. Dekker is a member of the Netherlands Electronics and Radio Society.
Performance Evaluation for Phase-Coded Spread-Spectrum Multiple-Access Communication-Part I: System Analysis MICHAEL B. PURSLEY,
Abstract- An analysis of an asynchronous phase-eoded spread-spectrum multiple-access communication system is presented. The results of this analysis reveal which code parameters have the greatest impact on communication performance and provide analytical tools for use in preliminary system design. Emphasis is placed on average performance rather than worst-case performance and on code parameters which can be computed easily.
I. INTRODUCTION
I
N RECENT YEARS there has been increased interest in a class of multiple-access techniques known as code-division multiple access (COMA). The CDMA techniques are those multiple-accessmethods in which the multiple-accesscapability is due primarily to coding and in which-unlike traditional time- and frequency-division multiple access-tbere is no requirement for precise time or frequency coordination between the transmitters in the system. CDMA techniques have been considered for a variety of satellite systems including the NASA tracking and data-relay system [17), systems to provide communication to aircraft and other mobile users [11], air traffic control systems [18], and military satellite communication systems. In certain satellite communication systems, CDMA techniques can be designed to provide multiple-access capability and, simultaneously, to reduce the effects of multipath distortion [12]. The most common form of CDMA is spread-spectrum multiple access (SSMA) in which each user is assigned a particular code sequence which is modulated on the carrier along with the digital data. The SSMA techniques are characterized by the use of a high-rate code (i.e., many code symbols per data symbol) which has the effect of spreading the bandwidth of the data signal. The two most common forms of SSMA are frequency-hopped SSMA and phase-coded SSMA. The first of these two was used in the TATS modulation system for the Lincoln Experimental Satellites and is described in detail in [II]. Phase-coded SSMA (also known as directsequence spread spectrum [5], (6]) utilizes the most common form of spread-spectrum modulation: the carrier is phase modulated by the digital data sequence and the code sequence. Manuscript received January 4, 1976; revised March 17, 1977. This work was supported in part by the National Science Foundation under Grant ENG75·22621 and in part by the Joint Services Electronics Program under Contract DAAB-Q7-72-C-0259. A portion of this paper was presented at the International Telemetering Conference, Los Angeles, CA, September 28-30, 1976. The author is with the Coordinated Science Laboratory and the Department of Electrical Engineering, University of Illinois, Urbana,
IL 61801.
MEMBER, IEEE
Although phase-coded spread-spectrum modulation has been considered for a wide variety of purposes [5], [9J, [12J, we are concerned in this paper only with its use in achieving multiple-access capability. The main topic of this paper is phase-coded SSMA system analysis. We concentrate on communication performance rather than on acquisition and tracking performance, so that the performance measures of interest are error rate and signal-to-noise ratio. Although various aspects of phase..coded SSMA communication were discussed in a number of publications which appeared in the mid-1960's (e.g., [1], [2], [4], [10], [16]), there were very few analytical results on asynchronous phase-coded systems and little had been done to identify the important code parameters for asynchronous phase-coded SSMA applications. Most of this work implicitly or explicitly assumed a synchronous model and therefore dealt only with the periodic cross-correlation properties of the code sequence. Further, nearly all of the results on cross-correlation properties of sequences dealt with only the periodic correlation (e.g., [8]). One of the first detailed investigations of asynchronous phase-coded SSMA system performance which dealt with aperiodic cross-correlation effects was published in 1969 by Anderson and Wintz [3]. They obtained a bound on the signal-to-noise ratio at the output of the correlation receiver for a SSMA system with a hard-limiter in the channel. The need for considering the aperiodic cross-correlation properties of the code sequences is clearly demonstrated in their paper [3, pp. 286]. Since that time, many additional results have been obtained (e.g., [12], [13], [19]) which help clarify the role of aperiodic correlation in asynchronous phase-coded SSMA communication. In this paper, we present some of these results and their implications. II. THE PHASE-CODED SSMA SYSTEM MODEL The SSMA system model that we will consider is shown in Figure 1 for K users. The k-th user's data signal bk(t) is a sequence of unit amplitude, positive and negative, rectangular pulses of duration T. This signal represents the k-th user's binary information sequence. The k-th user is assigned a code waveform Ok(t) which consist: of a periodic sequence of unit amplitude, positive and negative, rectangular pulsesof duration Tc . If (a/ k » is the corresponding sequence of elements of {+ 1, -I} then we w rite ak(t) as ak(t) =
~
OJ(h)PT
j=-oo
c(t - jTc )
Reprinted from IEEE Transactions on Communications, vol. COM-25, no. 8, August 1977.
The Best ofthe Best. Edited by W. H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
287
288
THE BEST OF THE BEST
correlation receiver is such that we can then ignore the double frequency component ofr(t) cos wet. The condition we > '1 1 is always satisfied in a practical SSMA communicationsystem. The data signal bh(t) can be expressed as 00
bK(f)
bk(t) = ~ bk.IPT(t -IT)
TK
1=-00
moK(t)cos4t+BK)
Fig. 1.
Phase-coded spread-spectrum multiple-access system model.
where PT(t) = 1 for 0 ~.t < T and PT(t) = 0 otherwise. We assume that the k-th user's code sequence (aj(k») has period N = T/Te so that there is one code period ao (h), al (h), "', aN-l (h) per data symbol. The. results presented can easily be generalized to multiple code periods per data symbol. The data signal bk(t) is modulated onto the phase-coded carrier Ck(t), which is given by Ch(t) = y'2P sin (wet
+ Ok + (Tr!2)ak(t))
where b k •l E {+ 1, -I}. The output of the correlation receiver at t = Tis given by
z, =VJifi fbl.OT +
t
k=l
[bk.-1Rk,,(Tk)
+ bk.oRk.,(Tk)]
k=l=i
• cos ¢k
1+ [T n(t)a,(t) cos wet dt
(1)
where Rk,i and Rk •i are the continuous-time partial crosscorrelation functions defined by
=v'2P ah(t) COS (Wet + Ok)' Thus, the transmitted signal for the k-th user is Shet) = V2P sin (wet
+ Ok + (1T/2)ah(t)b h(t ))
=v'2P ak(t)bk(t) COS (Wet + ()k) .
(2)
In the above expressions (J k represents the phase of the k-th carrier, we represents the common center frequency, and P represents the common signal power. The results that follow can easily be modified for unequal center frequencies and power levels. If the SSMA system is completely synchronized, then the time delays Th shown in the model of Figure 1 can be ignored (i.e., Tk = 0 for k = 1, 2, .", K). Thiswould requirea common timing reference for the K transmitters and it wouldnecessitate compensation for delays in the various transmission paths. This is generally not feasible and hence the transmitters are not time-synchronous. For asynchronous systemsthe received signal ret) in Figure 1 is given by ret) = n(t)
K
+~
k=l
VUak(t -
Tk)bk(t -
Tk)
cos (wet
+
for 0 ~ T ~ T. It is easy to see that for 0 ~ lTe ~ T ~ (I + 1)Te ~ T, these two cross-correlation functions can be written as
(3)
and
where the discrete aperiodiccross-correlation function Ch •i for the sequences (aj(k » and (aj(i») is defined by N-l-l
where
~ aj(k) aj+z(i) ,
I-N~l
k-=l=i.
If the received signal ret) is the input to a correlation receiver matched to Si(t), the output is Z,
=1
T
r(t)a,(t) cos wet dt.
In all that follows we assume We ~ 1 1 since the frequency response of a realistic hardware implementation. of the
O<::/~N-l
i=O
0,
111~N.
The periodic cross-correlation function (J k, i is given by ()k.i(l)
=
N-!
~
aj(k )aj+z(i)
j=O
for any integer I. Notice that Ok i(l) = Ck ;(/) + Ch i(l - N) for 0 ~ 1< N. We also define 8 k ,i(l) = Ch,i(l) - Ck,i(l - N) A
'
,
,
289
Fifty Years ofCommunications and Networking for 0 ~ I < N. The function 8k i is called the odd crosscorrelation function by Massey and Uhran [13] since it has the property 8i ,k(l) = -lJk~i(N - l), whereas the periodic (or even) cross-correlation function satisfies () it k (l) == 0h , i(N I). Both of these relationships follow from the observation that Ci,h(!) = Ck,l-l). If v« i(1 k) is defined by Uh,i(Tk) = [bk,-lRk,i(Tk)
+ bk,oRk,i(Tk)] cos tPk
then y'jifi Uk. i(Tk) is the contribution of the k-th signal to the output Z, of the correlation receiver matched to Si(t). For fixed Tk' Uk, i(Tk) depends only on lPk, the data symbols b k ,- l and bk,o, and the aperiodic cross-correlation function (or, the periodic and odd cross-correlation functions). Specifically, if lit isan integer for whichl» T; ~ Tk ~ (lk + 1)Tc and b k •O = bk,-l, then Vk,i(Tk)
=bk,o{Ok,i(lk)Tc + [6k ,i(lk + 1)
valuesofl is Ak,i ~ max {'Yk,i' Yk,i} where 1k,i = max I Ok,i(l) I and lk,i = max IOk.i(l) J.. From the above discussion we conclude that if bi,o = -1, the maximum error probability for the i-th receivercorresponds to the maximum value of Uk, i(Tk) for each k i and that this maximum value is "-k,i. The same argument can be applied for bi,O =. + 1 in which case the maximum error probability corresponds to the minimum value of Uk ,i(Tk) for each k =1= i and min Uk,i(Tk) = -Ah,;. Thus, Pmax(i) is minimized if the quantity Ai = ~k*i Ak,i is minimized. In fact
*
Pm a x (;) == 1 - <1'([1 - (Ai/N)]~2f./No )
where 4> is the standard (l.e., zero mean, unit variance) Gaussian cumulative distribution function and E =PT is the energy per data bit. We define Pm a x = max Pmax(i) and A = max Aj, where the maximization is over t. and notice that
Pm a x = 1 - 4'([1 - (A/N)]~2E/No) (5)
On the other hand, if bk,o Vk,i(Tk)
'* bk,-l, then
= bk,o{Ok,i(lk)Tc + [8 k ,i(lk + 1)
- 8h ,i(lk )] (Tk
-lkTc)} cos
(6)
III. SYSTEM ANALYSIS: WORST-CASE PERFORMANCE Up to this point we have not explicitly indicated which code sequence parameters should be optimized. The ideal situation would be to find a code for which the error probabilities Pr (Zi > 0 I b i •O == -1) and Pr (Zi < 0 I hi,o = + 1) are small for all values of the parameters Tk' ~k, b k ,- l , and bk,o. It is clear from symmetry considerations that, for any code, the set of values that one of the two probabilities takes on as the parameters are varied is the same as the corresponding set for the other probability. In particular, the two probabilities have the same maximum value Pmax (0 for any given code. One code selection procedure that is often suggested is to choose the code that gives the smallest value of P max (0; that is, the maximum value of the error probability is minimized. This approach is open to the usual criticism of minimax methods which is that too much emphasis is placed on the worst-case parameter values. However, the minimax approach is warranted for certain systems so we will pursue it further before suggesting an alternative. If bi,o = -1, Pmax(i) depends on the maximum value of the sum of the Vk.i(Tk) over all k i. From (5) and (6) it is clear that the maximum value of Uk, i(Tk) is achieved when Tk is an integer multiple of Tc and when ¢k = O. That is, the maximum value OfUk,i(Tk) ~ Vk,i(Tk)/T c is of the form
"*
[bk,-l Ck,i(l- N)
(7)
+ bk,oCk,i(I)]
for IE {a, 1, 2, ..., N - I}, bk,-l E {+1, -I}, andbk,o E {+1, -I}. For a fiJ£ed I, this quantity has four possible values, ±6k,i(1) and ±8k,i(I). The maximum of these values over all
(8)
~ 1 - 4>([1 - (K - 1)(A/N)]~2E/No)
(9)
where A is the maximum of Ak,i over all i and k such that 1 ~ i < k ~ K. The property Ak,i = Ai,h was used to obtain (9). An additional code parameter that is of interest is the maximum magnitude of the aperiodic cross-correlation
C;
= max {f Ckti(l) f:
1 - N ~ I <::N - 1,1
<:: i < k
~K}.
Notice that A~ 2e e and hence
P m a x <:: I - ep([l - (K - 1)(2CclN)]~2E/No).
(10)
Expressions (8)-(10) provide upper bounds on the worstcase error probability which may be useful for certain SSMA systems. However, unless the period N of the code sequences is much larger than the number of users K, the terms A/N, 'A(K - 1)/N, and 2Cc(K - l}/N which appear in these bounds will often be greater than unity. For this situation, not only are the bounds of no value, but also the maximum error probability itself is not a useful performance parameter. If in such situations, the large cross-correlation values arise for only a few values of the delay parameters, Tl' T2, •.', TK' it is more meaningful to consider the average performance rather than the worst case. Two important measures of average performance are the average error probability, which is discussed in [19], and the average signal-to-noise ratio, which is discussed in the next section. IV. SYSTEM ANALYSIS : AVERAGE SIGNAL-TO-NOISE
RATIO In this section we present an alternative approach to phase-coded SSMA system analysis which leads to a new parameter upon which to base code-sequence selection and evaluation. In this approach we treat the phase shifts, time delays, and data symbols as mutually independent random variables. The interference terms appearing in (1) are random and are treated as additional noise. The signal-to-noise ratio,
290
THE BEST OF THE BEST
SNR i , at the output of the /-th correlation receiver is one of
the most important performance measures that can be obtained with a reasonable amount of computation. We should point out that this signal-to-noise ratio is computed by means of probabilistic averages (expectations) with respect to the phase shifts, time delays, and data symbols. However, such averages can also be interpreted as time averages since, in practice, these variables are actually slowly varying time functions which can be modeled as stationary ergodic random processes. As in the previous section, there is no loss of generality in assuming ¢Ji : : : 0 and Ti = 0 when considering Zj, the output of the i-th correlation receiver. Also, because of the symmetry involved we need consider only bi,o = + 1. The desired signal component of Z, is then v'Pf2 T while the variance of the noise component of Z, is
±iT
N-l
N~l
~ Ck • i 2(1) = ~ Ck •i 2(1 - N) + Ck ,i 2 (l)
ph,i(D) =
z==o
l=l-N
N-l
=~ l=O
Ck ,i 2(1-N+l)+C k
i 2 (1+ •
1)
and N-l
~ Ck,i(l)Ck,i(l + 1)
~k,i(l) =
l=l-N
N-l
=~
Ckti(l- N)Ck.i(l- N
1=0
+
1)
!
(~)
Var {Zi} ==
4T
k=l 0 k=l=i
P)
(4T
= -
K
Rk/(r)
+ Rk/(r)dr + Y,.NoT
(14)
The signal-to-noise ratio is ~T divided by the rms noise ~i,whichis
N-lj(l+l)Tc
~ ~
ll=l 1=0 k=l=i
ZTc
R k .;2(r)
+ Rk ,i 2(T) dT + YJVo'J'
(11)
where the expectation has been computed with respect to the mutually independent random variables
where N-l
rh,i
=~ 1=0
{C k ,i 2(1- N) + Ch.i(l- N)Ck
;(/-
'
N
+ 1)
+ Ck .;2(1- N + 1) + Ck , i 2(1) + Ck.i(l")Ck.i(1 + 1)
+ Ck •i 2(1 +
Notice that
k
In general for K > 1 the error probability will not be exactly 1 - 4>(SNRi ) , but this is typically a very good approximation
for values of Nand K of interest in practical systems. Quantitative results on the accuracy of this approximation have been obtained by Yao [19] _ Numerical results on the evaluation of the signal-to-noise ratio for sequences of period N = 511 and for K = 10,20,30 and 40 are given in [13]. Finally, we should mention that for preliminary system design it is useful to be able to carry out a- tradeoff between the parameters K, N, and EINo - Such a tradeoff can be based on the approximation K
(6N3)-1 ~
~ (K - 1)/3N
(16)
k=!=i
which yields
1
Ck,;(l)Ck,j(l
'k.i
k=l
SNR·~
N-l l=l--:N
It is shown in [14] that JJ.k. i(n) can be computed directly fro~ the aperiodic autocorrelation functions for (aj(k» and (aj(l»). Thus, the signal-to-noise ratio can be evaluated without knowledge of the cross-correlation functions. Note that for K := 1, (15) reduces to SNR i = V2E/No which has associated error probability Pe = 1 - flJ(V2E/No).
I)}.
This last expression can be written in terms of the crosscorrelation parameters JJ.k, i(n) which are defined by llk.i(n) =
Therefore,
+ n)_
(13)
K --1 3N
1
No +2£
I
(17)
In [15], it is shown that the right-hand side of (16) is actually the expectation of the left-hand side when random sequences are employed. The main use of (17) would be to first determine
291
Fifty Years ofCommunications and Networking
roughly what code-sequence length N, bit energy E, and noise density N o/2 are required to achieve a given signal-to-noise ratio for a given number of users K. A more detailed investigation of the performance can then be carried out using (15) for specific code sequences.
REFERENCES [11
[2]
[3]
[4] [5] [6] [7]
[8] [9]
[10J
[11 J
[12]
[13]
J. M. Aein, "Multiple access to a hard-limiting communicationsatellite repeater," IEEE Transactions on Space Electronics and Telemetry, vol. SET-10, pp. 159-167, December 1964. J. M. Aein and J. W. Schwartz (editors), "Multiple access to a communication satellite with a hard-limiting repeater-Volume II: Proceedings of the IDA multiple access summer study," Institute for Defense Analysis, Report R-108, 1965. D. R. Anderson and P. A. Wintz, "Analysis of a spread-spectrum multiple-access system with a hard limiter," IEEE Transactions on Communication Technology, vol. COM-1 i. pp. 285-290, April 1969. H. Blasbalg, "A comparison of pseudo-noise and conventional modulation for multiple-access satellite communications," IBM Journal, vol. 9, pp. 241-255, July 1965. R. C. Dixon, Spread Spectrum Systems. New York: Wiley, 1976. R. C. Dixon (editor), Spread Spectrum Techniques. New York: IEEE Press, 1976. ' L. A. Gerhardt (lecture series director), "Spread Spectrum Communications," AGARD Lecture Series No. 58, NATO, July 1973. ' R. Gold, "Optimal binary sequences for spread spectrum multiplexing," IEEE Transactions on Information Theory, vol. IT-13, pp. 619-621, October 1967. S. W. Golomb (editor)', Digital Communications with Space Applications, Englewood .Cliffs, N. J.: Prentice-Hall, 1964. J: Kaiser, J. W. Schwartz, and 1. M. Aein, "Multiple access to a communication satellite with a hard-limiting repeater.--Volume I: Modulation techniques and their applications," Institute for Defense Analysis, Report R-I08, January 1965. I. L. Lebow, K. L. Jordan, and P. R. Drouilhet, Jr., "Satellite communications to mobile platforms," Proceedings of the IEEE, vol, 59, pp. 139·159, February 1971. J. L. Massey and J. J. Uhran, "Sub-baud coding," Proceedings
of the Thirteenth Annual Allerton Conference on Circuit and System Theory, pp. S39~547, 'October 1975 (see also "Final report for multipath study, " Department of Electrical Engineering, Universityof Notre Dame, 1969). M. B. Pursley, "Evaluating performance of codes for spread spectrum multiple' access communications," Proceedings of the
Twelfth Annual Allerton Conference' on Circuit and System Theory, pp. 765-774, October 1974 (see also "Tracking and data
relay satellite system configuration and tradeoff study, " Volume 4, Appendix D, Hughes Aircraft Company, Space and Communications Group, EI Segundo, California, Report 20642R, September 1972).
[14], M. B. Pursley and D. V. Sarwate, "Performance evaluation for phase-coded spread-spectrum multiple-access communicationPart II: Code sequence analysis," this issue, pp, 800-803. [15) H. F. A. Roefs and M. B. Pursley, "Correlation parameters of random sequences and maximal length sequences for spread.. spectrum multiple-access communication," 1976 IEEE Canadian Communications and Power Conference, pp. 141-143, October 1976. . [1~] J. W. Schwartz, J. M. Aein, and J. Kaiser, "Modulation techniques for multiple access to a hard-limiting satellite repeater," Proceedings of the IEEE, vol. 54, pp. 763-777, May 1966. [17] R. A. Stampfl and A. E. Jones, "Tracking and data relay satellites," IEEE Transactions on Aerospace and Electronic Systems, vol. AES-6, pp. 276-289, May 1970. [18] I. G. Stiglitz, "Multiple-access considerations-A satellite example," IEEE Transactions on Communications, vol. COM-21, pp. 577-582, May 1973. ' [19] K. Yao, "Error probability of asynchronous spread spectrum multiple access communication systems," this issue, pp. 803-
809.
* Michael B. Pursley (S'68-M'68-S'72-M'74) was born in Winchester, Indiana on August 10, 1945. He studied electrical engineering at Purdue University where he received the B.S. degree with highest distinction in 1967 and the M.S. degree in 1~68. In 1974 he received the Ph.D. degree in electrical engineering from the University of Southern California. He held a summer position in the Laser and Radar Electronics Section of the Hughes Aircraft Company, Los Angeles, California, in 1967 and a position in the Systems Analysis Section of the Nortronics Division of Northop Corporation in 1968. In December of 1968 he rejoined the Hughes Aircraft Company as a Member of the Technical Staff and was involved in satellite communication systems design and analysis; he was promoted to Staff Engineer in 1973. He was a Hughes Doctoral fellow at the' University of Southern California from 1971 until 1974 and' a Research Assistant during 1973. From January through June of 1974 he was an Acting Assistant Professor at the University of California, Los Angeles. Since June, 1974 he has been an Assistant Professor in the Department of Electrical Engineering and the Coordinated Science Laboratory at the University of Illinois, Urbana, Illinois where his research work has been in the general area of information theory and stochastic processes with applications to source coding and communication systems. His current interests are in universal source coding, spread-spectrum multiple-access communication systems, and multipleuser information theory. Dr. Pursley is a member of Phi Eta Sigma, Tau Beta Pi, and the Institute of Mathematical Statistics. He was treasurer and is presently secretary of the J oint Chapter of the Chicago, Central Illinois, Central Indiana, and South Bend Sections of the IEEE Information Theory Group.
The Throughput of Packet Broadcasting Channels NORMAN ABRAMSON, FELLOW
Abstract-Packet broadcasting is a form of data communications architecture which can combine the features of packet switching with those of broadcast channels for data communication networks. Much of the basic theory of packet broadcasting has been presented as a byproduct in a sequence of papers with a distinctly practical emphasis. In this paper we provide a unified presentation of packet broadcasting theory. In Section II we introduce the theory of packet broadcasting data networks. In Section In we provide some theoretical results dealing with the performance of a packet broadcasting network when the users of the network have a variety of data rates. In Section IV we deal with packet broadcasting networks distributed in space, and in Section V we derive some properties of power-limited packet broadcasting channels,showing that the throughput of such channels can approach that of equivalent point-to-point channels.
I. INTRODUCTION
A. Packet Switching and Packet Broadcasting
T
HE transition of packet-switched computer networks from experimental [1] to operational (2] status during 1975 provides convincing evidence of the value of this form of communications architecture. Packet switching, or statistical multiplexing [3] can provide a powerful means of sharing comt
munications resources among large number of data communi-
cations users when those users can be characterized by a high ratio of peak to average data rates. Under such circumstances, data from each user are buffered, address and control information is added in a "header," and the resulting bit sequence, or "packet," is routed through a shared communications resource by a sequence of node switches [4] , [5] . Packet-switched networks, however, still employ point-topoint communication channels and large multiplexing switches for routing and flow control in a fashion similar to conventional circuit switched networks. In some situations (6] -[10] it is desirable to combine the efficiencies achievable by a packet communications architecture with other advantages obtained by use of broadcast communication channels. Among these advantages are elimination of routing and network switches, system modularity) and overall system simplicity. In addition, certain kinds of channels available to the communications systems designer, notably satellite channels, are basically broadcast in their structure. In such cases use of these Manuscript received January 19, 1976; revised June 11, 1976. This work was supported by The ALOHA System, a research project at the University of Hawaii which is supported by the Advanced-Research Projects Agency of the Department of Defense and monitored by NASA Ames Research Center under Contract NAS2-8590. The views and conclusions contained in this paper are those of the author and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Advanced Research Projects Agency of the United States Government. The author is with The ALOHA System, University of Hawaii, Honolulu, HI 96822.
t
IEEE
channels in their natural broadcast mode can lead to significant system performance advantages [11] , [12].
B. Outline of Results Packet broadcasting is a form of data communications architecture which can combine the features of packet switching with those of broadcast channels for data communication networks. Much of the basic theory of packet broadcasting has been presented as a byproduct in a sequence of papers with a distinctly practical emphasis. In this paper we provide a unified presentation of packet broadcasting theory. In Section II we introduce the theory of packet broadcasting as implemented in the ALOHA System at the University of Hawaii; also in Section II we explain a modification of the basic ALOHA method, called slotting. In Section III we provide some theoretical results dealing with the performance of a packet broadcasting channel when the users the channel have a variety of data rates. In Section IV we deal with packet broadcasting networks distributed in space, and present some incomplete results on the theoretical properties of such networks. Finally, in Section V we derive some properties of power limited packet broadcasting channels showing that the throughput of such channels can approach that of equivalent point-to-point channels. This result is of.importance in satellite systems using smallearth stations sinceit impliesthat the multiple access capability and the complete connectivity (in the topological sense) of packet broadcasting channels can be obtained at no price in average throughput.
or
II. .PACKET BROADCASTING CHANNELS A. Operation ofa Packet Broadcasting Channel Consider a number of widely separated users, each wanting to transmit short packets over a common high-speed channel. Assume that the rate at which users generate packets is such that the average time between packets from a single user is much greater than the time needed to transmit a single packet. In Fig. 1 we indicate a sequence of packets transmitted by a typical user. Conventional time or frequency multiplexing methods (TDMA or FDMA) or some kind of polling scheme could be employed to share the channel among the users. Some of the disadvantages of these methods for users with high peak-toaverage data rates are discussed by Carleial and Hellman [13] . In addition, under certain conditions polling may require unacceptable system complexity and extra delay. In a packet broadcasting system the simplest possible solution to this multiplexing problem is employed. Each user transmits its packets over the common broadcast channel in a completely unsynchronized (from one user to another) manner. If each individual user of a packet broadcasting chan-
Reprinted from IEEE Transactions on Communications, vol. COM-25, no. 1, January 1977. The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
293
294
THE BEST OF THE BEST
nn Fig. 1.
n
Packets from a typical user.
Fig. 2.
B. ALOHA Capacity A transmitted packet can be received incorrectly or lost completely because of two different types of errors: I) random noise errors and 2) errors caused by packet overlap. In this paper we assume that the first type of error can be ignored, and we shall be concerned only with errors caused by packet overlap. In Section II-D we describe several methods of dealing with the problem of packets lost due to overlap, but first we derive the basic results which tell us how many packets can be transmitted with no overlap. Assume that the start times of packets in the channel comprise a Poisson point process with parameter A. packets/second. If each packet lasts T seconds, we can define the normalized channel traffic G where
(I)
If we assume that only those packets which do not overlap with any other packet are received correctly, we may define A' < A as the rate of occurrence of those packets which are received correctly. Then we define the normalized channel thruput S by S=A'T.
(2)
The probability that a packet will not overlap a given packet is just the probability that no packet starts T seconds before or T seconds after the start time of the given packet. Then, since the point process formed from the start times of all packets in the channel was assumed Poisson, the probability that a packet will not overlap any other packet is e- 2 /\ .,., or e- 2 G • Therefore S= Ge- 2 G
lime ~
averlap)
nel is required to have a low duty cycle, the probability of a packet from one user interfering with a packet from another user is small as long as the total number of users on the common channel is not too large. As the number of users increases, however, the number of packet overlaps increases and the probability that a packet will be lost due to an overlap also increases. The question of how many users can share such a channel and the analysis of various methods of dealing with packets lost due to overlap are the primary concerns of this paper. In Fig. 2 we show a packet broadcasting channel with two overlapping packets. Since the first packet broadcasting channel was put into operation in the ALOHA System radiolinked computer network at the University of Hawaii [6], they have been referred to as ALOHA channels.
G = ~:T .
nnn
n n nnn n n un nn n lime ~
(3)
and we may plot the channel throughput versus channel traffic for an ALOHA channel (Fig. 3). . From Fig. 3 we see that as the channel traffic increases, the throughput also increases until it reaches its maximum at S = 1/2e = 0.184. This value of throughput is known as the capac-
Packets from several users on an ALOHA channel. 2
~Q5
.1
.3 .2 1/2. S, mann,,1 thruput
Fig. 3.
.4
Channel throughput versus channel traffic for an ALOHA channel.
ity of an ALOHA channel, and it occurs for a value of channel traffic equal to 0.5. If we increase the channel traffic above 0.5, the throughput of the channel will decrease.
C Application ofan ALOHA Channel In order to indicate the capabilities of such a channel for use in an interactive network of alphanumeric computer terminals, consider the 9600 bits/s packet broadcasting channel used in the ALOHA System. From the results of Section II-B we see that the maximum average throughput of this channel is 9600 bits/s times 1/2e, or about 1600 bits/so If we assume the conservative (14] figure of 5 bits/s as the average data rate (including overhead) from each active! terminal in the network, this channel can handle the traffic of over 300 active terminals and each terminal will operate at a peak data rate of 9600 bits/so Of course, the total number of terminals in such a network can be much larger than 300 since only a fraction of all terminals will be active and a terminal consumes no channel resources when it is not active. D. Recovery ofLost Packets Since the packet broadcasting technique we have described will result in some packets being lost due to packet overlaps, it is necessary to introduce some technique to compensate for this loss. Wemay list four different packet recovery techniques for dealing with the problem of lost packets. The first three make use of a feedback channel to the packet transmitter and the repetition of lost packets, while the fourth is based on coding. 1) Positive Acknowledgments (POSACKS): Perhaps the most direct way to handle lost packets is to require the receiver of the packet to acknowledge correct receipt of the packet . Each packet is transmitted and then stored in the transmitter's buffer until a POSACK is received from the receiver. If a POSACK is not received in a given amount of time , the transmitter can repeat the transmission and continue to repeat until a POSACK is received or until some other criterion is met. The POSACK can be transmitted on a sepaI A terminal is defined as active from the time a user transmits an attempt to log on until he transmits a log off message.
295
Fifty Years ofCommunications and Networking
rate channel (as in the ALOHANET [6]) or transmitted on the same channel as the original packets (as in the ARPA packet radio system [ 15]). An error detection code and a packet numbering system can be used to increase the reliability of this technique. 2) Transponder Packet Broadcasting: Certain communication channels-notably communication satellite channelstransmit packets on one frequency to a transponder which retransmits the packets on a second frequency. In such cases all units in a packet broadcasting network can receive their own packet retransmissions, determine whether a packet overlap has occurred, and repeat the packet if necessary. This technique has been employed in ATS-l satellite experiments in the Pacific Educational Computer Network (PACNET) [16] and in the ARPA Atlantic INTELSAT IV packet broadcasting experiments [17] . 3) Carrier Sense Packet Broadcasting: For ground-based packet broadcasting networks where the signal propagation time over the furthest transmission path is much less than the packet duration, it is feasible to provide each transmission unit with a device to inhibit packet transmission while another unit is detected transmitting. A carrier sense capability can increase the channel throughput, even if these conditions are not met, when used in conjunction with other packet recovery methods. Carrier sense systems have been analyzed by Tobagi [18] and by Kleinrockand Tobagi [19]. A comprehensive yet compact analysis of such systems is provided in [42]. 4) Packet Recovery Codes: When a user employs a packet broadcasting channel to transmit long files by breaking them into large numbers of packets, it is possible to encode the files so that packets lost due to broadcasting overlap can be recovered. It is clear that some of the existing classes of multiple burst error-correcting codes [20] and cyclic product codes [21] can be used for packet recovery in transmissions of long files. It is also clear that these codes are not as efficient as possible for packet recovery and that considerable work remains to be done in this area.
users, and that whether or not a user transmits a packet in a given slot does not depend upon the state of any previous slot. If we have n users, we can define the normalized channel traffic for the slotted channel G where
(4) Note that G may be greater than 1. As before, we can also consider the rate at which a user sends packets which do not experience an overlap with other user packets. Define S, ~ Gi as the probability that a user sends a packet and that this ·packet is the only packet in its slot. If we have n users, then we define the normalized channel throughput for the slotted channel S where (5)
Note that S is less than or equal to 1 and S ~ G. For the slotted ALOHA channel with n independent users, the probability that a packet from the ith user will not experience an interference from one of the other users is n
II (1 -
Gj ) .
i=1
jif:i
Therefore we may write the following relationship between the message rate and the traffic rate of the ith US~'f: n
II (1 -
S, = Gi
1=1
(6)
Gj ) .
i'4=i
If all users are identical, we have
(7)
E. Slotted Channels It is possible to modify the completely unsynchronized use of the ALOHA channel described above in order to increase the maximum throughput of the channel. In the pure ALOHA channel each user simply transmits a packet when ready without any attempt to coordinate his transmission with those of other users. While this strategy has a certain elegance, it does lead to somewhat inefficient channel utilization. If we establish a time base and require each user to start his packet only at certain fixed instants, it is possible to increase the maximum value of the channel thruput. In this kind of channel, called a slotted ALOHA channel, a central clock establishes a time base for a sequence of "slots" of the same duration as a packet transmission [41]. Then when a user has a packet to transmit, he synchronizes the start of his transmission to the start of a slot. In this fashion, if two messages conflict they will overlap completely, rather than partially. To analyze the slotted ALOHA channel, define Gj as the probability that the ith user will transmit a packet in some slot. Assume that each user operates independently of aJI other
and G G·=I
(8)
11
so that (6) can be written S=G
( G)n-l 1-;
and in the limit as 11
s= ce:>.
-+
00,
(~)
we have
(10)
Equation (10) is plotted in Fig. 4 (curve labeled "slotted ALOHA"). Note that the message rate of the slotted ALOHA channel reaches a maximum value of lie = 0.368, twice the capacity of the pure ALOHA channel. This result for slotted ALOHA channels was first derived by Roberts [41] using a different method.
296
THE BEST OF THE BEST (14b)
3
(13) becomes
Z.5
(15)
exp (-2C 1 - C 21 - C 2 ] .
Z
Therefore (16a) and, by a similar argument, the throughput of long packets is 0 .5
(l6b)
.1 t.2
184 - I/ Z.
S. thruput
Fig. 4.
Traffic versus throughput for an ALOHA channel and a slotted ALOHA channel.
III. PACKET BROAOCASTING WITH MIXED DATA RATES
For any given values of Al and A2 we may calculate C 1 • C 2• C 12 , and C 21 ; substitution of these values into (16a) and (16b) will allow calculation of the throughputs 8 1 and S2 Therefore (16a) and (16b) may be used to define an allowable set of throughput pairs (Sl.S2) in the (Sl.S2) plane. To determine the boundary of this region we define
A. Unslotted Case: Variable Packet Lengths In Section II we were concerned with the analysis of ALOHA channels carrying a homogeneous mix of packets. If some channel users have a higher average data rate than others, however, the high ra te . users must either transmit packets more frequently or transmit longer .packets. In this section we shall analyze the unslotted ALOHA channel when carrying packets of different lengths, and we shall analyze the slotted ALOHA channel when the probability of transmitting in a given slot varies from user to user. Let us assume an unslotted ALOHA channel with two different possible packet durations, T2 and Tl' Assume T2 ~ Tl, and therefore vie refer to the two different length packets as long packets and short Rackets, respectively. Assume also the start times of the long packets and short packets form two Poisson point processes with parameters A2 and Al packetsl second, and that the two Poisson point processes are mutually independent. Then we can define, the normalized channel traffic for those packets of duration Ti: i = 1,2.
= 1,2.
(12)
Since we assumed two independent Poisson point processes, the probability that a short packet will be received correctly is
(13) and if we define C n ,@ A1T2
Note that a ~ 1. We may rewrite (16a) and (16b) in terms of a, the ratio of long packet duration to short packet duration: (18a)
S2
= C 2 exp [-(1 + a)C 1 -
2C 2 ]
.
(l8b)
The boundary of the set of allowable (S1,S2) pairs in the (S1.82) plane is defined by setting the Jacobian J=
IaS I. i
oC j
.
i.] = 1,2
(19)
equal to zero . A simple calculation shows that the Jacobian is zero when
(11 )
Again assume that only those packets which do not overlap with any other packet are received correctly and define A/ < Ai as the rate of occurrence of those packets of duration Ti which ~re received correctly. Define the normalized throughput of packets of duration Ti as i
(17)
04a)
(20)
Note that this checks for G1 = 0 and for a = I . We need only substitute this expression for C 2 into (18a) and (l8b) to obtain two equations for Sl' the short packet throughput, and S2, the long packet throughput, in terms of the single parameter C 1 ; and as G1 varies from 0 (all long packets) to 1/2 (all short packets), we will trace out the boundary of the achievable values of throughput in the (Sb S2) plane . These achievable throughput regions are indicated for several values of a in Fig. 5. The basic conclusion of this analysis is that the total channel throughput can undergo a significant decrease if all packets are not of the same length. Thus if the two different
297
Fifty Years of Communications and Networking
.nn (l-G/),
.20
l. .164 . 1
~
a=2 a=4
. 16
) " 1
n
a=16
-Gj
a=128
"ii
j=k
i-I I*j
a=1
~
n
(25)
(l - G;),
ji=k
1=1 I*j,k
after some algebra we may write the Jacobian as
~
">' .0 2 D4 .06 .0 8 .10 .12 .14 .16.1
.20 164
-l-' I(){/g poc"~1 cht1f1flt1l/f1rup
Achievable throughput regions in an unslotted ALOHA channel.
(26) packet lengths differ by a large factor, it is often preferable to break up long packets into many shorter packets as long as the overhead necessary to transmit the text in each packet is small. Ferguson [23] has generalized these results to show that channel throughput is maximized over all possible packet length distributions with fixed length packets. In view of this discouraging result, we might conclude that an inhomogeneous mix of users inevitably leads to a decrease in the maximum value of channel throughput. Surprisingly, this conclusion is not warranted, and we shall show in Section III-B that a mix of users of varied data rates can lead to an increase in the maximum values of channel throughput.
B. Slotted Case: Variable Packet Rates In the section we shall consider a slotted ALOHA channel used by n users, possibly with different values of channel traffic G/. From (6) we have a set of n nonlinear equations relating the channel traffics and the channel throughputs for these n users: S/
= G/ n (l n
j=i
Gj ) ,
i=1,2, " ·,n.
(21)
NI
Define
a: =
n (l j-i n
Gj ) ;
(22)
G1
i
= 1,2, .... n.
(23)
For any set of n acceptable traffic rates G l . G2 • . ... Gn , these n equations define a set of channel throughputs Sl. S2. .... Sn or a region in an n-dimensional space whose coordinates are the St. In order to find the boundary of this region, we calculate the Jacobian: j, k
Since
~GI = 1.
(27)
/
This condition can then be used to define a boundary to the n-dimensional region of allowable throughputs Sl. S2. ..., Sn' Consider the special case of two classes of users with nl users in class 1 and n2 users in class 2: (28) Let Sl and G1 be the throughputs and traffic rates for users in class 1, and let S2 and G 2 be the throughputs and traffic rates for users in class 2. Then the n equations (21) can be written as the two equations
s, =Gl(l-Gl )nl - l (l _
G 2)" 2
(29a)
S2 ==G 2(l-G2)n2- l (I -Gl)nl .
(29b)
For any pair of acceptable traffic rates Gl and G 2 , these two equations define a pair of channel throughputs Sl and S2 or a region in the (Sl,S2) plane. From (27) we know that the boundary of this region is defined by the condition
(30)
then (21) can be written S · = - - a:, I I-G j
Thus the condition for maximum channel throughputs is
= 1, 2, ..., n.
(24)
We can use (30) to substitute for Gl ill (29a) and (29b) and obtain two equations for Sl and S2 in terms of a single parameter G 2 . Then as (;2 varies from 0 to I, the resulting (Sl.S2) pairs define the boundary of the region we seek. These achievable regions are indicated for various values of n1 and n2 in Figs. 6 arid 7. The important point to notice from Figs. 6 and 7 is that in a lightly loaded slotted ALOHA channel, a single large user can transmit data at a significant percentage of the total channel data rate, thus allowing use of the channel at rates well above the limit of lie or 37 percent obtained when ali users have the same message rate. A throughput data rate above the 1[e limit
THE BEST OF THE BEST
298 1.0
n I users at rate n2 users at rate
51
52
(nl'n 2 )
81
= G1 2
(31a)
82
= (I
(31 b)
- GIrl.
2) For n2 _ 00:
51
= G1 (I
- Gl )"1- 1 • exp [-(1 -
8 2 =(1 -n l G1 )( 1 - G l ) n 1 -
( 1,1)
l •
nl G 1 ) ]
exp [-(1-n l Gl
(32a) )] ·
(32b) (CD,1l !I ,CD) (CD , CD)
.~
(33a)
.1
o '-----r--,-.2
Fig. 6.
r-'\..... --,--,---.......,..:==r--,-----,
.5
.6
.7
n/'1
.8
.9
Additional details dealing with excess capacity and the delay experienced with this kind of use of a slotted ALOHA channel may be found in (11] and [25]. A different view of the use of a slotted packet broadcasting for different sources may be found in [43] .
Allowable channel throughputs .
1.0
(33b)
to
n I users a t rate 5 1 n Users a t rate 52 z
(nl.nz)
IV. SPATIAL PROPERTIES OF PACKET BROADCASTING NETWORKS
A. Packet Repeaters
lie
.2
Fig. 7.
.5
.6
n/'1
7
.8
.9
1.0
Allowable channel throughputs.
has been referred to as "excess capacity" [24] . Excess capacity is important for a lightly loaded packet broadcasting network consisting of many .interactive terminal users and a small number of users who send large but infrequent files over the channel. Operation of the . channel in a lightly loaded condition, of course , may not be desirable in a bandwidth-limited channel. For a communications satellite where the average satellite transponder limits the channel, however, power in operation in a lightly loaded packet-switched mode is an attractive alternative . Since the satellite will transmit power only when It is relaying a packet, the duty cycle in the transponder will be small and the average power used will be low (see Section V). Finally , we note that it is possible to deal with certain limiting cases irt more deta il, to obtain equations for the boundary Of the allowable (Sl,8 2 ) region. 1) For til = il 2 =1 : Upon using (30) in (29), we obtain .
the
In this section we deal with certain spatial properties of packet broadcasting networks. Not long after the initial units of the ALOHA System went into operation, it was realized that the range of the network could be extended beyond the range of a single radio link in the network (about 200 km) by the use of packet repeaters . A packet repeater operates in much the same manner as a conventional radio repeater with one major excep tion . Since radio transmission in a packet broadcasting network is intermittent, a packet repeater can receive a packet and retransmit that packet in the same frequency band by turning off its receiver during a retransmission burst. Thus a packet repeater can sidestep many of the frequency allocation and spatial cell problems [26] of conventional land-based repeater networks. The use of packet repeaters leads to the consideration of packet broadcasting networks with more than one central station distributed over very large areas. Users transmit a packet, and if the packet cannot be received directly by its destination, it is forwarded to its destina tion by one or more packet repeaters according to some routing algorithm [27]. The study of such networks has led to the analysis of two communication theory issues related to the performance of the networks: 1) capture effect and 2) the distribution of packet traffic and packet throughput in space.
B. Capture Effect Up to this point we have analyzed packet broadcasting channels under the pessimistic assumption that if two packets overlap at the receiver, both packets are lost. In fact , this assumption provides a lower bound to the performance of
299
Fifty Years ofCommunications and Networking
real packet broadcasting channels, since in many receivers the stronger of two overlapping packets may capture the receiver and may be received without error. Metzner [40] has used this fact to derive an interesting result, showing that by dividing users into two groups-one transmitting at high power and the other at low power-the maximum throughput can be increased by about 50 percent. This result is of importance for packet broadcasting networks with a mixture of data and packetized speech traffic. In order to include the effect of capture in a packet broadcasting network, we consider a distribution of packet generators over a two-dimensional plane and a single packet broadcasting receiver which receives packets from these generators [41]. The receiver then may be viewed as a "packet sink" and the packet generators as a distribution of "packet sources" in the plane. We assume that the rate of generation of packets in a given area depends only on r, the distance from the packet sink, and is independent of direction (J. Then we may define a traffic density and a throughput density analogous to the normalized traffic G and normalized thoughput S defined in Section II-B. G(r)
= normalized packet traffic per unit area at a distance
r. S(r) = normalized packet thruput per unit area at a distance r. The traffic due to all packet generators in a differential ring of width dr at a radius r is
(34)
G(r) 21fr dr.
We assume that packets from different users are generated so that the packet starting times of all packets generated in the differential ring constitute a Poisson point process. Then since the sum of two independent Poisson processes is a Poisson point process, if users in different rings are independent, the start times of all packets generated in a circle of radius r also constitute a Poisson point process, and the total traffic generated by all users within a distance r of the center is
Lr G(X)21TX
(35)
dx.
If we assume that a packet from a user at a distance r from the center will be received correctly unless it is overlapped by a packet sent from a user at a distance ar or less (a ~ 1), then using the results of Section II-B the probability that such a packet will be receivedcorrectly is
[ar G(x)x dx]
exp [ -41T
.
(36)
Any packet generated from a packet source in the circle of radius ar shown in Fig. 8 will interfere with packets generated from a source in the circle of radius r. A packet generated outside the circle of radius ar will not interfere with packets generated from a source in the circle of radius r. We can relate the normalized packet throughput to the normalized packet traffic in the usual way:
Fig. 8.
Regions of interfering packets.
tar G(x)x dx ]
21TrS(r) dr = 2rrrG(r) exp [ -4rr
or S(r) = G(r) exp [-4rr
lar
G(X)x dx ] .
dr
(37)
we
If take a derivative of (37) with respect to r and use (37) to substitute for the exponential, we get
s' (r)G(r) = G'(r)S(r)
- 41Tra 2S(r)G(r)G(ar).
(38)
We have not found a .general solution of (38) for relating S(r) to G(r) in the presence of capture. We have been able to analyze two specialcases, however.
c. Two Solutions In the first of these specialcaseswe assume a constant traffic density G(r). We can then show that the throughput density S(r) has a Gaussian form, due to the fact that those packets
generated further from the receiver will be received correctly less frequently than those packets generated close to the receiver. In the second special case analyzed we assume a constan t packet throughput density S(r) and perfect capture (a = 1). Under these assumptions, the packet traffic density will increase as the distance from the receiver increases. We show that there exists a radius '0 such that the packet traffic density is finite within a circle of radius ro around the receiver, while the packet traffic density becomes unbounded on the circle of radius roo For the important case of a packet broadcasting channel distributed over some geographical area and using a packet retransmission policy (Section II-D), this result has an interesting interpretation. In such a situation any packet transmitted from a terminal located within the circle of radius ro will be received correctly with probability one (after a finite number of retransmissions), while the expected number of retransmissions required for a packet transmitted from a terminal further from the center than ro will be unbounded. Thus there exists a circle of radius such that terminals transmitting from within this circle can get their packets into the central receiver, while terminals transmitting from outside this circle spend all their time retransmitting their packets in vain. We call ro the Sisyphus distance of the ALOHA channel. l} Constant Packet Traffic Density: Assume the density of normalized packet traffic is constant over the plane
'0
300
THE BEST OF THE BEST
G(r)
= Go
(39)
and define the distance r1 as the radius of a circle within which the total packet traffic is unity :
=
rO~SI.ypBJs
+---"Ol. tance P8cket Sink
(40) Then (38) reduces to
Fig. 9.
4ra 2
s (r) = - ' - S(r) r1
(41a)
2
for
o~r
with the boundary condition So =G o 50
Region of constant packet throughput So for a single packet sink.
(41b)
where (51)
that the packet throughput density is
S(r) = Go exp [ -2a 2
c: )
2 ]
(42)
and ro is the Sisyphus distance mentioned in Section IV-C. Note that the Sisyphus distance also has the property that
and the total normalized packet thruput from a circle of radius
r is S=
[r
(50)
S(r)2rrr dt'
(43)
(52) As in the previous case, the total packet throughput which can be supported by a single packet sink operating with perfect capture is one half. V. PACKET BROADCASTING WITH AVERAGE POWER LIMITATIONS
and 1 lim S=-
r-
00
2a2
(44)
.
Note that a total throughput which can be supported by a single packet sink with "perfect capture" (a = I) is equal to one half. 2) Constant Packet Throughput Density: Another case of interest where we have found a solution for (38) is tha t of constant packet throughput density in the plane. Assume
S(r) = So
(45)
over the region in the plane where S(r) and G(r) are bounded. Then (38) becomes
c: (r) = 41Tra
2G(r)G(ar)
.
(46)
For the case of a = I (perfect capture), (46) becomes G'(r)
= 41TrG2(r)
(47)
with the boundary condition
G(O)
= So
(48)
So = ----'=--
(49)
so that G(r)
1 - 21Tr2So
Note that the normalized packet traffic per unit area is finite
A . SatellitePacket Broadcasting
In previous sections we have analyzed the performance of packet broadcasting channels and compared the performance of these channels to that of conventional point-to-point channels operating at the same peak data rate. Such a comparison is of interest in the case of channels limited by multiple access interference rather than noise, since an increase in the transmitted power of such channels will not lead to improved performance. But just as the average data rate of a packet broadcasting channel can be well below its peak data rate when it is operated at a low duty cycle, the average transmitted power of a packet broadcasting channel can be well below its peak transmitted power. In this section we analyze the throughput of a packet broadcasting channel when compared to that of a conventional point-to-point channel of the same average power. This analysis is of interest in the case of satellite information systems employing thousands of small earth stat ions. For a satellite system the fundamental limitation in the downlink is the average power available in the satellite transponder rather than the peak power. Our results show that in the limit of large numbers of small earth stations, the .packet throughput approaches 100 percent of the point-to-point capacity . Thus the multiple access capability and the complete connectivity (in the topological sense) of an ALOHA channel can be obtained at no price in average throughput. Furthermore, since our results suggest the use of higher peak power in the satellite
301
FiftyYears ofCommunications andNetworking transponder (while the average power is kept constant), the small earth stations may use smaller antennas and simpler receivers and modems than would be necessary in a conventional system. In existing satellite systems the TWT output power in each transponder cannot be varied dynamically. In such systems the advantages implied by our analysis may be realized by frequency -division sharing a single transponder among several voice users and a single channel, operating in an ALOHA mode or some other burst mode, and occupying a frequency band equivalent to one or more voice users. The type of operation implied by our analysis also suggests investigation of high peak power satellite burst transponders (perhaps employing power devices similar to those used in radar systems) for use in information systems composed of large numbers of ultra-small earth stations.
1.0 .9
. 8 ~siqnol-to-noisa .7
te .s ~
(53)
where C is the capacity in bits (if the log is a base two logarithm), W is the channel bandwidth, P is the average received Signal power at the earth station, and N is the average noise power at the earth station. Equation (53) expresses the capacity of the satellite channel under the assumption that the transponder transmits continuously. If the channel is used in burst mode the transponder will emit power only when a data burst occurs, and the average power out of the transponder will be less than the burst power . Let D be the ratio of the average power transmitted to the power transmitted during a data burst. For a linear transponder D will equal the channel traffic G. and for a hard limiting transponder D will equal the duty cycle of the channel. For both the unslotted and slotted ALOHA channel the duty cycle is 1 - e- G • Thus for a linear transponder(S4a)
D=G, while for a hard-limiting transponder
D= I -e-
G.
~-IO 0 10 20
." .5
.~
\j
(:.4
~
.3
.2 .1
o
B. BurstPowerandA verage Power The capacity of a satellite channel can be calculated by the classical Shannon equation
ratio (db)
-20
.2
.4
.6
.8 1.0 1.2 channel traffic
1.4
1.6
1.8 2.0
G
Fig. 10. Linear transponder; unslotted channel. 1) We replace W by SW to account for the fact that the channel is only used intermittently. 2) We replace P in (53) by PID to keep the average power of the channel fixed at P. We should note that when we make these changes, we are assuming that the packet length of the system is long enough so that the asymptotic assumptions which are used to derive (53) still apply . In practice, this is not a problem . With these two changes then , we have four different cases. I} Unslottedchannel, linear transponder:
C= Ge1
2G
Wlog
(I + G: )
.
(55a)
2) Unslotted channel, limitingtransponder:
P ) C- Ge-2GWlog (1 + (I -e-g)N 2 -
(55b)
3) Slotted channel. linear transponder:
Ca = Ge-GWlog (1 + ;
) .
(S5c)
4) Slotted channel, limiting transponder: (S4b)
Note that in the case of a hard -limiting transponder with small values of channel traffic, the duty cycle approaches that of a linear transponder. If we retain P as the notation for the average signal power received at the earth station, the power received during a data burst will be PID. Thus (53) should be modified in two ways. 2 Our analysis is of significance only for G < 1. The analysis is formally correct, however, for all G, even though the designation of the power transmitted during bursts as "peak power" becomes inappropriate for the linear transponder case when G > 1. (In sucha situation the "peak power"is less than the average power.)
C= Ge-GWlog (I + (l-e ~G) . )N 4
(55d)
=
We have calculated the normalized capacities CdC for i I, 2, 3, 4 for different values of PIN, the signal-to-noise ratio of the earth station when the transponder operates continuously . The normalized capacities are plotted in Figs. 10, 11, 12, and 13 for PIN equal to -20, -10,0, 10, and 20 dB. Of particular interest in these curves is the fact that the highest values of CilC occur just where we would want them to occur-for small values of channel traffic (G) and for small earth stations (low PIN). In the limit we have (for a fixed value of G)
302
THE BEST OF THE BEST 1.0
si gnal- to- noise ra tio (db)
.9 .8
s iqnol-to-noise r otio (db)
-20
-10
o
.~ i)
20
tl
10
.7
~ .6
.2
.1
.1
.....
L......,--.-.....-~
.2
.4
Fig. II .
6
--,-...---,~--,--..--,-~,---r-,--.---,G
.8 1.0 1.2 Channel Iraffic
1.4
1.6
1.8
.2
2.0
Limiting transponder; unslotted channel.
6
.8 1.0 1.2 channel truffic
1.4
1.6
1.8
Fig. 13. Limiting transponder; slotted channel.
1.0
and in all cases
.9
signal -Io-noise ratio (db)
-----==Tgo
.8 .7
lim G-O
10
.~
C-
lim..!. = 1. .f._ o C
(58)
N
20
1> .6
&
Thus this multiplexing technique allows a network of small inexpensive earth stations to achieve the maximum value of channel capacity, at the same time providing complete con nectivity and multiple access capability.
" 1l·5 ~
~4
g
.4
.3 2
VI. BACKGROUND AND ACKNOWLEDGMENT
.1
0 L......,--.-.......~.....--,-...---,~--,--..--,-~,---.--,--.---, .2
.4
Fig. 12.
c/ S lim-=C · D'
p
N.... O
.6
.8
1.0 1.2 c lwnnel traf fic
1.4
1.6
1.8
2.0
G
Linear transponder; slotted channel.
i-1234 -",
(56)
so that
I) unslotted channels, linear transponder
C1
lim .f._ o C N
= e- 2 G
(57a)
2) unslotted channels, limiting transponder (57b) 3) slotted channel, linea r transponder
Ca
lim -
.f_ o C
=e-
G
N
4) slotted channel, limiting transponder
Ge- G C4 lim - = - - - -
E_ N
o
C
(I - e- G )
(57d)
The term "packet broadcasting" was first coined by Robert Metcalfe in his Ph.D. dissertation [28] . As is often the case with simple ideas, the concept of combining burst transmission and Poisson user statistics to provide random access to a channel has occurred independently to a number of investigators. The first attempt at an analysis of such a system of which I am aware is contained in an internal Bell Laboratories memorandum by Schroeder [29], suggested by an earlier paper by Pierce and Hopper [30]. Two other early related papers were written by Costas [31 J and Fulton [32] . Of course , a theoretical analysis is not necessary in order to build such a system, and anyone who has sat in a taxi listening to the staccato voice bursts of a radio dispatcher and a set of taxi drivers sharing a single voice channel will recognize the operation of a voice packet broadcasting channel using a carrier sense protocol. And even after an analysis is available , the concept of packet broadcasting may be suggested without reference to the theory [33] . The first papers analyzing packet broadcasting in the fo rm implemented in the ALOHA System [6] assumed fixed packet throughput and a retransmission .protocol as described in Section I1-D-I). This approach leads to a number of questions involving optimum retransmission policy [28] , the behavior of the channel with a finite number of users [39.] , stability of the channel [13], and transmission of long files by means of various reservation schemes [34 J, [44]. A comprehensive treatment of these as well as other interesting packet broadcasting questions may be found in Kleinrock [42] . In this paper we have taken a different approach by assuming a given packet traffic rather than throughput. With such a starting point, the
303
Fifty Years of Communications and Networking
questions mentioned above do not assume key importance in the theory, although their practical importance is not diminished. Much of the theory of packet broadcasting was developed in two working groups sponsored by the Advanced Research Projects Agency of the Department of Defense. These groups circulated a private series of working papers-the ARPANET Satellite System notes (ASS notes) and the Packet Radio Temporary notes (pRT notesj-where many of the theoretical results described or referenced in this paper appeared for the first time. Unfortunately, the several references to ASS notes in papers subsequently published in the open literature may have produced some confusion in the minds of those trying to trace the references. Among the most significant of the ASS note and PRT note results was the first derivation of the capacity of a slotted ALOHA channel and the first analysis of the use of the capture effect in packet broadcasting, both by Larry Roberts. That note has since been republished in the open literature [41] . The results of Section III-A dealing with two different packet lengths were suggested by an ASS note written by Tom Gaarder, and the results of Section III-B dealing with the excess capacity of a slotted channel were suggested by an ASS note written by Randy Rettberg. Other problems which were first analyzed in ASS notes or PRT notes but not emphasized in this paper include various packet broadcasting reservation systems [22], [35], [36], carrier sense packet broadcasting [18], [19], and questions dealing with packet routing and protocol issues in a network of repeaters [37]. The reader interested in theoretical network protocol questions should also see Gallagher [38] , although this work did not originate in an ASS note' or PRT note. The first system to employ packet broadcasting techniques was the ALOHA System computer network at the University of Hawaii in 1970. Subsequently, packet repeaters were added to the network and packet broadcasting by satellite was demonstrated in the system. Some of the people involved in the implementation and development of the system were Richard Binder, Chris Harrison, Alan Okinaka, and David Wax. The historical relevance of [29] and [32] was pointed out to me by Joe Aein, to whom I am indebted, in spite of my embarassment at having forgotten I was thesis supervisor on the second of these papers.
REFERENCES [I}
L. G. Roberts and B. D. Wessler, "Computer network development to achieve resource sharing," in 1970 Spring Joint Comput. Conf., AF[PS Con]. Proc., vol. 36. Montvale, NJ: AFIPS Press,
the design of the ARPA computer network," in 1970 Spring Joint Comput. Conf., AFIPS Con! Proc., vol. 36. Montvale, NJ: AFIPS Press, 1970, pp. 581-587.
[6] N. Abramson, "The ALOHA system-Another alternative for
computer communications," in 1970 Fall Joint Comput. Cont, AFIPS Conf Proc., vol, 37. Montvale, NJ: AFIPS Press, 1970,
(7]
R. E. Kahn, "The organization of computer resources into a packet radio network," in Nat. Comput. Conf., AFIPS Con! Proc., vol. 44, May 1975, pp. 177-186. (81 N. Abramson and E. R. Cacciamani, Jr., "Satellites: Not just a big cable in the sky," IEEE Spectrum, vol. 12, pp. 36-40, Sept.
[9) [10}
[5]
H. Friink, I. Frisch, and W. Chou, "Topological considerations in
pp.491-S04.
[15]
T. J. Klein,
[16]
K. Ah Mai, "Organizational alternatives for a Pacific educational computer-communication network," ALOHA Syst. Tech. Rep. CN74-27, Univ. Hawaii, Honolulu, May 1974. R. Binder, R. Rettberg, and D. Walden, The Atlantic Satellite Packet Broadcast and Gateway Experiments. Cambridge, MA: Bolt Beranek and Newman, 1975. L. Kleinrock and F. A. Tobagi, "Packet SWitching in radio channels: Part I-Carrier sense multiple-access modes ,and their throughput-delay characteristics, n IEEE Trans. Commun., vol, COM-23, pp. 1400-1416, Dec. 1975. F. A. Tobagi and L. Kleinrock, "Packet switching in radio channels: Part II-The bidden terminal problem in carrier sense multiple-access and the busy-tone solution," IEEE Trans. Commun., vol. COM-23, pp. 1417-1433, Dec. 1975. R. T. Chien, L. R. Bahl, and D. T. Tang, "Correction of two erasure bursts," IEEE Trans. Inform. Theory, vol. IT-IS, pp. 186187, Jan. 1969. N. Abramson, "Cyclic code groups," Problems of Inform. Transmission, Acad. Sci. USSR, Moscow, vol. 6, no. 2, 1970. L. G. Roberts, "Dynamic allocation of satellite capacity through packet reservation," in Nat. Comput. Con! AFIPS Conf. Proc., vol. 42, June 1973,pp. 711-716. M. J. Ferguson, "A study unslotted ALOHA with arbitrary message lengths," in Proc. 4th Data Commun. Symp.. Quebec, Canada, Oct. 7-9, 1975, pp. 5-20-5-25. R. Rettberg, "Random ALOHA with sJots-excess capacity," ARPANET Satellite Syst. Note 18, NIC Document 11865, Stanford Res. Inst., Menlo Park, CA, Oct. 11, 1972. N. Abramson, "Excess capacity of a slotted ALOHA channel (continued)," ARPANET Satellite Syst. Note 30, NIC Document 13044, Stanford Res. Inst., Menlo Park, CA, Dec. 6, 1972. L. Schiff, "Random-access digital communication for mobile radio in a cellular environment," IEEE Trans. Commun., vol. COM-22, pp. 688-692, May 1974. H. Frank, R. M. Van Slyke, and I. Gitman, "Packet radio network design-System considerations," in Nat. Comput. Conf., AFJPS Con! Proc., vol, 44, May 1975, pp. 217-232. R. M. Metcalfe, HPac~et communication," Rep. MAC TR-114, Project MAC, Massachussetts Inst. Technol., Cambridge, July 1973. M. R. Schroeder, "Nonsynchronous time multiplex system for speech transmission," Bell Lab. Memo., Jan. 19, 1959. J. R. Pierce and A. L. Hopper, "Nonsynchronous time division with holding and random sampling," Proc. IRE, vol, 40, pp. 1079-1088, Sept. 1952.
[17] [18J
[19]
[20} (21) [22J [23} [24] [25] [26)
[2]
567.
1975. LT. Frisch, "Technical problems in nationwide networking and interconnection," IEEE Trans. Commun., vol. COM-23, pp. 7888, Jan. 1975.
R. M. Metcalfe and D. R. Boggs, "ETHERNET: Distributed packet switching for local computer networks," Commun. Ass. Comput. Mach., to be published. (11] N. Abramson, "Packet switching with satellites," in Nat. Comput. Conf., AFIPS Conf. Proc., vol. 42,1973, pp. 695-702. [12] R. D. Rosner, "Optimization of the number of ground stations in a domestic satellite system," in EASCON'75 Rec., Sept. 29Oct. 1,1975, pp. 64A-64J. [13] A. B. Carleial and M. E. Hellman, UBistable behavior of ALOHAtype systems," IEEE Trans. Commun., vol. COM-23, pp. 401410, Apr. 1975. [14] P. E. Jackson and C. D. Stubbs, UA study of multiaccess computer communications," in 1969 Spring Joint Comput. Conf., AFIPS Con! Proc., vol. 34. Montvale, NJ: AFIPS Press, 1969,
1970, pp. 543-549.
L. G. Roberts, "Data by the packet," IEEE Spectrum, vol. 11, pp.46-51,Feb.1974. [3] W. W. Chu, "A study of asynchronous time division multiplexing for time-sharing computer systems," in 1969 Spring Joint Comput. Conf., AFIPS Con]. Proc., vol. 35. Montvale, NJ: AFIPS Press, 1969, pp. 669-678. (4] F. Heart, R. Kahn, S. Ornstein, W. Crowther, and D. Walden, "The interface message processor for the ARPA computer network," in 1970 Spring Joint Comput. COIlf,. AFIPS Conf. Proc., vol. 36. Montvale, NJ: AFIPS Press, 1970, pp, 551-
pp.281-285.
[27] [281
(29]
l301
H
A tactical packet radio system," in Proc. Nat.
Telecommun. Conf., New Orleans, LA, Dec. 1975.
or
304 [31]
[32]
THE BEST OF THE BEST J. P. Costas, "Poiss
IRE, vol. 47, p. 2058, Dec. 1959.
F. F~ Fulton, Jr., "Channel utilization by intermittent trans-
mitters," Tech.. Rep.. 2094-2, Stanford Electron. Lab., Stanford Univ., Stanford, CA, May 12, 1961. (33] K. D. Levin, "The overlapping problem and performance degradation of mobile digital communication systems," IEEE Trans.
(44 J W. Crowther, R. Rettberg, D.. Walden, S. Ornstein, and F. Heart, "A system for broadcast communication: Reservation ALOHA," in Proc. 6th Hawaii Int. Con]. Syst. Sci., Western Periodicals, Jan. 1973,pp.371-374.
*
Commun .. vol, COM-23, pp. 1342-1347, Nov. 1975.
[34J R. Binder" "A dynamic packet-switching system for satellite
broadcast channels," In Conf. Rec.,Int. Con]. Commun., vol. III, June 1975, pp. 41-1~41-:5. [35] S. S. Lam and L. Kleinrock, "Packet switching in a multiaccess broadcast channel: Dynamic control procedures," IEEE Trans. Commun., vol. COM-23,:pp. 891-904, Sept. 1975. (36] L. Kleinrock and S. S. Lam, "Packet switching in a multiaccess broadcast channel: Performance. evaluation," IEEE Trans. Comlnull.,·vol~ COM-23, pp. 410~423, Apr. 1975. [37] 1. Gitman, "On the capacity of slotted ALOHA networks and some design problems," IEEE Trans. Commun., vol. COM-23, pp. 305-317, Mar. 1975. {38] R. G. Gallager, "Basic limits on protocol information in data communication networks," IEEE Trans. Inform. Theory, vol. IT-22, pp. 385-398, July 1976. [391 M. J. Ferguson, "On the control, stability, and waiting time in a slotted ALOHA random-access system," IEEE Trans. Commun., vol. COM-23, pp. 1306-1311, Nov. 1975. [40] J. J. Metzner, "On improving utilization in ALOHA networks," IEEE Trans. Commun., vol. COM..24, pp. 447-448, Apr. 1976. [41} L. G. Roberts, "ALOHA packet system with and without slots and capture," Comput. Commun. Rev., 'Vol. 5, PP. 28-42, Apr. 1915. [42] L. Kleinrock, Queueing Systems, Volume 2: Computer Applications. New York: Wiley, 1976, pp..360-407. [43] I. Gitman, R. M. Van Slyke, and H. Frank, "On splitting random accessed broadcast communication channels, n in Proc. 7th Hawaii Int. Conf. Svst. Sci. -Suppl. on Comput. Nets, Western Periodicals, Jan. 1974, pp. 81-85.
PHOTO NOT AVAILABLE
Norman Abramson (S'S5-M-S9·F'13) received the A.B. degree in physics from Harvard University, Cambridge, MA, the M.A. degree in physics from the University of California, Los Angeles, and the Ph.D. degree in electrical engineering from Stanford University, Stanford,
CA.
In 1966 he was appointed Professor of Electrical Engineering and Professor of Information Sciences at the University of Hawaii, Honolulu. Before moving to Hawaii he taught communication theory at Stanford, Berkeley, and Harvard.. From 1968 to 1970 he served as Chairman of the Information Sciences Program at the University of Hawaii. He is now Director of The ALOHA System-a university research project concerned with new forms of computer communication networks. He has served as a consultant in communication theory, satellite data transmission, and computer networks to several government and industrial laboratories in the United States. He has also served as a UN expert in computer networks at the International Computer Education Center, Budapest, Hungary. Dr. Abramson is the author of Information Theory and Coding (New York: McGraw-HilO and co-editor of Computer-Communication Networks (Englewood Cliffs, NJ ~ Prentice-Hall). He is a past Chairman of the Administrative Committee of the IEEE Information Theory Group and a member of the Editorial Board of IEEE SPECTRUM. In 1912 he was the recipient of the IEEE Sixth Region Achievement Award..
Maximum Likelihood Receiver for MultipleChannel Transmission Systems
w. VAN ETTEN Abstroct-A maximum likelihood (ML) estimator for digital sequences disturbed by Gaussian noise, intersymbol interference (lSI) and interchannel interference (ICI) is derived. It is shown that the sampled outputs of the multiple matched fdter (MMF) form a set of sufficient statistics for estimating the input vector sequence. Two ML veetOI sequence estimation algorithms are presented. One makes use of the sampled output data of the multiple whitened matched fdter and is called the vector Viterbi algorithm. The other one is a modification of the vector Vitetbi algorithm and uses directly the sampled output of the MMF. It appears that, under a certain condition, the error performance is asymptotically as good as if both lSI and ICI were absent
I. INTRODUCTION It has first been pointed out by Shnidman [1] that intersymbol interference (lSI) and crosstalk between multiplexed signals are essentially identical phenomena. Kaye and George have worked out this idea by investigating the transmission of multiplexed signals over multiple channel and diversity systems [2]. The author of the underlying concise paper has presented a unified theory for treating intersymbol interference and interchannel interference (leI) as one type of disturbance [3]. We will call the combined effect of these disturbances multidimensional interference (MDI). In the following the essentials of [3] are summarized. The generalized Nyquist criterion formulated by Shnidman is restated in matrix notation. Furthermore an optimal linear receiver is derived, consisting of a multiple matched filter (MMF) followed by a multiple tapped delay line (MTDL). As optimization criterion is used minimum error probability and it appears that this optimum linear receiver has the same structure as the receiver derived by Kaye and George under the minimum mean-square error criterion. For a suboptimum criterion (minimum Pr (e) under the constraint that the multidimensional Nyquist criterion is satisfied) a theorem is given to calculate the tap coefficients for this case. Up to this point is appears that several concepts known from lSI literature can be generalized for MDI. Recently maximum likelihood (ML) sequence estimation of data distrubed by noise and lSI received considerable attention [4] -[ 6] . Now the question arises whether these concepts can also be generalized for sequences transmitted over multiple channel systems where the output data are disturbed by noise and MOl. This concise paper gives a positive answer to this question. II. THE MULTIPLE CHANNEL COMMUNICATION MODEL
The transmission system, to be considered in this concise paper, has M inputs and M outputs. To each input j a data sequence ~, o,.j[,(t - iT) is applied which we want to estimate at the receiving end of the communication system. The symbols a',j are elements of the alphabet to,l,···, L - 11. Symbols that are applied to the several inputs of the system at the same instant iT are ordered systematically in the input vector Paper approved by the Associate Editor for European Contributions of the IEEE Communications Society for publication without oral presentation. Manuscript received April 14, 1975; revised July 16, 1975. The 'author is with the Department of Electrical Engineering, Eindhoven Institute of Technology, Eindhoven, The Netherlands.
a',2 (1)
Xl~
a',M
With the input vector sequence we associate the vector D transform x(D)
~~X1DI
(2)
I
where D is the delay operator. In our investigations a linear, dispersive, and time-invariant multiple channel model is assumed (Fig. 1). This means that a linear relation exists between each input and each output signal and that the output signal due to the excitation of more than one input is the sum of the individual responses to the inputs in question. The relation between input j and output i is denoted by the impulse response Cij(t). All these responses are assumed to be square-integrable and of finite duration. Further we assume that the output signals are disturbed by MDI and additive, zero-mean, white Gaussian noise. Each output i is corrupted by a different noise signal ni(t), but it is assumed that these noise signals are uncorrelated and all have the same, dou blesided spectral density No. These assumptions are not a restriction of the generality as is shown in (3] . III. THE STATISTICAL SUFFICIENCY OF THE MMF OUTPUT In this section we shall show that if the MMF, as defined in [3] , is used as multiple linear receiving filter, then the sampled outputs of this MMF form a set of sufficient statistics for estimating the vector input sequence x(D). The impulse response cij(t) is considered as an element of a matrix
C(t) ~
(3) which defines the behavior of the multiple channel system. If the MMF is described in an analog way, it will be clear this its response is denoted by (-t). Assume that the multiple channel system is excited by an arbitrary, single-input vector x. Defining in this case the signal at output i of the multiple channel system by Si(t), we can write the total system output as a vector as follows:
cr
s(t) ~
(4)
called the vector ou tpu t signal. The noise is also given as a vector
Reprinted from IEEE Transactions on Communications, February 1976. The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
305
306
THE BEST OF THE BEST ference whether the true received vector signal set) + net) or the vector signal v(t) is supplied to the input of the MMF. Writing out (13) yields
n , (d'
vet) = C{t)Gv
M
Fig. 1.
= C(t)Gs + C(t)Gn = s(t) + nr(t).
(14)
Thus e(t) is a basis for the signal space spanned by both str) and nr(t) [7, ch. 4], which proves that the sampled MMF output is a sufficient statistic for estimation of a single input vector x. Now the following theorem can be stated. Theorem 1: If at each instant IT a vector is transmitted, then the vector output sequence
M
x,
Multiple channel communication model.
(15) n(t) ~
(5)
forms a set of sufficient statistics for estimating the vector input sequence x(D) (see [4] and [7]).
IV. THE MULTIPLE WHITENED MATCHED FILTER called the vector noise. In the following we shall use several times the inner product of rnatrices, the components of which consist of time functions. Such a product is denoted as
Now consider the system consisting of the channel in cascade with the MMF as a multiple channel system with M inputs and M outputs. The impulse response from input j to output n of this system is called Vnj(t) and can be written as
(6)
The sampled output of the MMF, in the absence of noise, is given by the signal vector
s =
where
* means
convolution. Define
VI1 (iT)
(7)
V21 (IT)
whereas the inverse transformation from signal vector to output vector signal is s(t)
= [CT(t)] TGs
= C(t)Gs
(8)
where G is a matrix to be determined. Substituting (8) in (7) gives (9)
So the systems to be treated must have the property that the matrix G exists. This requirement, however, is quite trivial, because at systems not possessing this property it is impossible to recover even a single-input vector from the sampled MMF outputs. The sampled output noise, if the signal is absent, can be written as
n = <eT(t), n(t).
(17)
(10)
and
V(D) ~
L J1D'.
(18)
1
By means of (16) it is easy to see that (18) is equivalent to
V(D)
=
C(D,t»
(19)
where C(D,t) is a matrix with components consisting of the chip D transforms [4] of the components of Cit). The crosscorrelation of the output noise signals at outputs 11 and In is given by
According to (10) the relevant vector noise, being that part of the input vector noise that is left after projection of n( r) at the signal space, is denoted by nr(t) = [cT(t)] T Go
= C(t)Gn.
(11 ) (20)
By means of the definition
v @s
+ D.
(12)
The eq uivalent received vector signal is written as v(t)
= C(t)Gv
Sampling this function we define itsD transform as follows: tPnm(D)
(13)
which menas that for the sampled output there is no dif-
~I:
(21 )
l
If all tPnm(D) are collected in a matrix we get the spectral matrix
307
Fifty Years ofCommunications and Networking (D) = NO(CT(D,t),
curo i».
(22)
Relation (22) can easily be verified by means of (20). In [8 ) and [9] it is shown that a matrix H(D-1 ) can be found such that (D) = NoH(D)JlT(D-l)
V(D)
=
(31)
such that both H(D-l) and H-l(D-1) are stable and nonanticipatory. Then the multiple filter whose chip D transform is
(23)
(32)
with both H(D-1) and H-l(D;-l) stable and nonanticipatory. Comparing (19) and (22) it is obvious that
is realizable and is called a multiple whitened matched filter and its sampled outputs give a vector sequence
V(D)
= H(D-l )JlT(D).
(24)
Now we conclude that the sampled output of the MMF can be written as v(D) = H(D-l )J{l'(D)x(D)
+ H(D-l)n(D)
(25)
where n(D) is the sampled input noise vector sequence. The output noise n'(D)
= H(D-l)n(D)
(26)
is colored Gaussian with spectral matrix 4>(D). This follows from
=E[H(D)n(D-l )nT(D)J{l'(D-l)] = N oH(D)J{l'(D-l). (27)
From (25) it is seen that the output noise is whitened by the following operation: z(D) ~H-l(D-l}v(D)
=J{l'(D)x(D) + n(D) = y(D) + n(D)
(28)
which means physically that an MTDL (3] with transfer H-l(D-l) is placed after the MMF. It has been mentioned in the foregoing that H-l(D-l) is stable and nonanticipatory and thus realizable. The MMF followed by the MTDL is called multiple whitened matched filter and is characterized by its chip D transform
(29) If the impulse response from input n to output m is denoted by lVmn(t), the set of functions lVmn(t - kT) is orthonormal in both the time and space dimension as is seen from
n- 1 (D )(CT (D ,t ), C(D-l,t»fH-l(D-l )}T
(33)
in which n(D) is a white Gaussian noise vector sequence, and which is a set of sufficient statistics for estimation of the vector input sequence x(D) where n(D) white Gaussian is to be interpreted in both the discrete time and space dimension. The multiple whitened matched filter found in this section is a generalized version of the whitened matched filter derived in (4]. This generalized filter is capable of optimizing the signal-to-noise ratio of the outputs of a multiple channel transmission system in which both lSI and leI together with noise contribute to the disturbance, under the constraint that the output noise must be white in the two-dimensional sense.
In the preceding sections we have derived a structure giving a set of sufficient statistics for estimating the input vector sequence of a multiple channel transmission system from the observations of the output. This output is disturbed by MDI and noise. The noisy part of the multiple whitened matched filter output samples are shown to be uncorrelated and thus independent, since we have assumed that the noise is Gaussian. From this it follows that the Viterbi algorithm is a powerful tool to perform ML estimation of the input vector sequence x(D). The vector Viterbi algorithm is a vector version of the algorithm used to make ML estimations on digital sequences and which is extensively described in [4] and [5]. The vector sequence y(D) may be considered to be generated by a
multiple finite state machine, driven by an input vector sequence x(D) (see Fig. 2). As the state of this finite state machine we define (34)
where N is the degree of the matrix polynomial J{l' (D). The state can take on L N M different values. We can depict the successive states of the multiple finite state machine, together with all allowable transitions, in a trellis diagram. Each transition Tk in this trellis diagram is associated with an input vector xk I and a certain value of the output signal Yk ,. Given the observation z" the log likelihood of transition Tk is given by
s,
In p(z, - Yk') = -In
= H-l(D) V(D-l ) {JlT(D-l )}-l = H-l(D}H(D)JlT(D-l) (JlT(D-l »)-1
= JlT(D)x(D) + n(D)
V. THE VECTOR VITERBI ALGORITHM
E[H(D)n(D-l )(H(D-l )n(D»)T]
=
z(D)
=I.
(30)
In the foregoing section we concluded that v(D) forms a set of sufficient statistics for estimation x(D» but zeD) is found by the reversible linear transformation n-1(D-l) on v(D). Thus z(D) forms a set of sufficient statistics for estimating x(D) also. This section is resumed in the following theorem. Theorem 2: Let C(t) be the matrix of impulse responses of the multiple channel transmission system and H(D-l)J{l'(D) a factorization of
(V2rrN o >M
- _1_
t
2N o 1=1
(Z"i - Ykl,,)2
(35)
where z, and are the ith components of, respectively, z, and y~,. In ML' sequence' estimation the first term of the i
Yk I i
right member of (35), being independent of 1, can be omitted and the same holds for the factor 1/2No in the second term. The squared distance of an observation at instant IT to a certain allowa ble transition T11, characterized by Yk I, is defined by
THE BEST OF THE BEST
308
~(D)
Fig. 2.
Model of a multiple
(36)
fin~te
state machine.
mate X(D) for x(D) that vector sequence which maximizes In p [u(t) t E(D)] , which means minimizing over all allowable ~(D)
The updating of the metrics, belonging to the several states, together with the updating of the corresponding path-registers proceeds as follows. 1) The metrics of all the states that belong to transitions, terminating in the same state, are increased with the corresponding squared distance D h ,2 . 2) Select the smallest increased metric as survivor-metric for the new state. The path-register ~f this new state is to be filled with the content of the path-register of the old state of the selected transition, This 'new path-register content then updated with the elements of the input" alphabet that belong to the selected transition. The vector Viterbi algorithm does not differ fundamentally from the scalar version; the only differences, which are in fact generalizations, are as follows. 1) The operation "squared distance computation" is a computation in the vector sense defined by the Euclidean squared distance (36). 2) Each element of the path-register consists of M components and must be shifted and· updated parallel in the pathregister. At this point the vector Viterbi algorithm is in fact reduced to the scalar version and we refer to [4] and (5) for further details. It will be clear that in a multiple channel transmission system the number of states is growing exponentially with the number of channels.
J= IIU(t)-
=<[
is
Ungerboeck has given an alternative recursive algorithm for making ML sequence estimations on data that are disturbed by lSI and white Gaussian noise f6]. Using this algorithm, the tapped delay line is omitted and the sampled output of the matched filter is directly used as input for the algorithm. In the following we shall generalize the Ungerboeck algorithm for ML vector sequence estimation of data that are disturbed by MD} and white Gaussian noise. If a vector sequence x(D) is transmitted the corresponding received vector signal is defined as follows:
L: C(t -IT)X, + nit).
u(t) -
• [U(t)-
~ C(t -
iT)
~, ]T,
~ C(t-kT)~k J>.
(38)
Writing out (38) the expression for J becomes
=
J
-
~T(t), ~ C(t -
kT)
(It)
(?cT(t -IT),U(t)
+<~ (,rCT(t-ln,~C(t-kT)~~.
(39)
Define
V, ~ (CT(t -iT), u(t)}.
VI. THE VECTOR UNGERBOECK ALGORITHM
utt) ~
~C(t-lT)~1II:
(40)
This vector is interpreted as the sampled output of the MMF. By means of definition (40) J is written as
J
=
L ~,TvI + L E E,TVj-k (k. I
(41 )
"k
The first term of (41) is independent of {, and thus may be ignored during the minimization process. The metric J«((D» can be calculated in a recursive manner.
(37)
I
Among all possible input sequences {(D) we choose as esti-
(42)
Fifty Years ofCommunications and Networking
309
with
e: e(D) == eo + e1D + ... + eHJYI
F(vz;
~l-N' •••, ~l)
= ~IT[2VI~VO~I-2
with 11
i: ~ ~l-h]
(43)
k==l
where N is determined by the length of the V(D) matrix sequence, according to V(D)
=
2: N
(44)
V,Dl.
ei
(48)
11 2 ~ 00,
where 00 denotes the minimum nonzero value of the Euclidean norm of the error vector e.. This value equals the minimum distance
DO = min {I al,i i*i
I}.
- aZ,i
(49)
From [6] we know that the error event probability is written as
l=-N
Here the survivor-metric follows:
Jz is introduced, which is defined as where the subevents €1,
min
{J1( " ', ~l-N,
{···'~l-i..N}
and €2' are defined as follows.
+ e(D) is an allowable data vector sequence; noise vector sequence is such that X(D) + e(D) has ML (wi thin the observation interval); and . noise vector sequence is such that x(D) + f!(D) has greater likelihood than x(D), but not necessarily ML.
xeD) is such that xeD)
J,(81) ~Jl ((,+l-iV, '.., (1)
g
€2)
(l-N+l
t
".,
~,)},
(45)
The sequence (.'., ~l-N), which results in a minimum of (45) is called the path-history of the survivor-state
(46) It is easy to see that there are again LNM different survivorstates, One can imagine that these survivor-states correspond to the states of a finite state machine. From this point of view the principles of the Ungerboeck algorithm coincide from now on those of the Viterbi algorithm. For further details see [6]. Expression (43) is now to be used for the calculation of the squared distance of an observation to an allowa bIe transition, and the finite state machine has ~s much different states as the finite state machine of the Viterbi algorithm. Although at first glance the metric calculation of the Ungerboeck algorithm seems more complicated than that of the Viterbi algorithm, a second inspection of (43) shows that the metric up-dating is a rather simple operation from a programming point of view. Namely the quantity
From the preceding section it is concluded that Pr (€2' the probability that
J(x(D»
> J(x(l) + e(D».
I €1) is (51 )
It can be proven that inequality (51) is identical with
< 2 II VO-
1
H
11 2
~
e,Tnz'
(52)
1=0
n/
where are the sample values of the noise at the output of the MMF. The quantity 5( e) is called the distance of the error event e. Consider the random variable a given by the right member of (52). . Q
~ 211
H
VO-l
11 2 ~ e,Tn/.
(53)
1=0
This random variable is Gaussian distributed with zero-mean and variance
only depends on the channel response, which is assumed to be fixed, and on the transitions to be considered. So this value can be stored in a memory and need not be calculated in real time. .
E[a 2 ] = 4No
II Vo- 1
n2 5 2 (e) .
From this it follows that
VIII. THE ERROR PERFORMANCE OF TH:E ML RECEIVER
(55)
The investigations, given in this section, are closely related to the methods given in [4] and [6]. Remember that x(D) represents the transmitted vector sequence and that the vector sequence estimated by the ML receiver is denoted by nt». Then e(p) ~x(D) - xeD)
(54)
(47)
defines the error vector sequence. Assuming stationarity, the starting point of ~n error event € can be associated with t == 0:
where the well-known Q(.) function is defined in (7] . Let E be the set of all possible error even ts e. Then the probability that any error event occurs becomes . Pr (E)
= L: Pr (e).
(56)
€EE
Let .1 be the set
of all
possible 8(e) and
eo the subset of error
310
THE BEST OF THE BEST
t.J..-------~
2
- t
1
1
i2~------~
o
2
Fig. 3.
Received signal set for the example.
events f~r which 6(e) = 6. Then from (50) the error event probability is bounded by
(57) Because of the exponential behavior of the Q(.) function for large argument values, this expression will already at moderate signal-to-noise ratios be dominated by the term involving the minimum value Omin out of the set A. At moderate and large signal-to-noise ratios €2' implies €2 with a pro bability almost equal to one. For these SNR values Pr (E) is approximated by Pr (E) ~ Q
_. 11:T'5min
2v No II
(
)
Vo-11l2l/2
"L.J eEE6
.
Pr (El)'
.
M
H
= II II
i=1 1=0
L
-I el,i
(59)
L
II Vo- 112
t:
1=-00
"
I
L
(61)
As an example we take a multiple channel with M = 2. The components of the transmission matrix C(t) are as given in Fig. 3. We take T = 1 and for this system the V(D) matrix polynomial is as follows:
Vo =
5 [37 12]
144 12 1
[37
72
12
V1=V- 1 = -
37
12]
(62)
37
One can
1[6
IfI'(D) = -
II Vz 1l 2
flo
L-l
easily verify that this V(D) satisfies condition (60). Decomposition of V(D) according to (24) yields
with eZ,i the ith component of e" In the Appendix it is shown that under the constraint 1
ett
VIII. AN EXAMPLE
mm
)
) 1\21 12
Since 6 0 2 /11 Eo -1 11 2 is the total amount of energy that is measured as the receiving end. at transmission of a single symbol out of the set EfJ 0' the symbol error probability is not increased by MDI.
(58)
Assuming the input symbols a'J to be independent of each other and equiprobable, the probability of €1 is written as Pr (El)
<S o
Pr(e)~Q ( 2VNo 1\ VO-l
(60)
not any error event has smaller distance than the single error events with distance 00. With a single error event we mean an error sequence that consists of one error vector (e(D) = eo) and from this vector only one component differs from zero. In this situation the single error events with distance 00 dominate the expression for the error event probability and the error event probability equals the symbol error probability
12
1
:] (2
+ D).
(63)
By means of this matrix sequence the given system is simulated on a minicomputer. In Fig. 4 the error probability for a binary alphabet {+1,-I} is plotted as function of the signalto-noise ratio, together with the Pr (e) for isolated pulses. The two curves merge at a Pr (e) of about 10-4 • So, for error probabilities smaller than 10- 4 , the performance of the ML receiver is as good as if MDI were absent. In the case of larger error probabilities the difference between the two curves is
311
Fifty Years ofCommunications and Networking
I Pr le l
o
5
Fig. 4.
10
20
15
Symbol error probability versus signal-to-noise ratio .
Curve A Curve B
Single pulse; monochannel with linear correction and bit-by-bit detection; Curve C multiple channel with M = 2, linear correction and bit-bybit detection;
Curve D monochannel with ML sequence estimation; and Curve E multiple channel with M = 2 and ML vector sequence estima tion.
maximal 1.2 dB. These results are compared with those of an optimum constrained linear receiver [3 J. The difference between the linear receiver and the single-pulse performance is i .7 dB, showing the superiority of the vector ML receiver . We also simulated a ML receiver for a monochannel with impulse response cII (t) . Now the maximum difference with the singlepulse performance appears to be I dB, whereas the two curves also merge at a Pr (e) value of about 10- 4 . Linear correction with bit-by-bit detection gives an increase of 2.2 dB in this case. . . N.B.: At the simulations the path-regis ter length was 16 bits in all cases. The number of transmissions was chosen such that the reid error probability lies, with a probability 0.9 , within an interval of 10 percent around the plotted value. IX. SUMMARY AND CONCLUSIONS It is shown that the MMF outputs form a set of sufficierit statistics for estimating the transmitted vector sequence over a multiple channel system. A multiple whitened matched filter is derived , the output of which is used to perform ML vectot sequence estimation by means of the vector version of the V,iterbi algorithm. A modified algorithm, pointed out by Ungerboeck, is also generalized to combat the noise and MDI. If this algorithm is used MTDL is omit'ted and the sampled out-
put pf the MMF is directly used as input data for the algorithm. Finally, the error performance of the ML receiver for a multiple channel system disturbed by noise and MDI is calculated. from the latter investigations it follows that, under a certain constraint , for moderate and large SNR's the error performance is not SUbstantially influenced by MDI, i.e., the symbol error probability is approximated by the value found if a single pulse is transmitted. It is concluded from this concise paper that lei plays the same role as lSI. If these two disturbances are Simultaneously considered , then MDI can, under the given constraints, be treated as a generalization of lSI and the concepts of ML sequence estimation on data disturbed by noise and lSI are also generalized for noise and MOL APPENDIX Let (64 )
and let
lV,li=-oo be given and assume 00
1/ Vo- 1 lh ~' 1/ 1==-00
V,
1/2 .,.;;; 1.
(65)
312 The matrix V o equals
l)2(e)
~
(to -III
= II Vo-l \12
II elf
112 2
VO-l
112
·Ito
H
~
THE BEST OF THE BEST
ekTVOek
k=O
lie,. 11
-
80 2 )
l~~ IIV llh 1
22-6021 + 62 0
(71)
(66)
Consider the first term of (66). Because Vo is positive definite we have the inequality (67)
where Amin( Yo) is the smallest eigenvalue of yo. Moreover,
This last inequality holds if (65) is satisfied.
ACKNOWLEDGMENT The author would like to thank Prof. J. van der Plaats and Prof. J. P. M. Schalkwijk for the stimulating discussions on the subject. He also wants to acknowledge L. S. de long for giving the proof of the Appendix.
(68)
REFERENCES [1]
From (67) and (68) it follows
[2]
[3]
H
=E Ile k=O
2
(69)
k ll2 •
[4]
Consider now the second term of (66). Due to the Schwarz inequality and from what is given we have [51
[6}
<;11 Vo- 1 1l 2
~ III
H
E'
Z=-H
VO-l 11 2
it V,lI 2
l~~ II
V,
H
E
k=O
to
11
112
lI e ,+ k T 11 2 ·Uek
1\2
[8]
2 2
lie,. 11 -So2!. (70)
From (69) and (70) it follows
[7]
[9) [10]
D. A. Shnidman, "A generalized Nyquist criterion and optimum linear receiver for a pulse modulation system," Bell Syst. Tech. J., vol. 46, pp. 2163-2177, Nov. 1967. A. R. Kaye and D. A. George, "Transmission of multiplexed PAM signals over multiple channel and diversity systems," IEEE Trans. Commun. Technol., vol. COM-18, pp. 520-526, Oct. 1970. W. van Etten, "An optimum linear receiver for multiple channel digital transmission systems," IEEE Trans. Commun. (Concise Papers), vol. COM-23, pp. 828-834, Aug. 1975. G. D. Forney, Jr., "Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference," IEEE Trans. Inform. Theory, vol, IT-I8, pp. 363-378, May 1972. J.. K. Omura, "Optimal receiver design for convolutional codes and channels with memory via control theoretical concepts," Inform. Sci., vol. 3, pp. 243-266, 1971 . G. Ungerboeck, "Adaptive maximum-likelihood receiver for carrier-modulated data-transmission systems," IEEE Trans. Commun., vol. COM-22, pp. 624-636, May 1974. J. M. Wozencraft and I. M. Jacobs, Principles of Communication Engineering. New York: Wiley, 1965. D. N. Prabhakar Murthy, "Factorization of discrete-process spectral matrices," IEEE Trans. Inform. Theory (Corresp.), vol. IT-19, pp. 693-696, Sept. 1973. P. R. Motyka and J. A. Cadzow, "The factorization of discreteprocess spectral matrices," IEEE Trans. Automat. Contr., vol. AC-12, pp. 698-707, Dec. 1967. F. B. Hildebrand, Methods of Applied Mathematics. Englewood Cliffs, NJ: Prentice-Hall, 1952.
An Optimum Linear Receiver for Multiple Channel Digital Transmission Systems w. VANETTEN
Abstract-An optimum linear receiver for multiple channel digital transmission systems is developed for the minimum P 8 and for the zero-forcing criterion. A multidimensional Nyquist criterion is defined together with a theorem on the optimality of a finite lenght multiple tapped delay line. Furthermore an algorithm is given to calculate the tap settings of this multiple tapped delay line. This algorithm simplifies in those cases where the noise is so small that it can be neglected. Finally as an example the transmission of binary data over a cable, consisting of four identical wires, symmetrically situated inside a cylindrical shield, is considered. I. INTRODUCTION In this paper we shall investigate the transmission of digital signals over a multiple channel system, where each channel is used to transmit a data sequence. This configuration is included in the more general structure considered by Kaye and George [1]. We, however, use a technique that leads to an optimum structure for both the zero-forcing and minimum error probability criterion, instead of the minimum mean-square error criterion Kaye and George used. Besides the intersymbol interference (lSI) J interchannel interference (leI) can be one of the major problems in such a multiple channel digital transmission system. lSI is disturbance of an output signal by symbols that originate from the corresponding input but that are shifted in time with respect to the symbol of interest. leI is disturbance of an output signal by symbols that do not originate from the corresponding input but from input symbols that belong to neighboring channels. We introduce the name multidimensional interference (l\iDI) for the combined effect of lSI and leI. Because the equalization of lSI also changes the leI at the output, and the other way round, only a simultaneous treatment of these two phenomena can be successful in combating the overall degradation. In the following we generalize for MDI some techniques known from the lSI literature. As examples of systems where these methods can be applied, we mention multiwire cables and multichannel radio
systems that make use of perpendicular polarized waves in a common frequency band. II. THE MULTIPLE CHANNEL COMMUNICATION MODEL The multiple channel transmission system, to be treated in this paper, has M inputs and M outputs, where to each input j a data sequence Ll a/cl{t - IT) is applied which we want to detect at output j. The symbols ail are elements of the alphabet 10,1,·.·, L - I} and are chosen equiprobable and independent of each other. In our investigations a linear, dispersive, and time-invariant channel model is assumed (Fig. 1); this means that a linear relation exists between each input and each output signal and that the output signal due to the excitation of more than one input is the sum of the individual responses to the several inputs. The relation between inputj and output i is denoted by the impulse response rij(t). It is assumed that the output signals are disturbed by MDI and that zero-mean white Gaussian noise is added to them. Each output is corrupted by a different noise signal ni(t).
III. THE OPTIMUl\1 LINEAR RECEIVER By means of an optimum linear receiver and bit by bit detection on each channel output we make an estimate of the several input sequences. The receiving filter is assumed to be linear in the sense described in the preceding section. The linear relation between input i and output n of this filter is denoted by the impulse response h ni (t) (see Fig. 2). The following method yields an optimum solution for the linear multiple channel receiving filter for both the zero-forcing (zero MDI) and minimum bit error probability criterion. Assuming that the several noise sample functions ni(t) are independent of each other, then the noise variance at ou tput n of the receiving filter can be written as follows:
Paper approved by the Associate Editor for European Contributions )f the IEEE Communications Society for publication without oral
)resentation. Manuscript received November 27 t 1974; revised April 3 t
[975.
-The author is with the Eindhoven University of Technology, Eindroven, The Netherlands.
Reprinted from IEEE Transactions on Communications, August 1975.
The Best ofthe Best. Edited by W. H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and J. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
313
THE BEST OF THE BEST
314
and the assumption that the noise functions ni (t) are uncorrelated are not a restriction of the generality, as is shown in Appendix II. With
",(t>
(7) (6) reduces to
M
M
M
Fig. 1.
h ni (t) =
i-1
Multiple channel communication model.
M
Fig. 2.
Multiple linear receiving filter.
L N, 1CX1 hn 2 ( .,.) dT ltf
U n2 =
(1)
i
0
1-1
C,&;Zfij (t s
l
+ IT -
t).
(8)
The structure of the entire receiving filter follows from this equation.
--.w....--n
M
L L
where N, is the density of the noise spectrum of ni(t). Investigating the optimum structure of the linear receiving filter a technique is used that is presented in [4] and [5]. This means that all signal values that contribute to the possible sample values of the signal at output n are fixed. Then the noise variance un 2 is minimised subject to these constraints. Defining the input vector
Each hni(t) consists of a bank of matched filters, the outputs of which are added and the output signals of all hni (t), which belong to the same receiving filter output n, are added again. Assuming that t, is larger than the largest duration of a.llsnk(t), then a reduction of the receiving filter is possible and Fig. 3 depicts the result for M = 3, for instance. For ease of notation the time axis is shifted such that t, = o. At each filter input i we see an array of filters matched to the particular responses at channel output i due to the individual excitation of the several inputs. Then all the outputs of the filters matched to the responses due to the same input are summed to form the primed outputs 1'-2'-3'. This part of the filter we call the multiple matched filter (MMF) (inputs 1-2-3 and outputs 1/-2'-3'). Each primed output is followed by a. delay line with elements D giving a delay T. The rest of the receiving filter consists of M summing circuits and from each delayed primed output there is a weighted connection (with weighting coefficient Cn;t) to each adder. This part of the filter we call the multiple tapped delay line (MTDL) (inputs 1'-2'-3' and outputs 1"-2"-3/1). The weighting coefficients c,..;l have to be chosen such as to meet the optimization criterion. In the case of the minimum P, criterion it is impossible to find an analytical solution for the set {enid. By means of a steepest descent method one can find an approximation. Zero MDI offers the possibility to calculate the tap coefficients in a rather easy way as will be shown in Section V and to check the practical realization
(2)
the constraints are found by considering the sample values of the signals at output n due to the LM possible input vectors x k • The latter sample values are found in the following way. Assuming that at time t = 0 the vector x k is applied to the inputs of the channel, then the response at output n of the receiving filter is given by 8 nk (t )
At the instant t,
8,/'(t,
M 1a> = LM al' L hni(r)rij(t i-I
+ IT, this response has the value
+ IT) = L M
a,J:
i-I
LM1ClO hni(r)ri;(t + ll
0
i-1
111 (IT)
/12 (iT)
121 (IT)
12! (IT)
iT - T) dr,
(4)
kept constant, therefore we have to minimize the functional I
n
100
i-1
0
= L Ni
hn ,.2 (1') dr - 2
LM
L L Ankl L U,A i-I
I
·1: 1 00
i-1
0
hfti(T)ri;(ta
with in; (t) the response at output n of this system as result of a delta excitation on input j. Further we define F(D) ~
LM
i; (t) = N. ~ t
k-l
:E I
+ IT -
T) dT.
(5)
AI
Xnkl
L a/'rii «: + IT i-I
For the sake of simplicity we take N,
=N
L: FIDl l
M
L: L
Applying the calculus of variations to expression (5) yields I
(9)
t).
(10)
where D is the delay operator. A measure for MDI is now defined as follows:
M
k-l
f2M(lT)
iMN (IT)
In the minimization process these values for all k and l must be M
•
(3)
- T) dr,
0
i-1
by means of the eye pattern, while the error probability P, is of the same order of magnitude as when the minimum P, criterion is used especially for large signal-to-noise ratios. Considering the cascade connection of the channel, the MMF and the MTDL, the impulse responses of this overall system evaluated at the discrete instants iT are denoted by
(6)
for all i, This assumption
Ilni(lT)
I -1/",,(0) I
In, ~ - - - - - - - - - - - f !nn(O}
I
(11)
which is the worst case distortion due to MDI on output n. The overall worst case MDI distortion is given by
10
= max n
(In).
(12)
315
Fifty Years ofCommunications and Networking
2
3
Fig. 3.
Structure of the multiple linear receiving filter.
The terms "zero MDI" and "zero-forcing" are used here if 10 ::: o. By means of (10) and (12) we formulate a multidimensional Nyquist criterion which fits Shnidman's generalized Nyquist criterion [2]. Theorem 1: A multiple channel transmission system described by (10) satisfies the multidimensional Nyquist criterion if
=I
F(D)
Vmi (t) =
where
L
i-I
T'i (t)
* Tim ( -t)
(14)
* means convolution. Define
f·
(IT)
VIM
(IT)
ViM (IT)
V22(lT)
(15)
vl~l
VMI
(IT)
From the definitions (10), (16), and (18) it follows F(D) == C(D)-V(D).
IV. THE ERROR PROBABILITY OF THE EQUALIZED SYSTEM
If in a multiple channel transmission system it is possible to satisfy the multidimensional Nyquist criterion and the system has an optimum constraint receiver as described in the foregoing, the mean error probability of channel n of such a system is denoted by
O'n2
V(D) ~
r: VZDI.
=
M
l
The MTDL is also a multiple linear filter. For this system we define
i-I i-I k-1
l
Cut
•
• elM'
eMU
CM21
-
-
CMMl
lco Tilc(lT -co
'T)ri;(mT - 1") d.,..
M
=L L L l
(17)
NCnj".Cnld
For the equalized system the impulse response from input j to output n, evaluated at the instant mT, can be written as fn;(mT)
CIMl
(20)
(21)
M
Cuz
M
M
L L: L L L m
(16)
- 1 Q (~) 2u n
L
where the well-known Q(.) function is defined in [6, p. 82] and d is the smallest difference between two output levels. As the smallest difference between two elements of the input alphabet is taken unity and because of (13), d equals one. The noise variance at output n is calculated from (1) and (8)
VM2(lT)
and
(19)
In Section V we shall give a procedure to calculate the tap coefficients described by C(D).
r; = 2 L
Vl1 (IT)
V21
(18)
(13)
where J is the M X M identity matrix. It will be clear from the foregoing that for a system satisfying the multidimensional Nyquist criterion the MDI will be zero. Now consider the channel in cascade with the MMF as a multiple channel system with M inputs and M outputs. The impulse response from input j to output m of this system is called Vmj(t) and can be written as M
and
CnkZ
i-I k-l
·i:
ra:(lT - 'r)r'i(mT - 1") d1"
= S",Bni
(22)
as is derived in [2J. Substituting (22) reduces (21) to the simple form
316
THE BEST OF THE BEST u n2
= Ncnno
(23)
which, if substituted in (20), gives for the error probability of channel n
P~
v.
1 ) 2L-1 -L - Q ( 2 (Ncnno) 112
•
(24)
where 0 is the all zero matrix. To satisfy Theorem 2 we have the relation (28)
This equation is further simplified if we look at (14), (15), and (26). I t is easy to see that
THE OPTIMUM REALIZABLE MTDL
The index I of the C (D) sequence runs from minus infinite to
Lz'
eK
and
VI
V2K
V_l
Vo
V2K-l
V_~
V-I
V 2K - 2 -
(26)
rI2(lT)
riM (IT)
(iT)
r22(lT)
r2M(lT)
r21
rMl(lT)
rMM(lT)
and CD
R (D) ~ ~ Z-O
tut».
(32)
By applying Theorem 1 to this system it follows that the tap coefficients are determined by the recurrence relation
c,
=
-Ro-t
Z-l
L RZ-iC, ,-0
(33)
l ~ 1.
I t will be clear that Theorem 2 is also valid now with the restriction that l has only positive values and the length of the MTDL is K. In this case the MTDL is also realizable as M shift registers with resistance matrices at the sending end. One can derive that, in doing so, the expression (33) for the Ci matrices stay unchanged. Decision feedback is another possibility to eliminate MDI [7J. Then the MMF is followed by a "forward" MTDL and a "feedback" MTDL.
(27)
1
o Ro =
o LO.J
rM2(lT)
As an example we implemented the transmission of binary data over a multiwire cable, consisting of four identical wires which are symmetrically situated within a cylindrical shield (see Fig. 4). The cable has a length of 1 km and the bit rate is taken 5 Mbit/s. In this example the length of the cable, the bit rate and the sending pulses are such that the noise can be neglected, thus the relations (33) are used for calculating the tap coefficients. We have measured the following matrices:
o I
(31)
Rz~
VI. AN EXAMPLE
o
E~
rl1(lT)
Vo
V -2K V -2K+l
(30)
The solution of this equation decomposes into M times the solution of a set of linear equations, one time for each column of C as wanted vector and the corresponding column of E as known vector. "In systems where the noise does not play an important role, MDI distortion correction can directly be applied to the channel response. In this situation (29) is not true in general, but it is sometimes possible to choose t, < T giving a simplification of the expression for Ci. It is easy to see that the matrix sequence C: starts now at l = 0 and runs to"plus infinite. Analogous to (15) and (16) we define
(25)
Vo
Vo
= E.
VO
Co = Ro- l
C- K +1
V~
(29)
so that
plus infinite and as a consequence the MTDL becomes infinitely long. In practice we have to make it of finite length and in this case (13) cannot be satisfied exactly. If the MTDL is of length 2 K the optimum tap settings are given by the following theorem. Theorem S: If Vo = I and Ll'" Vt II < 1, then an upper bound of the MDI distortion 10 is minimal for those tap setting matrices which cause F l = 0, III ~ K, I ~ 0; where the primed summation excludes the term with l = 0 and the infinite norm is taken (which is the maximum over all rows of the sum of the absolute values of the components of the rows). This theorem will be proven in Appendix I. In the special case that II r, II represents the worst. case MDI 10, this distortion itself is minimized. Under these constraints this theorem is a generalization of a theorem derived by Lucky for lSI [3, p. 138J. If V o ~ I we can force V o to equal the identity matrix by placing between the MMF and the MrDL a multiple channel system with matrix D-transform VO-I. To' apply the theorem all V l matrices must then be replaced by YO-lYle In the case that Ll' 11 v. II represents the worst case MDI at the MMF outputs, a sufficient condition to satisfy the requirement :El' II V l II < 1 is that at none of the MMF outputs, the eye pattern is closed if a1' E {+ 1, -1 }. The tap settings as stated in Theorem 2 are calculated as follows. Define the composite matrices
VT = V
0.13
0.24 1
0.13
0.24
0.24 0.13
1
0.24
0.13
R1
0.24 0.24
= 0.261,
0.24 0.24
R2
1
= 0.111,
Ra
=
0.071,
R4
= 0.041.
(34)
Fifty Years of Communications and Networking
Fig. 4.
317
Oross section of the 4-wlre cable wlthin cylindrical shield.
One can verify that LllI R,Ro-IIi < 1, thus Theorem 2 can be applied. The calculated C, matrices are
Co =
CI
=
1
-0.21
-0.21
-0.03
-0.21
1
-0.03
-0.21
-0.21
-0.03
1
-0.21
-0.03
-0.21
-0.21
1
-0.31
0.12
0.12
-0.01
0.12
-0.31
-0.01
0.12
0.12
-0.01
-0.31
0.12
-O.oI
0.12
0.12
-0.31
Fig. 5.
Eye pattern of the unequallzed system.
(35)
Because of the several kinds of symmetry in both the R, and C, matrices, LI' II R,Ro-11i represents the worst case MDI before the MTDL. Moreover, the output matrices F, show the same symmetry and thus Ll' IIF, IIrepresents the worst case MDI at the output, SO that Theorem 2 is valid in its full consequence. At the realization of the C, matrices, tap coefficients equal or smaller than 3 percent are omitted because these values do not give a substantial improvement of the eye opening. All components of Ca, Ca, etc ., are smaller than 3 percent, that is why they are not given at (35). Only Co and CI are realized and at these matrices the connections between a certain wire and the diagonal opposed one are omitted too. This MTDL is implemented as 4 shift registers at the sending end which are connected to the cable by means of resistance matrices forming the tap coefficients. Fig. 5 shows the eye pattern at the receiving end of the cable if all wires are excited and it is seen that the unequalized system has a fully closed eye as is calculated from (34). Fig. 6 shows the eye pattern of the system characterized by R(D)Ro-1 which means that a multiple channel system with matrix D-transform RO-I is placed between the transmitter and the sending end of the cable. The eye pattern of this system is not closed, which shows that Llil R,Ro-11i < 1. Finally Fig. 7 shows the eye pattern of the equalized system and it appears that the multidimensional Nyquist criterion is satisfied rather well. VII . CONCLUSIONS It is shown that for a multiple channel transmission system both the optimum linear receiver (minimum P,) and the optimum linear constraint receiver (minimum P; under zero-forcing condition) have the same structure as the optimum linear receiver found by Kaye and George applying the minimum mean-square error criterion . Moreover it appears that by means of the multidimensional Nyquist criterion and the generalization for MDI of a theorem by Lucky for lSI it is rather easy to find the optimum tap settings for a finite length MTDL. The algorithm to calculate the tap settings is further simplified in the case that the noise is unimportant and the sampling instant is smaller than the bit time.
Fig. 6.
Eye pattern of the system RCD)R ,-l.
By means of the methods developed in this paper it is shown that MDI is the generalization of lSI.
APPENDIX I PROOF OF THEOREM 2 Let
IV;}i'__ be given with o>
V o = I and let 0>
u, = L ' IIVi II < 1.
(36)
Let 0>
N
n-ag
i-:--N
L'II L
A =
CjVn_jll
(37)
under the constraint N
L cv; = I.
(38)
j--N
We shall prove that a minimum for A exists and that this minimum occurs if N
L
CjVn_j = 0,
j-N
n
=
-N,···,-l,l,···,N .
(39)
Proof: Due to (38), (37) can be written as follows:
A =
co
N
n - C»
i-N
L' II L
Cj(Vn_j - V_jVn)
+ v, II.
(40)
318
THE BEST OF THE BEST
Fig. 7.
Eye pattern of the equalized system.
CCGN
no T
4>QQ(S) =Q(~sl Q (sl Fig. 8.
-1
I
R(s)
~ System
R(s)
Fig. 10.
0 -'(5)
~
=>
a'",
a- '(s)
H H.'(-" 10
~
'(-"1'
Multiple channel system disturbed by CCGN In cascade with the multiple whitening matched tllter.
V....,Vi )
This is possible because the inverse of (1 besides tho.t
"(1- V-kVi)-l l! ~ 1-11
(41)
1
V-kll.1l
exists and
(45)
ViII
By means of (44), (42) becomes
From (40) it follows N
.
R(s)
4>!!(s) = I
A· ~ A.
C/(Vn-i - V_;V n)
:$A"'-I!Hk "' ll +
~
disturbed by COGN is replaced by the SYSWDl Q-l(s)R (s) dIsturbed by WUGN .
Let A be minimal in (C -N·," ·,CN·) and let ita value there be A"'. Consider A in the point (C_N·,···,Ci • + E~,···,CN·) and let its value there be A. Now we must have
..
H HrWUGN
H
-1 .("
2:' " 2:
--_o~
CCGN
Fig. 9.
:=
pes)
Multiple noise whitening tllter.
~Q'C(S)=a(-s)a '(5)
A
WUGN
+ v, + E
1(Vn-k
- V -kVn) "
.,
,,_CI) ;~~
+ 2:' II v, 11·11 E~ 11·11 v...., II
:$ 1
n--CD :",.~
(42)
where
H1'" ~
L:
C;·Vi_i·
s II Hk· 1! [ -1 + 1 _II V_: 11.11 +
L' !IV...... II·IlEill
N
A-A·
(43)
j-N
(M o -II ViiI>
II V_k III+ 1 -
(44)
II
0]
611 H1"'1I _II Lil/.II ViII [Mo -II V ..... II +Moll
V.....
-II Vill·1I V-1I1-1 + II V11l·11 V.... I/J allHJ,"'II :$ 1 _ Il V -111-11 ViII (Mo - 1) (1 + II v.. . U)).
II
(46)
From (46) it follows tha.t
II H."'1l
Choose
Vi Il IMo -II V....,
= 0,
because otherwise there is a contradiction with (41).
(47)
Fifty Years of Communications and Networking
319
APPENDIX II In this Appendix we prove tbatthe assumptions that the noise functions ni(l) are white and uneorrelated are not a restriction of the generality; i.e., a system not satisfying these assumptions can be transformed into a system that meets these requirements. The proof starts with the remark that the spectral matrix (which is the Laplace transform of the correlation matrix) of the input noise can be factored, according to [8J, in the following way: ~nn (8)
= Q( -S)QT(s)
(48)
where 8 is the bilateral Laplace variable. Assume that we have a system with transfer matrix pes) such that the spectral matrix of the output noise is the identity matrix if the input spectral matrix is given by (48). Then the spectral matrix of the output y of P(s) is written as follows [8J: ~lI(8)
= P( -s)Q( -S)QT(S)PT(S)
(49)
(see Fig. 8). From this it follows that
P(s)
= Q-l(S)
(50)
satisfies the requirement of white, uncorrelated output noise. A pro ... cedure for finding a Q (8) such that both Q (8) and Q-l (8) are stable
is also given in [8]. Now we shall further investigate the MMF for colored, correlated Gaussian noise (CCGN). The several impulse responses r« (t) of the multiple channel system are written in 8. matrix R(t). From (50) it follows that the multiple channel transmission system with transfer matrix R(s) disturbed by CeON with spectral matrix ~nn{8) can be replaced by a multiple channel transmission system with transfer matrix Q-l(s)R(s) disturbed by white, uncorrelated, Gaussian noise (WUGN) (see Fig. 9). The MMF for this latter system is given by [Q-l ( - 8) R ( - 8) JT = RT( - 8) [QT( - s) J-l.
(51)
Note that the MMF for the system with impulse response matrix R(t) disturbed by WUGN is given by RT( -t). So that the MMF for the original system can be written as RT( -s)[QT( -8) J-lQ-l(S) = RT( -8)[~nT(S):ll
(52)
(see Fig. 10). This Ml\1F we call multiple whitening matched filter
(MWMF). ACKNOWLEDGMENT The author wishes to thank J. van der Plaats for initiating and stimulating the work on interchannel interference and L. S. de Jong for giving the proof of Theorem 2.
REFERENCES [1] A. R. Kaye and D. A. George, "Transmission of multiplexed PAM slgnals over multiple channel and diversity systems," IEEE Trans.
Commun. Technol., vol. COM-IS, pp. 520-525, Oct. 1970. [2] D. A. Shnidman, etA generalized Nyquist criterion and optimum linear receiver for a pulse modulation system:' Bell Byst. Tech. J. t pp. 2163-2177. Nov. 1967. [3] R. W. Lucky. J. Salz, and E. J. Weldon, re., Principles of Data Communication. New York: McGraw-Hill. 1968. [4] M. R. Aaron and D. W. Tufts, "Intersymbol Interrerence and error probability," IEEE Trans. Inform. Theory, vol. IT..12, PP. 26-34, Jan. 1966. [5] D. W. Tufts, "Nyquist's problem: The joint optimization of transmitter and receiver in pulse amplitude modulation," Proc. IEEE. vol. 53, pp. 248-259. Mar. 1965. [6} J. M. Wozencraft and I. M. Jacobs, Principles of Communication New York: Wiley, 1965. Engineering.
[7] M. E. Austin, "Equalization of dispersive channels using decision reedback." Quarterly Progress Rep., M.LT. Res. Lab. Electron.• Oambridge, Mass.• no. 84, pp. 227-243, 1967.
[8] M. C. Davis, "Factoring the spectral matrix," IEEE Trans. Auto· mat. Contr, vol. AC-8, pp. 296-305, Oct. 1963.
Adaptive Maximum-Likelihood Receiver for Carrier-Modulated Data-Transmission Systems GOTTFRIED UNGERBOECK
Abstract-A new look is taken at maximum-likelihood sequence modulation techniques [1J is discussed. All forms of digita estimation in the presence of intersymbol interference. A uniform amplitude modulation (A1VI), phase modulation (P~1) receiver structure for linear carrier-modulated data-transmission and combinations thereof, are covered in a unified manner systems is derived which for decision making uses a modified version Without further mention in this paper, the results are alsr of the Viterbi algorithm. The algorithm operates directly on .the output signal of a complex matched filter and, in contrast to the applicable to baseband transmission. original algorithm, requires no squaring operations; only multiplicaIn synchronous data-transmission systems intersymbo tions by discrete pulse-amplitude values .ere needed. Decoding of interference (lSI) and noise, along with errors in th. redundantly coded sequences is included in the consideration. The demodulating carrier phase and the sample timing, ar. reason and limits for the superior error performance of the receiver . the primary impediments to reliable data reception [1J over a conventional receiver employitig zero-forcing equalization and symbol-by-symbol decision making are explained. An adjustment The goal of this paper is to present a receiver struetun algorithm for jointly approximating the matched filter by a trans- that deals with all these effects in an optimum way ant versal filter, estimating intersymbol interference 'present at the an adaptive manner. In deriving the receiver the concep transversal filter output, and controlling the demodulating carrier of maximum-likelihood (~1L) sequence estimation [2J, [3: phase and the sample timing, is presented,
I. INTRODUCTION
T
H E design of an optimum receiver for synchronous data-transmission systems that employ linear carrier-
Paper approved by the Associate Editor for Communication Theory of the IEEE Communications Society for publication without oral presentation. Manuscript received May 18, 1973; revised December 21, 1973. The author is with" the IBM Zurich Research Laboratory, Ruschlikon, Switzerland.
will be applied. This assures that the receiver is optimun in the sense of sequence-error probability, provided tha data sequences have equal a priori probability. The modulation schemes considered in this paper cal be viewed in the framework of digital quadrature ampli tude modulation (QA!vI) [IJ. They can therefore be repre sented by an equivalent linear baseband model that differ from a real baseband system only by the fact that signal and channel responses are complex functions [2J, [4J, [;')] Conventional receivers for synchronous data signal:
Reprinted from IEEE Transactions on Communications, vol. COM-22, no. 5, May 1974.
The Best ofthe Best. Edited by W. H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
321
THE BEST OF THE BEST
322
comprise a linear receiver filter or equalizer, a symbol-rate sampler, and a quantizer for establishing symbol-bysymbol decisions. A decoder, possibly with error-detection and/or error-correction capability, may follow, The purpose of the receiver filter is to eliminate intersymbol interference while maintaining a high signal-to-noise ratio (SNR). It has been observed by several authors' [1], [6J[9J that for various performance criteria the optimum linear receiver filter can be factored as a matched filter (Ml") and a transversal filter with tap spacings equal to the symbol interval. The :\Il~ establishes an optimum S~R irrespective of the residual lSI at its output. The transversal filter then eliminates or at least reduces intersymbol interference at the expense of diminishing the
SNR. If symbols of a data sequence are correlated by some coding law, a better way than making symbol-by-symbol decisions is to base decisions on the entire sequence received. The same argument holds true if data sequences are disturbed by lSI. The correlation introduced by lSI between successive sample values is of discrete nature as in the case of coding, in the sense that a data symbol can be disturbed by adjacent data symbols only in a finite number of ways, lSI can even be viewed as an unintended form of partial response coding [IJ. Receivers that perform sequence decisions or in some other way exploit the discreteness of lSI exhibit highly nonlinear structures. Decision feedback equalization [10J, [IIJ represents the earliest step in this direction. Later, several other nonlinear receiver structures were described [12J-[17J. In view of the present state of the art, many of these approaches can be regarded as attempts to avoid, by nonlinear processing methods, noise enhancement which would otherwise occur if lSI were eliminated by linear filtering. A new nonlinear receiver structure was introduced by Forney [18J. The receiver consists of a "whitened' ~11t' .(i.e., -;n lVIF followed by a transversal filter that whitens the noise), a symbol-rate sampler, and a recursive nonlinear processor that employs the Viterbi algorithm in order to perform ~/IL sequence decisions. The Viterbi algorithm was originally invented for decoding of convolutional codes [19J. Soon thereafter the algorithm was shown to yield l\1L sequence decisions and that it could be regarded as a specific form of dynamic programming [20J-[22J. Its applicability to receivers for channels with intersymbol interference and correlative level coding was noticed by Omura [23J and Kobayashi [24J-[26J. A survey on the Viterbi algorithm is given by Forney [27J. Very recently, adaptive versions of Forney's receiver have been proposed [28J, [29J, and its combination with decision-feedback equalization has been suggested [80 J. In Forney's [18J receiver, whitening of the noise is essential because the Vitcrbi algorithm requires that noise components of successive samples be -statist.ically independent. In this paper a receiver similar to that of Forney will be described. The receiver employs a modified Viterbi algorithm that operates directly on the lVIIi' output without
whitening the noise. In a different form the algorithm has already been used for estimating and subtracting lSI terms from disturbed binary data sequences [17J. Here the algorithm is restated and extended so that it performs ML decisions for complex-valued multilevel sequences. Apparently, Mackechnie [31J has independently found the same algorithm. In Section II we define the modulation scheme. The general structure of the maximum likelihood receiver (l\1LR) is outlined in Section III. In Section IV we derive the modified Viterbi algorithm. The error performance of the IVILRis discussed in Section V and compared with the error performance of the conventional receiver. Finally, a fully adaptive version of the l\:ILR is presented in Section VI.
II. 1\!ODULATION SCHEl\1E We consider a synchronous linear carrier-modulated data-transmission system with coherent demodulation of the general form shown in Fig. 1. By combining in-phase and quadrature components into complex-valued signals (indicated by heavy lines in Fig. 1), all linear carriermodulation schemes can be treated in a concise and uniform manner [2J, [4J, [.5J..Prior to modulation with carrier frequency We, the receiver establishes a pulseamplitude modulated (PAIVI) signal of the form x(t)
=
L anf(t -
(1)
nT)
where the sequence fan} represents the data symbols, 1" is
the symbol spacing, andf(t) denotes the transmitted baseband signal element. Generally, {an} and j(t) may be complex (but usually only one of them is; see Table 1).1 The data symbols are selected from a finite alphabet and may possibly succeed one another only in accordance with SOIne .t:edund~l-t coding rule. Assuming a linear dispersive transmission medium with impulse response Oc(t) and additive noise 'tvc(t) , the receiver will observe the real signal
yc(t) = yc(t)
* Re lV2x(t) exp (jwct)} + tvc(t)
(2)
where * denotes convolution. One side of the spectrum of Yc(t) is redundant and can therefore be eliminated without loss of information; the remaining part must be transposed back in the baseband. In Fig, 1 we adhere to the conventional approach of demodulating by transposing first and then eliminating components around twice the carrier frequency. The demodulated signal thus becomes (3) n°
/t(l)
=
[yc(t) exp ( -jwct - jfPc) ]
= g(t) *f(t)
* f( t)
°
(4)
1 A more general class of P Al\f signals is conceivable where f(t) depends on n or an.
323
Fifty Years of Communications and Networking TRANSMISSION MEDIUM
TRANSMITTER
If {an} were the actual sequence of the pulse amplitudes transmitted during I, then
RECEIVER
w(tl {an}) = Yet) - L lXnh(t - nT),
tEl
(8)
'l~Td
-
Fig.!.
reat signol
-
complex. signal
General linear carrier-modulated data-transmission system.
p[y(t) ,t E I Ifan} ]
TABLE I Modulation Scheme DSB-AM 8SB, VSB-A!vI
PM
AM-PM
must be the realization of the noise signal tV (t). Hence, owing to the Gaussian-noise assumption, the likelihood function becomes (apart from a constant of proportionality) [32J
f(t)
real real I complex complex
I=
= p[w(t I{an})] ""exp {-
4~o~ ~ w(td {anl)K-l(tl -
real : complexreal real
1
f,l)
(9)
a SBB: f(t) = !l(t) ± j3Cffl(t)}, where 3C is the Hilbert transform [11.
where K::'(T) is the inverse of K(r) (10)
and 'w(t)
= Y2'w c(t)
exp ( -jwcl - j<{Jc).
05)
In (5) the effect of low-passfiltering the transposed noise is neglected since it affects only noise components outside the signal bandwidth of interest. Our channel model does not include frequency offset and phase jitter. It is understood that the demodulating carrier phase 'Pc accounts for these effects.
The correctness of (9) for the complex-signal case is proven in Appendix I. Substituting (8) into (9) and considering only terms that depend on {an J, yields
I".J
+ T) ]
=
W(-
e«
T)
= 2Wc( r ) exp (-jwcT).
lV(T)
= 2NoK(r) , WGN: K(r)
81
= 0(7).
(7')
{4~r [L 2 Re nTd
(anZn) -
L L
iTt! kTt!
0
f f su; =f f =
I
I
1
1
nT)K-l(tl -
h(tl - iT)K-l(t1
= S-l,'
a;8i- ka kJ}
(11)
-
~)y(~) dtl(~ t2)h(~
l = k - i.
(12)
- kT) cU1 dtl
(13)
The quantities z; and St can be interpreted as sample values taken at the output of a complex l\1.F with impulse response function 2 (14) The derivation presented is mathematically weak in that
it assumes K-l(t) exists. This is not the case if the spectral power density of the noise becomes zero somewhere along the frequency axis. The difficulty can be avoided by defining. Zn and s, in terms of the reproducing-kernel Hilbert space (RI{HS). approach [33J, [34J. Here it is sufficient to consider the frequency-domain equivalent of (14) given by
(6)
For example, if 'wc(t) is white Gaussian noise (WGN) with double-sided spectral density No, then We(r) = JVolJ(T) and W (T j = 2NolJ (T). In view of this important case it is appropriate to introduce
.
where
The objective of the receiver is to estimate tan} from a given signal y (t). Let the receiver observe y (t) within a time interval I which is supposed to be long enough so that the precise conditions at the boundaries of I are
W( T ) = E[ ill( t) 10 (t
exp
H
III. STRUCTURE OF THE MAXI1VIUIVILII{ELIHOOD RECEIVER
insignificant _for the total observation. Let {a f i J be a 'hypothetical sequence of pulse amplitudes transmitted during I. The lVILR by its definition [2J, [3J determines as the best estimate of {an} the sequence {an} = {an} that maximizes the likelihood function p[y(t) ,t E I , {an}]. In the following paragraphs the shape of the signal element h (t) and the exact timing of received signal elements are assumed to be known. The noise of the transmission medium is supposed to be stationary Gaussian noise with zero mean and autocorrelation function W c ( T). From (5) the autocorrelation function of tV (t) is obtained as
l.fa n } ]
p[y(t) ,t E I
GMF(f) = H(f)jK(f)
(15)
where G?tIF ( f), H ( f), and K (f) are the Fourier transforms of YAfF(t), h(t), and K (t), respectively. It follows from (15) that g~IF(t) exists if the spectral power density 2
Here we are not concerned with realizability.
324
THE BEST OF THE BEST
of the noise does not vanish within the frequency band where signal energy is received. Only in this case have we a truly probabilistic (nonsingular) receiver problem. It can easily be shown that Zn
= gMF(t) * y(t)
!t=nT
= :E an-lSI 1
+ rn
{Zn}
Maximum-lik,:lihood s e q uence e s nmc tor
{~n}
(MLSEl
(16) Fig. 2. MLR structure.
(17)
and that the covariance of the noise samples
Tn
reads (18)
Since the noise of the transmission medium does not exhibit distinct properties relative to the demodulating carrier phase, the following relations must hold:
E[Re (rn ) Re (rn+z) ]
= E[Im (rn ) 1m (r,,+z) = No Re
]
(sz)
(19)
(20)
The similarity of SL and R, expressed by (18) implies that Isi I and that the Fourier transform of the sampled signal element {Sl} is a real nonnegative function
80 ~
S*( f)
=L
Bz exp
(-j27fJlT) ~ 0
(21)
I
with period liT. Clearly, the l\1F performs a complete phase equalization, but does not necessarily eliminate lSI (lSI: 8l ~ 0 for l ¢ 0). The main effect of the MF is that it maximizes the SNR, which we define as
instantaneous peak po,ver of a single signal eleme~t average power of the real part of the noise 80
2
E{Re (ro)2) =
80
(22)
n;
The part of the receiver which in Fig. 1 was left open can now be specified, as indicated in Fig. 2. From (16) it c~m prises a MF and a symbol-rate sampling device sampling at times nT. It follows a processor, called maximum-likelihood sequence estimator (MLSE) , that determines as the most likely sequence transmitted the sequence tan} = {an} that maximizes the likelihood function given by (11) , or equivalently, that assigns the maximum value to the metric J I( tal}) =
L 1.Td
2 Re (a"Zn) -
L I: a-iSi-kak.
et a
(23)
kT€I
The values of e, are assumed to be known. The sequence IZn} contains all relevant information available about {an} and hence forms a so-called set of sufficient statistics [2], [3J. The main difficulty in finding {an} lies in the fact that tan} must only be sought among discrete sequences
{an} which comply with the coding rule. The exact solution to this discrete maximization problem is presented in Section IV. Solving the problem approximately by first determining the nondiscrete sequence {an} = {ZLn} that maximizes (23) and then quantizing the elements of {ZLn} in independent symbol-by-symbol fashion, leads to the optimum conventional receiver [5J. Applying" the familiar calculus of variation to (23), one finds that {ZLn} is obtained from {Zn} by a linear transversal filter which, having the transfer function 1/ S*( f), eliminates lSI. The series arrangement of the l\1:F and the transversal filter is known as the optimum linear equalizer, which for zero lSI yields maximum SNR [6J, [8J. Calculating the SNIt at the transversal filter output in a manner equivalent to (22), gives SjN M F SjN L = - - - - - - - - - - - S SjN M F •
7 1'2 [IT 8*(j) df [/ ' 1/8*(j) elf o
0
(24)
This reflects the obvious fact that, since the MF provides the absolutely largest SNR, elimination of lSI by a subsequent filter must diminish the SNR. Equation (24) indicates, however, that a significant loss will occur only if somewhere along the frequency axis S* (f) dips considerably below the average value. For systems that transmit only real pulse amplitudes, i.e., double-sideband amplitude modulation (DSB-AM), vestigial-sideband amplitude modulation (V8B-Al\1), and single-sideband amplitude modulation (SSB-AM), it follows from (23) that only the real output of the MF is relevant. In those cases S* (f) should be replaced by ![S*( -f) + S*( f) ] without further mention throughout the paper. IV. NIAXIMUM-LII(ELIHOOD SEQUENCE
ESTIIVIA1"1ON In this section the exact solution to the discrete maximization problem of (23) is presented. The MLSE algorithm that will be derived determines the most likely sequence {anI among sequences {anI that satisfy the coding rule. Clearly, the straightforward approach of computing J I ( { an J) for all sequences allowed, and selecting the sequence that yields the maximum value, is impracticable in view of the length and number of possible messages. Instead, by applying the principles of dynamic program-
325
Fifty Years ofCommunications and Networking
ing [22], we shall be able to conceive a nonlinear recursive algorithm that performs the same selection with greatly reduced computational effort. The IVILSE algorithm thus obtained represents a modified version of the well-known Viterbi algorithm [i8]-[21], [23J-[30J. . The algorithm is obtained, observing 8t = S-l, by first realizing from (23) that J I( {an}) can iteratively be computed by the recursive relation Jn(
• • •
,an-l,an ) = J
n-l ( • • •
,an-I)
+ Re [an(2z n -
L
SOan - 2
8n - ka k) ].
k~n-l
(25)
Coriditions concerning the boundaries of I are not needed since once we have a recursive relationship, the length of I becomes unimportant. We now assume that at the IVIF output lSI from a particular signal"element limited to L preceding and L following sampling instants:
is
Sl
=
0,
Ill>
maximum. With respect to a« this sequence is credited ML among all other sequences. It is not difficult to see that the further one iooks back from time n - L, the less will a path history. depend on the specific a n to which it belongs. One can th~refore expect that all a« "rill have a common path history up to some time n - L - m; m being a nonnegative random variable. Obviously, the common portion of the path histories concurs with the most likely sequence tan} fo~ which we are looking. The final step in deriving the 1 /ILSE algorithm is to apply the maximum operation defined. by (29) to (27). Introducing the notation' of a survivor metric also on the right-hand side, we obtain In(U n) = 2 Re (anZ n)
+ Re [an (2z n -
L
soan - 2
L
szan-z)].
l=l
(27) We recall that sequences may be coded. For most transmission codes a state representation is appropriate. Let ~j be the state of the coder after a, has been transmitted. The coding state IJ.j determines which sequences can be further transmitted. Given JJ.j and an allowable sequence ai+l,aj+2, • • • ,aj+k, the state J.tj+k is' uniquely determined as (28)
The sequence of states {Jli} is Markovian, in the sense that (Uj \ J-Lj-l,P,j-2, • ~ .) = Pr (II-i \lJ.i-l) . Let us now consider the metric
Pr
{J n( • • • ,an.-L,an-L+l,· • • ,an) }
max
t··, .
L l - Jln - L
(29)
where the maximum is taken over all allowable sequences {• • •,an-L-l,an-L} that put the coder into the state J.Ln-L. In accordance with the VA literature, J11. is called survivor metric. There exist as many survivor metrics as there are survivor states (30)
Clearly, the succession of states {fT11.} is again Markovian. Associated with each o« is a unique path history, namely, the sequence 1··· ,an-L-l,an-L}, which in -(29) yields the
tJ n- l (Un- l )
- F(fTn-l,o-n) }
where the maximum is taken over all states {(fn-l} that have a.; as a possible successor state, and F(fT1t - l,Un ) = ansoa n + 2 Re (an
Changing indiees we obtain from (25) and (26) In(···,an-l,a n) = In-1(···,an-l)
max 100n-11-O'n
(31)
(26)
L.
+
L
L
Slan-l).
(32)
1=-1
Verifying (31), the reader will observe that L is just the minimum number of pulse amplitudes that must be associated with (Tn. Thus, L takes on the role of a constraint length inherent to lSI. Equation (3i) enables us to calculate survivor metrics and path histories in recursive fashion. The path history of a particular a; is obtained by extending the path history of the Un-I, which in (31) yields the maximum by the an-L associated with the selected 0'~-1. At each sampling instant n, survivor metrics and path histories must be calculated for all possible states a«. Instead .of expressing path histories in terms of pulse amplitudes an-L, they could also be represented in any other one-to-one related terms. This concludes the essential part, in the derivation of the l\1LSE algorithm. The algorithm can be extended to provide maximum a posteriori probability C~1AP) decisions, as is shown in Appendix II. However, for reasons given there, the' performance improvement which thereby can be attained will usually be insignificant. The. algorithm is identical to the original Viterbi algorithm if there is no lSI at the ~/IF output, i.e., F(Un-l,U n ) = ansoan. In the presence of lSI, the algorithm differs frorr. the original Viterbi algorithm in that it operates directly on the IVI~~ output where noise samples are correlated according to (18). For the original Viterbi .algorithm, statistical independence of the noise samples is essential. Forney [18J proposed to decorrelate the l\1F output noise by a transversal filter which thereby reduces the number of nonzero sample values of the signal element from 2L 1 to L 1. The constraint length inherent to lSI is therefore the same, namely, L, for both the Viterbi algorithm as used by Forney and its modified version presented in this section." Clearly, this indicates some fundamental 10\VeI
+
i
G. D. Forney, private communication.
+
THE BEST OF THE BEST
326
limit to the complexity of l\1LSE. Yet the modified Viterbi algorithm offers computational advantages in that the large number of squaring operations needed for the original Viterbi algorithm [29J are no longer required. Only the simple multiplications by discrete pulse amplitude values occurring in Re (anZ n ) must be executed in real bime. It was observed by Price [35J that Forney's whitened IVIF ~ is identical to the optimum linear input section of a decision-feedback receiver. In Section VI Vie shall see that basically the same principle can be applied in order to realize 'the MF and the whitened IVII~' in adaptive form. Hence in this respect the two algorithms are about equal. We shall now illustrate the algorithm' by a specific example. Let us consider a simple binary run-Iengthlimited code with an E {O, 1 j and runs no longer than t\VO of the same symbol. The state-transition diagram of this code is shown in Fig. 3 (a). According to (30), with L = 1 the following survivor states and allowed transitions between them are obtained: 0"1 ~ (~l:
1) ~ (J.l3: {0,1}) ~
t(14,(15}
(12 ~ (J.L2:O) ~ (J.Ll: 1) ~ 0"1
CT3 ~ (",2: 1)
~
(p.3: {0,1})
to,!}
(",3:0) ~
(,.,.2:
(1S ~
(1£3: 1)
~
(JL 4:O)
~
(JL4: 0)
~
(JL {O, 1 })
(14
(16
~
2:
~
(0)
o
(b)
Fig. 3. (a) State-transition diagram of binary (an = {0,1}) runlength-limited code with runs ~ 2. (b) State-transition diagram of corresponding survivor states for L = 1.
{CT4,a 5}
~ {(12,u3
1
~ (16
~;'
o:«
\\ ~
P
0 0 0 Common path history 0 0
\
'y<~
d
0
0
~"
0
\~\
v, ,
P
0"'1
/p
v
)''':,p '7(~
b
'0
"
~~(
0
'\
II
~/I
0
0
0
;'
~)x
~ {0"2, (13 J .
The corresponding state-transition diagram is depicted in Fig. 3(b). By introducing the time parameter explicitly, we obtain Fig. 4, in which allowed transitions are indicated by dashed lines (the. so-called trellis' picture [20J). The solid lines, as an example, represent the path histories of the six possible states an and demonstrate their tendency to merge at some time n - L - rn into a common path. As the algorithm is used to compute the path histories of the states Un+-i, i = 1,2,···, new path-history branches appear on the right, whereas certain existing branches are not continued further and disappear. In this way, with some random time lag L + 111" ni 2: 0, a common path history develops from left to right. In order to obtain the output sequence {an. J only the last, say, M, pulse amplitudes of each path history have to be stored. },f should be chosen such that the probability for m > kf is negligible compared with the projected error probability of the ideal system (infinite M): Then, at time n, the an-L-kI of all path histories will with high probability be identical; hence anyone of them can be taken as an-L-M. The path histories can' now be shortened by the an-L-M. Thus the path histories are kept at length M-, and a constant delay through the 1\1LSE of L + M symbol intervals results. For decoding of convolutional codes the value of lv[ has been discussedin the literature[20J, [36J, [37J. At the present time no such results are available for the lSI case. A refined way of selecting an-L-.tlf, which allows for a
p
~
n-L-m
(In-L-M Cln-L-M+~
n-3
vi
0
"0
n- 2
n-L
1\.'1
°n-2
Stored
path histories
Qn-L
an
0"4 0"&
II ..... ,
n-L-M
(1"3
II ' \ II b
0
0
2
(In+4
L.T (L=ll
Fig. 4. Time-explicit representation of Fig. 3(b) (dashed lines) and illustration of path histories at time n (solid lines).
reduction of M, is to take (Xn-L-]\,I for the path history corresponding 'to the largest survivor metric. From (31) it is clear that without, countermeasures the survivor metrics would steadily increase in value. A suitable method of confining the survivor metrics to a finite range is to subtract the largest J n-l from all J ~ after each iteration. I
v.
ERROR PERFORl\1ANCE
Since the receiver of this paper realizes the same decision rule as Forney's receiver [18J, it is not surprising that identical error performance will be found. In this section, following closely Forney's approach, \ve present a short derivation of the error-event probability for the modified Viterbi-algorithm case. The influence of lSI present at the AI1~~ output the error performance of the l\1LR is discussed, and bounds for essentially 110 influence are given in explicit fornl. The results are compared with the error performance of the optimum conventional receiver. We recall that {an J represents the data- seq~ence transmitted, whereas {~} denotes the sequence estimated by
on
327
Fifty Years of Communications and Networking
satisfied is given by
the receiver. Then
Pr (02' I Gl)
(33)
is the error sequence. Since consecutive symbol errors are generally not independent of each other, the concept of sequence' error events must be used. Hence, as error events we consider short sequences of symbol errors that intuitively are short compared with the mean time between them and that occur independently of each other. Presuming stationarity, the beginning of a specific error event 8 can arbitrarily be aligned with time 0:
Ieo I, Ien I ~
8: fen} = ... ,0,O,eO,el'· • • ,eu,O,O,· • • ;
50,
H 2:: O. (34! Here
~o
denotes the minimum symbol error distance 00 = min
{!
a (i) -
I}.
a(k)
(35)
where Q(x) = , 1(21r) 1/2 '"
+ fen} is an allowable data sequence; 82: the noise terms are such that {an} + {en} has l\1L (within the observation interval) .
8 1 : {an} is such that {an}
1'
00
x
exp ( -y2/2) dy
X(2~) 1/2 exp ( .,- x2/ 2) ,
Pr (E)
=
\YC
Pr (e)
=
>
3.5.
(41)
~£.d
.
Pr (6 1)
.
8E:E(8)
(42) Owing to the steep decrease of Q(x), the right-hand side of (42) will already at 'moderate' 'SNlf be dominated by the term involving the smallest value in 4, denoted by Omin. Likewise, the bound given by (36) becomes tight fOI all 8 E 'E(Omin), as then 02' very likely implies 82. Consequently, as the SNR is increased, Pr (E) approaches asymptotically
L
Pr (E) ~ Q«S/N MF ) 1/2lJmin/ 2)
Pr (81)
(43)
8£E(8 m in )
{en I has greater likelihood than {an}, but not necessarily l\1L.
Then
z
2: Pr (e) < L Q( (S/ N MF) 1128/ 2) 2:
8£E
It is useful to define beyond that the subevent '62': the noise terms are such that {an}
(40)
We continue as indicated by Forney [18J. Let E(~) be the subset of E containing all events 8 with distance 0(8) = o. Let a be the set of the possible values of 8. From (36) and (40) the probability that any error event 8 occurs becomes and is upper bounded by
i~k
We are not concerned with the meaning of error events in terms of erroneous bits of information. Let E be the' set of events 8 permitted by the transmission code. For a distinct event 8 to happen, two subevents must occur:
= Q[(S/N MF ) 1/2Q(8 ) / 2]
+
where
(44)
have
Pr (e l ) Pr (8 2 1 e 1 )
< Pr (e
1)
Pr (82' 1 e 1 )
(36)
where Pr (81) depends only on the coding scheme. Events 8 1 are generally not mutually exclusive. Note that in (36) conditioning of 8 2 and 82' on 8 1 tightens the given bound, since prescribing 81 reduces the number of other events that could, when B2' occurs, still have greater likelihood, so that 8 2 would not be satisfied. From (23) we conclude that Pr (8/ Ie1 ) is the probability that J
I( {
an})
< J 1 ( { an}
+
{en}).
(:37)
By substituting (16) into (23), and observing (33) and Sz = s-z, (37) becomes 1 H H 2 H 02 (0) ~ - L L eiSi-kek < - Re [2: Ci 1\ ] . (38) So i=O k=O
So
i==O
We call 0(8) the distance of 8. The right-hand side of (38) is a normally distributed random variable with zero mean and, from (18), (19), and (20), variance (39)
Hence, observing (22), the probability of (37) being
Equations (43) and (44) differ only in notation from Forney's original finding. , In the following this result should be discussed in more detail. Specifically, we are interested in the influence of IS] on the value of Dmin. Using Parseval's theorem, from (21) and (38), 02(e) can be rewritten in the form T 02(8) = .,So
fliT S*( f)E*( f) df.
(45)
0
where E* (f) is the energy density spectrum of the error sequence fen}, E*(f) =
II
Jl
L: L: eiexp [j2'71/(i -
k)T]ek
~ O.
(46)
i-O k=O
If S*( f) were constant (no lSI at the 1\1F output), (45) becomes 02(8) = T
1/7'
~
El e, H
E*(f) df =
2
1
~ wIl(8)o02
(47)
where 'WH > 1 denotes the number of nonzero symbol errors of e.ln this case Omin would simply be the smallest Euclidian distance between any two allowed data se-
THE BEST OF THE BEST
328
quences. In a noncoded system, where pulse amplitudes of a given alphabet may occur in arbitrary succession, single error events (tVH = 1) with minimum distance Omin = 00 would be the dominating error events. The value of Omin can be increased by redundant sequence coding, e.g., by convolutional encoding [36J. If S*( f) is not constant, lSI at the :\IF output distorts the space in which error-event distances are measured. Depending on the weighting of E* (f) by S* ( f), error-event distances can become smaller or larger. By sequence coding one can prevent that error sequences are allowed which have spectral peaks where S* (f) is small. This is precisely what is accomplished by correlative-level (partial-response) coding [2,5]. Clearly, if S* (f) vanishes on one side of f = 0, as in SSB or VSB systems, only (real) data sequences with symmetric error-sequence spectra E*(f) = E*( -f) can be transmitted. We now limit our attention to noncoded systems. As long as lSI does not exceed limits discussed further in the following paragraphs, we have Omin = 00. From (43) the probability of occurrence of the then dominating single error events becomes
Pr (E)
~
Q( (S/lv
/ 2)
M F ) 1/200
· z:
Omi.n = 00.
Pr (eo allowed) ,
leol==60
(48)
For comparison, the error performance of the optimum conventional receiver is given by Pr (E) ~ Q( (S/ lV L) 1/200/ 2)
z:
Fig. 5. (a) Minimum symbol error distance in a specific octal pulse-amplitude alphabet. (b) Minimum symbol errors and probabilities that they may occur (assuming that pulse amplitudes are transmitted with equal probability). ~ Pr (eo allowed) = 2(t + i) = t, Pr (E) ~ tQ«2·SIN)1I'1.). 18O!-80
than 00 is that /3* (f) dips nowhere more than 6 dB [20 log (tVy = 2) ] below average value. A second generally less restrictive condition is that
L /sll
l,.eO
}[
leol=80
~ ~ min: S*( f) }wu (e) 00
2
80
(50)
it follows that if
~
80
~T
1
lIT
o
S*( f) elf
(51)
is satisfied, no event B can have smaller distance than 00' Hence, a sufficient but not necessary condition for the nonexistence of multiple error events with distance smaller
(.52)
So
11
el+kek
k-O
(49)
<
which is the familiar condition for peak distortion at the lVIF output being smaller than unity. In order to prove this sufficient but again not necessary condition for the nonexistence of error events e with distance smaller than 00, one should first realize from (34) and the Schwarz inequality that
IL
We note that in (48) lSI at the IVII~' output has essentially no influence on the error performance of the Th1LR, where as in (49) lSI affects the error performance of the conventional receiver through the loss of SNR expressed by (24). The evaluation of Pr (E) is shown by Fig. 5 for a specific octal Al\1-})l\/[. scheme. In order to determine the degree of lSI up to which (48) holds, we must look for multiple error events ('wu ~ 2) with distance smaller than 00. Such error events would then be more probable than the minimum single error events. A first condition for the nonexistence of such events can be derived from (4.5) and the inequality expressed in (47). Noting that
min IS*(j) jWll(e)
(o)
Pr (eo allowed), SjlV L S SjlV)IF.
a2(e)
I :s; L: Ie, /2 -
~02,
l ¢ O.
(53)
k=O
Condition (,52) can then be verified by transforming and bounding 02 (e) as follows: 02(e)
1
=- L 80
l
Jl
Sf
L
1
1l
~
L:
k=O
~ 002
el+kek
kz::
I ek 12
-
-
So
L
I ~o
I s, I ~
]{
I: el+kek
k-O
[holds, if (52) is true].
I
(54)
Comparing (,52) with the definition of S*( f) in (21) reveals that at distinct frequencies, S* (f) may approach zero level without this significantly affecting the error performance of the IVILIl. This was first observed by Kobayashi [2;3J for ThIlL decoding of (21n -l)-ary correlative-level encoded signals. A signal of this kind can just as well be interpreted as a noncoded m-ary signal with intentionally introduced lSI, which causes S* (f) to beCOll1C zero (usually) at! = 0 and/or 1/2T. For the IVILR the t\VO concepts arc equivalent. A conventional receiver, however; can interpret such signals only as (2m - 1) -ary coded sequences and thereby loses in the limit 3 dB, unless
329
Fifty Years ofCommunications and Networking
error correcting schemes are used. But even then the loss can only partly be compensated, since the hard decisions made by the symbol-by-symboI decision circuit of the conventional receiver cause an irreversible loss of information. VI. AUTOIVIATIC RECEIVER ADAPTATION So far the exact signal and timing characteristics have been assumed to be known. However, in a realistic case the lVILR must at least be able to extract the carrier phase and sample timing from the signal received. Beyond that, automatic adjustment of the l\1~-' will often be desirable or necessary. In this section ,ve present an algorithm that simultaneously adjusts the demodulating carrier phase and the sample timing, approximates the 1\1F by a transversal filter, and estimates lSI present at the approximated IVIF output. The algorithm works in decision-directed mode in much the same way as described by Kobayashi [5J and Qureshi and Newhall [29J. In the proposed fully adaptive MLR the IVIF is approximated by a transversal filter, similar to the familiar adaptive equalizers described by Lucky et al. [1], [38], [39J and others [5J, [4OJ-[45J. An analog implementation will be assumed. Assuming N + 1 taps equally spaced by T p seconds with tap gains gi, 0 ~ i ~ N, the output signal of the transversal filter at the nth sampling instant nT + T a, where r, denotes the sampling phase, becomes N
Zn =
L
giy(nT
+ Ts -
N
iTp,cpc) ~
~o
L
YiYni(Ts,'Pc).
(.55)
~o
Note that according to (4) and (.5) ,ve have y(t,
y (t,C{Jc
must be minimized as a function of these parameters, with So held constant. Differentiating (.57) and applying the Robbins-Monro stochastic approximation method [46J leads to the stochastic steepest-descent algorithm comprising the following recursive relations: gi('~+l)
=
sz(n+l)
= Sl(n) +
T 8 ( n + l)
= T/n )
g/n) -
-
nYni,
0
~
i
(X,(n)rnan_l,
1
~
III ~
a l1(n)r
ar(n)
Re
~
N
(rnin ) ,
(58) L
(59)
(60)
+ a~(n) 1m (r n2n) .
(61)
The step-size gains ag, as, a-, and at(J must be positive and may depend on n. In (60) ~n denotes the time derivative of the transversal filter output at the nth sampling instant. As the algorithm adjusts the transversal filter as l\1F, the values Sl approach the values s, required by the l\lLSE algorithm. Equations (58) and (59) differ from the corresponding equations of .an adaptive decision-feedback equalizer, or whitened lVIF, onlyby the fact that here at the transversal filter output L preceding and L trailing lSI terms are considered. The algorithm 'will force lSI outside this interval to zero. The true l\1F characteristic may often require a large value' of L. However, since the complexity of the l\1LSE algorithm increases exponentially with L, for the choice of L a, compromise suggests itself [29J. In many practical cases already with values L = 1 or L = 2, a good approximation of the ideal l\1F characteristic will be obtained. The potential advantages of l\1LSE can thus be exploited to a commensurate degree at a still manageable receiver complexity. Introducing the symmetry condition 8-l = ~l into (56), we obtain instead of (.59)
1S l
< L. (62)
This modification has the desirable effect of forcing the transversal filter to produce at its output a symmetric signal element even if L and the transversal filter parameters are not fully adequate to achieve therewith the ideal Mli' characteristic. Equations (60) and (61) have been reported by Kobayashi [5J. They describe the operation of two firstorder phase-locked loops. Theoretically, if by (58) the L (complex) tap gains are adapted, the adjustment of '-r. (56) r = L: Blan-l l---L and 'Pc appears to be not really necessary. In practice, In order "ho\vever, these phases must be controlled in order to comrepresents the estimated noise component of to adjust the parameters gi, Bz, T 8 , and 'Pc, the variance ofr n, pensate carrier and sampling frequency offsets. In case of considerable offset one might even add second-order terms N N to (60) and (61). var (r n) = L L YiE[Yni(TB,'Pc)Ynlc(ra,'Pc)]gk i-O k .....O The structure of the proposed IVILR is seen in Fig. 6. It is basically a combination of the approaches of KobayN L [5J and Qureshi and Ne,~hall [29J, except that here ashi - 2 Re LL: L giE[Yni(T8 ,
n zn -
zn.
THE BEST OF THE BEST
330
signal element at the lVIF output. With the receiver working in decision-directed mode and disregarding phase ambiguity, 1"8 and 'Pc must in principle be close to the optimum settings. Convergence towards Ts* and 'Pc* should therefore generally not be a problem.
VII. SUlVll\1ARY AND CONCLUSIONS
~n- M
i n-M
'n-M
Fig. 6. Adaptive l\1LH,.
Section IV). In order to shorten the feedback delay, tentative decisions taken from the path history with largest survivor metric are employed in the feedback paths as suggested by Qureshi [30J. With this approach, delays of ltf symbol intervals must be used in the forward paths. Not shown in Fig. 6 is the possibility of incorporating decisionfeedback cancellation of further trailing lSI in the receiver [30J. In the remainder of this section we discuss topics related to convergence and convexity of the adjustment algorithm. It must be assumed that already safe enough decisions are available. To begin with, r, and 'Pc are considered as given constant values. With sufficiently small step-size gains au and as, convergence from arbitrary initial settings g/O) and Sl(O) towards globally optimum settings gi*'(7 s,
A uniform fully adaptive receiver structure has been derived' for synchronous data-transmission systems that employ linear carrier-modulation techniques. The structure realizes the ~'1L sequence rule. In the receiver, first an information reduction to a set of sufficient statistics takes place by the demodulation, matched filtering, and symbol-rate sampling process. Sequence estimation is performed by a modified Viterbi algorithm that exhibits the same performance characteristics as the original scheme. The algorithm represents an attractive design alternative due' to the fact that squaring operations are no longer needed. Besides add and compare operations, only a few simple multiplications by discrete pulse-amplitude values must be performed in real time. In addition to performance gains realized by the lVILSE principle, one may expect that the approximation of the IVI:F will generally require fewer filter taps than are needed for the zero-forcing equalizer of a conventional receiver. The proposed adaptation scheme permits compromise solutions between the conventional receiver and the ideal ~/IL receiver. The choice of the decoding delay of l\1LSE in the presence of lSI and the dynamics of the presented adjustment algorithm have not been disc~ssed in detail. Also the issues of effective QAl\f coding and joint transmitterreceiver design have not been addressed. These could be fruitful areas for further research. For example, how should the transmitter filter be designed for a given channel characteristic in order to attain with (52) as secondary condition maximum SNIt at the matched filter output? The specific implementation of l\IL receivers will be another interesting topic. Recent progress in circuit technology will allow here for much more complex designs than we are still used to.
APPENDIX I
moor
OF (9)
Owing to the one-to-one relation between 'tv(t) expressed by (fi) we have [32]
'We (t)
and
p[wc(t) ,t E I] = p['lv(t)
,t E I]
""' exp {-
~ ~ ~ 1t'c(t1) Wc-l(t l - ~)tv.(t.l) lltl d~} . (AI)
It follows from (6), (7), and (10) that We-leT) =
~ K-l(T) No
exp (+jWcT).
(A2)
Fifty Years ofCommunicationsand Networking
331
Substituting (A2) into (AI) and observing (,5) we obtain p[w(t)
,t E I]
values sz* which minimize var (r) must satisfy N
"'-' exp {-
4~o ~ ~ 1v(ll)K-l(ll - ~)w(l:!) dl
L
1
dl:!} (A3)
APPENDIX II
L
L
E(YiYk)gk* -
k=O
(A8)
and N
EXTENSION OF THE l\iLSE ALGORITHl\1
E(Y1,a-l)sl* = 0,
l=-L
L
E(a-lYi)gi* -
i=O
L
L
E(a_Za-k),sk*
k~-L
TO THE iVIAP RULE
We proceed as indicated by Forney [27J. To satisfy the l\1AP rule the algorithm has to determine the sequence -[ an} which maximizes Pr [{ an} I y (t) ,t E I]
rov
I {an}] Pr
p[y (t) ,t E I
[{ an} ].
(A4) Since there is a one-to-one correspondence between {an} and the sequence of survivor states {Un}, and since {
Pr [{ an}] = Pr [{ 0"n } ]
rov
II Pr
«r a
(
(A5)
var [r I {Ui*(Ts,tpc) Lfsl*(T.)}] Yi( 'fa) =
L akh( T,
- iTp
f
kT)
-
k
= X.
(AlO
w( r, - iTp )
.
(All
With power series notations
= L h(t + kT)D"
(A12
k
(A6)
A(D) =
max
fJ'" n-l ( Un-l) - F( Un-l,
l
+ 4N o In CPr (
L
E(aOak)Dk
(A13
SZ*Dl
(A14.
k
Taking the logarithm of (A4) and observing (A5) , it is seen that transitions (Un-l,Un) are to be weighted by In CPr (Un I Un-I)]. In this way the IVIAP version of (31) becomes
+
= var (r IT.)
0
From (3) and (55) we have
h(t,D)
Pr (Un I Un-I) = Pr (Iln tUn-i).
2 R e ( anZ n)
l=O
where A acts as Lagrangian multiplier. Substitution (A8) and (A9) into (57) yields
It follows from (30) that
() = J nUn,
1°'-h,
=
rio,
L
=
L
I--L
(A8) and (A9) can be rewritten in the form N
(A7)
The factor 4No follows from (11) and, according to (22), is inversely proportional to the SNR at the l\1]i' output. The ML rule and the MAP rule are therefore equivalent for infinite SNR. The IVIAp rule can offer a significant advantage only at very low SNR's and when a code is used that leads to considerable differences among the conditional probabilities Pr ((In. I Un-I).
APPENDIX III First, we show that minimizing var (rn ) with So held constant indeed adjusts the transversal filter as lVIF, and that thereby the values of s, are provided to a sufficient degree of approximation. Second, we study the convexity of var (r n ) relative to the sampling phase r.. Assuming sampling instant n = 0, we drop the index 1~ in the following calculations. To begin with, the sampling phase T and the demodulating carrier phase 'Pc are considered as given constant values. It follows from (57) that the optimum tap gains Yi* and
{h(1', - iTp,D-l)'L A(D)h(T s
-
k1'p,D)gk*IDo
k=O N
+ L: W[Ci -
k)Tp]gk*
k=O
= {h(r, - irp,n- 1 ) A (D)s*(D)
l»o,
os
i ~ N
(A15
III s L.
(A16
and N
II: A(D)h(T
8
-
kTp,D)Yk*IDl
k=O
=
(A(D)s*(D) - X}DZ,
Here -t • } Dl indicates the coefficient belonging to D', Wit] L not being too restricted, and 'lVi" = W[(i - k)1'p = substitution of (A16) into (A15) yields in good approx
imation N
L
8
k,..,{J
tVikYk
* ~ Xfi(1',
- iTp ) ,
Let 'W 'k- 1 be the elements of the inverse of the (lV 1
+ 1)
)c
332 ("AT
THE BEST OF THE BEST
+
I) matrix with elements U\k. Then from (A17),
gk*~A
.v
L
h(Ts
o~
iTp)'l.C~~i-l,
-
k ~ N.
(A18)
Substituting (AIS) into (A16), and observing (A12) and (A14), we obtain N
N
L L
h(T s
-
iT
p) H'ki- 1h(T8
kTp + IT)
-
i-O k-O
! II
~ L.
(A19)
For moderate SNR we can neglect the last term in (A19) which involves minor approximation, Comparison of (AI8) with (14) and of (A19) with (17) exhibits that with adequate values of N, T p , L, and T s , the desired adjustment of the adaptive l\ILR will be achieved. We investigate now the convexity of var (r) relative to T a• Considering (AIO) and determining A from (A19) with l = 0, we find var (r
I'Ts )
So
~ ------------N N
LL
h(T s
iTp)Wki-1h(Ts
-
-
(A20)
kTp )
i=O k=O
The denominator of (A20) expresses the weighted energy of the values of h(t) seen at time t = T 8 at the transversal filter taps. We assume that the length of the transversal filter delay line NTp exceeds or at least corresponds to the duration of h(t). Var (r 'Ts ) will then, on the whole, be convex within an interval comparable to NTp. Unless the tap spacing T p is very small, however, there will be some ripple within this region. Suppose now that for some T s* and
L (Ji*(T *,
i=O
8 )
~ first term
+ third term
L
i=-O
SI*~A
var (r I LiT
iTp
+
IT,,!,c)
where set) denotes the time-continuous signal element at the l\11~" output [s(l7') = Sl], (57) becomes
var (r I ~'8,~'P(J ~ first term
+ third term D
- 2Ile
ILL: s(lT)E(akal) I.=-L
·S(~T8
+ leT)
k
exp ( -jA
Minimizing (A22) with respect to ~
(A22)
- 21 .L L 1=-£ k
s(lT)E(akal)s(Als
+ kT) I. (A23)
The form of (A23) permits local minima to occur within short distance from IS*. However, with the receiver working in decision-directed mode, ! ~Ts I < T /2 can be assumed, and hence convergence towards T s* and 'Pc* can hardly be a problem. REFERENCES [1] H,. W. Lucky, J. Salz, and E. J. Weldon, Jr., Principles of Data Communication: New York: McGraw-Hill, 1968. {2] C. W. Helstrom, Statistical Theory of Signal Detection, revised edt New York: Pergamon, 1968. [3] H. L. Van Trees, Detection, Estimation and Modulation Theory, pt. 1. New York: Wiley, 1968. [4J a) H. B. Voelcker, "Toward a unified theory of modulation,Part I: Phase-envelope relationships," Proc. IEEE, vol. 54, pp. 340-353, Mar. 1966. b) - - , "Toward a unified theory of modulation-Part II: Zero manipulation," Proc. IEEE, vol. 54, pp. 73f>-755; May 1966. [5] H. Kobayashi, "Simultaneous adaptive estimation and decision algorithm for carrier modulated data transmission systems," IEEE Trans. Commun. Technol., vol. COM-19, pp. 268-280, June 1971. [6] D. W. Tufts, "Nyquist's problem-The joint optimization of transmitter and receiver in pulse amplitude modulation," Proc. IEEE, vol. 53, pp. 248-259, Mar. 1965. [7] M. H.. Aaron and D. W. Tufts, "Intersymbol interference and error probability," IEEE Trans. Lnform, Theory, vol. IT-12, pp. 26-34, Jan. 1966. [8] T. Berger and I). W. Tufts, "Optimum pulse amplitude modulation, Part I: Transmitter-receiver design and bounds from information theory," IEEE Trans. Inform: Theory, vol. IT-13, !!p. 196-208, Apr. 1967. [9J T. Ericson, "Structure of optimum receiving filters in data transmission systems," IEEE Trans. Inform. Theory (Corresp.), vol. IT-17, pp. 3f>2-353, May 1971. [10] IV1. E. Austin, "Decision-feedback equalization for digital communication over dispersive channels," Sc.D. thesis, Mass. Inst. Techno!., Cambridge, May 1967. [11) D. A. George, It It Bowen, and J. R. Storey, "An adaptive decision-feedback equalizer," IEEE Trans. Commun, Technol., vol. COM-19, pp. 281-293, June 1971. [12] R. W. Chang and J. C. Hancock, "On receiver structures for channels having memory," IEEE Trans. Inform. Theory, vol. IT-12, pp. 463-468, Oct. 1966. [13] K. Abend, T. J. Harley, B. D. Frichman, and C. Gumacos, "On optimum receivers for channels having memory," IEEE Trans. Lnform, Theory (Corresp.), vol. IT-14, pp. 819-820, Nov. 1968. . [14] H,. R. Bowen, "Bayesian decision procedure for interfering digital signals," IEEE Trans. l njorm, Theory (Corresp.), vol. IT-1.5, pp. 506-507, July 1969. [15] R. A. Gonsalves, "Maximum-likelihood receiver for digital data transmission," IEEE Trans. Commun. Techmol., vol. CO~1-16, pp. 392-398, June 1968. [16] K. Abend and B. D. Fritchman, "Statistical detection for communication channels with intersymbol interference," Proc. IEEE, vol. 58, pp. 779-785, May 1970. [17] G. Ungerboeck, "Nonlinear equalization of binary signals in Gaussian noise," IEEE Trans. Commun. Tcchnol., vol. COM-19, pp. 1128-1137, Dec. 1971. [18] G. D. Forney, "Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference," IEEE Trans. Inform, Theory, vol. IT-18, pp. 363-378, May 1972. [19] A. J. Viterbi, "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm," IEEE 'Trans. Inform: Theory, vol. IT-13, pp. 260-269, Apr. 1967. [20] G·. D. Forney, "Review of random tree codes," NASA Amer. Res, Cen., Moffett Field, Calif., Contract NAS2-3637, NASA CR 73176, Final Rep., Appendix A, Dec. 1967.
333
Fifty Years ofCommunications and Networking [21] J. K. Omura, "On the Viterbi decoding algorithm," IEEE Trans. Inform.. 'Theory (Corresp.), vol. IT-15, pp. 177-179, Jan. 1969. [22] It. Bellman, Dunamic Programming. Princeton, N. J.: Princeton Univ. Press, 1957. [23] J. K. Omura, "On optimum receivers for channels with inter-
symbol interference" (Abstract), presented at the IEEE Int. Symp. Information Theory, Noordwijk, Holland, 1970. [24] H. Kobayashi, "Application of probabilistic decoding to digital magnetic recording systems," IBM J. Res. Deoelop., vol. 15, pp. 64-74, Jan. 1971. [2.5J - - , "Correlative level coding and maximum-likelihood decoding," IEEE Trans. Inform. Theory, vol. IT-17, pp. 586594 Sept. 1971. ' . (26] - - ', "A survey of coding schemes for transmission or recording of digital data," IEEE Trans. Commun, Technol., vol. COM-19, pp. 1087-1100, ])ec. 1971. ' [27] G. D. Forney, "The Viterbi algorithm," Proc. I$EE, vol. 61, pp. 268-278, Mar. 1973. [28] F. l~.. Magee, Jr., and J. G. Proakis, '(Adaptive maximumlikelihood sequence estimation for digital signaling 'in the presence of intersymbol interference," IEEE Trans. Inform, Theory (Corresp.), vol. JT-19, pp.I20-.124, Jan. 1973. [29] S. U. H. Qureshi and E. E. Newhall, "An adaptive receiver for data transmission over time..dispersive channels," I,EEE Trans. Inform. Theory, vol. IT-19,·pp'. 448-4fi7,. July 1973. [30] S. U.' H. Qureshi, HAn' adaptive decision-feedback receiver using maximum-likelihood sequence estimation," presented at the 1973 Int. Communications Conf., Seattle, Wash .. ~31] L. K. Mackechnie, "Receivers for channels with intersymbol . Interference" (Abstract), presented at the IEEE Int. Symp.. . , Information Theory, 1972, p. 82. [32] M. Schwartz, "Abstract vector spaces applied to problems in detection. and estimation theory," IEEE Trans. Inform. Theory, ·vol. IT-12, pp. 327-336, July 1966. [33] J. Capon, "Hilbert space methods for detection theory and pattern recognition," IEEE Trans. Infor-n~. Theory, vol. IT-II, pp. 247-2.19, Apr. 196.5. [34] T. Kailath, "ll-KHS approach to detection and estimation problems-Part I: Deterministic signals in Gaussian noise," IEEE Trans. 7nform. Theory, vol. IT-17, pp. 530-'549, Sept. 1971. ' [35] R. Price, "Nonlinearly feedback-equalized PAM versus capacity for noise filter channels," presented at the 1972 Int. Conf. Communications, Philadelphia, Pa. [36] A. J. Viterbi, "Convolutional codes and their performance in communication systems," IEElf Trams. Commun. Technol., . vol. COM-19, pp. 751-772, Oct. 1971. [37] J. A. Heller and I. M. Jacobs, "Viterbi decoding for satellite and space communications," IEEE Trans. Commun. Technol., vol. COM-19, pp. 835-84~, Oct. 1971. .
[38] R. W. Lucky, "Automatic equalization for digital communication," Bell Syst. Tech. J., vol. 44, pp. 547-588, Apr. 1965. [39] - - , "Techniques for adaptive equalization of digital communication systems," Bell Syst. Tech. J., vol. 45, pp. 255-286, Feb. 196H. ' [40] M. J. DiToro, "Communication in time-frequency spread media using adaptive equalization," Proc. IEEE, vol. 56, PP. 1653-1679, Oct. 1968. [41J A. Gersho, "Adaptive equalization of highly dispersive channels for data transmission," Bell Syst. Tech. J., vol. 48, pp ..55-70, Jan. 1969. [42] J. G. Proakis and J. H. Miller, "An adaptive receiver for digital signaling through channels' with intersymbol interference," IEEE Trans. Inform: Theory', vol. IT-15, pp. 484-497, July 1969. [43] D~ Hirsch and W. J. Wolf, "A simple adaptive equalizer for . efficient data transmission," IEEE Trans. Commun. Technol., vol. C01\I-18, pp. 5-11, Feb. 1970. [44] K. Mohrmann, "Einige Verfahren zur adaptiven Einstellung von Entzerrern fur die schnelle Datenubertragung," Nachrichtentech. Z., vol. 24, pp. 18-24, Jan. 1971. [451 G. Ungerboeck, "Theory on the speed of convergence in adaptive equalizers for digital communication," I B1l'I J. Res. Develop., vol. 16, pp. 546-555, Nov. 1972. [46] H. Robbins and S. Monro, "A stochastic approximation method," Ann. }.{ath. Stat., pp. 400-407, 1951.
* Gottfried Ungerboeck was born in Vienna, Austria, in 1940. He received the Dipl.Ing. degree in telecommunications from the Technisehe Hochschule, Vienna, in 1964 and PHOTO the Ph.D.. degree from the Swiss Federal NOT Institute of Technology, Zurich, Switzerland, in 1970. AVAILABLE After graduating from the Technische Hochschule, he joined the' technical staff of the Wiener Schwachstromwerke, Vienna. In 1967 he became a member of the IBM Zurich Research Laboratory, Ruschlikon, Switzerland. He worked in speech processing, dataswitching, and data transmission theory. His present interests cover signal processing and detection and estimation theory.
Error Probability in the Presence of Intersymbol Interference and Additive Noise for Multilevel Digital Signals SERGIO BENEDETTO, GIROLAMO DE VINCENTIIS, AND ANGELO LUVISON
Abstract-A new method is presented to compute the average probability of errol in the presence of intersymbol interference and additive noise for multilevel pulse-amplitude-modulation (PAM) and partial-response-coded (PRC) signaling schemes. The method is based upon nonclassical Gauss quadrature rules (GQR) and suffers no limitation on noise statistics, so that it applies also for non-Gaussian noise. Moreover it yields some remarkable advantages as compared with other methods, in particular, with the series expansion method that has recently received considerable attention. Expressions for the truncation error are also given and their derivation is reported in the Appendix. . Finally, examples of applications are presented and comparisons with other methods are carried out. .
series expansion, or, only for binary systems, by a GramCharlier expansion [6], [8] . All these methods suffer some disadvantages: 1) the former gives bounds that are often too loose; 2) the latter provides oscillating results when the channel distortion, or the signalto-noise ratio, or the number of levels increase,in spite of the absolute convergence of the series that has been ~heoretical1y proved. . In this paper we present a new method to evaluate the average probability of error within any desired accuracy, The proposed procedure is based on nonclassical Gauss quadrature rules (GQR) and suffers no limitation on noise statistics, so that it applies as well to non-Gaussian .noise. Moreover it always assures accurate results and very satisfactory performances even when other methods fail [4], [7]. The computer time required in the numerical evaluation is shorter, especially when the error probability is computed for many valuesof the signal-to-noise ratio. In Section II the problem is stated and a general expression for the error probability is derived. In Section III the method to compute the probability of error is explained, together with a brief description of GQR's. In 'Section IV the truncatio~ error due to the finite number of points in the quadrature rule is analyzed and an upper bound for the error is found. A discussion about the roundoff errors is also given. Finally, in Section V, the method is applied to some particular examples of additive Gaussian noise channels.
I. INTRODUCTION HIS paper deals with the evaluation of the average probability of error in multilevel pulse-amplitude-modulation (PAM) and partial-response-coded (pRe) data transmission systems in the presence of both intersymbol interference and additive noise. As this problem is one of the most important in the determination of the system performance, it has received considerable attention by the researchers in the area of digital communication theory. The first obvious approach (exhaustive method) is to consider a truncated M-pulse-train approximation of the real channel. The evaluation of the probability of error is performed by computing the conditional error probability for each of the L M possible sequencesof data, L being the number II. STATEMENT OF THE PROBLE~ of levels, and then averaging over all the sequences. However The received signal at the input to the decision device for the total number of such sequencesis limited by the computer an L·level digital PAM system is well known to be time needed to perform the average. As a consequence only few interfering samples can be taken into account and the ap+00 proximation of the true channel becomes very poor, especially y(t) = L Qhr(t - hT) + n(t), (1) h=-oo when dealingwith multilevel systems. When the additive noise is Gaussian, some authors evaluate an upper bound to the error probability by either the worst where n(t) is a noise process, r(t) the impulse response of the case sequence [1). or the Chernoff inequality [2]; [3]; other overall time-invariantlinear 'system, T the signaling period, and authors [4] , [5] , [7] compute the error probability for binary the sequence of ah the random stream of symbols that can asand multilevel systems by means of a Hermite polynomials sume the values
T
Paper approved by the Data Communications Committee of the IEEE Communications Society for publication without oral presentation, Manuscript received February 3, 1972. ' . S. Benedetto is with the' Istituto di Elettronica e Telecomunicazioni del Politecnico di Torino, Turin, Italy. G. DeVincentiis and A. Luvison are with the Centro Studi e Laboratori Telecomunicazioni (CSELT), Turin, Italy.
±d, ±3d, ±5d,· .. ,±(L - 1) d
(L even)
(2)
with probabilities
o« ~ K{ah = [2K - sgn (k)] d},
-L12 ~ k <:L/2, k:# 0 (3)
Reprinted from IEEE Transactions on Communications, vol. COM-21, no. 3, March 1973. The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
335
336
THE BEST OF THE BEST
such that (4)
Pk =P-k·
At the detector, y(t) is sampled every T seconds to determine the amplitude of the transmitted symbols. At the sampling time to, the signal can be written in the form Yo =aoro +
+00
L' Qh'h +no,
(5)
rh = r(to - hT) ~o = n(t o)
(6)
and ~' does not h = o. . include the term . The first term in (5) is the desired signal, whereas the second and the third terms represent the intersymbol interference and the 'noise, respectively. The set of the slicing levels at the de.. tector are -(L - 2) dr«, . · · ,-2dro, 0, 2dro, ... ,(L - 2) dro.
(7)
Let us assume that +00
L' Qh'h
(8)
h=-oo
is a random variable; so the average error probability has the following expression [7] :.
E'
PkP{IX+nol>dro}-PL/2P{X+no>dro}
k=-L/2
,
- P-L/2 P {X
= P {IX + nol > dro} - PL/2 P {X
+ no < -dro}
+ no > dro}
- P-L/2P {X + no
<-
dro}.
(9)
Because of the hypothesis in (4), the probability density function (pdf) of X is an evenfunction. Also,if the pdf of the noise sample no is even, (9) becomes P(e)=2(I-PL/2)P{Xtno>d,o}.
(10)
Denoting with D(·) the distribution function of the random variable (~V) ~o, and observing that P{X+ no > dro} ::; E[P{x
+ no > drolX=x}] = E[1 - D(dro - x)]
(11)
the average error probability assumes the following form pee) =2(1 - PL/2) E[1 - D(dro - x)] .
(l~)
We suppose henceforth that the impulse responsehas only a finite number M of interfering samples, that is rh=Q,
L
q;o(x - ~i)
(14)
where 0(·) is the Dirac delta function and Xi the discrete values taken by X, and q i =P {X =Xi}. Thu~ (12) can be represented formally in two ways: with the aid of generalized functions, which are involved in (14), we have .
ro = r(to)
L/2
LM i=1
Yo = y(to)
P(e)=
w(x) =
h=-oo
where
X=
moreover it allows us to avoid the mathematical difficulties arising from the problem of the convergence of a series of RV's to an RV (see [4] or [6] for a discussion about this problem). As a matter of fact, under this hypothesis, the pdf w(') of the RV X always exists, being a finite sum of the kind
h<-p,h>q,p+q=M.
(13)
This is a reasonable approximation for any actual channel;
P(e) = 2{I - PL/Z)
f1 [1-
D(dro - x)] w(x) dx,
(IS)
where [Xl is the range of the RV X; or by introducing the Stieltjes integralwe are led to write pee) = 2(1 - PL/2)f· [1 - D(dro - x)] dF(x).
.
1
(16)
The last representation is based on the fact that f(x) ~s the distribution function of the RV X, thus having bounded variations in [~]. . . In Section III we shall give a method to compute p(e), through (15) or (16), overcoming the difficulties due to the fact that the pdf of X cannot be known explicitly unlessa direct enumeration of all possible sequences is performed. III. COMPU~ATION OF THE PROBABILITY OF ERROR Equation (15) or (16) shows that the problem to be solved is the averaging of [1 - D(dro - x)] with respect to the RV X whose pdf is unknown. In the case of Gaussian noise, several authors expanded the error probability into a series approximation in terms of the moments of X and of Hermite polynomials for multilevel signaling schemes [5], [7], or Hermite functions for binary signaling schemes [6], [8]. These techniques provide values of the probability of error more accurate than the Chernoff bound. Moreover the computation is much faster than using the exhaustive method. Unfortunately, the series expansions behave critically, that is, they provide oscillating results, when either the signal-to-noise ratio or the intersymbol interference increases. Following a different approach, we have computed the ~v erage in (15) or (16) by means of GQR's [9] -(11], which guarantee, in the area of approximate integration, the highest degree of precision. We are concerned with an integral of the form
fb f(x) w(x)dx,
(17)
a
where w(x) is a "weight" function (maybe unknown),f(x) an arbitrary function of some wide class, and [a, b] any finite or infinite interval. In order to compute numericallythe integral
337
Fifty Years of Communications and Networking
(17), it is sufficient to assume that the weight function 'satisfies these conditions: 1) w(x) is nonnegative, integrable in [a, b], with
fb w(x)dx a
> 0;
(18)
2) the products x k w(x), for any nonnegative integer k, are such that the integrals
formulas is given in Krylov [10]; in the following, we shall sketch the essential properties that are used in the construetion of GQR's, with particular regard to the' case of unknown weight functions. Let w(x) satisfy. conditions 1) and 2). For w(x) it is possible to define a sequence of polynomials {Pn (x)} that are orthonormal with respect to w(x) and in which P n(x) is of exact degree n so that
i
(19)
b
(22)
Pn(x)Pj(x) w(x)dx =fJ nj .
T~e polynomial
are definite and finite. Though approximation theory makes good use of several wider classes of functions, we shall only consider the class of functions [(x) that have 2m continuous derivatives in the interval [a, b] , that isf(x) E C 2 m [a, b] . the pdf w(x) in (14) satisfies the previously stated conditions 1) and 2), so we shall use in the following a representation of the integrals as in (15), though the Stieltjes integral notation could also be used [9, pp. 15-16] for a completely equivalent approach.
. The most widely investigated method to approximate the integral (17) uses a linear combination of values of the func-
tion j'(x), i.e.,
i
b
n
i=1
Kn >0
(x - Xi),
m
f2m)(~)
[(x) w(x) dx = ~ Wi!(Xi) + K 2 (2 )'"
m.
m
1=1
(23)
where
Km+ 1 Wi=---·
b
n
has n real roots x, located in the interval [a, b] . The roots of the orthogonal polynomials play an important role in. GQR's. In fact, the following theorem holds [11, p. 35]. Theorem: Let f(x) E C2m [a, b], then
a
A. Gauss Quadrature Rules Approach
r m J f(x) w(x) dx ~ ~ wi!(xi).
Pn(x) =Kn
Km
(20)
1
Pt' ( .) = dPm(x)1
"
m
Pm+ 1 (Xi)Pm(Xi)
a
i
It is usually said that this quadrature rule .has degree of precision n if it is exact whenever I"(x) is a polynomial of degree ~n.
The Xi are called the abscissas of the formula and the Wi the . coefficients or weights, so the set {Wi,Xi17:= 1 is called a quadrature rule corresponding to the weight function w(x). If w(x) satisfies conditions 1) and 2), previously stated, then m abscissas and weights can be found to make (20) exact for all polynomials of degree n ~ 2m - 1; this is the highest degree of precision that can be obtained using m points. As such formulas were originally considered by Gauss, a quadrature rule like (20) is called a GQR if it is an exact equality whenever [(x) is a polynomial of degree 2m - 1 or lower. Beyond the advantages of accuracy, GQR's do not require an explicit knowledge of the weight function. In fact, as it will be shown later on, a rule {Wi, Xi}7:= 1 can be obtained from the moments of w(x) so defined
fb x kw(x)dx, a
k= 0,1,"', 2m
X,
(21)
and this fact is of great practical usefulness. Therefore, the moments' accuracy represents the key point of this computational approach. A systematic introduction to the theory of Gauss quadrature
ax
=1 , . · . , m.
X=X;
'
(24)
Moreover it can be proved that each set of orthonormal polysatisfies the following three-term recurrence relationship
nomials
xP n - 1 (x) = ~n-lPn-2(X) + QnPn-l (x) + ~nPn(x),
n
= 1, 2, . · . , m,
(25)
where P-l (x) == 0 and Po(x) ~ I., By observing that the remainder of (23) is zero for all polynomials of degree ~ 2m - 1, we conclude that (23) is a GQR. Quadrature rules {Wi, Xi}~ 1 have several important convergence properties; in fact these formulas converge to the true value of the integral for almost any conceivable function that can be met in practice [9, p. 15], even if f(x) ~ C 2m [a, b] ; namely lim m~oo
m ~
;::::.1
Wi[(X;)
i
=
a
b
[(x) w(x) dx,
(26)
Hopefully, the quadrature rule {Wi, Xi}~l corresponding to the weight function w(x) is available in tabulated form, but more likely it is not; so the constructive aspect of the formulas becomes very important. Severai algorithms have been proposed in order to compute {Wi, Xi}7:: 1. The procedure generally recommended consists in generating the set of orthonormal polynomials associated with the weight function, thus
338
THE BEST OF THE BEST
obtaining Xi as the zeros of Pm(X), and Wi starting from these orthonormal polynomials, An alternative approach, suggested by Golub and Welsch [12i, performs the computation of {Wi, Xi}~l using the first 2m + i moments of w(x). This approach seemed particularly suitable for practical purposes, because it leaves out of consideration the .knowledge of w(x). In the rest of the subsection we shall briefly outline this method; the development of the algorithm, the mathematical details and proofs are re.. ported in [12]. Basically, the algoritlun consists of two steps: 1) evaluation of the coefficients {an} and {J3n} of the three-term recurrence relationship (25) by means of the first 2m + 1 moments of w(x) defined in (21) and 2) generation of a symmetric tridiagonal matrix, whose elements depend on the coefficients {an} and {Pn}. The weights Wi are found as the first components of the eigenvectors. of this matrix, whereas the abscissas Xi are the corresponding eigenvalues.
B. Application to the Evaluation of the £"07 Probability The previously described procedure fits well into our problem, if we identify the weight function c.v(x) with the pdf w(x) of the RV X. In fact the average error probability (15) can be computed as N
p(e) ~ 2(1 - PL/2):E will - D(dro - Xi)]
(27)
i=1
{riJl*o
i=-p
must be modified according to the class of PRe, because we must take into account only the undesired intersymbol interference. In Table I we give the expressions of P L/2 and' for all the classes of PRe when the source symbois are equally likely. With the aid of this table one can evaiuate the error prob.. ability for both PAM and PRe systems by means of the same computational procedure. .
c. Evaluation of the Moments In the- preceding section we showed how the error probability cart be evaluated by an N·point GQR if the first 2N + I moments of the RV X, which represents the intersymbol interference, are known. Then our aim is the accurate evaluation of the moments (29) without resorting to the trivial direct enumeration, which requires too much computer time. We shall describe two methods that allow a recurrent evaluation of the moments. The first, described in detail in [7] ,leads to the following expression .
Xi}~l
being an N·point GQR. A sequence of formulas similar to (27) does converge to the exact value ofP(e), when N -)- 00, except for the roundoff errors. .For the particular case of Gaussian noise with variance o~, (27) becomes
{Wi,
interfering impulse response samples
P(e)
~ (1 - PL/2) i=l f Wi erfc
'/;,2'0 v£- an
(.
M2k =J?:_:l
'I
on
We must emphasize that this technique is independent of the noise statistics, whereas the methods up to the presently known [4] -[8] are strictly based on the "Gaussianity"of the noise process. So the proposed approach seems' to be very powerful, even in the presence of other types of disturbances, e.g., phase jitter [13] or channels with fading, and when dealing with. different systems of data transmission, for which the conditional error probability is easily written but cannot be expanded in series as in [4] -[8]. Moreover the method just described for multilevel PAM transmission systems, applies as well in the case of PRe signaling schemes [14], [15]. Some cautions must be used in the interpretation of (12), as it Was explained in [7] for the series expansion method, namely: 1) the term PL/2 is now the probability of the outermost level can have before the detector; for instance, in the case of class-d PRe systems, we pave PL/2 = IlL;' if the source symbols are equally likely and 2) the vector r of the
i)
(-l)i M1( k _ j ) f 2i - 1 ) (0),
(30)
whereM o = 1, k = 1, 2, ... ,N and
f
xi - _M2 . ). (28) VL..
(~~~
q
2j
-
1
) (O) = (2j - l ) ! .
L:'
i==-p
C~:~I'
(31)
where C(~) 2J -}
=(-l)i (dr , )2i (2
• C(~)
• (-I)k
2/-2k-l
1
o
'J
· (dr
0
_
LI2
j-l
n=t
k=l
" 2p n (2n - 1)2i - L..", ~ 1)' L..J •
1 L/2 ) 2k . - - . " 2Pm(2m I (2k)! ~l o
- 1)2k •
(32)
Since the source symbols ah are independently distributed RV's with zero mean, the odd order moments are all equal to zero. When the source symbols are equally likely, (30)-(32) reduce easily to the following formula [5]
q
:E' (d r h )'1.i,
we
h=-p
where B 2i are the Bernoulli numbers.
(33)
339
Fifty Years of Communications and Networking TABLE I Slpal
-1
ClASS 1 pac
LZ 1
CLASS U PRC
CLASS
·L(ZL -
Ii
1
m pac
1:i2:L=rr
C'LASS IV PRC
CLASS V
IDter8ymbol vector .!
PL/1.
7
1
1 L(ZL-l)
pac
PAM
L
j
·.1
!.
{".p.·... ".I."l - "0' "1' ....
!.
f".p....·".I· top. "I - "0/.1, .... "'I.}
.!:. {".p" ·.. "-1 •
.!.
f·.p'...
Z"o' "1 + "0' ....
".1' "1' r z + "0'
.I~ "of.'
.!:. {".p....."
-...
{" -.... p,". l'
"1 +
....
"'I}
"'I}
"of.' ....
"'I}
"1' .... ".• }
Intersymbol vector according to the coded signal class of partial response.
The second method was proposed by Prabhu [16]. Let us number from 1 to M the M interfering samples that are significantly different from zero, so that
(34) If we let (35) the even order moments of the RV X are givenby
impulse response and the second is the error involved in the quadrature formula (27). An analysis of the first error is given in [8]. Moreover it is possible to determine how many samples significantly contribute to the error probability within a preassigned amount by a preliminary computer run. Another way is to increase the number of the interfering samples until P(e) stops changing. The second truncation error is analyzed in the Appendix for the case of Gaussian noise. Therefore the error probability can be evaluated by taking a finite number of terms N
(
P(e)=(I-PL/2)?:w;erfc
k
=0, 1, . · · ,N
(36)
because of the statistical independence of Y M -} and XM. Equation (36) allows us to evaluate, by recurrence, the mo.. ments of X through the knowledge of the moments E[X~k],
h=1,2,···,M,
k=O,l,···,N
(37)
and these are given by [see (3)]
IV.
L' Pi [2i -
L,
On
-
Xi) -'2
V
L,
On
+RN,(39)
where R N represents the truncation error and is upper bounded by
IRNI
(dlrol-4:~ax {n):1 max {~}
IRNI < A,
< dlrol; ~ E
max {~} ~ d Iro I,
[~]
(40)
where
L/2
E[X~k] = ( dr h ) 2 k
dro
-M2
V
1=1
.
sgn (i)] 2k •
(38)
;=-£/2
ESTIMATION OF COMPUTATIONAL ERRORS
In the numerical evaluation of the probability of error using (27) three different types of error occur, two of them due to truncation and one to roundoff. In this section we give an upper bound to the error depending on truncation of the quadrature rule, and briefly discuss the other two types of errors.
A. Truncation Errors Two different truncation errors would have to be taken into account: the first is the error due to the truncation of the
A
=
N
}
B[(2N - 1) !]1/2 2 +1. (1 - PL/2)
{rr)l/2
(2N)!Ktl(2a~)N
B
~
1.086435.
It must be noticed that this bound is useful only if it issufficiently tight, which in some cases does not happen. In these cases, however, there is a very effective way usually followed in numerical. computation to judge the accuracy of the obtained result. In fact, since convergence is assured to the true value of P( e), it is sufficient to check how many significant digits remain unchanged as N increases and to continue the iteration. until the desired accuracy. In all cases we tried, no oscillation was observed and the convergence was obtained, at least in the first three significant digits. Also the bound to the truncation error in the series expan-
340
THE BEST OF THE BEST
sion method sometimes suffers the same drawback, but in that case the result can oscillate because of roundoff errors, when the convergence to the true value is not reached with a low number of terms; thus any reasonable stopping criterion is prevented. Examples of the truncation error bound and of the other way to check the accuracy will be given in Section V.
B. Roundoff Error In the evaluation of the right-hand side of (27) the limited accuracy of the digital computer introduces also a roundoff error. This error is usually negligible in Gauss quadrature formulas of low order, especially with regard to the evaluation of the probability of error, whose accuracy is satisfactory when 2 or 3 digits ~re correctly known. On the other hand the high speed of the present computers allows one to use high-order formulas, thus making significant the roundoff error. In order to reduce this drawback, one may resort to double precision computation, evaluating integer quantities in integer arithmetic, as far as possible. The knowledge of the moments with high precision is an important condition to further improve the accuracy in the computation of Wi and Xi. To this purpose, the computational method proposed by Prabhu seems to be quite satisfactory, since relationship (36) involves the 'summation of positive quantities only, A mathematical discussion of the roundoff error effects is given in [17]; it is shown that, roughly speaking, the lower is the quantity min {Wi}' the more the GQR is sensitive to the I
loss of accuracy with increasing N. So the values of the coefficients give a significant insight into the overall precision. Summarizing, the main limiting factor is due to the values of the moments that are affected by roundoff errors because they are computed by recurrence formulas. Due to this fact, the coefficients {Q;} and {Pi} of (25) may fail to be computed from the moments, when N becomes large. Nevertheless our algorithm, though applied in many cases different from those analyzed in Section V [13], [18], [19], never failed. In fact, GQR's always converged for N ranging between 6 and 15. It may be noticed that the roundoff error does not happen only in our method of computingP(e). In fact, the same arguments apply to any series expansion method, where each term of the approximation is obtained by multiplying a value of a Hermite polynomial (or function) obtained recurrently by a term (such as a moment) that is computed iteratively as well. Moreover the terms in the series expansion may be very large and have opposite signs, whereas the approximation (28) uses only positive and very regular quantities. When particular attention should be devoted to the roundoff error in special applications, the proposed procedure could be modified following two ways: 1) the first lies in the field of numerical differentiation of analytic functions [20], basing the kth moment on the corresponding derivative of the characteristic function of the RV X; 2) the second one consists in applying a recent procedure [21], which allows one to con-
6; N
w; [1 • D(dro • "il)
Fig. 1. Scheme of the routine to compute average error probability P(e).
struct GQR's starting from "modified moments," such as
mk = ftPk(X)W(X)dX, where Pk(x) belongs to a set of orthogonal polynomials suitably chosen. Extensions of the proposed method to evaluate pee) following these ways seem possible; however such possibilities have not yet been pursued.
v.
EXAMPLES AND COMPARISONS
The examples reported hereafter have been chosen to show the differences between the method of GQR's and the series expansion rules [4], [5], [7]. When it was possible, the exhaustive method was also used as a basis of comparison. The noise process was supposed to be Gaussian. The computational procedure requires the following steps: 1) evaluation of the first 2N + 1 moments {Mk }k~o using (30)-(33) or (36) and (38) starting from the vector r of the interfering samples; 2) determination of the three-term recurrence relationship through the coefficients {(Xi }~ 1 and {tii}~ 1 that appear in (25); 3) generation of the weights {Wi}~l and abscissas {Xi}f:l of the GQR by means of the numerical algorithm described in Section III-A; and 4) evaluation of the average error probability through (28). A block diagram of the preceding outlined procedure is drawn in Fig. 1.
341
Fifty Years of Communications and Networking
Pte )~
,
-,
~
10- 1
V
- - quadrature rule
Pd B =
L=Z,
~
;:i::---,
•
Pte)
16dB
10-2
"
.)r---~
"
,
----
,,~.
''" ,, -,~.
1
u 10· u
Z
'"
-----
quadrature rule trullcation error bouad
-'"
('
" ~"
~ ~.
~/
/)
(
/
.
L-Z. PdS=16dB
"\
" ,. .....
',-
,)
~/
/
l "
"
• "\I.llve .etod
L/
/r
I
~,
III
1-4~
.erie. expanaion
- . - exhaueUve method
0
I~-
---
8
N, N I
Fig. 3. Comparison of error probabilities versus number N of quadrature points or number Hi of terms of the series expansion. If-pulse truncation approximation of ideal band-limited signal. Sampling time deviation O.2T. Exhaustive-method error probability is also reported.
TABLE II 3
4·
6.,.,
If>
l'
Fig. 2. Error probability and upper bound to truncation error versus number N of points of the quadrature formula. l l-pulse truncation approximation. Ideal band-limited signal of (41). Sampling time deExhaustive-method probability of error is also viation O.05T. reported.
A. IdealBand-Limited Pulse
Exbau.tive Method
N
Quadrature Rule pee)
Ple) = Z. 7614 (.3) Serl•• Expan.loft pee)
Nt
z
3.77
(.5)
4.20
(-6)
2
We consider the case of binary PAM transmission when the receivedpulse is assumed to have the form
3
8.86
(.f)
5.98
(..5)
3
.
2.4
(·3)
4.71
(..f)
.
r(t) = sin ("tiT) .
5
2.'
(-3)
2.14
(·3)
S
6
2.8
(-3)
5.46
(-3)
6
7
Z.74 .
(-3)
6.56
(.3)
7
8
2.75
(-3)
5.06
(-4)
•
9
2.766
(-3)
-
9
10
Z.7617
(-3)
-
10
11
2.7615
(-3)
-
11
1'2
2.16164
(-3)
-
12
13
Z.761639
(-3)
14
Z.761636
(-3)
.
1ft/T
(41)
For a truncated l l-pulse train approximation the exhaustivemethod error probability, the error probability computed with our method and the bound (40) to the truncation error are shown in Fig. 2 as a function of the number N of points of the quadrature formula. The sampling time deviation from the nominal value is O.05T and the signal-to-noise ratio is taken to be 16 dB. In this subsection and in the following) the number of interfering samples has been chosen small enough to allow the exhaustive method to compute the exact probability of error, for the sake of comparison. In Fig. 3 the exact error probability, the error probability evaluated either with the series expansion method or with GQR are reported for a sampling time deviation of D.2T. N; is the number of terms of the series expansion and N is the order of the GQR according to (28). The curve giving the results of the seriesexpansion method ends with the eighth term because the successive approximation of nine terms gives a negative value for p(e). In order to check the accuracy and the convergence,the numerical valuesof p(e) are reported in Table Il.
-
13 14
.Numerical values of the probabilities of error obtained with the Gauss quadrature rule, the series expansion, and the exhaustive methods. Ll-pulse truncation approximation of the ideal band-limited signal. Sampling time deviation O.. 2T.
342
THE BEST OF THE BEST
_.....
P(e)
- - ~uaclratur. rule _ . - .xbaueU. . . .thod
-----,.....-------.----.....-----4
10·' ........------,__
-16418
10 -. ~--~.----~~---t----+---~
quadrature rule 10.1
' - -_ _ .--IIoo.-_ _- . . I
4
6
--a.
.......
•
.....
N
9
Fig. 4. Error probability versus number N of quadrature points. 9-pulse truncation approximation of PRC signal of (42). Exhaustive-method error probability is also reported.
B. A 4-Leve' PRe System In Fig. 4 the results of the computation of pee) in a class4 PRe system with four levels by both exhaustive and GQR methods are presented for a truncated 9-pulse train approximation of the following impulse response:
r(/) = sin (27ft) cos (21ry/) _ sin (27f6t) cos (2m'lt) 211"1 [1 - (4'Yt)2] 27ft [1 - (45t)2] ,
,
........_ - - - - " " ' - - - - - - - - 10
•
I
16
17
11
12
I
1.3
I
,
I
I
,
,
18
19
20
21
Z2
Z3
I
N
14
Nt
Zi
•
Fig. 5. Comparison of error probabilities versus number N of quadrature points and of terms N; of series expansion. 8-1evel source symbols are class-4 PRC and passed through the channel of the figure.
- - quadrature rule
PCe)
- - - eerie. expanaion
(42)
where the parameters 'Y and 5 have the meaning described in the same figure. The signal-to-noise ratio is chosen equal to 19 dB. The curves are very significant, and show that the GQR method behaves very well; in fact the convergence to the exact value is fast and accurate (5 correct significant digits). The exhaustive method, used to evaluate the exact value of P(e) , required about 2 h of computer time, while the GQR method needed just a few seconds. In this case we found that the series expansion method [5], [7] behaved critically giving very inaccurate results.
c.
aerie a expanaion 10. 13 " - - - - - " ' - - _........_
An Equalized Channel
The method described in this paper has been used extensively to evaluate the performance of multilevel class-4 PRe data transmission systems with a 16-tap zero forcing automatic equalizer. In this case, it was necessary to consider a truncated impulse response of 60 samples, so the exhaustive method was inapplicable because of the required computer time. Thus, we restricted our comparisons to the series approximation and GQR methods. In Fig. 5 the case of an 8-1evel system is reported, having a
10 ••
1----~I__----4----loIc__+--__+__+---_1
10 -• .....------i~----+---__+---_+---__t
10
·1't-----+---~+__---t___--___.,t__--___1
10
·121------1f------I-~::--__+---__t---_1
10
·1.1----~~-----t~--__t---..;~-__+---__1
10 ·t........
.a.13
........ 15
~
_ ____'......._ __""
17
Fig. 6. Comparison of error probabilities versus signal-to-noise ratio, with number of levels as a parameter. Channel is the same as Fig. 4.
signal-to-noise ratio equal to 19 dB. The curves show that the GQR method converges to a final value without oscillations, while the series approximation does not, and it gives a negative value for P(e) at the 25th iteration. The channel we are considering presents an amplitude char .. acteristic similar to that drawn in the same figure, and a para-
343
Fifty Years of Communications and Networking
bolic group delay. This kind of channel model is described in detail in [18]. In Fig. 6 an example of the curves of P(e) versus the signalto-noise ratio is presented, in the case of 4· and S-level class-4 PRe systems. It is interesting to observe that the curves of P(e) given by these two methods coincide in the case of L = 4. On the .opposite, with L = 8, the values of P(e) are almost the same only for small values of the signal-to-noise ratio, while they diverge as the signal-to-noise ratio increases. These re.. suIts agree with the statements reported In the Introduction. It must be noticed also that a remarkable saving of computer time may be achieved using the GQR method, when the error probability has to be .computed for many values of the signal.. to-noise ratio (as in this case). In fact the first three steps in the computational procedure are the same for all the values of the signal-to-noise ratio and therefore are performed only once. VI.
is Gaussian. Rewriting(39) according to (23), we have P( e)
= (1 -
;=1
Wi
(
erf c
(dro
erfc( 2N)
Xi) + (1 -
dro
_ r;;- _h y2 an y2
On
PL /2 )
" )1
- .- - -X-
.
V2 On'"
..;2 On x=~
(43)
(2N)!" KJi
where KN is Shown to be
n «(3i)-l N
KN =
(44)
;=1
and is easily computed during step 2) of the computational procedure described in Section V.
the error involved in truncating the quadrature rule to the first N terms is thus
CONCLUSIONS
A new method has been presented to compute the average probability of error for multilevel PAM and PRC data transmission systems with intersymbol interference and additive noise. The method is based upon nonclassical GQR's and suffers no limitation on noise statistics. After the explanation of the computational technique and the extension to the PRC case, some examples of application have been given either for binary PAM or multilevel PRe signaling schemes. The results were quite satisfactory even when the other known methods were found to fail, either for the required computer time, (exhaustive method) or because the obtained results were oscillating(series expansion method). The GQR algorithm applied to the computation of P(e) has been proved to be rapidly convergent in all practical cases that were considered. Finally it must be noticed that the proposed algorithm can be applied iteratively to construct GQR's. So the integration with respect to "other random variables, whose moment~ ~re known, may be performed. This technique has been used, for instance, to compute the error probability in the presence of a random phase jitter [131 where a further integration over the corresponding random variable had to be carried out. More.. over the method does not apply only to baseband pulse transmission, but it can be extended to deal with modulation systems, such as the JV-phase PSK. Many of these problems are being investigated and satisfactory results are hopefully expected.
LN
PL /2)
RN
( ~) _
erfc(2N)
(~~
X_) I
__
Vian .[fan x=~ (l _.
~ -
)
(45)
(z) exp (-z2),
(46)
(2N)! KJv
PL/2 .
But [22,p.298]
= (_l)n-l
er~c(n) (z)
2(1T)-1/1. Hn -
I
where Hn - I (.) is the Hermite polynomial of order n - 1.
Therefore R .
_ . , 2(1 - PL/1.) (n)1/2 (2N)! K'fly (..;2a )2N n
I N(~)I-
·IH
2N - l
(;:n -via:) I·
exp [-
(d;;~~~)2l
(47)
We know that (22, p. 787] IHn(z)1
< B exp (z2/2)
2n /2 (n !)1/2,
B ~ 1.086435.
(48)
= 2N -
1, one
Hence, substituting (4.8) in (47) and letting n obtains IRN(~)I
"
[(Jr o-1. ~)2] - ,
exp -
(49)
40 n
where the coefficient A is given by A =
B [(2N - 1) !]
1/2
2N+! (1 - P L /2)'
(50)
---~-----o:-:-----
(1T)1/2 (2N)! Kk(2o~)N
and does not depend on ~. If we choose ~ in order to maximize the right.. hand side of (49), we have
IRNI
(dr:;~~)2])
(51)
with ACKNOWLEDGMENT
The authors gratefully acknowledge the help of L. Lambarelli in organizingthe computer work and results. ApPENDIX
In this Appendix, we shall find an upper bound to the error that occurs in the truncation of GQR's, when the noise process
max { exp [ ~
=
{
exp ( -
1,
(dro -
1.
~)2]}
40 n (d
Irol- max {~})21 2
40 n
'
max {~}
max
< d Iro',
{~} ~ d
Irol
(52)
344
THE BEST OF THE BEST
and
q
max {~} =
L'
h=-p
(L - l)d (rhl.
(53)
Equation (5.1) with (50), (52), and (53) gives the desired bound to the truncation error. REFERENCES (1) R. W. Lucky, J. Salz, and E. J. Weldon, Principles ofData Communication. New York: McGraw-Hill, 1968, p. 44. [2] B. R. Saltzberg, "Intersymbol interference error bounds with application to ideal bandlimited signaling," IEEE Trans. Inform. Theory, vol. IT·14, pp. 563-568, July 1968. (3) R. Lugannani, "Intersymbol interference and probability of error in digital systems," IEEE Trans. Inform. Theory, vol. IT-IS, pp. 682-688, Nov. 1969. [4] E. Y. Ho and Y. S. Yeh, "A new approach. for evaluating the error probability in the presence of intersyrnbol interference and additive Gaussian noise," Bell Syst, Tech. J., vol. 49, pp. 22492265, Nov. 1970. . . [5] - , "Error probability of a multilevel digital system with intersymbol interference and Gaussian noise," Bell Syst, Tech. J., vol. 50, pp. 1017-1023, Mar. 1971. [6] M. I. Celebiler and O. Shimbo, "The probability of error due to intersymbol interference and Gaussian noise in digital communication systems," COMSAT, Tech. Memo, May 5, 1970. [7) S. Benedetto, E. Biglieri, and R. Dogliotti, "Probabilita di errore per trasmissione numerica a piu livelli e codificazione lineare," Alta Freq., vol. 40, pp~ 725-732, Sept. 1971. [8] 0.· Shimbo and M. I. Celebiler, "The probability of error due to intersymbol interference. and Gaussian noise in digital communication systems," IEEE Trans. Commun. Techno I., vol. COM..19, pp. 113-119, Apr. 1971. . (9] A. H. Stroud and D. Secrest, Gaussian Quadrature Formulas. Englewood Cliffs, N.J.: Prentice-Hall, 1966. (10J V. I. Krylov, Approximate Calculation of Integrals. New York: Macmillan, 1962.. (11] P. J. Davis and P. Rabinowitz, Numerical Integration. Waltham, Mass.: Blaisdell, 1967. (12] G. H. Golub and J. H. Welsch, "Calculation of Gauss quadrature rules," Math. Comput., vol. 23, pp. 221-230, Apr. 1969. [13] S. Benedetto, G. De Vincentiis, and A. Luvison, "The effect of phase jitter on the performances of automatic equalizers," in Conf· Rec. 1972 IEEE Int. Con! Communications, Philadelphia, Pa., June 1972. {14 J A. Lender, "The duobinary technique for high-speed data transmission," IEEE Trans. Commun. Electron., vol. 82, pp. 214-218, . May 1963. [15] E. R. Kretzmer, "Binary data communication by partial response transmission," in.IEEE Conf. Rec. Ann. Communication Conv., pp. 451-455, Boulder, Colo., June 1965. (16] V. K. Prabhu, "Some considerations of error bounds in digital systems," Bell Syst. TeefJ.. J., vol. 50, PP. 3127-3151, Dec. 1971. [17] W. Gautschi, "Construction of Gauss-Christoffel quadrature formulas," Math. Comput.,·vol. 22, pp. 251-270, Apr. 1968. {18] S. Benedetto, V. Castellani, C. Cianci, and U. Mazzei, "On the efficient bandwidth utilization in digital transmission," Advis. Group Aerosp. Res. Dev. (NATO), 23rd Meeting Avionics Panel, London, England, May 1972.
(19i S. Benedetto, V. Castellani, and G. De Vincentiis, "Error probability in the presence of intersymbol interference and additive noise for correlated digital signals," to be published. [20] J.. N. Lyness, "Differentiation formulas for analytic functions," Math. Comput., vol, 22, pp. 352-362, Apr. 1968. . [21] W. Gautschi; "On the construction of Gaussian quadrature rules from modified moments," Math. Comput., vol. 24, pp. 245-260, Apr. 1970. . (22} M. Ab~a~owitz and ~. A. Stegun, Eds. Handbook of Mathematical Functions. Washington, D.C.: Nat. Bur. Stand., 1967.
PHOTO NOT AVAILABLE
Sergio Benedetto was born in Turin t Italy, on January 18, 1945. He received the Dr.lng. degree in electronic engineering from the Politecnico of Torino, Turin, Italy, where he is Associate Professor in Electrical Communications. His current interests involve statistical communication theory and data communication systems.
Girolamo DeVincentiis was born in Taranto,
Italy, on June 26, 1946. He received the Dr.lng. degree in electronic engineering from the Politecnico di Torino, Turin, Italy, on DePHOTO cember 23, 1969. . NOT He joined the Centro Studi e Laboratori Tetecornunicazioni (CSELT), Turin, Italy, iri 1970 AVAILABLE and is currently a Research Engineer in the Scientific Section of the Scientific Secretariat. At CSELT he worked in the field of "digital filtering and data transmission. Other areas of work include nonlinear coding theory and computer-aided design of digital circuits.
Angelo Luvison was born in Turin, Italy, on November 30, 19~4. He received the Dr.Ing, degree in electronic engineering from the Politecnico di Torino, Turin, Italy, on January 30, PHOTO 1969. NOT In 1969 he joined the Communication and Information Theory Section in the Transmission AVAILABLE Department of the Centro Studi e Laboratori Telecornunicazioni (CSELT), Turin, Italy. He is a Research Engineer in the Scientific Section of the Scientific Secretariat of CSELi. He has been engaged in PCM transmission on radio links, and specifically in computerized system analysis and optimization. He has also worked on a variety of problems of high-speed data transmission, digital signal processing, and numerical analysis. Mr. Luvison is a member of the Associazione Elettrotecnica ed Elettronica Italiana (AEI).
Coherent Demodulation of Frequency-Shift Keying with Low Deviation Ratio RUDldeBUDA Member, IEEE
AbstrQct-A coherent binary F8K modulation system is discussed, that 'has the following properties: 1) it is phase coherent; 2) it has a low deviation ratio, h = !; 3) it occupies a small RF bandwidth, typically only 0.75 times the bit rate, without need for intersymbol interference correction; 4) it uses as receiver a self-synchronizing circuit and. a phase detector, which together achieve optimal decisions; and 5) its error performance is about 3 dB better than that of conventional FSK.
I. INTRODUCTION This paper describes a. scheme to transmit binary signals over an RF link. Usually such digita.l signals are encoded either into fre.. quency-shift keying (FSK) or phase-shift keying (PSK). While FSK has the advantage of simpler circuitry, it is generally conceded that PSK has the advantage of better utilization of bandwidth and signal-to-noise ratio. It will, however, be shown in this paper that the coherent FSK with the deviation ratio 0.5 has an optimal detector, which 'is also simple to construct. Using this detector, one can receive. faster pulse trains than 'with any other binary FSK . or PS~ of equal bandwidth and signal-to-noiseratio.
II. PHASE COHERENCE We consider a phase-coherent binary frequency-shift-keyed (FSK) signal and call a pulse at frequency 11 a. "mark" and at /2 > A, a "space." Each pulse has duration T. If the signal is narrow band, e.g., at IF or RF, then we may describe it by its preenvelope [1] tt(t) ;
u(t) = exp j {r(tl
+
12) t
+ q,(t)} ,
(1)
where the phase function 4>(t) is continuous if the FSK is coherent. Comparison of (1) with the preenvelope of a space
o :::;. t :::; T gives
(2)
Paper approved by the Communications Theory Committee of the IEEE Communications Society after presentation at the 1971 International Conference on Communications, Montreal, Que., Canada, June 14-16, 1971. Manuscript received July 21.1971; revised December 10,1971. . The author is with the Canadian General Electric Company, Ltd., TOTonto. Ont., and McMaster University, Hamilton, Ont., Canada.
Reprinted from IEEE Transactions on Communications, June 1972.
The Best ofthe Best. Edited by W. H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and J. W. Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
345
THE BEST OF THE BEST
346
For a mark, the phase decreases by this same amount. This suggests introducing, by definition, the dimensionless parameter h;
h
= (/2
- fl)T,
(3)
which is called modulation index [2] or frequency deviation ratio.' ' A space increases the phase by 7i'h and a mark reduces it by tth, After s spaces and m marks, i.e., at time (8 + m) T, we have then the phase shift [3" fig. 1]
q,{ (8
+
m)T} -
=
(8 - m)hr,
27Th
INDEX
.::.
(4)
""9-
c~ses
h :::= 1.0
=
h
= =
0.71
-27Th
Fig. 1.
Possible values ~(t) -
~(O).
h 0.715, so that we can see more readily where a false restriction has been applied [17]. We assume, that at time t 0 the phase 'of the received signal is given, say ep (0), and that for 0 $ t ~ T either space or mark is transmitted. ' We assume further, that the IF center frequency is sufficiently high. Then the narrow-band assumption is valid, so tha t· the signals can be expressed by their preenvelopes;
=
0.5.
The FSK with h 1.0 has been described in detail by Sunde [6]; it has not found wide acceptance, It uses signal power inefficiently because it transmits some power at the sp.ectral lines 11 and 12 [7]. More on this topic will be found in Section
VII.
The case h =:=' 0.71 has been claimed to be optimum [8], [9] under certain conditions, which will be discussed later. This system has been extensively studied by Thjung, who shows in particular' that, in a very narrow band cohererit FSK with h 0.7 is better than phase..s hift keying PSK, because the FSK has lower intersymbol interference [10]. The system that will occupy most of this report is the coherent FSK with deviation ratio of exactly h == !. It has a spectrum that falls off smoothly but rapidly [3, fig. 2], [4, fig. 4] and it has been singled out as a good example of barid conservation [5], (11). Apart from this, the superior qualities of this modulation seem to have attracted little attention until very recently [12]-[14] and one finds often that PSK is accepted as superior to FSK [15J, [16].' , After some comments on.the so-called optimum at h = 0.71, the coherent FSK with deviation ratio h 0.5 will be taken up again. First its optimal decision structure will be given, which requires a, coherent' demodulation; and self-synchronizing circuits will then be described, which are essential to obtain the optimal performance. With such a receiver, the signal requires less bandwidth than binary PSK and less power than any other binary F~K and comes close to four-phase PSK, wh-ich has much more complex receiver synchronization requirements. These advantages might be indicated by dubbing this FSK modulation the "fast FSK." ,
=
=
IV.
-7Th
=
h
From here on we will restrict our attention to coherent FS'K with small .modulation index h and in particular to the three
h
0
I
which is an even (odd) multiple of'1rh wilen (8 + m) is even (odd). The phase at all other times is obtained bv linear interpolation. The possible values of 4>(t) are shown i~ Fig. 1: Since. the phase
7Th
9
-e-
DISCUSSION OF THE OPTIMIZATION OF
h
The statement that this fast FSK is better than all other FSK seems to be at variance with the accepted value of h ~.715 ... for optimum error probability. Therefore, it will be instructive to repeat here the derivation of the optimum at
=
or Uspace
= exp j[2rt2 t
+ cP(~)].
N ext, white Gaussian noise is added to the signal. Then a decision must be made at the receiver as to whether mark or
space has been transmitted (see Fig. 2). The lowest error of this decision is obtained when the distance between the two signals in Hilbert space is largest [18]. Since mark space are transmitted only for 0 ~ t :$ T, one evaluates thus first the integral
or
2
D (h)
=
iT lu. -
2
(5) ,
u.. 1 dt
and then maximizes D(h) with respectto h. Simple substitution gives, with (3),
lu, -
~m\
2 . 1rht
sln
T
and
2(h) = 4 iT sin" .".~t dt =
D
2T[1 - sine 2h].
(6)
Woodward's sine f~nction [1, p. 29J is known to have its first and lowest minimum (-O.21?) when
tan 27r'" = 2rh.
=
(7)
The maximum of D(h) occurs therefore when h 0.715 so that this is assumed to be the optimum value for h. The fallacy' the preceding derivation lies in the arbitrary selection of theIimits 0.. $ t S; T in the integral that defines the distance ~D (h),' rather than integrating over (-: 00'" 00'). The same applies to the experimental work, '\vhich seems to confirm the theory; but bases the decision also only upon the
in
347
Fifty Years ofCommunicationsand Networking
I
intervals; but this is immaterial, because ,ve can form
Umark
I
. "
,
I
I
I
I
Re exp J ±2T RECEIVED SIGNAL ( DECIDED AS MARK)
, I
.. ~
DE
8"~~OAJ OU", • If
O~~);---...~
+C08
Re {u(t) exp -iwt}
I
, I
7rt 2T '
cP(o)
1t't
1r,
-T ~ t
Hilbert space representation of the mark-space decision.
interval (0, T). This is inadequate, because the interval (0, T)
is not the complete time that is available to make the decision. It is true' that T is the duration of the' transmission of one bit and ,ve must make one decision per t.ransmit.ted bit; but the bit phases are interrelated and it is a basic tenet of information theory that better performance is obtained when several signals are jointly investigated over a longer period. Now 10,v modulation index coherent FSK retains useful phase information beyond the time of the. bit transmission [15]. In the binary case this' extends over twotime intervals. (For suitably selected m-ary FSK, the phase coherence can extend over several time intervals', but this will not be pursued here.) In particular, the phase in 'the succeeding time interval. T S t S 2T contains unused information about the transmitted frequency of the earlier time interval 0 :s; t ~ T; this information increases when h decreases (below 0.71) and becomes particularly important at' h 0.5. A receiver that fails to use this information will therefore achieve at best a local maximum, but not an optimum, and thus we should expect a receiver with lower' h and a decision interval 2T to perform better than the· so-called optimum at h = 0.715.
=
y.
OPTIl\fU¥ RECEPTION OF CO~RENT
cP(~) = {o,
u(t) .(
'lrt )
21'.,'
r,
oS t
~
T.
(8)
The sign within the exponent is positive for space, negative for mark. A similar result holds for - T ~ t ~ 0, except that the sign within the exponent is not necessarily the same in both
T.
(9)
=
P F F SK
=
~
erfc
1 - V21r 'Y exp -"Y. TIns is over 3 dB better than the error probability of conventional FSK [15],
!
P F SK =
exp -"Y /2.
The optimal receiver structure is a clocked matched filter, for instance implemented by an integrate-dump circuit in baseband, which performs
J(O) =
+
S
The function Re {tt( t) exp - jO)t} depends only on ep (0) , not on any other f/>(nT), thus all unwanted information has been removed. The decision between ep (0) 0 and .ep (0) = 7T' can be optimally made, based on the waveform received in (-T, T), since the signals ± cos (7rt/2T) , (- T =s; t ~ T) are antipodal and contain each the energy of one bit [18]. The probability of error is therefore the same as that of a coherent binary receiver (or same energy per bit), i.e., as function of the SNR. r
FSK
If we inspect Fig. 1, or (4), then it is evident t.hat the phase at times that are odd' (even) multiples of T will be odd (even) multiples of '1Th. Since all phases are modulo 2u, the case h = ! stands out, because then the phase can take only the two values ± '1T /2 at odd times and only the t\VO values 0 'and 1r at even times. Thus one may design a. receiver to make a decision between the two ·permitted phase values. It will be sho\vn that this is an optimal decision scheme, provided that the decision is made after observing not the signal in one time interval, but the real (imaginary) part of its' preenvelope in two time' intervals: the one before and the one" after the phase that we wish to estimate. This can be seen by elementa ry methods. Let
-exPJ.wt ±
{o,
=
-cos 2T'
Uspace
Fig. 2.
rt 2T
cos
regardless of whether mark or space is~ being sent in either interval, 9onsequently, in the given interval of duration 2T
~
I
.( 1f't)} =
{
I
f_:
cos
;~ Re {u(t) exp
-jwtl dt
(10)
and decides
<;(0) q,(0)
= 0, = ~,
J(O) J(O)
> <
0
O.
Corresponding decisions in every second interval pair give and this yields half the number of transmitted bits. The other half comes from the imaginary part of u, where a similar relation holds for.
~ (2n T)
J(T)
t
1f'
f/>(T) - 2 '
sin ; ; 1m {u(t) exp -jwtl dt J(T)
>
J(T)
<0
(11)
0
and so on. Finally, the transmitted bit stream can be recovered from the resulting~(nT). . . No further effort to optimize the receiver is required; we can concentrate on implementing the mathematical receiver structure with practical circuits. .
VI.
SHIFTING INTO BASEBAND
Re {u(t)exp - jrot} and Im {u(t)exp from the received signal .
j~t}
are obtained
THE BEST OF THE BEST
348 CARRIER PHASE
CLOCK
IN
~----t
~-......o
Fig. 3.
OUT
Self-synchronizing receiver.
s(t) =. Re {u(t)} by beating it into baseband. This requires generating a phaselocked carrier at frequency <JJj27r, multiplying 8 (t) with cos wt and sin wt, respectively, in two balanced modulators, and then removing the second-harmonic terms from the t\VO outputs, which by then are usually called I channel and Q channel. The other operations of (10) and (11) are straightforward. They require at the receiver t\VO auxiliary signals: the clock that is needed for integrating over an interval of length 2T and the precisely phase-locked carrier signals cos <JJt and sin rot. There are various ways to transport both signals, but the most elegant method is to regenerate them at the receiver by a self-synchronizing scheme, rather 'than wasting power and bandwidth for transmitting phase and clock reference signals. Fig. 3 illustrates the overall schematic of a self-synchronizing receiver; after a. preamplifier, the signals are split into two paths, one extracts the timing and phasing information, necessarily at a. much slower rate than the signaling information [19J; and the other path contains a coherent detector of the signals, using the reference signals from the self-synchronizing subsystem.
main path, which do not pass through the doubler stage, but remain at h = ~.; and this] in principle, is the receiver. If certain periodic signals are used as modulation, then the frequency doubler may generate a line ~t 2fc /1 + !:l. This frequency should not be used in the self-synchronizing circuit, since a. spectral line at 2fc is in general not available. Some care is necessary to prevent false locking of the PLL if they should find 2/ c in the presence of such a periodic modulation. A different problem is the possible loss of lock if 3; long string of only mark (or only space) is received. Some improvement can be obtained by modulating the remaining line with the clock to regenerate the missing line, but this works only as long as the clock itself does not drift out of sync. This problem is common to all FSK. The best remedy, if available, is suitable source encoding to prevent such long strings from occurring.
=
VIII. 90°
PHASE AMBIGUITY
The divide-by-four ~ircuit generates a phase ambiguity of the reference phases by some multiple of 90 0 • Unlike the 4~phase PSI{, the fast FSK can resolve this ambiguity, for instance by comparing the coherent output with a bit stream from the frequency doubler. This bit. stream is a conventional FSK, except with some noise added (by the doubling circuit) and it is not desirable to use it directly in the receiver, but VII. SELF-SYNCHRONIZATION OF THE R.ECEIVER . it still contains enough good bits to resolve the ambiguity if The self-synchronization of the fast FSK is most easily excomparison is made over several time intervals. plained by reference to Sunde's FSK [6], [7], which has An alternative method is easier to implement. We treat the h 1.0. Since h is a (normalized) frequency difference, a ambiguity in two ways. The 90° ambiguity can be ·removed frequency-doubling circuit doubles h and thus transforms by a more' careful layout of the self-synchronizing circuit, the fast FSK into Sunde's FSK. keeping in mind that we really have a. binary system and Now Sunde's FSK has the stated disadvantage, that. part should not allow fourth-order effects to cause phase ambiguiof its power is in the spectral lines of the t\VO carriers and ties. Fig. 5 illustrates the circuitry. thus not available for signaling. But we do not need the The circuit of Fig. 5 has the same doubler, PLL, and signaling-this is done at h i; on the contrary we turn the master clock as the circuit of Fig. 4, but it generates the spectral lines of Sunde's FSK into an advantage, since they phase references differently. The outputs of the PLL .are are at 2f1 and 2/2 and therefore ideally suited to generate for divided bv two and then added or subtracted. Since the the fast FSK the phase reference. . phase at "the divider output is ambiguous by 180°, we do not know which adder gives the sum and which the difference, but this does not matter since the outputs are, respectively, ±cos wt cos 7rt/2T and ±sin {j)t sin "'rtj2T and both and the clock are the required reference signals, when they are gated between the nulls of their respective modulation. liT = 212 - 211-
=
=
In principle, this could be done with the circuit in Fig. 4. A frequency doubler feeds t\VO phase-locked loops (PLL), which extract 2ft and 2f2' and send them to a. balanced modulator. The low-frequency output 2f~ - 2f1 of the modulator is the clock, the high-frequency output 2/ 1 + 2/ 2 is divided by 4 to generate the t\VO phase reference signals. All the reference signals can be generated practically error free, by narrowing the PLL bandwidth, provided that sufficient time is available for initial synchronization. The phase and clock signals are then used for a conventional coherent demodulation and detection of the signals in the
IX. 1800
PHASE ·AMBIGUITY
The only ambiguity left is the ± sign in front of the generated reference and this can be taken care of by the method ~f differential encoding, as is usually done in coherent ~ina.ry PSK. . Therefore we relate the transmitted bit stream a; = ± 1, n = 1, 2 _. - to the received signals in the I and Q chanriels. First we notice that because of the 180 0 ambiguities only.the phase differences ep [ (n - 1) T] - 4> [( n 1) T] can be used. Inspection of Fig. 1 shows that a sequence mark-space or space-mark leaves 'ep unchanged, while a sequence mark-mark
+
349
Fifty Years of Communications and Networking nCOs
t--
2vt
T
CLOCK
±
COS wt
:t SIN wt
FAST FSK
h •
Yz
w=2."..
f,+f z
2
TO COHERENT RECEIVER
Fig. 4.
Generation of clock and phase reference.
_ _ _ C_OS(2w 1t +2et) ..... _ _ ...
2 SIN (wt + ex:) SIN
~~ TO a-CHANNEL
CLOCK h· I
TO I - CHANNEL
2 cos l w t + IX) cos
1U 2T
COS(2w 2t + 2ex)
Fig. 5.
Alternative generation of references from the PLL outputs of Fig. 4
DELAY 2T
I - CHANNEL DECISION
lJ--.,_-I
CLOCK
INTER-LEAVING GATES
OUT
DELAY 2T Q - CHANNEL DECISION
{1----...--~
o Fig~
6.
Digital outputs when {an} was transmitted: .
or space-space changes 4> by 1800 after 2 bit intervals. Thus, := 2nT, a transition at the input corresponds to no transition at the output of the I channels and at time t == (2n + 1) T, a transition at the input corresponds to no transition in the Q channel, and vice versa.
at time t
EXCLUSiVE OR
The circuit of Fig. 6 shows the various logic operations that recover the differentially encoded input from the outputs of the I and Q channels. The circuit decodes the differentially encoded input ~ · Un+l when a" was sent.
350
THE BEST OF THE BEST IN Z IT)
G----o
Fig. 7.
X.
Generator for fast FSK.
EFFECT OF FILTERING AND IN'l'ERSYMBOL INTERFERENCE
If the fast FSK signals are passed through a narrow-band filter, at the transmitter and/or at the receiver front end, then the signals available for the phase decisions will no longer be independent. Two types of interference may occur: 1) within each channel and 2) between the I and the Q channel. The second type can be avoided by careful circuit designs. There will be no such interference if the overall filter action is that of a symmetrical bandpass, .
G(fc
+ t)
= G*(fc -
OUT
f)
because such a bandpass has a. weighting function
g(t)· exp - 2r;'fct, where g(t) is a real function. The bandpass action is then the correlation of exp jef> (t) with g(t) and this leaves the real part of exp jc/l (t ) real and the imaginary part imaginary, so that no interference between the I channel and the Q channel occurs. The intersymbol interference within the I (or Q) channel cannot be removed so easily. It limits the speed with which we can transmit bits through a given system. But, in each of the two baseband channels, bits occur at half-speed (lj2T) and the signaling function is continuous, since switching occurs only at the nulls of cos -rrtj2T. Thus the intersymbol interference will be considerably smaller than when rectangular pulses are transmitted in an IF bandwidth of 1.2jT. When the IF bandwidth is O.75/T, then the intersymbol interference of the fast FSK is still negligible. Higher bit rates are possible by making the intersymbol correction by a transversal filter [20], [21] at baseband, which is by now a wellestablished technique. Using only an intersymbol correction before decomposition into I and Q channel seems to give poorer results [22]. The output of the narrow-band filter will show amplitude fluctuations, mainly increases near the mark-space transition. This is unavoidable. If one tries to remove the amplitude by a limiter, then the sidebands are regenerated and 'the effective filter bandwidth is widened [12]. There is also the usual small reduction of SNR due to hard limiter action but the probability of error does not increase significantly.' The decision is based upon the quadrant in which the received signal is in a g~ ven interval. If we now make no intersymbol interference correction, but ideally investigate per interval only one sample of signal plus nois.e plus interference, then the decision
made from this sample, and hence the error probability, not affected by hard limiting after the filter.
XI.
DIGITAL GENERATION OF THE FAST
FSK
IS
SIGNAL
When generating the fast FSK, for good performance it is required not only to generate a coherent FSK, but also to hold the normalized frequency excursion h to precisely the value of h 1. This can be achieved by a suitable feedback circuit around a conventional coherent. FSK; or by a circuit that generates two sine waves at /1 and /2 /1 + hiT, each with positive and negative polarity, and a logic circuit, which, on frequency shifting, selects that polarity of the new frequency that maintains phase coherence. A more elegant generator of the fast FSK can be constructed 'as follows. First Sunde's FSK [6] is generated at IF, then sent through a flip-flop, which divides by 2; then the harmonics of the resulting square. wave are removed by a zonal filter and the resulting signal is the fast FSK, since Sunde's FSK has a normalized frequency shift of h = 1, which, after the frequency divider stage, becomes h == i. If the desired frequency range allows it, then Sunde's FSK can be obtained in an all-digital circuit as follows. Let n be some small integer. Generate a master clock with frequency n( + 1 )jT. Count down to derive three pulse trains, the first with n/.T; the other with (n + 1) IT pulses, and the third train with liT pulses. This third train serves as clock for the bitstream encoder. In each time interval T we select n pulses from the first train if we transmit mark or (n + 1) pulses from the second train, if we transmit space. This can be done by a. digital circuit under control of the master clock. The fundamental of the resulting pulse train is then an FSK with h == 1; but it is not necessary to obtain the fundamental because the train itself is ideally suited to drive a divide-by-two flip-flop. The fundamental .at the flip-flop output is then the fast FSK. Fig. 7 illustrates a digital circuit that implements this scheme for n = 4.
=
=
XII.
CONCLUSION
The fast FSK is a binary FSK with these priorities. 1) It is phase coherent. 2) It has a low deviation ratio, h == !. 3) It occupies a small bandwidth.
351
Fifty Years ofCommunications and Networking 4) It uses a phase detector as demodulator. First, with the help of a modification of the Costas receiver [23], seIfsynchronization of phase and clock is achieved; then the signals are coherently demodulated and detected. Such a receiver shows a distinct improvement over the bandwidth and SNR required for conventional reception of binary FSK signals. For instance, when no correction for intersymbol interference is used, then the fast FSI{ can be transmitted in an RF bandwidth of only. 0.75 times the bit rate, yet received with about 3 dB less power than conventional FSK at equal bit rate and error probability. The theory of the scheme is fairly simple, once it is recognized where the information, which makes this improvement possible, is to be found; and the implementation is straightforward. IlEFERENCES (1) P. M. Woodward, Probability and Information. Theory, With Applicatio11s to Radar. New York: Pergamon, 1953, pp. 40-42. {2) W. Postl, "Dis spektrale Leistungsdichte bei Frequensmodulation eines Traegers mit einem atoehaatlschen 'I'elegraphie-signal," Frequellz, vol. 17, pp. 107-110, Mar. 1963. (3) M. G. Pelchat, "The autocorrelation function and power spectrum o{ PCM/ FM with random binary modulating waveforms:' IEEE Trans. Space Electron. Telem., vol. SET-10, Pj). 39-44, Mar. 1964. 14] W. R. Bennett and S. O. Rice, "Spectral density and autocorrelation functions associated with binary frequency-shift keying," Bell Byst. Tech. J., vol. 42. pp. _2355-~~85, Sept~ 19t;>~ ..
[5] H. J. von Baeyervl'Band limitation and error rate in digital UHF-Fl\1: transmission." IEEE Trans. Commun, Byst., vol. CS-l1. pp. 110-117. Mar. 1963. [6J E. D. Sunde, "Ideal binary pulse transmission by AM and FM," Bell 8yst. Tech. J., vol. 38, pp. 1357-1426, Nov. 1959. [71 W. R. Bennett and J. Salz, "Binary data transmission by FM over a real channel," Bell SY8t. Tech. J., vol. 42, pp. 2387-2426, Sept. 1963. [8) V. A. Kotel'nikov, The Theory of Optimum Noise Immunity (in Russian, 1947). New York: McGraw-Hill, 1959, p. 40. [9} E. F. Smith, "Attainable error probabilities in demodulation of random binary PCM/FM waveforms." IRE Trans. Space Electron. Telem .• voI.SET-S, pp, 290-291. Dec. 1962. [101 T. T. Tjhung and P. H. Wittke. "Carrier transmission of binary data in a restricted band:' IEEE Trans. Commun, Technol .• vol. COM-18, pp. 295-304, Aug. 1970. [11) M. L. Doelz et al .. "Minimum-shift data communication system," U. S.· Patent 2 977417, Mar. 1961. [12] J. R. Boykin, "Spectrum economy {or filtered and limited FSK signals," Conf. Rec., 1970 IEEE Con], Communications. pt. 1, pp. 21.29-21.34. (t31 W. A. Sullivan, "High-capacity microwave system for dhdtal data transmission," Coni. Rec., Int. Conf, Communications, Montreal, Que., Canada, June 1971, pp. 23.4-23.8. [14] H. C. van den Elzen and P. van der Wurf, "A Simple method of calculating the characteristics of FSK signals with modulation index 0.5," IEEE Trans. CommunicatioM. vol. COM-20, pp. 139-147, Apr. 1972. [15) M. Schwartz, W. R. Bennett, and S. Stein, Communication System, and Technique,. New York: McGraw-Hill. 1966, pp. 298-302, eq, 7.5.8; pp. 341-342. ' B. P. Lathi, Commu-nicatio118 Systems. New York; Wiley, 1968, p. 419. 11 R. de Buda and H. Anto, "About FSK with Jow modulation index:' Can. Gen. Elec. Co., Ltd., Tech. Inf. Ser. Rep. RQ69E~Ell, Dec. 1969. [181 J. M. \Vozencraft and I. M. Jacobs, Principles o/Communication Engineeri-nq. New York: Wiley. 1965. pp. 248-250. [19] R. de Buds, "The phaselock to a suppressed carrier in additive Gaussian noise," Can. Gen. Elec. Co., Ltd.• Rep. TIS RQ70EE7, Sept. 1970. [20] R. W. Lucky. "Techniques for adaptive equalization of digital communication systems." Bell Syst. Tech. J .• vol. 45, pp. 255-286. Feb. 1966. [21) D. A. George, D. C. Coli, A. R. Kaye. and R. R. Bowen, "Channel equalization {or data transmission." Eng. J., vol. 53, pp. 20-31, May 1970. [22] J. L. Pearce and P. H. Wittke, "Optimum reception of digital FM signals," Dig. 1970 IEEE Symp. Communlcaiione, Montreal, Que., Canada. Nov. 12-13, 1970. [23] J. P. Costas. "Synchronous communications," Proc, IRE. vol. 44, pp. 17131718. Dec. 1956.
[161
Data Transmission by Frequency-Division Multiplexing Using the Discrete Fourier Transform S. B. WEINSTEIN,
MEMBER, IEEE,
AND
PAUL M. EBERT,
A5stTGct-The Fourief transform data communlcation system is a realization of freqnency-dlvision multiplexing (~DM) in which discrete Fourier transforms are computed as part of the modulation and demodulation processes. In addition to eliminating the banks of subcarrier oscWators and coherent demodulators usually required in FDM systems, a completely digital implementation can be built around a special-purpose computer. performirig the fast Fourier transform. In this paper, the system is described and the effects of linear channel distortion are investigated. Signal design criteria and equalization algorithms are derived and explained. A differential phase modulation scheme is presented that obviates any equalization.
I.
MEMBER,
IEEE
+1
-I
(a)
(b)
INTRODUCTION
D
AT A AR.E usually sent as a serial pulse train, but there has long been interest in frequency-division multiplexing with overlapping. subchannels as a means of avoiding equalization, combating impulsive noise, and making fuller use of the available bandwidth, These "parallel data" systems, in which each member of a sequence of N digits modulates a subcarrier, have been studied In [2] and [4]. Multitone systems are widely used and have proved to be effective in [3], [8], and [9]. Fig. 1 compares the transmissions of a serial and a parallel
system.
For a large number of channels, the arrays. of sinusoidal generators and coherent demodulators required in parallel systems become unreasonably expensive and complex. However, it can be shown [1] that a multitone data signal is effectively the Fourier transform of the original serial data train, find that the bank of coherent demodulators is effectively an inverse Fourier transform generator. This point of 'vie\v suggests a completely digital 1110dem built around a special-purpose computer performing the fast Fourier transform (FFT) . Fourier transform techniques, although not necessarily the signal format described in this paper, have been incorporated into several military data communication systems [5][7J. Because each subchannel covers only a small fraction of the original bandwidth, equalization is potentially simpler than for a serial system. In particular, for very narrow subchannels, soundings made at the centers of the Pa-per approved by the Data Communications Committee of the IEEE Communication Technology Group for publication after presentation at. the 1971 IEEE International Conference on Communications, Montreal, Que .. Canada, June 14-16. Manuscript received January 15, 1971; revised March 29, 1971. The authors are with the Advanced Data Communications Department, Bell Telephone Laboratories, Holmdel, N. J.
(c)
Fig. 1. Comparison of waveforms in serial and parallel data transmission systems. (a) Serial stream of six binary digits. (b) Typical appearance of baseband serial transmission. (c) Typical appearance of waveforms that are summed to create parallel data signals.
subchannels may be used in simple transformations of the receiver output data to produce. excellent estimates of the original data. Further, a simple equalization algorithm will minimize mean-square distortion on each subchannel,. and differential encoding of the original data may make it possible to avoid equalization altogether. I
II.
FREQUENCy-DIVISION MULTIPLEXING AS A DISCRETE TRANSFORMATION
Consider a .data sequence (d\l' ds, ... , d n _ 1), where each d n is a complex number" d n == an + jbn. If a discrete Fourier transform (DFT) is performed on· the vector {2d".}-n=oN-l, the result is a vector S = (So, 8 1, ... , 8 11 - 1) of N complex numbers, with
s; =
L
N-l
n-O
2dne-
i< 2 7fnm/ N )
= 2
n-O
m where
L
N-l
=
dne- ; 2 7t! n' ,," 0, 1, ... ,N - 1,
(1)
(2)
Reprinted from IEEE Transactions on Communication Technology, vol. COM-19, no. 5, October 1971. The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
353
354
THE BEST OF THE BEST
i; £ m L\t
(3)
:(REAL PART
and ~t is an arbitrarily chosen interval. The real part of the vector S has components
: ONLY)
~STORTIQNLESS CHAN~EL
N-l
L (an cos 21rf t + b; sin 21rf nt
Ym = 2
m) t
n m
n-O
= 0,1,
m
... ,N - 1.
y(t}
(4)
If these components are applied to a low-pass filter at time intervals At) a signal is obtained that closely approximates the frequency -division multiplexed signal
SAMPLE AT INTERVALS k4t/2
Fig. 2. Fourier transform communication system in absence of channel distortion.
N-l
yet)
E (an cos 21rf t +
= 2
n
naoO
b; sin 27rfn t),
o ~ t ~ N I1t.
(.~)
A block diagram of the conuuunication system in which y (t) is the transmitted signal appears in Fig. 2. Demodulation at the receiver is carried out via a. discrete Fourier transformatiori ot a vector of samples of the received signal. Because only the real part of the Fourier transform has been transmitted, it is necessary to sample twice as fast as expected, i.e., at intervals at/2. When there is no channel distortion, the receiver DFT operates on the 2N samples
=
Yk
J1
t) = 2 t;
N- t (
y( k 2
K= where definitions (2) and (3) (4). The DFT' yields
z; - -
1
E
2N-l
- 2N
f2ao,
El k=O
have been substituted into
la
1 -
Jb 1 ,
e;(2rmk/2N)
=
{I, 0,
Fig. 3. Power density spectra of subchannel components of y(t).
III.
l = 0
.
where the equality
2N
(6)
k
irrelevant,
~2
0, 1, ... ,"2N - 1,
Ye- i ( 2 1r l k / 2 N )
k-O
=
k+ b. sin '~ ), k) ,
C)
a. cos - :
l
=
1,2,
l
>
N - 1,
m
= 0,
,N - 1
±2N, ±4N,
(7)
(8)
otherwise
has been employed. The original data a, and b, are available (except for l == 0) as the real and imaginary components, respectively, of Zl, as Indicated in Fig. 2. A synchronising signal is required, but one or several channels of the transmitted signal can readily be utilized for this purpose. Because the sinusoidal components of the parallel data signal y (t) are truncated in time, the power density spectrum of 11 (t) consists of [sin (f) / f] 2- shaped spectra, as sketched in Fig. 3. Nevertheless, the data, on the different subchanneIs can be completely separated by the DFT' operation of (7). This will not be exactly true when linear channel distortion affects the received signal, but it will be shown later that a modest reduction in transmission rate eliminates most interferences.
EQUALIZATION BY USE OF CHANNEL SoUNDINGS
Except for the added linear channel distortion and final equalizer, the Fourier transform data communication system shown in Fig. 4 is identical to that of Fig. 2. Ideally, the discrete Fourier transformation in the receiver should be replaced by another linear transformation, derived in [1], which minimizes the error in the receiver output. However, it is preferable, if possible, to retain the DFT with its "fast" implementations and carry out suboptimal but adequate correctional transformations at the receiver output. The system of Fig.. 4 perf OfIns this approximate equalization. Consider the waveform at the receiver input, (9)
ret) = y(t)*h(t),
where the asterisk denotes convolution. This wa veform is a collection of truncated sinusoids modified by a linear filter. If the sinusoid cos 2'Trfn t were not truncated, then the result of passing it through a channel with transfer function H(f) would be En cos(Z7rfn t +
'lin
=
t
an
-1
(1m H(fn)). Re H(fn)
(10)
The sinusoids in the transmitted signal y (t) [see (5)] are truncated to the interval (0, lVAt) , so that the nth subchannel must accommodate a [sin lV..,,(! - fn)dt]/
355
Fifty Years of Communications and Networking
Equations (13a-c) describe a 2 X 2 transformation to be performed on each of the DFT outputs t 1, 2, - · ., N - 1. For a reasonably large N and a typical communication .channel, the approximation of if (f) by a constant over each su bchannel, which leads· to (13a-c), may be adequate. However, linear rather than constant approximations to the amplitude and phase of the channel transfer function as it affects each subchannel waveform are much closer to reality. The following sect-ion examines the consequences of these approximations. It is shown that the truncated subchannel sinusoids are dela.yed by differing amounts, and that distortion is concentrated at the on-off transitions of these waveforms. Further, the magnitude of the distortion is proportional to the abruptness of the transitions. Hence a "guard space," consisting of a modest increase in the signal duration together with a smoothing of the on-off transitions, will eliminate most interference among channels and between adjacent transmission blocks. The, individual channels can then he equalized in accord with (13a-c).
z" =
CHANNEL H(f) 2N-1
P------li
,Irk} »o
r - - - ___
A
EQUALIZATION TRANSFORMATION.
:' b l
Fig. 4. Fourier" transform communicati.on system including linear channel distortion and final equalization.
[N7J"(f - fn)At] spectrum instead of the impulse at In, which would correspond to a pure sinusoid. However, if 1/ (N6.t) is small compared with the total transmission bandwidth, then H (f) does not change significantly over the subchannel and an approximate expression for the recei ved signal r (t) is
r(t) '" 2
N-l
L
n=l
Hn[an cos (21rfn t
+ cPn)
IV.
=
N-l
2
L
n".,1
+
CHANNEL DISTORTION
+ b; sin
Hn(a n cos cPn
2H~ao,
o~
t ~ N ~t.
(11)
As indicated in Fig. 4" r(t) is sampled at times k (Atj2) , k 0, 1, ... , 2N - 1, and the samples {fTc}" are applied to a discrete Fourier transformer. The output of the DFT'is
=
1
__
Zl ~
2N
2N-l ~
L...i rk e
The transmitted signal y (t) as given by (5) exists only on the interval (0, N tit), so that each subchannel must, as noted earlier, aCC01111110date a sin f If type spectrum. As suggested in the last section, let this spectrum he narrowed by increasing the signal duration to some T > N d t and requiring gradual rather than abrupt rolloffs of the transmitted waveform. Specifically, "the transmitted signal will be redefined as N-l
yet) = 2ga (t) ~ [an cos (21rfnt)
-i(2",Zk/2N)
n=O
k-O
l = 1,2, ... , N - 1
- a, sin cPt],
~ [1 + COS2:~] ,
(12)
1,
Estimates of at and b, are obtained from the cornputa-
i [
ions
= ~I
[Re (ZI) cos tPl
t, = ~I [Re (ZI) sin tPI
+
b; sin (2~fnt)], (14)
ga(t)
l> N - 1.
irrelevant,
+
where an appropriate ga (t) is
l = 0
0,1
ApPROXIMATE ANALYSIS OF THE EFFECTS OF
2 o
1m (ZI) sin tPd
1
+
cos
r(t -
T)J
2aT
-2aT
~
t
<0
o~ t < T '
elsewhere.
(15) - 1m (ZI) cos tPd
l
=
1, 2, ... ,N - 1-
(13a)
The "window function" go (t) is sketched in Fig. 5. When y (t) is passed through' the channel filter with impulse response h (t) , the received signal is
In complex notation, the appropriate computation is
ret)
=
N-l
L
n=O
{an[2h(t)*ga(t) cos 27rfn t]
where ui, =
1 [cos cPt - 3i sin ain e.] Hz cPl •
(13c)
N-l
~ {anqa (n>(t)
n=O
+ bnqb (n)(t)},
(16)
356
THE BEST OF THE BEST
-2.T
Fig. 5. Window ga(t)
IHC f I' ~Iop, Otn
<,
T+20T
T
multiplying all subchannel sinusoids in transmitted signal.
flU
-'"
where
qa (n) (t) qb (n) (t)
= 2h(t) * ga(t) = 2h(t) * ga(t)
cos
21rfn
+
(t) ~ 2H n cos [2rjnt
+ ~ sin
~
fft
t
sin 2rfn t .
(17)
In Appendix I, linear approximations to the amplitude f ± In result in the following approximate expression for qa{n) (t).
=
and phase of H (/) around
qo (n)
~
cJ>n]ga(t - {3n)
t+ cPnl ;t Ya(t -
[2rf n
(In),
Fig. 6. Linear approximations to amplitude and phase of H
(18)
where (H"., >n.) is the channel sounding at frequency in' and an and fJn are the slopes at f n of the linear approximations to amplitude and phase of H (}), as shown in Fig. 6. A similar expression results for qb(n) (t) . The first term. on the right-hand side of (18) is the nth cosine element in the transmitted signal (14), except that it is modified by a channel sounding (H n,
SAMPLES~
=/
T
>
(2N -
1)
~t + max n
~
(fJ.) - min ({3n), n
(19)
as pictured in Fig. 7. Then for all n there exists a time to > max fin such that
to
t
~
~
~t
(2N - 1) 2"
+ to.
(20)
Fig. 7 shows where the interval (to, tH + (21V - 1) (~tl 2)) is located with respect to the minimum and maxiInurn values of the time shift fin. Thus,
+
qa(n>(t) ,-...,; 2H n cos (27rJnt
t < o -
t
< t0
-
+ (2N
- 1) dt. 2
(21)
By a similar derivation, qb (ft)(t) ~ 2H n sin (21fJ nt
t
t
0
+ (2N
- 1) 8t. 2
(22)
Therefore, substituting (21) and (22) into (16) (except for n 0),
=
=
fJ,.·
N-l
ret) ~ 2
L
n-l
Hn[a n cos (21rf nt
+ 2H oao,
to
~
t
+ cPn) +
~
to
+
b; sin (21rf nt
+
tit
(2N - 1) 2'
cPn)] (23)
Except for the shifted domain of definition, (23) is identical to (11), which led to (13) for retrieval of the data. It can be shown that initiating sampling of r(t) at t = t., instead of at t = 0 is equivalent to incrementing each phase
[3] .
+
< t -< o -
Fig. 7. Shifted versions of ga(t) corresponding to subehannels with minimum and maximum delay and locations of samples taken by receiver. Here tsn\n min; fin, t m a x == max",
'T.
ALGORITHM FOR MINIMIZING MEAN-SQUARE
DISTORTION
Under the assumption supported by the results of the last section that interchannel interference is negligible, a simple algorithm can be devised for determination of the
357
Fifty Years ofCommunications and Networking
parameters cos epl/H t and sin
at
= T ll Re z,
bz =
+T
21
1m
z,
T 2 l Re Zl - T ll 1m Z"
(24)
which resemble (13a-c), except that the T coefficients are to be chosen to minimize the estimation error." Mean-square distortion is defined by f
=E
N-]
E
1-1
[el u
2
+
eZb
implementation of (30) is shown in Fig. 8. The initial value of T is probably best obtained from crude channel soundings, or specified as some "typical" vector quantity. It is expected that the first round of adjustments, made at the end of the first block transmission, will suffice to reduce the error to a low level, if it is not already low with the initial value of T. At At = 0.5 ms, the length of one block before addition of a guard space will vary from about 8 ms (16 subchannels) to about 64 ms (128 subchannels). This block length, plus the guard space necessary to minimize interference, is a transmission delay that cannot be avoided. The equalization algorithm given here only equalizes distortion due to cochannel interference (channels on the same frequency) and completely ignores interchannel interference. An unpublished analysis by the authors shows that for this equalizer, the interchannel interference becomes arbitrarily small as the number of subchannels increases. This is true even without any of the signal modification described in Section IV. VI.
2]
TRANSMISSION 'VITHOUT E(~UALIZATION
We have shown that for narrow subchannels the channel can be equalized by multiplying Zz, the receiver out=E [(TIl· Re z, + T 2 Z " 1m z, - al)2 put for the lth subcarrier channel, by a number tVz [see 1-1 (13c) ]. This is simply compensation for attenuation and phase shift in that particular subchannel. The interference among subchannels is made small by using a guard where time and smooth transitions between blocks. For binary transmission, the attenuation need not be ela ~ al al compensated, and if differential phase transmission be(26) tween subchannels is used, no phase equalization is elb ~ i, - bz • It is shown in Appendix II that E is a convex function of . needed. In order for this technique to work, the difference in phase of the transmission channel transfer functhe vector T, where tion H (f) between adjacent subchannels should be small. Assume this is the case and let N-l
E
Thus a steepest descent algorithm is sure to converge to the vector if'0 yielding the minimum mean-square distortion. The ll.2th components of the gradient of E with respect
to Tare
(30)
where an and b; are binary information digits on the nth sub channel. For the first block transmission, do is necessarily a fixed reference. At the output of the DFT in the receiver, form the product
(28) The steepest descent algorithm makes changes at the end of each block transmission in a direction opposite to the gradient:
tlT ll = -k[el a Re Zz !J,.T2l = -k[ez a Im Z,
eZb
+ elb
1m Zzl Re Zl),
(29)
where k controls the step size. A block diagram of the 1 The vector T and its subscripted components should not be confused with the signal duration T used earlier.
+ (an
- jb,,)h n(h"-l* - h,,*)
n
=
=
Id 1\2, n_
1,2, ... ,N - 1,
(31)
where h« H (f n) is the complex channel transfer function at the center frequency of the nth subchannel. The last part of (31) is the information signal times an unknown amplitude term, phis an error term depending on h n - h11. - 1. For binary transmission, the information signal can be reliably recovered if the second term is less than half of the first term. This is equivalent to saying that the phase of h does not change by more than
358
THE BEST OF THE BEST
to arbitrarily accurate expressions for qa (n) (t) and qb(n) (t) except for the lowest values of n. Substituting approximations (36) and (37) into (35), Qa (n)
(1)
[Hn
~
+ [Hn -
OFT
i: '" i:
Thus
q. ('l(t) =
g; }IDEAL
'-'-_-+
REF.
Fig. 8. Automatic equalizer for Fourier transform receiver. One such apparatus is required for each of the (n - 1) usable outputs of the discrete Fourier transformer.
30° between the centers of adjacent subchannels. Because a,l and b, are binary, (a ,l - jb'l) may be recovered by determining which quadrant of the complex plane contains ZttZtl _ 1~~, even though b« is unknown. For the second and all subsequent block transmissions, do can carry information by comparing it with do from the previous T-second transmission, i.e.,
do
=
current
(a o - jbo)clo previous-
RECEIVED SIGNAL AMPLITUDE
AND
UNDER PHASE
THE
n=O
lt
+ an(f
ApPROXIM:ATIONS CHANNEL
TO
TRANSFER
- f.) ]eilkllnU- 'n) I
i h lt
di
i:
+
to)
x(t -
!i dt x (t _) to
<==f
cos (2rf nt) ]
qb (n)(t) = 2h(t)
* [ga(t)
sin (21rfnt)].
Restricting attention for the time being to the Fourier transform of this function is
qa
»«
"J
.').f
~ J_·1('Je
(33)
qa(n)
-
an(f -
(39)
(40)
-
(3n) f3n)
(41)
0
This approximation appears. as (19) in the main body of this paper. A similar derivation yields an analogous approximation to qb(n) (t).
II
The mean-square error has been defined as
(t),
N-l
E(T) = E
f,J]ei[(j>n-PnU-'n)] ,
(36)
and for Ga.(f + In) the approximation is (Hn
df.
CONVEXITY OF l\1EAN-SQUARE ERROR
(34)
where G a (1) is the Fourier transform of the "window function" (fa (t). We now make linear approximations to the amplitude and phase of H (f) as they affectthe separate spectra Ga(I - f.n) and G (I + In) in (35). For G; (I - Itl), the approximation is
t'J
-iftoX(f)
+ ~n]ga(t -
2Hn cos [21rf n t
L: {[TIl Re Zl + Zzol
(35)
H(f)
+ fn)ei27rft
in)]
e-iltoX(f)
ApPENDIX
* [Ya(t)
an(f -
+
[H. - a.(f
+ ~ sin [21rf.t + !P.] :t g,,(t -
+ bnqb
qa (n')(l) = 2h(t)
+
(38)
df
where
H(j) '" [Hn
In).
After appropriate changes of variable, these Fourier transforms can be evaluated by recalling the following Fourier transform pairs
FUNCTION
ret) = ~ (anqa
Q.(nl(f)e i h [H.
+
fn)]e-ifcPn+PflU+fn)]Ga(f
_e- i1 (j> n+ {J nU +f n) ]G a (f
The received signal is given by (16) as N-l
!n)]ej[
Thus
LINEAR OF
lXn(f -
oO,,(f - i.)e
(32)
I
ApPENDIX
+ lXn(f -
!n)]e-ilcPn+PnU+fn»).
+ where the vector
The validity of these approximations depends on the narrowness of Ga(f) and the size of in, as illustrated in Fig. 6. One can always select a window function gao (t) for which the approximations of (36) and (37) lead
al]2
-
[T2 Z Re z, - T u Im z, - btl
T is defined
2
(42)
} ,
as (43)
Taking a single term of the sum, \ve have T 1l
2(Re
-
(37)
T 2 l 1m Zz
Zl)2
+
T 2l
2(Im
Zl)2
+ 2 Re z,
2al [TIl Re Zl + T 2 l Irn zzJ
+T
2(Im 21
2 Re
Zl)2 -
Zl
2 o
+
T'1l2)
lZl12
- 2T2l (al Im z,
+
+
+ a/ + T
2(Re Z1
Zl) 2
1m z ZT llT2l
- 2bz (T2l Re z, - Til Im = (T ll
Inl z ZTllT2 1
zzJ + bZ2
2T ll (bz Irn
b, Re zz)
+
Zl -
a/
al
Re Zl)
+ b/,
Fifty Years of Communications and Networking
which is the sum of convex functions and is thus convex itself. Since E(T) is the sum of convex functions, it too is
convex.
359 [9] J. L. Holsinger, "Digital communication over fixed timecontinuous channels with memory, with special applications to telephone channels," M.LT. Res. Lab. Electron., Cambridge, R,ep. 430, 1964.
ACKNO'VLEDGMENT
The authors are indebted to J. E. Muzo for comments on approximation techniques, and to J. Salz for earlier work on this project. REFERENCES [1] J. Salz and S. B. \Veinstein, "Fourier transform communion-
L2]
[3.J [4]
[5]
[6] [7]
[8]
tion system,' presented at the A~. Comput, Machinery Conf. Computers und Communication, Pine Mountain, Ga., Oct. 1969. B. R. Salzberg, "Performance of an efficient parallel data transmission system," IEEE Trans. Commun. Technol., vol. COM...15, Dec. 1967.. pp. 805-81L . M. L. Doelz, E. T. Hea.ld, and D. L. Martin, "Binary data transmission techniques for linear systems," Proc. 1HE:] vol. 45: May 1957, pp. 656-661. R. W. Chang and R. A. Gibby, teA theoretical study of perforrnance of an orthogonal multiplexing data transmission scheme," IEEE Trans. Commun. Technol., vol. COM-16, Aug. 1968, pp. 529-540. E. N. Powers and M. S. Zimmerman, "TADIM-A digital implementation of a. multichannel data modern," presented at the 1968 IEEE Int. Conf. Communications, Philadelphia, Pa. W. W. Abbott, R.. C. Benoit, Jr., and R. A. Northrup, "An all-digital adaptive data modem," presented at the IEEE Computers and Communication Conf., Rome, N. Y.. 1969. 'tV. \\T. Abbott, L. W. Blocker, and G. A. Bailey, "Adaptive datu modem," Page Commun, Eng., Inc., RADC-TR-69-296, Sept. 1969. M. S. Zimmerman and A. L. Kirsch! "The AN/GSC-IO (KATHRYN) variable rate data modem for HF radio," IEEE Trans. Commun. Technol., vol. COM-15, Apr. 1967.. pp. 197-204. #
. S.. B. Weinstein (8'59-1\1['66) was born in New York, N. Y., on November 25, 1938. He received the B.S. degree from the Massachusetts Institute of Technology, Cambridge, PHOTO in 1960, the M.S. degree from the University NOT of Michigan, Ann Arbor, in 196~\ and the Ph.D. degree from the University of AVAILABLE California, Berkeley, in 1966. Upon finishing his graduate studies he worked for approximately one year with the Philips Research Laboratories in Eindhoven, the Netherlands, before joining the Advanced Data Communications Department at Bell Telephone Laboratories, Holmdel, N. J., in 1968. He works in thy areas of statistical communication theory and data communications.
Paul M. Ebert (1\1'60) was born in Madison, Wis., on December 30, 1935. He received the B.S. degree from the University of Wisconsin, Madison, in 1958, and the 1\1( .S. PHOTO and Be D. degrees from the Massachusetts NOT Institute of Technology, Cambridge, in 1962 and 1965, respectively. AVAILABLE From 1958 to 1960 he was a member of the Airborne Communications Division at the Badia Corporation' of America, Camden, N, J. Since 1965 he has been a member of the Advanced Data Communications Department, Bell Telephone Laboratories, Holmdel, N. J. He has worked in the fields of information theory, coding, and digital filtering.
Viterbi Decoding for Satellite and Space Communication JERROLD A. HELLER Member, IEEE
IRWIN MARK JACOBS Member, IEEE
Abstract-Convolutional coding and Viterbi decoding, along with binary phase-shift keyed modulation, is presented as an efficient system for reliable communication on power limited satellite and space channels. Performance results, obtained theoretically and through computer simulation, are given for optimum short constraint length codes for a range of code constraint lengths and code rates. System efficiency is compared for hard receiver quantization and 4 and 8 level soft quantization. The effects on performance of varying of certain parameters relevant to decoder complexity and cost is examined. Quantitative performance degradation due to imperfect carrier phase coherence is evaluated and compared to that of an uncoded system. As an example of decoder performance versus complexity, a recently implemented 2-Mbit/s constraint length 7 Viterbi decoder is discussed. Finally a comparison is Paper approved by the Communication Theory Committee of the IEEE Communication Technology Group for publication without oral presentation. This work was supported in part by NASA Ames Research Center under Contract NAS2-6024. Manuscript received June 101 1971. . The authors are with the Linkabit Corporation, San Diego, Calif.
made between Viterbi and sequential decoding in terms of suitability to various system requirements.
I.
T
INTRODUCTION
lIE SATELLITE and space communication chan-
nels are likely candidates for the cost-effective use of coding to improve communication efficiency. The primary additive disturbance on these channels can usually be accurately modeled by Gaussian noise which is "white" enough to be essentially independent from one bit time interval to the next, and, particularly on the space channel but also in many instances 011 satellite channels, sufficient bandwidth is available to permit moderate bandwidth expansion. Two effective decoding algorithms for independent noise (memoryless) channels have been developed and refined, namely sequential and Viterbi decoding of convolutional codes. These theoretical accomplishments, combined with real communication
Reprinted from IEEE Transactions on Communication Technology, vol. COM-19, no. 5, October 1971.
The Best ofthe Best. Edited by W H. Tranter, D. :P. Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
361
THE BEST OF THE BEST
362
needs and the availability of low-cost complex digital integrated circuits, make possible practical and powerful high-speed decoders for satellite and space communication. Communication from a distant and isolated object in space to a ground-based station presents certain system problems which are not nearly as critical in earth-based communication systems, The most obvious among these is the high cost of space-platform power. It is desirable to design a system which is as efficient as practical in order to minimize the spacecraft weight necessary to generate power. The modulated signal power at a ground station receiver front end P depends upon the transmitted power, the transmitting and receiving antenna gains, and propagation path losses. Primarily due to thermal activity at the receiver front end, wideband noise is added to the received signal, resulting in a received signal powerto-noise ratio (PIN o ) , where No is the single-sided noise spectral density. The noise is usually accurately modeled as being both white and Gaussian. Other perturbations caused by uncertainty in carrier phase at the demodulator and inaccuracies in receiver AGe are treated in Sections IV and V. The efficiency of a communication system is usefully measured by the received energy per bit to noise ratio (Eb/N o) required to achieve a specified system bit error rate. The Eb/No is expressable in terms of the modulating signal power by the relationship
s,
No
P 1
(1)
= No'll
where R is the information rate In bits per second. Alternatively, (1) can be written as
R =
PINo . En/No
(2)
The payoff for using modulation and/or coding techniques which reduce the E1J./N.. required for a given bit error probability is an increase in allowable data rate and/or a decrease in necessary received P!J.Vn . As a point of reference, it is traditional to compare the efficiency of modulation-coding schemes with' that of a hypothetical system operating at channel capacity. Channel capacity for an infinite bandwidth white Gaussian noise channel with average power P is [1]
c'" = No~n 2 bit/so From (1), when R =
(3)
C~,
E b IN0 = 1n 2
.1.
/j) b
min
=~.
(4)
Thus, the lower bound on achievable Eb!lVo IS about -1.6dB. Without coding, required Eb/N 0 can be minimized by selecting an efficient modulation technique, For example,
180° binary phase-shift keying (BPSK) is more efficient than binary frequency shift keying (BFSK). For a desired bit error rate of 10-5 , an E b./N l) of 9.6 dB is required using BPSK (antipodal) modulation, whereas, 12.6 d~ is required with BFSK (orthogonal) modulation. Quadraphase-shift keying (QPSl\) is often used to conserve bandwith. Under the assumption of perfect phase coherence, QPSJ( has the same performance as BPSK. In designing a communication system to operate at a specified data rate, the improvement in efficiency to be realized using coding must be weighed against the relative costs. Potential alternatives include increasing the transrnitted power, increasing the transmitting antenna gain, and/or the receiving antenna area, and accepting a higher probability of bit error. In many applications, a minimum P, is required and the incremental cost per decibel increase in PINo is now greater (often much greater) than the cost of reducing the needed EblNo through coding. Soft decision Viterbi and hard decision sequential decoding can provide a relatively inexpensive 4-6-dB improvement in required Eb/N o (at a 10- 5 bit error rate), even at multimegabit data rates. Sequential decoding is extensively discussed in [5]. In Section VII, we compare these techniques. Sections II and III examine various aspects of Viterbi decoding and present curves permitting system tradeoffs. In Section VI, a particular implementation of a Viterbi decoder is discussed to provide one benchmark for cost-complexity discussions. In the discussion that follows, we assume that the channel is power limited rather than bandwidth limited. This assumption is realistic for many present day and future systems; however, the trend, especially in satellite repeaters, is to larger P/N« without a proportional increase in available bandwidth. For this reason, we will limit consideration to codes wh.ich involve a "bandwidth expansion" of 3 or less; that is, we assume that from 1 to 3 binary symbols can be transmitted over the channel for each bit of information communicated without appreciable intersymbol interference.
II.
SYSTEM
A. C onuoluiionai Encoder
Fig. 1 shows a general binary-input binary-output convolutional coder. The encoder consists of a kI( stage binary shift register and v mod-2 adders. Each of the mod-S adders is connected to certain of the shift register stages. The pattern of connections specifies the code. Information bits are shifted into the encoder shift register k bits at a time. After each k bit shift, the outputs of the mocl-2 adders are sampled sequentially yielding the code symbols. These code symbols are then used by the modulator to specify the waveforms to be sent over the channel. Since v code symbols are generated for each set of k information bits, the code rate R N is kjv information bits per code symbol, where k < v. The constraint length of the code is K, since that is the number of k bit shifts over which a single information bit can
363
Fifty Years of Communications and Networking kK .. STAGE
SHfFT
REGJSTER
tNFORMA'OON
s,rs
(shifted itt k
r
...a--.....I--r---..L-.....-L-..--I-"""----_ _-----"L..--..-,......--......J
• • timel
is 0 or 1. The function p(t) is a convenient unit energy low-pass pulse waveform, i, is the carrier frequency, E; is the energy per pulse, and 'I', is the time between successive code symbols, B, and T.~ are defined by the relationships
(6) and
(7)
L=
CQMMUTATOA
BfNARY CODE •
SYMIOlS
Fig. 1. Rate kin convolutional encoder.
influence the encoder output. The state of the convolutional encoder is the contents of the first k (K - 1) shift register stages. The encoder state together with the next k input bits uniquely specify the v output symbols, 1, v 2 encoder is As an example, a K = 3, k
=
=
shown in Fig. 2(a). The. first two coder stages specify
the state of the encoder; thus, there are 4 possible states, The code words, or sequences of code symbols, generated by the encoder for various input information bit sequences is shown in the code "trellis" [2] of Fig. 2 (b). The code trellis is really just a state diagram for the encoder of Fig. 2 (a). The four states are represented by circled binary numbers corresponding to the contents of the first two stages of the encoder. The lines or "branches" joining states indicate state transitions due to the input of single information bits. Dashed and solid lines correspond to HI" and "0" input information bits, respectively. The trellis is drawn under the assumption that the encoder is in state 00 at time O. If the first information bit were a 1, the encoder would go to state 10 and would output the code symbols 11. Code symbols generated are shown ad] acen t to the trellis branches. ,A.s an example, the input data sequence 101 ... generates the code symbol sequence 111000 .... Further interpretations of the encoder state diagram and a discussion of "good" convolutional codes is presented in r31.
B.
jlfodulation
The binary symbols output by the encoder are used to
modulate an RF carrier sinusoid. Here we restrict our attention to the case of 180 0 BPSK modulation. Each code symbol results in the transmission of a pulse of carrier at either of two 180 0 separated phases. A squence of code symbols produces a uniformly spaced sequence of biphase pulses. The signal component of the received waveform thus has the form s(t) =
L: V2E: i
pet - iT
= V2Es cos (21rfct + Here
Xi.
S)
cos (21rf c t
0)
L i
+
xi'rr/2
·XiP(t - i1T(J ) '
+
8)
(.5)
is ± 1 depending on whether the ith code symbol
There are several reasons for restricting attention to BPSI{ modulation. Three important ones are as follows. 1) BPSK signals are convenient to generate and amplify. 'Traveling wave tube amplifiers operate most efficiently at or near saturation. This nonlinear amplification would degrade performance with multilevel amplitude modulated waveforms. 2) It can be shown that antipodal (BPSK) modulation results in little increase in required E lJ/1Vo compared to optimum signaling when Es/i'To is low [4J. 3) BPSK modulation of quadrature carriers is equivalent to quadraphase (QPSK) modulation of one carrier. Thus, QPSK need not be separately treated except for synchronization and phase error requirements.
C. Demodulation. and
Qua:nb:zat1~on
At the receiver, the signal s(t) of (5), is observed added to white Gaussian noise. When the carrier phase 8 is known, the optimum demodulator consists of an integrate and dump filter matched to p(t) cos (21rfct + 8). At time jT" the demodulator outputs data r i relevant to the jth code symbol. Normalizing the matched filter output by dividing by yields
V!iJ2
ri
= Xi
V2E s / N o + n,
(8)
when nj is a zero-mean unit variance Gaussian random variable. Each n, is independent of all others. To facilitate digital processing by the decoder, the continuous Tj must be quantized. The simplest quantization is a hard decision with 0 output if 'rj is greater than zero and 1 output otherwise. Here, the received data are represented by only one bit per code symbol, Without coding, the matched filter sampler hard quantizer is an optimum receiver. When coding is used, hard quantization of the received data usually entails a loss of about 2 dB in Eb/N o compared with infinitely fine quantization [4], [5]. Much of this loss can be recouped by quantizing r; to 4 or 8 levels instead of merely 2. Adding additional levels of quantization necessitates a 2- or 3-bit representation of each 1~j. Fig. 3 (a) and (b) shows two quantization schemes with 4 and 8 levels, respectively. Here the quantization level thresholds are spaced evenly. The spacing is 1.0 for 4 levels and 0.5 for 8 levels. Uniform quantization threshold spacings of 1.0 and 0.5 can be shown by analytical means and through simulation to be very close to optimum for 4- and 8-level quantiza-
364
THE BEST OF THE BEST
CODE
INFORMATION BITS
SYMBOLS
COMMUTATOR
(a)
€V
@)
00
@
00
@
00
@
00
\
\ \
\
\
11
\
\
11
\
\
\
\
\
\ 11 \
\
@
\
11
\
\
®
11 \
",
\
\
00
®
\ 11 \
\
"-, , 00
\
,
"
"-
• • •
1Q \
" , '\ '\
"
"
""
"
" '\
" 01 '\
"
0
,
'\
" @~
, ,
'\
<,
01
,
.... 01
"-
", '\
" , ---------" --------@ 10
10
11
11
11
2
3
4
TRELLIS DEPTH
~
(b)
Fig. 2.
3
, I
(a) K
=
3, RN
= 1/2 convolutional encoder. (b) Code trellis diagram. o
2
-1.."0
1.0
(8)
7
l
-1.5
5
6 -1.0
,
o
2
3
4
0.5
-0.6
1.0
1.5
(b)
two inputs (0 or 1, the code symbols) and 2, 4, or 8 outputs. The channel transition probabilities are a function only of the symbol signal-to-noise ratio E,/N o• For example, with 8-level quantization, the probability of receiving interval 6 given that a O-code symbol is sent is the probability that a unit-variance Gaussian random variable with mean V2E,/N o lies betwee~ -1.0 and ., In value. .
1.5
III. VITERBI DECODING
Fig. 3. Receiver quantization thresholds and intervals for (a) 4-level and (b) 8-}evelquantization.
A. Basic Algorithm
tion. Furthermore, 8-I~vel quantization results in a loss of less than 0.25 dB compared to infinitely fine quantization; therefore, quantization to more than 8 levels can yield little performance improvement. We confine our attention to hard decision quantization and the 4 and 8 level schemes shown in Fig. 3. Receiver quantization converts the modulator, Gaussian channel, and demodulator into a discrete channel with
The maximum likelihood or Viterbi decoding algorithm was discovered and analyzed by Viterbi [6] in 1967. Viterbi decoding was first shown to be an efficient and practical decoding technique for short constraint length codes by Heller [7], [8]. Forney [~] and Omura [12] demonstrated 'that the algorithm was in fact maximum likelihood. A thorough discussion of the Viterbi decoding algorithm is presented by Viterbi [3]. Here, it will suffice to briefly
Fifty Years ofCommunications and Networking
review the algorithm and elaborate on those features and parameters which bear on decoder performance and complexity on satellite and space communication channels. Referring to the code trellis diagram of Fig. 2 (b), a brute-force maximum likelihood decoder would calculate the likelihood of the received data for code symbol sequences on all paths through the trellis. The. path with the largest likelihood would then be selected, and the information bits corresponding to that path would form the decoder output. Unfortunately, the number of paths for an L bit information sequence is 2J~; thus, this brute force decoding quickly becomes impractical as Lin... creases. With Viterbi decoding, it is possible to greatly reduce the effort required for maximum likelihood decoding by taking advantage of the special structure of the code trellis. Referring to Fig. 2 (b), it is clear that the trellis assumes a fixed periodic structure after trellis depth 3 (in general, Ii) is reached. After this point, each of the 4 states can be entered from either of two preceding states. At depth 3, for instance, there are 8 code paths, 2 entering each state. For example, state 00 at level 3 has the two paths entering it corresponding to the information sequences 000 and 100. These paths are said to have diverged at state 00, depth 0 and rernerged at state 00, depth 3. Paths remerge after 2 fin general k (K - 1) J consecutive identical information bits. A Viterbi decoder calculates the likelihood of each of the 2 k paths entering a given state and eliminates from further consideration all but the most likely path that leads to that state. This is done for each of the 2~~(K-l) states at a given trellis depth; after each decoding operation only one path remains leading to each state. The decoder then proceeds one level deeper into the trellis and repeats the process. For the K = 3 code trellis of Fig. 2(b), there are 8
paths at depth 3. Decoding at depth 3 eliminates 1 path entering each state. The result is that 4 paths are left.
Going on to depth 4, the decoder is again faced with 8 paths. Decoding again eliminates 4 of these paths, and so on. Note that in eliminating the less likely paths entering each state, the Viterbi decoder will not reject any path which would have been selected by the brute force maximum likelihood decoder. The decoder as described thus far never actually decides upon ODe most likely path. It always retains a set of 2h:(K-l) paths after each decoding step. Each retained path is the most likely path to have entered a given encoder state. One way of selecting a single most likely path is to periodically force the encoder into a prearranged state by inputting a K - k bit fixed information
sequence to the encoder after each set of L information
bits. The decoder can then select that path leading to the known encoder state as its (1 bit) output. The great advantage of the Viterbi maximum likelihood decoder is that the number of decoder operations performed in decoding L bits is only L2 k ( K - l ) , which is linear in L. Of course, Viterbi decoding as .a practical
365
technique is limited to relatively short constraint length codes due to the exponential dependence of decoder operations per bit decoded on K. Fortunately, as will be sh 0 \\11 , excellent decoder performance is possible with good short constraint length codes.
B . Path lvIel1~OTY In order to make the Viterbi algorithm a practical decoding technique, certain refinements on the basic algorithm are desirable. First of all, periodically forcing the encoder into a known state by using preset sequences multiplexed into the data stream is neither operationally desirable nor necessary. It can be shown [2], [9] that with high probability! the 2 k ( K - l ) decoder selected paths will not be mutually disjoint very far back from the present decoding depth. All of the 2k ( K - l ) paths tend to have a common stem which eventually branches off to the various states. This suggests that if the decoder stores enough of the past information bit history of each of the 2 'd K - 1 ) paths, then the oldest bits on all paths will be identical. If a fixed amount of path history storage is provided, the decoder can output the oldest bit on an arbitrary path each time it steps one level deeper into the trellis. The amount of path storage required u is equal to the number of states, 2 k ( K - l ) multiplied by the length of the information bit path history per state h,
(9) Since the path memory represents a significant portion of the total cost of a Viterbi decoder, it is desirable to minimize the required path history length h. One refinement which allows for a smaller value of h is to use the oldest bit on the most likely of the 2k ( K - l ) paths as the decoder output, rather than the oldest bit on an arbitrary path. It has been demonstrated theoretically [2] and through simulation [9] that a value of h of 4 or 5 times the code constraint length is sufficient for negligible degradation from optimum decoder performance. Simulation results showing performance degradation incurred with smaller path history lengths are presented and discussed in Section IV.
c. State
and Branch Metric Quantization
The path comparisons made for paths entering each state require the calculation of the likelihood of each path involved for the particular received information. Since the channel is memoryless, the path likelihood function is the product of the likelihoods of the individual code symbols [3]
P(r*/x l ) =
IT P(r;* /x/) i
(10)
where r* = (r1*, r2*, ... , r;*, ... ) is the vector of quantized receiver outputs and x' = (Xli, X2" ••• , x/, ... )
is the code symbol vector for the Ith trellis path. In order to avoid multiplication, the logarithm of the likelihood is a preferable path metric
366
THE BEST OF THE BEST
log p(r*/x l )
L: log p(ri* /x/) i
QUANTIZATION LEVEL
~
L i
m/
o
(11)
where 11vI l is the metric of the lth path and ni] is the metric of the jth code symbol on the lth path. With this type of additive metric, when a path is extended by one branch, the metric of the new path is the sum of the new branch symbol metries and the old path metric. To facilitate this calculation, the path metric for the best path leading to each state must be stored by the decoder as a state metric. This is an addition to the path information bit history storage required. Viterbi decoder operation can then be summarized as follows, taking the K == 3 case of Fig. 2 as an example. 1) The metric for the 2 paths entering state 00 are calculated by adding the previous state metrics of states 00 and 01 to the branch metrics of the upper and lower branches entering state 00, respectively. 2) The largest of the two new path metrics is stored as the new state metric for state 00. The new path history for state 00 is the path history of the state on the winning path augmented by a 0 or 1 depending on whether state 00 or 01 was on the winning path. 3) This add-compare-select (ACS) operation is performed for the paths entering each of the other 3 states. 4) The oldest bit on the path with the largest new path metric forms the decoder output. Since the code symbol metrics must be represented in digital form in the decoder, the effects of metric quantization come into question. Simulation has shown that decoder performance is quite insensitive to symbol metric quantization ..In fact, use of the integers as symbol metrics instead of log likelihoods results in a negligible performance degradation with 2-, 4-, or 8-level receiver quantization [7], [9]. Fig. 4 shows such a set of metrics for the 8-level quantized channel. Use of these symbol metrics implies that symbol metrics as well as the received symbols themselves may be represented by 1, 2, or 3 bits for 2-, 4-, and 8-level receiver quantization, respectively.
D. Unknown Starting State It has been assumed thus far that a Viterbi decoder has knowledge of the encoder starting state before decoding begins. Thus, in Fig. 2 (b), the starting state is assumed to be 00. A known starting state may be operationally undesirable since it requires that the decoder know when transmission commences. In reality, it has been found through simulation that a Viterbi decoder may start decoding at any arbitrary point in a transmission, if all state metrics are initially reset to zero. The first 3-4 constraint lengths worth of data output by the decoder will be more or less unreliable because of the unknown encoder starting state. However, after about 4 constraint lengths, the state metrics with high probability have values independent of the starting values and steady-state reliable operation results.
COO£
o
2
4
6
1
7
6
5
4
3
2
1
0
0
1
2
3
4
5
6
7
SYMBOL
Fig. 4. Integer code symbol metrics for 8-level receiver quantization.
IV.
SIMULATION AND ERROR PROBABILITY BOUND RESULTS
A. Tradeoffs Between Bit Error Probability and Eb/1Vo [or Rate 1/2 Codes Viterbi has derived tight upper bounds to bit error probability for Viterbi decoding based on the convolutional code transfer function [3]. These bounds are particularly tight for the white Gaussian noise channel for error probabilities less than about 10-4 • This bound has been numerically evaluated over a range of Eb/N o for a variety of codes. The upper bound is presented along with some of the 8-1evel receiver quantized simulation results for comparison. The upper bound' also provides performance data at very-low bit error rates, where simulation results are not available due to excessive computer time required. In comparing the upper bounds to the simulation results, it is important to keep in mind that the upper bound was derived for an infinitely finely quantized receiver output. The convolutional codes used in the simulations were found through exhaustive computer search [9], [10]. The search criterion was maximization of the minimum free distance for a given code constraint length [3]. Where two codes had the same minimum free distance, the number of codewords at that distance and the higher order free distances were used for code selection. Simulations have consistently shown that the free distance criterion yields codes with the minimum error probability. The principal results of the simulations and code transfer function bounds are shown in Figs. 5, 6, and 7. All of these figures show bit error rate versus Eh/No for Viterbi decoders using optimum rate 1/2 convolutional codes. In all cases, the decoder path history length was 32 bits. In all simulation runs, at least 25 error events contributed to the compiled statistics. B. Performance Depending on Quantization, Path History, and Receiver Automatic Gain Control The simulation results in Figs. 5 and 6 are for soft
(8-1evel) receiver quantization. Equally spaced demodulation thresholds are used as shown in Fig. 3 (b). This choice of 8-level quantizer thresholds is within a broad range of near optimum values, as will be shown presently. The transfer function bound is for infinitely finely quantized received data, although tight bounds for any degree of quantization can be obtained. Allowing for the 0.20-0.25 dB loss usually associated with 8-1evel receiver quantization compared with infinite quantization, the transfer function bound curves are in excellent
367
Fifty Years of Communications and Networking 3 10- " . , . - . . - - - - - - - - - - - - - - . . , . . . . - - -........-..-,
10-
2
..---.1""---...--.-.--------...------
10- 4 10- 3 cP
~
:! ...
..,
III
.....,0
IIlI
~
...
0
... Cd ~
~
10- 5
.... ~
JIQ
..,
10- 4
'P4
,a
Fig. 7. Bit error rate versus Eb/N o for rate 1/2 Viterbi decoding. Hard quantized received datu with 32-bit paths; K == 3 through 8. Fig. 5. Bit error rate versus Eb/1V o for rate 1/2 Viterbi decoding. 8-level quantized simulations with 32-bit paths, and infinitely finely quantized transfer function bound, K 3, 5, 7.
=
SIHt1LATIOlf UPPER 101J1m
agreement with silnulation results in the 10-.1 to 10- 5 bit error rate range. Since the accuracy of the transfer function bound increases with E 111 No, decoder performance can be ascertained accurately in the 10- 5 to 10- 8 region even in the absence of simulation. The symbol metrics used in the simulation were the equally spaced integers as shown in Fig. 4. Fig. 7 gives the simulation results for Viterbi decoding with hard receiver quantization. The same optimum rate 1/2, K 3 through K 8 codes were used here as in
=
.•,
.....
:I 0
=
the 8-1evel quantized simulations. The following points are obvious fro In the performance
10-'
...•.,
N
1t-4
\
\
\
\
\
\
\
\ \
\ 3
4
curves. 1) 2-leveI quantization is everywhere close to 2-dB inferior to 8-level quantization . 2) Each increment in K provides an improvement in efficiency of something less than 0.5 dB at a bit error rate of 10- 5 • 3) Performance improvement versus K increases with decreasing bit error rate. To observe the effects of varying receiver quantization more closely, simulation performance data are presented in Fig. 8 for the K = 5, rate 1/2 code, with 2-, 4-, and 8-1evel receiver quantization. The Q 8- and Q == 4-leveI thresholds are those of Fig. 3. Fig. 9 shows bit error rate performance versus Eo/N() for three values of path history length (8, 16, and 32) using the rate 1/2, K == 5 code, for both 2- and 8-level received data quantization. (The length 32 path curve is identical to the K 5 curve in Fig. 5.) Performance with
=
\ S
6
1
~/lfo in db
Fig. 6. Bit error rate versus Eo/No for rate 1/2 Viterbi decoding 8-1evel quantized simulations with 32-bit paths, and infinitely finely quantized transfer function bound, K 4, 6, 8.
=
=
368
THE BEST OF THE BEST
10- 2
10- 2
...
~
'P'4 f""#
..... ~
~
g
J.'3I
0
Jot Po.
10- 3
.
~
10- 1
~
A
0
~ )".
,D
...0 ...0 ......
w
ca..
u
'"
ClQ
Ja)
.u
'P4 I1Q
10-4
10- 4
10- 5 '---'-_............_
...._.a..--..Io_......_
4
........._
8-LEVEL QUANTIZATION
............
2-LEVEL QUANTIZATION
6 ~/No
in db
Fig. 8. Performance comparison of Vit.erbi decoding using rate 1/2, K :::: 5 code with 2-, 4-, and 8-1evel quantization. Path length 32 bits.
=
length 32 paths is essentially identical to that of an infinite path decoder. Even. for a path length of only 16, there is only a small degradation in performance. As previously mentioned, other simulations have shown that a path length of 4-5 constraint lengths is sufficient for other constraint lengths as well. Coded systems that make use of receiver outputs quantized to more than two levels require an analog-todigital converter at the modem matched filter' output, with thresholds that depend on correct measurement of the noise variance. Since the level settings are effectively controlled by the automatic gain control (AGe) circuitry in the modem, it is of interest to investigate the sensitivity of decoder performance to an inaccurate or drifting AGe signal. Fig. 10 shows the decoder performance variation as a function of A-D converter level threshold spacing. In all cases, the thresholds are uniformly spaced. These simulations use the K 5 rate 1/2 code with Eb/N o = 3.5 dB. It is evident that Viterbi decoding performance is quite insensitive to wide variations in AGe gain. In fact, performance is essentially constant over a range of spacing from 0.5 to 0.7. This allows for a variation in AGe gain of better than ±20 percent with no significant performance degradation.
=
C. Perforrnance of Codes of Other Rates The preceding simulation results have concentrated on Viterbi decoding of rate 1/2 convolutional codes. The
Fig. 9. Performance comparison of Viterbi decoding using rate 1/2, K 5 code with 8-, 16-, and 32-bit path lengths and 2- and 8-1evel quantization.
=
1.8x10- 3
.,GJ a::'" H 0 ~
J.4 fIJ
L6x10- 3
1.4xlO- 3
....+' fQ
1. 2xlO- 3
1. Oxlo- 3
0.3
0.4
0.5
0.6
0.7
Quantizer Threshold Spacing
Fig. 10. Viterbi decoder bit error rate performance as function of quantizer threshold level spacing; K == 5: rate 1/2, Eb/No = 3.5 dB, 8-level quantization with equally spaced thresholds.
369
Fifty Years of Communications and Networking
\ \ \ \
\
\ \
\ \
\
\
\
e ....
\
\
\
\ \
CQ
\ \
\
\
\
\ 6
5
4
>.
u
"I"f .-t
.....
Fig. 13. Performance of rate 2/3 K = 4 code with Viterbi decoding, Numerical bound and simulation results.
results on performance fluctuation due to decoder parameter variation carryover to other code rates with minor changes. Code rates less than 1/2 buy improved performance at the expense of increased bandwith expansion and more difficult symbol tracking due to decreased symbol energy-to-noise ratios. Rates above 1/2 conserve bandwidth but are less efficient in energy. Fig. 11 shows bit error rate versus Eo/No performance obtained from simulations of Viterbi decoding with optimum rate 1/3, K 4, 6, and 8 codes, and 8level quantization. Figs. 12 and 13 show numerical bound and simulation performance results for rate 2/3 K 3 and K 4 codes, respectively. Simulation curves are for 2- and 8-level quantization, while the numerical bound curves are for infinitely fine receiver quantization. Comparing the performance data obtained through simulations of Viterbi decoders with rate 1/2 (Figs. 5, 6, and '7), and rate 1/3 codes, it is apparent that the latter offers a O.3-to-O.5-~B improvement over the former for fixed K, 'in the range reported. This is close to the improvement in efficiency of a channel with capacity 1/3 compared with one of capacity 1/2, and is therefore expected. Comparison of the higher rate codes with the rate 1/2 codes may also be made over the range, spanned by the simulation and analytical data. The fairest comparison is probably between decoders with similar number of
=
A
10
A
10- 4
... .......
=
k
1&1
IlQ
10- 5
v-. \
UPPER BOIIND
10-6 1
4
6
8
~/Ho in db
Fig.
~2.
5
db (signal energy ce noise ratio)
10- 3
...0
UPPER II01lIID
\
=
0 Poe
~
10-5
Fig. 11. Performance of rate 1/3, K 4, f), and 8 codes with Viterbi decoding.
k
\
'9"1
\
4
\
...
...
lIJ
\
\
\
0
\
\
10- 4
~
\
\
~/N 0 iIJ
1
\ \
\
"I"f
\
\
\
~
\
\
\
3
...t
\
Performance of rate 2/3 K == 3 code with Viterbi decoding. Numerical bound and simulation results.
=
THE BEST OF THE BEST
370
states, and hence similar decoder complexity. Thus, the K 3 rate 2/3 data should be compared with the K := 5, rate 1/2 data. Fig. 14 shows the union bounds on performances for the rate 2/3, K 3, and rate 1/2, K ::= 5 codes. Both encoders have 16 states. The free distance d, equals 7 for the rate 1/2 code and 5 for the rate 2/3 codes. At very high EblNo, the rate 1/2 must be superior. This is because asymptotically, at high Eb/J.V O, the error probability varies as
=
=
P,
r-...I
n, exp (-dIE,/N o)
=
n, exp (-d.r RNEb/No)
where n e is the number of bit errors contributed by codewords at distance dfo This gives the rate 1/2 code an advantage of about 0.2 dB in the limit. In Fig. 14, the difference between the two curves is about 0.1 dB in the error probability 'range of 10-6 to 10- 9 • This small difference is due to the fact that the rate 2/3 code used happens to' be a particularly good code; the value of n c is smaller for it than for the rate 1/2 code and this difference is significant even for Fe as
g
10-
6
~
.g
..o
~
p..
~ ~
;..l
small as 10-9 •
V.
IMPERFECT CARRIER PHASE COliERENCE
Thus far it has been assumd that carrier phase is known exactly at the receiver. In real systems this is usually not the case. Oscillator instabilities and uncompensated doppler shifts necessitate closed loop carrier phase tracking at the receiver. Since the carrier loop tracks a noisy received signal, the phase reference it provides for demodulation will not be perfect. An inaccurate carrier phase reference at the demodulator will degrade system performance. In particular a constant error ep in the demodulator phase will cause the signal component of the matched filter output to be suppressed by the factor cos
/2Ea r, = ±\jN cos ~ o
+ ni.
(12)
The effect of an imperfect carrier phase reference on performance is 'always worse for coded than uncoded systems. This is because coded systems are characterized by steeper error probability versus Ebllv o curves than uncoded systems. An imperfect carrier phase reference causes an apparent loss in received energy-to-noise ratio. Since the coded curve is steeper, the loss in EblNo degrades error probability to a greater extent. Furthermore, unless care is taken in the design of the phase-tracking loop, .the phase error might be higher for the coded system than for an un coded system, since loop performance may depend upon E s / 1V H ) which is significantly smaller for coded than un coded systems. For convolutional coding with phase coherent demodulation and Viterbi decoding, exact analytical expressions for bit error rate Pc versus E b / 1Vo are not attainable. The simulation results of the preceding section, however, define a relationship between Pc and Eb/N o that can be
Fig. 14. Bit error probability bound for rate 1/2, K rate 2/3, K 3 code.
=
= 5, and
written formally as
r. = t(::)
(13)
for a given code, receiver quantization, and Viterbi decoder. Since the carrier phase is being tracked in the presence of noise the phase error > will vary with time, To simplify analysis, assume that the data rate is large compared to the carrier loop bandwidth so that the phase error does not vary significantly during perhaps 20--30 information. bit times. Viterbi decoder output errors are typically several bits in length and are very rarely longer than 10-20 bits when the overall decoder bit error probability is less than 10-3 • Therefore, the phase error is assumed to be constant over the length of almost any decoder error. This being the case, the bit error probability for a constant phase error ep, can be written as (14)
from (12) and (13). This result uses the fact that received signal energy is degraded by cos" q,. If 4> is a random variable with distribution p (ep), the resulting error probability averaged on cP is
371
Fifty Years ofCommunications and Networking 10- 3 .....-...............~------..r--------------------...,
10- 5 (1) ,.JJ
IU
p:; S-t 0 ~
~
w ~ .....
&.Q
10- 6
3
5
4
6
7
8
9
10
11
12
13
14
15
Fig. 15. Performance curves for rate 1/2; K = 7 Viterbi decoder with 8-1evelquantization as a function of carrier phase tracking loop signal-to-noise ratio Q.
P:
= iT" p(t/»P.(t/»
dt/>.
(15)
For the second-order phase-locked loop e~
pet/»~
CO"
~
= 271"Io(a)
I
a
»
1
(16)
where 10 ( .) is the zeroth order modified Bessel function and a is the loop signal-to-noise ratio [11]. Using this distribution and the P; versus Eb/lvo curve for the K ,= 7, rate 1/2 code of Fig. 5, the P; integral of (15) has been evaluated for several values of a. The results are shown in Fig. 15 as curves of Pe' versus Eb/No with a as
=
7, rate 1}2 simulation curve of Fig. a parameter (the K 5 was extrapolated to get the high Eb/1Vo results shown in this figure). These curves exhibit the same general shape as those for uncoded binary PSK modulation with phase coherence provided by a carrier tracking loop. As expected, the losses due to imperfect coherence are somewhat greater with than without coding. Fig. 16 shows the additional Eb/No required to maintain a 10- 5 bit error rate as a function of loop signal to noise ratio a. Curves are shown for the case of un coded BPSK and rate 1/2, K = 7 convolutional encoding-Viterbi decoding.
372
THE BEST OF THE BEST
tions of the decoder. For a rate l/v decoder, an ACS is used to add the state metrics for two states to the appropriate branch metrics, to compare the resulting two sums, and to select the larger. The decision is transmitted to 7.0 the path memory section and the larger of the two sums becomes a new state metric. One ACS function must be for each of the 2K - 1 states. In a fully parallel performed 6.0 very-high-speed decoder, 2K - 1 ACS units are required. In general, the speed of the ACS unit places an upper ZO bound on the speed of the decoder. For slower decoders; ~ 5.0 t.:l e.g., R less than several megabits per second for T2L .... logic, ACS units may be time shared, decreasing decoder cost significantly. Complexity of the ACS unit is .strongly IlS ~ 4.0 u dependent upon required decoder speed. It should be £:: H noted that implementation of Viterbi decoders is greatly simplified by the fact that all ACS units perform iden. tical functions and can be realized by a set of identical circuits. The path memory section 111Ust store about a 4 con2.0 straint length history of decisions for each state. The memory requirements are thus nontrivial. Considerable advantage can be taken of new integrated-circuits mem1.0 ories to keep the equipment cost small. However, the complexity of the path memory and the ACS units both 'increase by a factor slightly larger than 2 for each increase in constraint length of 1. Thus, an increase in 10.0 12.0 11.0 13.0 14.0 15.0 system performance of about 0.4 dB at a bit error rate of 10-5 , which can be achieved by increasing K by 1, comes a at a cost of slightly more than doublirig decoder cornplexity. Fig. 16. Comparison to Increase in Eb/No due to imperfect phase coherence necessary to maintain 10-5 bit error rate for A complete decoder also must include interface ciruncoded BPSK and K 7; rate 1/2, Q 8 Viterbi decoding. cuits, synchronization circuits, timing circuits, and generally an encoder. A recent implementation- of a K 7, VI. IMPLEMENTATION OF A VITERBI DECODER rate 1/2 self-synchronized Viterbi decoder capable of It is convenient to break the basic Viterbi decoder operating at up to R 2 Mbitjs with 2-, 4-, or g.-level into five functional units; an input or branch metric quantized data required a total of. 356 TTL integrated calculation section, ali ACS arithmetic section, and a circuits for all functions. As noted in Fig. 5, this relapath memory and output section. Information can be tively simple decoder provides over 5-qB Eb/N o advanthought of as passing successively from one section to the tage over an un coded BPSK system at P; 10- 5 , and 8 next. 6-dB advantage at P" 10- , when soft quantization' is The branch metric calculation section accepts the used. input data and calculates (or looks up) the metric for each distinct branch. For a rate 1/2 code, four branches VII:COMPARISON OF SEQUENTIAL AND VITERBI DECODING are possible corresponding to transmission of 00, 01, 10, Both sequential and Viterbi decoding offer practical and 11. For a rate 1/3 or rate 2/3 code, eight distinct branch metrics are possible. Note that this is the only alternatives to a communications engineer designing a section of the decoder that is directly concerned. with high-performance efficient communication system. The the number of bits of quantization of the received data, two decoders have significant differences which are noted arid hence, the only section whose complexity is directly below. Both are capable of very-high-speed operation," dependent on quantization. (The complexity of the ACS section also depends on quantization indirectly, in that 1 The Linkabit LV7026 decoder is designed for use with difthe number of bits required for storing state metrics ferentially encoded BPSK or QPSK systems. It automatically demodulator phase ambiguities and establishes node synincreases with the number of bits of quantization.) The resolves chronization without manual intervention. input section is generally not critical in terms of either 2 The Linkabit LS4157 sequential decoder is capable of oper50 Mbit/s. It uses a constraint complexity or speed limitations. Its complexity does ation at data rates up to R length K .= 41, 'rate 1/2 code and accepts only hard quantized double, however, for each increase of the denominator data. The decoder is fully self-synchronizing. The coding advantage over uncoded data is 4.4 dB at P6 10-5 at R =: 50 Mbit/s of the rate R N by one. and greater than 6 dB at P = 10-8 . The coding advantage is The ACS sections perform the basic arithmetic calcula- larger at lower data rates. ~
G)
II)
=
=
=
=
=
=
=
=
(J
373
Fifty Years ofCommunications and Networking
A. Error Probabiiiiu It should be recalled that, since the complexity of sequential decoders is relatively independent of constraint length, the constraint length is typically made quite large to provide a very small probability of undetected error. Usually the important eontri butor of errors is received data buffer overflow due to a computational overload. Such an event causes a long burst of rather noisy output data until the decoder reestablishes code synchronization. During this burst, the probability of bit error is that of the raw channel, perhaps P; == 3 X 10-2 • Error from a Viterbi decoder occurs in short bursts of length at most 10 to 20. Systems that are sensitive to long bursts of errors should thus use Viterbi decoding. Systems that can tolerate occasional long bursts, with an error indication provided if desired by the decoder, should consider sequential decoding. The curve of error probability versus EblAlo tends to be much steeper for a sequential decoder than for a Viterbi decoder because of the difference in K. Thus, the sequential decoding advantage tends to increase as lower probabilities of bit error are demanded, although, as before, many errors tend to come in widely separated noisy bursts.
B. .Decoder Delay Sequential decoders tend to require long buffers of at least 200 bits and as much as several thousand bits to smooth out the variations in computational load. Viterbi decoders require a. path memory of at 1110st 64 bits. Thus the decoding delay differs by up to two orders of magnitude.
C. Long Tail Required to Terminate Sequences
In time-division multiplexed systems, bursts of separately encoded data may be received at the same decoder from different sources. In these instances, it may be desirable to time share the decoder. As noted in Section III, termination of encoding can be achieved by transmitting a known sequence of length K - 1, thus causing the encoder to enter a known state. Since K is typically larger for sequential decoding, the "tailing off" of the encoded sequence can cause a significant degradation. in system efficiency. The tailing off of the short constraint length codes for Viterbi decoding causes a much smaller degradation. If time and implementation permit the storage of the decoder state without code termination, then the cost of tailing off can be ignored. The design of such a timeshared sequential decoder remains for future work.
D. Rates Other than 1/2 and Soft Quantization Viterbi decoders for rate 1/3 and 8-level quantization are not significantly more complex than those for rate 1/2 and 4- or 2-level quantization. The chief costs occur in the input section of the decoder as discussed in Section VI. In particular, the soft quantized data are processed in the input section and then incorporated in the branch
and state metrics. No storage is required. A sequential decoder, on the other hand, must store several thousand branches of received data, each branch containing log, Q/RN bits for rate R N and Q level quantization. Although the possibility exists of gaining 0.4 dB by using rate 1/3 rather than rate 1/2 and of gaining 2 dB by using soft decisions rather than hard, these advantages are bought in sequential decoding at a formidable storage and processing cost. In general, then, practical high-rate sequential decoders are limited to rate 1/2 and hard decisions. (It is conceivable that this cost could be minimized by operating the decoder at a very high ratio of computation rate to average bit rate, thereby minimizing the number of branches required in the buffer.) A second argument against soft quantization with sequential decoding involves the sensitivity of the probability of buffer overflow to channel variations. In Fig. 10, it was demonstrated that changes in receiver AGe of ±20 percent had negligible effect on the performance of a Viterbi decoder. The degradation is much more pronounced for sequential decoding, since the computational load is very sensitive to changes in channel parameters. Thus, part of the 2-dB gain anticipated for soft decisions might be lost unless great care was exercised in controlling receiver }\GC precisely. In comparing sequential decoding and Viterbi decoding, it thus appears fair to consider soft decisions only for the Viterbi decoder. Under these conditions, the efficiency advantage of a long constraint length sequential decoder is considerably diluted. Consequently, performance of a rate 1/2, K 41 sequential decoder is no better than a rate 1/2 Viterbi decoder of constraint length 5 to 7 (depending on the speed factor, that is, the ratio of computation rate to bit rate) at a P e of 10-5 • The sequential decoder does show a distinct advantage for P; of 10- 8 or smaller. On the other hand, building a system without receiver quantization lowers system costs, since a considerably more crude AGe may be used.
=
E. Sensitivity to Phase Error and B1.trsty Conditions on the Channel The performance of Viterbi decoding under slowly fluctuating phase error was presented in Fig. 15. A similar calculation would indicate much greater degradation in the case of sequential decoding, since the error probability curve is much steeper. Furthermore, this estimate would be optimistic in the case of sequential decoding, since the assumption that the phase varied so slowly that errors occurred independently would probably not hold for sequential decoding. Thus, more careful design of the phase-tracking loop is indicated for a system utilizing sequential decoding rather than Viterbi decoding.
VIII.
CONCLUSIONS
Viterbi decoding has been shown to be a practical method for improving satellite and space communication
374
THE BEST OF THE BEST
efficiency by 4-6 dB, at a bit error rate of 10- 5 • The successful implementation of 2-Mbitjs constraint-length-7
Viterbi decoders effectively demonstrates that the technique is well beyond the stage of being a theoretical curiosity. In fact, a major effort has been under way for
the past 2-3 years with the aim of modifying and adapting the algorithm for minimum complexity implementation without sacrificing performance significantly. In addition, Viterbi decoding has been shown to "degrade gracefully" in the presence of adverse channel or receiver conditions. In particular, the error probability does not change precipitously with Eb/No as is the case with coding techniques that use longer codes and/or require variable decoding effort, such as sequential decoding. This ensures that performancc degradation due
to an imperfect phase or bit timing reference, or a slight correlation between noise samples, will be minimal, Requirements on AGe accuracy, even; for 'soft decisions, were shown to be quite loose. Finally the results presented here should provide the communication engineer with the information necessary to evaluate the applicability of Viterbi decoding to space and satellite communication systems with a wide range
[12]
J. K. Omura, "On the Viterbi decoding algorithm," IEEE
Trans. Iniorm. Theory (Corresp.), vol. IT-15, Jan. 1969, pp. 177-179.
Jerrold A. Heller (1\1'68) was born in New York, N. Y., on June 30, 1941. He received the B.E.E. degree in 1963 from Syracuse University, Syracuse, N. Y., and the M.S. PHOTO and Ph.D. degrees in electrical engineering NOT from the Massachusetts Institute of Technology 7 Cambridge, in 1964 and 1967, reAVAILABLE spectively. During his first year at IV1.1.T. he was a National Science Foundation Fellow. In 1965 he joined the l\1.I.T. Research Laboratory of Electronics where he was a Xerox Fellow for the following two years. He held summer positions in 1962 at the Bell Telephone Laboratories, New York, N. Y., and in 1963 at the IBM Research Center, Yorktown Heights, N. Y., where he worked on the logical design of digital systems. From 1967 to 1969 he was with the Communications Research Section of the Jet Propulsion Laboratory, Pasadena, Calif., where his work centered on the application of coding to deep-space communication. He is presently Director of Technical Operations for the Linkabit Corporation, San Diego, Calif. Currently, his work is concerned with coding for space, IIF, and communication satellite channels. Dr. Heller is a member of Tau Beta Pi, Eta Kappa Nu, and Sigma Xi.
of requirements and constraints. REFERENCES
[1] C. E. Shannon, "Communication is the presence of noise," Proc. IRE, vol, 37. Jan. 1949, pp. 10-21. [2] Codex Corp., Final Rep. "Coding system design for advanced solar missions," Contract NAS 2-3637, NASA Ames Res. Cent., Moffett Field, Calif. [3] A. J. Viterbi, "Convolutional codes and their performance in communication systems," this issue, pp. 751-772. [4] J. M. Wozencraft and I. M. Jacobs, Principles of Communicaiion. Engineering. New York: Wiley, 1965. [5] I. 1\1. Jacobs. "Sequential decoding for efficient communication from deep space," IEEE Trans. Commun. Techmol., vol. COM-15, Aug. 1967, pp. 492-501. [6] A. J. Viterbi, "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm," IEEE Trans. Inform. Theory, vol. IT-13, Apr. 1967, pp. 260-269. [7] J. A. Heller, "Short constraint length convolutional codes," Jet Propulsion Lab., California. Inst. Technol., Space Programs Summary 37-M, vol. III, Oct.jNov., 1968, pp. 171177. [8] --~ "Improved performance of short, constraint length
convolutional codes," Jet Propulsion Lab., California Inst. Technol., Space Programs Summary 37-56, vol. III, Feb'; Mar. 1969, pp. 83-84. [9] Linkabit Corp., Final Rep., "Coding systems study for high data rate telemetry links ,H Contract N AS2-6024, NASA Ames Res. Ctr. Rep. CR-114278, Moffett Field, Calif. (10] J. P. Odenwalder, "Optimum decoding of convolutional codes," Ph.D. dissertation, Syst. Sci. Dep., Univ. California! Los Angeles, 1970.
[11] A. J. Viterbi, Principles of Coherent Communication. New 'York: McGraw-Hin, 1966.
Irwin Mark Jacobs (S'55-M'60) was born in New Bedford, Msss., on October 18, 1933. He received the B.E.E. degree from Cornell University, Ithaca, N. Y., in 1956, and the PHOTO S.M. and Bc.D. degrees from the MassachuNOT setts Institute of Technology, Cambridge, in 1957 and 1959, respectively. He was the AVAILABLE recipient of a McMullin Regional Scholarship and a General Electric Teachers Conference Scholarship .at Cornell and participated in the engineering cooperative program in association with the Cornell Aeronautical Laboratory, Buffalo, N. Y. In graduate school, he was a General Electric Fellow and an Industrial Fellow of Electronics. In 1959, he was appointed Assistant Professor of Electrical Engineering at lVI.I.T. and was a Member of the staff of the Research Laboratory of Electronics. He was promoted to Associate Professor in 1964. On leave from M.I.T., he spent the academic year 1964-1965 as a NASA Resident Research Fellow at the Jet Propulsion Laboratory, Pasadena, Calif., and was concerned principally with coding for deep-space communications. In 1966, he accepted an appointment as Associate Professor of Applied Physics and Information Science at the University of California, San Diego. In 1970 he was promoted to full Professor. In 1968 he cofounded Linkabit Corporation, of which he is now President. He is presently on leave from the University of California and devoting full time to Linkabit Corporation. He is currently working in the area of information and computer science. Dr. Jacobs is a member of Phi Kappa Phi, Sigma Xi, Eta Kappa Nu, Tau Beta Pi, and the Association for Computing Machinery.
Convolutional Codes and Their Performance in Communication Systems ANDREW J. VITERBI Senior Member, IEEE
Abstract-This tutorial paper begins with an elementary presentation of the fundamental properties and structure of convolutional codes and proceeds with the development of the maximum likelihood decoder. The powerful tool of generating function analysis is demonstrated to yield for arbitrary codes both the distance properties and upper bounds on the bit error probability for communication over any memoryless channel. Previous results on code ensemble average error probabilities are also derived and extended by these techniques. Finally, practical considerations concerning finite decoding memory, metric representation, and synchronization are discussed.
A
I. INTRODUCTION
LT H OUGH convolutional codes, first introduced by Elias [1] , have been applied over the past decade to increase the efficiency of numerous communication systems, where they invariably outperPaper approved by the Communication Theory Committee of
the IEEE Communica tion Technology Group for publication
without oral presentation. Manuscript received January 7, 1971; revised June 11, 197I. The author is with the School of Engineering and Applied Science, University of California, Los Angeles, Calif. 90024, and the Linkabit Corporation, San Diego, Calif.
form block codes of the same order of complexity, there remains to date a lack of acceptance of convolutional coding and decoding techniques on the part of many communication technologists. In most cases, this is due to an incomplete understanding of convolutional codes, whose cause can be traced primarily to the siza hIe literature in this field, composed largely of papers which emphasize details of the decoding algorithms rather than the more fundamental unifying concepts, and which, until recently, have been divided into two nearly disjoint subsets. This malady is shared by the block-coding literature, wherein the algebraic decoders and probabilistic decoders have been at odds for a considerably longer period. The convolutional code dichotomy owes its origins to the development of sequential (probabilistic) decoding by Wozencraft [2] and of threshold (feedback, algebraic) decoding by Massey [3]. Until recently the two disciplines flourished almost independently, each with its own literature, applications, and enthusiasts. The Fano sequential decoding algorithm [4] was soon found to
Reprinted from IEEE Transactions on Communications Technology, vol. COM-19, no. 5, October 1971. The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
375
376
greatly outperform earlier versions of sequential decoders both in theory and practice. Meanwhile the feedback decoding advocates were encouraged by the burst-error correcting capabilities of the codes which render them quite useful for channels with memory. To add to the confusion, yet a third decoding technique emerged with the Viterbi decoding algorithm [9], which was soon thereafter shown to yield maximum likelihood decisions (Forney [12), Omura [17]). Although this approach is probabilistic and emerged primarily from the sequential-decoding oriented discipline, it leads naturally to a more fundamental approach to convolutional code representation and performance analysis, Furthermore, by emphasizing the decoding-invariant properties of convolutional codes, one arrives directly to the maximum likelihood decoding algorithm and from it to the alternate approaches which lead to sequential decoding on the one hand and feedback decoding on the other. This decoding algorithm has recently found numerous applications in communication systems, two of which are covered in this issue (Heller and Jacobs [24], Cohen et al. [25]). It is particularly desirable for efficient C0111munication at very high data rates, where very low error rates are not required, or where large decoding delays are intolerable. Foremost among the recent works which seek to unify these various branches of convolutional coding theory is that of Forney [12], [21], [22], et seq., which includes a three-part contribution devoted, respectively, to algebraic structure, maximum likelihood decoding, and sequential decoding. This paper, which began as an attempt to present the author's original paper [9] to a broader audience,' is another such effort at consolidating this disci pline, It begins with an elementary presentation of the fundamental properties and structure of convolutional codes and proceeds to a natural development of the maximum likelihood decoder. The relative distances among codewords are then determined by means of the generating function (or transfer function) of the code state diagram. This in turn leads to the evaluation of coded communication system performance on any memoryless channel. Performance is first evaluated for the specific cases of the binary symmetric channel (BSC) and the additive white Gaussian noise (AWGN) channel with biphase (or quadriphase) modulation, and finally generalized to other memoryless channels. New results are obtained for the evaluation of specific codes (by the generating function technique), rather than the ensemble average of a class of codes, as had been done previously, and for bit error probability, as distinguished from event error probability. The previous ensemble average results are then extended to bit erroro probability bounds for the class of 1 This material first appeared in unpublished form as the notes for the Linkabit Corp., "Seminar on convolutional codes," Jan. 1970.
THE BEST OF THE BEST
time-varying convolutional codes by means of a generalized generating function approach; explicit results are obtained for the limiting cage of a very noisy channel and compared with the corresponding results for block codes. Finally, practical considerations concerning finite memory, metric representation, and synchronization are discussed. Further and more explicit details on these problems and detailed results of performance analysis and simulation are given in the paper by Heller and Jacobs [24]. While sequential decoding is not treated explicitly in this paper, the fundamentals and techniques presented here lead naturally to an elegant tutorial presentation of this subject, particularly if, following Jelinek [18], one begins with the recently proposed stack sequential decoding algorithm proposed independently by Jelinek and Zigangirov [7], which is far simpler to describe and understand then the original sequential algorithms. Such a development, which proceeds from maximum lik.elihood decoding to sequential decoding, exploiting the similarities in performance and analysis has been undertaken by Forney [22]. Similarly, the potentials and limitations of feedback decoders can be better understood with the background of the fundamental decoding-invariant convolutional code properties previously mentioned, as demonstrated, for example, by the recent work of Morrissey [15] .
II.
CODE R,EPRESENTATION
A convolutional encoder is a linear finite-state machine consisting of a K -stage shift register and n linear algebraic function generators, The input data, which is usually, though not necessarily, binary, is shifted along the register b bits at a time. An example with K = 3, n 2, b 1 is shown in Fig. 1. The binary input data and output code sequences are indicated on Fig. 1. The first three input bits, 0, 1, and 1, generate the code outputs 00, 11, and 01, respectively. We shall pursue this example to develop various representations of convolutional codes and their properties. The techniques thus developed will then be shown to generalize directly to any convolut-ional code. It is traditional and instructive to exhibit a convolutional code by means of a tree diagram as shown in Fig. 2. If the first input bit is a zero, the code symbols. are those shown on the first upper branch, while if it is a one, the output code symbols are those shown on the first lower branch. Similarly, if the second input bit is a zero, we trace the tree' diagram to the next upper branch, while if it is a one, we trace the diagram downward. In this manner all 32 possible outputs for the first five inputs may be t-raced. From the diagram it also becomes clear that after the first three branches the structure becomes repetitive. In fact, we readily recognize that beyond the third branch the code symbols on branches emanating from the two nodes labeled a are identical, and similarly for all the
=
=
377
Fifty Years ofCommunications and Networking 010001•••
00
00
00
a
00
a
00
b={!!]
011010...
001101010010. . .
a.~
DATA SEQUENCE
c-=@] 011100...
Convolutional coder for K
Fig.T,
00. 00 11 b 00
00 11 10 01 11 00 01 10 00
10c 11
t ~
Old
0
1
11 a
10
OOb
Fig. 3. Trellis-code representation for coder of Fig. 1.
a:[§J b=~
c=@] d=<@)
11
10 01 11 00 01 10 00 11 10 01 11 00 01 10
11
00 01 e
01 10d
-m
= 3, n =2, b = 1.
11
10 01
11 00 01 10
Fig. 2. Tree-code representation for coder of Fig. 1.
identically labeled pairs of nodes. The reason for this is obvious from examination of the encoder. As the fourth input bit enters the coder at the right, the first data bit falls off on the left end and no longer influences the output code symbols. Consequently, the data sequences lOOxy' .. and OOOxy· .. generate the same code symbols after the third branch and, as is shown in the tree diagram, both nodes labeled a can be joined together. This leads to redrawing the tree diagram. as shown in Fig. 3. This has been called a trellis diagram [12], since a trellis is a tree..like structure with remerging branches. We adopt the convention here that code branches produced by a "zero" input bit are shown as solid lines and code branches produced by a "one" input bit are shown dashed. The completely repetitive structure of the trellis diagram suggests a further reduction in the representation of the code to the state diagram of Fig. 4. The "states" of the state diagram are labeled according to the nodes of the trellis diagram. However, since the states corres-
[ill=b
Fig 4. State-diagram representation for coder of Fig.
i.
pond merely to the last two input bits to the coder we may use these bits to denote the nodes or states of this diagra:m. We observe finally that the state diagram can be drawn directly by observing the finite-state machine properties of the encoder and particularly the fact that a four-state directed graph can be used to represent uniquely the input-output relation of the eight-state machine. For the nodes represent the previous two bits while the present bit is indicated by the transition branch; for example, if the encoder (machine) contains 011, this is represented in the diagram by the transition from state b == 01 to state d 11 and the corresponding branch indicates the code symbol outputs 01.
=
III. MINIMUM DISTANCE DECODER FOR BINARY SYMMETRIC CHANNEL
On
B.Se,
errors which transform. a channel code symbol 0 to 1 or 1 to 0 are assumed to occur independently from symbol to symbol with probability p. If all input (message) sequences are equally' likely, the decoder which minimizes the overall error probability for any code, block or convolutional, is one which examines the error-corrupted received sequence YIY2· · ·Yi' .. and chooses the data sequence corresponding to the transmitted code sequence XIX2·· • Xj· •• , which is closest to the received sequence in the sense of Hamming distance; that is, the transmitted sequence which differs from the received sequence in the minimum number of symbols. :1
378
Referring first to the tree diagram, this implies that we should choose that path in the tree whose code sequence differs in the minimum number of symbols from the received sequence. However, recognizing that the transmitted code branches remerge continually, we may equally limit our choice to the possible paths in the trellis diagram of Fig. 3. Examination of this diagram indicates that it is unnecessary to consider the entire received sequence (which conceivably could be thousands or millions of symbols in length) at one time in deciding upon the most likely (minimum distance) transmitted sequence. In particular, immediately after the third branch we may determine which of the two paths leading to node or state a is more likely to have been sent. For example, if 010001 is received, it is clear that this is at distance 2 from ססoo00 while it is at distance 3 from 111011 and consequently we may exclude the lower path into node a. For, no matter what the subsequent. received symbols will be, they will effect the distances only over subsequent branches after these two paths have remerged and consequently in exactly the same way. The same can be said for pairs of paths merging at the other three nodes. after the third branch. We shall refer to the 111initnU1TI distance path of the two paths merging at- a given node as the "survivor." Thus it is necessary only to remember which was the minimum distance path from the received sequence (or survivor) at each node, as well as the value of that minimum distance. This is necessary because at the next node level we must compare the two branches merging at each node level, which were survivors at the previous level for different nodes; e.g., the comparison at node a after the fourth branch is among the survivors 'of comparisons at nodes a and c after the third branch. For example, if the received sequence over the first four branches is 01000111, the survivor at the third node level for node a is 00ססoo with distance 2 and at node c it is 110101, also with distance 2. In going from the third node level to the fourth the received sequence agrees. precisely with the survivor from c but has distance 2 from the survivor from a. Hence the survivor at node a of the fourth level is the data sequence 1100 which produced the code sequence 11010111 which is at (minimum) distance 2 from
the received sequence. In this way we may proceed through the received sequence and at each step for each state preserve one surviving path and its distance from the received sequence, which is more generally called metric. The only difficulty which may arise is the possibility that in a given comparison between merging paths, the distances or metrics are identical. Then we may simply flip a coin as is done for block codewords at equal distances. from the received sequence. For even if ,ve preserved both of the equally valid contenders, further received symbols would affect both metrics in exactly the same way and thus not further influence our choice. This decoding algorithm was first proposed by Viterbi [9] in the more general context of arbitrary memoryless
THE BEST OF THE BEST
channels. Another description of the algorithm can be obtained from the state-diagram representation of Fig. 4. 'Suppose we sought that path around the directed state diagram, arriving at node a. after the kth transition, whose code symbols are at a minimum distance from the received sequence. But clearly this minimum distance path to node a at time k can be only one of two candidates: the miminum distance path to node a at time k - 1 and the minimum distance path to node c at time k - 1. The comparison is performed by adding the new distance accumulated in the kth transition by each of these paths to their minimum distances (metrics) at time k - 1. It appears thus that the state diagram also represents a system diagram for this decoder. With each node or state we associate a storage register which remembers the minimum distance path into the state after each transition as well as a metric register which remembers its (minimum] distance from the received sequence. Furthermore, comparisons are made at each step bet-ween the two paths which lead into each node. Thus four comparators 111USt also be provided. There remains only the question of truncating the algorithm and ultimately deciding on one path rather than four. This is easily done by forcing the -last two input bits to the coder to be 00. Then the final state of the code must be a 00 and consequently the ultimate survivor is the survivor at node a, after the insertion into the coder of the two dummy zeros and transmission of the corresponding four code symbols. In terms of the trellis diagram this means that the number of states is reduced from four to two by the insertion of the first zero and to a single state by the insertion of the second. The diagram is thus truncated in the same .way as it was begun. We shall proceed to generalize these code representations and optimal decoding algorithm to general convolutional codes and arbitrary memoryless channels, including the Gaussian channel, 'in Sections V and VI. However, first we shall exploit the state diagram further to determine the relative distance properties of binary convolutional codes.
=
,IV. DISTANCE PROPERTIES OF CONVOLUTIONAL CODES We continue 'to pursue the example of Fig. 1 for the sake of clarity; in the next .section we shall easily generalize results. It is well known that convolutional codes are, group codes. Thus there is no 108s in generality in computing the distance from the all zeros codeword to all the other codewords, for this set of distances is the same as the set of distances from any" specific codeword to all the others. For this purpose we 111ay again use either the trellis diagram or the state diagram, We first of all redraw the trellis diagram in Fig. 5 labeling the branches according to their distances from the all zeros path. Now consider all the paths that merge with "the all zeros for the first time at some arbitrary node j.
379
Fifty Years of Communications and Networking
at this same state a = 00.
All such paths can be traced
on the modified state diagram. Adding branch exponents we see that pathc b c a is at distance 5 from the correct path" .paths a b d c a and abc b c a are both at distance 6, and So forth, for the generating junctions of the output sequence weights of these paths are DrJ and D6, respectively Now we may evaluate the generating function of ail paths merging with the all zeros at the jth node level simply by evaluating the generating function of all the Fig. 5. Trellis diagram labeied with distances from all zeros path.
weights of the output sequences of the finite-state machine.' The result in this case is
1)"
T(D) = 1 - 2D
= DS +
2D 6
+ 4D + .. ~ + 7
21; D"+5
+ ... .
(1)
This verifies our previous observation and in fact shows that among the paths which merge with the all zeros at a given node there are 2 k paths at distance k + 5 from the
~.8 Fig. 6. State diagram labeled according to distance from all zeros path.
it is seen from the diagram that of these paths there will be just one path at distance 5 from the all zeros path arid this diverged from it three branches back. Similarly there are two at distance 6. from it, one which diverged 4 branches back and the other which diverged 5 branches back, and so forth: We note also that the input bits for distance 5 path are 00 ... 0100 and thus differ in only one input bit from the all zeros, while the distance 6 paths are 00 .: · · GllOO and 00 · .. 010100 and thus each differs in 2 input bits from the all zeros path. The minimum distance, sometimes called the minimum "free" distance, among all paths is thus seen to he 5. this implies that any pair of channel errors can corrected, for two errors will cause the received sequence to be at distance 2 from the transmitted (correct) sequence but. it will. be ~t least at distance 3 from any other possible code sequence. It appears that with enough patience the distance of all paths from the all zeros (or any arbitrary) path can be so determined from the trellis diagram. However, by examining instead the state diagram we can readily obtain a closed form expression whose expansion yields directly and effortlessly all the distance information. We begin by labeling the branches of the state diagram of Fig. 4 either D2, D, or DO 1~ where the exponent corresponds to the distance of the particular branch from the corresponding branch of the all zeros path. Also we split open the node a 00, since circulation around this self-loop simply corresponds to branches of the all zeros path whose distance from itself is obvIously zero. The result is Fig. 6. Now as is clear from examination of the trellis diagram; every path which arrives at state a 00 at node level i. must have at some previous node level (possibly the first) originated
be
=
=
=
all zeros. Of course, (1) holds for an infinitely long code sequence; if we are dealing with the jth node level, we must truncate the series at some point. This is most easily done by considering the additional information indicated, in the modified state diagram of. Fig. 7. The L terms will be used to determine the length of a given path; since each branch has an L, the exponent of the L factor win be augmented by one every' time a branch is passed through. The N term is included only If that branch transition was. caused by an input data "one," corresponding to a dotted branch in the trellis diagram: The. generating function of this augmented state diagram is then
T(D, L, N)
D5L3N
=1-
DL(l
+ L)N
+D + ... + D +
= D
N
5L3
6L4(1
+ L)N -i- D L
S kL3 k(1
+
2
+ L)2N + ... .
1
+ L}lN +
1 k
fJ(1
3
(2)
Thus we have verified that of the two distance 6 paths one is of length 4 and the other is of length 5 and both differ in 2 input hits from the all zeros." Also.. of the distance 7 paths, one is of length 5, two are of length 6, and one is of length 7; all four paths correspond to input sequences with three ones. If we are interested in the jth node level, clearly we should truncate the series such that no terms of power greater than LJ are included. We have thus fully determined the properties of all paths in the convolutional code. This will be useful later
in evaluating error probability performance of codes used
over arbitrary memoryless channels.
2 Alternatively, this can be regarded as. the transfer function of the diagram regarded as 8· signal flow graph. 3 Thus if the all zeros was the correct path and the noise causes us to choose one of the incorrect paths, two bit errors will be made.
THE BEST OF THE BEST
380
§]=a
[ill=b
LN
c=ffi]
8=~
Fig. 7. State diagram labeled according to distance, length, and number of input ones.
Fig. 8. Coder for K
== 2, b =2, n =
3, and R
= 2/3.
V. GENERALIZATION TO ARBITRARY CONVOLUTIONAL CODES The generalization of these techniques to arbitrary binary-tree (b 1) convolutional codes is immediate, That is, coder, with a [(-stage shift register and n mod-2 adders will produce a trellis or state diagram with 2 K - t nodes or states and each branch will contain n code symbols. The rate of this code is then
a
=
R
1 .
= n - bits/code symbol. .
The example pursued in the previous sections had rate R 1/2. The primary characteristic of the binary-tree codes is that only two branches exit from and enter each node. If rates other than lin are desired must make b >-1, where b is the number of bits shifted into the register at one time. An example for K = 2, b == 2, n 3, and consequently rate R 2/3 is shown in Fig. 8 and its state diagram is shown in Fig. 9. It differs from the binary-tree codes only in that each node is connected to four other nodes" and for general b it will be connected to 2 b nodes. Still all the preceding techniques including the trellis and state-diagram generating function analysis are still applicable. It must be noted, however, that the minimum distance decoder must make comparisons among ali the paths entering each node at each level of the trellis and select one survivor out of four (or out of 2 b in general).
=
we
=
VI.
=
GENERALIZATION OF OPTIMAL DECODER TO
ARBITItARY MEMORYLESS CHANNELS
Fig.
10
exhibits a communication system employing
a convolutional code. The convolutional encoder is precisely the device studied in the preceding sections. The data sequence is generally binary (aj = 0 or 1) and the code sequence is divided into subsequences where Xi represents the n code symbols generated just after the input bit a, enters the coder: that is, the symbols of the jth branch. In tetms of the example of Fig. I, a3 == 1 and x:\ = 01. The channel output or received sequence is similarly denoted. Yi represents the n symbols received when the. n code symbols of Xi were transmitted. This model includes the BSC wherein the Yi are binary n vectors each of ·whose symbols differs from the' cor-
Fig. 9. State diagram for code of Fig. 8.
responding symbol of Xi with probability p and is identical to it with probability 1 - p. For completely general channels it is readily shown [6], [14] that if all input data sequences are equally likely, the decoder which minimizes the error probability is one which compares the conditional probabilities, also called likelihood functions, P(y I x(m»), where y is the overall received sequence and x(m) is one of the possible transmitted sequences, and decides in favor of the maximum. This is calied a maximum likelihood decoder. The likelihood functions are given or computed from the specifications of the channel. Generally it is more convenient to compare the quantities log P(y I x?") called the log-likelihood functions and the result is unaltered since the logarithm is a monotonic function of its (always positive) argument. To illustrate, let us consider again the BSC. Here each transmitted symbol is altered with probability p < -1/2. Now suppose we have received it particular N-dimen~ional binary sequence y and are considering a possible transmitted N -dimensional code Sequence x(m) which differs in d m symbols from y (that is, the Hamming distance between x(m) and y is d m ) . Then since the channel is memoryless (i.e., it affects. each symbol independently of all the others), the probability
381
Fifty Years ofCommunications and Networking
....-
DATA SEQUENCE
-. CODE
CONVO~UTIONAL
MEMORVLESS
SeQUENCE
ENCODER
CHANNEL
RECEIVED
SEQUENCE . - - - ......
(INCLUDING MODEM)
'1,12· .. 8t...
Fig. 10. Communication system employing convolutional .codes.
that this x(fn) was transformed to the specific received y at distance d". from it is P(y
I x(m». =
pd"'(l _
I x(fn»
=
-d m log (1 -
pip)
+ N log (1 -
I - -........-
p)
Now if we compute this quantity for each possible transnutted sequence, it is clear that. the second term is constant in each case. Furthermore, since we may assume p < 1/2 (otherwise the role of 0 and 1 is simply interchanged at the receiver), we may express this as log P(y
I x(m»
= -ad m
-
{3
(3)
where a and (3 are positive constants and d m is the (positive) distance. Consequently, it IS clear that maximizing the log-likelihood function is equivalent. to minimising the Hamming distance dm. Thus for the BSC to minimize the error probability we should choose that code sequence at minimum distance from the received sequence, as we ha indicated and done in preceding sections.
ve
We now consider a more physical practical channel: the AWaN channel with biphase" phase-shift keying (PSK) modulation. The modulator and optimum demodulator (correlator or integrate-and dump filter) for this channel are shown in Fig. 11. We use the notation that Xik is the kth code symbol for the jth branch. Each binary symbol (which we take here for convenience to be ±1) modulates the carrier by ±ll/2 radians for T seconds. The transmission rate is, therefore, liT symbols/second or blnT' = R/T bit/so The function E. is the energy transmitted for each symbol, The energy per bit is, therefore Eb = f,IR. The white Gaussian noise is a 'zero-mean random process of onesided spectral density No W /Hz, which affects each symbol independently. It then follows directly that the channel output symbol Yik is a Gaussian random variable V~ if Xjlc = 1 and - ~ whose mean is ~Xik (i.e., if Xik = -1) .and whose variance is N o/ 2. Thus the conditional probability density (or likelihood) function of Yik given Xik is
+
'( I Xik ) =
P Yjk
vIZ Xik)2/lV o] •
exp [-(yjl>.... I vIINo
(4)
The likelihood function for the jth branch of a particular The results are the same for quadriphase PSK with coherent reception. The analysis proceeds in the same way, if we treat quadriphase PSK as two parallel independent biphase PSK channels. 4
• ••i(t) a./2es/T lin
(WI+~ Xjk) . . .
.....- - -...
»::
and the log-likelihood function is thus log P(y
X'1 X12 ' • 'X1n X2 1X22 * . 'X2;". • •
..... ---~
CORRElATOR DEMODULATOR n(t) WHITE GAUSSIAN NOISE
Fig. 11. Modem for additive white Gaussian noise PSK, modulated memoryless channel. '
code path x/ m ) P(yi
I x/m»
n
II P(Yik I Xjk
=
(m»
k-l
since each symbol is affected independently by the white Gaussian noise, and thus the log-likelihood function for the jth branch is
In p(Y/ iX/fIl» ==
t.
E
I XiI:
In P(Yik
~-1
(ftl»
~ [YiT~ - .V. r: = -N1 LJ E, o
- -
1 ~
No
=c
k-l
L:
L...J
kal
Yil.;
2
1
2"
-
l\
Y;kXilc
(ft.)
-
Xilt
Cm
>]2
-
.1 .,-
In N IT
0
1nIINo D
(5)
k-l
where C and D are independent of m, and we have used the fact that [~jklm)] 2 == 1. Similarly, the log-likelihood" function for any path is the sum of the log-likelihood functions for each of its branches. We have thus shown that the maximum likelihood de-coder for the memoryless AWGN biphase (or quadriphase) modulated channel is one which forms the inner product between the received (real number) sequence and the code sequence (consisting of ± 1) and chooses the path corresponding to the greatest. Thus the metric for this channel is the inner product (5) as contrasted with the distance" metric used for the :sse.
5 We have used the natural logarithm here, but obviously a change of base results merely in a scale factor. 6 Actually it is easily shown that maximizing an inner product is equivalent to minimizing the Euclidean distance between the corresponding vectors.
382
THE BEST OF THE BEST
For convolutional codes the structure of the bode paths was described in Sections II-V. In Section III the optimum decoder was derived for the BSC. It now becomes clear that if we substitute· the inner product metric ~YjkXJk(m) for the distance metric ~dJk(m), used for the BSC, all the arguments used in Section IIi for the latter apply equally to this Gaussian channel. In particular the optimum decoder has a block diagram represented by the code state' diagram. At step j the stored metric for each state (which is the maximum of the metrics of all the paths leading to this state at this. time) is augmented by the branch metrics for branches emanating from this state. The comparisons are performed among all pairs of (or in general sets of 2 b ) branches entering each state and the maxima are selected as the new most likely paths. The history (input data) of each new survivor must again be stored and the decoder is how ready for step
j + 1. Clearly, this argument generalizes to any memoryless channel and we must simply use the appropriate metric In P(y I x'?'), which may always be determined from the statistical description of the channel. This includes, among others, AWGN channels employing other forms of 1
modtilation. . . ' 1n the next section, we apply the analysis of convolutional code distance properties of Section IV to determine the 'error probabilities of specific codes on more general memoryless channels. VII.
symbols, Hence the probability of an error comparison is
In Section tv we analyzed the distance properties of convolutional codes employing a state-diagram generating function technique, We now extend this approach 'to obtain tight upper bounds on the error probability of such codes. We shall .consider the BSC, the- AWGN channel and more general memoryless channels, in that order. We shaH obtain both the first-event error probability, which is the probability that the correct path is excluded (not a survivor) f.or the first time at the jth step; and the bit error probability which is the expected ratio of bit errors to total number of bits transmitted.
Binary Symmetric Chonnei
The first-event error' probability is readily obtained from the generating function T.
more
elaborate modulators, such as multiple FSK or multiphase modulators, .might be employed, Jacobs [11] has shown that the most effective as well 8S the simplest system for wide-band space and satellite channels is the binary PSK modu... later .considered in the example of this section. We note again that the performance of quadriphase modulation is the same as for biphase modulation, when both are coherently demodulated.
in this specific
(6) On the other hand, there is no assurance that this particular distance five path will have previously survived to be compared with the correct path the jth so step. If either of the distance 6 paths were compared instead, then four or more errors in the six different symbols will definitely cause an error in the survivor decision, while three errors will cause a tie which, if resoived by coin flipping, will result In an error only half the time. Then the probability if this comparison is made is
at
as
P6
!
=
(6)p3(1 - p)3 + t (6)p' (1 - p)~-~. e
2 3
e~4
(7)
Similarly, if the previously surviving paths were such that a distance d path is compared with the correct path at the jth step, the resulting error probability is
±
PERFORMANCE OF CONVOLUTIONAL CODES ON MEMORYLESS CHANNELS
A.
path merging with the all zeros at node a at the jth level. Now suppose that the previous-level survivors were such that the path compared with the all zeros at step j is the path whose data sequence is 00 · · . 0100 corresponding to nodes a·· · .a abc a (see Fig. 4.). This differs from the correct (all zeros) path in five symbols. Consequently an error WIll be made In this .comparison if the BSC caused three or more errors in these particular five
(k)p4!(l _ p)k-',
!( 2
k. \_-k/2('l _
k/2JP
+
k odd
e
.-(k+l)/2
t
.\k/2
(8)
PI
(~)P'(l
.-k/2+1 .
e
- p)1c-~,.
k even.
Now at step j, since there is no simple way of determining previous survivors, we may overbound the probability- of a first-event .e~ror by the ~u~ of the error probabilities for all possible paths which merge with the correct path at this point. Note this union. bound is indeed an tipper bound because two or more such paths may both have distance closer to the received sequence than the correct path (even though only one has survived to this point) .and thus the events are not disjoint. For the example with generating function (1) it follows that the first-event error probability" is bounded by
PH < P s
+
2P6
+ 4P7 + ... +
k
2 p ,,+&
+ ...
(9)
where f", is given by (S). In Section VII-C it will be shown that (8) can be upper bounded by (see (39)). Pic
<
2 kp(1 - p)k/2.
(1.0)
Using this, the first-event error probability bound (9) 8 We are ignoring the finite length of the path, but the expression is still valid since it is an upper bound.
383
Fifty Years ofCommunications and Networking
can be more loosely bounded by
CORRECT PATH
x
(It)
PE
2k - 52 kp(1 _
p)k
l2
k=5
[2vP
1-4vpl-p
-
T(D)
I
--
D-2v'1J(l-p)
where T (D) is j list the generating function of (1) It follows easily that for a general binary-tree (b convolutional code with generating function
(11) INCOR~eCT SURVIVOR
= 1)
x'
Fig. 12. Example of decoding decision after initial error has occurred.
ClO
T(D) =
L
ak
D
k
(12)
binary-tree code if we weight each term of the first-event error probability bound at any step by the number of the first-event error probability is bounded by the generroneous bits for each possible erroneous path merging eralization of (9). with the correct path at that node level, we upper bound the bit error probability. For, a given step decision (13) corresponds to decoder action on one more bit of the transmitted data sequence; the first-event error probwhere Pk is given by (8) and more loosely upper bounded ability union bound with each term weighted by the by the generalization of (11)' corresponding number of bit errors is an upper bound on the expected number of bit errors caused by this (14) action. Summing the expected number of bit errors Whenever a decision error occurs, one or more bits, will over L steps, which as was just shown may result in be incorrectly decoded. Specifically, those bits in which overestimating through double counting, gives an upper the path selected differs from the correct path will be bound on the expected number of bit errors in L branches incorrect. If only one error were ever made in decoding' for arbitrary L. But since the upper bound on expected an arbitrary long code path, the number of bits in error number of bit errors is the same at each step, it follows, in this incorrect path could easily be obtained from the upon dividing the sum of L equal terms by L, that this augmented generating function T (D, N) (such as given expected number of bit errors per step is just the bit by (2) with factors in L deleted). For the exponents of error probability P B, for a binary-tree code (b = 1). the N factors indicate the number of bit errors for the If b > 1, then we must divide this expression by b, the given incorrect path arriving at node a at the jth level. number of bits encoded and decoded per step. To illustrate the calculation of P B for a convolutional After the first error has been made, the incorrect paths no longer will be compared with a path which is overall code, let us consider again the example of Fig. 1. Its correct, but rather with a path which has diverged from transfer function in D and N is obtained from (2), letting the correct path over some span of branches (see Fig. 12). L 1, since we are not now interested in the lengths of If the correct path x has been excluded by a decision incorrect paths, to be error at step j in favor of path x', the decision at step 5N j + 1 will be between x' and s", Now the (first-event) 'l'(D N) D , = 1 - 2DN error probability of (13) or (14) is for a comparison, at any step, between path x and any other path merging with it at that step, including path x" in this case. However, since the metric" for path x' is greater than the (15) metric for x, for on this basis the correct path was The exponents of the factors in N in each term deterexcluded at step ;, the probability that path x" metric mine the number of bit errors for the path (s) correspondexceeds path x' metric at step j + 1 is less than the T(D, N) IN=l yields the probability that path x" exceeds the (correct) path x ing to that term. Since T(D) first-event error probability P E, each of whose terms metric at this point. Consequently, the probability of a new incorrect path being selected after a previous must be weighted by the exponent of N to obtain Pn, it error has occurred is upper bounded by the first-event follows that we should first differentiate T (D, N) at N 1 to obtain error probability at that step. Moreover, when a second error follows closely after a first error, it often occurs (as in Fig. 12) that the erroneous bites) of path x" overlap the erroneous bites) 6 5 of path x', With this in mind, we now show that for a = D + 2·2D 3·4D 7 + ... + (k + 1)2" D~+5 + k-d
=
=
=
+
D
5
9 Negative distance from the received sequence for the BSC, but clearly this argument generalizes to any memoryless channel.
= (1- - 2D)2·
(16)
384
THE BEST OF THE BEST
Then from this we obtain, as in (9), that for the BSC
PB
<
+ 2· 2P + 3·4P7 + ... +
P5
higher metric than the correct path, i.e., n
(k
+
1)2kpki-5
+ ...
(17)
where P; is given by (8). If for PI: we use the upper bound (10) we obtain the weaker but simpler bound
< L CD
Po
/t-5
= dT(~N N) L=1.D=2v';o.:;>
ft
L L ;=1 i
r, = Pr
(18)
More generally for any binary-tree (b = 1) code used
dN
N-l
= £...J ~ e, Dk k=d
=
and corresponding to (18) we have the weaker bound dT(D, N)
dN
I
__ "
Xii)Yii
PI"
{L t,-I (Xi;' -
~ 0
(21)
N-l.D=2V'p(l-p)
For a nonbinary-tree code (b .-=/= 1), all these expressions must be divided by b. The results of (14) and (18) will be extended to more general memoryless channels, but first we shall consider one more specific channel of particular interest.
B. A IV a LV Biphase- j\1adulated Channel As was shown in Section VI the decoder for this channel operates in exactly the same way as for the BSC, except that instead of Hamming distance it uses the metric
~ o}
x,j)y,j
1
{~ (x/ -
xr)Yr
,-1
= Pr {~ v(20)
<
(Xii' -
= pr{-2 tYr
(19)
then corresponding to (17)
PB
i-I
i
or
on the BSC if
d'll(D, ~21
.L .L XiiYii
2:
where i runs over all branches in the t"\VO paths. But since, as we have assumed, the paths x and x' differ in exactly k symbols, wherein ,Xi; = 1 and Xii' = -I, the pairwise error probabilityis just
(k - 4)2 k- 5[4p(1 _ p)]k/2
12Vp(1 _ p)1 5 [1 - 4Vp(1 _ p)]2
1l
.L .L Xii'Yii ; ;=1
6
~ o}
~ o}
~ O}
(22)
where r runs over the k symbols wherein. the two paths differ. Now it was shown in Section VI that the Yii are independent Gaussian random variables of variance N0/2
and mean v'ZXii' where Xii is the actually transmitted code symbol. Since we are assuming that the (correct) transmitted path has Xi i = 1 for all i and j, it follows that Yii or y, has mean ~ and variance N o/ 2. Therefore, since the k variables y,. are independent and Gaussian, the sum Z = u- is also Gaussian with mean kv'Z and variance kN 0/2. Consequently,
+
.L,._lk
r,
<
=
Pr (Z
=
10)-V'2lcE.lN o
0)
=
1 0
-~
exp (-Z - kv'ZllkNo dZ
vrrkN o
2 [exp (-x /2)] dx A. erie
V2ll
/2kE..
~ No
(23)
= ±1 are the transmitted code symbols,
We recall from Section VI that (8 is the symbol energy, which is related to the bit energy by £8 =R£b' where R = bin. The bound on P.E then follows exactly as in Section VII-A and we obtain the same general bound as (13)
Yii the corresponding received (demodulated) symbols, and j runs over the n symbols of each branch while i
(24)
where
Xii
runs over" all the branches in a particular path. Hence, to analyze its performance we may proceed exactly as in Section VII-A except that the appropriate pairwisedecision errors Pic must be substituted for those of (6) to (8). A'3 before we assume, without loss of generality, that the correct (transmitted) path x has Xii = + 1 for all i and j (corresponding to the all zeros if the input symbols. were 0 and 1). Let us consider an incorrect path x' merging with the correct path at a particular step, which has k negative symbols (Xii' = -1) and the remainder positive. Such a path may be incorrectly chosen only if it has a
where
ak
are the coefficients of 'i'(D)
=
L CD
k-d
ali; Die
(25)
and where d is the minimum distance between any two paths in the code. We may simplify this procedure considerably while loosening the bound only slightly for this channel by observing that for x ~ 0, y ~ 0, erfe
vX+Y S
exp (
-;Y) erfc VX.
(26)
385
Fifty Years of Communications and Networking
Consequently, for k
from (23)
Pk
2 d, /2kEa
=k
letting l
f
=
er f c \j No
-<
exp ( - - er f c
/2(d
= er C \j
-NolEa)
- d, we have
+
No
l)E s
C. General Memoryless Channels
~2dE' -
(27)
No
whence the bound of (24), using (27), becomes
PB
~ akPk < L.J k-d
0
k=d
0
P < erfc ~~:. exp (~JT(D) ID-
(28)
ex p (- •• IN,)-
E
The bit error probability can be obtained in exactly the same way. Just as for the BSC [(19) and (20)] we have that for a binary-tree code
(29) Ck
are the coefficients of
dT(D, N)
dN
I
N-I
=
t
k-d
(30)
k
Ck
D ·
Thus following the came arguments which led from (24) to (28) we have for a binary-tree code P
B
<
ere f
As was indicated in Section VI, for equally likely input data sequences, the minimum error probability decoder chooses the path which maximizes the log-likelihood function (metric)
~ [ - (k N- d)E.] eric ~2dEB -N z: ak exp
~
or
where
out the first two factors. Since the product of the first two factors is always less than one, the more general bound is somewhat weaker.
e x p(dE.) - - dT(D, N) I ~-2dE~ No No dN
N=l,D-exp(-EJ>/N o)
(31)
For b > 1, this expression must be divided by b. To illustrate the application of this result we consider the code of Fig. 1 with parameters K = 3, R 1/2, whose transfer function is given by (15). For this case 1/2 and £8 = 1/2 fb, we obtain since R
=
=
P
<
erfc V~. (1 _ 2e-U/2No
In P(y
over aJI possible paths x?", If each symbol is transmitted (or modulates the transmitter) independent of all preceding and succeeding symbols, and the interference corrupts each symbol independently of all the others, then the channel, which includes the modem, is said to be memoryless'" and the log-likelihood function
In P(y I x(m')
Pk(X, x')
=
Pr
Since the number of states in the state diagram grows exponentially with K, direct calculation of the generating function becomes unmanageable for K > 4. On the other hand, a generating function calculation is basically just a matrix inversion (see Appendix I), which can be performed numerically for a given value of D. The derivative at N 1 can be upper bounded by evaluating the first difference [T(D, 1 + f) - T(D, 1) l/f, for small f. A computer program has been written to evaluate (31) for any constraint length up to K 10 and all rates R = lin as well as R 2/3 and R 3/4. Extensive results of these calculations are given in the paper by Heller and Jacobs [24], along with the results of simulations of the corresponding codes and channels. The simulations verify the tightness of the bounds. In the next section, these bounding techniques will be extended to more general memoryless channels, from which (28) and (31) can be obtained directly, but with-
=
Pr
=
=
=
=
=
n
L L In P(Yii I i
Xii
i-I
(m»)
where xu f III) is a code symbol of the '1nth path, Yu is the corresponding received (demodulated) symbol, j runs over the n symbols of each branch, and i runs over the branches in the given path. This includes the special cases considered in Sections VII-A and -B. The decoder is the same as for the BSC except for using this more general metric. Decisions are made after each set of new branch metrics have been added to the previously stored metrics. To analyze performance, we must merely evaluate P k , the pairwise error probability for an incorrect path which differs in k symbols from the correct path, as was done for the special channels of Sections VII-A and -B. Proceeding as in (22), letting Xij and Xij' denote symbols of the correct and incorrect paths, respectively, we obtain
(32)
B
I x(m»)
[~
{t In r-1
=
Pr
t
InP(Yi;
P(Yr I x/) P(Yr I x r )
I Xi;') > >
~ ~ InP(Yi; I Xi;)]
o}
{iI,-1 P(Yr I x/) > I} P(Yr I x r)
(33)
where r runs over the k code symbols in which the paths differ. This probability can be rewritten as
Pk(x, x')
=
k
L IIP(Yr I x r)
Y£Yk
(34)
,.-1
where Y k is the set of all vectors y = Yr, ... , Yk) for which
(YI' Y2,
10 Often more than one code symbol in a given branch is used to modulate the transmitter at one time. In this case, provided the interference still affects succeeding branches independently, the channel can still be treated as mernoryless but now the symbol likelihood functions are replaced by branch likelihood functions and (33) is replaced by a.single sum over i.
386
THE BEST OF THE BEST
IT P(Yr I x/) > 1 P(Yr I XI') •
(35)
that the likelihood functions (probability densities) were
r-1
_ exP [ - (Yr - V~ xr) 2/No] P(Yr I z,) vrrN
But if this is the case, then
Pk(X, x')
< <
, x I)] 1/2 L II P(Yr I r) [PCY P( r I r) Yr X,.. Ii:
where
X
y'YJ; ,-1
Ie
L IIP(y,
I Xf')1/2P(Yr I X,.')1/2
nlly,Y,-1
(36)
where Y is the entire space of received vectors." The
first inequality is valid because we are multiplying the summand by a quantity greater than unity," and the second because we are merely extending the sum of positive terms over a larger set. Finally ,ve may break up the k-dimensional sum over y into' k one-dimensional summations over Yt, Y2, •.. , Y'q respectively, and this yields Ie
PI:(:X:, x')
s L E··· L IIP(Yr I Xf")1/2P(y" I X,.')1/2 v, If&:
Itl
1'-1
Ie
II L P(Yr 1 x,,)1/2P(Yr I Xr')1/2 ,.-J
XI'
= +l.or -1 and z,
+ x;'
= O.
(41)
Since Y)' is a real variable, the space of y,. is the real line and the sum in (37) becomes the integral
where we have used (41) and x r 2 = Xr'2 of these k identical terms is, therefore, P;
<
exp
= 1. The product
Noe) (-k
(42)
/I
for all pairs of correct and incorrect paths. Inserting these bounds in the general expressions (24) and (29), and using (25) and (30) yields the bound on firstevent error probability and bit error probability.
(37)
(43)
tlr
To illustrate the use of this bound we consider the two specific channels treated above. For the BSC,Yt' is either equal to Xr~ the transmitted symbol, or to xn its complement. Now y,. depends on x; through the channel statistics. Thus P(Yr
=::
X,.).
=
1 -
p (38)
P(y" = Xf") = p.
For each symbol in the set r
+
= 1, 2,
<:>,
k by definition
x/. Hence for each term in the sum if x, = 0, xr' ~ 1 or vice versa. Hence, whatever X r and x/ may be X"
E 1
11,-0
P(Yr t x,.)1/2P(Yr I X/)1/2 = 2p l/2(1 _ p)1/2
PB
=
2 k p k/2(1 _
p)Ie/2
eXD
(44)
(-.,IN,)
which are somewhat (though not exponentially) weaker A characteristic feature of both the BSC and the AWGN channel is that they affect each symbol in' the same way independent of its location in the sequence. Any memoryless channel has this property provided it is stationary (statistically time invariant). For a stationary memory less channel (37) reduces to
fir
Do ~ (39)
11 This would be the set of all 2~ k-dimensional binary. vectors for the BSC, and Euclidean k space for the AWGN channel. Note also that the bound of (36) may be improved for asymmetric channels by changing the two exponents of ~ to s and 1 - s, respectively, where 0 < s < 1.
The square root of a quantity greater than one is also
greater than one.
.D-
where'"
for all pairs of correct and incorrect paths, This was used in Section VII-A to obtain the bounds (11) and (21). For the AWGN channel of Section VII-B we showed
12
dT(fN N) L=J
<
than (28) and (31).
and the product (37) of k identical factors is
PI:
(40)
o
E lIr
P(Yr I X,.) 1/2P(Yr r X/)1/2
<
1.
(46)
While this bound on P, is valid for all such channels, clearly it depends on the actual values assumed by the symbols XI' and x.', of the correct and incorrect path, and these will generally vary according to the pairs of paths x and x' in question. However, if the input symbols are binary, x and x, whenever x, = x, then xr' = i, 13
For an asymmetric channel this bound may be improved by
changing the two exponents 1/2 to where 0 s 1.
< <
8
and 1 -
s, respectively,
387
Fifty Years of Communications and Networking
so that for any input-binary memoryless channel (46) becomes
Do =
E
P(y
I X)1/2P(y I i)1/2
(47)
and consequently PE'
<
r < B
7'(D)
(48)
In-Do
dT(D, N)
dN
I
CODe
-E
Fig. 13. Systematic convolution coder for K
(49)
MAXIMUM-MINIMUM FREE DISTANCE
CODES
=
=
IX.
=
CATASTROPHIC ERROR PROPAGATION iN CONVOLUTIONAL CODES
Massey and Sain [13] have defined a catastrophic error as the event that a finite number of channel symbol errors causes an infinite number of data bit errors to be decoded. Furthermore, they showed that a necessary and sufficient condition for a convolutional code to produce
K
Systematic
2
3
3 4 5
SYSTEMATIC AND NONSYSTEMATIC CONVOLUTIONAL
The term systentatic convolutional code refers to a code on each of whose branches one of the code symbols is just the data bit generating that branch. Thus a systematic coder will have its stages connected to only n - 1 adders, the nth being replaced by a direct line from the first stage to the commutator. Fig. 13 shows an R =: 1/2 systematic coder for K 3. It is well known that for group block codes, any nonsystematic code can be transformed, into a systematic code which performs exactly as well. This is not the case for convolutional codes. The reason for this is that, as was shown in Section ViI, the performance of a code on any channel depends largely on the relative distances between codewords and particularly on the minimum fre~ distance d, which is the exponent of D in the leading term of the generating function. Eliminating one of the adders results in a reduction of d. For example, the maximum free distance code for K = 3 is that of Fig. 13 and this has d 4, while the nonsystematic K = 3 code of Fig. 1 has minimum free distance d 5. Table I shows the maximum minimum free distance for systematic and nonsystematic codes for K == 2 through 5. For large constraint lengths the results are even more widely separated. In fact, Bucher and Heller [19] have shown that for asymptotically large K, the performance. of a systematic code of constraint length K is approximately the same as that' of a nonsystematic code of constraint length K(I - R). Thus for fl = i/2 and very large K, systematic codes have the performance of nonsystematic codes of half the constraint length, while requiring .exactly the same optimal decoder complexity. For R = 3/4, the constraint length is effectively divided by 4.
= 3 and r = 1/2.
TABLE I
N-l,D-Do
where Do is given by (47). Other examples of channels of this type are FSK modulation ov~r, the AWON (b?th coherent and noncoherent) and Rayleigh fading channels. VIII.
ON
a
4
4 5
NonsystematicS 5 6 7
We have excluded catastrophic codes (see Section IX); R = i.
catastrophic errors is that all of the adders have tap sequences, represented as polynomials, with a common factor. . . In terms of the state diagram it is easily seen that catastrophic errors can occur if and only if any closed loop path in the diagram has a ~er? weight (i.e, the e~ ponent of D for the loop path is zero). To illustrate this, we consider the example of Fig. 14. Assuming that the all zeros is the correct path, the incorrect path a b d d··· d c a has exactly 6 ones, no matter how many times we go around the self loop d. Thus' for a BSC, for example, four-channel errors may cause us to choose this incorrect path or consequently make an arbitrarily large number of bit errors (equal to two phis the nu~ber of times the self loop is traversed). Similarly for the AWGN channel this incorrect path with arbitrarily many corresponding bit errors will be chosen with probability eric V6Eb/ No. . . Another necessary and sufficient condition for ca~a strophic error propagation, recently found by Odenwalder
[20] is that any nonzero data path in the trellis
o~
state
diagram produces K -- 1 consecutive branches with all
zero code symbols. . . We observe also that for binary-tree (R = lin) codes, if each adder of the coder has an even number of connections, then the self loop corresponding to the all ones (data) state will have zero weight and consequently the code wiiI be catastrophic. , The main advantage of a systematic code is that it can never be catastrophic, since each closed loop must contain at least one branch generated by a nonzero data bit and thus having a nonzero code symbol. Still it can be shown [23] that only a. small fraction of nonsystematie codes is catastrophic (in fact, 1/(2 n - 1) for binary-tree R = lin codes. We note further that if catastrophic errors are ignored, nonsystematie codes with even larger free distance than those of Table 1 exist.
388
THE BEST OF THE BEST
D
ON
-on •.[E
Fig. 14. Coder displaying catastrophic error propagation.
X.
PERFORMANCE BOUNDS FOR BEST CONVOLUTIONAL
CODES FOR GENERAL MEMOR~LESS CHANNELS AND COMPARISON WITH
BLOCK
CODES
We begin by considering the path structure of a binary-tree1 4 (b 1) convolutional code of any constraint K, independent of the specific coder used. For this purpose we need only determine T (L) the generating function for the state diagram with each branch labeled merely by L so that the exponent of each term of the infinite series expansion of T (L) determines the length over which an incorrect path differs from the correct path before merging with it at a given node level. (See Fig. 1 and (2) with D = N 1). After some manipulation of the state-transition matrix of the state diagram of a binary-tree convolutional code of constraint length K, it is shown in Appendix 11 5 that
=
=
T(L)
=
<
LK(l - L)
V
C
+ L 1 - 2L = LK(i + 2IJ + 4L + ... + 1 - 2L
K
2
2"L"
+ ...)
(50)
come is a head we connect the particular stage to the particular adder; if it is a tail we do not. Since this is repeated for each new branch, the result is that for each branch of the trellis the code sequence is a random binary n-dimensional vector. Furthermore, it can be. shown that the distribution of these random code sequences is the same for each branch at each node level except for the all zeros path, which must necessarily produce the all zeros code sequence on each branch. To avoid treating the all zeros path differently; we ensure statistical uniformity by requiring further that after each shift a random binary n-dimeneional vector be added to each branch 16 and that this also be reselected after each shift. (This additionai artificiality is unnecessary for input-binary channels but is required to prove our result for general memoryless channels): Further details of this procedure are given in Viterbi [9]. . We now seek a bound on the average error probability of this ensemble of codes relative to the measure (randomselection process) imposed.. We begin by considering the probability that after transmission over a memory-less channel the metric of one of the fewer than 2 k paths merging with the correct path after differing in K k branches, is greater than the correct metric. Let Xi the correct (transmitted) sequence and x/ an incorrect sequence for the ith branch of the two paths. Then following the argument which led to (37) we have that the probability that the given incorrect path may cause an error L'3 bounded by ,
where the inequality indicates that more paths are being counted than actually exist. The expression (50) indi .. cates that of the paths merging with the correct path at a given node level there is no more than 'one of length K, no more than two of length K + 1, no more than three of length K + 2, etc. We have purposely avoided considering the actual code or coder configuration so that the preceding expresK+1c sions ar~ valid for all binary-tree codes of constraint PK+Ic(X, x /) .< P(Yi I Xi)1/2P(Yi I X/)1/2 (51) length K. We now extend our class of codes to include i-1 Yi time-varying convolutional codes. A time-varying coder is one in which the tap positions may be changed, after where the product is over ali K k branches in the path. each shift of the bits in the register. We consider the If we now average over the ensemble of codes constructed ensemble of all possible time-varying codes; which in- above we obtain cludes as a subset the ensemble of all fixed codes, for K+1c . . a given constraint length K. 'We further impose a uniform P + < ~ q(Xi)P(Yi I Xi)1/2 q(X/ )P{Yi I X/)1/2 K 1c i - I Xi Xi' r« probabilistic measure on all codes in this ensemble (52) randomly reselecting each tap position after each shift of the register. This can be done by hypothetically flipping a coin nK times after each shift, once for each stage where q(x) is the measure imposed on the code symbols 'of each branch by the random selection, and because of the register' and for each of the 11, adders. If the outof the statistical uniformity of all branches we have
+
be
IT L
+
IT :E :E
by
14 Although for clarity all results will be derived for b = 1 the extension to b > 1 is direct and the results will be indi~ cated at the end of this Section. 15 This generating function can also be used to obtain error bounds for orthogonal convolutional codes all of whose branches have the same weight, as is shown in Appendix I.
PK+1c
< {L [L y
][
q(x)P(y
I X)I/2]2 }K+1c =
2-
(53)
16 The same vector is added to all branches at a given node level.
Fifty Years ofCommunications and Networking
389
where
Note that the random vectors x and yare n dimensional. If each symbol is transmitted independently on a memoryless channel, such as was the case in the channels of Sections VII-A and -B, (54) is reduced further to
f::lo =
-log2
{2: [2: q(x)P(y I X)1/2YJ 11
•
(55)
where x and y are now scalar random variables associated with each code symbol. Note "also that because of the statistical uniformity of the code, the results are independent of which path was transmitted and which incorrect path we are considering. Proceeding as in Section VII, it follows that a union bound on the ensemble average of the 'first-event error probability is obtained by substituting PK + k for L K + k
in 050). Tp.us
co
co
k-O
k-b
PE < l: 2kP K +k < l: 2k 2 -
= 1, R
+
co
.-0
where
Eo(p)
=
-* .
logz
L: [l:x q(x)p(y I y
-----[1 - 2-CRo/R-l)J2
(57) The bounds of (56) and (57) are finite only for rates R < R o, and R o can be shown -to be always less than the channel capacity.
P
X)1/1+ ]' +P ,
o< p s
1
(59)
where p is an arbitrary parameter which we shall choose to minimize the bound, It is easily seen that Eo(O) = 0, while Eo(l) = R o, in which case QK+k = 2kP~+k' the ordinary union bound of (56). We bound the overall ensemble first-event error probability by the probability of the union of these composite events given by (58).
Thus we f i n d ' co
merging. Hence the number of incorrect bits due to a path which differs from the correct path in K + k branches can be no greater than K + k - (K - 1) = k + 1. Hence we may overbound PB by weighting the kth term of (56) by k 1, which results in
< L: (k + 1)2-k(Ro/R-l)2-KRo/R ::::
(58)
.
2- K B o ( p) / B
~
PH < ~ QKU <
= lin
bits/symbol. To bound the bit error probability we must weight each term of (56) by the number of bit errors for the 'corresponding incorrect path. This could be done by eval.. uating the transfer function T (L, N) as in Section VII (see also Appendix' I), but. a simpler approach, which yields ~ simpler bound which is nearly as tight, is to recognize that an incorrectly chosen path which merges with the correct path after K + k branches can produce no more k + -I bit errors. For, any path which merges with the correct path at a given level must be generated by data which coincides with the correct path data over the last K - 1 branches prior to merging, since only in this way can the coder register be filled with the same bits as the correct path, which is the condition for
t,
To improve on these bounds when R > R o, we must improve on the union bound approach by obtaining a single bound on the probability that anyone of the fewer than 21: paths which differ from the correct path in K + k branches has a metric higher than the correct path at a given node level. This bound, first derived by Gallager [5] for block codes, is always less than 2" times the bound for each individual path. Letting QK+1c ~ Pr (anyone of 21: incorrect path metrics > correct path metric), Gallager [5} has shown that its ensemble average for the code ensemble is bounded by "
1_
2~(B.(P)/R-P)·
(60)
Clearly (60) reduces to (.56) when p = 1. To determine the bit error probability using this approach, we must recognize that QK+~ refers to 2" different incorrect paths, each with a different number of incorrect bits. However, just as was observed in deriving (57), an incorrect path which differs from the correct path in K + k branches prior to merging can produce at most k 1 bit errors. Hence weighting the kth termof (60) by k + 1, we obtain
+
PB < l: (k+l)QK+k < l: (k+l)2-k(Bo(p)/R-P)2-KRoCP)/B co
co
k-O
k-O
o<
p
$ 1.
(61)
=
Clearly (61) reduces to (57) when p 1. Before we can interpret the results of (56), (51), (60), and (61) it is essential ,that we establish some of the properties of Eo(p) (0 < p ~ 1) defined by (59). It can be shown [5], [14] that for any memoryless channel, Eo (p) is a concave monotonic nondecreasing function as 0 and Eo(l) ::;: R:o. shown in Fig. 15 with E~(O) Where the derivative Eo'(p) exists, it decreases with p and it follows easily from the definition that
=
" E '( .\ I im 0 PJ p-+O
P (1 I x) °S2 '" (')P(y I x ') L.J q ~
1 " '" () I
L.J q ?t £..J y x
:= -
J[
x'
(62)
390
THE BEST OF THE BEST LIM EtR) 6'" 0
LIM EeR) 6:' 0
Fig. 15. Example of Eo(p) function. for general memoryless channel.
e
R
Fig. 16. Typical limiting value of exponent of (67).
the mutual information of the channel" where X'" and yn ate the channel input and output "spaces, "respectively, for each branch sequence. Consequently, i~ follows that to minimize the bounds (60) and (6i), we must make p ~ 1 as 'large as possible to maximize the exponent of the numerator, but at. the sametimewe must ensure that
R
<
p
for R < 'Ro and sufficiently large. ~ we should choose = 1, or equivalently use the bounds (56) and (57). 'We 111ay thus 'combine all the above bounds into the' expressions '
p
_
2-KB(1l)/~
< 1-_-2--4-
(63)
(B )
where E(R)
= {Ro,
o ~ R < s, ,~o < R < C,
Eo(p) ,
6(R) =
{~oIR - ~, Eo(p)/~
- p,
0
< R < s, Ro 5 R < C,
p ~
0
I
0
<
1 (66)
p ~ 1~
'
o < R < u,
s, ~ ~ =
<
Eo(p)/p
o<
=
Eo(p) and consequently R o
p
s
C,
17 C can be made equal to the channel capacity by properly choosing the ensemble measure q(x). For an input-binary channel the random binary convolutional coder described above achieves this. Otherwise '8, further transformation of the branch sequence into a smaller set of nonbinary sequences is required [9].
l+p
(68)
= Eo(l) = C/2. (for the B8C,
= y2/2 In 2 while for the A'VGN, 9 =
lim E(R) == {C/2, '1-+0 C ~
u,
s s s
0 R C/2 C/2 R S; C.
c/
(69)
This limiting form of E (R) is shown in Fig. 17. The bounds (63) and (64) are for the average error probabilities of the ensemble of codes relative to the 111eaSUre induced by random selection qf the time-varying coder tap sequences, At least one code in the ensemble must perform better than the average. Thus the bounds (63) and (64) hold for the best time-varying _binarytree convolutional coder of constraint length K. Whether there exists a fixed convolutional code with this performance is an unsolved problem. However, for small K the results of Section VII seem to indicate that these bounds are valid also for fixed codes. To determine' the tightness of the upper bounds, it is useful to have lower bounds for convolutional code error probabilities, It can be shown [9] that for ~l~ R < C
(67)
1.
= ----&-
E/f/1Vo In 2.) For the very noisy channel, suppose we let p .::;: R - 1, so that using (68) we obtain Eo(p) := C - R. Then in the limit "as 8 ~ 0 (65) becomes for a-very noisy channel ' . '
C
(65)
<
To minimize the numerators of (63) and (64) for R > R« we should choose p as large as possible, since E,o (p) is a nondecreasing function of p. However, we are. limited by the necessity of making 8(R) > 0 to keep the denominator froln"becon~ing zero. On the other hand, as the constraint length K becomes very large we may choose 8(R) == 8 very small, In particular, as 8 approaches 0,
(6q) approaphes
It follows from the properties of Eo(p) described, that for R > R o, lim,-.o E'(R) decreases from R; to 0 as R increases from R o to C, but that it remains positive for all rates less than C. The function is shown for a typical channel Fig. 16. . It is particularly instructive to obtain specific bounds, in the limiting case, for the class of "very noisy" channels, 'which includes the BSC with' p 1/2 -' y where 1,,1 « 1 .and the biphase modulated AWGN with .f.ft/1V~. « 1. For this class of channels it can be shown [5] that .
in
$o(p)
in order to keep the denominator positive. Thus since Eo(l) = R; 'and E 9 (p) < R o , for p < 1, it follows that
PB
Fig. 15 demonstrates the graphical determination of
lim&-.o E(R) from Eo(p).
PB;
~
PE
>
2-~(EL(R)/R-o(~>l
(70)
where EL{R)
R
= Eo(p)
= 8o(p)/p
and o(K)
~
,
0 as
(71) I(~ 00.
Comparison of the parametric
391
Fifty Years of Communications and Networking
Both Eb(R) and ELb(R) are functions of R which for all R > 0 are less than the exponents E (R) and E L (R) for convolutional codes [9]. In particular, for very noisy channels they both become [5]
LIM E(R)
6"'0
CI2 ......- - - " " " "
C
Cf2
(74)
R
Fig. 17. Limiting values of E(R) for very noisy channels.
equations (67) with (71), shows that
EL(R) = limc5-,o E(R) for R > R« but is greater for low rates. For very noisy channels, it follows easily from (71) and (68) that
oS
R
~
c.
Actually, however, tighter lower bounds for R < C /2 (Viterbi [9]) show that for very noisy channels
EL(R)
= {C/2, C-R,
o ~ R'S C/2 C/2 s R < C,
(72)
which is precisely the result of (69) or of Fig. 17. It follows that, at least. for very noisy channels, the exponential bounds are asymptotically exact. All the results derived in this section can be extended directly to nonbinary (b > 1) codes. It is easily shown (Viterbi [9]) that the same results hold with R = bin, R; and Eo (p) multiplied by b, and all event probability upper bounds multiplied by 2 b - 1, and bit probability upper bounds multiplied by (2 b - 1) lb. Clearly, the ensemble of codes considered here is nonsystematic. However, by a modification of the arguments used here, Bucher and Heller [19] restricted the ensemble to systematic time-varying convolutional codes (i.o., codes for which b code. symbols of each branch correspond to the data which generates the branch) and obtained all the above results modified only to the extent that the exponents E (R) and E,,, (R) are multiplied by 1 - R. (See also Section VIII.) Finally, it is most revealing to compare the asymptotic results for the best convolutional codes of a given constraint. length with the corresponding asymptotic results for the best block codes of a given block length. Suppose that K bits are coded into a block code of length N so that R = ](/N bits/code symbol. Then it can be shown (Gallager [5], Shannon et al. [8]) that for the best block code, the bit error probability is bounded above and below by
where Eb(R)
=
Max [EoCp) - pRJ
This is plotted as a dotted curve in Fig. 17. Thus it is clear by comparing the magnitudes of the negative exponents of (73) and' (64) that, at least for very noisy channels, a convolutional code performs much better asymptotically than the corresponding block code of the same order of complexity. In particular at R C/2, the ratio of exponents is 5.8, indicating that to achieve equivalent performance asymptotically the block length must be over five times the constraint length of the convolutional code. Similar degrees of relative performance can be shown for more general memoryless channels [9]. More significant from a practical viewpoint, for short constraint lengths also, convolutional codes considerably outperform block codes of the same order of complexity.
=
XI.
PATH MEMORY TRUNCATION METRIC QUANTIZATION AND SYNCHRONIZATION
A major problem which arises in the implementation of a maximum likelihood decoder is the length of the path history which must be stored. In our previous discussion we ignored this important point and therefore implicitly assumed that all past data would be stored. A final decision was made by forcing the coder into a known (all zeros) state. We now remove this impractical condition. Suppose we truncate the path memories after M bits (branches) have been accumulated, by comparing all 2 K metrics for a maximum and deciding on the bit corresponding to that path (out of 2 K ) with the highest metric M branches forward. If M is several times as large as K, the additional bit errors introduced in this way are very few, as we shall now demonstrate using the asymptotic results of the last section. An additional bit error may occur due to memory truncation after M branches, if the bit selected is from an incorrect path which differed from the correct path M branches back and which has a higher metric, but which would ultimately be eliminated by the maximum likelihood decoder. But for a binary-tree code there can be no more than 2M distinct paths which differ from the correct path M branches back. Of these we need concern our.. selves only with those which have not merged with the correct path in the intervening nodes. As was originally shown by Forney [12], using the ensemble arguments of Section X we may bound the average probability of this event by [see (58)]
O~P51
E1Jb(R) :$ Max [EoCp) - pRJ. 05p
o
s
1.
(75)
392
THE BEST OF THE BEST
To minimize this bound we should maximize the expo ... nent Eo (p) /R - p with respect to p on the unit interval. But this yields exactly Eb(R), the upper bound exponent of (73) for block codes. Thus
(76) where Eb(R) is the block coding exponent. We conclude therefore that the memory truncation error is less than the bit error probability bound without truncation, provided the bound of (76) is less than the bound of (64). This will certainly be assured if
MEb(R)
>
(77)
KE(R).
For very noisy channels we have from (69) and (74) or Fig. 17, that
M
-> K
1 1 - 2R/C'
os
1 2(1 - VR/C)2 ,
C/4 S R
1 ~ R/C (1 - VR/CY)2 ,
C/2
R ~ C/4
<
R
s
C/2
<
C.
For example, at R = C/2 this indicates that it suffices to take M > (5.8)K. . Another problem faced by a system designer is the amount of storage 'required by the metrics (or log-likeliK hood functions) for each of the 2 paths. For a BSC this poses no difficulty since the metric is just the Hamming distance which is at most n, the number of code symbols, per branch. For the AWGN, on the other hand, the optimum metric is a real number, the analog output of a correlator, matched filter, or integrate-anddump circuit. Since digital storage is generally required, it is necessary to quantize this analog metric. However, once the components Yik of the optimum metric of (5), which are the correlator outputs, have been quantized to Q levels, the channel is no longer an AWON channel. For biphase modulation, for example, it becomes a binary input Q-ary output discrete memoryless channel, whose transition probabilities are readily calculated as a function of the energy-to-noise density and the quantization levels. The optimum metric is not obtained by replacing Yik by its quantized value Q(Yik) in (5) but rather it is the log-likelihood function log P(y I x(m») for the binaryinput Q-ary-output channel. Nevertheless, extensive simulation [24] indicates that for 8-1evel quantization even use of the suboptimal metric Q(Yik)Xjk(m) results in a degradation of no more than 0.25 dB relative to the maximum likelihood decoder for the unquantized AWGN, and that use of the optimum metric is only negligibly superior to this. However, this is not the case for sequential decoding, where the difference
Lk
in performance between optimal and suboptimal metrics is significant [11]. In a practical system other considerations than error performance for a given degree of decoder complexity often dictate the selection of a coding system. Chief .among these are often the synchronization requirements, Convolutional codes utilizing maximum likelihood decoding are particularly advantageous in that no block synchronization is ever required. For block codes, decoding cannot begin until the initial point of each block has been located. Practical systems often require more complexity in the synchronization system than in the decoder. On the other hand, as we have by now amply illustrated, a- maximum likelihood decoder for a convolutional code does not. require any block synchronization because the coder is free running (i.e., it performs identical operations for each successive input bit and does not require that K bits be input before generating an output). Furthermore, the decoder does not require knowledge of past inputs to start decoding; it may as well asSUIDe that all previous bits were zeros. This is not to say that initially the decoder will operate as well, in the sense of error performance, as if the preceding bits of the correct path were known. On the other hand, consider a decoder which starts with an initially known path but makes an error at some point and excludes the correct path. Immediately thereafter it will be operating as if it had just been turned on with an unknown and incorrectly chosen previous path history. That this decoder will recover and stop making errors within a finite number of branches follows from our previous discussions in which it was shown that, other than for catastrophic codes, error sequences are always finite. Hence our initially unsynchronized decoder will operate j list like a decoder which has just made an error and will thus always achieve synchronization and generally will produce correct decisions after a limited number of initial errors. Simulations have demonstrated that synchronization generally takes no more than four or five constraint lengths of received symbols. Although, as we have just shown, branch synchronization is not required, code symbol synchronization within a branch is necessary. Thus, for example, for a binarytree rate R 1/2 code, we must resolve the two-way ambiguity as to where each two code-symbol branch begins. This is called node synchronization. Clearly if we make the wrong decisions, errors will constantly be made thereafter. However, this situation can easily be detected because the mismatch will cause all the path metrics to be small, since. in fact there will not be any correct path in this case. We can thus detect this event and change our decision as to node synchronization (cf. Heller and Jacobs [24]). Of course, for an R.= lin code, we may have to repeat our choice n times, once for each of the symbols on a branch, but since n represents the redundancy factor or bandwidth expansion, practical systems rarely use n > 4.
=
Fifty Years ofCommunications and Networking
393
quantization (8 or more levels-3 or more bits). On the other hand , with maximum likelihood decoding, by ernTION AL CODES . ploying a parallel implementation, short constraint This paper has treated primarily maximum likelihood length codes (K S 6) can be decoded at very high data decoding of convolutional codes. The reason for this was rates (10 to 100 Mbits/s) even with soft quantization. two-fold: 1) maximum likelihood decoding is closely In addition, the insensitivity to metric accuracy and related to the structure of convolutional codes and its simplicity of synchronization render maximum likelihood consideration enhances our understanding of the ulti- decoding generally preferable when moderate error probmate capabilities, performance, and limitation of these abilities are sufficient. In particular, since sequential decodes; 2) for reasonably short constraint lengths (K < coding is limited by the overflow problem to operate at 10) its implementation is quite feasible" and worthwhile code rates somewhat below R , it appears that for the o because of its optimality. Furthermore for K S 6, the AWGN the crossover point above which maximum likecomplexity of maximum likelihood decoding is sufficiently lihood decoding is preferable to sequential decoding oclimited that a completely parallel implementation (sepa- curs at values of P somewhere bet-ween 10- 3 and 10-5 , n rate metric calculators) is possible. This minimises the depending on the transmitted data rate. As the data rate decoding time per bit and affords the possibility of ex- increases the P n crossover point decreases. tremely high decoding speeds [24]. A third technique for decoding convolutional codes is Longer constraint lengths are required for extremely known as feedback decoding, with threshold decoding low error probabilities at high rates. Since the storage [3] as a subclass. A feedback decoder ~asically ma~es and computational complexity are proportional to 2K , a decision on a particular bit or branch In the decoding maximum likelihood decoders become impractical for tree or trellis based on the received symbols for a limited K > 10. At this point sequential decoding [2], [4], (6] number of branches beyond this point. Even though the becomes attractive. This is an algorithm which sequen- decision is irrevocable, for limited constraint lengths tially searches the code tree in an attempt to find a path (which are appropriate considering the limited number whose metric rises faster than some predetermined, but of branches involved in a decision) errors will propagate variable, threshold. Since the difference between the cor- only for moderate lengths. When transmission is over a rect path metric and any incorrect path metric increases binary symmetric channel, by employing only codes with with constraint length, for large K generally the correct certain algebraic (orthogonal) properties, the decision on path will be found by this algorithm. The main draw- a given branch can be based on a linear function of the back is that the number of incorrect path branches, and received symbols, called the syndrome, whose dimenconsequently the computation complexity, is a random sionality is equal to the number of branches involved in variable depending on the channel noise. For R .< R o, the decision. One particularly simple decision criterion it is shown that the average number of incorrect branches based on this syndrome, referred to as threshold decodsearched per decoded bit is bounded [6], while for R > ing, is mechanizable in a very inexpensive manner. HowRo it is not; hence R o is called the computational cutoff ever, feedback decoders in general, and threshold decodrate. To make storage requirements reasonable, it is nec- ers in particular, have an error-correcting capability essary to make the decoding speed (branches/ s) some- equivalent to very short constraint. length codes and what larger than the bit rate, thus somewhat limiting consequently do not compare favorably with the performthe maximum bit rate capability. Also, even though the ance of maximum likelihood or sequential decoding. -average number of branches searched per bit is finite, it However, feedback decoders are particularly well may sometimes become very large, resulting in a storage suited to correcting error bursts which may occur in fadoverflow and consequently relatively long sequences be- ing channels. Burst errors are generally best handled by ing erased. The stack sequential decoding algorithm [7], using interleaved codes: that is, employing L convolu[18] provides a very simple and elegant presentation of tional codes so that the jth, (L + j)th (2£ + j)th, etc., the key concepts in sequential decoding, although the bits are encoded into one code for each j = 0, 1, ... , Fano algorithm [4] is generally preferable practically. L - 1. This will cause any burst of length less than L For a number of reasons, including buffer size require- to be broken up into random errors for the L independments, computation speed, and metric sensitivity, sequen- ently operating decoders. Interleaving can be achieved tial decoding of data transmitted at rates above about by simply inserting L - 1 stage delay lines between 100 K bits/s is practical only for hard-quantized binary stages of the convolutional encoder; the resulting single received data (that is, for channels in which a hard deci- encoder then generates the L interleaved codes. The sigsion -0 or 1- is made for each demodulated symbol). nificant advantage of a feedback or threshold decoder is For the biphase modulated AWGN channel, of course, that the same technique can be employed in the decoder hard quantization (2 levels or 1 bit) results in an effi- resulting in a single (time-shared) decoder rather than L ciency loss of approximately 2 dB compared with soft decoders, providing feasible implementations for hardquantized channels, even for protection against error bursts of thousands of bits. Details of feedback decoding 18 Performing metric calculations and comparisons serially.
XII.
OTHER DECODING ALGORITHMS FOR CONVOLU-
THE BEST OF THE BEST
394
are treated extensively in Massey [3], Gallager [14], and Lucky et ale [16]. ApPENDIX
times the first), we obtain finally a 2K - 2 matrix equation, which for K == 4 is
o
I
CONVOLUTIONAL CODE FOR ARBITRARY
K
AND ERROR
BOUNDS FOR ORTHOGONAL CODES
=
=
0
-NL
0
0
0
X OO1
NL
-L
1
0
0
-L
0
0
X0 10
0
-NL
0
1
0
-NL
0
0
X Otl
0
0
-L'
0
1
0
-L
0
. X 10 0
0
0
-NL
0
0
1
-N.L
0
X 10 1
0
0
0
-L
0
0
1
-L
X ll 0
0
0
0
-NL
0
0
0
1- NL
XUI
0
where 100 .. · 0 contains (K - 2) zeros.
where ii, j2, ... ,iK-2 runs over all binary vectors except for the all zeros . Substitution of (80) into (78) yields a 2K - 2 -din1ensional matrix equation. The result for K 4 is
=
o
-L
o
-NIJ
1
-NL
o
o o
-1.1
1
-L
o
1- NL
_ Ir: · L
X 00 1
l~
Defining the new variable
=
NL X OO... 01
+ X oo
. ' . ll
=
[
0
(81)
(78)
Since in all respects, except these two, the matrix after this sequence of reductions is the same as the original but with its dimension reduced corresponding to a reduction of K by unity, we may proceed to perform this sequence of reductions again. The steps will be the same except that now in place of (80), we have (80') and in place of (82) (82 1 )
while in place of (81) the right of center term of the first row is - (L + £2) and the first component on the right side is N2L2. Similarlyin place of (83) the center term of the first row is
1 [ (82)'
(which corresponds to adding the second row to lVL
· (83)
o
(79)
From this general pattern it is easily shown that the matrix can be reduced to a dimension of 2 K - 2 • First combining adj acent rows, from the second to the last, pairwise, one obtains the set of 2K - 2 - 1 relations
1
N 2L 2
Note that (83) is the same as (78) for K reduced by unity, but with modifications in two places, both in the first row; namely, the first component on the right side is squared, arid the middle term of the first row is reduced by an amount NL2. Although we have given the explicit result only for K == 4, it is easily seen to be valid for any K.
0
= LX 10 0 ••• O,
X'OO.'.Ol
Xli]
1
This pattern can be easily seen to generalize to a 2 K - 1 - 1 dimensional square matrix of this form for any binary-tree code of constraint length K, and in general the generating function T(L, N)
· X 10 1
1 - NL
We derive here the distance-invariant (D 1) generating function T(L, N) for any binary tree (b 1) convolutional code of arbitrary constraint length K. It is most convenient in the general case to begin with the finite-state machine state-transition matrix for the linear equations among the state (node) variables. We exhibit this in terms of IV and L for a K == 4 code as follows:
1 dimensional
X/ OOt
-L
GENERATING FUNCTION FOR STRUCTURE OF A BINARy-TREE
-
-NL
- (L
+ L 2+ ... L K-2)] 1 -
NL
[4
.
00 - 0 1
(K-3)]
Xu ... )
(84)
395
Fifty Years of Communications and Networking
whence it follows that
PB (NL)K-J
2 extensions of (80) and
Applying (19) and the K (80') we find T(L~ N)
= L4100."OO
=
(85)
K
K 1 - )
NLK(l - L)
+
N)
+
(86)
K
NL
If we require only the path length structure, and not the number of bit errors corresponding to any incorrect path, we may set N 1 (86) and obtain
= in
LK(l - L)
1 - 2£
+L ,
K
(~7)
If we denote as an upper pound an expression which is the generating function of more paths than exist in our state' diagram, we have
(88) As an additional application of this generating function technique, we now obtain bounds on PE and PB for the class of orthogonal convolutional (tree) codes .introduced by Viterbi [~O]. For this cl~ss of codes, to' each of
the 2K branches of the K-state diagram there corresponds one of 2K orthogonal signals. Given that each signal is orthogonal to all others in n ~ "1 dimensions, corresponding to n channel symbols or transmission times (as, for example, if each signal consists of n different pulses out of 2K possible positions), then the weight of each branch is n. Consequently, if we replace L, the path length enumerator, by D" in (~6) we obtain for' orthogonal codes
n
Then using (48) and (49) ,the first-event error probability 'for orthogonal codes is bounded by
and the bit error probability bound is
N-l.D-Do
Dofl1(~l - Do )2 < D o (l - DOfl) 2 (1 - 2Dofl D OflK)2 (1 - 2Do )2 fl
flK
+
ft
(91)
where Do is a function of the channel transition probabilities or energy-to-noise ratio and is given py (46).
stimulation he has received over the course of writing the several versions of this paper from Dr. J. A. Heller, whose recent work strongly complements and enhances this effort, for numerous discussions a~d suggestions and for his assistance in its presentation at the Linkabit
NL =------------1 - N(~ t L + +L 1 - L(1
=
I
The author gratefully acknowledges the considerable
= LN-
2
dN
ACKNOWLEDGMENT
LN-lXIOO ... Ol
= LN-2X100."0l1 = ...
< dT~N, D)
Corporation' "Seminars on Convolutional Codes." This tutorial approach owes part of its origin to Dr, G. D. Forney, Jr., whose imaginative and perceptive reinterpretation of my original work has aided immeasurably in rendering it more comprehensible. Also, thanks are due to Dr. J. K. Omura for his careful and detailed reading and correction of the manuscript during his presentation of 'this material in the UCLA graduate course on information theory. REFERENCES
[1] P. Elias, "Coding for noisy channels," in 1956 . Conu. Rec., vol. 3, pt. 4, pp~ 37-46.
IRE N at.
[2] J. M. Wosencrart, "uSequential decoding for reliable communication," in 1957 IRE Nat. fJonv. Record, vol. 5, pt. 2, pp. 11-25.' . [3] J. L. Massey, Threshold Decodinq. Cambridge, Mass.: M.I.T. Press, 1 9 6 3 . · ' . [4] R. M. F~no, "A heuristic discussion of probabilistic deeoding," IEEE Trans. Inform. Theory, vol. IT-9, Apr. 1963, pp.64-74. . [5] R. G. Gallager, "A simple derivation of the coding theorem and some applications," IEEE Trans. Inform. Theory, vol. IT-II, Jan. 1965, pp. 3-18. (6] J. M. Wozencraft and I. M. Jacobs, Principles of Communi: cation Engineering. New York: Wiley, 1965. [7] K. S. Zigangirov, "Some sequential decoding procedures," Probl. Peredacb lnjorm., vol, 2, no. 4, 1966,pp. 13-25. [8] C. E. Shannon, R. G. Gallager, and E. R. Berlekamp, "Lower bounds to error probability for coding on discrete memoryless channels," Tniorm. ·Contr., vol, lQ, 1967, 'pt. I, pp. ~103,pt.II,pp.&22-552.
[9] A. J. Viterbi, "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm,' IEEE Trans. Inform. Theory, vol. IT~13, Apr. 1967, pp. 260-269. [10] - , "Orthogonal tree codes for communication in the presence of white Gaussian noise," IEEE Trans. Commun. Technol., vol. COM-15, Aprill~7, pp. ~38-242. . [11] T. M. Jacobs, "Sequential decoding for efficient communication from deep space," IEEe' Trans. Commun. Technol., vol. COM-15, Aug. 1968, pp. '492-501. [12] G. D. Forney,' Jr., "Coding system design for advanced solar missions," submitted to NASA Ames "Res. Ctr. by Codex Corp., Watertown, Mass., Final Rep., Contract NAS23~7,])ec. 1967. . [13l J. L. Massey and M. K. Sain, "Inverses of linear sequential circuits,' IEEE Trans. Comput., vol. C-17, Apr. 19~, pp. 330-337. . [14] ;R.. G. Gallager, In/ormation Theory and Reliable Com, . munication. NewYork: Wiley, 1968.[15] T. N. Morrissey, "Analysis of .decoders for convolutional codes by stochastic sequential machine methods," Univ. Notre Dame, Notre Dame, Ind., Tech. Rep. EE-682, Ma.y 1968. ' ' [16] R. W. Lucky, J. Salz, and E. J. Weldon, Principles of Data Communication. New York: McGraw-Hill, 1968.
396
THE BEST OF THE BEST
[17] J.
.
K. Omura" "On the Viterbi decoding algorithm," IEEE
Tra~.·inJorm.
Theory, vol. IT-15, Jan. 1969, pp. 177-179.
[18] F~ . Jelinek, "Fast. sequential decoding algorithm using' a . stack," IBM J. Res. Dev., vol. 13, no. 6, Nov." 1969, pp.
670-685.
.
[19] E. A. Bucher and J. A. Heller, "Error probability bounds for systematic convolutional codes," IEEE Trans. Inform. Theory, vol. IT-l6, Mar. 19~~, pp. 219-224. [20] J.. P. Odenwalder, "Optimal decoding of convolutional codes,' Ph.D. dissertation, Dep. Syst. SeL, Sch. Eng. Appl. Sci.,Univ. California, LosAngeles, 1970. [21] G. D. Forney, Jr., "Coding and its application in space communications", IEEE Spectrum, vol. 7, June 1970, pp.
47-58.
[22] - - , "Convolutional codes I: Algebraic structure," IEEE Trans. Injorm. Theory, vol. IT-16, Nov. 1970, pp. 720-738; "II: Maximum likelihood decoding," and "III: Sequential decoding," IEEE Trans. In/orm. Theoru, to be published, [23] W. J. Rosenberg, "Structural properties of convolutional codes," Ph.D. dissertation, Dep. Syst, ScL, Sch. Eng. Appl. ScL, Univ. California, Los Angeles, 1971. . [24] J. A. Heller and I. M. Jacobs, "Viterbi decoding for satellite and space communication," this issue, pp. 83~848. [25] A. R. Cohen, J. A. Heller, and A. J. Viterbi, "A new coding technique for asynchronous multiple access communica" tion," this issue, pp. 849-855.
Andrew J. Viterbi (S'54-M'58-8M'63) was born in Bergamo, Italy, on March 9, 1935. He received the B.S. and M.S. degrees in electrical engineering from' the MassachuPHOTO setts Institute of Technclogy, Cambridge, NOT in 19571 and the P~.D. degree in electrical engineering from the U niversity of Southern AVAILABLE California, Los Angeles, in 1962. While attending M.~.T., he participated in the cooperative program at the Raytheon Company. In 1957 he joined the Jet Propulsion Laboratory where he became a Research Group Supervisor in the Communications Systems Research Section. In 1963 he joined the [aculty of the 'P niversity of California, Los Angeles, as an Assistant Professor. In 1965 he was promoted to Associate Professor and in 1969 to Professor of Engineering and Applied' Science. He which he is was a cofounder in 1968 of Linkabit Corporation presently Vice president. _ Dr. Viterbi is a member of the Editorial Boards of the PROCEEDINGS OF TH~J iEEE and of the journal Information and Control. He is a member "Of Sigma Xi, Tau Beta Pi, and Eta Kappa. N u arid has served on several governmental advisory committees and panels. He is the coauthor of a book on digital communication and author of another on coherent communication, and he has received three awards for his journal publications.
of
An Adaptive Decision Feedback Equalizer DONALD A. GEORGE, MEMBER, IEEE, ROBERT R. BOWEN, JOHN R. STOREY, MEMBER, IEEE
~bstract-An adaptive decision feedback equalizer to detect digital informatioD transmitted by pulse-amplitude modulation (PAM) through a noisy dispersive linear channel is described, and its perfonnance through several channels is evaluated by means of analysis, computer simulation, and hardware simulation. For the channels considered, the performance of both the fixed and the adaptive decision feedback equalizers are found to be notably better than that obtained with a similar linear equalizer. The fixed equalizer, which may be used when the channel characteristics are known, exhibits performance which is close to that of the optimum, but impractical, Bayesian receiver and is considerably superior to that of the linear equalizer. The adap~ve decision feedback equalizer, which is used when the channel impulse response is unknown or time varYing, has a better transient and steady...state performance than the adaptive linear equalizer. The sensitivity of the receiver structure to adjustinent and quantization errors is not pronounced.
A
I.
INTRODUCTION
N ADAPTIVE decision feedback equalizer is described, and its performance is discussed in this paper. The equalizer is used to recover a sequence of digits that has been transmitted at a high rate over a noisy dispersive linear communications channel by some linear modulation process. The channel is used efficiently by sending the digital information at such a high rate that there is intersymbol interference at the receiver input between several successive digits. The receiver is able to combat both the additive noise and the intersymbol interference, and also to adapt itself to an ooknown or slowly varying channel without the aid of a training digit s~ quence. Thus it can "track" a continual slow drift in chan-
nel characteristics without interrupting the message transmission. Past decisions about the digits are used in ininimizing the intersymbol interference by coherently subtracting the interference from previously detected digits, and also are used in adapting the equalizer parameters to , a change in channel characteristics. It is shown that this receiver is insensitive to quantization of the input signal and quantization and adjustment of its own parameters, and so can be constructed at reasonable cost. Paper approved by the Data Communication Systems Committee of the IEEE 9ommunlcatlon Technology Group for publication after presentation at the 197~ IEEE International Conference on Communieatlons, San Francisco, Calif., June 8-10. Manuscript receIved Septeml?er ~5, 1970; revised December 23, 1970. D. A. George IS WIth the Department of Electrical Engineering Carleton University, Ottawa, Ont., Canada. ' R. R. Bowen and J. R. Storey are with Communications Re-
search Center, Department of Communications Ottawa Onto Canada. ' , ,
MEMBER, IEEE, AND
The adaptive linear transversal equaliser [1]-[3J has been developed in recent years to accomplish the task outlined previously. With that receiver it has been possible to utilize unknown or slowly varying dispersive channels much more effectively than was possible with fixed 'lumped-parameter equalizers. Concurrently, however, it has been shown [4J-[7J that the statistically optimum receiver for the recovery of the digit sequence, when the dispersive channel is known, is nonlinear. At high data rates the performance of this receiver is much better than that possible with the transversal equalizer, which is the optimum linear receiver. Unfortunately, the statistically optimum receiver is very complex when there is a large amount of intersymbol interference, and is not practical with today's technology. This suggests that one seek a statistically suboptimum receiver that is practical and has a performance that is significantly better than that of any receiver that is constrained to be linear. A decision feedback equalizer, described by Austin [8J, is such a receiver. It is shown in Fig. 1. This equalizer is not adaptive but an adaptive version may readily be obtained, as is shown in this paper. The decision feedback equalizer is similar to the transversal equalizer in that both have a filter matched to the isolated received pulse, followed by a baud-rate tapped delay line. However, it makes use of the fact that at the transversal equalizer output there is intersymbol interference caused by. both undetected digits and previously , detected digits. If the previous decisions are correct, they can be used to coherently substract the intersymbol interference caused by the previously detected digits. This is done by passing the past decisions through the feedback tapped delay line. The feedback delay line tap values are chosen on the assumption that these past decisions are all correct. The matched filter and the forward tapped delay line are used to minimize the effects of the additive noise and the intersymboI interference from undetected digits. Errors at the output of this equalizer occur in bursts, of course, because a decision error in the feedback delay line tends to cause yet more incorrect decisions. However, the equalizer is able to recover spontaneously from this condition. Simulation studies show that the performance of the decision feedback equalizer can be considerably better than that of the linear equalizer even though its output errors occur in bursts. In Section II of this paper the decision feedback equalizer is described, and its performance is compared with the performance of a number of other receivers. This com-
Reprinted from IEEE Transactions on Communication Technology, vol. COM-19, no. 3, June 1971. The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
397
398
THE BEST OF THE BEST MATCHED FI~TER
TAPPED OEl4Y L!NE
H( ....)=~
.n(w)
TAPPED DELAY LINE
Fig. 1. Decision feedback equalizer. .
'
parison is done both analytically' and by digital computer simulation. it is shown by example that the decision feedback equalizer is an attractive compromise between what is theoretically possible and what is now in use. Next, in Section III, it is demonstrated that the decision feedback equalizer may be made adaptive to an unknown channel. Two different adaptation algorithms are described and compared by hardware simulation studies. In the cases considered, the adaptive decision feedback equalizer significantly outperformed the adaptive linear equalizer, and a training sequence was not required for adaptation. Rather, the decisions can be used for adaptation as well as to coherently substract the intersymbol interference. Finally, in Section IV, the practical nature of the equalizer is demonstrated by showing that the number of delay line taps that it requires is modest, and that its digital implementation requires no finer quantization than the linear equalizer.
II.
FIXED EQUALIZER
In this section the fixed nonadaptive decision feedback equalizer for a known dispersive channel will be examined. The error rate of this equalizer is a lower bound on the error rate of an equalizer that must also adapt to random changes in the channel characteristics. The basic assumption made in deriving the receiver is that the decisions made by the receiver as to the transmitted signal samples are essentially correct. Given the bit error rate requirements in modern communication systems, this is a valid assumption. It is furthermore assumed that the analog signal from the conihttinications channel has been demodulated, filtered, and sampled at the digit baud rate with the appropriate phase. Previous work [IJ, [8J has indicated the desirability of a matched filter before the sampler, as shown in Fig. 1. In the approach taken in this paper, any suitable band-limiting filter may be used, at the price of some loss in performance. It remains to determine the tap gains {a(k); lc = 0,1,-. ·,N} and {b(m); m = t.s,s, .. ·,M}, as illustrated in Fig. 1. In the adaptive form of the equalizer these taps are automatically adjusted; for the fixed equalizer the tap gains must be
calculated and manually adjusted after the channel characteristics are determined. . The equalizer makes the estimate N
B(}) =
L: a{k)Y(i +
M
k) -
k~
L
i-I
b(l)D(j - l)
(1)
about 8( j), the digit thatis sent' at time t = jT, and then converts this estimate to a final decision DC i) with a nonlinear memoryless circuit. (If the digits {8( j)} are binary this circuit is a clipping circuit with zero bias. If the digits, are m-ary the circuit is an m-output quantizer.) In (1) y (j + k) is the output of the initial filter at time t = ( j + k) T. One method of choosing the tap values is to adjust for the minimization of the probability that 8( j) ¢ B( j). However, this direct optimization is difficult because an analytical expression for the, error probability in terms of the equalizer tap values is. not known. A practicabie way to "optimize" the tap values is to choose them such that the output mean-square error E[e2 ( j) ] is minimized, where (2) As shown later, this leans to a set of linear equations that specify the tap values. This method of optimization is also attractive because it can be used to make the equalizer self-optimizing or adaptive to an unknown or a slowly varying channel. While this optimization does not minimize the digit error probability directly, computer simulation studies [6J have shown that the probability density function of the error e (j) is close to Gaussian, and so the two performance criteria are similar. The process of determining the tap values starts with the evaluation of the mean-square error:
E[e2 ( j ) ]
= E[{8(j)
- 8(j)}2]
N
= E[{ L
k-o
M
a(k)y( j
- L: b(l)8( i>: l-l
+ k) l) - 8(j) 12 J
(3)
Fifty Years of Communications and Networking
where the signal sample y (j y(j
+ k)
M
L
=
i-o
+ k)
+k-
8(j
399
is
i)x(i)
+ n(j + k)
(4)
and x (i) is the value of the impulse response of the linear modulator, the channel, and the initial receiver filter after a delay iT, and n( j + k) is the additive noise at the output of the initial filter at time t = (j + k) T. The forward tap a(n) is optimum when
iJE[e2 ( J') ]
N
oa(n)
k-o
-----::: = 2E[{L a(k) M
- L
1-1
( . + k)
Y J
In general, it is quite difficult to calculate the digit error probability at the equalizer output. The calculation is particularly difficult because the assumption that all past digits are strictly correct will, of course, be violated, and the .errors may tend to occur in bursts. Nonetheless, some id~ of the improved performance of the decision feedback equalizer over the transversal equalizer can be obtained by assuming an ideal equalizer with an infinite number of taps and a matched filter in an environment of white additive noise with spectral density No. In this case the equations for the optimum tap values of a transversal equalizer are co
L
b(l)O(j - l) - 8(j)} .y(j + n)J
J:--ao
+ NoB(} -
c(j) {q,q(j - k)
k)}
= 2E[e(j)y(j + n)J
for all k
= 0,
(5)
aE[e } ) J . N iJb(m) =:= 2E[{E a(k)Y(J L
•
q,g{j - k)
+ k)
M
- L
l-l
b(l)8(j - l) - 8(j)} ·D(j - m)J
= 0,
m
= 1,2,···,M.
(6)
i-O
a(i) {'1'(i,k)
+
co
L
i-o
(7)
= 1,2,··· ,M
(8)
m.
L
l-o
x(l)x(l
+k-
i).
(9)
In the particular case where a matched filter is used ahead of the tapped delay lines, the x(t) is the autocorrelation function of the impulse response of the modulator and channel in cascade, and (7)-(9) become equivalent to those given by Austin [8J.
(11)
(12)
m)} = o(ln),
for 1n
~
O.
(13)
As before, the feedback tap values are given by (8). If the past decisions {8 (j - m)} are all correct, the decision feedback equalizer output mean-square error is
= Noa(O).
(14)
The performances of the two equalizers can now be compared by comparing (12) with (14) through the medium of an "equivalent received pulse" which has samples p(i) at the sampling instants t = iT. Use of sampled data notation and the z transform is convenient here. The transformed pulse shape P(z) is defined to be
where q,n(k - i) is the autocorrelation function of the noise net) at delay T = (k - i)T, and w(i,k) is defined by the equation '1'(i,k) ~
+ No8( j -
E[e2 ( j ) J
k=O,l,···,N N
L a(i)x(1n + i),
a(j) {epq(j - m)
i-O
= x(k),
i)'}
and bern) =
- (j - k)T) dT
In the same situation the forward taps of the decision feedback equalizer are given by
I
N
o
where q(T) is the impulse response of the modulator and the channel in cascade. The mean-square error of the output of this equalizer is
+
The M + N 1 (5) and (6) can be written as a set of M + N + 1 linear equations with the unknowns a (i), i = 0,1,·· ·,N and b(k), k = 1,2,- - ·,M. This is done by reversing the order of the summation and the averaging in (5) and (6) and by assuming that B(}) = 8( j) and that E[8(j) -8(j + k)J is zero if k ~ O. The resulting equations are
L
~ ED q(T)q(T
E[e 2 ( j ) J = Noc(O).
= -2E[e(})8(j-m)]
(10)
where o( .) is the Kronecker delta function and
Similarly, the feedback tap b(m) is optimum when 2(
= o(k),
00
P(z)
~
L i=-O
p{i)Z-i.
(15)
With this notation, the equivalent received pulse for the problem at hand is given by (16)
where co
~q(z)
= :E
i--oo
cfJq(i)Z-i.
(17)
400
THE BEST OF THE BEST
The quantity of significance here is the inverse of the "equivalent received pulse", which is defined by R(z)
1
= P(z)
q(d = re~T
BAUD INTERVAL T· 1.0
(18) 0.1
and the rei) can be determined by simple long division. Substituting the results of this series of definitions back into (12) and (14), gives the mean-square error
0.03
Pe
E[e 2 ( i) ]
= No L
i==O
r2 (i )
·2
10
(19)
for the transversal equalizer and (20)
for the decision feedback equalizer. It should be noted that these results are for an idealized situation, and as such form a lower bound on the actual mean-square error; however, they do allow ready comparisons. For example, for the simple case where p(i) = P exp (-i'Y), i ~ 0, the advantage of the decision feedback equalizer is limited to 3 dB, since eo
L:r2(i) = P{l
+ exp
(-2-y)}
.4
3xl0
10..4-t-----.,r-----r--~--,r---
2.0
4.0
6.0
8.0
--.-------,
10.0 12.0 fINo. dB
14.0
16.0
Fig. 2. Comparison of receiver error probabilities as functions of signal to background noise ratio.
(21)
21.0
q(T)=ATe
i-=O
Pe = 10- 3
18.0
and
-e r
LINEAR EQUALIZER
r2
(22)
(O) = P.
In contrast, if the background noise spectral density No is small and the actual pulse q ( T) is a rectangular pulse of length L, where the interpulse period T is aL, 1/2 :$ a :$ 1, then the ratio of the mean-square errors is
\
15.0 r:::::""
a::
~
12.0
8
9.0
~
6.0
DECISION FEEDBACK
EQUALIZER
E
.j
3.0
BAYESIAN DEMODULATOR
0.0
1 =1-(j2
-3.0
(23)
-2.0
-1.0
0.0
1.0
10 LOG 10
where
Fig. 3.
ffl 1+p2
--61-a =
.
2.0
3.0
4.0
5.0
6.0
7.0
[TRANSMISSION RATE RJ
3dS~~H
Effect of data transmission rate change on receiver performances.
(24)
The ratio becomes very large as a --+ 1/2. Thus the advantage achieved by using the decision feedback equalizer depends on the channel impulse response, and can, in some cases, be quite large. The performance of the equalizers were compared also by Monte Carlo simulation of the two equalizers on a digital computer. In addition, the statistically optimum or Bayesian demodulator [6J was simulated to determine how close to the optimum performance were the performances of the statistically suboptimum, but much simpler, equalizers. In one series of simulation tests the isolated received pulse was A Te- a ,. and the additive noise was Gaussian and white with power spectral density No. The message was a sequence of independent binary digits. The measured error probabilities at the output of the decision feedback equalizer, the linear equalizer, the Bayesian
demodulator, and a matched filter with clipper are shown in Fig. 2, as a function of the signal to background noise level E/No• (E is the isolated pulse energy, equal to 4>q(O) of (11).) Also shown is the error function curve, the performance that one could achieve if there was no intersymbol interference. In this series of tests the channel parameter a was unity. A filter matched to A re:" was used as part of both the linear and the decision feedback equalizer. Approximately one hundred errors were observed in making each error probability measurement. A sequence of tests was done with different numbers of forward and feedback equalizer taps. It was found that the linear equalizer performance improved as the number of taps was increased to 5, but a further increase did not appreciably improve the performance. In the same \vay, 3 forward taps and 6 feedback taps were found for the decision feedback equalizer. (The question of the number
401
Fifty Years ofCommunications and Networking 5.0
4.0
3.0
o.o+........~---+---==========-----------
-io _2.04-
o
----r---,~r--or--.,....-..,._......_......_......,._~......,..._r~~__,~r__,..__r_:'I
4
8
12
K)
20
28
24
32
36
40
44
48
TIME, NANOSECONDS
Fig. 4.
Impulse response of coaxial cable PCM channel.
of taps is discussed in more detail in Section IV.) As expected, it was observed that the errors at the decision feedback equalizer output occurred in short. bursts. At low signal to background noise ratios, ~ 6 dB in this case, the bursts occur so frequently that the linear equalizer performance is better. However, at higher signal-to-noise ratios (SNRs) the improved ability of the decision feedback equalizer to reduce the intersymbol interference is more important than the tendency to produce errors in bursts. It was found that at low error probabilities, wherever the digit error probability was less than 10-2, the performance of the decision feedback equalizer, the linear equalizer, and the Bayesian demodulator could all be accurately described by the empirical formula
'1 (R)E]}
pee] = 0.5 {1 - erf [ 2N
o
180
COAXIAL PCM CHANNEL
-3
ERROR PROBABILITY: fO 16.0
14.0
LINEAR EQUALIZER
12.0
~
:t
10.0
2
9 Q
8.0 6.0 \
4.0
DECISION
FEEDBACK
EOUALIZER
2.0 04-----.....---"";;;Oil:::lll'::=------,----,----r-------,
o
100
200
300
400
500
600
DATA RATE R. MEGABITS /SECONO
(25)
where R is the data transmission rate, defined for this example to be equal to (aT)-l. l1(R) may be considered to be the efficiency of the modem, and must be in the range 0 ~ t] (R) ~ 1.0. In all cases '1 ~ 0 as R ~ CD and t] ~ 1.0 as R ~ o. 1J(R) for the three demodulators, measured at 10-3 digit error probability, is shown in Fig. 3 as a function of R. At all transmission rates the efficiency of the decision feedback equalizer was greater than that of the linear equalizer and less than that of the Bayesian demodulator. At high transmission rates the efficiency of both nonlinear receivers decreased by 4.0 dB when the rate was doubled. The efficiency of the Bayesian demodulator was 2.0 dB better than that of the decision feedback equalizer at all high rates. In contrast, the efficiency of the linear equalizer decreased by 9.0 dB when the rate was doubled. This difference in rate of efficiency decrease becomes very important, of course, if the channel is used at very high rates and at high SNRs. The usefulness of the decision feedback equalizer in a coaxial cable pulse-code modulation (PCM) link was also
Fig. 5. Effect, of data transmission rate change through PCM channel.
evaluated with a computer Monte Carlo simulation program. In this series of tests the "channel" included the transmitter, a solid coaxial cable with an air dielectric, and a fixed lumped-parameter equalizer. This channel has no de response, a loo-MHz bandwidth at the - 3.0-dB points, a 240-MHz bandwidth at the -20.0-dB points, and a 70.0-dB per octave roll-off at higher frequencies. (This contrasts with the previous example, where the roll-off was 12.0 dB per octave.) The channel impulse response is shown in Fig. 4. The nominal data rate through this channel without further equalization is 225 Mbit/s. Simulation tests showed that inclusion of a linear transversal equalizer with a matched filter would allow one to increase the data rate to 400 Mbit/s, but not beyond. In contrast, the decision feedback equalizer with a matched filter can be used at 450 Mbit/s with only 6.0 dB more signal' strength than that required at low data rates, and even higher rates if the signal strength is increased further. The efficiency ,,(R) of the linear and the
402
THE BEST OF THE BEST
decision feedback equalizer for this channel example, measured at 10- 3 digit error probability, is shown in Fig. 5. Thus both the performance analysis and the simulation results indicate that the decision feedback equalizer performance is considerably better than that of the linear equalizer. Moreover, in the examples in which the much more complex statistically optimum demodulator was also evaluated, the decision feedback equalizer performance was quite close to this limiting performance. However, these results cannot be extended to other channels without either a simulation study in each case or the development of an appropriate analysis technique.
... /
TYPICAL RESPONSE
"'I:
\
0::
o
a:
\
ct:
w w
a:
(AVERAGE \
«::::> o
rn
"
.....
z « w
RESPONSE
" ..... _ jS:'
~
TIME
r
III.
ADAPTIVE EQUALIZER
Adapt~tion
to step change in channel impulse response.
tap value error is d(k) ~ a(k) - ao(k),
In this section it is shown that the decision feedback equalizer can be made adaptive to unknown or slowly varying channels, i.e., channels in which the impulse response does not change appreciably during the transmission of several hundred digits. The dynamic performance of the decision feedback equalizer, that is, the performance of the equalizer while it is adapting, is described. A method by which the decision feedback equalizer can be made adaptive can be seen from (2), (6), and (7). For any set of tap values
aE[e~(~)J = 2E[e(j)y(j + n)J aa n
Fig. 6.
~
k = 0,1,·· ·,N
b( - k) - bo ( - k) ,
k = -1,· -,-M.
I jet us also define a set of signals {z ( j
z(j
+ i)
+ i)}
i = 0,1,-· ·,N
= O(i +
i = -1,· - ·,-lll.
i),
(29)
by
= y(j + i),
(30)
Then it. can be shown that if the tap values are in error the mean-square error is
M
(26)
_ KNOWN CHANNEL PERFORMANCE LOWER BOUND
ADAPTATION TIME
N
+ L L
i==-ltf k=::-M
d(i)d(k)E[z(j
+ i)z(j +
k)J (31)
and
where eo (j) would be the error if the tap values were all correct. (The assumption was made that B( j - 'In) = ab(1n) 8(j - 1n), in = 1,2,. - ·,M, to derive (31).) Thus the mean-square error is a quadratic function of the tap gain If e (j) of (2) is replaced by errors.iin the same way that the mean-square error of the linear equalizer is a quadratic function of its tap gain (28) e(j) = B(j) - D(j) errors. Because of this, there are no "locally optimum" by assuming that the decisions are correct, then all the . tap gain settings, and a hill-climbing adaptation can be signals in (26) and (27) are available at the receiver. made to readily converge close to the correct set of tap When the error probability is low this substitution does values given by (7) and (8). not change the value of (26) or (27) appreciably. By Of course, the cross correlations E[e(j)y(j + k)J and changing the forward tap values by amounts approxi- E[e( j)8( j - 111,) ] cannot be measured exactly in a finite mately proportional to -E[e(j)y(j + k)], and the feed- time, so any particular sequence of tap adjustments is a back tap values by amounts approximately proportional sample function of a random process. A Robbins-Monro to E[e( j)O( j - 1n)], the taps are automatically adjusted procedure [9J would be applicable if the channel were to near their optimum .values, Thus the forward taps are . unknown but time invariant. However, if the channel is adjusted by means of measurement of the cross correla- slowly varying then the adaptive algorithm must be cation between the error and the input signals, just as for pable of "tracking" a slowly varying channel and of the linear transversal equalizer. On the other hand, ad- "learning" the optimum tap values for an unknown chanjustment of the feedback taps makes use of the cross nel. In that case a procedure such as the Robbins-Monro correlation between the error and the output signal, i.e., procedure is not applicable, and a compromise between the decisions. a smaller steady-state mean-square error and a shorter The potential of this type of algorithm can be seen by adaptation time to channel changes is necessary. observing how the mean-square error E[e 2 ( j) ] depends There are a .number of adaptation algorithms available, on tap value errors. Let laCk); k = 0,1,·· ·,N} and in which the exact details of the algorithm are somewhat {b(l); l = 1,2,·· -,M} be the actual tap values, and different, A typical response of an adaptive receiver to a {ao(k);k = O,I,···,N} and {bo(l);l = 1,2, ••• ,M} be the step change in the channel characteristics when any of optimum tap values, specified by (7) and (8). Then the these algorithms is used is shown in Fig. 6. The mean-
aE[e2 ( j) J'
-2E[e(j)8(j - 1n)J.
(27)
403
Fifty Years ofCommunications and Networking
square error at the receiver output is plotted as a func- The tap values are thereby shown to be equally sensitive tion of the .number of baud" intervals after a step change to adjustment, and this point combined with that of the in the channel characteristics. It is a nonstationary ran- previous paragraph .implies that the feedback taps can be dom variable, of course, since it is a function of the equal- adjusted more quickly and/or more accurately than the izer tap values, which in turn are nonstationary random forward taps or the taps of a transversal equalizer. Thus variables. Also shown in Fig. 6 is the average of many : the decision feedback equalizer would be expected to have such adaptation curves. The two most important char- a better adaptation performance than the transversal acteristics of such a curve are the "steady-state" mean- . equalizer. This has also been observed experimentally, 88 square error, and the "adaptation time," the time required : will be described. The advantages of using the cross correlation between to reach the steady-state mean-square error after a specific change in the channel characteristics. "The adaptation the error and the decisions suggest the use of this same curves that were obtained in a series of simulation studies measurement for adjustment of the transversal or forward are discussed later in the paper, but some general obser- tap values. Of course, the taps would not converge to the correct value to minimize the error due to the addivations may be made now. First, since a cross correlation is being measured, the tive noise and the intersymboI interference, but in some variance of the square of the signals involved contribute cases at least the taps converge to a value close to the significantly to the measurement error. Particularly when correct value. As these taps may be subject to less error binary signals are involved, there is a notable difference due to the cross correlation measurement, improved adaptbetween the input samples .{ y ( i)} and the output deci- ive behavior can result. Suppose we specify the taps {a (i)} by sions {8( j)} in this regard. In particular, (32) when
E[e(j)8(j
and
N
l: a(i)x(i
E[{y2(j) - E[{y2(j)JJ2J
i-:o
(33)
(The additive noise samples n( j) are assumed to be Gaussian in the caleulation.) Certainly then in the case of strong intersymbol interference, L: q2(i ) is large and so 82( j ) is much less "noisy" than y2(j). Thus an estimate of E[e( j)O( j - m) ] would usually involve less error than an estimate of E[e ( j) y (j k) ]. Whether this implies that the decision feedback equalizer can adapt more rapidly or more closely to the performance possible from a fixed optimum equalizer than can the linear transversal equalizer, and that the feedback taps can adapt more easily than the forward taps, depends on the sensitivity of the tap adjustments. The partial derivatives of (5) and (6) indicate this sensitivity. Since the absolute signal levels are, of course, arbitrary, the decisions 8(j) are taken to be ± 1 and the average signal power E[y2 (j) ] is taken as unity.. This effectively means that the taps {aCk)} and {b(l)} are of the same order of magnitude. This done, the sensitivity of the tap values can be evaluated, giving:
+
o2E[e2 ( J') ]
aa2 (n )
= 2E[Y2( J. + n)J =2
(34)
=
(35)
2.
l
= 0,1,·· ·,N
(36)
rather than by (5). Then the tap values are the solution to the equations
8(j)=±1
= 2{l: q2(i»2 + E2[n2 ( j ) J}.
+ l)J = 0,
- l)
=
o(l),
l
= 0,1,·· ·,N
(37)
rather than to (7). Thus adjustment of the forward taps by cross correlation between the error .and the decisions "forces zeros" in the overall transmission characteristic. .If the equalizer had an infinite number of correctly spaced taps, specified by (37), the result would be an inverse filter. In the limiting situation of no additive noise and a SImilarly ideal equalizer, an equalizer adjusted by the minimum mean-square error approach would also yield an inverse filter. Consequently, it is not surprising that in some high SNR situations where effective equalization is being obtained, the tWQ methods give similar results. A potential difficulty with this "zero-forcing" algorithm is that only as many system impulse response zeros can be forced as there are taps in the delay line.with one additional tap reserved to force a unit response at the desired time. The overall system impulse response can become large both before and after this interval over which the response is forced to zero. In contrast, when the equalizer is adjusted by the minimum output meansquare error approach, the mean-square contribution of the total system impulse response is minimized, not just the response over an interval as large as the equalizer delay line. Note, however, that if the taps of the transversal filter of the decision feedback equalizer are adjusted by the zero-forcing algorithm, adjustment of the taps of the feedback filter will automatically cancel any large impulse response after the main pulse, without causing a large impulse response at an even greater delay. This is the basic idea behind the decision feedback equalizer, based on the assumption that the decisions in the feed-
404
THE BEST OF THE BEST e(j)
COMPUTER-CONTROLLED NONRECURSIVE DIGITAL FILTER Y{j+N)
COMPUTER - CONTROLLED ~------~
NONRECURgVE
DIGITAL FILTER
FLTER TAP CONTROL
Fig. 7.
back delay line are correct. Thus only the system impulse response before the area in which zeros are forced can contribute to the output error. From this and from considerations of the errors associated with measuring E[e(j)8(j + k)J and E[e(j)y(j + k)], it is hypothesized that the decision feedback equalizer can use the cross correlation between the error and the decisions to advantage for adjustment of both forward and feedback taps. The experimental results that are described later substantiate this hypothesis. The actual adaptation algorithm that was used in the experimental investigation will now be discussed. As shown, the transversal or forward tap gain should be. changed by an amount proportional" to a measure of -E[e( j)y( j + k) ] -E[e( j)8( j k)], and the feedback tap by an amount proportional to a measure of E[e ( j) 8(j - 1n)]. Actually, rather than taking a fixed finite time average of these products and then changing the tap values, the adaptation is done indirectly with an algorithm similar to that developed by Lucky [2J for adaptation of the linear equalizer. The tap values are changed in the following way. 1) An accumulator for the tap is set equal to zero. 2) Each time a digit is processed, the product e( j)y (j k) for the forward tap a(k), or -e(j)O( j - m) for the feedback tap b (m), is added. to the accumulator; Only the signs of j), y ( i + k), and 0(j - 1n) are used in this calculation to simplify the equalizer synthesis. 3) If the accumulator contents exceed a threshold V; then the tap value is decreased by .6 and step 1 is repeated. If the contents become less than - V, the tap value is increased by A and step 1 is repeated. If the contents remain between + V and - V, then step 2 is repeated. In the alternate procedure previously discussed, the e(j)y(j + k) of step 2 are replaced by e(j)8(j + k). Both the transversal and decision feedback equalizer were tested using each of, these adaptation algorithms. The adaptive equalizers were evaluated by observing their ability to adapt to an unknown but fixed channel, rather than to a time-varying channel. This was done by measuring the mean-square error at the equalizer output as a. function of adaptation time. A hardware simulator
or
+
+
e(
+
under the control of a PDP-5 digital computer was used to do this, A block diagram of the simulator is .shown in Fig. 1. The transmitted message was the pseudorandom output of the m-sequence generator corresponding to the polynomial X 31 + X 28 + X 27 X 24 X 17 X 16 + x 9 + x 8 + 1. In some cases a time-invariant analog filter was used to simulate the channel. The filter output was sampled- at the baud rate after the addition of filtered Gaussian noise. In other tests, a 32 tap 12-bit baud rate nonrecursive digital filter was used to simulate the channel filter. In this case the additive noise was sampled at the baud rate and then added to the dispersed signal. In both cases, the composite sampled signal was processed with a 7-bit baud rate digital filter, as shown in Fig. 7. The sum of the number of taps in the two nonreeursive filters could be as great as eleven, with any division of taps between the two. These filter tap values were under direct computer control. At the beginning of each adaptation test., all taps of the decision feedback equalizer except the last forward tap were set to zero, so that the output would be' 8 if there were no noise or intersymbol interference. The adaptive transversal equalizer was tested in a similar way. The digital computer was,used to change the tap values, and took the sequences ly(j)}, {e(j)}, and (I(j)} directly as inputs. This method was used to avoid construction of adaptation circuitry for each tap. As a result, only a few of the binary digits that were processed by the equalizer were used for adaptation processes. The digits that were used are called "independent digits", because the time between successive observations is long compared to the times over which the autocorrelation functions of e(j) and y(j) are significant. A specified number of these independent digits, usually 100, were processed according to the preceding algorithm to change the taps. Then 2000 digits were used to estimate E[e 2 (j) ], without changing either the tap values or the accumulator contents. Then 100 more samples were used for adaptation purposes, followed by another measurement of E[e 2 ( ;" ) ] . This sequence continues until it is evident that the equalizer has reached it "steady-state" mode of operation where the trend in mean-square error is no longer
+
+
+
405
Fifty Years ofCommunications and Networking 5.0
4.0
3.0 l'" 0
::> 2.0
.... :::::i
Q.. ~
«
1.0
llJ
CJ)
.J
:J
~
0.0
-10
-2.0
o
4
6
8
10
12
14
TIME IN MULTIPLES OF I39". sec.
Fig. 8.
16 BAUD INTERVAL
impulse response of simuiated telephone cable channel.
changing with time. The results of 50 such adaptation runs are then averaged to give a mean adaptation curve. Both the signal samples {y ( j)} and the equalizer tap values were quantized with a maximum accuracy of 7 bits. (More will be said about quantization accuracy requirements in Section IV.) The least significant bit of the tap gain values was changed during adaptation each time the threshold V or - V was exceeded. Thus the adaptation parameter a in these tests is 2-6 ~ 0.016. Tests were carried out to determine whether the decision feedback equalizer can adapt better than the linear equalizer to an unknown channel, whether the results are valid for a variety of channels, whether use of a learning sequence is necessary or even advantageous, and whether or not use of an estimate of E[e(j)8(j k)J results in better adaptation than an estimate of E[e(j)y(j k)J for the forward taps {a (k) }. The channels that were simulated in these tests included a channel with an exponential impulse response, a coaxial cable PCM channel, and an audio-loaded telephone line. One series of tests was made to compare the performances of the adaptive linear and decision feedback equalizers in an audio-loaded telephone cable system, and to determine the advantage that could be gained by using a training sequence. The channel filter was a lumped-parameter filter designed by Bell Canada to simulate a 15 000 ft audio-loaded telephone cable and was terminated in 600 ohms. The impulse response of the filter is shown in Fig. 8. Binary information was transmitted through this channel at 7200 bit/s. Neither any intentional additive Gaussian noise nor a filter matched to the isolated received pulse were used in this series of tests. Because of the resulting mismatch, choice of the third of 11 taps as the "main" tap minimized the linear equalizer output mean-square error. Similarly, it was found that the decision feedback equalizer with 4 forward taps and 7 feedback taps made the best use of the 11 available taps. The adaptation thresholds were set at ±20 during this series of tests. It
0.3
0.2
+
+
+
•
.,
UNKNOWNCHANNEL: SIMULATED 15,000 AUDIO·LOADED TELEPHONE LINE. DATA RATE·: 7200 BPS BINARY
0.1
a::
o
~
0:
w 0.05
'"a:4:
o
:::)
of/)
~ 0.03
LrJ
2
LINEAR EQUALIZER. WITH TEST SIGNAL o
°0
DECISION FEEDBACK EQUALIZER, NOlEST SIGNAL 0
0.02
0 0 00 / 0°0 0
0
o oorOOOOOOOOOO
DECISION FEEDBACK EQUALIZE R t WITH TEST SIGNAL
O.Ol-i-------.-----y-------y---..,------,.-500 lOOO 1500 2000 2500 o NUMBER OF DIGITS PROCESSEO
Fig. 9.
Adaptation curves with and without test sequence.
was found that adaptation based on the measurement of E[e(j)8(j + k)J resulted in a better performance than measurement of E[e(j)y(j + k)J. The mean-square errors at the equalizer outputs, using the former measurement, are shown in Fig. 9 as a function of the number of independent digits that were processed. As shown, the steady-state mean-square error of the decision feedback equalizer is 5 dB better than that of the linear equalizer. This is consistent with the fixed equalizer tests that were
406
THE BEST OF THE BEST 0.18 0.17 0.16
UNKNOWN CHANNEL
q« T)
•
At- 08.,
E IN• .. 10 OdS
O.I~
0.14
BAUD INTERVAL
T •
to
"'NllltUM V· 2
0.13
AV·2
0.12
~
0,11
:
O.
_ 0.09
g
~
0.08
~ 0.07
.W w
0, LINEA" EQUALIZER .. INPUT MEASUR(I!t[NT
~ 0.05
en
i 0.04
\== : ~
LINEAR EOUALIZER • OUTPUT MEASURM(NT
~ 0.03 en 002
0.01
DECISION
f[[08~1!
[QUAlIZER-IIlPUTM£ASUR£!lt£llt
O[C1StON FEE08ACK [WALIZER -OUTPUT f«ASUA£WENT
O-.---.......---'P-""--r-----,--------r----....-----r---.........---""'Y'---_--_-----.~o
~OO
tOOO
I~
2000 2500 3000 3500 4000 45CX> ADAPT.rION T'WE. NUMBER Of BAUD INTERVALS
5000
5500
6000
Fig. 10. Adaptation performances at various threshold values, high signal-to-background noise ratio example.
0.9
UNKNOWN CHANNEL q(1') • A.- O-5"
[IN.· 13.0 de 0.8
IAUO INTERVAt. 1 •
MINIMUM
~
1.0
V· 2
AV-2
0.7
Ci LINEAR EOUALIZER - OuTPUT MEASUREMENT
LINEAR EQUALIZER· INPUT MEASUA£MENT
~ fn
0.3
\
\
0.2 0.1
~
-.
'°-0_ 0
'-'-'~I_~--------e_
\"DECISION FEEDBACK EOUALIZER OUTPUT MtASU"EMENT
DECISION FEEDBACK EQUALIZER INPUT MEASUREMENT
O.O--+--..----r---.---........----.,.-...........-~-r----..--__.__-......._____.
o
soo
1000
rsoo
2000 2~ 3000 3&)() 4000 4500 ADAPTATION TIM£. NUMBER Of 8Al.!O INTERVALS
5000
5500
6000
Fig. 11. Adaptation performances at various threshold values, low signal to background noise ratio example.
described in the previous section. (The results shown in Fig. 9, and the results to follow, are normalized in amplitude such that I8(j) I = 1.) The decision feedback equalizer converged to i.ts steady-state performance after 1500 digits were processed, or in about 0.2 s if 'every digit was used for adaptation. It is notable that knowledge of the correct digit sequence did not improve the adaptation of either equalizer significantly; use of the decisions gave almost the same
performance. However, the digit sequence was the output of a pseudorandom generator. It is likely that the transmission of certain redundant sequences could cause a significant increase in adaptation time, especially if the "eye" at the equalizer output were initially closed. Two important characterist.ics that can be taken from a graph such as Fig. 9 are the "steady-state" meansquare error and the "adaptation time". Unfortunately, it is not possible to improve both these characteristics simultaneously by changing the adaption parameters V and ~. By increasing V, or decreasing a, or doing both, the mean-square error can be reduced, but the adaptation time is increased. A series of tests was made to determine how the mean-square error and the adaptation time are dependent on the t.hreshold V. In these tests the channel impulse response was Ae-O•5T/T • A filter matched to this pulse was used as ·part· of the equalizer in this series of tests. Measurements were made at high SNR, approximately 70 dB, and at 13.0 dB. The effect of using both "input" measurements e( i)y( j + k) and "output" measurements e(j)O(i + k) were tested with both the decision feedback equalizer and the linear equalizer. The effect of changing V from 2 to 24 in increments of 2 is shown in Figs. 10 and 11. As is shown, the performance of the decision feedback equalizer is better than that of the linear equalizer when either tap-error measurement is used. Also, the performance of both' equalizers when adaptation is based on e(j)8(j + k) is better than when it is based on e( j)y( i + k), except for the decision feed-
407
Fifty Years of Communications and Networking 1.0 back case when E/No = 13 dB and a large threshold was used. In all cases a much quicker adaptation could be achieved by using e(i)8(i k). In all cases a minimum product of adaptation time and mean-square error was achieved when V was 4. Thus both the experimental results and the analysis indicate that the decision feedback equalizer can be made adaptive, and that its adaptive performance is better than that of the linear equalizer. Estimates of either E[e( j)y( j + k) ] or E[e( j)8( j + k) J can be used to modify the forward taps. The experimental results described previously show that the latter measurement is better in many cases. However, the work of Hirsch and \ Wolf [10J indicate~ t.h~t when other channels are used ~ 10- 2 measurement of E [e( J ) 8 (i k) ] cannot be used to make e the linear equalizer adaptive. Further investigation is re- § quired to determine which is the better measurement to ; adapt the decision feedback equalizer to such channels.. ~
+
-~-.....
,
\.
,,'"'~, ...,."
'~"""~;-- ...... \ \
PRACTICAL CONSIDERATIONS
I t has been shown previously that the performance of the decision feedback equalizer is considerably better than that of the linear equalizer, both when the channel impulse response is known and when it is not. This result is part.icularly important when one realizes that the decision feedback equalizer is no more complex than the linear equalizer. For instance, in the telephone cable example the decision feedback equalizer with 4 forward taps and 7 feedback taps outperformed the linear equalizer with 11 taps. In both cases the input signal and the tap values were quantized to 7 bits, In fact, the decision feedback equalizer is potentially simpler, since the feedback delay line can be a single bit shift register if the data is a binary one. It was determined from computer simulation studies that the decision feedback equalizer is no more sensitive t.o signal quantization errors or tap quantization errors than the linear equalizer, even though it achieves its superior performance by coherent subtraction of the intersymbol interference. This is consistent with the tap sensitivity analysis of the previous section, (34) and (35).. In the computer simulation study the matched filter output Y( i) and the equalizer tap values could be independently quantized to any specified accuracy. The results of a typical test are shown in Fig . 12. In this test the channel impulse response was Are-O.8TIT, the signalto-background noise ratio was 16.0 dB, the linear equalizer had t5 taps, and the decision feedback equalizer had 3 forward taps with 8 feedback taps. Two curves of Fig. 12, one for each equalizer, show the error probability as a function of the number of quantization bits when all quantities are quantized with the same accuracy. As shown, the error probability starts to increase when the number of quantization bits is reduced to 8. In contrast, when the quantization of the signal was held at 10 bits, the tap gain quantization of both equalizers could be reduced to 6 bits before any significant increase in error probability was observed.
\
\
,
\
,
\
\
\
\
\If
\
'\
\ \ \. \~
\
\ '\'
I
\'
\ \
\
\'~ \
,\
\
' '...... "
\
\
'L
6-_~:::::
\ \
\ \
,
.. _L!1_
.:..
'-
..1
\
\
\
\
o \
\ \
"
\
\
\0
\
c
Q..
\
6
J CD
§
,
"
\
\ \
\
+
IV.
\' 0
~ \
\
\\
\
10-
\
\
\
\
""
-O.IT
qC,,)-An E/~. ~ 16.0 ell
\
~\
\0
' '~
'G... _~_
~~-...o--~ __ ---
-<)
X LINEAR EQUALIZER
o
orelStON FEEDBACK [QUAlllER
/j" ~INEAR EQUALIZER
D DEctSION FEEDBACK EQUALIZER
0
2
4
6
}
10811 SIGNAL QUANTIZATION
8
10
12
14
NUMBER OF QUANTIZATION BITS
Fig
.
12
.
Effect of signal and tap quantization on receiver performance.
In a separate experiment, quantization of only the signal values {y ( j)} resulted in a performance very similar to that achieved when all quantities were quantized equally. It is believed that the reason for the 2-bit difference in quantization requirements is that the signal ineludes the additive noise and the intersymbol interference, and so the quantization error is a larger percentage of the desired signal than of the total signal y (t) . Similar results were observed when the coaxial cable PCl\1 channel was simulated. Over that channel at 360 Mbit/s it was necessary to use 7-bit accuracy for the signal and 6-bit accuracy for the taps. At 450 Mbit/s the linear equalizer could not effectively equalize the channel, even with as many as 21 taps and with no quantization error in either the signal or the tap values. The decision feedback equalizer with 5 forward taps and 6 feedback taps required 9-bit signal quantization accuracy and 7-bit tap accuracy. These results are directly applicable if a digital synthesis method is used. They show that the decision feedback equalizer is no more sensitive to quantization inaccuracies than the linear equalizer, and, so is no more expensive to construct. If an analog synthesis method is used, these results indicate that the decision feedback equalizer is no more sensitive to component inaccuracies or signal distortion than the linear equalizer.
408
THE BEST OF THE BEST
STEADy-STATE
TABLE I SNRs
AS FUNCTION OF NUMBER OF TAPS OUTPUT
Number of Forward Taps
Number of Feedback Taps 1
2 3 4 5
6
7 8
2
3
4
5
9.6 11.4
10.7 13.3 14.5 14.3 14.7 14.9 14.9 14.8
11.1 14.9 15.7 15.9 16.0 16.1
11.0
11.9 11.9 11.8 11.8
11.7 11.7
14.2
15.4
16.0 16.1
16.2
of the linear equalizer. As well, at any specified low error probability the decision feedback equalizer allowed data transmission at rates beyond that possible with the linear equalizer. Two practical algorithms are described that make the decision feedback equalizer adaptive to unknown or slowly varying channels. It was found experimentally that the algorithm based on the cross correlations between the estimated error and the estimated digits gave the better performance, and that a training sequence was not necessary for adaptation. Also, it was found that the decision feedback equalizer is no more sensitive to quantization errors than the linear equalizer. Because of these advantages, and because its performance is close to that of the much more complex Bayesian demodulator when the channel is known, the decision feedback equalizer should be considered whenever linear modulation techniques are used to transmit digital information over dispersive linear channels at a high rate. This is additionally verified in the theoretical portions of the paper where it is shown that the idealized decision feedback equalizer will always yield smaller mean-square error than the transversal equalizer. As well, theoretical considerations indicate that the adaptive properties of the decision feedback equalizer will tend to be superior.
The required number of equalizer taps was also examined. Computer simulation tests showed that', when the channel had a simple impulse response such as Ae-a 1' ATe-a,., both equalizers equalized the channel as well with a few transversal taps as with many, but that the decision feedback equalizer required several feedback taps to realize its full potential, and as many as 10 or 12 at very high rates. Note however, from Figs. 2 and 3, that the more complex decision feedback equalizer could attain a performance not possible with a linear equalizer with any number of taps. In the more complex channel examples REFERENCES the linear equalizer did not retain this advantage in sim[IJ D. C. Call and D. A. George, "A receiver for time-dispersed plicity. Computer simulation tests of the coaxial cable pulses," in Conf, Rec., 1965 IEEE Int. Conf. Communications, PCl\1 channel at 360 Mbit/s, with a 20.0-dB signal to pp. 753-758. [2J R.. W. Lucky, "Techniques for adaptive equalization of digital background noise ratio, showed that the linear equalizer communication systems," Bell Syst. Tech. J., vol. 45, Feb. required 9 taps to realize its full potential, and the deci1966, pp. 255-268. [3J J. G. Proakis and J. H. Miller, "An adaptive receiver for sion feedback equalizer required 4 forward taps and 5 digital signaling through channels with intersymbol interfeedback taps. ference," IEEE Trans. Inform: Theory, vol. IT-15, July 1969, pp. 484-497. A series of hardware simulation tests was carried out [4] R. A. Gonsalves, "Maximum-likelihood receiver for digital to determine the number of taps required by the adaptive transmission," IEEE Trans. Commun. Technol., vol. COM-16, June 1968, pp. 392-398. decision feedback equalizer. In these tests 450-Mbit/s [5] R. R. Bowen, "Bayesian decision procedure for interfering transmission over the PCl\1 channel at high signal to digital signals," IEEE Trans. In/arm. Theory (Corresp.), vol. . IT-15, July 1969, pp. 506-507.. background noise ratio was simulated. The threshold V [6] - - , "Bayesian detection of noisy time-dispersed pulse sewas held at 16 during these tests. The steady-state output quences," Ph.D. dissertation, Carleton Univ., 6ttawa, Ont., Canada, Sept. 1969. SNR in dB is shown as a function of the number of taps [7J K. Abend and B. D. Fritchman, "Statistical detection for in Table I. As shown, 4 forward taps and 4 to 6 feedback communication channels with intersymbol interference," Proc. IEEE, vol. 58, May 1970, pp. 779-785. taps is a good compromise between better performance [81 M. E. Austin, "Decision feedback equalization for digital and higher cost. Use of more than 6 feedback taps decommunication over dispersive channels," M.I.T./I~.L.E. Tech. Rep. 461, Aug. 11, 1967. grades the performance, because the amount of quantiza[9] H. Robbins and S. Monro, A stochastic approximation tion noise and adaptation noise that the tap introduces is method," Ann. "Afath. Staiist., vol. 22, 1951, pp. 400-407. more than the amount of intersymbolinterference that is [10] D. Hirsch and W. J. Wolf, "A simple adaptive equalizer for efficient data transmission," IEEE- Trans. Commun, Technol., removed. vol. COM-I8, ~eb. 1970, pp. 5-12. From these experimental results it is concluded that the decision feedback equalizer is a practical as well as a very high performance receiver.
or
U
v.
CONCLUSIONS
It has been shown that the decision feedback equalizer can be used to detect digital information that has been sent at high rates over an unknown or slowly varying dispersive channel. The equalizer's ability to combat intersymbol interference caused by several channels was investigated experimentally. In each of the examples that were investigated the digit error probability of the decision feedback equalizer was considerably less than that
Donald A. George (S'54-M'59) was born in Galt, Ont., Canada, on April 24, 1932. He received the B .Eng. degree in engineering physics from McGill University, Montreal, PHOTO P. Q., Canada, in 1955, the M.S. degree in electrical engineering from Stanford UniNOT versity, Stanford, Calif., in 1956, and the AVAILABLE Sc.D. degree in electrical engineering from Massachusetts Institute of Technology, Cambridge, in 1959. From 1959 to 1962 he was an Assistant Professor of Electrical Engineering, University of New Brunswick,
Fifty Years ofCommunications and Networking Fredericton, Canada. Since then he has been a member of the Faculty of Engineering, Carleton University, Ottawa, Ont., Canada. While teaching, he has been a Consultant to a number of organizations principally the Defence Research Telecommunications Establishment (now the Communications Research Center) and Canadian Westinghouse Company, Ltd. Also, he spent three summers and a nine-month sabbatical period with the Communications Research Center. At Carleton University, while being concerned with the development of the engineering program in general, he has been particularly involved in building up graduate activity in communications. His recent research activity has been in the area of optimum and adaptive PAM systems and in signal processing with small computers. At present, he is a Professor of Engineering and Dean of Engineering at Carleton University. His research interests are in communication and information theory, cybernetics and systems, and signal processing.
PHOTO NOT AVAILABLE
Robert R. Bowen (S'57-M'61) was born in Peterborough, Ont., Canada, on June 10. 1935. He received the B.Sc. degree in engineering physics and the M.Sc. degree in electrical engineering from Queen's University, Kingston, Ont., in 1958 and 1960, and the Ph.D. degree in electrical engineering from Carleton University, Ottawa, Ont., in 1970. In 1960 he joined the Defence Research Telecommunications Establishment, which
409 has since become the Communications Research Center of the Department of Communications, Ottawa, Ont., and worked on radar signal processing problems. In 1967 he returned to the Communications Research Center, where he continued his work on the transmission of data through dispersive media. More recently, he and two colleagues have developed a computer language for simulation of adaptive communication systems.
John R. Storey (M'68) was born in Trelewis, Wales, on SeptemberIfl, 1926. He graduated from Lewis School, Pengam, Wales. After "graduation he joined the Royal PHOTO Navy for a period of three years. In 1952 NOT he joined Decca Radar, Ltd., Surrey, England, working primarily on the development of a AVAILABLE baseband communication system. He emigrated to Canada in 1955 and joined the Ferranti Packard Company where his main interest was in HF communications using meteor trial reflections. In 1957 he joined the Avro Aircraft Company r Malton, and was involved in data processing during the flight trials of the Avro Arrow aircraft. He is now with the Communications Research Center, Ottawa, Ont., Canada, (formerly the Defence Research Telecommunications Establishment) having joined them in 1959.. He has worked on projects involved with pulse compression techniques for ionospheric sounding and on research in the field of adaptive receivers for data transmission. He now heads a communication engineering group currently working on an automated system to measure noise in the HF communication spectrum.
Performance of Optimum and Suboptimum Synchronizers PAUL A.
WIN1~Z, MEMBER, IEEE, AND
REFBRENCE: Wintz, P. A., and Luecke, E. J.. : PERFORMANCE OF OPTIMUM AND SUBOPTIMUM SYNCHRONlZERS,l School of Electrical Engineering, Purdue University, Lafayette, Ind. 47907 and School of Electrical Engineering, Valparaiso University, Valparaiso, Ind. 46383. Rec'd 12/16/68. Paper 69TP22-COM, approved by the IEEE Communication Theory Committee for publication without oral presentation. IEEE TRANS. ON COMMUNICATION TECHNOLOGY, 17-3, June 1969, pp. 380-389.
ABSTRACT: The optimum (maximum-likelihood) synchronizer for extracting bit synchronization directly from a binary data stream is presented along with some simple suboptimum synchronizers that perform almost all well. The manner in whic~ t4$;,~!formances of these systems depend on the pertinent system parameters as determined by a combined program of analysis, simulation, and laboratory experimentation are reported. Both synchronizer jitter and the degradation in error rate due to jitt~r are considered.
I.
!t\TUODUC'l'ION
D
IGI TAL transmission systems require receivers that are synchronized to the ineommg signal format. Three levels of synchronization are usually required. Bit synchronization is required since optimum (matchedfilter) detectors require knowledge of the start and stop times of each incoming symbol in order to make bit decisions. Word synchronization is also required in order to sort out bits into their appropriate words. Finally, frame synchronization is required by the data user if he is to distinguish between word groups, or frames. Further comments on these synchronization problems are given
in [1]. This paper is concerned with the problem of bit synchronization. There are, of course, a number of strategies for obtaining bit synchronization by transmitting extra power and/or bandwidth. For example, the standard teletype format uses start and stop signals to synchronize the encoder and decoder. Approximately 33 percent of the transmission time is alloted to symbol synchronization. In other systems, synchronization words are inserted with the data words at regular intervals, and the receiver first obtains word synchronization by locking to these synchronization sequences. If word synchronization is obtained accurately enough, it can be divided down to obtain bit synchronization. Sometimes a separate subchannel is used exclusively for transmitting synchronization information. 1
This research was supported by NASA under Grant NsG-553.
EDGAR J. LUECKE,
MEMBER, IEEE
On the other hand, self-bit synchronization systems extract bit synchronization directly from the data stream with no additional power, bandwidth, ete., added for synchronization purposes. We restrict our comments to this class of synchronization strategies. The ease of extracting timing information directly from the data stream depends on the amount of energy in the incoming waveform at the frequency corresponding to the bit rate. Since a random sequence of equally likely antipodal signals contains no energy at the bit rate frequency, some type of nonlinear processing is required. The analysis of nonlinear systems with noisy inputs is not a trivial problem. However, some progress has been made. Van .Hom [2J suggested a synchronizer consisting of a bank of correlators similar to some radar systems, but made no claims about its performance or practicality. Stiffler [3] proposed an optimum (maximum-likelihood) strategy that operates on the infinite past or enough of the past to make the truncation error negligible. Van Trees [4J analyzed a system consisting of a data channel and a synchronization channel and concluded that the optimum strategy for dividing. 'the transmitter power between these two channels devotes all available transmitter power to the data channel and 'Qq~~' synchronization from it. (An analogous result-relative word synchronization was established by Chase [5J who showed that the channel capacity for the discrete memoryless unsynchronized channel is the same as the capacity for the synchronized channel.) Wintz and Hancock [6J evaluated the performance of an adaptive estimator-correlator detection system for signals of unknown arrival times. The error rate was given in terms of the signal-to-noise ratio (SNR) and the rma timing error, but no synchronizer structure was suggested. In this paper we report the results of an investigation aimed at determining the important parameters associated with self-bit synchronizing systems and the physical relationships between these parameters. Toward this end we present the structure of the optimum (maximumlikelihood) self-bit synchronizer and also some suboptimum structures that perform nearly as well as the optimum but are much easier to implement. The manner in which the performances of these optimum and suboptimum systems depend on the pertinent system parameters are determined and the performances compared. The important parameters were found to be the input SNR, the shape of the signaling waveform, the memory time of the synchronizer, the synchronizer nonlinearity;
to
Reprinted from IEEE Transactions on Communication Technology, vol. COM-17, no. 3, June 1969. The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and J. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
411
412
THE BEST OF THE BEST
and the bandwidth of the low-pass filter preceding the synchronizer. The performance of some suboptimum systems (properly designed) are quite close to the performance of the optimum system and, therefore, represent a reasonable engineering solution to the synchronization problem. Synchronizer performance is very dependent on signal waveform. In particular, rounded pulses result in significantly better performance (approximately 1 order of magnitudej-than square pulses; the best nonlinearity is "log cosh", but "square law" and "magnitude" nonlinearities perform nearly as well for rounded pulses while the hard limiter is better for square pulses. We present only a discussion of the results of this study. Detailed derivations, etc., can be found in [7J. The problem of bit slippage is not considered.
II.
T
.~I)dt'E
10
T
I
(b)
(a)
Fig. 1. (a) Positive symbol waveform. (b) Random processes S{t;t,T) and X(t).
x (t)
MODEL
A mathematical model for the synchronization problem is presented in Fig. 1. A random sequence of positive and negative pulses is transmitted. The equation for the positive symbol waveform is s(t)t E (O,T) and its energy is E as illustrated in Fig. 1 (a). For noiseless reception the received signal sequence would appear as illustrated in Fig. l(b), S(t;t,T) , and for the more realistic case (with noise) as in Fig. 1(b), X (t). The received (noiseless) signal is characterized by the random process S(t;t,T) with two sources of randomness indicated by the two parameters rand T. r is a parameter that governs
(b)
(a) Fig~
2.
(a) Synchronizer. (b) Detection system.
the jitter at the synchronizer output is the performance measure. One possibility is the rms error (E 2)l J2. However, because we found it easier to measure in the laboratory, we have chosen the mean magnitude error f E-r to be the performance measure. For typical unimot!~l probability density functions encountered in practice' E 1 ~ O.8( e2 ) 112 • A complete digital detection system consists of a the particular sequence transmitted. We assume that all matched filter (or correlator), a sample-hold-dump cirpossible sequences are equally likely. T is a parameter cuit, and a synchronizer, as illustrated in Fig. 2(b). In that indicates epoch [the first zero crossing of S(t;r,T) ] this system the synchronizer has no utility of its own. relative to some point (chosen arbitrarily and labeled Rather, its sole purpose is to provide timing information t = 0) on the time scale. We assume that T is equally (the start and stop times of each incoming signal) to likely to be any number between - T /2 and + T /2. ']'he the sample and dump switches, If the bit timing informaobserved random process X(t;T) is the sum of the signal tion is not exact, the detector performance is degraded, process and a noise process, i e., i.e., the error rate is increased relative to what it would (1) be for perfect synchronization. Hence a reasonable perX(t;T) = S(t;r,'T) N(i) formance measure for this system is the fractional increase where the noise process N (t) is assumed to be a Gaussian in transmitter power required to achieve the same probnoise process with noise power No W 1Hz. ability of error that would be obtained with perfect A synchronizer can be modeled as a transformation G synchronization. That is, suppose that with perfect synof the random process X (t;r) into the random variable chronization, the digital detection system operates at an f, i.e., average SNR E/No so that the corresponding probability (2) f = G[X(t;T)]. of detection error is given by [14J
+
PE=~(l-erf~:).
We define the synchronizer error (jitter) to be
e=r-r
(3)
where e is a random variable whose range is (- T /2,+ T /2).
III.
PERFORMANCE
MEASURES
On what basis should 1-ve say that one synchronizer is better (or worse) than another? Two different performance measures for synchronizers are illustrated in Fig. 2. In the first, illustrated in Fig. 2(a), the synchronizer is considered a system by itself and some measure of
(4)
Then with synchronization jitter it can be shown (after considerable computation) that the probability of detection error is given by [7]
Pg
*_!11/2{ _! [lb'(R(E)+R(T-E))] ~l 1 erf '\IN R(O) -
0
6.l
2
.. !erf[·~ . rE(R(f.) 2
'iNa
0
R(T R(O)
~))]}
p(e) dE
(5)
413
Fifty Years ofCommunications and Networking
+
the K 1 time intervals be represented by the K + 1 column matrices XO,X1, · · · ,Kg as in Fig. 3. (The representation of waveforms by vectors is discussed in Appendix
B of [14J,) Hence we write X, = S, + N" I
I I I
I I I
I I
I J II • • 1
I I
It I I I
f:-T--t-T--1 ••• t---T-r,:.~ 1----- K T seconds ~ I
I
Fig. 3. Reeeivedjsignal without and with noise.
where p(E) is the probability density function of the synchronizer error and R(E) is the autocorrelation function of the positive symbol waveform, i.e.,
R(E)
=
r -0
8(t)8(t
+ E) dt.
(6)
Equation (5) takes into account the intersymbol interference effect due to timing errors, i.e., a timing error of E seconds results in the decision being based on the last E seconds of one signal and the first T - E seconds of the following signaL It is easy to show that P B * ~ PH with P E * = PH only for peE) = O(E) (perfect synchronization). We now define the degradation due to jitter to be the fractional increase in SNR required at the input to the system of Fig. 2(b) in order to decrease PE * to PEl That is, in (5) replace E/No by E/No + /l and find Ll such that P E * = PE. 'Then the fractional increase in SNR 11/( E / No) is called the degradation. Note that the degradation depends on the input SNR E/N o, the signal waveform through its autocorrelation function R(E), and the synchronizer through p(E). p(E), in turn, depends on the input SNR E/No and the signal waveform s(t) as well as the synchronizer structure G.
IV.
OPTIMUM SYNCHRONIZER
The synchronization problem was presented graphically in Fig. 1(b), X (t). A random sequence of anticorrelated signaling waveforms (in noise) is observed; the problem is to determine the epoch T. We use the maximum-likelihood parameter estimation technique to derive the structure of the optimum self-bit synchronizer, i.e., the maximum likelihood estimator of the epoch parameter T. The basic assumptions and notation are illustrated in Fig, 3. We assume a record of received data precisely KT seconds long, where T is the known symbol duration (T-l is the bit rate in bits per second) and K is an integer called the memory (K is the number of bits over which the estimate is made). Even though the record is exactly K symbols long, it contains parts of K + 1 consecutive symbols since the last T seconds of one symbol are included at the beginning of the record and the first T - T seconds of another symbol are included at the end of the record. Let the received data (signal plus noise) in
1 = 0,1,·· ·)K
(7)
where Si is a column matrix representing the signal waveform in the ith interval (Si = ± S, where S represents the positive symbol waveform except that So and Sg represent the last T seconds and the first T - T seconds, respectively, of the positive symbol waveform) and N, is a column matrix representing the noise waveform in the ith interval. We assume that the noises in the ith and jth time intervals are uncorrelated for i F j, and that in each interval the noise has zero mean and known
covariance
.pi, i.e.,
N;,N,,"
=0 = epi,
i
N,NJt
= 0,
ij = 0,1,·· .,K,
N.
= 0,1,·· -,1(,
Nit = N; transpose (8)
i:;& j.
Under these assumptions (and those stated in Section II) it can be shown that the a posteriori probability density function p( T I Xo,Xt, - ••,Kg) can be written in the form [7J
p (r I Xo, · · •,Xx)
=C
p(r)
IIcosh [XlcI>j-lSj(T) ] K
p(X o, · · • ,XIC)
\9)
j:aO
where C is a constant, p(T) is the a priori probability density on T, and SeT) is a column matrix representing the positive symbol waveform shifted by T seconds. The maximum likelihood estimate of T is the value of T that maximizes p ( T 1 Xo, · • •,Kg). Assuming no a priori knowledge of T [per) = T-l;-T/2 ~ T ~ T/2] the maximum likelihood estimate is the value of T that maximizes lK cosh [X/-Pj-lS( T)]. Or, since this quantity is always positive and the logarithm is a monotonically increasing function for positive arguments, an equivalent expression for the maximum likelihood estimator is f such that
IIj..
max r
{t log cosh [XJ~rISJ( J} T)
3=0
K
=
L: log cosh i=0
For white noise (10) reduces to
m:x
{t
log cosh [XjSj(T)
J} =
[XJ~J-!S)(f)J.
E
log cosh [XJS
J ( ·; ;
(10)
J(11)
To implement the optimum synchronizer for the white noise case we need a device that chooses a value for T, say T = T*, and then correlates the first T* seconds of the record with the last T* seconds of the positive symbol waveform, computes the log hyperbolic cosine of this quantity, and stores the resulting number. Next, the
414
THE BEST OF THE BEST sO)
~ 3T
,.,... -~ .. , Holf-Sine
, /> \ '\
~ T
r
-AT
Raised Cosine
Square
"
half-sine
\-\ \'\,
s(t)
r----I-i----~~-_ / \
'\
Fig. 4.
Optimum
0.400
"
square
T
s(t)
t$ (0,
==! {
0,
t$ (0, T)
V EI T,
t E(OJ T)
0,
t~ (0, 7')
Optimum Synchronizer
Synchronizer Cosine Pulses
Roised
8
.200
0.100
.lOO
0070
.070
0.050
.050
0.030
I-
.030
0.020
I~
.020
Symbo1
~~
~~. .
~A~Holf-Sine
lC","n RSD • Cosine
.010
.007
MOT
'\cSquore
.005-
2
I
I
4
I
8 10 2 E/N o
I
20
40
(a)
(b)
Fig. 5. Optimum synchronizer performance" (mean magnitude error).
Optimum Synchronizer Cosine Symbol
Raised
10 9
8 7 6
4
Fig. 6.
Periods of Memory
-,
.015
0.010
T)
v'2E I 3T [1 - cos (21r/7') t], tE (0, T)
Signaling waveforms; all three waveforms have energy.
.150
....
0,
,
0.200
" I~
t E (0, T)
raised. cosine::s(tf = {
\
o
V 2tEI T sin ('KIT) t,
={
2E IN o = 2
Optimum synchronizer performance (degradation).
415
Fifty Years ofCommunications and Networking
T-second portion of the record from T* to T* + T is correlated with the positive symbol waveform, the log hyperbolic cosine of this number computed, and the result added to the number previously obtained. The same procedure is followed for each of the remaining K - 1 T-second intervals. For the last interval, the last T - T* seconds of the record is correlated with the first T seconds of the positive symbol waveform, the log hyperbolic cosine is taken, and this number is added to the sum of the first K numbers. The device has now evaluated (11) for T = T* and obtained the number
,,*
4.0
3.0
2.0....--------~~----
K
L log cosh [XjSj(T*)]. j-=O
Fig. 7. Synchronizer nonlinearities.
. This procedure must now be repeated for all - T/2 :::; T ~ T/2 to obtain max, {Li-oK log cosh [XiSj(,,)]J and
the corresponding estimate T. For colored noise the procedure is the same except that the data Xi are correlated with the matrix Q(T) = cI»i-1Si(T) rather than Sj(T). Because of its complexity, the optimum (maximumlikelihood) synchronizer does not appear to represent a realistic engineering solution to the synchronization problem. However, its performance is of interest. Even though we would not build the optimum synchronizer, knowledge of its performance would give us a standard against which to judge the performances of various suboptimum systems which are easier to implement. It is not, of course, feasible to evaluate (11) for all - T /2 ~ T S T /2. However, an approximate solution can be obtained by evaluating \ 11) for a large but finite number of values for T. Using this procedure and the Monte Carlo technique, the performance of the optimum synchronizer was evaluated on a digital computer for the 3 symbol waveforms shown in Fig. 4, various memory times K, and SNRs E/No• Results for the raised cosine symbol are presented in Fig. 5(a) for memory times of K = 1,2,·· ·,8 symbol durations. The curves indicate that I-~ i varies inversely with the square root of the SNIt E/N o and with the square root of memory time K. This result is typical of adaptive systems (see, for example, [14J). Results for the raised cosine, half-sine, and square pulses are compared in Fig. 5(b) for K = 8. Fig. 6 shows the amount of degradation that the optimum synchronizer introduces in the error performance of the matched filter detection system of Fig. 2(b). The abcissa gives the fractional amount of SNR that must be added to achieve the same error performance that would be achieved with perfect synchronization. The ordinate gives the memory required to achieve the specified degradation. All of the curves increase without bound as the degradation approaches zero. This is to be expected since for no degradation in error performance perfect synchronization is required; this can be obtained only with an infinite memory. lienee we are forced to accept some degradation. Fig. 6 gives the memory required by the optimum synchronizer to achieve any specified degradation for input S1\i R s of 2,4,8,16.
Fig. 8. Suboptimum synchronizer.
v.
SUBOPTIMUM SYNCHRONIZERS
In Section IV we showed the structure of the optimum (maximum-likelihood) synchronizer and interpreted it in terms of physical operations on the received data. In this section we use the optimum structure as a guide to a (suboptimum) synchronizer structure that is easier to implement. Referring to (11), we recall that the first operation performed on the received data is a correlation or matched filter operation. For the pulse type signals of Ji"'ig. 4 a single-pole low-pass RC filter with time constant approximately equal to the symbol duration RC ~ T is a reasonably good approximation to the matched filters. The log cosh nonlinearity is nearly a square law device for low-level inputs and a magnitude device for high-level inputs, as illustrated in Fig. 7. We choose to use a square law device rather than a log cosh device because it is easier to analyze. (We later investigate the effect of the nonlinearity on system performance.) Finally, the summation in (11) can be interpreted as an averaging operation on the nonlinearity outputs. In our analog system a bandpass filter can serve to time average the nonlinearity outputs due to all past inputs. Hence we arrive at the suboptimum system presented in Fig. 8. The input sequence of positive and negative symbols (and noise) is passed into the low-pass filter which eliminates some of the noise thereby increasing the SNR. Unfortunately, it also introduces some intersymbol interference because of its memory. (Each input pulse is smeared into the subsequent pulses.) The filter output is passed into the nonlinearity which flips the negative pulses (and noise) positive. The resulting waveform is not periodic (it would not be periodic even in the noiseless case because of the smearing effects of the low-pass filter on the random input sequence), but has some energy at the bit rate frequency. The bandpass filter is centered at the frequency
416
THE BEST OF THE BEST RaisedCosine Pulse Square
Law
2
~5'~
.2, ~,"~ " ..........'~ ...... ,
...................
~
........ ~...
-~~~~~~: --------------- ------- --------------- fC~·2
2E/No
Fig. 9. Effect of bandwidth of low-pass filter.
corresponding to the bit rate and passes only the energy near the bit rate frequency. The synchronizer output is a sine wave (approximately) whose zero crossings are the synchronizer estimates of the start times of each incoming pulse. (The output could be fed into a hard limiter followed by a differentiator to obtain timing pulses whose leading edges contain the synchronization infermation.) Note that the memory of the suboptimum synchronizer can be adjusted by changing the bandwidth of the bandpass filter. The optimum system weighted the data from all past intervals equally, but the bandpass filter is characterized by an exponential weighting into the past. (The envelope of its impulse response decreases exponentially.) Therefore, in order to compare the memory times of the optimum and suboptimum systems an effective memory time (in symbol durations) was used as follows: (12)
where C, is the relative weighting assigned to the ith preceding time interval. (This definition of effective measurement time is due to Price [8J. See also [9J.) A mathematical analysis of the suboptimum synchronizer has been completed; it is quite complex and the computations very tedious. Hence we present here only a brief overview; details can be found in [7J. The input to the synchronizer is a sample function from a random process which is itself the sum of two random process, i.e., the random sequence of signals and the noise. The power spectrum of the signal process can be computed by drawing on results due to Huggins [10J, Zadeh [11J, and Barnard [12J. Since the low-pass filter is a linear device, the power spectrums at the filter output due to both signal and noise can be computed separately. Furthermore, they are simply related to the input processes. The signal spectrum and the noise spectrum (including signalcross-noise terms) at the output of the square law device
were computed by the direct method for nonlinear devices [13J. From these spectra the SNR at the input to the bandpass filter can be determined. Next, the probability density function for the phase of the output of the bandpass filter was assumed to have thesame functional form as the density function of the phase of 'a sine wave in additive Gaussian noise, i.e., p(fJ)
= exp
(-d) {1
+ [y 7rd cos 8][exp (- Yd cos 8) J2 X [1 + erf (Ydcos8)]}. (13)
(This assumption was verified by experiment.) This is a one-parameter density function depending only on the SNR parameter d; d is related by a constant to the SNR computed at the input to the bandpass filter. This analysis was used to determine the dependence of the performance of the suboptimum synchronizer on the various system parameters and to optimize the suboptimuve synchronizer relative to these parameters. For example: What is the optimum bandwidth (3-dB frequency) of the Iow-pass filter, and how sensitive is the performance to variations in this parameter? Fig. 9 shows the jitter r;l for the raised cosine symbol and an equivalent memory time of 60(K = 60) symbol durations. fe is the ratio of the filter cutoff frequency to the bit rate frequency. To interpret these curves we recall that the low-pass filter effects both the signal and the noise. (It eliminates noise, but also smears the signals.) Hence at high SNRs performance is limited by the signal smearing effect and the optimum bandwidth is Ie = 1.0 (or more). At low SNRs the noise effect predominates and performance is enhanced by accepting some smearing in order to eliminate more noise. However, since the effect of the noise can also be mitigated by increasing the memory time (decreasing the bandwidth of the bandpass filter), we conclude that the low-pass filter bandwidth should be set equal to the bit rate frequency
417
Fifty Years of Communications and Networking 0 .1
Raised Cosine
Ie
Putse
Absolute Volue
= 1.0
36 SymbOls of Memory
Ie -10
.05 0
36 Symbols Memory
.030
I-
....
t::
l~
.OZO
1~ .0 10
<~ -"<, ~ Sq . Root 1.(
10
ZE/No
"'.....1'SqyOr8,!"
100
III 1000
(b)
(a)
Fig. 10. Effect of nonlinearity.
Ie' 1.0
0.1
Absolute Value 36 Symbols Memory
0.1
'e&I .O Absolute VOlu. 18 SymbOls Memory
.
I-
I-
I~
I~
Square
.0 1 lIZ Sin•
.01 100
10
Sowtooth
100
10
1000
ZE/Ng
ZE/N o
(a)
(h)
1000
Fig 11. Effect of signal waveform.
The effect of low-pass filter bandwidth for the half-sine and square symbols was also investigated. The results for the half-sine symbol are similar to the results for the raised cosine symbol but not quite as pronounced at low SNRs. Hence the same conclusionis reached (set Ie = 1.0). For square pulses low-pass filtering is absolutely essential. Indeed, for the case of no noise and no low-pass filter, the output of the square law device would be a constant voltage. In order to generate energy at the bit rate frequency, some distortion must be introduced. We found that performance is essentially independent of low-pass filter bandwidth for sufficiently high SNRsj for low SNRs lower cutoff frequencies are better. Again, since the effect of the noise can be mitigated by increasing the memory time, Ie = 1.0 appears to be a reasonable setting . Detailed results for both cases (half-sine and square symbols) are given in [7]. A prototype of the suboptimum system was constructed in the laboratory, and the effect of the low-pass filter bandwidth on the jitter ~l was measured. These experimental results agreed quite well with those predicted by the analysis. Detailed results are given in [7]. The effect of the nonlinearity was also investigated by experiment since the square law nonlinearity is the only one amendable to analysis. Laboratory measurements were obtained for the absolute value and square root nonlinearities as well as for the square law device (see Fig. 7). Fig. 10(a) is typical of the results obtained for the rounded pulse shapes (raised cosine and half-sine). (Results for
the other symbol waveforms can be found in [7J.) We conclude that for rounded pulses the specific shape of the nonlinearity makes little difference so long as it is an nth law device with 1 < n < 2. We also took measurements with a hard limiter (infinite clipper) nonlinearity since it is commonly used, especially for square pulses. The performances for the hard limiter and absolute value nonlinearities are compared in Fig. 10(b) for the raised cosine and square symbol waveforms. For the raised cosine symbol the absolute value nonlinearity yields significantly better performance while the reverse is true for the square symbol. Note, however, that the shape of the symbol waveform is much more significant than the shape of the nonlinearity, especially at SNRs of practical interest. Further experimental data were obtained to determine the effect of the symbol waveform on the performance of the suboptimum synchronizer. Performance for raised cosine, half-sine, ramp, and square pulses are presented in Fig. 11for the system with the absolute value nonlinearity. Again we note that the square pulses are very inefficient (approximately one order of magnitude in SNR) Compared to the other waveforms. The effect of symbol waveform is even more pronounced when the synchronizer is incorporated into the detection system, as illustrated in Fig. 2(b). The memory times required to achieve any specified degradation is presented in Fig. 12 for the raised cosine, half-sine, and square symbols for SNRs 2E/No = 4,9.16. For O.I-dB degration
418
THE BEST OF THE BEST
RIc) E
~
c
i o
:g
-T
1000
Fig. 13. Signal auto correlation functions .
t
OJ,
1!
t=
005
.10
20 .30 !40 OeQradot ion of 2EINO in dB
Fig. 12. Effect of symbol waveform.
autocorrelation function evaluated at e [see (6)]. When a sequence of signals is transmitted, the output due to signal also contains a component due to the preceding signal (for positive f) or the succeeding signal (for negative E), i.e., the correlator output is now given by R(E) ± R( T + E). If both signals have the same sign, the signal component at the correlator output is increased, while if one signal is a positive symbol and the other negative, the signal component is decreased. Since these two events occur with equal probability, and since the probability of error is nearly a linear function of SNR for small SNR perturbations, we conclude that this effect has a small effect on system performance. Hence the primary effect is due to the first term R (E). In Fig. 13 we have sketched R (E ) as E for the three symbol waveforms. Note that for the rounded pulses R (E) is quite flat for small values of E, while for the square pulse R (f ) decreases much faster with E. Since R(E ) is essentially the SNR (except for a constant scale factor) at the correlator output, the dependence of detection system performance on symbol waveform is quite clear. For a given timing error E (if the system is to be at all practical, E is reasonably small) rounded pulses perform significantly better than square pulses. Furthermore, we recall that rounded pulses result in significantly smaller timing errors than square pulses. The combined effect yields the result illustrated in Fig. 12.
and an SNR of 9, the system requires two orders of magnitude more memory for square pulse than for rounded pulses (raised cosine or half-sine). Also note that this effect cannot be eradicated by increasing the SNR. This drastic dependence of detection system performance on waveshape is due to two effects, one concerning the synchronizer and the other the correlation detector. We have already noted that the synchronizer performs significantly better with rounded pulses. (For the same amount of jitter f"E [ rounded pulses outperform square pulses by an order of magnitude in SNR.) Hence we focus VI. OPTIMUM AND SUBOPTIMUM our attention on the correlation detector. Now, the probSYNCHRONIZERS COMPARED ability of a detection error depends only on the SNR at the output of the correlation detector. The output of In Section IV we presented the optimum (maximumthe correlation detector depends on the input signal, the likelihood) synchronizer and showed how its performance input noise, and the timing error. Since it is a linear depends on the basic system parameters. In Section V device, the outputs due to signal and noise can be con- we suggested a suboptimum synchronizer and showed sidered separately . It is easy to show that the output due how its performance depends on the basic system paramto noise is independent of the timing error. That is, the eters . We have also pointed out that the suboptimum rms value of the noise depends only on the signal energy synchronizer is relatively simple to implement while the (not the waveform) and the correlation time, and for optimum synchronizer does not appear to be practical. reasonable values of synchronizer memory the correlation In this section we compare the performances of the times do not change significantly from bit to bit. Hence optimum and suboptimum synchronizers. In Fig. 14(11) we compare the mean magnitude jitter timing error effects only the correlator output due to signal. The correlator output due to signal depends on performances of the optimum and suboptimum synchrothe timing error and the autocorrelation function of the nizers [Fig. 2(a) ] for K = 8 for the raised cosine, halfsignal waveform. Suppose, for the moment, that only a sine, and square symbol waveforms. Note first that for single signal is transmitted and that the timing error is e. the raised cosine symbol the suboptimum system performs Then the correlator output due to signal is R (f), the nearly as well as the optimum system (within 1 dB).
419
Fifty Years ofCommunications and Networking
j:j
vs. 2E /No
6 SYMBOLS OF MEMORY
.2<
I-
I~
(a)
SqUOrfl
.2
.4
.6
8 DeqrQdahOrl of
Opt .
10 2€ / No
in
de
12
1.4
---
16
(b)
Fig. H.
Comparison of optimum and suboptimum synchronizers.
The performances of the detection system [Fig. 2(b) ] The same is true for the half-sine symbol. On the other utilizing both the optimum and suboptimum synchrohand, we note an order of magnitude difference between nizers are compared on the basis of SNR degradation for the performances of the optimum and suboptimum systhe three symbol waveforms in Fig. 14(b). Note that for tems for the square pulse. From this and from additional the rounded pulses the suboptimum detection system is curves given in [7] we conclude that for all practical inferior to the optimum system by approximately a factor purposes the suboptimum synchronizer performs as well of two in the memory time required for the same degradaas the optimum synchronizer for rounded pulses, but not tion . For the square symbol the discrepancy between the for square pulses. As indicated in Fig. 10(b), the performance for square pulses can be improved somewhat by optimum and suboptimum detection systems is approximately a factor of 10 in equivalent memory time. using the hard limiter nonlinearity .
THE BEST OF THE BEST
420
VII.
CONCLUSIONS
f~l j. J. Stiffler, "Maximum likelihood symbol synchronization," J P L Space Programs SUlnrnary, vol. 4, pp. 349-357, October
1~65. We have determined the performances of the optimum l4] H. L. Van Trees, HO ptinlum power division in coherent communication systems," IEEE Trans. Space Electronics and (maximum-likelihood) and some suboptimum (Fig. 8) Telemetry, voL SET-10, pp. 1-9, March 1964. self-bit synchronization systems as functions of the per[5J D. Chase, "Communication over noise channels with no a priori synchronization information," M.I.T. Res. Lab. of Electronics, tinent system parameters. This was accomplished through Cambridge, Mass., Tech. Rept, 463, February 1968. a combined program of analysis, simulation, and labora[61 P. A. Wintz and J. C. Hancock, "An adaptive receiver approach to the time synchronization problem," IEEE Trans. Comtory experimentation. Two criteria were considered: synmuniauion. Technology, vol. COM-13, pp. 90-90, Marc~ 1965. chronizer jitter and detection system SNR degradation. [7J P. A. Wintz and E. J. Luecke, "Performances of self-bit synchronization systems School of Electrical Engineering, Purdue The most significant parameters (in terms of system University, Lafayette, Ind., Tech. Rept. TR-EE68-1, January performance) were found to be the input SNR, the shape 1968. [8J R. Price, "Error probabilities for adaptive multichannel recepof the signal waveform, and the memory time of the tion of binary signals," M.I.T. Lincoln Lab., Lexington, Mass., synchronizer. System performance is less sensitive to the rrech. Rept. 258, July 1962. 19] J. G. Proakis, P. R. Drouilhet, Jr., and R. Price, "Perlormance bandwidth of the synchronizer input filter and, within of coherent detection systems using decision-directed channel limits, the synchronizer nonlinearity. measurement," IEEE Trans. Communications Sysle1ns, vol. CS-12, pp . 54-63, March 1964. The dependence on input SNR and synchronizer mem[10] \V. H. Huggins, "Signal-flow graphs and random signals," ory are roughly equivalent in the sense that doubling the Proc. IRE, vol. 45,_pp. 74-86, January 1957. memory has the same effect on performance as doubling {IIJ L. A. Zadeh and W. If. Huggins, "Signal-flow graphs and random signals," Proc. IRE (Correspondence), vol. 45, pp . the input SNR. This is in agreement with some general 141:3-1414, October 1957. conclusions on adaptive systems stated in [14J. Synchro- [12] R. D. Barnard, "On the discrete spectral densities of Markov pulse trains," Bell Sgs. Tech. J., vol. 43, pp . 233-259, January nizer jitter and detection system performance are very 1964. sensitive to the shape of the signal waveform for both {l3} W. Davenport and W. Root, Random Signals and Noise. New York: McGraw-Hill, 1958. the optimum and suboptimum synchronizers considered. [14] J. C. Hancock and P. A. Wintz, Signal Detection Theory. New York: McGraw-flill, 1966. Synchronizer performance can be improved by one order of magnitude, and detection system performance can be improved by two orders of magnitude by using rounded pulses rather than square pulses. That is, with square pulses the input SNR must be increased by a factor of 10 in order to achieve the same amount of synchronizer jitter incurred with rounded pulses. With square pulses Paul A. Wintz (S'61-1\1'64), for a photograph and biography, please the input SNR must be increased by a factor of 100 in see page 290 of the April, 1969, issue of this TRANSACTIONS. order to achieve the same error rate obtained with rounded pulses. These comparisons were made with square and rounded pulses having equal energy or average power. On a peak power basis the difference in performance is not quite so great, but rounded pulses still yield significantly better performance. Edgar J. Luecke (M'59) was born in Cleveland, Ohio, on October 6, We also found that the best nonlinearity for use with 1933. He received the B.S.E.E. derounded pulses in the suboptimum system is an nth law gree from Valparaiso University, device with 1 < n < 2; for square pulses a hard limiter Valparaiso, Ind., in 1955, the M.S.E.E. degree from the Univergives better performance. PHOTO sity of Notre Dame, Notre Dame, Finally, we determined that the suboptimum synchroInd., in 1957, and the Ph.D. degree NOT nizer properly adjusted performs almost as well as the in electricalengineeringfrom Purdue AVAILABLE University, ~afayette, Ind., in 1968. optimum synchronizer for rounded pulses; for square From 1955 to 1963 he was an pulses this is not the case. Instructor and Assistant Professor 1"
REFERENCES
[lj S. \V. Golomb, J. R. Davey, I. S. Reed, H. L. Van Trees, and J. J. Stiffler, "Synchronization," IEEE Trans. Commumicaiions Systems, vol. CS-ll, PP. 481-491, December 1963. [2} J. II. Van Horn, "A theoretical synchronization system for use with noisy digital signals," IEEE Trans. Communuxuion Technology, vol. COl\1-12, pp. 82-90, September 1964.
of Electrical Engineering at Valparaiso University. During 1963 he was an NSF Faculty Fellow at Purdue University and from 1964 to 1966 a Research Instructor. Since 1966 he has been Associate Professor of Electrical Engineering at Valparaiso University. Dr, Luecke is a member of Tau Beta Pi, Eta Kappa Nu, and the American Society for Engineering Education.
Correlative Digital Communication Techniques ADAM LENDER,
AbstTQCt-A new method for the transmission of intelligence by means of a signal having certain correlation properties has been evolved. The theoretical and practical aspects of this concept are presented. An important advantage of these techniques is that for a fixed performance criteria, considerably higher speeds are possible compared to the presently known methods. In addition, the implementation. is simple and straightforward. An unus~al property of these techniques is the capability of error detection without the introduction of redundancy into the original data. Finally, expressions for spectral distributions and error performance as well as methods for practical implementation, including the errordetection process, are presented.
I. REVIEW OF lVluLTILEVEL TECHNIQUES
A
CCOR D I N C} TO the well-known postulate [1] of information theory, it is possible to attain virtually error-free transmissions up to the maximum rate of
c=
W log, (1
+ ~) bits per second (h/s)
(1)
WNo
where W represents the channel bandwidth, S the average signal power, and No the Gaussian noise power in a onecycle band. The theory also states that to achieve such a capacity, an exceedingly complex signaling sy~tem is r~ quired. In existing systems only a small fraction of this maximum possible channel capacity has ever been obtained. This paper proposes methods and describes practical approaches which are felt to represent a step toward realizing this theoretical goal. In high-speed digital communications, multilevel techniques have been known and used for some time; recently an experimental system with 32 discrete signaling levels was described [2]. Multilevel techniques employ b discrete signaling levels and represent log, b binary channels. For a fixed peak power, a b-level system has approximately 20 loglo(b - 1) dB penalty relative to binary, but at the same time has log, b tirnes greater capacity. As a result, the net noise penalty of a b-Ievel multilevel system relative to binary is (b - 1)2 10 loglo I b dB Og2
(2)
Equation (2), however, does not take into account the noise impairment due to intersymbol interference which, in multilevel systems, increases rapidly with the number of signaling levels. Moreover, multilevel systems require complex equipment. Recognizing the shortcomings of multilevel systems, new techniques have been explored Manuscript received July 21,. 1964.. ~re8ent~d at 1964 IEEE
Intemat'l Convention and pubhshed originally In the 196J,. IEEE Internat'l Convention Record, vol 12, pt 5, pp 45-53. . The author is with Lenkurt Electric Co., Inc., San Carlos, Calif.
1\1EMBER, IEEE
to obtain more efficient digital communications in terms of both performance and equipment. The common characteristic of existing multilevel codes is the absence of correlation between the digits. Attention was directed to the possibility of utilizing discrete signaling levels which would be correlated in the process of generating such levels, yet be treated independently in the detection process. One such technique has been described in an earlier paper [3J. Generation of a time wave consisting of correlated signaling levels permits overall spectrum shaping in addition to individual pulse shaping. It is, for instance, possible to redistribute the spectral energy so as to concentrate most of it at low frequencies or to eliminate any power at low frequencies. Such techniques are termed correlative.
II.
THE POLYBINARy l CONCEPT
Let a binary message with two signaling levels (IVIARI{ and SPACE) represented by NRZ (nonreturn to zero) digits be transformed into a signal with b signaling levels numbered consecutively from zero to (b - 1) starting at the bottom, All even-numbered levels are identified as SPACE, and all odd-numbered ones as l\1ARI{, although this labeling can, of course, be easily reversed. Both the original message and the polybinary signal have an identical digit duration of T seconds. There are no restrictions on the number of levels, b. Such a signal, termed polybinary, is capable of tl'ansl1~itting (b - 1) binary channels over a bandwidth which not1nally accommodates only a single binary channel. The binary message is transformed into the polybinary signal in two steps. In the first step the original sequence an, consisting of IVIARlrS and SPACES, is converted into another binary sequence d; in such a manner that the present binary digit of sequence d« represents the modulo 2 sum of (b - 2) immediately preceding digits of sequence d; and the present digit of an' The second step involves the transformation of the: binary sequence d; (in which 1 and o now no longer represent l\1ARI( and SPACE) into the polybinary pulse train Pn by adding algebraically (not modulo 2) the present digit of sequence d; to the (b - 2) preceding digits of sequence d.; One result of the transformation from sequence an to Pn is that SPACE and l\1ARI( in sequence an are mapped into even- and oddnumbered levels respectively in sequence Pn. This is significant, since each digit in sequence P« can be i.ndependently detected in spite of the strong correlation properties. The primary consequence of such properties is the redistribution of the spectral density of the original sequence a; into energy compressed near low frequencies for the new sequence Pn. 1
i.e., a plurality of binary channels.
Reprinted from IEEE Transactions on Communication Technology, December 1964. The Best ofthe Best. Edited by W. H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
421
422
THE BEST OF THE BEST
It is well known that the continuous component of the spectral density of the sequence a", consisting of uncorrelated binary digits, is W1(f) = ! -IG(f)12 pq
(3)
T
where p is the probability of MARK and q = (1 - p) is the probability of SPACE, G(f) is the Fourier transform of the pulse shape, and liT is the speed in b/s. Similarly, it is shown in Appendix I that the spectral density for the sequence d« is
W2(f)
= W1(f)ZI for b = 3 (
= W 1(f) Z2
for b >
N
3)
...z
1 + (1 - 2p)2 (1 _ 2p)4 - 2(1 - 2pr cos (b - 1)21rfT
+1
(6)
=~ T
b
=3
IG(f)!2[sin (b. - fT 1)rfTJ2 pqZ 2 for b sm 1r
>3
I
p'0.25
\
'"
~ 1.0 ;8
o
P, '0.7? !
-.i>K / VI,
- -
p' 0.5
I
...z
;::
3.0
2.0
+
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 FREQUENCY INUNITS OF
4.0
Zz '
1+ 1I- 2pI2 FORb>3 1I- 2p)4_2(1- 2p)zcGs(b-1l 2T1fT + I
-
<,
"Z' 0.25 OR 0.75 1'--..._
'" 1.0 ~ ;I:
Finally, the spectral density of the sequence p" as shown in Appendix I is
=T ~ IG(j 1)1rfTJ2 pqZ I for ) j2[sin (b. - fT sin 1r
2.0
I FORb'3/ 1I-2p)z- 2(1-2p)cos2nfT+ I I I I
;::
NN
+1
\
'" ....o
v ~
'" ....o
W
ZI'
- 3.0
v ~
1 ~= (1 - 2p)2 - 2(1 - 2p) cos 2rfT
TV3(j)
I\
(4)
where
Z2 =
4.0
p' r
o
0.1 0.2 Q3 0.4 0.5 0.6 0.7 0.8 0.9
FREQUENCY IN UN ITS OF -(
1.0
-'I b- I T
Fig. 1. Weighting factors of spectral densities.
(7)
It is interesting to note that the weighting factor Z2 of the spectral density in (6) and (7) is a symmetrical function of p and q, where q = 1 - Pi when p ~ 0.5, the energy density is more concentrated near low frequencies than for p = 0.5. The case for b = 3 is an exception. Both weighting factors shown in (.5) and (6) are plotted for representative values of p in Fig. 1. That the spectrum of the polybinary sequence P« is compressed by the factor of (b - 1) relative to the binary sequence a" can be shown for the simplest case of equal likelihood of MARK and SPACE and rectangular NRZ pulses of unit height, i.e., when
=
p
=
G(j)
=
1/2
(8)
T sin 1rfT rfT
(9)
q
Then Zl = Z2 = 1, and substitution of (8) and (9) into (8) and (7) yields WI (f) = W (j) 3
T
-
4
= (b -
(Sin rfT)2 (for binary) . --
-rr
1)2T
4
[Sin (b -
1)1rfTJ2 (b - 1)1rfT (for polybinary)
(10) (11)
In practical situations the (b - 1) compression factor can be closely approached by judicious choice of network
Fig. 2. Probability of error vs. average signal power per bit per second divided by noise power density in dB.
characteristics depending upon the value of b and the maximum permissible intersymbol interference. The net noise penalty of the polyb inary signal relative to binary is approximately 10 loglo(b - 1) dB
(12)
The intersymbol interference is inherently small, since the only possible transitions in two successive digit-time slots
Fifty Years of Communications and Networking
423
occur between adjacent signaling levels. Equation (12) should be C0111psJred with (2) for multilevel systems. However, (12) is pessimistic and does not take into account the peculiar property of the b-Ievel polybinary sequence, namely, that errors occur only when an even-numbered level is changed to an odd-numbered level, or vice versa. An exact expression for the probability of error for any value of b in the presence of Gaussian noise for the polybinary system is developed in Appendix II and is
1+
P, = -
2
2 1- b
l: (b -k b--1
1)
k==l
.
{~ i =l ( - I ) lm f LJ
{(2i -
I)L:NoY'}
(13)
relative to bipolar. Such a compression is reasonably well approached in practice for 1110re general conditions. Polybipolar signals have greater intersymbol interference than polybinary because transitions in t\VO successive digit slots involve one or two steps between signaling levels. All other characteristics of poly bipolar signals such as independent detection of digits, orderly distribution of even and odd levels (SPACES and IVIARI\:S, respectively), as well as the probability of error developed in Appendix II and shown in Fig, 2 are similar to the properties of poly binary signals. IV.
Il\f'PLE1\tIENTATIO:~ OF }JOLYBINARY AND POLYBIPOLAR
SYSTEl\fS AND EXPERIl\fENTAL RESULTS
Implementation of the correlative systems as shown in Fig. 3 is straightforward and simple, The transmission filter H (w) is a passive LC filter with nominal half-amplitude point at 11 liz. It represents to a rough approximation III. THE POLYBIPOLAR 2 CONCEPT the conversion characteristics combined with pulse shapIn SOBle applications it is desirable to transmit a base- ing. The conversion characteristics correspond to the band signal which has no de C0111pOnent and only a small transformation rules for the sequence d; to Pn for polyamount of energy at low frequencies. A well-known signal binary and the sequence d, to gn for polybipolar. The with this property is the bipolar [4]. It has three levels, input data rate at A of Fig, 3 is 2(b - 1)/1 b/s for polyoccupies approximately the same bandwidth as binary and binary and (b - 1)}1 b/s for polybipolar. At point B the represents a single binary channel. However, when greater coder transforms the original data sequence an to the binary capacity is desired, the polybipolar concept is useful. As sequence d; which in turn is converted to the band-limited polybinary or polybipolar time function at C. In the rein the case of a bipolar signal, there is no de component. The polybipolar transformation process is similar to the conversion at the receiving end, each signaling level js polybinary transformation described in Section II. It clearly associated with IVIARI{ or SPA.CE (lor 0) regardmaps the original binary sequence an into the polybipolar less of previous decisions: this permits the use of one of sequence gn with b levels representing (b - 1)/2 binary the two methods shown in Fig. 3. Such an experimental channels. The only restriction is that b must be an odd system was implemented for specific values of b. B of Fig. 4 shows the polybinary waveform for a 16number. Binary sequence an is converted into binary sequence d; as before. In the second step, each digit of the bit pattern with five levels representing four binary chansequence Yn is formed by adding algebraically to the pres- nels and corresponding to the original data input in A of ent digit of the sequence d n , (b - 3)/2 immediately pre- Fig. 4 at 4800 b/s. These waveforms appear at points A and ceding digits and subtracting the remaining (b - 1)/2 C of Fig. 3. To indicate the degree of intersymbol interdigits. As a result, the center level of the sequence gn ference, an eye pattern is shown for random data input at always represents SPACE of the original sequence an' the same speed. For comparison, eye patterns for b = 3 The adjacent positive and negative levels are consecu- and 4 are also shown at speeds of 2400 and 3600 b/s with tively numbered ± 1, ±2, etc., starting from the center appropriate coding through the same filter H(w) as in the level so that even-numbered levels again represent system for b = 5. Results of the polybipolar experimental system with SPACES and odd-numbered l\1ARKS. From Appendix I b = 5 at 2400 b/s appear in Fig. 5. The bandwidth is. the spectrum of the polybipolar sequence gn for b > 3 is approximately the same as for the polybinary system with b = 5 at 4800 b/s. No dc is present, and a higher inter_ 2 ?TfT symbol interference :in the eye pattern is evident due to W 4(f) - T 10(!) 1 (sin ?TfT)2 pqZ2 (14) one- and two-step transitions. For comparison, the bipolarThe conventional bipolar spectrum is merely (14) with signal is also shown at 1200 b/s through the same filter. b = 3 and the weighting factor Zl, i.e., where EIN o is the normalized signal-to-noise ratio (SNR). Equation (13) is plotted for several values of b in Fig. 2.
~
[Sin
e; 1)
J
v.
ERROR DETECTIO'N IN POLYBINARY AND POLYBIPOLAR. SYSTEl\'1S
where Zl and Z2 are defined in (5) and (6). When the conditions in expressions (8) and (9) prevail, the polybipolar spectrum is co111pressed exactly by the factor (b - 1)/2 2
i.e., a plurality of bipolar channels.
Based on previous considerations, a b-Ievel correlativetime function has recurrent correlation properties extending over (b - 1) digits. These properties are not utilized in the signal detect.ion but can be employed to check pattern violations without introducing redundant digits into the original binary message. Such violations result.
424
THE BEST OF THE BEST
A
2400 B ITSIS
I NPUT DATA PA TTERN FOR b.S 110 0 101 01 0 0 0 10 0 0
(a )
I---''-+-- DATA OUTPUT
(b)
I -'-+---o DATA OUTPUT
B
Fig. 3. Polybinary or polybipolar system (a ) Applicable to any value of b. (b) Applicable when b = 2" + 1 and niBan integer .
- 1,-1 , 0,0 ,+ 1,+2,+ I , 0, -1. - 2, 0 , + 2,+ 1,0 ,0, 0
c A
240 0 B IT S IS
o
1200 B ITSI S
48 00 BITS/S
INPUT DATA PATTERN FOR b =5 ' 1100 010 110 111 100
B
2400 BIT SI S OUTPUT 5 - LEVEL PATTERN
C
4800 BITSIS
4B OO BITSIS
b' 5
Fig. 5. Polybipolar waveforms for b = 3, 5, over identical band -pass filter.
OUTPUT 5 - LEVEL PATTERN ' 3322234332 111 12211 0001233 ETC. '--BEGINNING OF A NEW PERIOD
o
3600 BITS IS
E
2400 B ITSI S
RESET Y TO SHIFT REGIS TER
RESET W TO SH IFT REG ISTER
61
POLYBINARY OR POlYBI PO l AR SIGNA L FROM RECEIVER TOP SLICER I
Fig . 4. Polybinary waveforms for b = 3, 4, 5, over identi cal lowpass tilter .
in errors which are easily detected . Error detection is performed at the receiving end by comparing the received polybinary or polybipolar waveform with the decoded binary data. A simple, yet efficient method for implementation is shown in Fig. 6. The regenerated binary da ta sequence an at A is converted to th e binary sequen ce d« at B. As a result, the shift register always contains (b - 2) past digits of sequence d n • Every time th e polybinary or polybipolar signal is in one of th e ext reme signaling levels, an error check is performed. The top and bottom slicers in Fig. 6 sense these extreme levels and provide respective indications to gates G3 or G4. When either of the extreme states occurs, the pattern of (b - 2) digits in the shift register is examined . For example, when the top state is indicated for the polybipolar signal, the first (b - 3)/2 stages of th e register should be in binary state 1 and the remaining (b - 1)/2 stages in binary state O. If there is a disagre ement, an error indication is provided, and the register is
G2
s':.~il~~ I - -- - - ----..J Fig. 6. Error-detection system for b-level signal
reset to the proper states. Consequently, for polybipolar signals inverter A inverts only the first (b - 3)/2 states of the shift register and inverter B the last (b - 1)/ 2 states. For polybinary signals inverter B is absent and inverter A inverts all t he states of t he shift register; at the same t ime, reset Y resets all stages to binary 1 and reset W to binary O. In the case of polybipolar signals , reset Y resets the first (b - 3)/ 2 stages to binary 1 and the remaining to binary 0; reset W does exactly t he opposite. An inte rest ing asp ect of th e error-detection process is demonstrated when a violation which does not result in an error occurs; e.g., when one even-numbered level is changed to another even-numbered level. In general, t he type of error detector described will not give an indication except for the singular case when such violations result in a transition to one of the extreme signaling levels. It has been experimentally verified that such cases occur rare ly and have relatively little bearing on the error -detection process.
425
Fifty Years of Communications and Networking
VI. T,RE POLYBINARY AM-PSK PROCESS DETECTION
WITH
ENVELOPE
The correlative baseband techniques already described are suitable for any type of carrier modulation. However, the most interesting signal characteristics result from the unqiue combination of the polybinary technique with AM-PSK modulation. The restriction on the number of levels is that b must be odd. Suppose the center level of the polybinary signal is represented by the absence of carrier, the (b - 1)/2 levels above the center by equal increments of carrier and the (b - 1)/2 levels below the center also by equal incr ements but with carrier reversed by 1800 • Remembering that b is odd and that inherent symmetry exists with respect to the center level, the original b amplitude levels of the poly1)/2 amplitude levels of binary signal are reduced to (b the AM-PSK carrier signal. The demodulator at the receiving end completely disregards phase reversals and detects only the envelope of the carrier. Since, however, the signal follows the polybinary rules, the bandwidth is compressed by a factor of (b - 1) relative to the straight binary AM. For example, the system has only two carrier amplitude states for b = 3, yet it requires only half the bandwidth of conventional on-off AM. Conversely, for a fixed bandwidth, the polybinary AM-PSK for b = 3 has twice the bit capacity of binary AM. In fact , there is a 3-dB noise advantage for this case over binary AM, since AM-PSK still has two levels but requires half the bandwidth. The system should not be confused with AM-PSK multilevel systems [5). For instance, in the case of four binary channels the AM-PSK multilevel system requires 16 states, all of which must be distinguished at the receiver ; two separate detectors are needed to recover two amplitude levels and eight phases for a total of 16 possible conditions. In polybinary AM-PSK, to accommodate four 1)/2 calls for only three binary channels, b = 5, but (b amplitude levels ; the 180 0 phase reversal is ignored by th e receiver. Figure 7 shows the general implementation for the AM-PSI\: polybinary system. The coder again conve rts the original binary sequence an at A into the sequ ence d; at B. Phase modulation is generated digitally to provide a binary two-phase modulated wave at C. The H(w) bandpass filter performs the polybinary conversion and shaping so that at D the polybinary AM-PSK wave appears. The receiver is a conve ntio nal envelope detector followed
+
+
HI",) BANDPASS CONVERSION AND SHAPING filTER
DATA INPUT
•
•• • ••
• • •••
•••••••••
~
••••••••••••
D~
•••••••••
DATA OUTPUT
Fig. 7. AM-PSK polybinary system.
•
A
INPUT DATA PAr TER N'
0011 101010 011000
II
--- ---- -- B
E
IlL
CODED PATT E I~N -
00 10110 011 101 11 1
c
--- -- ---- ---- - , --- --- --
o I
MNv
II II
Fig. 8. AM-PSK polyhinary waveforms with b second.
A
c
o
-
=
5 at 2400 bits per
E
F CODED PATTEflN 1001 11101010000 1
---- ... - - -... --- ---- - --
--
3 at 2400 bits per
INPUT DATA PAT TE RN-
110 00 1011 0 00 110 1
B
=
II
Fig. 9. AM-PSK polybinary waveforms with b second.
by slicers . For such a digitally generated phase-modulated wave with one carrier cycle of unit amplitude per bit and 0 0 and 1800 shift at C, the spectral density at point D can with be represented by
m
G(f) == T (sin 7rfT/ 2) 2 7rfT/2
(16)
426
THE BEST OF THE BEST
If a unit amplitude sine wave carrier with n-cycles per bit were employed in the generation of the phase modulation, the corresponding G(f) in (7) would be
G(j)
= .L.
sin 1I"fT 2m 1 - (jT /n)2
(17)
The system was implemented for b = 3 and 5 at 2400 b/s with carrier at 2400 Hz. For b = 3 the band-pass filter attenuation increases rapidly above 3000 Hz and below 1800 Hz, and for b = 5 the bandwidth of this filter is reduced by approximately a factor of two. Figure 8 shows the AlVI-PSK waveforms for a 16-bit pattern and b = 3. The letter designations correspond to those in Fig. 7. The last waveform represents an eye pattern for a random input. Similarly, Fig. 9 shows the Al\1-PSK waveforms for b = 5 with the same designations. It should be noted that the 3-level pattern in F of Fig. 9 as well as the eye pattern for a random binary input are not duobinary [3] in spite of the fact that MARI(S are at the center level and SPACES at the extreme levels.
VII.
...t \.PPENDIX
I
SEQUENCES
The continuous component of the spectral density (6] of a pulse train is -
?n1 2
+2
6 00
[R(k) - 1n}2]COS
211"kfT}
(18)
where R(k) = ave(d n d n + k ) = autocovariance of the binary sequence d n . The binary sequences an and d n , consisting of 1'8 and O's, as well as their relationship are defined in Section II and can be expressed as b-2
a, \'" here
aj
=
+
+
i-l
SPECTRAL DENSITIES OF POLYBINARY AND POLYBIPOLAR
1 { =T IG(j)12 R(O)
+
CONCLUSIONS
The correlative digital communication techniques described in this paper indicate that high data transmission rates per cycle of bandwidth can be achieved with much less SNR penalty compared to the present multilevel techniques. Moreover, the correlation properties of the coded wave, representing the binary message, can be used to detect errors without introducing redundant digits into the original data stream. Signals of this type can be easily generated with or without de and low frequency components. A unique combination with AlVI-PSI{ results in a technique which is superior to other digital modulation techniques, such as, e.g., a multilevel AIVI-PSI{ combination. Finally, the circuit implementation is straightforward and simple.
W(f)
To evaluate R(k) of sequence d n , the probabilities of patterns with (k 1) digits will be considered. The probability of each pattern is 22 - b multiplied by the product of the conditional probabilities of each of the remaining (k 3b) digits. This is in agreement with the physical process of generating sequence dn , since the first (b - 2) digits in the shift register of Fig. 3 are arbitrary with equal likelihood 3 - b) for 1 and o. The product of the remaining (k conditional probabilities equals the product of the probabilities of l\1ARI\:S and SPACES for the pattern in sequence an which corresponds to the pattern in sequence d n • Such probabilities are denoted, respectively 1 by p and q, where q = 1 - p; and they are represented in sequence an by the symbols 1 and o. The only binary patterns in sequence d; which contribute to nonzero terms in R(k) are those that start and end with binary 1. For such an arrangement we seek the relations, if any, which are imposed upon the sequence an by the value of the modulo 2 sum of the first and last digits of sequence dn ; i.e., by the value of do EB dk , where EB indicates modulo 2 addition. When b = 3, a relation exists for any value of k and amounts merely to
mod 2
L d+ i=o
j
i
(19)
is the jth digit of sequence an' Also,
m, = ave(d, l ) = 1/2
(20)
R(O) = ave(d'12) = 1/2
(21)
mod 2
L
£=0
at = do 61
a,
(22)
> 3, a relation exists only when k is an integral multiple of (b - 1), i.e., k = n(b - 1), and is
When b
n
1110d 2
L
£=0
[ai(b-l)
+ aiCb-l)+l]
= do EB d"
(23)
Equation (23) is represented by the summation pattern in Fig. 10. Each horizontal line segment represents (b - 1) digits of sequence d n necessary to form a digit in sequence an in accordance with (19). The horizontal lines are summed up along the vertical axis corresponding to the right-hand side of (23). Since addition is 1110du]o 2, all parallel line segments cancel leaving only single line segments corresponding to do and dn(b _ 1). It follows from the pattern in Fig. 10 that unless the number of digits in a pattern of sequence d« is a multiple of (b - 1), it is not possible to obtain cancellation of all but the first and last digits. The summation 01 specific digits of sequence an, taken vertically, corresponds to the left-hand side of (23). The case of b = 3 in Fig. 10 is unique, since all, rather that the selected digits of sequence an, are SUl11111ed up. The case when k = n(b - 1) is first considered. Since the only patterns of interest in sequence dn are such that do = dk = 1, (23) is zero, and the positions of sequence an indicated by (23) have an even number of l's. There are 2k/(b - 1) such positions. Moreover, sequence d« has 2k - 1 distinct patterns and sequence an only 2" + 2 - bout of a possible 2" + 3 - b. But each pattern in sequence d; has a corresponding one in sequence an, therefore sequence an has (2 k - 1) (2 k + 2 - 0)-1 = 2 b - 3 identical sets. Based on previous considerations, binary 1 in sequence an symbolizes the probability of ~1:A.RI{ and binary 0 the probability of SPA.CE. Remembering that the product of the end digits in patterns of sequence d n is 1, the autocovariance of sequence d« for k = neb - 1) is
427
Fifty Years of Communications and Networking
and the spectral density of sequence Un is
dO-- - ----·db- I -- -- - d21H 1- - - d31b-J1-- -- - - - • - - - - .. -- - - -- d. (HJ
!
'0
'I
I ~I
I I
~
I I
I
I
---. ~
i
~~ i
i
'---+--+-f;
' IHI+I
I
' ZIH)+ I
r
;.+...
r
'3 (b- 1) ' l lb-I) + I
(30)
~ ~
~I
' Z(b'l l
i,-l-.
........
~
ApPENDIX
b >3
POLYBIPOLAR SIGNALS
~.
;"~ I
Cn(b-U
In a b-Ievel polybinary system with a peak voltage of A volts, the power dissipated in a one-ohm resistor for the kth level is
i--.--.--+---i-
°n (b- J)+ l
Fig. 10. Pattern diagrams for sequences a.. and do.
R(k)
=
II
PROBABILITY OF J!]RROR FOR BASEBAND POLYBINARY AND
(1/2)b-2(2 b-3)
i
~ ( b 2k) ~n ~ 1 pI (q/~I - i
(b~lr (24)
(31)
Assuming MARK and SPACE being equally likely or = 1/2, the probability Pk of the kth level in the polybinary system. is
P = q
The remaining (k + 3 - b) - 2k/(b - 1) positions in sequence an form all possible binary patterns, and the corresponding summation yields unity in (24). When k ~ neb - Ijand b > 3, all possible binary combinations of sequence an exist, and there are (2k - 1)(2k + 3 - br 1 = 2b - 4 identical sets each of which has all possible binary patterns. Hence, for this case the autocovariance of sequence d; is R(k) = (1/2)b - 2(2b - 4) = 1/4 (25)
where k = 0, 1,2, ... , b - 1. Similarly, in the polybipolar system the probability of the jth level is
Based on (22), the case of b = 3 is included in (24). Next, (20), (21), (24), and (25) are substituted into (18) to yield the spectral density of the binary sequence d;
where j = 0, ::I: 1, ::1:2, .. . , ± (b - })/2. Equation (33) reduces to (32) by using the identity (see [7])
(32)
p, -
- 1) (b - 1) t,b-l(b~ ~ i
(1/2)' -,
b~I(~..!.) L 2 (~) 2
1 W 2(f) =fj !O(f)12p QZ1 for b = 3
J=i
(26)
.
..
~
~
=
j
(b. -b-l
- J
1 )
+ -2-
J
(33)
(34)
+
where p is the probability of MARK (binary 1) in sequence an, and Zl and Zz are expressed in (5) and (6). The poly binary sequence Pn is obtained from sequence a, by adding algebraically consecutive groups of (b - 1) digits. This is equivalent to multiplying Wz(f) in (26) by
and changing the variable j (b - 1)/2 to k so that the results are applicable to both polybinary and polybipolar systems. From (3Jl) and (32) the average signal power S in a b-Ievel system is
s
= ~ (1/2)b k=o
l(b -k 1) (~)Z b- 1
=
A zb 4(b - 1)
(35)
_ The key point.in calculating the probability of error is (27)[that the only tune an error occurs IS when an even-: numbered amplitude level is changed into an odd-numbered so that the spectral density of sequence Pn is level, or vice versa. Hence the probability of error is the summation of the probabilities of noise being present in (28) specific amplitude zones when the kth level is present. To illustrate this point, let the distance between any Similarly, the polybipolar spectral density is obtained signaling level and the adjacent slicing level be a, where by adding consecutively the first (b - 1)/2 digits of se- el = A/2(b - 1). If, for example, the zero level is present, quence d; and subtracting the remaining (b - 1)/2 digits, only noise in the amplitude zones el to 3a, 5a to 7 a, etc., so that W 2(f ) in (26) is multiplied by will result in an error. Assuming additive Gaussian noise with noise power N = (]'2 and signal power given by (35), b- 3 2 the probability of error P e is --;;b7rfT ~
I.m~ e
t I("C:O
e- i mwT
-
t
Tl2= [sin (b - l)7rfTJ2
_i mw
ol
e -i'wT
r=b;l
sin 7rfT
[Sin (T) (sin
7rfT) 2
J
(29)
r, =
h-I
1/2
+L A"
=. PkYk
(36)
428
THE BEST OF THE BEST
where Pk is defined in (32) and
f; (-l)'erf
b-k-"l
Y k = 1/2
.
[SIN
{.
(21. - 1) 2b(b _ 1)
k + 1/2 ~ (~l)ierf { (2i
erf(x)
= ~;.
e
[
-
Finally, combining (39) and (40), the probability of error vs. the normalized SNR is
J1h}
SIN
1) 2b(b _ 1)
J1h}
1''' e-~ dt
r, (37)
=
1/2
+2
1
6 (b - 1) b( -1rerf b-l
-
b
k
k
E . { (2i - 1) [ 2bN
(38)
-1/2} o
J
(41)
After simplification of (36) and (37)
r,
= 1/2 + 21 _b
~
~ 1)
t
REFERENCES
(-l)ierf
SIN
· { (2i - 1)[ 2b(b _ 1)
Jl/!}
[1] Shannon, C. E., A mathematical theory of communication,
Bell Sys. Tech. J., vol 27, Oct 1948, PP 623-656.
(39)
If EINo represents the ratio of the average signal power per bit to the noise power density, then
EINi) ~ SWTIN (b -
1).
Here liT is the speed in b/s for the binary channel, and the noise bandwidth W Hz is assumed to equal the frequency of the first zero 'v hen signaling at this rate so that WT = 1 and
E
S
N{) - N(b -
1)
(40)
[2] Lebow, 1. L., et al., Application of sequential decoding to high-rate data communication on a telephone line, IEEE Trans. on Information. Theory (Correspondence), vol IT-9, Apr 1963, PP 124-126. [3J Lender, A., The duobinary technique for high speed data transmission, IEEE Trans. on Communication. and Electronics, vol 82, May 1963, pp 214-218. [4] Aaron, IV!. R., PCM transmission in the exchange plant, Bell Sys. Tech. J., vol 41, Jan 1962, PP 99-141. [5] Hancock, J. C., and R. W. Lucky, Performance of combined amplitude and phase-modulated communication systems, IRE Trans. on Communication. Suetems, vol C8-8, Dec 1960, PP 232-236. [6] Bennett, W. R., Statistics of regenerative digital transmission, Bell Sys. Tech. J., vol 37, Nov 1958, pp 1501-1543. [7] Ryshik, I. M., and I. S. Gradstein, Tables of series, products and integrals. Berlin: Veb Deutscher Verlag der Wissenschaften, 1963, p 4.
Characterization of Randomly Time-Variant Linear Channels PHILIP A. BELLO,
SENIOR MEMBER, I]~EE
Summary-This paper is concerned with various aspects of the complement the time-variant impulse response which is a characterization of randomly time-variant linear channels. At the time domain method of characterization. Further interestoutset it is demonstrated that time-varying linear channels (or filters) may be characterized in an interesting symmetrical manner in time . ing work on the characterization of time-varying linear and frequency variables by arranging system functions in (time- filters in terms of system functions has been done by frequency) dual pairs. Following this a statistical characterization Kailath," who has pointed out that a third type of impulse of randomly time-variant linear channels is carried out in terms of response may be defined in addition to the two already. correlation functions for the various system functions. These results used for time-variant linear filters. He has defined single are specialized by considering three classes of practically interesting channels. These are the wide-sense stationary (WSS) channel, the and double Fourier transforms of these impulse responses uncorrelated scattering (US) channel, and the wide-sense stationary in order to demonstrate that certain variables may be uncorrelated scattering (WSSUS) channel. The WSS and US identified with frequencies at the filter input and output channels are shown to be (time-frequency) duals. Previous dis- and certain variables may be identified with the rate of cussions of channel correlation functions and their relationships variation of the filter. However, (excepting the Timehave dealt exclusively with the WSSUS channel. The point of view presented here of dealing with the dually related system Variant Transfer Function) only the impulse responses and functions and starting with the unrestricted linear channels is their double Fourier transforms were demonstrated to be considerably more general and places in proper perspective previous system functions; i.e., filter input-output relations were results on the WSSUS channel. Some attention is given to the derived which used only impulse responses and their double problem of characterizing radio channels. A model called the Fourier transform.s. Quasi-WSSUS channel is presented to model the behavior of In Section II we demonstrate that time-varying linear such channels. All real-life channels and signals have an essentially finite channels (or filters) may be characterized in an interesting number of degrees of freedom due to restrictions on time duration symmetrical manner in time and frequency variables by and bandwidth. This fact may be used to derive useful canonical arranging system functions in (time-frequency) dual pairs." channel models with the aid of sampling theorems and power Most of these system functions (which include, among series expansions. Several new canonical channel models are derived in this paper, some of which are dual to those of Kailath. others, those introduced by Zadeh and Kailath) are shown
D
I.
INTRODUCTION
to imply circuit model interpretations or representations of the time-varying linear channels. The relationship between these system functions is demonstrated in a simple way with the aid of a graph involving duality and Fourier transfonnations. When the filter becomes randomly time-variant the various system functions become random processes. An exact statistical characterization of a randomly timevariant linear channel in terms of multidimensional probability density distributions for system functions, while necessary for some theoretical investigations, presupposes more knowledge than is likely to be available in physical situations. A less ambitious but more practical goal involves a statistical characterization in terms of correlation functions for the various system functions, since knowledge of these correlation functions allows a
DRING RECENT YEARS there has been an increasing amount of attention given to the study of randomly time-variant linear channels. This attention has been motivated to a large extent by the advent of troposcatter, ionoscatter, chaff and moon communication links and radar astronomy systems. The determination of optimum modulation and demodulation techniques and the analytical determination of the efficacy of optimum and suboptimum communication (or radar) techniques for such channels depends heavily upon a satisfactory characterization of the transmission channel. Thus, the characterization of randomly timevariant linear channels is of some interest. The characterization of time-variant linear filters (whether random or not) in terms of system functions 2 T. Kailath, "Sampling Models for Linear Time-Variant Filters," received its first general analytical treatment by Zadeh, 1 M.l.T. Research Lab. of Electronics. Cambridge, Mass., Rept. No. May 25, 1959. who introduced the Time-Variant Transfer Function 352;3 P. A~ Bello, "Time-frequency duality,'.' IEEE TRANS. ON Inand the Bi-Frequency Function as frequency domain FORMATION THEORY, vol. IT-10, pp. 18-33; January, 1964. Section V-D of this paper is essentially identical to Section II of the present methods of characterizing time-variant linear filters to one. Note added in J'Jroof: Since the present material has been acReceived May 20, 1963. The work reported in this paper was performed, in part, under Subcontract 480~117D with ITT Communications Systems, Inc., Paramus, N. J. The author is with ADCOM, Ine., Cambridge, Mass. I L. A. Zadeh, "Frequency analysis of variable networks," PROC. IRE, vol. 38, pp. 291-299; March, 1950.
,cepted for publication the author has discovered that A. J. Gersho ["Characterization of time-varying linear systems," Proc, IEEE, (Correspondence), vol. 51, p. 238; January, 1963] also has determined a symmetrical formulation of system functions for timevariant linear channels. His formulation omits the Delay-Doppler Spread and Doppler-Delay Spread Functions and two Kernel System Functions.
Reprinted from IEEE Transactions on Communications Systems, December 1963.
The Best ofthe Best. Edited by W H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
429
430
determination of the autocorrelation function of the channel output. In Section IV we define and determine relationships between the correlation functions of the various system functions for the general randomly time-variant linear channel. These results are specialized by considering three classes of practically interesting channels. These are the WSS channel, the US channel, and the WSSUS channel. The WSS and US channels are shown to be (time-Irequeney) duals. Previous diacussions':" of channel correlation functions and their relationships have dealt exclusively with the WSSUS channel. Our point of view dealing with the dually related system functions and starting with the unrestricted linear channel is considerably more general and places in proper perspective previous results on the WSSUS channel. Virtually all radio transmission media may be regarded as randomly time-variant linear channels. In the case of the transmission of digital signals over radio transmission media certain simplifications may be effected in channel characterization when the channel contains very slow fluctuations superimposed upon more rapid fluctuations, the latter of which exhibit an approximate statistical stationarity. In Section V we introduce the quasiwidesense stationary uncorrelated scattering (QWSSUS) channel as a means for characterizing such channels. All real-life channels and signals have an essentially finite number of degrees of freedom due to restrictions on time duration and bandwidth. This fact has been exploited by Kailath" to derive canonical channel models for the cases in which the channel band-limits signals at its input or output and in which the channel impulse response is time-limited. With the aid of the dual system functions derived in Section III we derive new canonic sampling models in Section V, SODle of which may be identified as dual to those of Kailath. As might be expected, these dual models are particularly useful under the dual time- frequency constraints, namely when the input or output time functions are time-limited or when the channel fading rate is band-limited. In addition we derive two new dually related canonic' channel models, called f-power series and t-power series models. The i-power series model is of particular use in evaluating the effect of frequency selective fading on a signal whose bandwidth is less than the correlation bandwidth of a scatter channel, The t-power series model will be of use in the dual situation, i.e., in evaluating the effect of (time-selective) fading on a pulse whose duration is less than the correlation time constant of a scatter channel. • T. Hagfors "Some properties of radio waves reflected from the moon and their relation to the lunar surface," J. Geophys. Res., vol. 66, pp. 777-785; March, 1961.. .. 6 R. Price and P. E. Green, Jr., "SIgnal Processing In Radar Astronomy." M.l.T. Lincoln Lab., Lexington, Mass., Rept. No. 234; October 6, 1960. . . " 6 P. E. Green, Jr., "Radar measur~mentof target characteristics, in "Radar Astronomy," J. V. Harrington and J. V. Evans, Eds., Chapter 9; to be published.
THE BEST OF THE BEST
II.
COMPLEX ENVELOPES
A process x(t) whose spectral components cover a band of frequencies which is small compared to any frequency in the band may be expressed as (1)
where Re{ } is the usual real part notation, We is some (angular) frequency within the band and 'Y(t) is the complex envelope of x(t). This name for 'Y(t) derives from the fact that the magnitude of 'Y(t) is the conventional envelope of x(t) while the angle of 'YCt) is the conventional phase of x(t) measured with respect to carrier phase wet. The nonnarrow-band case may be handled with the complex notation also by the use of Hilbert transforms. 7-10 However, the complex envelope will then no longer have the simple interpretation described above. Complex envelope notation will be used extensively for the remainder of this paper. However, it should be understood that there is always implied the existence of a center or 'reference frequency We which via an equation such as (1) converts the complex time functions under discussion into 'physical narrow-band signals. When dealing with problems in which there are wideband filters (time-variant included) whose inputs and outputs are narrow-band (when expressed with reference to the same center frequency), it is possible to replace these filters with equivalent narrow-band filters which leave the input-output relations invariant. This fact becomes obvious when it is realized that by preceding and following a wide-band filter with narrow-band filters which have fiat transfer functions over the range of input and output frequencies of interest, one produces a composite filter which is narrow-band and of course cannot change the input-output relations for the properly restricted class of input and output narrow-band signals. It is readily demonstrated that (except for an unimportant constant of one-half) the complex envelope of a narrowband signal at the output of a narrow-band filter due to a narrow-band input may be obtained by passing the com· . 1en t" plex envelope 0f the Input t h roug h an " equrva low-pass filter whose impulse response is just equal to the complex envelope of the narrow-band filter impulse response. In defining the autocorrelation function of the complex envelope of a random process a certain difficulty appears that is not generally appreciated, namely, that two autocorrelation functions are needed in order to uniquely specify the autocorrelation function of the original real process. This fact is demonstrated by direct calculation 7
P. M. Woodward, "Probability and Information Theory,"
McG-raw-Hill Book Co., Ine., New York, N. Y.; 1953. 8 J. Dugundji, "Envelopes and pre-envelopes of real waveforms, IJ IRE TRANS. ON INFORMATION THEORY, vol. IT-4, pp. 53-57; March, fl' " 1958. 9 R. Arens, "Complex processes for envelopes 0 norma noise, IRE TRANS. ON INFORMATION THEORY, vol. IT-3, pp. 204-207; September, 1957. 10 D. Gabor, "Theory of communications," J. lEE, Part III, vol. 93, pp. 429-457; November, 1946.
431
Fifty Years of Communications and Networking
of the autocorrelation function of x(t), (1), as x(t)x(s)
=
t Re
f"Y*(t}y(s)
eiwe( ..
+!
-n}
Re {1'(t}y(s)
eiwe(a+t)}.
If z(t), Z(f) denote the input time function and spectrum, and wet), W(f) denote the output time function and spectrum of a device, then the four possible operators are (2) described by the equations
Thus the two autocorrelation functions
R-y(t, s) =
1'*(t)~(s)
Ry(t, s) ~ l'(t}y(s)
(3)
(4)
In fact, from (2), one may readily deduce that (4) is necessary if x(t) is to be wide-sense stationary. A simple physical test of x(t) (deterministic components removed) to determine whether (4) is satisfied is to multiply it by itself delayed and examine the sum frequency component for the presence of a deterministic. component. According to (2) the complex amplitude of this component is !R'Y(t, s), so that the presence of a deterministic component would mean that (4) is violated. In the subsequent discussion involving complex envelopes we shall deal only with that form of autocorrelation function which involves the conjugate under the expectation sign. It should be kept in mind, however, that an analogous discussion applies for Itt(t, 8) in those cases where it is nonzero. The above discussion of complex envelopes, equivalent noises and equivalent filters is supplied as a physical justification for our subsequent use of" low-pass" complex time functions, complex white noise and low-pass filters with complex impulse responses.
III.
W(f) = Off(Z(f)]
wet) = ()t/[Z(j)]
W(f)
=
(5)
Oft [z(t)]
where the operator pairs Ott, Off and 0,,, Oft individ-
are needed to specify the autocorrelation function of the real process. Fortunately, in most applications the narrowband process is so constituted that
R-y(t,8) = O.
wet) = ()tt[z(t)]
SYSTEM FUNCTIONS FOR TIME-VARIANT LINEAR FILTERS
A. Dual Operators and Kernel Systern Functions
The concept of H time-frequency" duality is discussed at some length by Bello. 3 For the purposes of this section it will be sufficient to define the concept of dual operators. A device which processes communication signals may be thought of in mathematical terms as an operator which transforms input signals into output signals. The inputs and outputs of such a device may be described in either the time or frequency domain according to convenience. Since either time or frequency domain descriptions may be used at the input and output, a two-terminal device (a single-input single-output device) may be described by any. one of four operators. If we define time and frequency domain descriptions of processes as dual descriptions, then these four operators may be grouped into dual pairs with the aid of the following definition: Two operators associated with a particular two-terminal-pair device are defined as duals when dual descriptions are used for corresponding inputs and outputs.
ually consist of dual operators. In the case of a linear device, such as a linear timevariant channel, the four equations in (5) may be formally expressed" as linear integral operators with associated kernels; i.e., wet) wet)
f =f
=
z(s)K1(t, s) ds
W(f) =
Z(f)Ka(t, f) df
W(f)
=
f
Z(l)K2(f, l) dl
f z(t)K
4(f,
t) dt.
(6)
(7)
These kernels are, in effect, system functions and we shall call them kernel system functions to distinguish them from other classes of system functions to be described. It is clear that the system function pairs K 1 , K 2 ·and K 3 J K. may be considered as dual system function pairs. The system functions K1(t, s) and K 2 (f, l) may be recognized as the Time-Variant Impulse Response and the Bi-Frequeney Function respectively, used by Zadeh.! The system functions J(a(t, f) and K 4 (f, t) have not been defined previously. Without difficulty it may be established that K1(t, 8) and K 2 (f, l), besides being duals, are double Fourier transform pairs, and similarly that the dual pairs K 3 (t, f) and K 4 C !, t) are double Fourier transform pairs. Also K1(t, a) and K 3 (t, f) are single Fourier transform pairs with t considered as a parameter, while K2(j~ l) and K 4 (J , t) are single Fourier' transform pairs with f considered as a parameter.. It is worth noting that K 11 K 2 and K 3 , K 4 are the only dual pairs of system functions among those to be presented which are related directly as double Fourier transform pairs. The kernel system functions have simple physical interpretations in terms of the response of the channel to impulses and cissoids. Thus, it is readily determined that if the channel is excited with a unit impulse at t = 8, the resulting channel output is the time function K 1 (t, s) with spectrum K 4 (!, s), while if the channel is excited with the cissoid ei21rlt (i.e., frequency impulse at f = l), the resulting channel output is the time function K 3 (t, l) with spectrum K 2 (f, l). The present discussion of kernel system functions has been included primarily in the interest of making our discussion of system functions as complete as possible, and in clarifying our subsequent discussion of system functions. We shall actually make little use of the kernel system functions in the remainder of this paper, primarily because we are interested in circuit model descrip11 In order to include linear differential operators one must assume that the kernels may include singularity functions.
432
THE BEST OF THE BEST
tions of the time-variant linear channel and the kernel system functions do not lend themselves readily to such phenomenological descriptions.
DENSELY TAPPED DELAY LINE
B. Delay-Spread and Doppler-Spread Functions From a strictly mathematical point of view the kernel system functions are sufficient to describe the timefrequency input-output relations for a time-variant linear channel. From a physical intuitive point of view they are not as satisfactory, since they do not readily allow one to grasp by inspection the way in which the time-variant filter affects input signals to produce output signals. Section 1110 we will be concerned with system functions which, via circuit model analogies, provide a somewhat more physical interpretation of the action of the linear time-variant channel. Consider first the following input-output relationship for a linear time-variant channel obtained from the first equation in (6) by the transformation S = t - ~:
wet)
=
Jz(t -
~)g(t, ~) d~
(8)
- - - ........ - - - - " " - - - - - - w ( t ) ~ SUMMING BUS
Fig. I-A differential circuit model representation for linear timevariant channels using the Input Delay-Spread Function.
-----r-.....--.....--DISTRIBUTION BUS
~ ((t)- -
--=--=
............
-==-~
.-_-.(t)~
DENSELY TAPPED
DELAY LINE
Fig. 2-A differential circuit model representation for linear timevariant channels using the Output Delay-Spread Function.
t < 8 and get, t), h(t, ~) must vanish for ~ < o. These physical realizability conditions may be explicitly indi(9) cated by appropriate limits 011 the integrals defining Eq, (8) leads to a physical picture of the channel as a input-output relations. However, for simplicity of precontinuum of nonmoving scintillating scatterers, with sentation we will assume that the integral limits are with gCt, ~)d~ equal to the (complex) modulation produced co), with the integrand being taken as zero in the by hypothetical elemental "scatterers" that provide appropriate intervals to assure physical realizability. delays in the range (~, ~ + d~). Fig. 1 illustrates such a An entirely dual and just as general channel characterphysical picture with the aid of a densely tapped delay ization exists in terms of frequency variables by employing line. Note that the input signal is first delayed and then the Input and Output Doppler-Spread Functions, the multiplied by the differential scattering gain. We shall system functions which are the (time-frequency) duals call get, ~) the Input Delay-Spread Function to distinguish of the Input and Output Delay-Spread Functions, it from another system function called the Output Delay- respectively. Consider first the dual of the Input DelaySpread Function, to be described below, which leads to a Spread Function. Such a system function must relate the channel representation similar to g(t, ~) except that the channel output spectrum to the channel input spectrum delay occurs on the output side of the channel (and the in a manner identical in form to the way get, ~) relates multiplication on the input). the input and output time functions. This dual characterIf we consider z(t) to be first multiplied by a differential ization involves a representation of the output spectrum gain function k(t, ~)d~ and then delayed by ~ with a W (f) as a superposition of infinitesimal Doppler-shifted continuum of t values, we obtain the input-output (the dual of delayed) and filtered (the dual of modulated) relationship replicas of the input spectrum Z(f). Thus we have where
(-
w(t)
=
f
z(t -
~)h(t
-
~, ~) d~
(10)
and the circuit model representation. of Fig. 2. By comparing (10) with (8) and (6) we quickly find that h(t, ~) =
Kt(t
+ ~, t) =
g(t
+ ~, ~).
(11)
From the fact that K 1 (t, 8) is the channel response at time t due to a unit impulse input at t = s, it is seen from (7) and (11) that get, t) may be interpreted as the response at 'time t to a unit impulse input ~ seconds in the past, and h(t, ~) may be interpreted as the response ~ seconds in the future to a unit impulse input at time t. Since a physical channel (without internal sources) may not have an output before the input arrives, K 1 (t, s) must vanish for
(X) ,
W(f) =
J
Z(f - v)H(f, v) df
(12)
where H(f, p) is the Input Doppler-Spread Function. Eq. (12) may be interpreted physically with the aid of a model dual to that in Fig. 1. To construct such a dual it is necessary to note that the dual of a tapped delay line is a "frequency conversion chain," i.e., a string of frequency converters arranged so that the output of one converter is not only the input to the next converter but is also the lC local" frequency shifted output. Fig. 3 illustrates such. an interpretation of (12) using a "dense" frequency conversion chain. Note that the quantity Hij, JI)dv is to be interpreted as the transfer function associated with hypothetical Doppler-shifting elements
433
Fifty Years ofCommunications and Networking
that provide frequency shifts in the range (v, JI + dll). By comparing (12) and the last equation in (6) we find that the Input Doppler-Spread Function is related to Zadeh's Bi-Frequency Function by (13) which is an equation dual to (9). From (10) we deduce that the dual of the Output Delay-Spread Function must provide the input spectrumoutput spectrum relationship W(f)
=
J
(14)
Z(f - v)G(j, v) dv
where GCt, v) is defined as the Output Doppler-Spread Function. Whereas the Input Doppler-Spread Function leads to a cascaded Doppler shifter-filter realization as indicated in Fig. 3, the Output Doppler-Spread Function leads to a cascaded filter-Doppler shifter realization as shown in Fig. 4. The quantity G(t, v)dll is the transfer function of a hypothetical differential filter at the input which is associated with a Doppler shift of JJ cps at the channel output. By comparing (14) with (13) and (6) we find that G(t, v)
= K 2(!
+ P, f)
= H(f
+ 71, v),
(15)
which is a set of equations dual to (11). Since K 2 (j, l) is the value of the spectral response of the channel at a frequency f due to a cissoidal excitation of frequency l cps, it is quickly seen from (13) and (15) that Htj, v) may be interpreted 3,S the spectral response of the channel at I cps due to a cissoidal input 11 cycles below t, and G(j, 11) may be interpreted as the spectral response of the channel at a 'frequency 11 cps above the cissoidal input at the frequency f cps.
!!!lli
(U--======Z=::O:======OE=NS=ELY
TAPPED FREQUENCY CONVERSION CHAIN
c.
Time-Varianl; Transfer Function and FrequencyDependent M odulaiiot: Function
The characterizations of a time-variant channel in terms of the Delay-Spread Functions g(t, ~) and h(t, ,t) and the Time-Variant Impulse Response K1(t, 8) are strictly time domain approaches, while the characterizations in terms of the Doppler-Spread Functions Hi], v) and G(f, v) and the Bi-Frequency Function K 2 (f, l) are entirely frequency domain approaches. In the former cases the output time function is directly related to the input time function, while in. the latter cases the output spectrum is directly related to the input spectrum. As discussed in Section III A and exemplified by the dual kernel system functions Ka(t, f) and K 4 (f, t), two other approaches are possible. These involve an expression of the output time function directly in terms of the input spectrum in one case, and an expression of the output spectrum directly in terms of the input time function in the other. An example of the former approach was first introduced by l Zadeh with the aid of the Time-Variant Transfer Function. In this section we will introduce a new system function called the Frequency-Dependent Modulation Function, which is the (time-frequency) dual of the Time-Variant Transfer Function. This system function relates the output spectrum to the input time function. Assuming we have an input z(t), which may be represented as a summation of infinitesimal cissoidal time functions, i.e.,
=
z(t)
JZ(f)e
i2
o< /l
df
(16)
where Z(f) is the spectrum of z(t), one may determine the channel output by superposing the separate responses to the infinitesimal cissoidal components. The response of the channel to the cissoidal time function exp [j21rlt] (or spectral impulse 0(1 - l)) is given by [see (9)]
f
ei 2 o< W
- fl
g(t,
~) d~ = ei 2r ll T (l , t)
(17)
where (18)
_ _ _...............~ - - - - - - w { t ) ~ SUMMING BUS
Fig. 3-A differential circuit model representation for linear timevariant channels using the Input Doppler-Spread Function.
..!.!'!e!1.I
is the Fourier transform of the Input Delay-Spread Function with respect to the delay parameter. By superposition the network output is given by
DISTRIBUTION BUS
~(t)------~--,---
w(t) =
----..a.-----.. . .
-,,(t)~
DENSELY TAPPED FREQUENCY CONV£RSION CHAIN
Fig. 4-A differential circuit model representation for linear timevariant channels using the Output Doppler-Spread Function.
JZ(f)T(j, t)e
1h f l
df·
(19)
Eq.. (19) shows that even though the channel may be time-variant, one may determine the output by exaeUy the same frequency domain techniques as for time-variant (linear) channels.. This involves, basically, a multiplication of the input spectrum by a system function followed by an inverse Fourier transformation with respect to the frequency variable. For time-variant channels, however, the
434
THE BEST OF THE BEST
system function is a function of the time variable. This explains use of the name Time-Variant Transfer Function to denote T(j, t). By using (14) to determine the spectrum of the response to the frequency impulse ~(f - l), and then inverse Fourier transforming to obtain the corresponding time response, it may be quickly determined that
the spectrum of the response, it may be quickly determined that (26)·
i.e., that the Frequency-Dependent Modulation Function is the Fourier transform of the Output Delay-Spread Function with respect to the delay variable. T(j, t) = G(t, V),prr>1 dv, (20) Also, either by noting, as discussed in Section III A, that K 4 (f 8) is the spectrum of the channel response to an i.e., that the Time-Variant Transfer Function is the inverse excitatio~ oCt - 8), or by comparing (25) and (7), it is Fourier transform of the Output Doppler-Spread Function readily seen that with respect to the Doppler-shift variable. (27) Also, either by noting, as discussed in Section III A, that K 3 (t, l) may be interpreted as the channel response at In the case of time-invariant linear filters the transtime t to an excitation ei2rlt, or by comparing (7) and (15), mission frequency characteristic of the filter can be it is readily seen that determined by direct measurement as the cissoidal reKg(t, f) = ei27fftT(f, t). (21) sponse or else indirectly as the spectrum of the impulse response. For time-variant linear filters these measureTo develop the dual system function we assume that ments procedures yield different results, as exemplified we have an input whose spectrum Z(f) may be represented by the fact that T(f, t), which corresponds to the cissoidal as a summation of infinitesimal cissoidal frequency func- measurement, differs from M (t, f) which corresponds to tions; i.e., impulse response measurement followed by spectral analysis. Z(f} = Z(t)e-i2..-/1 dt. (22) Moreover, as we have shown above, only 'I'I], t) may properly be considered a transmission frequency characterThe spectrum of the response of the channel to the cis- istic, the proper interpretation of M (t, f) being that of 8. soidal frequency function exp [-j21r18] (i.e., to the time channel "modulator.. " function oCt - 8) whose spectrum is exp [-j2'nJs]) is given by [see (12)] D. Delay-Doppler-Spread and Doppler-Delay-Spread Functions e-ihCU-')H(j, v) dv = e-;h/'M(s, f) (23) In Section III B it was demonstrated that any linear time-varying channel may be interpreted either as a where continuum of nonmoving scintillating scatterers with the aid of the Delay-Spread Functions, or as a continuum (24) of hypothetical Doppler-shifting elements with associated filters with the aid of the Doppler-Spread Functions. We is the Fourier transform of the Input Doppler-Spread demonstrate in this section that any linear time-varying Function with respect to the Doppler-shift variable. By channel may be represented as a continuum of elements superposition the network output spectrum is given by which simultaneously provide both a corresponding delay and Doppler shift. i h 1t W(j) = z(t)M(t, f)edt. (25) As in Section III B, we can consider system functions classified according to whether the corresponding phenomEq. (25) shows that, even though the channel may be a enological channel model has its delay operation or general time-variant linear filter, one may determine the Doppler-shift operation at the channel input or output. output spectrum by exactly the same time domain Since delay and Doppler shift both occur in the model to techniques as for a channel which acts as a pure complex be described, only two possibilities exist, i.e., input-delay multiplier (or modulator). This involves, basically, a output-Doppler-shift and input-Doppler-shift outputmultiplication of the input time function by a complex delay, rather than the four possibilities of Section III B. time function characterizing the channel, followed by a To determine the system function corresponding to the Fourier transformation with respect to the time variable. input delay output Doppler-shift channel model, we For general time-variant channels, however, the complex express the Input Delay-Spread Function get, ~) as the multiplier is frequency-dependent. This explains our use inverse Fourier transform of its spectrum (where ~ is of the name Frequency-Dependent Modulation Function considered to be a fixed parameter), i.e., to denote M(t, f). By using (6) to determine the time function response to g(t,~) = U(~, v)eih.1 dv, (28) the input o(t - s) and then Fourier transforming to obtain
J
J
J
J
J
435
Fifty Years ofCommunications and Networking
and then use (28) in (8) to obtain the following inputoutput relationship: w(t) =
Jf
~)e;2UIU(~, v) dv d~.
z(t -
(29)
Examination of (29) shows that the output is represented as a sum of delayed and then Doppler-shifted elements, the element providing delays in the interval (~, E + d~) and Doppler shifts ill the interval (v, v + dv) having a differential scattering amplitude U(~, p)dlld~. For this reason we call U(t, v) the Delay-Doppler-Spread Function. In all entirely analogous way, in order to determine the dual system function, i.e., that corresponding to the input Doppler-shift output-delay channel model, we express the Input Doppler-Spread Function as a Fourier transform
E. Relationship Benoeen Susiem Functions
At this point the reader may be somewhat bewildered by the variety of system functions that have been intro-
duced. In addition to the four ke.rnel system functions, we have discussed eight other system functions. The relationships between the kernel system functions are rather clearly outlined ill Section III A. The relationships between the other eight system functions can be simply portrayed by grouping them according to duality and Fourier transform relationships. This grouping is illustrated in Fig. 5, ill which the dashed line labeled D signifies that the system functions occupying mirror image positions
with respect to the dashed line are duals, and the line labeled F signifies that the system functions at the terminals of the line are related by single Fourier transforms. Since the system functions involve two variables, any two system functions connected by all F must have a common variable which should be regarded as a fixed parameter in B(j, v) = V(v, ~)e-ihU (30) employing the Fourier transform relationships involving the other two variables. Note that one of these latter two variables is a time variable and the other is a frequency and then use (30) "in (12) to~~obtain the input-output variable. To make the F notation unique \ve have emrelationship ployed the convention that in transforming from a time to a frequency variable a negative exponential is used in the Fourier integral, while in transforming from a frequency to a time variable a positive exponential is used. Examination of (31) shows that the output is represented as a sum of Doppler-shifted and then delayed elements, the element providing Doppler shifts in the interval (v, v + dv) and delays in the interval (t, ~ + d~) lilt,!) having a differential scattering amplitude V (11, ~)d~ dv. For this reason we call V(v, ~) the Doppler-Delay-Spread Function. If we Fourier transform both sides of (29) with respect to t and inverse Fourier transform both sides of (31) with respect to f we obtain the equations
J
«.
J
INPUT DELAY SPREAD FUNCTICN
W(f)
=
Jf
=
If
and w(t)
Z(j -
z(t -
JI)e-;2'W-.lU(~, v) dv d~
(32)
~)ei2"(Hl V(v, ~) d~ dv.
(33)
OUTPUT l>OPPLEj
SPREAD FUNCTIl)N
GIf, .. )
A comparison of (31) and (32) or (29) and (33) reveals that U(~, v) and V(v, ~) are simply related; i.e.,
Fig. 5-:-Relationships between system functions for time-variant linear channels.
Examination of Fig; 5 reveals that the Time-Variant Transfer Function T(j, t) and the Delay-Doppler-Spread Functions are double Fourier transforms; i.e.,
(34)
If the integration with respect to ~ is carried out in (28) and the integration with respect to 11 is carried out in (33), one finds that
k(t,
~) =
and G(j, v) =
J V(v, ~)e-j2"'1 dv
f
U(t v)e-ihfl
d~.
(35)
(36)
(37) Also the dual relationship exists. The Frequency-Dependent Modulation. Function }.[(t, f) and the DopplerDelay-Spread Function are double Fourier transforms; i.e.,
M(t, f) =
JJ
V(v,
~)e-i2wUej2"1 d~ dv.
(38)
Since T(f, t) and M(t, f) are double Fourier transforms of system functions which differ only by the simple exponential factor exp [- i21r1l~] [see (34)] it might be
436
THE BEST OF THE BEST
supposed that they also are related in a similar simple lashion. However, this is not the case. The analytic relationship between T(f, t) and ]yf(t, f) is quickly obtained From (21) and (27) with the aid of the fact that Ka(l, f) and K 4 (f, t) are double Fourier transform pairs. This relationship is M(t, f) T(f, t) =
II II
=
T(J', t')eihU-f') (1-1 ') dt' dt' (39)
M(t', f')e-ihU-f')(t-f') df' dt'.
g*(t, ~)g(8, 11)
tion properties associated with the purely random part of the channel. Thus it should be understood in subsequent discussions of correlation functions that each of the system functions has a zero ensemble average.
A. General Case We shall confine our discussion of channel correlation functions to the eight system functions shown in Fig. 5, since it is felt that these system functions provide a better picture of the operation of a time-varying linear channel than the kernel system functions. . The correlation functions for the system functions in Fig. 5 will be defined as follows:
H*(f, lI)H(l, p,) = RH(f, i;
= Rf1(t, .s;~, fJ)
u-«, f)M(s,
T*(f, t)T(l, 8) = RT(f, 1; t, 8)
h*(t, ~)h(s, 11)
G*(j, v)G(l, IJ) = Ra(J, l; v, p,) U*(t, V)U(11, p,) = Ru(~, 1]; v, JL) We also note from Fig. 5 that the Input Delay-Spread Function get, ~) and the Output Doppler-Spread Function G(f, v) are double Fourier transforms. Also, the dual relationship exists; i.e., the Output Delay-Spread Function h(t, ~) and the Input Doppler-Spread Function Hi], v) are double Fourier transform pairs. Other analytic relationships between system functions are readily obtained by using (11) and (15) and the Fourier transform relationships indicated in Fig. 5. IV.
CHANNEL CORRELATION FUNCTIONS
When the channel is randomly time-variant the system functions discussed in Section III become stochastic processes. An exact statistical characterization of a randomly time-variant linear channel in terms of multidimensional probability density distributions for system functions, while necessary for some theoretical investigations, presupposes more knowledge than is likely to be available in physical situations. A less ambitious but more practical goal involves a statistical characterization in terms of correlation functions for the various system functions since (as will be shown below) these correlation functions allow a determination of the autocorrelation function of the channel output. In this section we will be concerned with defining correlation function for these system functions and showing their interrelationships. Special attention will be given to simplifications that result for certain classes of channels of practical interest. In general, a randomly time-variant channel has a mixed deterministic and random behavior. Thus, for example, the Input Delay-Spread Function g(t, ~) may separated into the sum of a purely random part and a deterministic part [equal to the ensemble average of g(t, ~)]. This separation implies a representation of the channel as the parallel combination of a deterministic channel and a purely random channel. For simplicity of discussion, we shall only be concerned in this paper with the correla-
l)
=
RM(t, s;
P,
i,
= Rh(t, s; ~,
p,) l)
11)
V*(v, ~)V(~, 11) = Rv(v, p,;~, 11)
(40)
where correlation functions for dual system functions have been placed in the same row and correlation functions for Fourier-transform-related system functions have been placed in the same column. It is readily appreciated that the relationships between correlation functions in any column are double and quadruple Fourier transform relationships since the corresponding system functions are related by single and double Fourier transforms, respectively. As an illustration, consider the derivation of the relationship between RgCt, 8; ~, 1]) and RT(f, l; t, 8). The Fourier transform relationship between g(t, t) and T(f, t) is shown explicitly in (18).. Using this equation we find that T*(f, t)'l'(l,8) =
II
g*(t,
~)g(8,
d~ d'l1.
71)eih W - 1 1)
(41)
Then, taking the ensemble average of both sides of (41) (and assuming the validity of interchanging the order of integration and ensemble averging), we find that RT(J, l; t, 8)
=
JI
Rg{t,
8;~,
71)ei2TW-1
1
)
dJ; d'l1
(42)
and by inverting the Fourier transform relationship
R.(t,
8;~,
'1]) =
JJ
RT(J, i. t, 8)e-i 2i ( U - 11) dfdl.
(43)
If an identical procedure is followed to determine the other Fourier transform relationships between channel correlation functions, one finds that these relationships may be portrayed as shown in Fig. 6, wherein a double line labeled F indicates a double Fourier transform relationship between the connected correlation functions. The meaning of the dashed line labeled D is similar to the corresponding dashed line in Fig. 5, namely, the correlation functions which occupy mirror image positions with respect to the dashed line are correlation functions of dual system functions. Since the channel correla-
Fifty Years ofCommunications and Networking
437
tion functions involve four variables, any two correlation functions connected by an F must have two common variables [such as t, 8 in (42) and (43)] which should be regarded as fixed parameters in employing the double Fourier transform relationship involving the remaining variables. N ate that the four variables of a correlation function are divided into two pairs separated by a semicolon. One of these pairs is involved directly in the Fourier transform relationship while the other pair is fixed. Note also that the double Fourier transform relaI tionship always connects a pair of time (or delay) variFig. 6-Relationships between channel correlation functions. ables and a pair of frequency (or Doppler-shift) variables e.q., pairs ~, 7J and i, l in (42) and (43)]. In order to make the Fourier transform symbolism in Fig. 6 unique we conjunction with the Fourier transform relationships have employed the convention that in Fourier transform- indicated in Fig. O. ing from a pair of time variables to a pair of frequency With the aid 0:[ the input-output relationship and the variables a positive exponential is to be used to connect correlation function associated with each system function the first variables in each pair and a negative exponential one may readily determine a corresponding double integral to .connect the second variables [e.g., exp [j21l"~f] and relating the autocorrelation function of the output to the exp [- j21r'll], respectively, in (42)}, while in transforming autocorrelation function of the system function. Thus from a pair of frequency variables to a pair of time vari- consider the system function g(t, ~). Using (8) to form the ables the opposite signing procedure is to be used [e.g., product w*(t) w(s) as follows: exp [- j21r~fl and exp [j21r1]l] in (43)]. Examination of Fig. 6 reveals that the pairs of channel w*(t)w(s) = z*(t - ~)z(s - 7])g*(t, t)g(s, 7]) d~ d7], (47) correlation functions (R,n R a ) , (R uJ R T ) and their duals (R,., R o ) , (R v , R},f) are quadruple Fourier transform pairs. and then averaging, one finds that The actual fourfold integral relating any of these pairs is readily obtained by performing two successive double R",(t, 8) = z*(t - ~)z(8 - ,7])R.(t, s; ~, 7]) d~ d'l/ (48) Fourier transforms as indicated in Fig. 6.
Jf
!!
From (30) itis quickly determined that
where we have defined
(44) is the relationship between the correlation functions of the Delay-Doppler-Spread and Doppler-Delay-Spread system functions. Eq, (39) may be used to determine that the relationship between the correlation functions of the Time-Variant Transfer Function and the Frequency-Dependent Modulation Function is as shown below, RM(t, 8;1, T) =
JJJ!
RT(j',
v, t', 8') exp [j21r(l-
· (t -
ffJ!
RM(t', 8';
1',
l') exp (j21r(j -
(45)
1')
R G(!, l; v, p.) = RH(f
'11;~, 1/)
+ 11, l + J.L; 11, p).
=
R..(t, s)
(49)
Jf
R.(t -
t, s
- 7])R.(t,
s;~,
7])
~ d7]
(46)
Other analytic relationships between channel correlation functions on either side of the dashed line in Fig. 6 are quickly obtained by using (40), (41) and (42) in
(50)
where we have defined (51)
The dual system function H(f, v) leads to the following expression for the autocorrelation function of the output spectrum R w (!, l)
From (11) and (15) we find the following relationships between the Delay-Spread and Doppler-Spread Functions:
= RfI(t + t, S +
(t , 8) = w*(t)w(s)
R%(t, s) = z*(t)z(s).
t/) ..- j21r(l - l')(s - s')] df' dl' dt' ds':
Rh(t, 8;~, 1])
1P
as the autocorrelation function of the output time function. When the input is random rather than deterministic, as in (48), the output autocorrelation function becomes
l')
· (8 - Sf) - j21r(f -- f')(t - t')] df' dl' dt' ds' RT(j, l; t, 8) =
R
=
=:E
W*(f) liV(l)
Jf
Z*(j - p)Z(l - p.)RH(j, l;
P,
p.) dp dp.
(52)
11,
p,) dv dp
(53)
when the input is deterministic and Rw(f, l):;::
J~r
J
Rz(j -
JI,
l -- p,)RH(f, l;
when the input is random, where we have defined Rz(f, l) = Z*(f)Z(l)
(54)
438
THE BEST OF THE BEST
as the autocorrelation function of the input spectrum. 12 The reader may readily form the input-output correlation function relationships corresponding to the remaining system functions.
B. The Wide-Sense Stationary Channel In many physical channels the fading statistics may be assumed approximately stationary for time intervals sufficiently long to make it meaningful to define a subclass of channels, called Wide-Sense Stationary (WSS) Channels. A WSS channel has the property that the channel correlation functions Rf1(t, 8; ~, ,,), Rh(t, 8; ~, 1)), RT(f, l; t, 8) and RM(t, 8; t, I) are invariant under a translation in time; i.e., these correlation functions depend on the variables t, 8 only through the difference T = 8 - t. Thus for the WSS channel
+ Rh(t, t + Rf/(t, t
T;~, 1J)
= Ra(r; ~, 11)
T; E, "11)
= Rh(T; ~,
RT(f, l; i, t
RM(t, t
+
rJ)
(55)
T) = RT(f, l; 1")
+ 7"; i, l) = RM(T; i, l).
The restricted nature of the four channel correlation functions in (55) constrains the remaining four channel correlation functions in Fig. 6 to have a singular behavior in the Doppler-shift variables. As an example consider the double Fourier transform relationship between Rf1(t, s; ~, '1) and Ru(~, 11, v, p,): R u(1;,
1]j
IJ
v, p) =
R.(t, s;
t, 1])ei2rC,'-p,) dt
ds.
(56)
Upon making the change in variable T = 8 - t in (56) and using the first equation in (55), one finds that Ru(t,
1]jV,
p) =
Je
i 2r I C , - p)
dt
J R.(r;~,
1])e-
i 2r
<
dr. (57)
The first integral in (57) is recognized as a unit impulse at J1 = p, i,e., 0('11 - IJ). It follows that Ru(~, 11; V, p..) may be expressed in the form Ru(t, 11, v, p,)
=
Pu(~,
1]; ,,).5(11 - p,)
where Pu(~, '11; v) is the Fourier transform of Ra(T; with respect to the variable T; i,e.,
Pu(~,
1]j
v) =
J R.(r;~,
1])e-i2m
dr .
(58)
t, TJ) (59)
In an analogous fashion it is readily determined that Ro(J, lj 11, JJ)
= P a(f, l;
v)o(v - p,)
Rv(v, p.;~, '1) = Py(v; ~, '1)0(11 - p,) Rll(f, I; v, ~)
(60)
= PH(f, l; v)o(v - p.)
where Po(j, l; v), Pv(v; ~, fJ), and PH(J, l; v) are Fourier transforms of RT(J, l; 7), R h ( ". ; ~, ,,), and RM(T; t, l), respectively, with respect to the delay variable T. 12 See Bello, process.
Ope
cit. 3, for a discussion of the spectrum of a random
The singular behavior of the channel correlation functions in (58) and (60) has interesting implications with regard to the behavior of the associated circuit models. Thus the forms of R H and R o as shown in (60) imply that in Figs. 3 and 4 the transfer functions of the random filters associated with different Doppler shifts are uncorrelated. Similarly the forms of R« and R u in (60) and (58) imply that in the associated channel models consisting of a number of differential U scatterers" producing delay and Doppler shifts, the complex scattering amplitudes of two different elements are uncorrelated if these elements cause different Doppler shifts. When the system functions are normally distributed stochastic processes, complete lack of correlation between two processes implies statistical independence. Then wide-sense stationarity implies strict-sense stationarity, and in the circuit models of Figs. 3 and 4 the transfer functions of random filters associated with different Doppler shifts are statistically independent. Similarly in the models consisting of a number of differential "scatterers" producing delay and Doppler shifts, the complex scattering amplitudes of two different elements are statistically independent if these elements cause different Doppler shifts. The singular behavior of the correlation functions R H , R G , R v and R u might have been expected a priori from the observation that the corresponding system functions are interpretable as (complex) amplitude spectra of random processes and from the fact that the crosscorrelation function between the amplitude spectra of two wide-sense stationary noises is an impulse located at zero frequency shift with a complex area equal to the cross-power spectral density between the original processes," Thus, when considered as a function of the Doppler-shift variable v. the functions Pu(t, 7]; v), P a(t, l; v) Pv(v; t, 1)) and PH(J, Ii v) may be interpreted as crosspower spectral densities between the pairs of time funcget, 71)], [T(f, t), T(l, t)], [h(t, ~), k(t, 71)] and tions [g(t, [M(t, f), M(t, l)), respectively. In the particular case that ~ = 11, PvC,,; ~, ~) and Pu(~, ~; r) may be interpreted as power spectral densities of the functions get, ~) and h(t, ~), respectively; while for f = l, PaCt, t: p) and PH(J, t: v) may be interpreted as power spectral densities of the functions Tfj, t) and M(t, f), respectively. In view of the above it is clear that the system functions U(~, v), G(f, v), ~) and H(f, v) will behave like nonstationary white noises in the Doppler-shift variable when the channel is
o.
yep,
WSS.
In Fig. 7 we have summarized the relationships between the channel correlation functions, using only the corresponding density function when the correlation function has an impulsive behavior. Note that the Fourier transform notations of Figs. 5 and 6 have been used. Let us now consider some analytical relationships between functions on the opposite side of the dashed line in Fig. 7. From (44) and (60) we find that
(61)
439
Fifty Years ofCommunications and Networking o I
I I I
I I
I I I I
I
I I I I
I
I I I I I
Fig. 7-Relationships between channel correlation functions for WSS channel.
while from (46), (55), (58) and (60) we find that R/s(Tj ~, '1])
=
Rg(T
l~G(f, l; 11) = Pn(j
+
'I] -
~;~, '1])
(62)
+ v, l + v; v).
The relationship between the correlation functions of the Time- ariant System Function and the FrequencyDependent Modulation Function is readily determined from (45) and (55) to be
,r
RX(T;
I,
l)
=
JJ
RT(J',
f'
+l
-
The singular behavior of the channel correlation functions in (64) and (65) has implications with regard to the behavior of the associated circuit models. Thus the forms of R, and R h as shown in (64) imply that in Figs. 1 and 2 the complex gain functions associated with different path delays are uncorrelated. Similarly the forms of R« and R u in (65) imply that in the associated channel models consisting of a number of differential "scatterers" producing delay and Doppler shifts, the complex scattering amplitudes of two different scatterers are uncorrelated if these elements cause different delays. A channel whose system functions have correlation functions of the form shown in (64) and (65) will be called an Uncorrelated Scattering (US) channel. Whell the system functions are normally distributed stochastic processes, the uncorrelatedness properties mentioned above for complex gain functions and scattering amplitudes become independence properties. The forms of the correlation functions for the remaining four system functions which are readily determined from the Fourier transform relationships indicated in Fig. 6 are given by Rt(f~
f
+
n; t, 8) = RT(n; t, 8)
Ry(t, 8; i, t
I; T') (63)
+
0)
=
RM(t, 8; 0)
RoCf, f + n; v, p,) = Ra(Q; v, p.) RH(f, f + n; 11, p,) = RH(nj 11, p,).
(66)
A comparison of the channel correlation functions for the WSS and US channels reveals an interesting fact: the correlation function of a particular system function of the WSS channeland the correlation function of the dual Other analytic relationships between the channel cor- system function of the US channel have identical analytical relation functions on either side of the dashed line in forms as a function of dual variables. For this reason one Fig. 7 are quickly obtained by using (61), (62) an~ (63).in may consider the class of WSS channels to be the dual of conjunction with the Fourier transform relationships class of US channels." indicated in Fig. 7. As a consequence of this duality we note that the US channel may be regarded as a WSS channel in the freThe Uncorrelaied ScaUering Channel quency variable since, from (66), the channel correlation For several physical channels (e.g., troposcatter, chaff, functions depend upon the frequency variables t, l only moon reflection) the channel may be modeled approxi- through the difference frequency n = l - f. Similarly the mately as a continuum of uncorrelated scatterers. The WSS channel may' be regarded as a form of US channel in mathematical counterpart of this statement is embodied the Doppler-shift variable. in the following forms for the autocorrelation function of While the Input and Output Doppler-Spread Function the Input and Output Delay-Spread Functions: have the character of nonstationary white noise as a R,,(t, 8; E, 1]) = P g(t, 8; ~)O(71 - t) (64) function of the Doppler-shift variable in the case of the WSS channel, the dual system functions, i.e., the Input Rh(t, s; ~, 1/) = Ph(t, 8; t)O(1J - ~). and Output Delay-Spread Functions, the Delay-DopplerSpread and Doppler-Delay-Spread Functions, respecBecause of the intimate relationship between the Input tively, have the character of nonstationarv white noise and Output Delay-Spread Functions, one of the equat~ons as a function of the dual variable, i.e., the delay variable, in (64) implies the other. Moreover, these equations in the case of the US channel. imply that the autocorrelation functions of the DopplerBy analogy with the dual functions in the WSS channel, Delay-Spread and Delay-Doppler-Spread Functions must the functions P,(t, s; t), PACt, 8; t), PU(~;), 11, Jl) and have the form
c.
Ru(~, 1J; v, p.) = Pu(~, v, p.)8(1] - ~)
Rv(v, iJ,;~,
,,)
=
Py(v, p; ~)o(" - ~).
(65)
13 Using the definitions developed in Bello, op•. cit. 3, one may state that the wide-sense dual of a WSS channel IS a US channel and vice versa.
440
THE BEST OF THE BEST
p v(v, u; ~), when considered as a function of ~, may be regarded as cross-power spectral densities while Po(l, t; ~), Ph(t, t; Pu(t; v, v), Pv(v, v; ~) may be regarded as power spectral densities as a function of the delay variable. In Fig. 8 we have summarized the relationships between the channel correlation functions for the US channel using only the corresponding cross-power density function when the correlation function has an impulsive behavior. We will now obtain some analytical relationships between functions on the opposite side of the dashed line in Fig. 8. First we have the relationship dual to (61) which may be obtained by using (65) in (44),
o,
Pu(v, IJ.;~) = ei21f"(yl-~)pv(~; v, IJ.).
(67)
Then, using (64) and the last two equations of (66) in (46) we obtain the dual to (62) as
R a(!"} ; 11, p.) = RH(fl
+ ,." -
v; v, IJ.)
(68)
The relationships between the Time-Variant Transfer Function and the Frequency-Dependent Modulation Function dual to (63) are obtained by using the first two equations in (66) in (45) and carrying out two integrations of the appropriate quadruple integrals in (45),
.e- i 2 1r ( t -
.ei 2 1r
t
l )(
n-
0')
t ' ) (0- W)
dt' dO'
(69)
dt' du',
As we have mentioned in a similar vein for the other classes of channels, further analytical relationships may be obtained bet,veen system functions on the opposite side of the dotted line in Fig. 8 by using (67), (68) and (69) and the Fourier transform relationships indicated in Fig. 8.
D. The Wide-Sense Stationary Uncorreloied Scattering Channel
The simplest type of randomly time-variant linear channel to describe in terms of channel correlation functions, and one which, fortunately, is of practical interest is the WSSUS channel. As might be suspected from its name, the WSSUS channel is both a WSS and a US channel. Thus, the channel correlation functions of the WSSUS channel must have forms characteristic of both the WSS channel [(55), (58) and (60)] and the US channel [(64), (65) and (66)]. An examination of the correlation functions of the WSS and US channels reveals that for the WSSUS channel, the correlation. functions of the Delay-Spread Functions must have the form
+ T; 1;, TJ) t + T;~, TJ)
R,lt, t
=
Rh(t,
= Ph(T,
P,(T, ~)O(l1 - £) ~)O(17 - ~)
(70)
Fig. 8-Relationships between channel correlation functions for US channel.
while the correlation functions of the Doppler-Spread Functions must have the form
f
+
Q; P, Jl)
=
RH(f, f
+
12; v, ,.,,)
= PH(Q,
RG(f,
Po(Q, v)o(p. - v)
(71)
v)b(f.L - v).
The equations in (70) show that for the WSSUS channel, the system functions g(t, ~) and h(t, ~) have the character of nonstationary white noise in the delay variable and wide-sense stationary noise in the time variable.. In terms of the channel models of Figs. 1 and 2, the WSSUS channel has a representation as a continuum of uncorrelated randomly scintillating scatterers with widesense stationary statistics. The equations in (71) show that for the WSSUS channel the system functions G(f, v) and H(f, v) have the character of nonstationary white noise in the Doppler-shift variable and wide-sense stationary noise in the frequency variable. In terms of the channel models of Figs. 3 and 4,·the WSSUS channel has a representation as a continuum of uncorrelated Doppler-shifting filtering (or filtering-Doppler shifting) elements with each filter having a transfer function with wide-sense stationary statistics in the frequency variable. For the WSSUS channel the correlation functions of the Delay-Doppler-Spread and Doppler-Delay-Spread Functions simplify to Ru(t, 17; v, p.)
=
Pu(~,
Rv(v, J.1.; ~, 1])
=
PY(lI, ~)o(JL - v)~(1J - ~).
v)5eJ.£ - v)o(1] - ~)
(72)
Eq. (72) shows that for the WSSUS channel the system functions U(~, v) and V(v, ~) have the character of nonstationary white noise in both the delay and Doppler-shift variables, i.e., a form of two-dimensional nonstationary white noise. It follows that in terms of the corresponding channel models, the WSSUS channel may be represented as a collection of nonscintillating uncorrelated scatterers which cause both delays and Doppler shifts. Finally, in the case of the WSSUS channel, the correla... tion functions for the Time-Variant Transfer Function and the Frequency-Dependent Modulation Function take the simple forms
+ RT(f, f +
RM(t, t
+ Q) t, t + r)
T; !,
0;
f
=
RM(r, 0)
= RTcn,
r);
(73)
Fifty Years ofCommunications and Networking
441
i.e., the system. functions 'I't], t) and M(t, f) are wide-sense
stationary processes in both the time and frequency variables. From previous discussions in Sections IV B and IV 0, we know that when considered as a function of the Doppler-shift variable v the functions P G(0, v) and P H(n, v) may be interpreted as the cross-power spectral densities of the pairs of time functions Tt], "t), T(f + 0, t) and M(t, f), M(t, f + 0), respectively.
Similarly, when considered as a function of the delay
variable ~, the functions Pg(T, ~) and Ph(T, ~) may be interpreted as cross-power spectral densities of the frequency functions T(j, t), T(j, t T) 811d M(t, f), M(t T, f), respectively. The functions Pu(t, v) and Pv(v, ~) may be interpreted as a sort of two-dimensional power density spectrum as a function of delay and Doppler shift corresponding to the combined time and frequency functions T(j, t) and M(t, f), respectively. In the case of the WSSUS channel the relationships between correlation functions on opposite sides of the
+
+
dashed line in Fig. 6 become trivial. Thus, use of (73) in (44) shows that Pu(~, v) = Pv{v,~)
==
S(t, 71);
(74)
i.e., that the two-dimensional power spectral densities in delay and Doppler shift associated with the DopplerDelay-Spread and Delay-Doppler-Spread functions are identical, We have used the function S(~, v) to denote this common function which is identical to the Target Scattering Function (T(~, v) defined by Price and Green" in their work on radar astronomy.. We shall call S (E, 11) the Scattering Function since it has more general applications than to radar problems. If (72) is used in the last equation of (46) one finds immediately that
P o(fl , 7I)
= PH(O,
v)
== pen, v).
(75)
Thus the Doppler cross-power spectral densities associated with the channel models of Figs. 3 and 4 become equal in the case of the WSSUS channel. We shall call this common function the Doppler Cross-Power Spectral Density and denote it by the function P(U, v). In the particular case that~ ~n = 0, the cross-power spectral densities become simply power spectral densities. Thus we define P(O, v)
== P(v)
denote it by the function Q(r, ~). In the particular case that T = 0, the cross-power spectral densities become simple power spectral densities. Thus we define
Q(O, t) s Q(t) where
Q(~)
(78)
is called the Delay Power Density Spectrum.
(This function has been called the Power Impulse Response by Green" and the Delay Spectrum by Hagfors.4 ) Finally, if (73) is used in (45) one finds that the quadruple integrals in (45) vanish leaving the interesting result
= RT(n,
R 1f1 (r , Q)
r)
==
R(n, r).
(79)
Thus in the case of the WSSUS channel, the correlation functions of the Time-Variant System Function and the Frequency-Dependent Modulation Function become identical, i.e., T*(j, t)T(j
+
0, t
+
r)
= M*(t, f)M(t +
T, f
+
0)
=
R(O, T).
(80)
We shall call this common function the Time-Frequency Correlation Function and denote it b:y R(O, r). (This function has been called the Spaced-Time Spaced-Frequency Correlation Function by Green.") Two correlation functions of practical interest derivable from R (n, r) are the Frequency Correlation Function q(Q) (called the Spaced-Frequency Correlation Function by Green6) giveil by T*(t, f)T(t, t
+
= M*(t,
0)' f).~f(t,
f + 12) = R(O,
0)
:=
q(n)
(81)
and the Time Correlation Function p(T) (called the Echo Correlation Function by Green 6) given by T*(t, j)T(t
+
T,
f)
= M*(t, f}111"(t
+
T, f)
= R(O,
T) == per).
(82)
The relationships between channel correlation functions are shown in Fig. 9. Note that Q(T, t) and pen, v) are double Fourier transform pairs as are S(~, v) and R(f!, 7). The double Fourier transform relationship between the Scattering Function and the Time-Frequency Correlation Function appears to have been first pointed out by Hagfors. 4 It is readily determined from the Fourier trans_form relationships in Fig. 9 that q(O), Q(~) and peT), pep) are single Fourier transform pairs.
(76)
where P(v) is called the Doppler Power Density Spectrum. (This function is called the Echo Power Spectrum by Green.") If (59) is used in the first equation of (42) one finds that PQ(T,~) :=
PA(T, t) ;; Q(-r, ~).
(77)
Thus the delay cross-power spectral densities associated with the channel models of Figs. 1 and 2 become equal in the case of the WSSUS channel. We shall call this common function the Delay Cross-Power Spectral Density and
Fig. 9--RelationshiJ:ls between hannel correlation functions for WSSUS channel.
442
THE BEST OF THE BEST
v.
RADIO CHANNEL CHARACTERIZATION
Virtually all radio transmission media may be characterized as linear in regard to their influence upon communication signals. Thus, from a phenomenological point of view, radio channels may be regarded as special cases of random time-variant linear filters. In the case of the transmission of digital signals over radio channels it appears that certain simplifications may be effected in the general characterization of randomly time-variant channels developed in the previous sections of this report. These simplifications arise when the time and frequency selective behavior of the channel may be regarded as widesense stationary for time and frequency intervals much greater than the durations and bandwidths, respectively, of the signaling waveforms of interest. Such a situation arises in practice when the channel contains very slow fluctuations superimposed upon more rapid fluctuations, the latter of which exhibit the desired statistical stationarity properties. Most radio channels do, in fact, appear to exhibit such" quasi-stationary" behavior. Moreover, the more rapid fluctuations often appear to be characterizable in terms of appropriately defined Gaussian statistics. Since a Gaussian process can be completely described statistically if its correlation function is known, it follows that a fairly complete statistical description of many quasi-stationary radio channels should be achievable by measuring the channel correlation functions for time and frequency intervals small compared to the fluctuation intervals of the slow channel variations, and then measuring the statistical behavior of these quasi-stationary channel correlation functions as caused by the slow channel variations. In this way one may compute quasi-stationary error probabilities for digital transmission which would accurately reflect the short-time error rate behavior of the channeL 14-16 The long-time error rate behavior of the channel could then be predicted by averaging the shorttime error rate behavior over the long-time fading statistics of the channel. To sum up, our thesis is that a useful way to perform measurements on radio channels is to determine the longtime statistics of short-time channel correlation functions. The resulting data should be sufficient to provide a fairly complete statistical description of some radio channels. To make these ideas more precise we shall now present a mathematical exposition of the above ideas. The time and frequency selective behavior of a random time-variant linear channel may be described with the aid of several of the system functions described in Section 14 P. A. Bello and B. D. Nelin, ((The influence of fading spectrum on the binary error probabilities of incoherent and differentiallycoherent matched filter receivers, "IRE TRANS. ON COMMUNICATIONS SYSTEMS, vol. C8-10, pp. 160-168; June, 1962. Iii P. A. Bello and B. D. Nelin, "The effect of frequency selective fading on the binary error probabilities of incoherent and differentially coherent matched filter receivers," IEEE TRANS. ON COMMUNICATION SYSTEMS, vol. OS-11, pp. 170-186; June, 1963. 16 P. A. Bello and B. D. Nelin, HThe Effect of Combined Time and Frequency Selective Fading on the Binarv Error Probabilities of Incoherent Matched Filter Receivers," ADCOM, Inc., Cambridge, Mass., Res. Rept. No.7, March, 1963.
III. For purposes of discussion it is sufficient to start with an examination of T(f, t), the Time-Variant Transfer Function of the channel. It will be recalled that T(f, t) is the complex envelope of the response of the channel to an excitation cos 27r(jc + f)t, where t, is the" ceilter" or "carrier" frequency at which the channel is being excited. Thus, the magnitude of T(f, t) is the envelope of the channel response and the angle of T(f, t) is the phase of the channel response measured with respect to the phase 27r(fc f)t. For a general input signal with complex envelope z(t), the channel output complex envelope wet) is given by (19). Although the input time function z(t) may be deterministie (nonrandom), the output time function w(t) will be a random process since T(f, t) is a random process. It is possible that for some radio channels T(f, t) will contain a deterministic component so that wet) will also contain a deterministic component. However, for the purpose of the present discussion it is sufficient to confine our attention to the purely random part of TCf, t) and wet). Thus, to avoid introducing unnecessary notation we shall assume in this section that wet) and T(f, t) are purely random. The time and frequency selective behavior of the channel is evidenced by the way T(f, t) changes with changes in f and t. As far as conventional usage is concerned the concept of "fading" is associated only with the fact that T(f, t) varies with the time variable t. However, it appears desirable to extend this definition to include variation of T(f, t) with i, and thus talk of "frequency fading," the dual of conventional "time fading." From a statistical point of view the simplest way to describe the sensitivity of T(f, t) to changes in f and t is to form the correlation function
+
In terms of the notation in Section IV,
When f and t are fixed, say at f = f', t = t', RI' .,'(T, n) describes the way in which the Time-Variant Transfer Function becomes decorrelated for a frequency interval n and a time interval T centered on the "local" timefrequency coordinates f', i', In the case of the WSSUS channel Rf,,(r, 0) becomes independent of t, t; i.e.,
(85) From an analytical point of view the WSSUS channel is the simplest nondegenerate channel that exhibits both time and frequency selective behavior. As discussed in Section IV, such a channel may be modeled as a continuum of uncorrelated scatterers such that each infinitesimal scatterer providing Doppler shifts in the range v, v + dv and delays in the range ~, ~ + d~ has a scattering cross section of Set, v)d~, dv, where
Fifty Years ofCommunications and Networking
443
(86)
is the Fourier transform of R(T, 0). We will now demonstrate that the simplicity of the WSSUS channel can be transferred to a practically interesting class of channels which we shall call QuasiWSSUS (or QWSSUS). This class contains two subclasses which are (time-frequency) duals. For the purposes of the present discussion we need only introduce that subclass which is based upon the properties of Rj,t(T, 0). The dual subclass is based upon the Frequency Dependent Modulation Function and may be readily constructed by the reader. To define the QWSSUS channel of interest here one must assume that the typical input signaling waveform has a constraint on bandwidth and that the resulting output waveform has a constraint on time duration. This bandwidth and time constraint can be centered anywhere, but for simplicity of discussion (and with 110 loss in generality) it is convenient to assume that the input bandwidth constraint is centered at f = 0 (i.e., at the carrier frequency) and the output time constraint is centered at t = o. Any other centering can be handled by a redefinition of carrier frequency and time origin. The QWSSUS channel is defined as that subclass for which certain gross channel parameters have specified inequality relations with respect to the input-bandwidth and output-time constraints defined above. The gross channel parameters of concern here are measures of the maximum rate at which R,.t(r, D) varies in the f and t directions. Let the maximum rate of fluctuation RI,t(r, fl) in the f and t directions be denoted bY'Ymax sec and 8max cps, respectively. Let W, T denote the bandwidth, and time duration of the input signal. Let d denote the multipath spread of the channel. Then the QWSSUS channel. under discussion is a channel for which the following inequalities hold:
W«_l_ 'Ymax
T
+ A«
1 (Jmax '
From (19) we determine that the output correlation function is given by
w*(t)w(s) =
II Z*(f)Z(l)RT(f,
l; i,
O
S)e-I2r(/t-l l
a.
(89)
df dO.
(90)
df
Upon making the transformations
l~f
+ g2
in (89) one finds that
·Rf,t(T,
f2)e-i2rfei21f Ot
If we consider the integration with respect to f first in (90) we note that the integrand is nonzero over an interval of f values of maximum width W centered on f = 0, because by hypothesis Z (f) is zero for values f outside this interval and thus Z*(f - n/2)Z(f + 0/2) must also be zero outside this interval, According to inequality (87), however, Rf,t(T, ~J) will vary negligibly for values of f in this interval and thus for values of f for which the integrand in (90) is nonzero. It follows' that insignificant error will result in (90) if we use R o It (T, 0) in place of RI,t(r, n). Furthermore we note that, since wet) is constrained by hypothesis to be nonzero only over an interval of t values of width T + 11 centered on t = 0, then the left-hand side of (90) must perforce exhibit the same property as a function of t. Since the double integral in (90) must vanish for values of t outside an integral of width T + A centered on t = 0, and since by inequality (88), R,.t(T, n) in the integrand can vary negligibly in this interval, it follows (87) that little error can result by replacing RI.t(r, n) by Rf,o(r, fl) in the integrand.. It then follows that if both inequalities (87) and (88) (88) are velid, we have the close approximation
Of, in other words, one for which R 1 f , (fl , r) changes negligibly over "f" intervals equal to the input signal bandwidth (W) and over" t" intervals equal to the output where signal time duration (T + l1). It will now be demonstrated that if inequalities (87) and x(n, r) = Z*(f df (92) and (88) are satisfied, the actual channel may be replaced by a hypothetical WSSUS channel at least as far as the determination. of the correlation function of the channel is the ambiguity function of the transmitted signaL The output is concerned. However, when the channel has expression (91) for the output signal correlation function Gaussian statistics, the actual channel may be replaced by is identical in form to that for a WSSUS channel in which a hypothetical WSSUS channel as far as the determinaR(r, n) = Roo(r, n). (93) tion of any output statistics are concerned, since then the output will be a Gaussian process and thus completely It will be recalled that initially we had assumed the determined statistically from knowledge of its correlation spectrum of the input and the output time function were function. centered at f = 0 and t = 0 respectively. If we had assumed
I
~)Z(f + ~}-i2Uf
444
THE BEST OF THE BEST
instead that they were centered at f = f' and t = t', the satisfaction of the inequalities (87) and (88) would still have lead us to conclude that the output correlation function can be determined by replacing the actual channel by a hypothetical WSSUS channeL However, instead of (93) we must use (94)
We are now in a position to consider the application of the preceding analytical results to the characterization of radio channels. As discussed at the beginning of this section, many radio channels seem to exhibit a combination of fast fading of a nearly Gaussian nature" and very slow fading of a generally non-Gaussian nature. 'Ve shall assume this to be the case for radio channels using the extended concept of fading described previously. Thus we assume that T(f, t) "fades" along the frequency axis with a "fast" frequency fading superimposed upon a "slow" frequency fading. Let us momentarily consider that the very slow variations are deterministic by selecting a particular member function of the stochastic process defining the slow fading, and let us assume that the fast fading is Gaussian. Then all statistical information concerning the output of the channel may be obtained once the correlation function R,.t(r, 0) is known, since then (90) may be used to determine the correlation function of the output Gaussian process wet). We may ascribe the variations of Rf,,(r, n) with i. t as due to the (temporarily deterministic) slow (time and frequency) fading of the channel. In practical channels it appears sufficient to consider (95)
in order to obtain a feeling as to the degree to which R J ,t (r, 0) varies with f and t. From the definition of T(f, t)
we see that R t ,« (0, 0) is equal to twice the average power received from a unit amplitude sine wave at time t and of frequency f cps away from carrier frequency, as measured by averaging along an ensemble of channels all with the same deterministic slow variations. In practice we do not have available an ensemble of channels.. However, we may obtain an approximate measurement of (95) with the determination of the time average; 1
T
1
jt+T1/2 I-T,/2
IT(j, t1)1 dt, 2
==
(IT(f, t)!2)T.
(96)
where the averaging T 1 is (hopefully) long enough to produce negligible measurement fluctuations due to the fast fading but yet short enough to reflect the long term fading behavior of the channel. 17 Some confusion may exist in the reader's mind as to precisely what is meant by Gaussian fading, since fading is usually stated to be Rayleigh distributed. By Gaussian fading it is meant that the transmission of a sinusoid results in the reception of a narrow-band Gaussian process with a possible nonfading specular component present. It is the envelope which will be Rayleigh or Rice distributed depending upon the nonexistence or existence of the specular component.
Measurements such as (96) seem to indicate that the inequalities (87) and (88) will be satisfied for a large percentage of radio channels for operating frequencies and signaling element bandwidths and time durations of practical interest. Thus it appears that a useful model for several radio channels is the QWSSUS channel with Gaussian statistics. Measurement of the correlation function R,tt(r, Q) on a short-time basis will then provide the necessary statistical information to evaluate the short-time performance of a digital system, assuming the statistics of any additive interferences are known. This short-time performance will be a functional of Rf.t(r, Q). Since RI,t(r, 0) is in effect a random process due to the slow channel fluctuations (we have removed the deterministic assumption), the performance index (say error probability) computed on a short time basis assuming Gaussian QWSSUS channel must be averaged over the long term statistics of RI,t(r, 0) to determine a longtime basis performance index.
a
VI.
CANONICAL CHANNEL MODELS
All practical channels and signals have an essentially finite number of degrees of freedom due to restrictions on time duration, fading rate, bandwidth, etc. These restrictions allow a simplified representation of linear timevarying channels in terms of canonical elements or building blocks. Such channel representations, called canonical channel models, can simplify the analysis of the performance of communication systems which employ timevariant linear channels. Two general classes of channel models, called Sampling Models and Power Series Models, will be considered in this paper. The Sampling Models apply when °a system function vanishes for values of an independent variable (time i, frequency f, delay ~, or Doppler shift v) outside some interval or when the input or output time function is time-limited or band-limited. The conditions for the applicability of the Power Series Models are not so simply stated. Stated briefly it requires the existence of a power series expansion of a system function in an independent variable and, depending upon the channel model, the existence of derivatives of the input function or spectrum. A. Sampling Models In this section we will develop the various sampling canonical channel models referred to above. The models 2 developed by }\:ailath will also be included 110t only for completeness but because the significance of some of the new sampling models is enhanced since they are dual to those of Kailath, It will be convenient to divide our discussion into two parts, one involving tune and frequency constraints and the other involving delay and Doppler-shift constraints. 1) Time and Frequency Constraints: A quick understanding of the time and frequency limitations that are relevant in the case of the sampling channel models may be arrived at by examining the input-output relationships corresponding to the Time-Variant Transfer Function
445
Fifty Years of Communications and Networking
T(f, t) and the Frequency-Dependent Modulation-Function M(t, f). These relationships are repeated below for convenience:
f == f
Z(f)T(f, t)eihf l df
w(t) ==
lV(f)
l z(t)M(t, f)e- i2trf dt.
(97)
If a time-variant linear filter is preceded by a timeinvariant linear filter with transfer function T,(f) and is followed with a multiplication by a time function M oCt), a combination time-variant linear filter which includes the input filter and output multiplier has a Time-Variant Transfer Function T'(!, t) given by T'(f, t) = T j (f)T (f , t)MoCt).
(98)
Eq. (98) is quickly deduced from the first equation in (97) by noting that preceding the original filter by a time-invariant linear filter with transfer function T i (1) is equivalent to changing the input spectrum from Z ([) to Z (j) T i (1), while following the original filter by a multiplication with Mo(t) is equivalent to multiplying both sides of the first equation in (97) by]l/o(t). _ Since a constraint on the bandwidth of the input signal to a frequency region of width Wi centered on i.. i.e., to i, - W i / 2 < f < fi + W i /2 cps may be represented by means of a band-limiting filter at the channel input, it is clear from (98) that such a constraint can be handled conveniently by defining a hypothetical channel whose Time-Variant Transfer Function T'I], t) is given by
T'(f, t) = Rect
(1 ;'/i)T(f, t)
(99)
where T(f, t) is the actual system function and Rect (z) =
Ixl < 1.2. o Ixl ~ ! I
{
~) =
J
eihfi(E-,lWi -sinc [Wi(~ - 1J)]g(t, 1]) d11
(101)
where . smcy
= sin -1ry -'fry.
T'(f, i) = T(f, t)Rect
(t;o to).'
(103)
Since the Output Doppler-Spread Function G(f, v) is the Fourier transform of the Time-Variant Transfer Function with respect to the time variable, it follows that corresponding to the hypothetical system function T' (f, t) in (103) there is ;9, hypothetical Output Doppler-Spread Function G'(I, v) given by
(100)
Since the Input Delay-Spread" Function get, ~) is the inverse Fourier transform of the Time-Variant Transfer Function T(j, t) with respect to the frequency variable, it follows that corresponding to the hypothetical system function T' (t, t) in (99) there is a hypothetical Input Delay-Spread Function g' (t, ~) given by g'(t,
range, say i. - fJ'i/ 2 < f < I, + W i /2, and the case wherein thechan:nel responds to other frequencies but has an input signal limited to frequencies in the range t. - W,/2 < f <: fiW ,/ 2 may be handled by the same analytical approach. In deriving the sampling model relevant to an input signal bandwidth limitation or an input frequency response limitation it will be convenient to assume that T(I', t) is nonzero only for values of f within the relevant frequency interval. It should be kept in mind, however, that when the input signal rather than the input frequency response of the channel is bandlimited, one must eventually use an equation such as (99) or (101) in order to express the parameters of the canonical channel model in terms of the true channel system function. If it is desired to observe the channel output for some finite time interval, say to - T 0/2 < t < to + T 0/2 or if, due to some gating operation in the receiver, only a finite time segment of the received waveform in the same interval is available, one has a constraint on the time duration of the channel output. It is clear from (98) that such a constraint may be handled analytically by defining a hypothetical channel whose Time-Variant Transfer Function T'(f, t) is given by
(102)
Eq. (101) is obtained from (99) by noting that multiplication corresponds to convolution in the transform domain and that the inverse transform of Rect ([f- til/Wi) is ei2~/tfWi sine [Wit]. Thus the case wherein the channel responds (i.e., has a nonzero output) only to input frequencies in a given
-sinc [To(v - j.t)]G(f,
J..L)
dp..
(104)
Eq. (104) follows from (103) for the same reasons that (101) followed from (99). 'Thus the case wherein the channel output vanishes outside some time interval, say to - T o/2 < t < to + T o/2, and the case where the channel has outputs outside this range but the receiver observes the received waveform only in this range may be handled by the same analytical approach. In deriving the sampling model relevant to a limitation in a receiver observation time it will be convenient to assume that T(j, t) is nonzero only for values of t within the relevant time interval. However, it should be kept in mind that when the output observation time is limited rather than the output time response of the channel, one must eventually use an equation such as (103) or (104) in order to express the parameters of the canonical channel model in terms of the true channel system function. A discussion entirely dual to the one above concerning input frequency and output time constraints and dealing
THE BEST OF THE BEST
446
with T(f, t) may be carried through for an input time and output frequency constraint by dealing with Jl(t, f) the Frequency-Dependent Modulation Function. Thus, if a time-variant filter is preceded by a multiplier which multiplies the input by M i (t) and is followed with a time-invariant linear filter of transfer function To(f) , a combination time-variant linear filter which includes the input multiplier and output filter has a Frequency-Dependent Modulation Function given by
u«, f) = Mj(t)M(t, f)To(f).
= Rect (t
;; ti)M(t, f),
(106)
or equivalently by using a hypothetical Input DopplerSpread Function HI(I, v) given by
in place of the actual system function and then assuming an internal input time constraint. A constraint on the frequency interval over which the channel output is observed, say to to - W o/2 < f < fo + W o/2 cps, may be handled analytically by using
M'(t, f) = M(t, f) Rect
(I ;/0)
H(y)
(108)
E H(~) sine [ X(y - ~) ] .
SllCY
and
H(y)
=
f h(x)e-
(110)
i2U
#
sm 1ry
(111)
=-7ry
dx or
f h(x)e
i2
U#
dx,
(112)
When h(x) vanishes outside an interval which is centered on Xl, i.e., when h(x) vanishes outside the interval Xl Xj2 < x < Xl + Xj2, the Sampling Theorem becomes
H(y) = E H(~)eHh%t(#-k/X) sine [x(y - ~) J
(113)
where the sign of the exponential used in (113) agrees with the sign of the exponential used in the Fourier transform definition of H(y). a) Sampling Models for Input Time and Frequency Constraints: In this subsection we will derive sampling models appropriate for input time and frequency constraints. Consider first the case of an input time constraint. discussed above, such a case may be described by stating that M (t, f) vanishes for values of t outside some interval, say, t, - T i /2 < t < t, + T i 12. Since the Input Doppler-Spread Function Hij, v) is the Fourier transform of M(t, f) with respect to i (see Fig. 5), it follows from (113) that
As
H(j) ,)I
=
L H(1, Tn•.
.
)e-i2rti(V-nITi)
[(
n)J
-sinc T i v - T i
•
(114)
If the summation in (114) is used in place of Htj, v) in the input-output relationship [(12)]
in place of the actual Frequency-Dependent Modulation Function M(t, f), or equivalently by using
(109)
=
where
(105)
Eq. (105) is quickly deduced from the second equation in (97) by noting that preceding the original filter with a multiplication by Mi(t) is equivalent to changing the input time function from z(t) to z(t)M i(t), while following the original filter by a filter with transfer function To(f) is equivalent to multiplying both sides of the second equation in (97) by ToCt). A constraint on the duration of the input waveform to, say, t, - T i / 2 < t < t, + T i l2 can be handled analytically by using a hypothetical Frequency-Dependent Modulation Function M' (t, f) given by
M'(t, f)
pressed as the following series
W(j)
=
J
Z(j - II)H(j, II)dll,
one finds that in the case of an input time constraint the output spectrum is given by
E
f
H,,(j) Z(j - II)T; in place of the actual Output Delay-Spread Function W(f) = h(t, ~) and then assuming an internal output bandwidth constraint. Having completed our preliminary discussion of time and frequency constraints we may now proceed to the where determination of the corresponding sampling canonical channel models. (116) Hn(j) = H(11 All the sampling models are derived by application of the Sampling Theorem," which states that if a function Note that the integral in (115) is just the convolution of hex) is zero for values of x outside an interval -X/2 < the input spectrum Z(J) with e-i2~ti{J-n/Ti)Ti sine z < X/2, then its Fourier transform H(y) may be ex- [T,(f - n/Ti ) ] . Since convolution becomes multiplication in the transform domain, and sinee the time function corresponding to e-j2~tdJ-nITi)Ti sine [T.(j - niT,)] is 18 P. M. Woodward, Ope cit.,' pp. 33-34.
~.
;J.
447
Fifty Years ofCommunications and Networking
exp [j21rn(t/1\)] Rect ([t - ti]/Ti ) , the time function corresponding to the convolution integral in (116) is just the product z(t) exp [j21rnt/T i ] Rect ([t - td/T i ) . Thus, (115) states that the channel output may.be obtained by gating the input with the time function Rect ([t - ti]/Ti ) , frequency shifting the gated input by harmonics of l/T i cps, filtering each of these gated frequency-shifted waveforms with an ap.propriate filter Hn(~) -.... ---+-- _ _ - _ w ( t l for each harmonic and then summmg the result. ThIS series of operations immediately suggests the channel Fig. lO-Canonical channel model for input time constraint, output filter version. model shown in Fig.' 10. Although, in theory, an infinite number of frequency converter-filtering elements would be required, in practice a finite number will suffice since for the US, WS8, and WSSUS channels. We mention for a physical channel the range of Doppler shifts is finite only the simplified form it takes in the case of the WSSUS and thus H(f, 11) must effectively vanish for 11 outside some channel, interval. When the channel is randomly time-variant the filters H":,(f)Hm(f Q) := e- i 21r I ; ( m- " l / T . sine HnCf) become random filters. The correlation properties of these filters are defined by the average 'sine ]PH(Q, (121) H*"(f)Hm(l) = I; ;. ' ;). (117) It will be recalled that PH(n, ,,) is equal to the cross-power For the case of the US channel (90) simplifies to [see (66)] spectral density between the processes M(f, t) and M(t + 0, t). From (121) it is readily seen that if PH(fJ, v) , (1)2 H ( changes very little for changes in v of the order of the (118) H~(f)llm(f + Q) = T R n; T, 'T, · i reciprocal of the duration of the input waveform, I/T i , It is not difficult to see that a channel may not have an we have the approximation internal input. (or output) time constraint and be a WSS m ~n (or WSSUS) channel since the existence of a time con(122) straint is incompatible with the existence of stationarity. m =n However, an external input (or output) time constraint may be associated with any type of channel. Thus in the case of the WSSUS channel, when P H(O, 11) The correlation properties of the random filters as ex-, varies little for changes in v of the order of the reciprocal pressed in (117) and (118) are pertinent to the ca~e of an of the duration 'I', of the input time constraint, the various internal input time constraint. In order to obtain from random filters become uncorrelated and the frequency (85) the corresponding cross-correlation functi~n betw~en correlation function of an individual filter transfer function the random filters for the case of an external Input time Hn(f) becomes proportional to the value of the density constraint, i.e., the case of a time-limited input waveform function PH(O, v) at 11 = niT,. (e.g., a pulse input), we replace the actual Inp.ut ~opple: We will now determine the channel model appropriate Spread Function by a hypothetical one as indicated In to the case of an input bandwidth restriction.?" This (107). It is quickly seen that instead of (116) we have restriction is dual to that of an input time limitation and, as will be seen, leads to a dual channel model. H,,(f) = e- i 2r " ( ' - ,,/ T il sine [ T;(v - ;.) ]H(f, v) dv (119) Let us assume that the channel responds only to frequencies within the interval t. - W,/2 < f < i, Wi/~. as the expression for the transfer function of the random From the discussion in Subsection VIA.I), we see that this filter in the canonic channel model of Fig. 10 when the is equivalent to stating that T (1, t) is zero for values of f input time constraint is external to the channel proper. outside this interval. Since get, t) is the inverse Fourier Also, instead of (117) we have transform of 'I'i], t), an application of the Sampling Theorem shows that get, ~) may be represented by the H*n(f)Hmel) senes = e- i 2r t ; ( . - ,,/ T . - p+ m/ T . 1 sine
+
J
[T.(v - ;)]
[T.(v - ;)
(~JRH(f,
v) dv.
n m)
J
JJ
-sinc [
+
[T.(v - ;.)]
T(1l - ;:) }n(f, I; v, /J.) dv dp.
(120)
as the cross-correlation function between the random filters for the general channel. By using the appropriate form for the correlation functions, (120) may be specialized
-sinc [ 19
Ope
W.(t - ;J1
(123)
This channel :model has previously been derived by Kailath,
cit.!
448
THE BEST OF THE BEST
If the summation in (123) is used in place of gCt, the input-output relationship w(t)
=
I z(t -
~)
in
L
gn(t)
I
-sinc [
z(t -
~
~)g(t, ~) d~,
9n 1tJ
one finds that in the case of an input frequency constraint the output time function may be represented in the form
w(t) .,
Chl
l(t)~--·i ~ ...
~)W;
w;(~ - ;) ]ei
2
.. /H(-t.IWIl
d~
where
Fig. II-Canonical channel model for input frequency constraint, output multiplier version.
of an internal input bandwidth constraint. In order to obtain the corresponding cross-correlation function for the case of an external input bandwidth constraint, i.e., a band-limited input signal, we replace the actual Input Delay-Spread Function by a hypothetical one as indicated (125) in (99).. It is readily seen that instead of (125), we have (124)
Note that the integral in (124) is just the convolution of the input with a time function Wi exp [j21rfi(t - n/Wi ) ]
sine [Wi(t - njW i ) ] . Since the spectrum corresponding to this latter time function is exp [- j2rn (f /W i)] reet ([I - f·d/W;), it follows that the spectrum corresponding to the convolution in (124) is just the product Z(f) Rect ([f - fi]/W,;) exp [-j21rn(f/W i ) ] . Thus (124) states that the channel output may be obtained by passing the input through a band-limiting filter with transfer function Rect ([1 - til/Wi), delaying the resultant by multiples of a basic delay l/W i , multiplying each of these delayed functions by a multiplier gn(t) appropriate to the delay n/W i , and then summing the result. This series of operations immediately suggests the channel model shown in
gn(t)
=
I
e-i2 ..
f;(~-"IWi)
as the expression for the multiplier associated with a delay of njW'i sec in the canonic model of Fig. 11 when the input frequency constraint is external to the channel proper. Instead of (127), we have
g~(t)gm(8) =
II
-sine [
eihfHt-.-aIW/+mIlV/)
w.(
11 -
;)
}.(t,
sine [
8;
w.(~ -
~, 71) d~ dTJ.
;)] (129)
Fig. 11.
By using the appropriate form for the correlation functions
interval.
g*n(t)gm(t
Although, in theory, an infinite number of taps would be required, in practice a finite number will suffice, since (129) may be specialized for the US, WSS, and WSSUS for a physical channel the spread of path delays is finite channels. We mention only the simplified form it takes in and thus g(t, ~) must effectively vanish for t outside some' the case of the WSSUS channel, When the channel is randomly time variant the multipliers gn(t) become random processes. The correlation properties of these multipliers are defined by the average
g~(t)g..(8) = (~JR.(t, 8; ; ; , ; }
(126)
For the case of the WSS channel,
+
=
J
ei2r/iCtn-ml/WO
·sine
[ J w.(~ - ;;) P.(T,~)
dJ;.
(130)
Examination of (130) shows that if PII(T, ~) changes little for changes in t of the order of l/W, we have the approximation
g~(t)g..(t It is not difficult to see that a channel may not have an internal input (or output) bandwidth constraint and be a US (or WSSUS) channeL To understand this fact recall lsee discussion following (66)] that the US channel has a wide sense stationarity property in the frequency variable. Such a property is clearly incompatible with an internal input (or output) frequency constraint. However, an exiernoi frequency constraint may, of course, be associated with any type of channel. The correlation properties of the random multipliers as expressed in (126) and (127) are pertinent to the case
T)
o · + T) ~ {.L ( ~): W. p o r,W. ' ,
m
'¢
n (131)
m =n
I
There exist alternate channel models closely related to those of Figs. 10 and 11. Consider first the case of an input tinle constraint and the channel model of Fig. 10. If we define Z' (f) as the spectrum of the time function, z(t) Rect ([t - ti]/T), which is the input to the frequency conversion chain, it is readily seen that the spectrum of the channel output W(f) may be expressed as W{f) = LtHfl(f)Z'(t _ ~
n.).
T,
. (132)
449
Fifty Years of Communications and Networking RECT( ~~'I)
If we define the filter
Iltt--ollX)-----.-----......---
G,,(f)
=
H,,(t + ;)
(133)
then the output spectrum may be expressed as
W(f) = L G,,(t-
;)z'(t - ;)
(134)
which states that the output may be obtained by gating, filtering, and frequency shifting as indicated in Fig. 12. With the aid of (133) and (116) and (119) one may determine expressions for Gn(f) in terms of the system functions for the cases of internal and external time constraints. Also the correlation properties of Gn{f) are readily determined from those of Hn(f) with the aid of (133). Similarly, if we define z"(t) as the time function resulting from band-limiting the input as shown in Fig. 11, it is seen that the channel output wet) may be expressed as
'wet) =
L gn(t)Z~/(t - ~). W, n
---w(l)
Fig. 12-Canonical channel model for input time constraint, input filter version.
1111~.ECTI .-1" w. ~ --_ -..--------1~
---wit)
(135)
Fig. 13-Canonical channel model for input frequency constraint, input multiplier version.
If we define the time function (136) then the output time function may be expressed as
wet) =
L n
hn(t -
~)Z"(t - ~) W, W,
(137)
We shall not give the details of the derivations because of their similarity to the derivations in the previous section. The resulting canonical models are shown in Figs. 20 14 to 17. Note that the output time constraint models differ in form from. the input time constraint models only in having an output time gate instead of an input time gate. Similarly, the output frequency constraint models differ from the input frequency constraint models only in having an output rather than input band-pass filter. The filter transfer functions in Figs. 14 and 15 are given by
which states that the output may be obtained by bandlimiting, multiplying, and delaying as indicated in Fig. 13. With the aid of (136) and (125) and (128) one may determine expressions for kn(t) in terms of the system functions for the cases of internal and external bandwidth constraints. Also the correlation properties of kn(t) are readily fi,,(f) = determined from those of On(t) with the aid of (136). o (138) b) Sampling Models tor Output Time and Frequency l 1 ( Constraints; The development of channel models for U,,(f) = To G I, To = H" f + To output time and frequency constraints parallels the previous development of channel models for input time in the case of internal output time constraints. When the and frequency constraints. When considering an output output time constraint is external, however, one must use time constraint one specifies that 'I'I], t) vanishes for t i 2r I ]G(f, II)dll outside some interval, say, to - T o/2 < t < to + T o/2 a,,(f) = e . ( , - " I To ) sine [To while for an output frequency constraint one specifies (139) that M(t, f) vanishes for f outside an interval fa - W o/2 < fI,,(f) = G,,(t f < to + W o/2. Then, for an output time constraint, one makes a sampling expansion of G(f, v) [the Fourier transThe gain functions in Figs. 16 and 17 are form of T(f, t) with respect to t] in the v variable while for an output frequency constraint one makes a sampling expansion of h(t, ~) [the inverse Fourier transform of (140) M (t, f) with respect to f] in the ~ variable. By using these expansions in the appropriate input-output relations and examining the resulting series expressions for the output (as was done for the 'case of input time and frequency in the case of internal output bandwidth constraints. constraints) one may obtain the appropriate canonical
i H(/, ;J
n) . ( n)
J
channel models for the case of output tilne and frequency constraints.
&- ;J
;J
20The model.in Fig'. 16 has been previously derived by Kailath, Ope cit. I .
450
THE BEST OF THE BEST
c) Sampling Models [or Combined Time and Frequency Constraints: In Subsections VA. 1) a) and b), we have
zU)--
RECT(
';'0)
_~-_-~t--_ ---~'ltl Fig. I4-Canonical channel model for output time constraint, output filter version.. zlO- - -
---+--------.--
REcn';t o )
----J~.111 Fig. I5-Canonical model for output time constraint, input filter version.
developed canonic channel models for the case of a single constraint on time or frequency at the input or output of the channel. However, it is possible for certain combinations of these constraints to exist for the same channel. Two interesting combinations, which are dually related, are the cases of a combined input time constraint and output frequency constraint and a combined input frequency constraint and output time constraint. Other combinations are either impossible or do not lead to new models. The impossible combinations are combined internal time and frequency constraints on the same end of the channel, since (as a study of the meaning of an internal constraint readily reveals) such combined constraints imply the existence of functions which are both time- and band-limited. Consider now the case of an internal input time and output frequency constraint. Mathematically we can represent such a combined constraint by stating that the Frequency-Dependent Modulation Function M (t, f) vanishes for values of t and f outside the rectangle t, T i/2 < t < i, T i/2, 10 - W o/2 < f < fo + W o/2. Thus, M (t, f) satisfies the equation
+
z(t)---
- - _ .......... -
_ _ ----+-_ --
--I.
M(t, f-fO)~.Ctl Wo
ECTC
Fig. 16-Canonical channel model for output frequency constraint, output multiplier version.
f)
= Rect (t ~. to)MCt, f) Reet (t ;"0to).
(142)
But it is readily seen that we can also write
M(t, f)
= Rect
e
(t~. t')MCt, f) Rect ;'oto)
(143)
where
M(t, f)
co
L L
m,n--c:o
---i.
M(t -
-a.
f - mWo)
(144)
since only the m = 0, n = 0 term in the sum (144) defining M(t, f) contributes nonzero values to the left side of (143). Fig. 17-Canomcal channel model for output frequency constraint, Eq. (143) states that the channel under discussion input multiplier version. may be represented as the cascade of three operations. The first is an input gating operation with the function Rect ([t - ti]/Ti ) , the second is a filtering operation by For external output bandwidth constraints the gain means of a time variant filter with Frequency-Dependent functions become Modulation Function 1Vl(t, f) and the last is a band-pass filtering operation with transfer function Rect ([f - foll h,.(t) = e- i 2 r ' o
I
Wo(~ -
ECTl f-fo wo I
~.(t)
;J}Ct, ~) d~ ;J
For randomly time-varying channels the correlation properties of the filter transfer functions in Figs. 14 and 15 and the gain functions in Figs. 16 and 17 are quickly determined in terms of the appropriate channel correlation functions, as was done for the case of input time and frequency constraints.
Rect ([f - !o]/Wo). It will be recalled (see Fig. 5 for example) that the
Doppler-Delay-Spread Function V(v, Fourier transform of llI(t, f), V(II, ~)
=
IfM(t,
f)e-
i2
~)
r< h - IU
is the double
dt
u.
451
Fifty Years ofCommunications and Networking
Thu! the Doppler-Delay-Spread Function corresponding to 11 (t, j) is given by 1"(1',
~) = ~.?2
Jf M.
REcn!.rf-)
Zltl-Q};.
-a ; f - nW o) (145)
where the second equation follows from the first by making the changes of variable t - m/I', ~ t, f - nWo ~ f in the double integral in the first equation. The sums ill the second equation in (145) may be recognized as Fourier series expansions of periodic impulse trains, i.e.,
I I
-----< ---------' - -
. . . L_~RECTlf-f.1 wo
1--111
Fig. IS-Canonical channel model for input time-output frequency constraint.
(146) so that
f(v,~) =
1
ToW • 0
Lm.nL
m n V(T . ' w 0 ) t
If we let z'(t) and w'(t) respectively denote the input and output of the channel with Doppler-Delay-Spread Function V(v, ~), we can express the channel output as w'(t) =
L L m.n
V mnZ'(t -
..!!:..)ei2r(m/T~)(t-~/W.) Wo
(148)
where the complex amplitude (149)
have this same constraint, i.e., Hn(j) must vanish for values of f outside the intervalj, - W o/2 < f < /0 + W o/2. This latter fact becomes quite clear by examination of Fig. 10, since these filters are the only elements in the model which provide frequency selectivity. An analytical proof is readily obtained by noting that H n (f) = (l/Ti)H(f, n/Ti ) and that the f variable in Hi], v) and M(t, f) are the same variables since Ht], v) and M(t, f) are Fourier transform pairs in the t, JI variables.. . Since the filter H n (1) has an output frequency constraint (and also an input frequency constraint since it is timeinvariant), one m2~Y represent this filter by means of a canonic channel model consisting of a tapped delay line as shown in Fig. 11, but with time-invariant gains. When this representation of FIn(f) is made one arrives at the canonic channel model shown in Fig. 18. When the channel is randomly varying the coefficients V mn become random variables. The correlation between these random gains is given by
(T'~0)2Rv(~.,
;0' ;J.
Thus, apart from the input gate and the output bandV=: ..V.. = ;; ; (150) limiting filter, the channel representation corresponding The expressions for the gain (149) and its correlation to an input-time output frequency limitation is a discrete number of point" scatterers" each providing first a fixed properties (150) are applicable for the case of internal Doppler shift which is some multiple of the reciprocal constraints on the input time and output frequency. For input time duration constraint and then a fixed delay the case of external constraints the same canonical channel which is some multiple of the reciprocal output bandwidth model applies, but in determining V mn the actual Freconstraint.. The complex amplitude of the reflection from quency-Dependent Modulation Function M(t, f) should the point scatterer is just I/T i Wo times the Doppler- be replaced by a hypothetical one given by Delay-Spread Function sampled at the same value of delay and Doppler shift provided by the scatterer. Fig . 18 M'(t, f) = Rect t')M(t, f) Rect (151) demonstrates the realization of such a channel by tapped Of, equivalently, the actual Doppler-Delay-Spread Funcdelay lines and frequency conversion chains. The canonic model of Fig. 18 can be derived in a some- -tion 'V(v, ~) should be replaced by a hypothetical one what different manner than described above by making V' (v, ~) given by use of canonic channel models previously derived for ih t single constraints on input time and output frequency. V'(J1, ~) = Rect , Thus, consider the canonic channel model for the case of an input time constraint, Fig. 10, and note that when an i 2r n Reet (' M (t, f) dt df. (152) output frequency constraint exists the filters Hn(f) must
(t ;;
J (t ;.t')e·f ;/0)e
(! ;0'0)
452
THE BEST OF THE BEST
With the aid of (108) and (109) the integral with respect to f can be expressed as
Similarly the integration with respect to t may be expressed as
Thus for the WSSUS channel and a sufficiently smooth Scattering Function, the gains of the point "scatterers" in the canonical channel model become uncorrelated and the strength of the reflection from a particular scatterer becomes proportional to the amplitude of the Scattering Function at the same value of delay and Doppler shift provided by the scatterer. A somewhat different canonical channel model may be derived for the case of an input-time output-frequency constraint by using the relationship [see (34)]
m) u(~ T r r, XT 0 't'
(154) which results in the following expression for V' (v,
~):
.1
=
t'
V(l1t'1 .
1
~)
.
,.
'w'"0
-i2r(m/Ti)(n/Wo)
e
(159)
in (121) to show that w'(t) =
Ti~VO ~.f= u(;o ';:) (160)
·sinc [TiCv - It)] sino [Wo(E - 1])] V(~, TJ) dp. d1J.
(155)
The gain V mn in the case of an external input timeoutput frequency constraint is given by
V
= 1nn
_1_
TiW0
·v,(1nT
i '
~) =
If
woe
i2..-t,(Jf-m/T,)
[T ( T,m)J
-e -i2 ...fo
-
The correlation V: n V,. 8 is readily determined as a fourfold integral involving Rv(v, IJ; ~, 11) by using the integral representation, (156), for Vmn and V r a and then averaging under the integral sign. It does not appear desirable to take the space to present this fourfold integral. In the ' case of the WSSUS channel, for which [(72) and (74)]
Rv(v, p,; t, 11)
=
Examination of (160) shows that w'(t) is obtained by first delaying z'(t) by multiplies of l/Wo and then Doppler-shifting by multiples of IjTi _ Thus, this model will differ from the one shown ill Fig. 18' only in that the order of delay and Doppler shift is reversed and the complex amplitude of the reflection from the point scatterer is equal to l/T i W o U(n/Wo, 1n/T i ) rather than 1/TiW O V(1n/T i , njWo) . To derive the canonical channel model for the case of an input-frequency output-time constraint we may proceed in a manner entirely analogous to that in the case of the dual constraint, i.e., the input time-output frequency constraint. We have omitted this derivation because of its similarity to that for the dual case. The resulting channel model is shown in Fig. 19, in which (161)
in the case of internal constraints and
S(~, v)o(1J - v)o(1] - ~),
the fourfold integral becomes the double integral
·JJ
sine [
-sine [
T.(v - ~) ] sine [ T.& - ;)]
Wo(~ -
;)
J
sine [
Wo(~ - ~JJS(~, v) dv dE. (157)
in the case of external constraints. The correlation between the gains is given by
(163) When the" Scattering Function S(~, v) varies very little for changes in ~ of the order of l/W o and changes in v of for the case of internal time and frequency constraints. the order of l/T i , (157) simplifies to As in the dual situation, the correlation between the gains in the case of external constraint may be expressed m ¢ n, r ~ 8 in terms of a fourfold integral involving the Delay(158) Doppler-Spread Function. We present only the expresm =n, r = 8 sion for the case of WSSUS channel,
Fifty Years ofCommunications and Networking zll)
-,. -BJ-REct(---!) Wi
- -
rEBT-E:EJ.
DELAY-See.
DELAY-see
«,
Wi
Um,n-I
:
I I
.
Um+t,n- I'
453 ---
1 I
and V(v, ~) are so simply related both must vanish over the same intervals of ~ and v, while the dual system functions M(t, f) and T(f, t), with their more complicated relationship, (39), need not vanish over the same intervals
of f and t. In the following subsections we shall develop canonical
models for channels which are limited in either path delay spread or Doppler spread or both Doppler- and delayspread.. a) 1-10
RErlT;) -----------------....-
---Q9--W(fl
Fig. 19-Canonical model for input frequency-output time constraint.
·if
sine [
W.(~ - ;)] sine [W.(~ - ;)]
Sarnpling Models for Delay-Spread Constraint: 2 1 If we
assume that a channel provides path delays only in an interval Ll seconds wide' centered at to seconds, then g(t, ~), U(E, 11), k(f; t), and V(v, ~) must vanish for values of ~ outside this interval. It follows that the Sampling Theorem, (86), may be applied to the Fourier transforms of these system functions with respect to the delay variable, i . e.. , to the system functions T(j, t), G(f, JI), M(t, f), and Ht], v), respectively. To derive the canonical channel models appropriate to a delay-spread limitation it is sufficient to deal with Tt], t) and M(j, t). Thus, according to (113), we find that T(f, t) and M{f, t) have the following expansions
When the Scattering Function varies very little for changes in t of the order of l/W i and changes in JI of the order of liTo, (164) simplifies to
U:..U.. ~ JO 1 (n M); m r! n, r F- 8 lwiTo S Wi "T« ; m = n, r = 8
(165)
yielding a collection of uncorrelated scatterers for the canonic channel model. . As in the dual case, a canonical channel model may be found for the case of an input-frequency output-time constraint which differs' from that shown in Fig. 19 only in a reversal of the order of delay and Doppler shift. In this model the gain of a scatterer is set equal to
(ljToW,)V(njTo, mjW i )
(166) and
M(t, f)
(167) Upon using the expansion (166) to represent Tt], t) in the input-output relationship (19), we find the series representation for the channel output to be
wet) =
rather than (l/ToW.)U(mjW i , njTo) . 2) Delay- and Doppler-Shift Constraints: For physical channels the spread of path delays and the spread of Doppler shifts are effectively limited to finite values. According to our definition of system functions, ~ limitation in the spread of path delays means that all system functions containing the delay variable ~ vanish for values of ~ outside some specified interval. Similarly, a limitation in the spread of Doppler shifts means that all system functions containing the Doppler-shift variable v vanish for values of v outside some specified interval. This situation is somewhat different from the cases of time and frequency constraints discussed above where different physical interpretations (input as opposed to output constraints) might be associated with a specification that system functions vanish for values of the variables t or f outside specified ranges. The difference in behavior may be traced to the fact that the dual system functions U(t, 11)
= ~ M(t:. ~~-i~"E,(f-m/4)
~
T(:, t) J Z(j)
e-ihE.U-m/4l
(168) Examination of (168) shows that the channel output is represented as the sum of the outputs of a number of elementary parallel channels, each of which filters the input with a transfer function of the form
for some value of m and then multiplies the resultant by a gain function T(mILl, t). The impulse response of the filter, i.. e., the inverse Fourier transform of
Sl
This case has been treated previously by Kailath, Ope cit.I
THE BEST OF THE BEST
454 z ( t ) - - --
is readily found to be exp
[i271- :
tJ ~
Rect
(t ~ ~o) ,
which is the complex envelope of a rectangular RF pulse of frequency [, + m/ ~ (where [, is the carrier frequency) and of width 11 seconds centered on t = ~o. Such a filter has frequently been called a band-pass integrator. If we let I.,(/)
= exp [
-i2~o(f - ~) Jsine A(f - ~) ,
= ~ 1.,(f)
f Z(t)M(t, ~)ei2"/1 dt.
-
_-......----+------...---~.....----wcu
Fig. 2o--Canonical channel model for delay-spread limited channel, output multiplier version.
(169)
then we can represent the canonical channel model corresponding to (168) as shown in Fig. 20. An alternate channel representation which involves multipliers on the input side rather than the output side may be derived by using the series (140) to represent M(t, f) in the input-output relationship (21). The resulting series expression for the output spectrum is given by W(f)
---....---~t--------.-
_ _......--_-+------4_----+-- --_wttl Fig. 21-Canonical channel model for delay-spread limited channel, input multiplier version.
(170)
where R(O, r) is the Time-Frequency Correlation Function Examination of (170) shows that the mth term in the defined in Section IV-D. b) Sampling},{odels for Doppler-Spread Constraint: The sum involves a multiplication of the input by a (complex) Doppler-spread constraint is dual to the delay-spread gain function M(t, m/Ll) followed by a filtering operation constraint and the derivations and resulting canonic with a filter having transfer function Im(f). Such a repredual pattern. Thus, if we assume that a models follow a sentation is shown in Fig. 21. channel provides Doppler shifts only in an interval reps As with the previous channel models, although an wide centered on 110 cps, then Ht], v), V(v, ~), GU, v), and infinite number of elements is involved, only a finite number is needed in practice. Thus, in Fig. 20, since the U (~, v) must vanish for values of 'II outside this interval. approximate bandwidth of each band-pass integrator is Application of the sampling theorem then produces the 1/~ cps, and since adjacent integrators are separated by expansions 1/Ii cps, an input signal of bandwidth W would require T(f, t) = E T(/, n/f)e;2r o (t - n / r ) sine [r(t - n/r)] (172) n somewhat more than W Ll, perhaps lOWd, judiciously selected adjacent multiplier-filter channels to produce a and very close approximation to the channel output. More M(t, f) = M(n/r, f)ef21rPo(t-n/r) elementary channels may be needed in the model in " Fig. 21 where the multipliers precede the filters, because -sinc [ret - n/r)]. (173) the time varying gains M(t, m/ ft.); m = 0, ±1, ±2, etc. Upon using (172) in (15) and (173) in (21), we obtain spread the spectra of the inputs to the corresponding the following expansions for the channel output: filters. When the channel is randomly time-variant the gain wet) = e;2 o ( t - n/ r ) sine [ret - n/r)] n functions T (mj Ii, t) and M (t, m/Ll) become random processes. It is clear that the correlation properties of these Z(f)T(j, njr)ei2r/1 dt (174) gain functions are completely determined from the correlation functions of the Time-Variant Transfer Function and and Frequency-Dependent Modulation Function. Since these correlation functions have been discussed in detail in W(f) = M(njr, f) z(t)ei 2 .... (1-n1f) Section III, there is no need for further discussion here. However, it is interesting to note that only for the WSSUS -sinc [ret - n/r)]ei21tft dt. (175) channel do the correlation properties of the gain functions The nth term of the series (174) may be interpreted as in Figs. 20 and 21 become identical, i.e., for the WSSUS the result of a filtering operation with filter transfer channel function T(f, n/r), followed by a multiplication with a gain function exp [j2rvo(t - njr)] sine r(t - n/r). If M*(t, m/ a)J.tl(t + T, n/ d) we let n - m ) (171) = T*(m/ a, t)T(n/ Ll, t + T) = R ( -a- , T PACt) = exp [j27r110(t - n/r)] sine [f(t - n/f)] , (176) p
L
E
E
1f P
·f
f
455
Fifty Years ofCommunications and Networking
then the canonical channel model which follows from the above interpretation of (174) is as shown in Fig. 22. In an entirely analogous fashion (175) leads to the model shown in Fig. 23. Whereas in the dual cases described in the previous section a finite number of multiplier-filter combinations is satisfactory for representing the channel for a bandlimited input, in the present cases a finite number of multiplier-filter combinations may be used when the input is time-limited. This fact may be appreciated by noting that the gain function Pn(t) acts as a "gate" of duration 11 r and that the" gates" of adjacent elementary channels are separated by r/r seconds. Thus, an input signal of duration T will require anywhere from, say Tr to lOTr judiciously selected adjacent elementary channels to characterize the channel. When the channel is randomly time-variant the filters T(f, n/r) and ltl(nlr, f) become random processes in the frequency variable. The correlation properties of these filters may be determined from the results of Section IV, which deals with the correlation functions of the various system functions. It is interesting to note, as in the dual case, that only for the WSSUS channel do the correlation properties of the random filters in Figs. 22 and 23 become identical. c) Sarnpli'ng "Aiodels for Combined Delay-Spread and Doppler-Spread Constraints: When both a Delay-Spread and Doppler-Spread constraint exist, the sampling theorem may be applied twice to the system functions T(f, t) and lIf(t, f), i.e., once when they are considered as time functions with the frequency variables fixed and once when they are considered as frequency functions with the time variables fixed. In this manner one may form sampling expansions and determine corresponding canonical channel models. However, this procedure is unnecessary since the desired models may be obtained by inspection of Figs. 20-23 by combining models appropriate to delayspread and Doppler-spread constraints. To demonstrate this latter approach, examine the model of Fig. 20, which is appropriate to a delay-spread constraint. If we require that a Doppler-spread constraint also exist, the multiplication operation in each parallel channel is in effect a subchannel with H. Doppler-spread constraint and may be represented by the canonical model of Fig. 22 or 23 where the f variable is set equal to m/ ~ for the rnth branch in Fig. 20. If this procedure is followed using the model of Fig. 22 in Fig. 20, the model of Fig. 24 appears. In an entirely analogous fashion one may generate three additional models, one by using the model of Fig. 23 in Fig. 20 and two more by using the models of Figs. 22 and 23 in Fig. 21. It is readily demonstrated that for waveforms which are effectively limited in time and frequency duration, i.e., waveforms which have most of their energy located in a finite time-frequency interval, only a finite number of filters and multipliers may be used in the models of Figs. 24 and 25 to provide a close approximation to the actual channel output.
The complex gain constants T(nt/ L\, n/r), M(njr, 1n/L\) become random variables when the channel is randomly time-variant. Their correlation properties are just sampled values of the correlation functions of 'I't], t) and lvI(t, f) respectively. 3) Combined Time and Delay-Spread or Frequency and Doppler-Spread Constraints: In Section VIA. 1) we have developed canonical channel models for time and frequency constraints, i.e., for situations in which system functions vanish for values of t and/or f outside specified intervals. In Section VIA. 2) we have developed canonical I U I - - - -.....
----.-----+------+--
__.......__-.-.-----+-----6--
---wit)
Fig 2'l-Canonical channel model for Doppler-spread limited chnnDel, output multiplier version.
,It)-- - --.-----+-----+----+---
- -.....- - - . - . - - - - . . . . . . - - - - . . - . - -
-- -wm
Fig. 2:J-Canonical channel model for Doppler-spread limited channel, input multiplier version, . .1I1
~
I",In
:
TI!!. !.)Lltt
s-r
'11
--------4---------4-.----_wltl
Fig. 24-A canonical channel model for combined delay-spread and Doppler-spread limited channel.
I
I
------'L --..
- - - - - - - - - 4 - - -__
ttl
Fig. 2()--A canonical channel model for combined input-time and delay-spread constraints. .
THE BEST OF THE BEST
456
channel models for delay-spread and Doppler-spread constraints, i.e., for situations in which system functions vanish for values of ~ and/or p outside specified intervals. Here we consider certain combinations of the constraints in Sections VIA. 1) and 2), namely, combined time and delay-spread constraints and combined frequency and Doppler-spread constraints. Other combinations are not possible since they are equivalent to requiring that a function be limited ill both time and frequency. The combined constraint models may be derived by combining the models appropriate to the individual constraints, as was done in Section VIA. 2) c). One may combine a delay-spread constraint with either an input or output frequency constraint. We shall present here only one model for a combined Doppler-spread and frequency constraint and one model for a combined delay-spread and time constraint. The remaining possible models may be quickly constructed by the reader. To construct a model appropriate to an input time constraint and a delay-spread constraint we may' make use of the models of Figs. 10 and 21. We note first that if the channel is delay-spread limited then the filters Hn(f) in Fig. 10 are also delay-spread limited. Thus, each of these filters may be represented by means of a canonical model of the form of Fig. 21. Note, however, that since these filters are time-invariant, the gain functions in Fig. 21 are also time-invariant. To determine the value of the (time-invariant) gains it should be noted that a timeinvariant filter with transfer function H (f) has
M(t, f) = H(J).
(177)
Then, using the model of Fig. 21 to represent each filter in Fig. 10, we arrive at the model shown in Fig. 25. To construct a model appropriate to an input frequency constraint and a Doppler-spread constraint we may use the Doppler-spread constraint model of Fig. 22 and input frequency constraint model of Fig. 11. In this connection one should 110te that a multiplier get) is a degenerate time-variant linear filter with Ti], t)
B. Power Series Models In this section we will derive certain canonical channel models which arise from power series expansions of Ti], t) and M (t, f) in either the time or frequency variables. The two models arising from expansions in the f variable will be called f-power series models, while those arising from expansions in the t variable will be called t-power series models. As might be expected, the f-po\ver series models are dual to the z-power series models and are useful in dual situations. 1) f-Power Series Models: The starting point for our discussion is the input-output relationship corresponding. to the Time-Variant Transfer Function T(f, t) (19), which is repeated below: w(t)
=
J
Z{f)T{f, t)e
It is readily determined that the model of Fig. 26 results when the model of Fig. 22 is used to represent each multiplier gn(t) in Fig. 11. -~
~------:..._-
df.
If the input spectrum Z (I) is confined primarily to a specified frequency interval over which T(j, t) varies little in f, with a minimum fluctuation period much greater than the bandwidth of Z(f), then a Taylor series representation of Ti], t) in f will provide a rapidly convergent expansion of Z(f)T(j, t) and wet). Since the existence of a mean path delay ~Q' i.e., a value of t about which get, t) may be considered centered, produces a factor exp [- j21l"ftol in T(f, t) which can fluctuate with f quite rapidly, it is desirable to expand only that portion of T(f, t) which does not include this factor. To this end we may define a shifted Input DelaySpread Function go(t, ~) in which the mean path delay tg has been removed, i.e., (179)
where ~g is a mean multipath delay defined according to some convenient criterion. Then (180)
where To(f, t) is the Time-Variant Transfer Function of the medium after the mean path delay has been removed,
(178) 't.e.,
= g(t).
i2r H
To(j, t)
=
J
go(t, ~yj2rfE
d~.
(181)
In the most general situation the input spectrum Z (f) may not be centered at f = O. Thus, assuming that Z(f) is centered at f = i., the most rapid convergence of Z(f)To(f, t) will be obtained by expanding To(f, t) about
f = i., i.e.,
:E T Q)
ToCf,
t)
=
»-0
n(t)(27rj)n(f
-
fJf&
(182)
,vhere I I
I
I
_-------l---------~L --
-wit)
Fig. 26-A canonical channel model for combined input-frequency and Doppler-spread constraints.
(183)
457
Fifty Years ofCommunications and Networking
A filter with transfer function (21rj)nr is an nth order differentiator. We shall define a filter with transfer function (21rj)n(f - fir" as an offset differentiator with an offset of i, cps. If we let Df i be an operator denoting such an offset differentiation, it is quickly demonstrated that
D;Jf(t)]
= ei 2r / 1l : ; {!(t)e-i2r/;II·
(184)
zit)
~----+-------+--- -- _ -- -- - - --wet) Fig. 27-f-power series channel model, output multiplier version.
Then use of (182) and. (184) in (19) yields the following series representation of the channel output: where
wet)
=
~ T,.(t)DjJz(t - to)]
= e i 2r/ ll
I: T,,(t) d~' {z(t
-
~.)e -i2r/'II.
(185)
Examination of the last equation in (185) indicates that the channel output may be represented as the parallel combination of the outputs of an infinite number of elementary channels each consisting of a differentiation of some order followed by a time variant gain with all channels preceded by a delay ~II and then a frequency translation of -fi cps and all channels followed by a frequency translation of +fi cps. Study of the first equation in (185) shows that the channel output may also be represented as the parallel combination of elementary channels where now the typical channel consists of an offset differentiation of some order followed by the same time-variant gain, with all channels preceded by a delay ~a' This latter channel representation is shown in Fig. 27, where the offset differentiators of different subchannels have been combined into a chain of offset differentiators. , For simplicity, in the following discussion of power series models, we shall present the particular forms that are simplest to diagram. It should be realized, however, that equations such as (185) may have a variety of interpretations in tenus of channel models. An understanding of the conditions leading to rapid convergence of the series (185) may be obtained by first defining a normalized shifted Input Delay Spread Function go(l, ~) whose H width" in the ~ direction is unity and a. shifted normalized input time function zo(t) which is located at f = 0 and has unit bandwidth (using any convenient bandwidth criterion). These normalized functions are defined implicitly by the relations z(t) =
'to(Gi t)ei 2 'Jr ! i'
go(t, ~) = ~. go(t, 1) where A is a measure of multipath spread given by the "width" of go(t, ~) in the ~ direction and B, is the bandwidth of the input. With the aid of the normalized functions one readily finds that the series (185) may be expressed in the form Q
SZ~n)ill = __1_ (j211")n
crto(t) =
(;21r)n dt~
J rZ~
0
(f)
/21(1t
e
df
(188)
in which Zo(f) is the spectrum of ~o(t). Examination of (188) reveals that Tn(t) and [1/(j2?r)ft] • ~o {n) (Bit - B~Q) are both moments of functions having unit "duration." If, indeed, we assume go(t, ~) (as a function of ~) and Zo(f) are zero outside the unit interval (centered at ~ = 0, and f = 0 respectively), then it is readily demonstrated that these moments may not increase and in practice will most likely decrease with increasing n. Examination of (187) then indicates that if
(189) the series will be rapidly convergent. In the general case go(t, ~) and Zo(f) will have "tails" extending outside the unit interval. For (189) still to represent a useful. convergence criterion, these tails must drop to zero sufficiently rapidly so that the moments do
not increase too rapidly with increasing n.22
When the channel is randomly time-variant, the multiplier functions become random processes with correlation properties defined by
T*(t)T",(s) = (-1)"'+" n n! 1n!
·JJ
~"'rl'R.(t,
s;
~ + ~.,
11
+ ~.)ei2./;{~-') d~ dll
(190)
for the general channel, For the WSSUS channel, the cross-correlation function (190) specializes to (see Fig. 9)
(191) In the case of the WSSUS channel, a desirable choice for ~f1 is given by t ~9
= f tQ(~) d~
J Q(~) d~
(192)
i t The series will diverge if these tails fall too slowly. If the product of the two moments increases exponentially with n as aft, one may modify (189) into 21raBi t:ta , « 1 to obtain a SUItable convergence
criterion.
458
THE BEST OF THE BEST
where Q(1;) is the Delay Power Density Spectrum, since such a choice not only minimizes 17\1 2 relative to ITo l2 but also leads to T i (t) and To(t) being uncorrelated. The ratio of the strength of T 1 relative to To then takes the simple form
(193) where ~ may be called the rms width of Q(~). When the frequency selective fading in the channel is sufficiently slow, only the first term in the series (185) will be sufficient to characterize the channel output, i:e.,
wet)
= To(t)z(t -
~o)
(194)
which may be recognized as a "fiat-fading" or nonfrequency-selective channel model. If the first two terms are used,
which may be called a "linearly frequency-selective fading" channel since it corresponds to approximating To(f, t) by a linear term in the frequency variable. One may continue and define a "quadratically frequencyselective fading" channel, etc., depending upon the degree of approximation required. We shall now investigate the error incurred in using a finite number of terms in the expansion (185) for the case of a WSSUS channel. If we assume the existence of derivatives of To(f, t) with respect to f as high as Nth order, then we may expand To(f, t) in a finite Taylor series expansion To(f, t)
=
N-l
:E Tfl(t)(21rJ)"(f
n-O
+ ~ (f N.
- fi)n -
fi)N[iJNTo~1 af
t)]
1=-1'
(196)
where I' lies between f. and f. Then using (196) in (15) and making use of (183), one obtains the following series expression for the channel output: N-l
w(t) =
:E 1'n(t)D7,[z(t -
fl-O
~f1)]
+ RN(t)
(197)
where RN(t) is a remainder term given by
It follows that 2
)RN(t) 1
IJ (·ff (_~)N(-17tgW, ~)gO(t,
=
21f1Y (f - fit(21f1)N(l - fi)N (;!)2 Z*(f)Z(l)
(199) where I" lies between i. and l. For the WSSUS channel (200)
and for a wide-sense stationary Z(t)12 Z*(f)Z(l)
=
pz(j)a(l - f)
(201)
where P.(f) is the power spectrum of z(t). Using (200) and (201) in (199) we readily find that
IRN(t)12 = ~~?;2N
J
(f - fiYNp.(f) df
= (211"fN~lN
f
fNp.(f) df
I (~ - ~o)2NQ(~) ~
I ~2NQ(~) d~
(202)
where P.(J) is the power spectrum of the normalized input signal ~ (t) (unit bandwidth and centered at zero frequency) and aCt) is the Delay Power Density Spectrum associated with the normalized Delay-Spread Function go(t, ~). One may show that the right-hand side of (202) is also just equal to the average magnitude squared of the N Ith term in the series (185) for the case of a WSSUS channel. Thus, we have the simple error criterion that the average magnitude squared of the error incurred by using only a finite number of terms in (185) is just equal to the average magnitude squared of the first omitted when the channel is WSSUS and the input is wide-sense stationary. We shall now derive a channel model which differs from Fig. 27 principally in a reversal of the order of the operations of differentiation and multiplication. This channel model is derived by making use of a Taylor series expansion of M(t, f) in the frequency variable. The inputoutput relationship corresponding to M (t, f) is given by (25), which is repeated below:
+
W(j)
(198)
17)
=
f z(t)M(t, f)e-
i2
"n dt.
If the output spectrum W (1) is confined primarily to a specified frequency interval over which M (t, f) varies little in I, with a minimum fluctuation period much greater than the bandwidth of W (f), then a Taylor series representation of M (t, f) in f can be used in (25) to obtain a rapidly convergent expansion of W(f). Since M(t, f) is the Fourier transform of h(t, ~), the Input Delay-Spread Function, the presence of a nonzero value of ~, say ~h, about which h(t, ~) is (( centered" will result in a factor exp [- j21rf~h] in M (t, f) which can
459
Fifty Years of Communications and Networking
fluctuate with t quite rapidly. Thus, it is desirable to expand only that portion of M (t, f) which does not include this factor. If this portion of M(t, f) is denoted by Mo(t, f), we have
(203)
wit)
Fig. 28-f-power series channel model, input multiplier version.
where
(204) It may also be shown in the same manner as was shown for the series (185) that the average magnitude squared error incurred by using a finite number of terms in the M(t, f) = Mo(t, f)e-i21rf~h. (205) series (209) is jUS1j equal to the average magnitude squared of the first term. omitted when the input is wide-sense The most rapidly convergent expansion of W
L
[d
r
t)]
l\
i:
J
27rBod "
«
1.
(210)
Even when -wet) is not band-limited and h(t, t) is not f-limited, (210) will still be a useful convergence criterion so long as the "tails".of W(f) and h(t, ~) (as a function of ~) drop to zero rapidly enough.
M (f) .A
3
[aft
1 {114'( f) -i21r" lIt}] = n1('2 . . . 7rJ.)n atn IYL t, e t-It =
1
r nH{f
'J n.
v
,
11
+ )ei2lr"t, dv, VB
(212)
460
THE BEST OF THE BEST
2(f~(f)Mo(f) = O.
z(U-.f'X't----+--......{),o--- ...........- o { X l - - - - . . - -
The ratio of the strength of takes the simple form
!1I-f t(f)!2
IMo(f) 12
L - -_ _- - - . . . . .
.-...-_ -- -
- - .. ttl
Fig. 29-t..power series model, output filter version.
Use of (211) in (25) leads to the following expansion for the output spectrum:
W(f) =
t
n-O
M ..(f)
Jz(t)e-
i 2 1r U
- ' H)t(27r j)n(t - ti ) " dt
22 Mn(f)Dt.[Z(j ,,-0
M (f ) relative to MoCt) 1
(v --:-"H)2P(,,) dJ, J P(v) dv
(32
=:
= lIfo(J)Z(f - VH)
W(f)
(220) lIH
cps, represents
If the first two terms are used,
(221)
(213) Examination of the channel model in Fig. 29 readily reveals that the summed outputs of its elementary channels is identical to the series (186). When the input time function is limited to a finite time interval T, seconds and H(f, v) is zero for values of II outside an interval of duration ~H cps, one may show that the series (186) will be rapidly convergent if
«
1
(2.14)
[c.f. (189)]. Even if z(t) is not time-limited and H(f, v) is not v-limited, (214) will still be satisfactory convergence criterion if the "tails" of z(t) and Hi], v) (as a function of 11) drop to zero sufficiently rapidly. When the channel is randomly time-variant the filter transfer functions Mft(f) become random processes in the frequency variable with correlation properties defined by
M~(f)Mm(l) =
if'?"', n.m. RII(f,
l.;»
+ "H, 14 + 14H) (215)
which may be called a "linearly time-selective fading" channel since it corresponds to approximating M(t, f)e-i2rltHI by a linear term in the time variable. One may continue and define a "quadratically time-selective fading channel," etc., depending upon the degree of approximation required. An exact expression dual to (203) is readily formulated for the average magnitude squared error incurred by using only N - 1 terms in (213). However, this expression would be applicable in the dual situation, namely, when the input spectrum (rather than time function) is a widesense stationary process and the channel is WSSUS. Since such an input is not very common, the corresponding error expression may not be as useful as in the dual case. Thus, we present a different derivation which yields an upper bound on the average magnitude squared error for the case of arbitrarily specified z(t) and a WSSUS channel. We first express M (1, t)e- i211" v H t in a finite Taylor series,
M(j, t)e-i27rVHt
N ..... l
= 2: Mn(f)(21r j)"(t n=O
ti)fl
for the general channel. III the case of the WSSUS channel, (215) simplifies to
M~(f)M..(f
+ {l) = n.m. -tJ Jvm+np(O, v) dv
(216)
(see Fig. 9). A desirable choice for nel is given by VH
VB
(219)
where here (1 is the rms width of P(v). When the fading is sufficiently slow only the first term in (213) will be sufficient to characterize the channel output, i.e., one may use
which, apart from the frequency shift of
VII)]
2rTJ3 H
then
the channel as a time-invariant linear filter with transfer function l.(lo(/).
co
=
=f
(218)
in the case of the WSSUS chan-
J vP(v) dv = J P(,,) dll
(222)
where t' lies between t, and t. Then using (212) and (222) in (213) we obtain the finite series representation of the output spectrum W(f)
=
N-l
2: M (J)D7i [Z (f
»-0
n
- VII)]
+ EN(f):
(223)
where the remainder term EN(f) is given by (217)
where P (v) is the Doppler Power Density Spectrum [see (76)], since in this case IM1(f) (2 is minimized relative 2 to /llf o(f)1 and
EN(f) = (211"1Y 1 · Nt
f
J
(t -
titZ(t)e;~"'H'
vNH(f, v
+ )e Vl/
1'21rllt'
i 2 r lt dt dv e. •
(224)
461
Fifty Years ofCommunications and Networking z(t)-- --- - - - - - - - . . - - - - - - - .
It follows th~lt
IEN(fW
= f};~;2N
·11
11 (t -
t.t(s - t,tz*(t)z(S)e-;2r'H(I-.l
pNp.NH*(f, v
+ PH)H(f, p. + PB) (225)
where til lies between t, and For the WSSUS channel, H*(f,
J)
+ PJ/)H(j, p. + JlH)
Fig. 3O-t-power series model, input filter version.
8.
= P("
+ J'H)O(IJ.
- v).
(226)
Using (226) in (225), we find
IE
N
(f)12 =
~;;b~
·1
lNp(v
11 (t -
wIt)
t,)N(S - t.)NZ*(t)z(S)e-;2lrPB(l-O>
+ PH)e-;2lH(l'-I"l dp e
When the output time function is limited to a finite
interval of duration To sec and G(f, v) is zero for values of v outside an interval of duration (3o cps, one may show that the series (231) will be rapidly convergent if (232)
i 2lr/(!-ol
dt ds.
(227)
Even if w(t) is not time-limited and G(f, 11) is not v-limited, (232) will still be a satisfactory convergence criterion if Noting that the magnitude of an integral is less than the the "tails" of w(t) and G(j, v) (as a function of v) drop to integral of the magnitude, zero sufficiently rapidly. When the channel is randomly time-variant the filter N N IEN(fW It - t.I Is - t,I transfer functions in Fig. 30, like those in Fig. 29, become random processes in the frequency variable. Relationships ·Iz(t)! tZ(8)t P(v Vn)v2N dv'dt ds analogous to those in (215) to (228) are readily constructed for the model iIJL Fig. 30. In particular, for the WSSUS 2Np(P) = (p - PH) dv It - t.IN1Ia(t)1 dtr channel the correlation properties of the random filters in Fig. 30 become identical to those of the random filters 2Np (v) dv = p ItlN Iz(t) I dtr (228) in Fig. 29. 3) ft- and tl-Pouer Series Models: In this section we present two channel models, one arising from an expansion where pep) is the Doppler Power Density Spectrum cor... responding to a normalized Doppler-Spread Function of T(j, t) and the other from an expansion of M(t, f) in H(f, 11) which differs from Htj, v) in being translated and the t and f variables. As in the previous power series scaled along the v axis so that it has (3H = 1 and VH = O. models, it is desirable to remove mean path delays and Similarly, z(t,) is a shifted scaled version of the input which Doppler shifts before expanding these functions. Thus, we define has unit duration and is located at t = o. The channel model dual to that in Fig. 28 may be arrived at with the aid of the following expansion: where ~07 Vo are a mean delay and Doppler shift defined TU, t) = ei 2 r O I 2: 1'",(f)(21r1)(t - to)'" (229) as the value of ~ and v, about which the Delay-Doppler,,-0 Spread Function. U(E, v) may be assumed "centered", i.e., where to is a time instant about which the output may be + to, v PI)) is "centered" at t = v = o. assumed" centered" and VG is the value of 11 about which Since in T(t, t) the variable t is associated directly with G(j, v) may be assumed "centered." The frequency the spectrum of the input signal and the variable t is function Tn (1) is given by associated with the output time function, we expand Too(j, t) in the following double power series: 'J' n(j). --- n! (1 3) n atn {T(f t)e-i2rll G t }] . t-to , 27f Tc)o(f, t) = ~ ~ Tmft(21rl)m+1I(f - fi)m(t - lo)" (234) o 0 =~ v + vG)ei27TlltD d», (230)
~ f};~):N 1ff
+
~;;?;2N J
(211"rN~;)2N 1
IJ
If
(lG
J'
Uce
[ii"
n.
+
(1)
J vo«
(1)
where
Use of (229) in (19) leads to the following expansion for the output time function:
wet)
= ei21tPG'
'f
(211"J)"(t - to)"
f T,,(f)Z(j)ei
2r
f l
df
(231)
from which we readily infer the channel model shown in Fig. 30.
(235)
462
THE BEST OF THE BEST
Using (234) in (19), we find that the output time function is represented by the series wet)
= ei 2 " poe
•
CD
EE
milia
.-0
Tmn(21C'j)n(t - toY'D7,[z(t - ~o)]. (236)
It is readily seen by examination of (236) that an appropriate channel model whose output is given by (236) may be obtained by using the f-Power Series Model of Fig. 27 with each gain function represented by the t-Power Series Model of Fig. 30. We leave it to the reader to sketch this channel model, which we shall call the ft-Power Series Channel Model. We may make statements concerning the convergence properties which are similar to those pertinent to the output series expansions of the t- and i-Power Series Models. However, since the tf-Power Series Model is, in essence, both a t- and an f-Power Series Model, the convergence requirements of both models need to be imposed. Thus, it is readily shown that when U(t, v) is zero outside a rectangle whose sides are f30 cps long in the v direction and A o sec long in the ~ direction and when z(t) is band-limited to a bandwidth B, cps, a sufficient condition for convergence of (236) is the satisfaction of the inequalities
2r
It - tol Po «
1
(237) (238)
It is clear from (237) that the series (237) may not converge for all values of t. However, if the significant values of the output are confined to an interval of duration To one may change (237) to the inequality
(239)
If the finite Taylor series M
Too(f, t) =
E
N
~ T mn(2rJ)m+n(f -
m-O n-O
f,)m(t - toyl
is used ill (19), one readily finds that for the WSSUS channel the average magnitude squared error incurred by using terms up to m = M - 1 and n = N - 1 is bounded by
IE
MN
2s (27fT0t*:;(~~)~o)2M If 111 IZ(f)1 dtr
l
M
·If ~2MV2NS(I;,
v) ~ dv
(241)
where 2(f) is a shiftedscaledversionof the input spectrum
with unit bandwidth and located at zero frequency, and S(~, p) is the Scattering Function [see (74)] associated with a shifted scaled version of U (~, II) which is zero outside a unit square centered at t = v = O. It is assumed in (241) that only those values of t are of interest for which It - tol s To. A discussion entirely dual to the one above may be formulated by expanding M(t, f) rather than T(f, t) in a Taylor series and using the resultant series to derive a series expansion for the output spectrum. Because the analytical procedure is identical to that above, except for a replacement of functions and variables by their duals, we shall not present these derivations. We note, however, that the resulting channel model, which we call the tfPower Series Model may be obtained by using the t-Power Series Model of Fig. 29, with each filter represented by the f-Power Series Model of Fig. 28.
On the Optimum Detection of Digi(al Signals in the Presence of White Gaussian NoiseA Geometric Interpretation and a Study of Three Basic JData Transmission Systerrls* E. ARTHURst,
MEl\IBER, IRE AND
Summary-This paper considers the problem of optimally detecting digital waveforms in the presence of additive white Gaussian noise. A technique for representing the transmitted signals and the additive noise which leads to a geometric interpretation of the detection problem is presented on a tutorial level. Subsequently, this technique is used to derive the optimum detector for each of three basic data transmission systems: m-level Phase Shift Keyed, m-level Amplitude Shift Keyed and m-level Frequency Shift Keyed. Corresponding probability of error curves are derived, compared and discussed with reasonable detail.
I.
INTRODUCTION
T
H IS PAPER WAS written to fulfill two objectives. The first objective is tutorial; the paper is intended to serve as an introduction to some of the ideas of modern statistical communication theory; in particular, the problem of detecting a known signal in a white noise background with minimum probability of error is introduced. The second objective is to analyze and compare in detail the performance of three basic data transmission systems, namely, m-Ievel Phase Shift Keyed, nt-level Amplitude Shift Keyed and m-Ievel (orthogonal) Frequency Shift Keyed. The approach adopted within this paper stresses the geometric viewpoint. Specifically, advantage is taken of the fact that in the white noise case it is possible to choose a convenient (orthonormal) representation for the transmitted signals and yet still be guaranteed that the noise can be decomposed suitably ti.e., as described in Theorem II). This freedom of choice in signal representation leads in a natural way to a geometric interpretation of the detection problem which is both analytically. correct and intuitively plausible. Furthermore, it is felt that the presented approach which, strictly speaking, is only applicable to the white noise case can serve as a useful
* Received June 15, 1962. This work has been supported by the Mitre Corporation, Bedford, Mass., under Contract No. AF 33(600) 39852. t Bell Telephone Laboratories) Murray Hill, N. J. On leave from Dept. of Elec. Engrg., M. I. T. Research Laboratory of Electronics, Cambridge, Mass. Formerly Consultant to the Mitre Corporation, Bedford, Mass. t M, I. T. Dept, of Mathematics and Research Laboratory of Electronics, Cambridge, Mass. On leave from the Mitre Corporation, Bedford, Mass.
H. DYMt,
MEMBER, IRE
introduction to the more general treatment wherein the transmitted signals are represented in tenus of a Karh unen- Loeve expansion. 1 ,2 The concepts developed in the early portions of the paper are used subsequently to derive the optimum detector for each of three systems mentioned above (both under the assumption of phase 'coherence and phase incoherence) and to derive the corresponding expressions for the probability of error. Several sets of curves are presented and discussed in reasonable detail in the final sections of the paper. In the course of the presentation, references to additional articles and books on related subject matter are cited. However, appreciating the difficulty involved in sifti.ng through many references each with its own peculiar notation, an effort has been made to write this paper as a complete unit. Accordingly, theoretical results which are utilized within the body of this paper without proof are discussed at length in the appendixes, which, with the exception of Appendix II, do not depend critically on outside source material. The treatment of Section V, which is concerned with deriving the probability of error for the systems under consideration, is somewhat different. The objective therein is to present a complete description of the calculations involved and the types of estimation which can be resorted to. It is believed that some of the presented results are new, although this is difficult to
ascertain without an extensive search of literature.. Principally, however, it is felt that the value of this paper lies in the presentation; considerable insight into the significant factors which contribute to error is gained by the geometric approach emphasized. References to some alternate techniques for calculating the probability of error are presented where thought to be of interest or where they have been of direct help to the authors. We remark that, as pointed out in the text, Section V-A to V-F may be skipped by the reader without loss of continuity. 1 W. B. Davenport and W. L. Root, HAn Introduction to the Theory of Random Signals and Noise," McGraw-Hill Book Co., Inc., New York, N. Y., pp, 96-99,338-345; 1958. .2 C. W. Helstrom, "Statistical Theory of Signal Detection," Pergamon Press, Inc., New York, N. Y., pp. 95-109; 1960.
Reprinted from IEEE Transactions on Communications Systems, December, 1962. The Best ofthe Best. Edited by W. H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W. Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
463
464
THE BEST OF THE BEST
II.
BASIC GEOMETRIC CONCEPTS
TRANSMITTER
I ;-SSAGE- SOURCE I _,
A. Discussion of ASSU1ned Model The analysis of data transmission systems is commonly based on the following modeL There is assumed to exist a message source generating a stream of equally likely messages, M 1 , M 2 , ••• , M m , into a waveform generator having available an alphabet of m distinct waveforms, Sl(t), S2(t), ... , Sm(t), each of duration T (and necessarily finite energy). One waveform is transmitted every T seconds, the choice of waveform depending in some fashion on the incoming message and possibly on the waveforms transmitted in preceding time slots. The medium coupling the transmitter to the receiver is assumed to add stationary-white-zero mean-Gaussian noise to the transmitted signal but otherwise is assumed to be distortion free. It is generally further assumed that the receiver is time synchronized with the transmitter (synchronous detection). Sometimes it is also assumed that the receiver is phase locked to the transmitter (coherent detection). In this report we shall always assume time synchronism but shall distinguish between coherent and incoherent detection. The problem we are generally interested in solving, given this model (see Fig. 1), is how to design the receiver so that it makes as few errors as possible. Furthermore, assuming that an optimum receiver (optimum in the sense that it will make fewer errors in the long run than any other receiver) is constructed, we are interested in calculating its error rate.
B. Geometric Representation of a Known Set of Waveforms One purpose of this report is to point out that all problems of the type mentioned above may be transformed into geometric problems with considerable simplification of detail. Basic to the geometric viewpoint are two theorems, the first of which we shall now state. Theorem I Any finite set of physically realizable waveforms of duration T, say St(t), S2(t), ... , Sm(t), may be expressed as a linear combination of k orthonormal waveforms
where the
a., are real numbers given by ai;
=
i7' S.(t)'P;(t) dt
1
~NA~I SOURCE
_,II'
M.
I.m
I
I
:
·-l
CHANNEL
I - -
I I I \
I
I
WHITE NOI::USSIAN
,"," I I --J L
L--
-
~ ~ ~
M,
-.
-l
RECEIVER
I I
II ~
\
M~
I
J L __ J
Fig. I-Idealized model of data transmission system.
and the 'Pi(t), j = 1, 2, ... , k are waveforms having the property (definition of orthonormal) that
17'
'Pi(t)'P;(t) dt
o
= {O
if i 1 if i
~ j. =
(3)
j
The proof of this theorem is presented in Appendix I. Note that the conventional Fourier Series expansion of a waveform of duration T is an example of a particular expansion of this type. There are, however, two very important distinctions we wish to make.
I
1) The form of the C{)i(t) has not been specified. That is to say, we have not confined the expansion to be in terms of sinusoids and cosinusoids. 2) The expansion of S,(t) in terms of a finite number of terms is not an approximation wherein only the first k terms are significant but rather an exact expression where k and only k terms are significant. The number k, incidentally, referred to as the dimension of the signal alphabet of waveforms.
is
The form of the
=
k
L aiiCf'i(t)", ;-1
i = 1,2, ...
,m,
(4)
it is apparent that each signal waveform may actually be specified uniquely in terms of the coefficients of the «)i(t), (j = 1, ... k). Thus, we can represent Si(t) by the set of k-tuples (ail' a i 2 , ••• , aik). Furthermore, if we conceptually extend our conventional notion of 2 and 3 dimensional Euclidean spaces to a k-dimensional Euclidean space, we can think of the numbers
as the k coordinate projections of the signal point S, on a k-dimensional Euclidean space. Thus, for example, if k = 3 we may plot the point S, corresponding to the waveform
the formula (2)
465
Fifty Years of Communications and Networking
as a point in a 3-dimensional Euclidean space with coordinates (ail., ai2, ai3,) as shown in Fig. 2. We shall subsequently refer to the k-dimensional space on which Si is plotted as the signal space. There are some interesting relationships between the energy content of a signal and the distance between a signal point and the origin of the signal space. The distance between a pair of points, X, Y, with coordinates (Xl, X2, .•. , Xk) and (Yl' Y2, ... , Yk), respectively, is given by the formula d(X, Y)
=
~
t
# - - - - a~2 - - _
----7If
___
I
I
I
;------~~.........2
.,(1) -
I/
-
(I)
SIGNAL 'OINT Il
- --Y
Si(I).'ra;,.,ttl
J.'
Fig. 2-Geometric representation of the signal waveform Si(t).
(5)
(x; - y;)2.
/ / laL3
e,.i' ~\..o'
oP-e-
It follows readily that the distance between a signal point S, with coordinates (ail, ai2, · · · , aik) and the origin . . - - d ( 0 , 5 0 - - - - - . . Sl of the signal space is given by . Fig. 3-Planar section in the signal space determined by the origin and the signal points S, and Sv, i
~
v.
(6) which, however, may be simplified to yield
iT
Now, by (4), we may write
but since the C{Ji(t) are orthornormal (see (3)1, this latter equation reduces simply to (7)
That is to say, the energy content of Si(t), E i is equal to
e, =
k
L: (ai )2.
d2(Si, 0) =
jal
(8)
i
It may similarly be shown that
i
T
k
[S.(t) - S.(t)]2 dt
= ~ (aij - a.;)2.
. (9)
Though not essential to the development we might point out an interesting sidelight. Namely, if we consider the plane formed by lines joining the signal points Si, SfJ' (i ~ v) and the origin, then by the law of cosines we may express cos 0 (see Fig. 3) as
S.(t)S.(t) dt
= O.
Thus, we can conclude that if the signal points corresponding to the pair of waveforms Si(t) and S,,(t) are orthogonal to each other, the angle between a pair of signal points being; defined as in Fig. 3, then
iT
S.(t)S.(t) dt
= 0
(i ¢ v).
If, further, each. of the waveforms Si(t), i = 1, 2, · · · , m is suitably scaled, that is, normalized, so that
iT S~(t)
dt
=
1
for i = 1, 2, ' .. , m, then the set of waveforms S,(t), i = 1, 2, . · . ,m is termed an orthonormal set. (See also (3)]. Thus, we see that there is a rather simple geometric interpretation which can be given to the notion of orthonormal waveforms.
C. Detection of Si9nals in the Presence of Noise Now, returning' to the main development, we wish to point out that the coefficients (2)
(10) If, in particular, () = reduces to d 2(O, 8 i )
+d
2(O,
1r/2, 8,) -
then cos () = 0 and (10)
eis;
Sf)
= O.
lt follows, therefore, from (6) and (7) that
iT s;(t)
dt
+
iT s;(t)
dt -
iT
(S! -
S.)2 dt
=
0
may be calculated electrically by a series of product integrators (properly synchronized to the waveform generators) as shown in Fig. 4. Such a series of product integrators can, in fact, be used as the first stage. of a detector in a da ta transmission system. The function of the second stage or decision stage, as we shall term it, is then to decide, on the basis of the k outputs of the product integrators, what signal was actually sent.
THE BEST OF THE BEST
466
The decision problem is complicated by the fact that the transmitted signal is perturbed by noise. (We are assuming, for the present, coherent detection.) Typically, the noise is assumed to be additive whitestationary-zero mean-Gaussian, the reasons for this assumption being that
. . . - - - - - ..... l l
I - - - - -..... L'
1) it makes calculations more tractable, and 2) it is a reasonable description of the type of noise
1------.....
present in many communication channels.
We shall now outline, briefly, the meaning of each of the terms used in the description of the noise. Classifying the noise as additive implies simply that the received signal, which we shall designate as z(t), consists of a noise term in addition to the originally transmitted signal. That is, if Si(t) was transmitted, the received signal
= Si(t)
z(t)
+ n(t).
ll
Fig. 4-Set of product integrators which may be used to calculate the signal space coordinates of the signal S,;(t).
(11)
Correspondingly, the output of the jlk product integrator equals
iT
z(t)tpj(t) dt =
aij
+ n,
j
=
1,2, ... ,k
nn (t)
(12)
t
t:O
~v
&A
4i
v=
Mf"'AA l! ~ IT h:;J.¢4
where
iT = iT
aij =
S;(t)tp;(t);.,.dt
(2)
n,
n(t)tp;(t) dt.
(13)
The amplitude of the term n, will be dependent upon the particular noise sample which perturbed the transmitted signal waveform. Since there is an infinite number of such possible noise samples, each of which could have perturbed the transmitted signal (see Fig. 5), there is a correspondingly infinite number of values which the term ni can take on. Accordingly, the amplitude of ti, cannot be specified in advance and can at best be described in a probabilistic sense. The fact that the noise is stationary tells us that the statistics of the noise are independent of the particular time we choose to start transmitting data. In particular, referring to Fig. 5, the choice of the point t = 0 is arbitrary as far as the noise is concerned since all joint probability density functions will depend only upon time differences and not upon the actual values of time with respect to some absolute reference. If, in particular, the noise model used is assumed to be stationary Gaussian with zero mean, it may be shown (Appendix II) that the probability density function of the noise perturbation n, is Gaussian with zero mean. That is to say, the probability that
Fig.
\HI
v\Jvv
• t
of possible noise voltages which might be superimposed on the transmitted signal.
5~amples
is equal to P(a
g. I
~ ti, <
b)
2'7r
I
b
=~
Uj
dx
e-z2/2
a
j
= 1,2, ... ,k.
(14)
Evaluation of the integral described in (14) requires knowledge of the quantity u~, that is, the variance of the noise perturbation nj. Since the noise is stationary, the variance of each noise perturbation is determined by the spectral density. In fact, if the noise is specified to be white which implies that the spectral density W(f) equals
No watts/cps for all f (positive and negative), it may be shown (Appendix II) that the variance of n, is j
= 1, 2,
... ,k.
It may further be shown (Appendix II) that each of the perturbations are independent. That is the probability of the joint event that, say, al :5 nl < b1 and a2 :5 n2 < b« •.• and alt; ~ nk < b; is equal to simply the product of the probabilities of the individual events. P(al ~ nl
< bl , =
a2
~
n2
<
b2 ,
•••
a;
:5
nit;
< b1)P (a2 :5 n 2 < ·P(ak ~ n" < b,,). peal :5 ».
< bk )
b2 ) (15)
Fifty Years of Communications and Networking
467
We wish to point out that the n 1 , n2, ... , nk do not serve to completely characterize the noise but only that portion of the noise which interacts with the product integrators. 'That is, net) cannot be expanded simply in terms of CPi(t) alone but, rather, must be expressed as net)
= n1({)1(t) + n 2({)2(t)
+ ... + nkC(Jk(t) + h(t)
(16)
where k(t) is a sort of remainder term which must be included on the right to preserve the equality. [Contrast this with the expansion of Si(t), in (1).] Utilizing the fact that the cPi(t) are orthonormal and that n,
=
-iT n(t)
j
= 1, 2, ...
,k
(13)
j= 1,2, ... ,k.
(17)
it may be deduced from (16) that
iT
h(t)l{Jj(t) dt
=0
This is, of course, no more than the statement that h(t) does not have any components on the signal space. The results of the preceding few pages may be summarized as the second basic theorem. Theorem II Given a set of orthonormal waveforms, "'l(t), f{)2(t), .. · , CPk(t), which characterize a signal space and a stationarywhite-zero mean-Gaussian noise source, net), with spectral density No, the noise may be decomposed into two portions, the first nl"'l (t) + n~CP2(t) + ... + nkfPk(t) consisting of the projection of the noise on the signal space and the second consisting of that portion of the noise which is orthogonal to the signal space. [See (16) and (17).] The n, j = 1, 2, ... , k, which are defined by (13), are independent Gaussian random variables with zero mean and variance No. . That is to say, the n 1 , n2 ... , nk represent the k coordinate projections of the noise on the signal space and represent that portion of the noise which will interfere with the detection process. The remaining portion of the noise [h(t)] may be thought of as being effectively tuned out by the detector. III.
COHERENT DETECTION
A. Statement of Detection Problem in Geometric Terms
Summarizing the results of Section II, we note that a received signal, z(t), may be represented by a point in a Euclidean space. of the appropriate dimension. The coordinates of the point are calculated by a series of product integrators which make up the first stage of our conceptual detector. Each coordinate, as may be deduced from (12), consists of two components-one due to the transmitted signal and the other due to the noise which has been superimposed on the signal in the channel coupling the transmitter to the receiver. The function of the decision stage of the detector is to guess which signal was transmitted from the position of the received "noisy" point. We emphasize the fact that the best the detector
can do in the p:resence 'of a statistical perturbation such as the additive noise model assumed is to guess at the transmitted message. As a consequence, one reasonable measure for the performance of a detector is the number of times it guesses wrong in a long typical sequence of messages. Or, more precisely, since by assumption the a priori probability for transmission of each signal waveform Si(t) is known, we can calculate for each detector the probability of making an error. B. Optimum Decision Rule
In the coherent case, the coordinates of each possible transmitted signal may be calculated by the detector. Thus, m points, each of which corresponds to a transmitted signal, may be plotted in the detector signal space. We shall subsequently refer to these m points as the message points or the transmitted signal points. Note that the received signal point will be displaced from the transmitted signal point due to the addition of noise. Since, as may 'be deduced from the bell shaped curve, small noise perturbations are much more likely than large ones in the Gaussian case, a reasonable decision rule to adopt is to assume that the signal whose message point lies closest to the received point was actually transmitted. In fact, in Appendix III the following theorem is verified. Theorem III If each signal waveform is transmitted with equal probability and if the received signal is perturbed by additive stationary-white-zero mean-Gaussian noise, then, for the case of coherent detection, that decision rule which selects the message point closest to the received point minimizes the probability of error. . A detector which embodies the decision rule of Theorem III is often referred to as a maximum likelihood Detector. It should be noted, however, that a maximum likelihood detector will only minimize the probability of error, when, as in this case, it is assumed that each possible signal waveform is transmitted with equal probability." We shall now illustrate this rule for three cases of practical inter~st-coherent Phase Shift Keyed, coherent Amplitude Shift Keyed and coherent Frequency Shift Keyed. C. Coherent PSI( 'This modulation scheme is characterized by the fact that the information carried by the transmitted waveform is contained in the phase. A typical set of message waveforms is described by Si(t)
3
=
r~2: cos ("'ot +
10
2;;:) 0s t ::;
elsewhere i
=
T
(18)
1, 2, ... , m
For further discussion, see Davenport and Root, op. cit.,
pp. 317-324; R. M. Fano, "Transmission of Information," M. I. T.
Press, Cambridge, Mass., John Wiley and Sons, Inc., New York,
N. Y., p. 184; 1961.
468
THE BEST OF THE BEST
where E is the energy content of Si(t) and wo
2?r'no
=7
for some fixed integer no-
Now, recognizing that each S,(t) may be written in terms of a sinusoid and cosinusoid, which are orthogonal, and then suitably scaling to fulfill the conditions of (3), we conclude that the appropriate form for the orthonormal waveforms ({)1 (t) and «J2(t) (alternately, we could have used the techniques described in Appendix I) to be used in the product integrators of Fig. 4 is
'Pi(t)
=
~~ cos wot
'P2(t)
=
..Jf; sin wot.
(19)
The coordinates of the message points may be calculated by (2), (18) and (19).
ail
=
i
T
0
21ri = vE cosm
i
/2E
o ~T cos
(
(20)
I
-l+ZONE 3
-1r-
lated for various values of m in Section V. Curves and discussions are presented in Section VI.
D. Coherent ASK
27ri) s. d wot + m ~TSlnwot t =
F==-~Vfi ~ ~ZONE ~1-ZOfE2
Fig. 7-0ptimum partitioning of detector signal space for a 3-level ASK coherent system.
I2E ( 21ri) 12 \}T cos wot + m Vr cos wot dt _ r;;
T
Fig. 6-0ptimunl partitioning of detector signal space for a 4-1evel PSK coherent system.
~!T;E · 21ri
-V.8S1n-·
In this modulation scheme the information carried by the transmitted waveform is contained in the amplitude. A typical set of message waveforms is described by
' (2E;
m
Note that for the particular case m = 2 [often termed phase reversal since Si(t) = v'E/2T sin (wot ± r/2)], ai2 = O. Accordingly, we can dispense with
Si(t) =
{
~T cosCJJot
o
elsewhere
0 ~ t i
S T
(21)
= 1,2, ... , m
where E, is the energy content of Si(t) and "'0 = 21fflo / T for some fixed integer no. It should be clear that each transmitted waveform may be expanded in terms of the single orthonormal waveform 'Pi (t) =
..Jf;
cos
wot
(22)
and that
ail =
fa ~ cos wot..Jf; cos w dt = v1!f:. T
(23)
The possible transmitted signal points are illustrated for the case m = 3 in Fig. 7. The signal space is partitioned into 3 distinct detection zones according to the techniques just discussed. Thus, for example, zone 2 consists of the set of points in the signal space which lie closer to 8 2 than to 8 1 or Ss. Probability of error calculations for this case (under the further assumption of average power limitations and uniform amplitude spacing starting with zero) are presented in Section V. Curves and discussion appear in Section VI.
469
Fifty Years of Communications and Networking
E. Coherent FSK
This modulation scheme is characterized by the fact that the information carried by the transmitted signal is contained in the frequency. A typical set of signal waveforms is described by S,(I)
~ {~
C
(w,'l
(24)
...---
5,
elsewhere
4>, ('1
REGION I
where E is the energy content of S.(t), w.
=
2~ (no ..
+ ~)
for some fixed integer no
T
i
=
1,2, . . . , m.
Following the procedure of Appendix I or observing directly that the S.(t) are orthogonal (not orthonormal), it may be deduced that the most useful form for the orthonormal waveforms CI',(t), Cl'2(t), ... , Cl'k(t) is CI';(t) =
~ cosw;t
j
= 1,2, '" ,k
= m.
(25)
Fig. 8-0ptimum partitioning of detector signal space for a 3-level FSK coherent system.
IV.
Incoherent systems differ from coherent systems in that no provisions have been made to phase synchronize the receiver with the transmitter. Accordingly, if the waveform
Correspondingly, a.;
=
i
0
T
/2E
S(t)
f2
.
'\}r cos w '~T cos Wit dt
vB if i = { = 0 otherwise
j
(26)
.
= V2E
i
;z!
!2E cos (wot + = '\}r
1/»
(2'1)
is transmitted, t he received signal z(t) will be of the form
That is to say that ilh signal point is located on the ith coordinate axis at a displacement of vB from the origin of the signal space. It should also be noted that in this modulation scheme the distance between any two signal points S. and S, is constant, since by (5) and (26) we have
us., Sj)
INCOHEREl\'T DETECTION
j.
The detection rule is illustrated for the case m = 3 in Fig . 8. Calculations for the probability of error of an m-level FSI( coherent modulation scheme are presented in Section V. Curves and discussion appear in Section VI. F. Remarks
The procedure we have followed in the last three examples is to partition the detector signal space into (m) distinct regions, each region containing one and only one message point and consisting of those points in the signal space which are closer to the contained message point than to any other message point. The received signal point will (with probability one) fall into one, and only one, of these regions. The optimum decision rule (when the hypothesis of Theorem III is satisfied) is simply to identify the region in which the received signal point falls and assume that the signal corresponding to the contained message point was actually transmitted.
z(t)
[2E
= '\}r cos (wot
+ I/> + a) + n(t)
(28)
where the angle a is unknown and is usually considered to be a random variable uniformly distributed between o and 2'11". It may readily be deduced that the detection schemes presented previously are inadequate for the incoherent case for if the received signal takes the form described by (28), the outputs of the product integrators will be functions of the unknown angle a. We shall now discuss in turn the modifications which must be introduced to the PSK, ASK and FSK systems discussed previously. A. P SK Incoherent
It should be clear from (28) that the presence of the random phase angle in the argument of the cosine prevents the receiver from deriving any information from the phase of the incoming signal alone, namely, (I/> + a). If, however, a varies slowly (that is, slowly enough so that it may be considered constant over the period of time required to transmit two waveforms, 2T), then the relative phase difference between two successive waveforms will be independent of a [i.e., (I/>, + a) - (1/>2 + a) = 1/>, - 1/>2J. Thus, if the detector was equipped with storage, it could measure the phase difference between successive signals regardless of the value of (~. This suggests that we modify the coding scheme at the transmitter as follows. To send the ith message (i = 1, 2, ... , rn), phase advance the current signal waveform by 271"i/m radians over the previous waveform.
470
THE BEST OF THE BEST
Correspondingly, the detector should (at least, conceptually) calculate the coordinates of the incoming signals by product integrating it with the locally generated wavecos wot and V2/T sin wot. It should then forms plot the received signal points and measure the angle between the currently received signal point and the previously received signal point which has been stored. It may be shown (see Appendix IV) that the best rule for the detector to follow is to quantize the measured angle in steps of 21r/m and guess that the corresponding message was transmitted. Thus, for example, if the ith message was transmitted, a pair of successively received signals Zl (t) and Z2(t - T) will be of the form
et>l it i (Ol·b l'
CURRENTLY RECEIVED " ' -
V27T
Zl(t) = Z2(t - T)
f2E cos (wot + a) + net) 'J'T
/2E ( 2 .) = '\J'T cos wot + a + ;;: + net
(29a)
- T)
(29b)
where the angle a is unknown and is assumed to be uniformly distributed over a 211" interval (symmetric with respect to some mean which may be unknown). The coordinates of the corresponding signal points Zl and Z2, which we designate as (Xl' Yl) and (X2, Y2), will be of the form Xl
Yl X2
=
iT = iT
=
Zl(t)tl>I(t) dt
= VB cos a + n l l
zl(t)tMt) dt
= -
J:T Z2(t -
VB sin a + n12
(+ m 21ri) +
Y2
=
J:T Z2(t -
n21
VB sin (a + 2;) + n22
STORED SIGNAL POINT
----t-----¥--f----_
t#>.(t)
Fig. 9-Illustration of detection rule for PSK incoherent.
pared with each other. Thus, we might, after a crude fashion, say that there is twice as much noise present in the incoherent case as in the coherent and, consequently, there will be a 3-db degradation in performance. This latter statement will in fact turn out to be approximately true under the appropriate restrictions (namely, high signal-to-noise ratio and m > 2) as will be discussed in Section V-D. We wish also to point out that under the proposed detection scheme two product integrators will be necessary for the two-level case as well as for the multilevel case. Recall that only one product integrator is required for the coherent two-level case.
B. ASK Incoherent In the ASK incoherent case the received signal z(t) is of the form
(30b)
z(t)
(30 c)
=
~ cos (wot + a) + net)
z*(t)
(3Od)
where nll, n12, n21 and n2~ are independent-Gaussianrandom variables, each having zero mean and variance No. Suppose now that for some particular combination of received signals, the random variables Xl' Yl, X2, Y2 take on the values al, b1, a2, b~, respectively. The optimum rule for the detector to follow (see Appendix IV) is to measure the angle 8 between the two points (aI, bI) and (~, b2 ) , which are shown plotted in Fig. 9, round off to the nearest integral multiple of (21r/m) and guess that the signal corresponding to that phase rotation was transmitted. A basic difference between coherent and incoherent PSK systems is that in the coherent case the received signal is being compared with a clean reference, that is, the known position of the transmitted point. In the incoherent case, however, two noisy signals are being com-
i
=
1, 2, ... ,m
(31)
where ex is unknown and is assumed to be uniformly distributed over a 21r interval. Consider for the present the signal portion of a particular received signal which we shall designate as z*(t), for which it is known that a = A. That is,
T)4J2(t - T) dt
= -
-.
~
(30a)
T)(l>I(t - T) dt ... /E""":l cos a = v.
SIGNAL POINT
=
[2E;
'\J-T
+
cos (wot
(32)
A).
Any waveform of this form may be expressed as a linear combination of the pair of orthonormal waveforms cos wot and sin wot. Product integrating z*(t) with V2/T cos wot and sin wot, respectively, yields
V2iT a
=
b=
V2iT
iT z*(t)~~ iT z*(t)~
cos wot dt
v'27T
= VJjf", cos A
sin wot dt = -
Vii; sin
(33a) A.
(33b)
In accordance with our previous discussion z*(t) may be represented by the point z* with coordinates (a, b)
shown plotted in Fig. 10. It may readily be deduced that the line segment drawn from the origin to the point z* has length v'E; and is displaced A radians below the abscissa. As far as recovery of the transmitted information is concerned, only the distance of the point from the origin,
471
Fifty Years ofCommunicationsand Networking
TJltANSIIITTED POINTS
4>.(1)
r
b
1
A
'(q.J~ o•
.'
.
~ ~,.
ZONE 5 -JL..L.A-----==----=~---+~+___ X
;, (t)
Fig. lO-Plot of the point z* in the detector signal space.
namely, VE~ is significant. We wish to point out, however, that Fig. 10 lends itself to a simple geometric interpretation of the difference bet\veen ASI{ coherent and ASK incoherent systems. In both cases the set of message points lies on a straight line (see, e.q., Fig. 7). In the coherent case, however, the receiver knows the orientation of this line (i.e., the angle A) and, thus, need only make one measurement (along the line) in order to deduce what message point was sent. In the incoherent case the receiver does not know the position of the line and must, therefore, perform its analysis in the plane containing all possible rotations of the line. In particular, to calculate the distance of a point from the origin it must first measure the projections of the point on each of t\VO perpendicular axes lying on the plane and passing through the origin and then take the square root of the sum of the squares of the two projections. Accordingly, it should be noted that the ASK coherent detector requires only one product integrator whereas the ASI{ incoherent detector requires two. (The logic following the product integrators will, of course, be different in the two cases.) Owing to the presence of noise the actual outputs of the product integrators will be of the form
x
~
y =
iT Z(t)cPl(t) dt =
iT
VE,
Z(t)cP2(t) dt = -
cos a
+ nl
v'iif: sin a + n2
Fig. II-Illustration of detection rule for 3-Jevel ASK incoherent.
sponding to the ith message consists of those points with coordinates (z, y), for which
I Vx2 +
y2 -
v'iif: I < I V x + 2
y2 -
VB; I
all j ¢ i.
(35)
Such a partitioning of the detector signal space is illustrated in Fig, 11 for the three-level case. We wish to point out that this decision rule is not the optimum one to adopt but approaches the optimum rule in the case of high signal-to-noise ratio, The rationale for adopting this rule is that it is considerably easier to instrument than the optimum rule for which the decision regions are functions of the signal-to-noise ratio. The decision rule is discussed in Appendix IV. Probability of error calculations for the particular case of uniformly spaced signal arn.plitudes starting with zero and an average power limited transmitter are presented in Section V. Curves and discussion appear in Section VI.
c.
FSK Incoherent
(34a)
In the FSI{ incoherent case, if the it.h message is transmitted, the recei.ved signal z(t) will be of the form
(34b)
z(t) =
where nl and nz are independent-Gaussian-random variables each having zero mean and variance No and ex. is assumed to be a uniformly distributed random variable over a 211'" interval. It may be shown (see Appendix IV) that a reasonable decision rule for the receiver to adopt is to measure the outputs of the two product integrators, calculate the rms amplitude, quantize in steps of v1[ and guess that the corresponding signal was transmitted. Thus, for example, if, in particular, x = a and y = b, then the receiver should guess that the message corresponding to the value of i which minimizes the quantity i = 1,2, ... ,m
was sent. This is equivalent to saying that the twodimensional space corresponding to all possible outputs of the two product integrators should be partitioned. into m distinct regions (zones), each of which is associated with a particular message where the ith region corre-
~COS(CJJit +a) +n(t)
i = 1,2, .. , ,m (36)
where the unknown angle a is assumed to be a random variable uniformly distributed over a 211" interval. Although each of the transmitted signals may be represented by a point in an m-dimensional space, the presence of the unknown angle a makes it necessary to resolve the incoming signal in terms of the 2m orthonormal waveforms
j2 J2" 'iT cos l~lt, ~T cos
W2
t , ...
/2.SIn '!JIt, \iT /2.SID W2 t , \iT
•••
/2
''\IT cos ~mt
f2.SIn Wm t • '\iT
Correspondingly, the 2m product integrator outputs will be of the form X;
=
iT z(t)~ iT Z(t)~E:inCJJitdt =
cOSCJJ;t dt = {nZi
vIE coso +
o
y; =
o
{n ll i _
j F- i (37a) nxi
j
=i
j ~i v'Esina +n"i j = i
(37b)
472
THE BEST OF THE BEST
where the 2m noise perturbations j = 1,2, ... , m
are independent random variables with zero mean and variance No and a is a uniformly distributed random variable over a 21r interval. If, at some instant, the random variables Xl, X2, • • • , X m , Yl, Y2, ... , Ym take on the particular values aI, a2, · . · , am, bI , b2 , • •• , bm , respectively, it may be shown (Appendix IV) that the optimum decision rule for the receiver to follow is to find that value of j, j = 1, 2, ... ,1n for which the quantity V a~ + is a maximum and guess that the corresponding signal was transmitted. That is to say, the detector should calculate the rms amplitude associated with each possibly transmitted frequency and select the largest one. The probability of error for an orthogonal FSI( incoherent system is calculated in Section V. Curves and discussion appear in Section VI.
amplitude of the outputs of the sine and cosine product integrators associated with that direction (frequency). The m distinct points, one on each coordinate axis, displaced vIE units from the origin, constitute the message points. Correspondingly, the optimum decision rule is equivalent to measuring the distance (in this space) from the received point to each message point and selecting the closest one.
v.
PROBABILITY OF ERROR CALCULATIONS
For the systems under consideration it is possible to obtain exact expressions for the probability of error in integral form. Unfortunately, however, in many cases the integrals in question are not simply integrable nor have they been tabulated over the ranges of interest. When this is the case it is sometimes possible to obtain upper and lower bounds on the probability of error which are usually adequate to predict the signal-to-noise ratio (within a decibel or so) required to maintain a prescribed error rate. The approximations which can be made fall into two D. Interpretation of Decision Rules in Light of Appropriate categories, namely, simplification of the integrand and M inimum. Distance. Criteria simplification of the region of integration. The latter In concluding our discussion of incoherent systems we procedure is especially useful in the coherent case where wish to point out that the decision rule adopted in each the regions of integration are fixed relative to the signal case can be interpreted as a minimum distance type rule space and the noise is symmetric Gaussian with zero mean. although the spaces in which the distances are measured In fact, Theorem IV may be shown (see Appendix V). are not simply related to the k-dimensional Euclidean spaces which characterize the k product-integrator out- Theorem IV puts. Thus, in the PSI\: incoherent case the space of interest Given M message waveforms, each transmitted with is that corresponding to the possible values of the relative equal probability and perturbed by additive stationaryphase difference between a pair of successively received white-zero mean-Gaussian noise with double-sided spectral signals, namely, an interval of length 21r. The possibly density No watts/cps, then the average probability of transmitted phase differences determine a set of m mes- error for a maximum likelihood coherent detector is sage points with coordinate displacements 21ri/1n i = bounded by" 1, 2, ... , m, respectively, and the optimum decision rule is equivalent to measuring the phase difference plotting -1e-x / 2 dx the resultant number in the space and selecting the closest y'2; p/2 v' N message point. < r. < (M - 1) fco e- x 2/ 2 dx (38) VZ; p*/2V!i; In the ASI{ incoherent case the space of interest is that corresponding to the possible values of the rms where p* and p are defined in terms of Pi, the distance amplitude of the received signal. This space can be between message point i and its closest neighbor. That is, represented geometrically by a semi-infinite line running ~ = 1,2, ... , m from zero through the positive real numbers to infinity. p* = minimum (Pi) The set of possibly transmitted signals define a set of m messages points with coordinate displacements VE i i = 1, 2, ... , m, respectively, and the adopted decision rule (which is only asymptotically optimum) is equivalent to Actually, it is often possible to establish tighter bounds calculating the rms amplitude of the received signal, than those presented in Theorem IV. The results presented, plotting the resultant value in the space and selecting however, are indicative of the type of bounds which may the closest message point. be achieved by overestimating and underestimating the In the FSK incoherent case the space of interest is regions of integration. an m-dimensional Euclidean space (or, to be more exact; Similar reasoning may be employed to estimate the the positive "quadrant" of that space) wherein each probability of error for the ASI\: incoherent and FSK direction in that space is associated with one of the possibly transmitted frequencies. The received signal may be The upper bound is similar to one presented by E. N. Gilbert, considered to be an m-dimensional vector whose co- leA 4 comparison of signaling alphabets," Bell Bue. Tech. J., vol. 31, ordinate projection in each direction is equal to the rms pp. 504-522 (Theorem 3); 1952. Gilbert's Lower Bound is incorrect.
b;
10'
2
0
Fifty Years ofCommunications and Networking
473
incoherent systems where, again, we are dealing with fixed regions of integration independent of the incoming signal (though the probability of error for the latter system may' be evaluated exactly). Unfortunately, however, these techniques are not readily applicable to
P(z
t
1 R,.jSi) = 21rN
o
11
exp
{
(x-
=
i-l
P(Si sent) P(z ¢ RijSi)
In the coherent PSli case, if Si(t) is transmitted, the received signal point z has coordinates [see (11), (12) and (20)]
l
T
o
Z(t)CPl(t) dt =
~
VE
?1ri
cos::'m
+ n,
(40a)
(40b) i
See, e.q., Davenport and Root, op. cit., pp. 7-13.
1
dx dy
2:11 exp{-~[i - 2P~ ·cos (0 + 2::) + :J}pel e.
=
P[ZtRJS;"]
Bi
pel
(41)
But R i , as may be deduced from Fig. 12; is simply the set of points satisfying the two conditions
q
2 .
.
_~/1r1,
s. < () < _.-!!! + s..
_
m
m ':
m
m
Thus, substituting the appropriate limits of integration in (41), we get P(z
~2:-
.7T'
-2I [ p
r:
-exp {
--rIm
2
2 7ri
1 00
1r
l m+ l m
-211'i/m-r/m
dO
2 p\j[E AT;, cos ( (J
-
0
dpp
2ri) + No E ]} + -;;
dO 1(10 dpp 0
{ I[
. exp -2
j-
1
R.ilSi) =
e
21r
A. Coherent PS!(
2
which, transforming to polar coordinates with x = p Yiio cos 8 and y = p Yiio sin 8, may be written as
= -1
where we are using standard notation to denote the probability of an event and the conditional probability of an event. s Now let us consider each of the six systems individually.
21ri)
~ JT;E' • 'v n 8111m
2N o
m
L
.)2 + (' y+
Ri
(39)
x =
2
VEcos~
the PSK incoherent case. Therein it was only possible, with the exception of the two-level case (for which the probability of error may be evaluated exactly), to obtain approximate expressions (in simple form) which represent neither an upper bound nor a lower bound to the probability of error. Sections V-A to V-F are devoted to calculating the probability of error and approximations thereof for each of the six systems under consideration. They may be skipped without loss of continuity. Before proceeding to this section, however, we wish to point out that, in the ASI( systems, the FSI( systems and the coherent PSI\: system, an error occurs whenever the itk signal waveform Si(t) is transmitted and the received signal point does not land in the region associated with the message point Si' Designating this region by R, and the received signal point by z, the event z falling inside the region R, will be written symbolically as z e R, whereas the event z falling outside the region R, will be denoted z " Ii., Averaging over all possibly transmitted signals it is readily seen that the average probability of error P, equals Pit
Thus, X and yare independent-Gaussian-random variables with means v'E cos 21ri/1n and -VE sin 21rijm, respectively, and common variance No. Consequently, the probability that z lands in Ii, when J.Si(t) is transmitted is given by
2p
fJ2 -
EJ}
~ - COS () + . No No .
(42)
Note that (42) is independent of the choice of i. That is to say, the probability of interpreting the received signal correctly is the same regardless of which particular signal was transmitted. Therefore, the probability of error for the nt-level PSI( coherent system is simply equal to one minus the right-hand side of (42). That is,
?1
Pe = 1 -
..1r
j+
1r
lm
--'trIm
dO
l
CXl
dpp
0
.exp{--~[/ - 2P~:0 cos 0 +
:J}.
(43)
Now, if we complete the square in exponent by writing
(i -
2p
IE cos 0 + .!L) No
~No
=
(p - ~ cos 8 + :0 sin20
y
474
THE BEST OF THE BEST
yielding finally that
p
P
Fig. 12-Illustration of the detection region R, corresponding to the signalSi(t) for the multilevel PSI{ coherent case.
and substitute this result into (43), we get
1j+T/m
Pe=l-21T
{I
E } dOexp --~sin2(j 2 No
-rIm'
·i~ dpp exp.{ -~ [p - ~ cos OJ} 1 1- 27("
· exp
jT/.", -trIm
dO
{_L sin" (J}{ex {_.JL p
2No
+ \j~ !ENE cos 8j'"
2N o
C08
2
o}
. exp { -!f} dt}.
(44)
Eq. (44) may be bounded quite easily for m > 2. For if oi > 2, then -7("/2 < 0 < 7("/2 which implies that cos (} > 0 and, therefore, that" -v'B/N.
COB
B
exp {-W}
dt > V2; E cos2} 2iio 0
~:o cos 0
(45)
r. < 1 _ _1_
J
-rl m
1 P[ Zt B] 1 = _ ~
v 21rNo
exp
{E. 2N -
o
SIll
Jco
e-X2/2No
Vi sin tr/m
dx
= - 1- JClO
V2;
VE/N o ein 71m
e-x
2
/2
dx.
Note that if we were to consider also the shaded planar region of Fig. 14, which we shall denote B 2 , then it should be clear that P~
< P[zeBd + P[ztB2 ],
from which we can conclude that 2
}
(}
v»;{E cos (} dO.
(46)
If we now let x = VE/No sin 0, then the right-hand side of (46) may be written as 1 j+V~ sin rIm. x 2 2 1 - -e- / dx
VZ;
but
(48)
Combining (44) and (45) results in the bound
• -TIm
Thus, we have established a simple upper bound to the probability of error for the case 11'£ > 2. We remark that if VE/No » 1, which is usually the case, the left-hand side of inequality, (47), represents a good approximation to P,. A lower bound for the probability of error for m .> 2 may be established quite easily by geometric reasoning. In particular, it may be deduced from the symmetry of the message points [though it has been shown formally in the discussion following (42)] that the average probability of error is equal to the probability of landing outside detection region i when message point i is sent. The probability of landing outside R i , however, is larger than the probability of landing in the shaded planar region of Fig. 13. But the received point z will only fall in the planar region if the component of noise perpendicular to the boundary line of the planar region exceeds VE sin 1r/m. That is, designating the planar region by the symbol B 1 ,
Therefore,
exp { -
V2;
(47)
_
P~>P[Z£Bl]
-VE/Nocos8
j""
J<:O. _ _ je-x / 2 dx. < _2_ v 21r VE/No sin T/m 2
IJ
-VE/N o sin -rIm -%2/2 d = -2- foo e X -v'2; VE/No sin rim
6 This result follows from some elementary inequalities presented in W. Feller, "An Introduction to Probability Theory and its Applications," John Wiley and SODS, Inc., New York, N. Y., vol. 1, 2nd ed., pp. 164-166; 1959.
P•
JClO < _2_ -v'2; VEIN o sin
e-x
2
/2
dx..
7/1'4
The upper bound so calculated is identical to the one established previously [see (47)] with considerably more effort. Now let us consider separately the cases m = 2 and rn = 4, for which P e may be evaluated exactly. If m. = 2, it may be deduced from Fig. 15 that the probability of error is equal to the probability that a Gaussian random variable of mean zero and variance No exceeds YE. That is, when -m. = 2,
Pe
=
-I-
V2;
JCO VBINo
e
-3:
2
/2
dx.
(49)
Fifty Years ofCommunications and Networking
475
The probability of error for the case m = 4 may be calculated most easily by resolving the noise into the two orthogonal directions x' and y' indicated on Fig. 16. It follows readily that an error will not occur if and n~J the noise components in directions x' and v', satisfy the conditions
/
n:
-~ ~ n: <
-~ ~ n~ <
and
CXl
n'
CXl
but since ni and. are independent noise vectors with zero mean and variance No, the probability of this event is simply [ Fig. l3-Illustration of detection region R, for the multilevel coherent PSK case and the planar region B t •
JCD «r'": 1 v'21fN o -Vi/2
dXJ2 [
JIXl e- z "/ 2 dXJ2 vf2; v E/2N
1 - _1_
0
•
Thus, the probability of error for the 4-level case equals
r, =
1- [1 - _1_ JOO vf2;
e- z2/2
dXJ2
VB/2N o
2
= vf2;
JCD
-%2/2
VE/2N
_[_1_ Jm vf2;
o
e
d
x
e-¥2/ 2
dXJ2.
(50)
VE/2N o
Fig. 14-Illustration of detection region R, for the multilevel PSK coherent case and the planar region B 2•
The results of the preceding calculations for the probability of error of an m-Ievel PSI{ coherent system are summarized below. An exact expression for the probability of error in integral form is, by (44),
P = 1•
J 2r
l.-
m
7f
dOe- B ' 2 N o SiD' B
/
-1r/m
.{e-B/2NO oos' + 9
Fig. 15-Illustration of detection regions for 2-level PSK coherent system.
Y
REGION 4
-v'E/No cos B
«r" de}.
If m > 2, simple bounds for the probability of error are given by (47) and (48). That is, _1_
vf2;
55
IE cos (J JCD _
~N;,
fCD
e- z 2 / 2 dx
v'B/No a i n wlm
yl
< r.
< -2- JCD y'2.;
V BIN. sin
rIm
e-z
2
/2
d x.
The geometrical reasoning used to establish these bounds is similar to that used to establish Theorem IV (see Appendix V). Theorem IV, however, yields a set of slightly weaker bounds when applied to this case, namely, Xl
_1_ JOO
vf2;
Fig. 16-Illustration of the detection region R4 for a 4-1evel PSK coherent system.
e-%2/ 2
dx
VE/No sin ft/m
- 1) JCD e < (n~vf2; VE/No sin 7f/m
-
-%2/2
dx.
476
THE BEST OF THE BEST
We might point out, however, that Theorem IV is valid even for m = 2, in which case both the upper and lower bounds coincide yielding a check for (49), namely, P
e
=
-1-10) .... ; -
'v 21r
.vB/No
~vfm-,-----"",_!
I
r- JE, ----+1
e-z2/2 d x.
""--vf"l~
{E.
For the particular case m = 4, the probability of error is given by (50) as
p
= #
_2_
y'2;
JO)
(_1_ V2;
e- z 7. / 2 dx _
.vE/2No
e- z 2/ 2
Jco .vE/2No
•
In the ASK coherent case, if Si(t) is transmitted, the received signal point z has a single coordinate (see (11), (12) and (23)J,
iT
= VB: + n.
z(t)'P(t) dt
VE;
= (i - 1) .1
i
= 1, 2,
... , m,
(52)
The corresponding detector signal space is shown in Fig. 17. It is readily deduced from Fig. 17 that if 1 < i < m, then the probability of interpreting the transmitted signal incorrectly is simply
On the other hand, if i = 1 or i = m, then the probaability of interpreting the transmitted signal incorrectly is P[z '- Rtf Btl
=
2
•
(.54)
Now, combining (39), (53) and (54), we can compute the average probability of error P e to equal P • = 2(m - 1) -1m
V2;
fOJ 1l/2VN":
e-z2/2 d x.
(55)
If we further assume that the transmitter is subject to an average power limitation of EfT (watts), it follows An ingenious, though somewhat involved, derivation of (50) from a form of (44) has been presented by E. A. Trabka, "Embodiments of the Maximum Likelihood Receiver for Detection of Coherent Phase Shift Keyed Signals," Detect Memo. No. 5A (Appendix) in "Investigation of Digital Data Communication Systems," J. G. Lawton, Ed., Cornell Aeronautical Lab., Inc., Ithaca, N. Y., Rept. No. UA-1420-S-1; January, 1961.
t~_l..AJ
"v
I
2
2~
:
I
2 :
~ ~EGION m ~
L"'~(E.)l = -E . T m T -2
(56)
-
i=t
Substituting (52) into (56) and making use of the equality"
f
I: l
(i - 1)2 =
= (m -
1)(m)(2m -
6
i-I
i=l
1) ,
(57)
it is readily seen that the quantity d is constrained to equal
I - \Jem -
A-
6E
1)(2m - 1)
.
(58)
Consequently, the average probability of error (valid for all m) may be written as P
e
c.
=
2(m - 1)
m
1 Joo ----VZ; V6E/[4No(m-l)
(2m-l))
e-x'J/2 dx
(59) •
Coherent FSK
In the coherent FSK case, when Si(t) is transmitted, the received signal point z has coordinates {see (11), (12) and (26)J x,
=
iT
z(t)'Pj(t) dt
o
P[z ¢ R m/8m ]
1 JOO = --e-x / 2 dX y'2; /)"/2~
7
REGION I ~ REGION
I
I I
Fig. 17-Illustration of detector signal space for the coherent ASK case, assuming uniformly spaced message points starting with vIE; = O.
(51)
Thus, x is a Gaussian random variable with mean -vJh and variance No. Let us assume in particular that the message points are uniformly spaced, starting with VIi: = o. That is,
I I
6t ~ I ~ ~
that
B. Coherent ASK
=
6 -'
~
I
I
I I
~T II-TT2~T-1
+--1r--
dX)2
As a final point it should be noted that both (49) and (50) could have been derived from (44).7
x
f
=0
-I
= {n i _ VE +n
i
j
=i
j=1,2,·· . ,m.
(60)
The Xi are independent-Gaussian-random variables with mean zero if i ~ j with mean VE if i = ;, and each having variance No.. The decision rule, namely, to choose the message point closest to the received signal point, is equivalent to choosing that value of j for which Xi is largest. This may be deduced by noting that if the received signal point z, with coordinates Xl, X2, .. x". is closer to, say, the signal point S, than to any other signal point, then to·
d2 (z, S;)
< d (z, 2
8 i)
,
all j ~ i.
8 This equality may be verified by induction. See e. g., G. Birkhoff and S. MacLane, HA Survey of Modern Algebra," The Macmillan Co., New York, N . Y., revised ed.; 1960. Note, in particular, example 5a, p. 13.
477
Fifty Years ofCommunications and Networking
That is to say, [by (5); (26) and (60)]
xi + '" + x; + ... + < xi + ... + (Xi -
and, therefore, that
+ ... + x~
(Xi -
VE)2
VE)2
+ .,. +
x~
+ ... + x~
-1-
y'2;
Jco VE/2N o
e-z'J/2 d-x < P -
<
but this is true if, and only if,
-
-2VE Xi < -2VE Xi; that is, if, and only if, Since, when Si(t) is sent, each of the Xi (j ~ i) are independent-Gaussian-random variables with mean zero and variance No, the probability that each of the (m-I) Xi is less than Xi is simply
P[x; <
Xi
all j
i/x i , Sd
~
=[
1 V2rN o
-co
e-v,2/ 2N o
duJm-l.
Hence, the probability of a correct decision when S, is sent to equal P[Xj
<
Xi
~bll
j
i/3d
¢
1 fco ~ V2wNo exp
{
ex. -
o
J:ti
_00
exp
{-
u
2 }'
2N o du
r
dx:
1 __ 1 fCJ:J exp {
y'2;
-co
1 · [ y'2;
t
_a>
(x - ~r} 2
exp {
-i1l} du
J--l ax.
it follows that
= p ==
y'2;
1,<»
v' E/2N
e-z2/2 dx Q
(65) •
A distinguishing feature of the PSI{ incoherent case
is the fact that there are no pre-assigned detection regions in the signal space, each of which corresponds to a par-
V2E
R. H. Urbano, "Analysis and Tabulation of the M Positions Experiment Integral and Related Error Function Integrals," AF Cambridge Res. Ctr., Bedford) Mass., Tech. Rept. No. AFCRC TR-55-100; April, 1955.
/2E
\)T cos
Zl{t) =
Z2(t - T) =
1R-
cos (wot
(wot
+ a) + net)
+ a + ~;:) + n(t -
T).
Correspondingly, the measurement will (at least conceptually) be based on the pair of signal points Zl and Z2, having coordinates (Xl, Yl) and (X2, Y2), respectively, where, by (30a)--(30d), it is known that Xl, Yl, X2, Y2 are of the form
(63)
The integral appearing in (63) does not appear to be solvable in terms of standard functions for ni > 2. It should be noted, however, that the integral has been tabulated for several values of 111, and VE/No by Urbano," although not over the ranges which are considered to be of interest within this report. Fortunately, simple bounds for the average probability of error may be found quite easily by applying Theorem IV. In particular, noting [by (5) and (26)] that the distance between any two distinct message points Si, S, is
p = p*
(64)
D. PSK Incoherent
(62)
The right-hand side of (62) is independent of the choice of i and is, in fact, equal to the average probability of a correct decision. It follows, therefore, that the average probability of error
r. =
e-Z;~/2 dx.
Jv'E/2N o
ticular transmitted signal. The decision, rather, is based on the phase angle between successively received signals. If the ith message has been sent, such a pair of successively received signals will, by (29a) and (29b), be of the form
VE)2}
2No
•
_00
1 · [ V2wN
y'2;
Actually a tighter upper bound for P e has been derived by Fano l O for use in a channel capacity argument. For our purposes, however, the considerably simpler, if less sophisticated, bounds presented above will be adequate. Note that if, in particular, 111 = 2, the upper and lower bounds coincide. It follows, therefore, that in the twolevel case, ~
(61)
r<»
(m - 1)
P = -1-
fZi
e
Xl
==
Yl
==
VE cos a + nil - VB sin a + n
Y2 == -
12
VE sin (a + 2:) + n22
where nll, n12, n21, n22 are independent-Gaussian-random variables with zero mean and variance No and a is a uniformly distributed random variable over a 21r interval. A possible set of received signal points corresponding to the case Xl =: aI, YI = bh X2 = a2, Y2 = b2, a = A are 'shown plotted in Fig. 18 as vectors. Each vector is represented as the sum of a signal vector and a noise vector. The decision will be based on the angle '" which equals
1/1
t
10
=
21f'i
m
+ ¢:
-
R. M. Fana, op, cit.,3 pp. 200-206.
¢~.
478
THE BEST OF THE BEST
-~~----k------t----.x
/
/
1
tal ,bl)
I'\
A
~ '" ~ ;· <, 1
\
./
¥'
;1
Fig. 18-Illustration of a pair of successively received signal points with coordinates (at, bt ) and (a2, b2), respectively.
An erroneous decision will be made if, and only if, the noise is such that 1r IcPr - cPt I > -. m
(66)
Note that the angles 4>~ and
= IcP2
I,
- cPt
(67)
can be calculated. Designating this density function by p('Y/) , it follows that the average probability of error
r.
=
f1r1l'/m p('Y/) d7J.
=:;111l'/2 sin'" [E 1 + 2N (1 + o 0
-exp {-
2~o (1 -
cos 7J sin 1/;)
cos
'1/
mations are introduced, it is worthwhile to consider an alternate procedure for estimating the probability of error. This we shall now do. Initially let us calculate the probability density of the random variable cPt, corresponding to the angle 4>~ defined in Fig. 18. The probability density of the random variable cP2 is, of course, the same. It is convenient to resolve the noise component of the corresponding received signalvector into components which are parallel and perpendicular to the signal component as illustrated in Fig. 19. Designating the projections of the received signal on the directions parallel to and perpendicular to the signal component of the received signal as ii and 6, respectively, it follows readily from Fig. 19 that
a =
J
1f
1
rim:;
j1t'/2 0
sin
. exp { -
sin
]
I/;)} dl/;
(69)
where n~ and n~ are observed samples of the independentGaussian-random variables nl, n2, each of which has zero mean and variance No. That is to say, ii and 6 are sample values of a pair of independent random variables which we shall denote by x and y. Since x is Gaussian with mean VB and variance No and 'jj is Gaussian with mean zero and variance No, the joint probability density is given by p(i, y)
] I/;[li' 1 + 2iv (1 + cos '1/ sin 1/;) o
2~0 (1 -
cos
'1/
sin
(71a)
(71b)
and, correspondingly, that
P. =
VB + nt
(68)
The approach to calculating the probability of error which has just been outlined has, in fact, been used by Fleck and Trabka. 1 1 They have shown that12 p(r,)
Fig. 19-Decomposition of received signal vector into components parallel and perpendicular to the signal component of the received signal.
I/;)} dl/; d'l/.
=
2:N exp {
(72)
o
Now, transforming to polar coordinates by means of the relationships
x = vYNo cos
(70)
Since the manipulations required to establish (69) are rather involved and since, furthermore, the derived expression for the probability of error, (70), is awkward to work with (except if 1n = 2, in which case it reduces to a more tractable form) unless some simplifying approxi11 J. T. Fleck and E. A. Trabka, "Error Probabilities of MultipleState Differentially Coherent Phase Shift Keyed Systems in the Presence of White Gaussian Noise," Detect Memo. No. 2A in "Investigation of Digital Data Communication Systems," J. G. Lawton, Ed., Cornell Aeronautical Lab., Inc., Ithaca, N. Y., Rept, No. UA-1420-S-1; January, 1961. I 12 Ibid., see (32) and (33) from which (69) of this paper follows. It should be noted that p(",) = 2h(",) and R = E /2N 0 since we are using a double-sided noise spectrum.
fj
(73a)
= vVNo sin cPl,
(73b)
we can express (72) in terms of the random variables v and cPl. Since the Jacobian of the transformation is equal to vNo, the joint probability density of v and cPl which we shall denote by q(v, cPt) is equal to l3 q(v, cPt) = p(v YN"o cos cPt, v YNo sin cf>l) . vN o. That is, by (72), q(v, ,pi) = 13
;11" exp { -~
[v
2 -
:0
2v~
COS,p,
Davenport and Root) Opt cit., pp. 37-38.
+
:J}.
(74)
479
Fifty Years of Communications and Networking
Integrating out the v dependency, we get, finally, the probability density of cf>l' q(cPl) equal to
(75) The reader might find it interesting to compare the right-hand si.de of (75) with the integrand of (42) and to accordingly note that the probability of a correct decision in the PSI\: coherent case is equal to the probability that the angle between the transmitted signal vector and the received signal vector is less than ?rIm radians in magnitude. Now, recall [see (66) and preceding discussion] that a correct decision will be made by the detector if, and only if, I cPl - cf>2 I ~ 7r11n. The region in the cP1 X cP2 space corresponding to a correct decision is illustrated by cross hatchings in Fig. 20 (we are assuming that cP1 and cP2 are restricted to lie between - 7 ( ' and 7r modulo 21r). Neglecting edge effects, it may be deduced from Fig. 20 that the correct decision region is characterized by the condition 'lr_. l :2 I -< _V2 m
(76)
At.'
Thus, the probability of a correct decision is approximately equal to the probability that inequality (76) is satisfied. Unfortunately, the exact probability density of cP~ is not readily determinable. For small 4>;, however, (75) yields a reasonably good approximation to the probability density of ~~, as we shall now demonstrate. It is clear from (75) that q(cfh) is an even function, cPh and that in the interval I cPl 1 ~ 7l", q(>1) takes on its maximum value at cPl = 0 and decreases monotonically with I ~l I to its minimum value at IcPt I = 7r. Furthermore, following the procedure that was used to transform (43) into (44), (75) can be rewritten in the
Fig. 20-Illustrat"Lon of correct decision region in cPl X 4>2 space.
q(±~) = 14 21r .[e-EI2NO
+ e-EI4NO~ E
2N o
(±'-2Tr)
q q(±1r) =
1.-
271"
[e-EI2NO -
fa)
e- t ll / 2
dt]
-VE/2N o
1 =-e
-E/2N o
(78c)
271"
rK JO)
'J No
e-t'l/2
VE/N o
(78b)
dt].
(78d)
The point ,ve wish to make is that in the case of high signal-to-noise ratio, say E/2No » 20, q(cPl) falls off quite rapidly as I cPl I departs from the origin. If, in particular, I 4>1 l < 1r/2, then inequality (45) is valid. Treating this inequality as an approximate equality (the approximation improves as VE/No cos 8 increases) and substituting into (77), we get
q(tPJ)
~ ~ e-E12No sin' .,~ COS tPJ. (if
IcPl I <
1r12).
(79)
In the neighborhood of the origin
and cos cPl
form
~
1
and, correspondingly, q(qll)
+
fE ~~
cos tPJ
foo
-VE/Noc08~t.
e- t ll/ 2 dt}.
(77)
Substituting some particular values of cPl into (77) to gauge the "ray in which q(cPl) decreases as cPt increases, we get"
q(O) =
1.. [e-EI2NO + fE fOO 2~
~~
e-
t ll
/
2
dt]
(78a)
-VE/No
14 Some plots of q(q,J.) for various values notation) have been presented by C. R. digital phase-modulation communication ON COMMUNICATIONS SYSTEMS, vol. C8-7, 1959.
of E /2No( = BIN in his Cahn, "Performance of ,systems," IRE TRANS. pp, 3-6 (Fig. 2); May,
~ J211" e-(EI2N')."~'
(80)
That is to say, the probability density of cPl is approximately Gaussian in the region where it has the most weight, namely, near the origin. Correspondingly, the joint distribution
is approximately circularly symmetric in the same region
and, thus, for small cP~ the probability density of q,~ is approximately given by (79). Since the probability of error for the case m = 2 may be evaluated exactly as will be shown below, we shall only assume (79) to be a valid representation for the probability density of
cJ>~
480
THE BEST OF THE BEST
when m
4 or, correspondingly, by (76) when
~
J
'I -< 401('
cf>2
The probability of a correct decision when m ~ 4
is thus 1-
vi
e. ~ p[ II/l~ I s ~~
i..
IV2
<,
.1,.
mJ
.. e- E I 2N •
oln'
~,'~ cos I/l~ dl/l~.
(81)
Transposing (81) and introducing a new variable u
""
,
'.('
Fig. 21-Illustration of necessary and sufficient conditions for an error to occur in the 2-level PSK incoherent case, given that 4>1 = cPl* and that i = 'In.
Defining the quantity
IE · , = 'Vivo SID ~2,
we get, finally, that for m
lilt)
(86)
4 the probability of error
~
is approximately equal to Pe
fCl:l
2 -" ;;:;v 21l"
~
VB/No s i n 1r/V2m
e-u~/2 du.
we can rewrite (85) as
(82)
We shall now consider the case m = 2. From our previous discussion it should be clear that the probability of error is dependent only on the angles cPI and cP2 and not on the phase difference of 21ri/111 introduced between successive signals at the transmitter, Accordingly, in examining the two-level case it is sufficient to calculate the probability of error for the particular case where the transmitter keeps sending the same waveform. That is to say, we shall consider the case for which the signal components of the successively received signal points Zl and Z2 coincide. Assuming now that the angle cPI is known and is equal to, say, cJ>1 and that the signal point Zl has coordinates (c., b1 ) , it may be deduced from Fig. 21 that an error will occur if, and only if, the noise component of Z2 in a direction parallel to the orientation of the stored vector Zl exceeds VB cos et>~ .15 That is,
z2
= I/l~) = _ ~ fOO
P[error/I/l,
V 27
VB/No cos
q,1*
e- /2 dx.
(83)
i.,
=
P[error/I/l,
I/l~]q(>~) dIjl~.
(84)
1- foo · = J-..'" {-V2; 008~. e
-%2/2
v'EIN.
-E/2No sin
.{ e
+
211"
$
N1
o
COS
2
~l
[
d x}
-VB/No cos fl.
e-t
E. A SK Incoherent In the ASI( incoherent case, if Si(t) is transmitted, the received signal point z has coordinates of the form
x =
e-EI2N. ooo'~.
The expression for the average probability of error, (87), thus reduces to
[see (34a) and (34b)]
Substituting (77) and (83) into (84) thus yields P
Now, by simple symmetry arguments it may be deduced that
(91)
The average probability of making an error is, however, equal to
r. =
(87)
y 2
/2
d t]} UApl AJ... •
(85)
16 Ibid., this approach to calculating the probability of error has been suggested here. An alternate technique involving a reduction of (70) is presented in Fleck and Trabka, op, cit.
=
f'
Z(t)
i7' Z(t)
+ n, a
+ n2
where 11,1 and n 2 are independent-Gaussian-random variables with zero mean and variance No and a is a uniformly distributed random variable over a 2'Jf interval. The conditional joint density function of the random variables x and y, given that ex. = A and that Si(t) was
481
Fifty Years of Communications and Networking
transmitted, is thus equal to
Since the region R, is defined by the equations
VB:: + v1i; < v < vIE; + VB::: 2Vf..ro
2YNo
-
o ~
(97a) (97b)
21l",
it follows readily that Correspondingly, averaging over all possible values of A, we get the conditional joint density function of x and y, given only that Si(t) 'vas transmitted, equal to
Pi(X, y)
re
= Je
+
2r
Pi(X, yja
1 {X = --exp
2
2~lVo
.r: J
+
=
dA A) 211"
+ E i} . 1-
y2
2lVo
2~
exp { . x vIE; cos A - y VE; sin A} dA
e
1"0
= _1_
. { 2rNo exp
x
2
+ 2N y2 + Ei}/ (VX + y2 VE;) No 2
0
o
(93)
where 10 is the modified Bessel function of the first kind of zero order." As previously noted {see discussion preceding and following (35)], the decision rule for this case is simply to 2 round off the measured value of I z I = + y2 to the nearest value of VE; and guess that the corresponding signal was transmitted. Thus, if Si(t) is transmitted, the received signal will be interpreted correctly if, and only if,
vx
-r«:
+
2
Vl~i ~~. + :1 X v < VE;+2 VE i + i = 1, 2, ... ,m
1
(94)
VB:
where we define YEo = and VE m + 1 = co. Now the probability that inequality (94) is satisfied, i.e., the probability that the received signal point z falls into the detection zone R, when ~Si(t) is transmitted, may be expressed as P[uR,fS,J
=
Jf
Pi(X, y) dx dy
=
JJ
21T~O
Transforming to polar coordinates with x
= vVNo cos!/>
y
=
vVN o sin cP,
d
an
(96)
16}'-'. Bowman, "Introduction to Bessel Functions," Dover Publications, Inc., New York, N. Y., pp. 41-42; 1958.
vB; +
2
VB:::
respectively. The probability of the noise vector falling inside the ring, however, is certainly larger than the probability of it falling inside the circle of radius (it" centered on the noise origin (see Fig. 22), where (Re! is chosen as large as possible subject only to the constraint that the circle must lie in the ring. On the other hand, the probability of landing in the ring is certainly less than the probability of landing in the shaded region of Fig. 22(c). Both these probabilities are easily evaluated. The probability that the noise lands inside the circle of radius. m.1 Isee Fig. 22 (b)], which we shall denote by E(i) is equal to
11
exp
{
(n; 2N + n;)} 21rN 0
=
P[z e lli/ Sd
;1TJJ exp{-(~ + 2~J}Io(v~)vdvd9>'
VB:: 2+ v1i;
P(i) =
we get
=
Unfortunately, however, this integral is not simply solvable except for the particular case VE i = 0 nor has it been tabulated over the regions of interest." It is possible, however, to bound integrals of this type from above and below by analyzing the integral geometrically and then modifying the regions of integration. Thus, for example, we might note that the integral in question represents the probability that a two-dimensional spherically symrn.etric Gaussian noise vector with mean zero and variance No, originating from some point lying on the circumference of a circle of radius VE;, fans inside the ring determined by the circles (concentric with the first) of radius
o
1 -
•
dm, dn 2 exp
{_.(R~ }. , 2N o
(99)
17 Eq. (~}8) may be expressed as the difference of 2 Q functions. The Q function is defined in J. I. Marcum and P. Swerling, "Studies of Target Detection by Pulsed Radar," IRE TRANS. ON INFORMATION THEORY (81)ecial Afonograph Issue), vol. IT-6, p. 159; April, 1960. The Q function has been tabulated, though not within the ranges of interest of this report, by J. 1. Marcum, "Table of Q Functions," Rand Corp., Santa Monica, Cnlif., Rept. RM-3:39; January 1, 1950.
482
THE BEST OF THE BEST
and is, therefore, by (101) and (102), bounded from below by
n,
(104)
", Ib )
An upper bound to 1 - P. can be obtained in a similar fashion by evaluating (100) for different values of i. The resultant expression is, however, rather cumbersome. A satisfactory upper bound can be obtained quite simply by noting that since
P[z£R ;/S.] >0,
I ISlf ~']
I I ~ ./Ei- J • ./E~
" ,jEi ' .I! :r ./E i ... ./Ei.,
it is certainly true that [see (39»)
,
r. =1.m i:P[z.tR./S,] >1. [pz.tR,/Stl. ; =1 m That is,
Fig. 22-Geometric interpretation of (98). (a) The right-hand side of (U8) is equal to the probability that the two-dimensional spherically symmetric Gaussian noise vector, with coordinates nl, n2, each having zero mean and variance No, lands inside the shaded region. (b) The probability of this event is certainly larger than the probability of landing in this shaded region. (c) The probability of this event is certainly smaller than the probability of landing in this shaded region.
P > 1 - P[z £ R./S ,) , m
(105)
but, for the case i = 1, (98) reduces to
Substituting this latter result into (105) yields The probability that the noise lands in the shaded region of Fig. 22(c), P(i), is equal to P(i) = 1 -
1 V211"
fm
C-,'/2
dx.
(100)
( lij.- Ol.) /v:¥.
Note that the quantities
<
P(i) .
(101)
Now if, in particular, we assume that the message points are uniformly spaced starting with zero, then, by (.52) ,
v1F: =
(i - 1) .1
I
6E
1)(2m - 1)
P(i)
=
1 - e-Il.·/8 N..
(102)
The average probability of a correct decision is equal to 1 - P,
1
... E P[z e R;/ S;j m
=-
s
exp {
i =l
(103)
< i,
8N o(m _
The bounds are valid for all m
6~(2m - 1)}' ~
(107)
2.
F. FSK Incoherent
In the FSI( incoherent case, if S i(t) is transmitted, the received signal point z has, by (37a) and (37b), coordinates of the form Xi
= { VE cos ex
Yi =
It follows readily by direct substitution that
(106)
}
6E
{ 8N _ ex_p----'-----= '-'-"-' o(-'-'-m-'-'----'1:.)-'= :. (2c.m :.:.c---~ 1 ).L
n Zi
.
e-Il.'/SN. . > 1.. m
Finally combining (104), (l06) and (58), we get the average probability of error bounded by
i = 1,2 , . . . , m
where, assuming the same average power constraints as in the ASK coherent case [see (.58»), .1 - '\} (m -
P•
{
j ~ i
+ n zj
j
n . "'
- Vii sin ex + nvi
=
i
j
~
j
=i
i
where the n# and nYi, j = 1, 2, ... , m are independentGaussian-random variables with mean zero and variance No and ex is a uniformly distributed random variable over a 211" interval. A correct decision will be made by the detector when S i(t) is sent, if, and only if, for all j ~ i the inequality
vx; + y; <
vx~
+
Y~
(108)
Fifty Years of Communications and Networking
483
is satisfied. The probability density of the random variable V x~ j = 1, 2, ... , m may be established readily by converting the joint density of Xi and Yi to polar form and then integrating out the angular dependency. In particular, if j = i, the joint conditional density of Xi, Yi, given that Sr:(t) was transmitted and that a = A, is equal to
+ y;,
Pi(Xi, Yi/a = A)
=
_1_ exp { __1_ [(Xi _
21rNo
2N o
of (112) with E set equal to zero. That is,
(113) Condition (108) is equivalent to requiring that (114)
for all i ¢ i. Since the Vi are independent random variables, the probability that inequality (114) is satisfied under the condition that Vi is known is simply
VE cos A)2
<
P[Vj
+ (y, + VB sin A)2l}.
(109)
Vi
= ·
1
8+2 11"
q
Pi(Xi , Yi/a
dA
= A)-2 r
6 +2 11"
exp
{Xi VE cos A
(I
No
= 21r~o exp {-2~o
Yi
I
v,.jv,]
=
l
0
Pi
i that
v.e -Vi'/2 dVj (116)
< Vi,
P[Vi
all
i
~
i/vd =
(1 - e-'i
2
/ 2) m- l .
(117)
Consequently, the probability of a correct decision when Si(t) is sent, which is given by
vE sin A} -dA No
.<
~
(115)
Substituting (116) into (115) yields
P[v;
21T'
< Vi, all j
~
i]
= {' P[Vj < v, all j
[x~ + y~ + El}
'Io(VX~ + y~ vE) No
-
P[Vj
/
21r~o exp { - 2~o .[x~ + y~ + El}
1
jiJ6i
But the probability that for any particular j < ~i, given Vi, is equal to [by (113)]
Averaging over all possible values of A to get the joint conditional density of Xi and Yi, given only that Si(t) was transmitted, yields Pi(Xi, Yi) =
IIP[Vj < Vi/Vi]'
i/Vil =
Vi, all j ~
;J6
i/v,Jq(v,) dVI,
is equal to [see (112) and (117)] (110)
P[v;
<
Vi,
all j ¢ i]
where 1 0 is the modified Bessel function of the first kind of zero order .16
Letting Xi = Vi VN o cos (jJi, u, = Vi ~ sin (jJi and noting that the Jacobian of the transformation is equal to ViNO' the joint probability density of the random variables Vi and tPi, q(Vi, tPi) which is equal to 13
q(Vi,
(jJi)
Utilizing the binomial expansion, we can write [1 - e-"'/2r- 1 =
= ViNO ·P(Vi VN~ cos c/>i, Vi VN o sin cPi),
E (m -:k
e-
1)( _1)A:
k-O
Pi
!l k / 2
which result, when combined with (118), yields
may be written as
(111) Therefore, the probability density of V,i, q(Vi) is equal
to
P[VI
<
v, all j
•
i
0
to
~
il = exp {-2~} ~
Vi exp
{v~(k + 2
I)}1 ( 0
(m ~ 1)(_1)'
JE
)
Vi~ No d"i.
(119)
The integral appearing in the right-hand side of (119) is a standard forrn whose solution is'"
If j r6 i, then the joint density of the random variables and Yi is equal to the right-hand side of (110) with E set equal to zero. Correspondingly, the probability density of the random variable Vi is equal to the right-hand side Xi
18 G. N. Watson, ,eA Treatise on the Theory of Bessel Functions," 2nd ed., Cambridge University Press, Cambridge, England, Sec. 13.3, p. 3H3,t Eq. (1); 1958.
484
THE BEST OF THE BEST
Combining (119) and (120), we get the probability of a correct decision when Stet) is transmitted equal to P[l';
<
l'i!
all j ¥- i)
=
exp { -
.E (m -k 1)( _l)i
e."q> {
o(:
2N + I)}. (k ~ 1)
3) (121)
The right-hand side of (121) is, however, independent of the choice of i and is, therefore, in fact, equal to the average probability of a correct decision. The average probability of error for an m-Ievel incoherent (orthogonal) FSK system is, thus, equal to p.
=
1 -
{ E} L
exp - 2No
m-l (
k-O
m -
k
exp
probability.
2) The transmitter is subject to an average power
2~J
k-O
1) Each signal waveform is transmitted with equal
4)
1)(-l)k {2No(~+ IJ k
+
1
5)
·
(122)
Noting that (122) can be written as
=_
p tI
{_..J£} ~ (m k- l)(_I)k {2No (f + I)} exp 2N f=: k + 1 exp
o
and that
1 (m) 1 (m-k 1) k+l= k+l m' we can, introducing a new summation index, q = k + 1, rewrite the expression for the average probability of error in final form as
P. = exp
{-&} t (m)C m
-1)« exp {E(2 -
«-2
q
VI.
CONCLUSIONS
q)}.
4No q
(123)
6)
limitation, EIT (watts), where T is the duration of each transmitted signal waveform. The received signal is the sum of the transmitted signal and a noise term, the noise being stationarywhite-zero mean-Gaussian with double-sided spectral density No (watts/cps). That is to say, the noise power passed by an ideal filter with unit gain and (positive) bandwidth W is 2N oW watts. The receiver is in time synchronism with the transmitter, by which we mean to say that the receiver knows when to sample and when to quench the product integrators. When, in addition, it is assumed that the receiver is phase locked to the transmitter, the system is referred to as coherent. The received signal is processed by a maximumlikelihood detector except in the ASK incoherent case. For reasons of simplicity, an approximation to the maximum-likelihood detector which approaches the true maximum-likelihood detector in a high signal-to-noise ratio environment was chosen for this case. In the FSK case and the PSI\: case, the transmitted signal waveforms, which are sinusoidal pulses, each contain equal energy E whereas in the ASK case the amplitudes (that is, the square root of the energy) of the transmitted pulses are uniformly spaced starting with zero. Furthermore, in the FSK cases the transmitted waveforms are orthogonal.
B. Physical Significance of r. It should be particularly noted that the calculated quantity designated as P e represents the average probability of misinterpreting the transmitted waveforms. That is to say, if, in a long period of time KT, K waveforms are transmitted and, say, L of them are misinterpreted by the detector, then P e will (almost always) be approximately equal to
A. Review of Basic Assumptions In the preceding sections of this paper, an approach to the problem of optimally detecting a set of known waveforms in a stationary-white-Gaussian environment has been presented. Utilizing this approach, three basic-data transmission systems have been studied and a set of error characteristics has been derived. Curves wherein the probability of error (actually loglO P.) is plotted as a function of signal-to-noise ratio (in decibels), the number of levels m appearing as a parameter, are presented in Figs. 23-28 (pages 358-359). Before proceeding to a detailed discussion of these curves, however, we wish to review explicitly the assumptions upon which they are based. In particular we have assumed that:
the approximation becoming better as K ---) 00. The point we wish to emphasize is that P~, which is sometimes also referred to as the character error, is not, in general, equal to the probability that a single binary symbol is received incorrectly or that a binary sequence of some arbitrary length is received incorrectly. Thus, we might note, for example, that if, in an 8-level system the waveform corresponding to the binary sequence 001 is misinterpreted for the waveform corresponding to the binary sequence 011, a single character error has been made but two out of three of the binary digits have been received correctly. It is, of course, possible for a single
'.1 ------,'0. '. 485
Fifty Years of Communications and Networking
--
~
"""::IIO- IO.OOO
,
. 0 00
10
.-
'0
'0
.
'0
,
'0
1[1
PlOT
OJ "".
[UtT UI"M:SSJOtt
tl . ' ....OT Oil ... \..O'M:lIl (lac, U",USo(lII
,0
;-
III - PlOT Of uPP(1I lOUHO S AND
LOW(R
8O\.IC)S
h :PftU SIOH.
o
~
.
IU - 11\.0' OF AtlI (UC' I[I'I'ft(SS IOfrf
OF " .. (UC'
~:::: :
10
10
TO .".
"')U~.toutrto
.;
.;
.CUrtO
LB
.,
'0
Ill ' 4 IU Ill
r"l IE I m )ol lla,
"
Art ..."
, • •u
~st
1 '0 ...... ['>lItO.
,g-.t:_ V t t T' -'I,.
"
. .... u ' U (I IIGIU
" . .. .. ~ ....... t"f t r'-' " _ , ."
"0 _ 18
ll(."f'
Fig. 23-Probability of waveform error (m-Ievel PSK coherent), assuming that the duration of each signal is fixed independently of m.
'0',-- - - - - - - - - - - - - - - - - - - - - - - --,
Fig . 25--Probability of waveform error (m-level FSK coherent), assuming that the duration of each signal is fixed independently of .m,
,
10-100.000
'0
~------------------------__" lootO.O()l
_mot II I
'0
,
,
. 10 00
'0
'10 00
'0
'0
,0
! 000:
,
'0
~
~ . 'I(I
.;
,"
.o~ ,
o
1
! ;"
,
10
(f l · "lOT 0' AN tx'CT U .. Ilt(SS1QN
'0
0-:'
;
"J ~ ." I ( 1
"
!
•
'0
.,
'0
~
. /
10
1&1· P\.OT 01 AN U PItOI'IIlU TE (K PR(SsaotrI
.,
10
/
/
'0
'0
&Vlll " ( " ~cr
Fig. 24-Probability of waveform error (m-Ievel ASK cohe rent), assuming that the duration of each signal is fixed independently ofm.
llO lS r
' 0
. IGfI"~ IlIl ••"
~r .
~ Ct . &L
H
... Il "
Fig. 26-Probabilityof waveform error (m-Ievel P SK incoher ent), assuming that the duration of each signal is fixed inde pendently of m.
486
THE BEST OF THE BEST
,O,-
'0
--"I(;IO'OM
.
-..... .
.
=-
10
_ m _a rl )
."
.~
,
-m·1618)
character error to correspond to two or three binary errors (in the 8-level case). Whether or not such a character error is more serious than the other kind depends upon the particular coding scheme used and cannot be predicted in advance. In one coding scheme, for example, we might be interested in the probability that binary sequences of length 6 are received correctly (alpha-numeric code). If an 8-level system is utilized in this situation, each binary sequence of length 6 must be associated with a pair of waveforms which will then be transmitted in succession. Correspondingly, the sequence will be received correctly if, and only if, both the associated waveforms are received correctly. The probability of receiving such a sequence incorrectly is, therefore,
p ••q
'0
lin -
lJ'~ftttsfON
'0
,.
ru.8HJPP(fI IDVHO
'0
10
)0
..rt."Gl SlOtl.. \"
' . t«(
"oaf.
-01'1
40
.",r ••,
t.t Cflll&l Ofll" "
f. "I · l ot ••
..!...
",
Fig. 27-1' robl.lbility of waveform error (m-lovel ASK incoherent), assuming that the duration of each signal is fixed independently of m. l O· r::-
(1 _ p.)2
which for small P, is approximately equal to
PLOT Of' UPIt[IlII BOUNDS AND lQYll(" IOU,.O$ 0' AM [I.eT
o
=1-
......, iC· 'O.
I., •
'0
.."s
In the last cited example, it was a relatively simple task to derive an expression for the probability of the event of interest, namely, that a binary sequence of length 6 is received incorrectly, in terms of P.. In general, however, the events of interest will not be simply related to P•. In fact, it may be impossible to calculate the probability of certain events such as the probability of a binary error without resorting to a detailed analysis of the probability with which various types of waveform errors can occur" (e.g., what is the probability that if waveform number one is sent it is interpreted as waveform number two, number three, etc.?) . Nevertheless, a set of inequalities which adequately relate the probability of the event of interest to the probability of a character enol' may frequently be established quite easily . Thus, if we are interested in the probability of a binary error we might note that if K waveforms of an rn-level system are sent in a period of time KT, then K log, rn (which is assumed to be an integer) binary symbols are sent in that time. If L out of K waveforms are misinterpreted by the receiver, then at least L but no more than L log, 'In binary symbols are received incorrectly. Consequently, the ratio r of the number of binary symbols received incorrectly to the number transmitted is bounded by
L < r < L log2 m. K log, m - K log, m fE'
.
~
PlOl of' AN bACT UHlU'1OIi
.. tTU4
rn-
~
-.
'0
-P. 1 < P. :s; P•. ogzm- •
to
(124)
Z If)
~ .)It(
40
~
..lI(llIaor
'.1(1
Now, as K becomes large L/K approaches P., the probability of a waveform error and r approaches Pi, the probability of a binary error. In the limit, therefore, as K approaches infinity we have
It
IIlI,,,
,o_r_ , ..teu-.
0("$ 1"
~
Itll' o LOI ••
...!....
~
,tl.
.Fig. 28-I' robability of waveform error (m-level FSK incoherent), assuming that the duration of each signal is fixed independently of m.
19 Some special cases have been considered by J . K. Wolf, "Comparison of N-ary Transmission Systems," Rome Air Dev. Ctr., Rome, N . Y., Rept. No. RADC-TN-6Q-210; December, 1960.
487
Fifty Years ofCommunications and Networking
C. Accuracy of Presented Curves Regarding the accuracy of the curves presented, it should be noted that in all cases except PSK incoherent (rn > 2) either an exact expression for P. or an upper and lower bound to P. is plotted. Unfortunately, for the PSK incoherent case simple bounds could not be found with the exception of the 2-level case which was evaluated exactly. Correspondingly, the curves presented for the PSK incoherent case (m > 2) represent simply an approximation to the actual probability of error, the approximation being quite good for large values of m. We might point out that the plotted curves cannot be used to sharply determine the probability of error for a particular modulation scheme, given E /2N o and m. The reader can verify this himself simply by trial and error, noting in particular that a small uncertainty in log P. results in a much larger uncertainty in P•. The inverse problem, which is perhaps the more natural, can be handled much more satisfactorily. That is to say, given a value of P. which it is desired to maintain and a value of m, the needed signal-to-noise ratio (E / 2N 0) can be determined quite closely.
10·C--
- - , IO· ,C.OOO
-
t £ I - " SII: IH(OH[RINT
£ I-ASIC COH(AEHT { E1-;51< (;()HUtlNT I
10
.'~
•
101OO~
'0
.
~: If )~
PlOT
or Aftf
(xACT [k'AUS!OH
-,
'0
D. Comparisons Assuming Fixed Waveform Duration Now, referring to the curves appearing on Figs. 23-28, there are a few general conclusions which can be drawn. In the first place, note that increasing m (the number of waveforms which may be transmitted in any time T) tends to increase the probability of error whereas increasing the energy content in each transmitted signal (i.e., the signal-to-noise ratio) tends to decrease the probability of error. Furthermore, it should be noted that increasing m introduces the most degradation in the ASK case, somewhat less degradation in the PSK case and comparatively little degradation in the FSK case. Geometrically, the reason for this is clear. In all cases, increasing m increases the number of message points. In the ASK case, these points lie in a one-dimensional space; in the PSK case, they lie in a two-dimensional space whereas in the FSK case, the dimension of the space increases linearly with m. Correspondingly, in the first two cases the points (since we are subject to an average power limitation) become crowded together and the frequency of errors increase . In the FSK case, however, all message points may be maintained equidistant. Consequently, there is little deterioration in performance with increasing m. In fact, we might expect that if m is sufficiently large, the average probability of error of an FSK system should be smaller than that of a PSK or an ASK system (if it is not so already). That this is indeed the case may readily be deduced from Figs. 29-31 wherein some cross plots of the error rates for various systems are presented for given values of m. For the sake of clarity only four curves are presented in each diagram rather than the full six. The selected curves are, however, representative, as the following comparison of the per-
10
2'0
'0
_ ~~IJ " G""'~
'111('
~I'U
1"(."
40
100_'_ ~"tttlU l OCIirI,,,,
so ' IIt" l OG..
60
L
, ...
Fig. 29-Compo.rison of error performance (2-1evel systems). IO· r -
,
10
-"I O· ' 3 ,OOO
-10 00
10
~
•
10
o
'0
l
~,,,.
'0
10
ts I
tEl - PlO T OF .e.N UACT ExPR[SS!ON ' AI · P1..0T Of A'" APPRO k IMAf[ EXPR£5 SI0N
-,
.
'0
"'." ",
10
Fig. 3O-Comparison of error performance (4-1evel systems).
488
THE BEST OF THE BEST ,0r-
---, .O·'O'009
_If;I - r$ll : IN(:OHUttfilT I
'0
·10 00
- '0
small enough to be of interest (say, P; < 10- 6 ) the degradation between coherent and incoherent is of the order of a decibel or less. Consequently, either the coherent curves or the incoheren t curves may be taken as a representative set of curves for either of these two systems. F. Comparisons Assuming Fixed Signalling Rate
,
'0
. '
~-
,
'"
lE I -PLOT 0" uP'I'ElIl I()UN)S A"'O
LQw(It IOUNOS 0' AN EUCT UI"llt[SSIOIr4
IAI- PlOT 0' AN AP1'fItO....... u: [.PRESSION IV 11 - LIPPE" IOUNO
e '0
'0
'"
10 " 'I_I U
)0 " G.'~
r ..r.o,
40
' . '(1 "01S( . o_rll '.'C'.A.I. O(II"t,
Fig. 3i-Comparison of error performance (8-level systems).
formance of coherent systems vs incoherent systems will demonstrate.
E. Coherent vs Incoherent Of the three basic systems under study, the PSK system suffers the most degradation in performance due to lack of coherence. The calculations performed in Section V show that for m > 2 the probability of error for a PSK coherent system and a PSK incoherent system are approximately given by [see inequality (47) and the remark following it and (82)]
P "eo~.reD'
~
2 _~ V 211"
1"
VEIN • • i .. r/ ..
- u' / 2
e
du
Returning now to the discussion of degradation in performance with increasing m, we wish to emphasize the fact that any conclusions drawn regarding the relative performance of various systems must take into account the constraints under which the comparisons are made . In particular, it should be noted that the curves presented in Figs. 23-28 are drawn under the assumption that the duration of each transmitted signal is mairitained at a fixed value, T, independent of the choice of m. The relative positions of these curves might change considerably if, instead, we considered the equally valid constraint of fixed rate R. (We are still assuming that the transmitter is average-power limited.) Under this constraint we can allow a longer time duration for each waveform in a multilevel system and, hence, increase the energy content of the transmitted signals. Thus, for example, if each waveform employed in a 2level scheme has a duration T, the waveforms employed in the corresponding 4-level version of the scheme can have a duration 2T. Doubling the allotted time duration doubles the energy content of the signal and, hence, is equivalent to a 3-d!:! boost in signal-to-noise ratio E/2No• Now, in considering the general case let us attach the subscripts m to the parameters E and T to indicate the number of levels which are under discussion. Assuming that the rate at which data is being transmitted,
R
T...
'
(125)
is maintained constant, it follows that (126) and, therefore, that
T .. From these equations it may be seen that for large m (i.e., where sin 1I"Im ~ 1I"1m) the cost of incoherence is a 3-db degradation in signal-to-noise ratio. For small m the degradation is somewhat less. In fact, for m = 2 the degradation approaches zero in the high signal-tonoise ratio case. This latter point may be verified by comparing the expressions for the probability of error when m = 2 [see (49) and (91)]. A graphical comparison is presented in Fig. 32. In the ASK and FSK cases effective comparisons between the analytic expressions for the probability of error in the coherent case and the incoherent case do not seem feasible. Yet, examination of Figs. 33 and 34 indicate that in those regions where the probability of error is
(b J'ts~sec)
log, m = ---
=T
2
log, m.
(127)
Since the transmitter is average-power limited, (128) Combining (127) and (128) yields (129) from which it follows that 10 log., (:;) = 10 log.;
2~ + 0
10 log., (log, m). (130)
489
Fifty Years ofCommunications and Networking 10 ' r : -
'0 r : -
-c:J IO-fO·OOO
'0
_l[ , -m ~"
'0
,
C()H( Il [ N T
-IA) - m • • ....c014:IUNT
-
- , I O·\C.OOO
.
..
'0
t
10
- '&1-
tn'.
IHCOHfMflfT
" '0
~
t
,
(U
.t- ,...OT
OF uP9£R eoUNO OF AN
[ U CT UPft($SlQM
I()
.' ~-
UI-I't..O f
,
0' AN APr"O IlIM&f[ (XPRESSIQN
UU -P LOf ~ uP PU I aND LOW£" 8O\JfrIlOS TO TN( [UCY t:r:PMSSION
'"
.,
o
'"
'0
o '0
'0
10
10
'0
40
'vl"'GI IIGIO"", ("(1141' ' . I~ (
IIIOlst " OWllI ,.t ( f."t. D("' lf,
:)0
lui ' lOG, •
..L
60
I .. ,
Fig. 34-Comparison of coherent FSK and incoherent FSK.
Fig. 32-Comparison of coherent PSK and incoherent PSK. ---, 10·' ° ,000
10· . . -
'0
.
C( j-I'I'I ' l COH£I'lt:H T -UlJ ~m'2
,,.COHfJI [ NT
'0
· .000
t
00
;.'
,
'"
le J -~OT ~
UPPER ANO LO'*£"
BOUNDS0' 'N UACT £JJI1IlfUiOH
o
'"
tu eJ - UPP€R &CUNOS
H. e .!
tl,..ll!J4UJWtR IOJHDS
'"
10
sc
40
,-. (( .:.~s:. ':o:;:·~t::::'DC ••11'
.,
'"
50
(. ' h LO' ,. It..
Fig. 33-Comparison of coherent ASK and incoherent ASK.
That is to say, under the assumptions of constant rate an tn-level system has a signal-to-noise ratio advantage of 10 log., (log, rn) db over its 2-level counterpart. Making use of this fact a set of error characteristics for the assumption of constant rate may be derived from the curves presented in Figs. 23-28 (which were drawn under the assumption that the duration time of each transmitted waveform was fixed) simply by translating the m-level characteristic to the left by loglo (lOg2 m) db. A sample set of constant rate curves which were so constructed is presented in Figs. 35-38. It may be noted from these curves that multilevel PSK and ASK systems are inferior in performance to their two-level counterparts whereas FSK systems seem to improve in performance with increasing rn. In both the PSK and the ASK cases, however, the difference in performance between the multilevel and the corresponding two-level schemes is smaller under the present assumption of constant rate than under the previous assumptions of constant time duration. These results are completely ' consistent with the geometric picture. In all cases, increasing m increases the number of message points. Under the restriction of constant rate, however, the energy content of the transmitted signals increases with m and , hence, the volume of the sphere within which the message points are constrained to lie also increases with m. This tends to partially counteract the crowding together of message points which occurs when m is increased in the PSK and ASK cases. In the FSK case, the message points are actually moved further apart as tn increases, the result being an improvement in performance.
THE BEST OF THE BES T
490 ,0· .--
---:l IO-'O'OOO
,
'0
'0
'0 .--
.
" 000
.
t[l-"\.Ot 0 ' .Iol ExACT UltftU$ ION
I
Un - Pl OT 0' SOUNOS 0" AN ( . ACT [I PR( S S ~
(U el- UP P[ IIl 8OU1o()
.;
.
J I 1
~ f [,- ~T
Fig. 35-Probll.bility of waveform error (m-level PSK coherent), assuming that signal duration is adjusted to keep the data rate constant.
. .
.';
EJlItt (lPIt(SS
IAJ-"L.OT M AM AlI't"M)Xl....n [lfttllS$tOtll
,
00
.
~J .,
'0
00
10
,..e.
..c1l&M
. , ••a"
se
Il1O". "Oe'11I .....
40
10
.........
CT.&l, OIl."'.
Fig. 3ti-Probability of waveform error (m-level PSK incoheren t), assuming that signal durat ion is adjusted to keep the data rate constant.
1"" "0-.. -!...
10
I.,
..,. ,O·IO,OOO
·.000
,Q
.0
.
," 0
!
00 :'
'0
J I
II)tll
ee
40
,
·.000
o.-xJ:
1(1 - "-01 0' Atil
'0
'1•••" ...... "
c-
'0
"!
'0
10 aY(IIA 51
f WOC ( IIIO ISI 1'010'11 ."'tC'.".L OII111' IU
Fig. 37-Probability of waveform error (m-level ASK coherent), assuming that signal duration is adjusted to keep the data rate constant.
IO~poo
'0
!
.. I "
o
'0
"
'0
01 AN 1111" U~
.
.,
'0 .-------------------------'l
!
0-<10:
.;
'0
00
,"
m'ltll
~-
l
j "
-
"
t
II 1 '-lO'tl'(" 8OUNO
. 00•
'0
~
..0.000
_m.4 (1 1
.
! .;100:
10
'0
'0
,"
'0
•
.
~: IEI'PLOT
"
;
(;6
.,.. [ . 1(." U"'ESSION
)
",'
.
o
'0
00
'0
. ....... 1;0' 'l(olo aL , ,..cl . ,sc.-o-f ••"'C,
'0 ,•
"
Fig . 38-Probability of waveform error (tn-level FSK incoherent), assuming that signal duration is adjusted to keep the data rate constant.
~
491
Fifty Years of Communications and Networking
G. Discussion of Bandwidth A significant factor in the comparison of different communication systems is the quality of the channel required by the system to maintain a "satisfactory" flow of information between the transmitter and receiver sites. To this point in the discussion, all comparisons have been made under the assumption that a distortionless "wideband" Gaussian channel (which is completely specified by the spectral density of the noise No) is available. Unfortunately, channels of this type, while most amenable to analysis, are hard to come by in practice. In particular, the bandwidth allotted to anyone transmitter is generally limited and, hence, the relative efficiency with which it uses the available bandwidth is of prime interest. As a measure of bandwidth efficiency we shall introduce the parameter r defined as the ratio of the rate at which information is being transmitted, R = log2 miT bits/sec, to the Nyquist rate of transmission, 2B bits/sec. That is, R
r = 2B
m
mi
=
log, m
2BT
Fig. 39-Illustration of transmission bandwidth required by an nt-level orthogonal FSK incoherent system.
_{PSK ASK ,
12
~
III
~
W 10
%
Q
(131)
.~ ta
•
where m.
= the number of different waveforms which may be transmitted
T = the duration time of a waveform B = bandwidth required to maintain satisfactory
~".C"''''T FSK-INCOHUINT
operation.
I t is quite difficult to define B precisely. For the purposes of present discussion it will be adequate to assume simply that a sinusoid of duration T and frequency to will be passed with negligible distortion by an ideal filter with pass band 1.5/T centered around foe It follows, therefore, that for the PSK and ASI\: modulation schemes wherein the frequency of the pulses sent in each time slot is fixed the required transmitter bandwidth B ~ 1.5/T and, correspondingly, the bandwidth efficiency log, m
r~-3-·
(132)
In the FSK case, however, assuming a separation of
l/T between adjacent tones, an m-Ievel transmitter requires a bandwidth (see Fig. 39) "-' m + 0.5 B rov T ' in which case log2 m r~ 2m + 1·
(133)
In the FSK coherent case, however, adjacent, signals need only be separated by a frequency difference of 1/2T to maintain orthogonality and, hence, the required bandwidth may be reduced to
m+2
BP::3~,
Fig. 4o-Plot of bandwidth efficiency of levels.
38
a function of the number
resulting in a bandwidth efficiency of log2 m r~ m 2'
+
(134)
Eqs. (132), (la3) and (134) are plotted for purposes of comparison in Fig. 40. It is apparent from Fig. 40 that simple multilevel orthogonal FSK systems, even under idealized operating conditions, are inefficient users of bandwidth. Physically, the reason is clear; in any time period T only a fraction of the total system bandwidth, namely, that occupied by the particular tone transmitted, is utilized. Thus, we see that although the number of levels and, correspondingly, the rate (if we keep the duration time of each waveform T fixed) of such an FSI{ system may be increased with relatively little degradation in performance (see Figs. 25 and 28), there is, correspondingly, an increase in the bandwidth required by the system to operate. Consequently, the utility of simple multilevel orthogonal FSK systems is limited to situations where conservation of bandwidth is not of principle concern. .. It is clear from Fig. 40 that multilevel PSK and ASK modulation schemes utilize bandwidth more efficiently than do the FS1( modulation schemes. For these systems the bandwidth utilization factor r increases montonically
492
with the number of levels m. The probability of error, however, as may be deduced from Figs. 23, 24, 26 and 27, also increases with m. There is, thus, an effective upper limit to r beyond which the error rate becomes intolerable. This upper limit will be a function of the signal-to-noise ratio on the channel. H. Numerical Examples
A feel for the types of tradeoff involved can perhaps best be gained by considering a numerical example. Let us, therefore, investigate the feasibility of operating one of the modulation schemes discussed above at the Nyquist rate, that is, at r = 1. We assume for the sake of definiteness that the maximum acceptable error rate is 10- 5 (characters per second) and that the signal-to-noise ratio available is 15 db. The curves of Fig. 40 indicate that a transmission rate corresponding to r = 1 limits our choice of systems to 8-level PSI( or 8-level ASIC. Furthermore, an examination of Fig. 23 reveals that if the signal-tonoise ratio is limited to 15 db, the error rate of an 8-1evel PSK coherent system is of the order of 10- 3 character/sec, which is unacceptable whereas the error rates of PSK incoherent and ASI{ modulation schemes are even higher (see Fig. 31). It follows, therefore, that under the assumed constraints none of the systems considered will operate satisfactorily at the Nyquist rate. If, on the other hand, the available signal-to-noise ratio was increased to 22 db, both the PSI\: coherent and the PSK incoherent modulation schemes would meet the stated requirements. As a second example let us consider in a semiqualitative way the type of performance we might expect from one particular modulation scheme, namely, 4-level PSI\:: incoherent on a telephone channel. Assuming a useable bandwidth of about 2 kc, it follows from Fig. 40 that, for a 4-1evel PSI\: system, r ~ .68 and, thus, the rate at which information may be transmitted is of the order of R ~ (4000)(0.68) ~ 2700 bits/sec. Correspondingly, from Fig. 26 we may deduce that this system will operate at a character error rate of 10- 5 with a signal-to-noise ratio of about 15 t db and at a character error rate of about 10- 2 0 with a signal-to-noise ratio of 22 db. I. Final Remarks
Since typically the signal-to-noise ratio on a telephone channel is well in excess of 22 db and yet the error rates of existing 4-phase systems are more nearly on the order of 10- 5 , one can only conclude that the principle sources of errors on a telephone channel are non-Gaussian. Indeed, Kelly and Mercurio" have noted that the major sources of errors on a telephone channel appear to be impulse noise and dropouts. There are, of course, additional factors which tend to 20 J. P. Kelly, and J. F. Mercurio, "Comparative Performance of Digital Data Modems," Mitre Corp., Bedford, Mass., Tech. Memo., TM-3037, p. 4; April 14, 1961.
THE BEST OF THE BEST
limit the performance of actual communication systems which have been ignored in the present analysis. Among these are distortion in the received signal and intersymbol interference due to the nonlinear delay, bandlimiting and gain fluctuation of the medium coupling the transmitter to the receiver. Furthermore, the received signal is processed in less than ideal fashion by the detector due to imperfections in the hardware and timing recovery. It is to be noted that all these factors ultimately manifest themselves at the detector simply as a perturbation in the position of the transmitted message point. As such, the effects are similar to those produced by the noise and may largely be compensated for by an additional margin of signal-to-noise ratio at the detector." Conversely, we might, loosely speaking, say that some fraction of the total signal-to-noise ratio available at the detector is needed to compensate for effects of the type listed above which were not accounted for in the basic analysis. Consequently, only the remaining fraction of signal-to-noise ratio is available for combating Gaussian noise. It is to be expected, therefore, that any predictions of the probability of error based on estimates of the total signal-to-noise ratio available at the detector will be unduly optimistic. ApPENDIX
I
The following is devoted to a proof of Theorem I. Within the course of the proof, a technique for calculating the orthonormal functions (('l(t), (('2(l) , ... ,
If the set of functions is not linearly dependent, it is said to be linearly independent and vice versa. Consider the given set of waveforms Sl(t), S2(t) , ... Sm.(t). Either this set is linearly independent or it is not linearly independent. If not, then (by definition) there exists a set of constants, bl , b2 , ••• , bm , not all equal to zero such that
Suppose, in particular, that bm
~
O. Then,
That is to say, Sm(t) can be expressed in terms of the remaining (m - 1) waveforms. 21 An assessment of the effects of delay distortion in terms of a parameter related to signal-to-noise ratio for a particular data transmission scheme has been carried out by R. A. Gibby, HAn evaluation of AM data system performance by computer simula.. tion," Bell Sys. Tech. J., vol. 39, pp. 675-704, Ref. 1; May, 1960.
Fifty Years ofCommunications and Networking
493
Consider now the set of waveforms Stet), S2(t), " ' , Sm-l (t). Either this set is linearly independent or it is
not. If not, there exists a set of constants, Cl "C2 not all equal to zero such that
C1S 1(t)
+C
Suppose that
2S2(t)
Cm -
1
=;t.
+ ... +
Cm-lSm~l(t)
==
•••
that
C , ,m1
and
o.
iT fP~(t)
O. Then,
= 1.
dt
That is to say, Cl'1(t) and CP2(t) form an orthonormal set. Continuing in the same fashion, set which implies that Sm-l(t) can be expressed as a linear combination of the remaining (In - 2) waveforms. Now, examining the set of waveforms SI(t), S2(t), ... , S;"-2(t) for linear independence and continuing in this fashion, it is clear that we will eventually end up with a linearly independent subset of the original set of waveforms, say, SI (t), S2(t), ... ,
and the constants v., j Ii
(136)
iT
1,2, ... , i - 1 equal to S;(t)fPj(t) dt.
iT ~i(t)
dt
=
1.
(137)
i = 1,2, ... , k
h2(t)fP l(t) dt =
iT
S2(t)fPl(t) dt -
x,
(139)
it is clear that if we set (140)
(144)
form an orthonormal set. Since each waveform of the derived subset Si(t) i 1, 2, ... , k may be expressed as a linear combination of the 'Pi(t) i = 1, ~~, ... , k, it follows that each of the originally given waveforms Si(t) i = 1, 2, .. · , m may be expressed as a linear combination of the (()i(t) i = 1, 2, .. , , k. That is to say, we can write .
k
Si(t) =
L: aii
i
= 1,2, ... , m
(145)
,vhere the a., are constants. Furthermore, multiplying both sides of (145) by 'Pi(t) and integrating from 0 to T, we can deduce the fact that
i = 1,2, ... , m
(146)
We remark that the results of this appendix are abstracted from the general theory of vector spaces. The interested reader would do well to refer to some of the standard texts in the area."
(138) where A is some constant which is yet to be determined. Since, by (137) and (138),
(143)
It follows readily that the set of functions
j = 1, 2, ... ,k
Now, define a new intermediate function,
i 7'
=
=
k ::; m,
~Sk(t)
(The indicise of the given set of waveforms can always be permuted in such a fashion that the first k waveforms, Stet), S2(t), ... , Sk(t), will be linearly independent.) Note that each of the given waveforms .Sl(t), S2(t), ... , Sm(t) may be expressed as a linear combination of these k waveforms. We shall now, utilizing the Gram Schmidt process, show that if the given waveforms are physically realizable (or, to be more precise, L 2 functions), which condition guarantees the existence of the integrals in question, it is possible to construct a set of k orthonormal waveforms, 'Pl(t), fP2(t) , ... , (()k(t) , from the derived linearly independent waveforms 8 1 (t), S2(t), ... , S,,(t). As a starting point set,
It is clear that
hi(t) = .Si(t) - 1'1~'l (t) - 1'2((J2(t) - ... - l' i-l
II
ApPEND1X
In this appendix we wish to investigate the properties of the quantities n j , j = 1, 2, ... , k defined in (13). In so doing, we shall use the notation of Davenport and Root! and shall 211so make use of some of the results derived therein. Now, by (13), nj
=
iT
n(t)fPj(t) dt
j
=
1, 2, ... ,k
where the tl'j(t) form an orthonormal set.
and
(141)
22 1:». R. Halmos, •'Fi.nite Dimensional Vector Spaces," D. Van Nostrand Inc., Princeton, N. J .., 1958; G. Birkhoff and S. MacLane, A Survey of Modern Algebra," The Macmillan Co. New York, N. Y.; 1960.' '
90.,
494
THE BEST OF THE BEST
If n(t) is a Gaussian random process, then n, is a Gaussian random variable" and is thus characterized completely by its mean and variance. In particular, the mean of n, is equal to
n, =
E[n;] = iT E[n(t)]lp;(t) dt
=
T
dt
f'as
n(t)lp;(t)n(s)lp;(s) ] -
iT dt { ' ds E[n(t)n(s)]lp;(t)lp;(S)
-
u~ = No {' dt iT ds
(147)
and the variance of n, is equal to
= E[i
Consequently, substituting this result into the expression for the variance, (152), we get
No iT dtlp~(t)
No
=
(155)
where we have utilized the sifting integral" and the fact that the 'Pi(t): j = 1,2, ... , k form an orthonormal set. It follows readily that if j :;t. k,
ii~ ii~.
=
o(t - S)lpj(t)lpj(S)
E[n;n k] = E[i T dt iT de n(t)lp,(t)n(S)lpk(S) ] (148)
iT ds
=
iT dt
=
No iT dt lp;(t)lpk(t) = 0,
R(t - S)lp;(t)lpk(S)
If net) is a zero mean process, then (149)
E[n(t)] = 0
This suffices to prove that the Gaussian random variable
which in turn, by (147), implies iii
= E[n i ]
=
o.
(150)
By definition, the statistical autocorrelation function of the random process net) is equal to R(t, 8)
= E[n(t)n(s)].
n, and nk are independent if j ~ k. 27
We might further point out that successive noise outputs in time are independent. That is to say, if
(157)
(151)
If net) is stationary, then the autocorrelation function is a function of the time difference t - 8 alone and not on the particular choice of t and 8 per se, Summarizing the results to this point we note that if net) is a stationary Gaussian random process with zero mean, then n, is a Gaussian random variable with zero
and (158) where it is assumed that ~(t) is nonzero only for 0 then
mean and variance u~ equal to
E[nn*] =
(152)
=
L:
W(f)e+;hf
T
dt
T
T
n n*
t < 2T
(161)
f7' = fT
=
n(t)g,(t) dt
(162)
n(t)glt) dt.
(163)
= 0,
(164)
Now, noting further that
fT
where 0(1') is the unit impulse." Davenport and Root, Ope cit., pp. 155-156. Davenport and Root, Ope cit., p. 104. 26 Davenport and Root, op, cit., pp. 365-368.
s
(160)
2T
and then to note that (157) and (158) can be rewritten in the form
and, correspondingly,
23
24
:s;. t <
o~ t
t
(154)
t S T, (159)
O~t
(153)
where W(f) represents the spectral density of the stationary random process in question. For white noise, lV(f) = No for all
o.
~
An easy way to see this is to let
For a stationary random process, the spectral density and the statistical autocorrelation function form a Fourier Transform pair." In particular, R(r)
(156)
26 27
g,(t)g2(t) dt
Davenport and Root, Ope cit., pp. 365-368. Davenport and Root, Ope cit., pp. 55-58.
495
Fifty Years of Communications and Networking
it follows readily that E[nn*]
E[fT dt {T ds n(t)gl(t)n*(s}g2(S} ] = {T dt {T ds No 6(t - S}gl(t}g2(S) =
= No
f'l'
dt gl(t}g2(t}
=
o.
ApPENDIX ~II
This appendix is devoted to a proof of Theorem III.. The proof will proceed in two steps. Initially, we shall show that a maximum-likelihood detector minimizes the probability of error if each possible signal is transmitted with equal probability. Secondly, we shall show that if the transmitted signals are perturbed by additive stationary-white-zero mean-Gaussian noise, then, in the coherent case, maximum-likelihood detection is equivalent to picking the message point closest to the received signal point and guessing that the corresponding signal was transmitted. Assume that each time slot T one of the m possible signals Sl(t), S2(t) , ... , Sm(t) is transmitted with equal probability, namely, 11m. Assume further that, whenever a signal is transmitted, a point (or vector) y is observed at the detector. Denoting the set of all possibly observed y by Y, the observation space, we suppose that the conditional probability density of y [under the condition that Si(t) is sent], Pi(Y) i = 1, 2, ... ,m, is defined on Y. Our objective is to establish a rule for partitioning the space Y into a set of disjoint regions, Y 1 , Y2, ... , Y m, such that if we guess that Si(t) was transmitted whenever y lands in the region Y i , i = 1, 2, ... , m, the probability of error is at a minimum. Note initially that the probability that y lands in the region Y j when SIG(t) is transmitted may be written in the following equivalent ways:
in
this latter sum will take on its maximum value if we set Y i equal to the set of points y in Y, for which
(In case there is a point Yo in Y for which, say, Pl(YO) = P2(YO) = P3(YO)
>
Pq(Yo),
q
= 4, 5, ... , m,
the point Yo can be assigned to either Y1 or Y2 or Y3.) The decision rule embodied -in this partitioning of the observation space is equivalent to taking the observed point, say Yo, finding that value of i for which Pi(YO) is a maximum and then guessing that the corresponding signal Si(t) was transmitted. That is to say, this decision rule is equivalent. to maximum-likelihood detection. We conclude, therefore, that if each signal' is transmitted with equal probability, a maximum-likelihood detector will minimize the probability of error. In the text it has been shown that, in the case of coherent detection, if 8 i (t) is transmitted, the received signal can be characterized by a point in a k-dimensional Euclidean space with coordinates [see (12)] j = 1,2, ...
1
k
where the au are the coordinates of the transmitted signal and the n, are independent Gaussian random variables with, zero mean and variance No. The decision as to what signal was sent will be based on the coordinates of the received point. That is to say, in this case the observation space Y is a k-dimensional Euclidean space. Accordingly, let us designate the kdimensional random vector corresponding to the received signal by ('0) (the symbol z was used in the paper). Since each coordinate of '0, namely, Yi = a., + nj;j = 1,2,.··, k, is an independent Gaussian random variable with mean aij when 8 i (t) is transmitted and variance No, the conditional probability density function Pi(Y) is equal to
1. p;(y} = ( 2wN
o
)"/2 exp {-2N1 f.;t (y; t
o
}
aii}2 •
But, by (5),
L: (Yi ~
aij)2 = d (y, Si) Now, assuming that we do partition the observation space ;..,:1 Y into a set of disjoint regions, Y 1 , Y2, . · · , Ytn, and then guess that Si(t) was transmitted if y lands in the region where d(y, Si) is equal to the distance between the Y i , i = 1, 2, .... , m, the probability of incorrect decision received point y and the transmitted point Si. P. is equal to Now, suppose a particular signal is observed at the receiver; that is to say, Y = Yo. The maximum-likelihood 1 m detection rule is simply to choose that value of i for which ,P, = - LP[yf. Y./S ]
m
i
i-I
1 ~ 1- m
1
= 1 - -
m
L m
i
Q
l
pry e r.) Sil
Lf m
,"1
Y.
Pi(Y) dy.
L7-1
It is clear that P., will be minimized if J Y, Pi(Y) dy is maximized. A little reflection, however, indicates that
p,(Yo} =
2
( 1)&/2 {I 2N
o
exp -2N tf(yo, S.} o
}
is a maximum and guess that the corresponding signal Si(t) was transmitted.. This is, however, equivalent to selecting that value of i for which d(yo, Si) is a minimum or selecting the message point closest to the received signal point.
THE BEST OF THE BEST
496 ApPENDIX
IV
The following is devoted to discussing the decision rules which have been adopted for the incoherent systems. As shown in Appendix III, a maximum-likelihood detector will minimize the average probability of error if each possible message is transmitted with equal probability. Accordingly, we shall, in each case derive the decision rule corresponding to maximum-likelihood detection but shall, for the purpose of simplicity, modify the decision rule derived for the ASK case.
A. PSK Incoherent
In the PSI( incoherent case the decision as to what message was sent is based on the successive outputs of the two product integrators. Assuming, in particular, that the itk message has been sent, the decision will be based on the four quantities described by (30), namely,
= VB cos a + nll
Xl
VB sin a + n = VB cos (a + 21ri/m) + n = - VB sin (a + 21ri/m) + n22
Yl = X2
Y2
12
21
where nu, n12, n2h n22 are independent Gaussian random variables with zero mean and variance No and a is assumed to be uniformly distributed over a 21r interval. That is to say, (J~a<(J+21r
The joint conditional density for Xl, Yl, X2, Y2, assuming only that the itk message was sent, is equal to
Pi(X1 ,
u., X
2,
l
=
8
Pi(Xl, Yl, X2 , Y2/a =
=
=
+ VB sin A)2
(Yl
+
(X. -
v'E
cos
D exp {P cos A
COB
r
(A + 2:)
+ Qsin A}
(166)
2~ [x~ + y~ + X: + Y: + 2E]} VB [ + cos 2 1-rY2i .sIn 21riJ P = No m m VB . 21ri Q= - [ - Yl sIn - Y2 cos -21riJ • No
(171)
By (168) and (169), we find that
p'
+ Q'
=
:~ {x~ + yi + x; + y; + 2(x + YIY.) lx.
2ri · cos m
+ 2(Y X2 1
.
2m} m
X 1Y2) SIn - .
(172)
The right-hand side of (172) can be interpreted geometrically. Before doing so, however, we wish to point out that since [as can be deduced from (29a) , (30a) and T
)2 0
Xl
+ Q2)
P+Q2.
o
D = ( 21rNo
•exp { -
(170)
where lois the modified Bessel function of the first kind of zero order. 16 Now, in principle, to decide what signal was sent, the decision box of the detector should substitute the measured values of Xl, Yl, X2, Y'J into the right-hand side of (171) and evaluate this expression for all values of i, i = 1, 2, ... , m. It should then select that value of i which yielded the largest value and assume that the corresponding signal was sent. However, since the quantity D is independent of the choice of i and the Bessel function increases monotonically with its argument, it is sufficient to find the value of i which maximizes the quantity
i iT q,.(t)~
A)'
where 1
A)p(A) dA
(30b)]
(21r~J exp { - 2~o [(Xl - VE +
=
Pi(X1, Yl, x 2, Y2) = Dlo(VP2
(165)
A)
Pi(Xl, Yl, X2, Y2/a
where we are averaging over all possible A. Substituting (165) and (166) into (170) and performing the indicated integration, we get
elsewhere The conditional joint density function of the random variables Xl) Yt, X2, Y2, given, say, that a = A and that the ith message was transmitted, may be written as
Y2)
B + 2 1r
X2
X2
m
m
(167)
(168) (169)
/2E (Pt(t)'\Jr cos (Wot cos (Wot
+ a) dt = VB cos a
+ a) dt = - VB sin a,
it is necessary to consider angular displacements in the clockwise direction to be positive if the output of the
x = p cos () y
=
-p
sin 8.
In Fig. 41, x, y, p and (J are defined. Now, let us consider a pair of successively received signal points with coordinates (aI, b1) and (a2, b2), respectively. That is to say, we are assuming that the
Fifty Years of Communications and Networking
497
.. x ,,(U
---~--_
(177)
8 p
Fig. 41-Definition of polar and rectangular coordiilate systems adopted in Section A, Appendix IV.
--r-----.-------,~x
B ...4SK Incoherent
Fig. 42-Geometric interpretation of (172) and (177).
random variables XI, u., x:z, Y2 take on the particular values ai, b l , a2, b2, respectively. The two signal points are shown plotted in Fig. 42. If we denote the point (a3, b3) as the rotation of the point (a2, b2) in the counterclockwise direction by 21ri/nt radians, it is apparent, since
a2 =
V a; +
b2 = - V/~~
b: cos ()
+
b; sin
V a; + b~ cos (
=
a2
e,
8_2:)
27ri b. 21f'i cos - - 2 SIn m m
In the ASK incoherent case, if S i (t) is transmitted, the decision at the receiver as to what signal was transmitted is based on the random variables X and y described in (34a) and (34b), namely, x :=
+ nl -v'lf sin a + n
VE
y = -
(173)
P.(x, y/a
= A)
=:
· ( = - V~ / a22 + b;2SIn
=
b2
2ri
COS -
m
+
a2
(J -
-
27ri) m
. 2ri
SIn - .
m
(174)
Furthermore, applying the law of cosines to the angle 1/1 defined in Fig. 42, we get
cos If;
va:- + b;
(y
(176)
Substituting this result into (172) yields (recall we are considering that particular case where Xl = at, YI = b-, X2 = a2, Y2 = b 2)
-
VEi cos
+ VE, sin A)2}}.
Pi(X, y) =
1
9+2 ".
9
;Pi(X, y/a
=
A)2 (178)
dA A) 211"
27r~O exp { - 2~o (x + + Eo)} 2
y2
.Io( VE:
Substituting (173) and (174) into (175) yields
+ bi
2
Therefore, averaging over all possible values of A, the conditional joint density function of the random variables x and y, given only that Si(t) was transmitted, is equal to
=
Va~
cos a
27r~o exp { - 2~o {(x +
and ba
i
where nl, 'n2 are independent Gaussian random variables with zero mean and variance No and a is assumed to be uniformly distributed over a 27r interval. The joint conditional probability density function of the random variables x and y, given that a = A and that Si(t) was transmitted, is equal to
that
a3 =
The only term in the right-hand side of (177) which is dependent upon i, the index of the transmitted signal, is cos 1/1. Accordingly, that value of i, i = 1, 2, ... t m, which maximizes cos ~ will also maximize (P2 + Q2) . Equivalently, we can solve for that value of i which minimizes I '"I. That is to say, the decision rule, as may be deduced from. Fig. 42, reduces simply to measuring the phase difference between successively received signals, rounding off the measured value to the nearest value of 21ri/m i = 1, 2, ... ,m and guessing that the corresponding signal was transmitted.
~:2 + y2)
(179)
where lois the modified Bessel function of the first kind of zero order." The optimum decision rule for the detector to follow is to substitute the measured value of x and y into the right-hand side of (179), find the value of i which maximizes the resultant expression and assume that the corresponding signal. was transmitted. Unfortunately, this decision rule is rather complicated to instrument and we
498
THE BEST OF THE BEST
shall adopt, instead, its asymptotic form although we shall not always be operating in ranges where the resultant rule is optimum. The asymptotic expansion of the Bessel function is given b y 28
was sent and that a = A, is equal to P i (x I, =
+ Substituting the first term of the expansion into the right-hand side of (179), we get
P .,. (~r-'.J11)
~
I exp { -2N
_ /t: 2} l v_ rT-~ x +!1 - -v J!J i ] n
27rV21rN o
[J!~\(X2
+
•
y2)]1/-l
• ••
,X
m, Y1, 112, . .. ,Y m / a
= ..4)
(21r~Jm exp [ - 2~o {t; (x~ + Y~) (Xi -
VIi cos A)2 +
(Yi
+ VE sin .1)2} ]
= (L) exp { VIi No (Xi cos A - Yi sin A) }
(183)
(181) ..
In the regions whore this expansion is valid, the behavior of the right-hand side of (181) is dominated by the exponential term. It follows, therefore, that the set of (x, y) points for which Pi(X, y) > Pj(x, y) all j ~ i corresponds approximately to the set of (x, y) points for which
I V x + y~ - vE; I < I V x 2 + y2
X2 ,
(184) Now, averaging over all possible values of A to find the joint conditional probability density of the random variables Xl, Xz, ... , Xm, YI, Y2, .,. , Ym, given only that ~Si(t) was sent, we get
2
-
Vl!)j
I
j ~ i.
all
That is to say, it is approximately true that if, in particular, x = a and y = b, the value of i which maximizes PiCa, b) corresponds to the value of i which miniI. The indicated decision rule mizes I V a2 + b2 is, therefore, to calculate the rms amplitude of the received signal roundoff to the nearest value of VE i , i = 1, 2, ... , m. and guess that the corresponding signal was transmitted. It should be noted that this decision rule, although seemingly a natural one to adopt, is only optimum in the region where it is legitimate to approximate the Bessel function by the first term of its asymptotic expansion-that is, for those values of x, y, VE i and No for which V x 2 + y2 ~/No » 1.
-vB:
c.
, Ym/a =
(182)
(185) where 1 0 is the modified Bessel function of the first kind of zero order." Clearly, since L is independent of the choice of i and 10 is a monotonically increasing function of its argument, the set of points Xl, X2, •• , , Xm, YI, Y2, .•. , Ym for which
is equal to the set of points for which _/2
V Xi
FSK Incoherent
In the l?SI{ incoherent case, if Si(t) is transmitted, the detector will base its decision as to what was sent on the 2m random variables [see (37a) and (37b)] j
n 'd
x,
= { VE cos a + nzi
Yi =
~
i
j = i
n Yi {
- VE sin a + n
ll
i
where the nxi and nui are independent Gaussian random variables, each having zero mean and variance No and a is a random variable uniformly distributed over a 21f interval. The conditional joint probability density of the random variables Xl, X2, •• , , X m, YI,·Y2, .'. , Ym, given that Si(t) 28
Bowman,
Ope
cit., p. 84.
dA A) 21l"
+ Yi > 2
_/2 V x,
+ Yi
2
all
j ~ i.
(186)
Thus, if, at some instant, the random variables Xl, Xm, Yl, Y2, , Yrn take on the particular values aI, a2, ... , am, bI , b'2, , bm , the receiver should calculate the ni rms amplitudes Va~ b~ i = 1, 2, ... , m (each of which is associated with a particular frequency), select the largest one and guess that the corresponding signal was transmitted.
X2, • • • ,
+
ApPENDIX
V
This appendix is devoted to a proof of Theorem IV. Recall that the decision rule for the maximum-likelihood detector in the coherent case is simply to choose the message point closest to the received signal point and to guess that the corresponding signal was transmitted. Accordingly, an error will occur if, and only if, when S i (t) is transmitted, the received signal point lies closer to one of the message points S, (j 'F i) than to the message point Si' Denoting the distance between message
499
Fifty Years ofCommunications and Networking
points S, and Sf by Pi; and the noise components originating at the point S, and directed towards the point S, by nii' it follows that the probability of this event which equals the probability of an error, P ei , when Si(t) is transmitted is equal to
p. = p[nij > P~f for at least one j ~ i] i
(187)
where nii is a Gaussian random variable with mean zero and variance. No. Since the events nii > Pii/ 2 are not necessarily mutually exclusive, P ei is certainly less than or equal to P ti S , LJ,P[nii j~i
> -Pii] . 2
VZ;
2
fa>
e-"'/2 dx
!-
m
t H(-Ei'I ::; r. ~ m - 1 t 2 V"Ffo) m i-I
i=1
j
Piil2vJi;.
~
i
(189)
1
m ( -L H IPi--
m
-x'/2
dx
. ,
(190)
creasing function of its argument, that
r.. s L: H( 2V"Ffo --.!!ii..-) ~ (m iFi
'
l)H(-P_i ) 2v'No
p
(1
=
1
m
~ t-l
1n
(198)
Pi
P, >- H(--L-). 2VNo
(199)
Further, defining the symbol
= min [Pi]
(200)
i
and noting that
H"(
Pi
)
<
. 2V"Ffo -
H( 2V"Ffo' p* )
(201)
we get, by combining (201) with the right-hand inequality of (196),
Pi = min [Pi2"]·
(202)
i~i
The quantity Pi defined in (192) is equal to the distance between the signal point S, and its nearest neighbor. It is clear [by (187) and (192)] that
~ H( zv ~~) «,
(197)
and combining (197) and (198) with the left-hand equality of (196), we get
(191)
where
p ..
(196)
Defining the symbol P as the average of Pi, we have
p*
it follows, since H (Pi i /2 v'No) is a monotonically de-
2 V"Ffo
m ) ~H-L:~· m i-12v'No
)
\2V"Ffo
i-I
and defining
PH) - = -1- fOO. e H(2VNo VZ; Pii/2~
H(~) .
Since the second derivative of H(Pi/2 V"Ffo) (with respect to its argument) exists and. is greater than or equal to zero if Pi ~ 0, it follows that H(Pi/2V"Ffo) is a convex function" for Pi ~ O. That is to say,
(188)
Now, noting that
p[n i i > Pii] = _1_
is, therefore, by (194), bounded by
(193)
which result, when combined with (191), yields
(194) The average probability of error, P 6, which is equal to
(195)
Thus, by the inequalities of (199) and (~02), we have H
(
p ) 20ro -< P e -< -=.
(
p*
(m - l)H 2v'No
)
which, by (190), may be rewritten in the form presented in the text, namely, - 1-
y2;
ieg
p/2VN";
e'-%2/2 dx
<
-
_.
em - 1) fm y2;
p*/2V'N;
e- x 2/ 2 dx.
(203)
29 G. H. Hardy, J. E. Littlewood and G. Polya, "Inequalities," Cambridge University Press, Cambridge, England; 1959. In particular, note Sees. 3.5 and 3.10.
Performance of Combined Amplitude and Phase-Modulated Communication Systems" J. c. HANCOCKt,
MEMBER, IRE AND
R. W. LUCKyt
Summary-The performance of two types of digital phase- and S(t) amplitude-modulated systems is investigated for the high signalto-noise ratio region. Approximate expressions for the probability of error and channel capacity of the more optimum of these two systems are compared with corresponding expressions for prob- . e ability of error and channel capacity for a digital phase-modulated system. It is shown that the phase- and amplitude-modulated systems show a definite power advantage over the phase-only system when the information content per transmitted symbol must be greater than 3 bits. From a channel capacity standpoint, the phase- and amplitude-modulated systems make more efficient use of the Fig. I-Digital phase and amplitude modulation. channel for signal-to-noise ratios greater than 11 db. The more optimum of the two phase and amplitude systems has only a 3-db advantage over the less optimum and is considerably more difficult tudes. Each combination of a particular amplitude level to instrument. • and phase position represents a transmitted symbol, and
R
INTRODUCTION
ECE NT papers have investigated the performance of digital phase-modulation systems and have suggested the possibility of a digital system which is both phase and amplitude modulated [1]-[3]. Fig. 1 shows a typical transmitted signal in such a system. During each pulse length of T seconds the phase and amplitude of the transmitted signal assume values chosen from a discrete set of possible phases and ampli* Received by the PGCS, June 6, 1960 .
t School
of Elec. Engrg ., Purdue University, Lafayette, Ind.
the totality of such combinations is the alphabet size. Since this transmitted signal is contaminated by noise in the channel, the received symbols cannot be detected with certainty. Fig. 2 shows the decision levels at the receiver for a system with two possible amplitude posit ions AI and A 2 and four possible phase positions rPl - cP,· The phase and amplitude of an incoming pulse are detected, and the decision as to which symbol has been sent is made on the basis of which of the eight possible segments the resulting received phasor is in. In a system such as this, where there are the same number of phase positions available regardless of the amplitude level, the phase and amplitude channels are independent, and 9
Reprinted from IRE Transactions on Communications Systems, December 1960.
The Best ofthe Best. Edited by W. H. Tranter, D. P. Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W. Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
501
502
THE BEST OF THE BEST
/
amplitude
't.&-----
phase
decision level
decision level
Fig. 2-Decision levels in a type I system.
transmitted symbol may be expressed as Ai - cPi' 'vhere A i and cPi may be chosen independently from separate discrete sets A and cjJ. Such a system will be designated a type I system. It is possible to make a more efficient system by removing the restriction that the phase and amplitude channels he independent. Since the probability of an error in phase decreases with increasing amplitude levels, it is possible to increase the number of phase positions available with increasing amplitude levels, while maintaining a constant probability of phase error. Thus there are more symbols available for transmission for a given peak power, A typical system shown in Fig. 3 has three phase positions on the first amplitude level and nine on the second, for a total of twelve transmission symbols. This kind of system will be designated a type I I system. While the type I I system is theoretically superior, the type I system is much easier to instrument. In the first section of this paper the structure of the type II system (i.e., the placing of amplitude and phase levels) is derived so as to make the probability of error equal for each transmitted symbol. Such a system has its maximum capacity when all symbols are equally likely to be transmitted. Subsequent sections of the paper compare this system and a digital phase system with respect to probability of error and with respect to channel capaciity. The latter comparison is particularly useful, since by using channel capacity both systems can be compared with the theoretical upper limit as given by Shannon [4]. As pointed out by Cahn [1], two types of phase demodulation are possible-coherent detection using a synchronized local reference, and phase comparison of two successive received samples. Only the use of coherent detection is considered in this paper. Similar results could be obtained for noncoherent detection and would show some degradation in performance. Cahn 12] has shown that in the case of digital phase systems this degradation approaches 3 db. Approximations have been derived by Cahn for the probability of error in detecting the phase of a signal in the presence of Gaussian noise [2], [3]. These expressions are asymptotic in the high signal-to-noise ratio region and are useful when dealing with high accuracy systems where the probability of error must be small. Since these expressions have been used here, it should be
l=2
m==t2
Fig. 3-Decision levels in a type II system.
noted that all' results apply only to the high signal-tonoise ratio region. In most of the results given here it has also been necessary to employ other approximations rather freely. While there is some justification for each of these approximations, none of the stated results are intended to be exact. MATHEMATICAL DESCRIPTION OF SYSTEM
We first consider a type I system. For large signal-tonoise ratios the probability of a phase error at the kth amplitude is given by the asymptotic approximation [1]-[3], e-
8in 2
7r/ ft 1c
Peq, = - - - - - - --. J-
'v 1r
Ak
v'2 SIn •
1('
/
nk
'
(1)
where A k is the normalized amplitude of the kth amplitude level based on a noise power of unity in the channel, and where nk is the number of phase positions at this level. (For a type I system nk is equal to n 1 . ) It should be noted that this expression is a function of the argument,
i.e.,
r.; = f(~Sin:J
(2)
It can be assumed that the marginal amplitude distribution of the detected envelope is Gaussian for large signal-to-noise ratios. Using these distributions, Cahn has shown that for a type I system the placing of the amplitude levels is given by [3] A k = A{ 1 4- 2(k - 1)
Sin:J,
(3)
where there are n possible phase positions. When the amplitude levels are placed accordingly, the probability
503
Fifty Years of Communications and Networking
of an amplitude error at each amplitude level is equal to (9) p.= the probability of a phase error on the first amplitude Al . I V 11" V2S1D 1f' nl Jevel. The placing of At is determined by the number of phase positions n and the desired probability of error. For a type II system the number of phase positions Since this error is a monotonically decreasing function on the higher amplitude levels is increased so as to keep of Al sin 1r/nt , the error may be minimized by maximizing the probability of a phase error constant and equal to Al sin 1rln 1 subject to the average power and alphabet the probability of an amplitude error. Thus the channel .size constraints. Since all symbols are equally probable, the average is symmetrical with respect to probability of error in a power is given by received symbol. By (2), A k sin 7r/nk must be kept constant for a constant probability of error. Thus we must have L A2 P = Enk_le. (10) k-l m 2 A 1 • 1r A k • 1r --= SIn - = ----= Sin - . V2 nl V2 n c Using (6) for nJ: and (3) for A l , and performing the Eq. (3), relating A to A for a type I system, may also summation, k
1
be used for the type I I system, since the amplitude levels are not changed. Using (3) for Act sin.!:. n1
= sin.!.. [1 + nk
2(k - 1) sin
!..J.
·nt
(4)
+ 3L(L -
Approximating the sine by its argument, nkr'Vnt
+ 21r(k
- 1).
(5)
Since a fractional phase position is impossible,
= nt +
+ 6L (£ 2
1)
+ 4L(£
- 1)28 in2
- 1)(2L - 1) sin !.-
!...}. nl
nl
(11)
It should be noted that because the amplitudes have been normalized on the basis of unity noise power, this average Since most systems cannot have more than four or five power P is equivalent to the average signal-to-noise ratio. amplitude levels without making the spacing between Now Al may be solved in terms of P, m, and 1&1 by use phase positions too close to be of practical value, (6) of (11) and (8) and the function to be maximized, At usually gives the integer closest to the true value of nk sin 'Klnt, may be expressed as a rather lengthy function as computed from (4). The alphabet size m is obtained by of nlJ P, and m; i.e., summing the number of phase positions on all amplitude levels: (12) nk+l
6.
(6)
(7) Using (6) for nkJ m = 3L(L - 1)
+ nIL.
P t m = given constraints.
Setting iJg / ant (8) proximately
o to
find the maximum, yields ap..
Thus the channel is determined by choice of the number of amplitude levels L, the number of phase positions on Using this value for 11,. in (8) and (11) gives the first amplitude level nl, and one additional factor which can be average or peak power, probability of error, L = or first amplitude level position A l .
.J!j ,
PROBABILITY OF ERROR VB SIGNAL-To-N OIBE RATIO
It is desired to find the minimum probability of error that can be attained for a given average power and alphabet size and the system parameters, All n h and L, of the system which achieves this minimum error rate. The probability of error for a received symbol is the sum of the probability of a phase error and the probability of an amplitude error less the probability of both errors occurring. Neglecting the latter probability as small, the probability of error is just twice the probability of a phase error as given by (1). (All amplitude and phase errors are equally probable.) Thus,
A~/2
= P/Cim - 1).
(13)
(14)
With these optimum values, the minimum probability of error (9) becomes 2e- P / ( S / 9 m - 4/ 3 ) (15) PC,)min = p
V; 819m - 4/3 Cahn has drawn error curves for phase modulation on a 2 normalized signal-to-noise ratio abcissa, P 8in rim [2], [3]. Fig. 4 shows (15) plotted together with Cahn's curve for phase modulation systems using this normalized coordinate. The signal-to-noise ratio improvement obtained
504
THE BEST OF THE BEST
C = log m
(17)
Pi log Pi.
The Pi are the transitional probabilities from any one input symbol to each output symbol, and m is the total number of input symbols. The probability of an error of more than one phase position is small, and it is assumed that the total probability of error is concentrated in the two neighboring phase positions with a transitional probability of
.\
10
phose and amplitude
_ 1p
fJ -
_2
c
+L
2
->
10 -
"-'
=
-A2/21Iin :l rIm
..:._.::.e_--:-
_
A 2 V; _ ~sin'll"/m
(18)
v2
.~ 0..-
to each. (See Fig. 5.) The probability of correct transmission is (1 - 2fJ) and the channel capacity is
I
C
= log m +
2fJ log fJ
+ (1 -
D
2fJ) log (1 - 2fJ).
(19)
~
o
D
o
o
C.
dec is io n
-4 10
I'P -16
- 12
nor m al i ze d
I '
-8 -4 s ig nal-fa -n o ise rat io
0
r
•
1" s'''!Ii
a
4
db
Fig. 4-Probability of error vs signal-to-noise ratio.
by the addition of amplitude modulation to phase-modulation systems is a function of the alphabet size. The improvement increases approximately 3 db every time the alphabet size is doubled. It is interesting to note that the minimization of peak power for a given alphabet size and error rate also leads to the result that n l = 3 regardless of the power or alphabet size.
Fig. 5-TranBitionaI probabilities in a phase-only system.
For small fJ, this can be approximated by C ~ log 1m
+ 2fJ(log fJ
- 1).
(20,
Substituting (18) for {3 gives C as a function of A and m. For a given average power A is a constant, and the expression may be differentiated with respect to m to give an optimum number of phase positions. Differentiation yields a transcendental equation of the form (21)
CHANNEL CAPACITY
A. Phase Modulation Only
l e v e ls
This equation may be solved by trial and error to give
In order to provide a comparison with a type II phase (22) and amplitude modulation system, the channel capacity will first be found for a digital phase-modulation system. (23) In this system there is one amplitude level A, and m phase positions giving an alphabet size of m symbols . Using this value for m in (20) gives an expression for In general the channel capacity as defined by Shannon channel capacity as a function of power (signal-to-noise ratio).' can be found by (24) C = ! log P 0.814 bits/symbol. (16) C = max [H(y) - Hiy)] ,
+
where the maximum is with respect to all possible in1 The use of (18) as an approximation for the probability of error causes the channel capacity maximum to occur at the knee of the ormation sources [4]. In a symmetrical system such as true C vs m curve instead of at m = cc , As m becomes infinite Iigital phase modulation, the maximum is obtained by C = ! log P + 1.10, but the probability of error approaches unity. Increasing m beyond the value given in (23) results in very little naking all input alphabet symbols equally probable; the gain in channel capacity and only serves to complicate the coding problem. lxpression reduces to [4]
505
Fifty Years of Communications and Networking
B. Phase and Amplitude Modulation
found by using the integer values obtained for m and L. Thus a better specification of optimum system parameters Fig. 6 shows a typical portion of a phase and amplitude is scheme. It can be seen that the channel is no longer symmetrical and that appreciable transitional probabili'In = iP (nearest integer), (30) ties exist to more than the four neighboring states of any L = t v"P+2 (nearest integer), (31) transmitted symbol. Approximations have been found to take into consideration these asymmetries in calculating m n, = L - 3(L - 1). (32) channel capacities of sample systems, and numerous channel capacities have been calculated using these approximations. However, these capacities do not differ appreciably It is now of interest to compare the channel capacities from capacities calculated assuming that the channel is of these two systems with each other and with Shannon's symmetrical with equal probability of error in each of upper limit [4]: four directions. To solve for the optimum system parameC = W log (1 + SIN). (33) ters, All n h and L, for a given average power from the point of view of channel capacity, (8) and (9) are used In order to compare these capacities with Shannon's in an expression entirely similar to (20). A Lagrangian result it is necessary to find the bandwidth associated multiplier is used to add (11) as a constraint. The resulting with a given symbol rate in the case of phase modulation equation is differentiated with respect to All nil L, and and combined phase and amplitude modulation. Both the Lagrangian multiplier. Unfortunately, the resulting cases have spectral densities with a (sin2x)/x2 envelope. simultaneous transcendental equations cannot be solved The width between first zeros is 2/1' cps where l' is the in closed form. If, however, it is assumed that n l = 3 pulse length. Thus, the bandwidth is twice the number (a not unreasonable assumption in view of the earlier of symbols transmitted per second. The three-channel results on probability of error and peak power minimiza- capacities are plotted against signal-to-noise ratio for tion), new equations which may be solved may be con- unit bandwidth in Fig. 7.2 structed by the same method.
le vets
10
9
Fig . 6-Transitional probabilities in a phase and amplitude system.
For this case these results are obtained:
= 3 (assumption), L = i v"P+2.
n1
(The result of these two equations is that 'In = 1 P .)
Al = 2, C = log (P
+ 2) -
1.19 bits/symbol.
(25) (26) (27)
(28)
(29)
For the general case where no value is assumed for nl, a digital computer was programmed to find the maxi-
mum of C over all integer combinations of 11. 1 and L while changing Al to hold average power constant. The computed maxima of C fall quite closely on the curve of (29) while the optimum values of m and L were those given by (27) and (26). The optimum value of 11., is then
~3
a.
o
u
-2 ., ~
o s: U
I
9
Fig. 7-Channel capacity
36
VB
signal-to-noise ratio.
2 Although an additional improvement can be realized throng bandwidth reduction by sampling, the results of Fig. 7 demonstrat the relative capacities of the two systems. For the sampled syster the channel capacity scale for both the phase and amplitude syster and the phase only system in Fig . 7 would be doubled.
THE BEST OF THE BEST
506 CONCLUSION
The structure, probability of error, and channel capacity of type I I phase- and amplitude-modulated communication systems have been presented. In this type of system the phase and amplitude channels are dependent and cannot transmit separate information sources. A system in which the phase and amplitude channels are independent has been referred to as a type I system. Obviously the type I I system is nlore difficult to instrument than the type I system, and both systems are in turn more complex than the phase-modulation-only system, The progressive complexity of the three systems must be balanced against the resulting savings in power. The probability of error curves (Fig. 4) shows a 2.5-db advantage to the type I I phase and amplitude system over the phase-only system for an alphabet size of 16 symbols (4 bits per symbol). This advantage increases to about 5.5 db for an alphabet size of 32 symbols (5 bits per symbol) and nearly 3 db for each bit increase thereafter. The channel capacity curves (Fig. 7) show that the phase and amplitude system makes much more efficient use of the channel for high signal-to-noise ratios. 'This is because it is more efficient to use larger alphabet sizes at these powers and carry more bits of information per sample; this constitutes the power advantage for the phase and amplitude system. The channel capacity curves cross at a signal-to-noise ratio of about 11 db. Below this power there is no advantage to the phase and amplitude system over the phase-only system. At this crossover point of 11 db the optimum alphabet size for the t\VO
systems as given by (23) and (30) is about 8 symbols each. The type II phase and amplitude system can be easily compared with the type I system on the basis of peak power required for a given probability of error and alphabet size. This comparison shows approximately a I-db saving for an alphabet size of 8 symbols and a 2-db saving for an alphabet size of 16 symbols. The saving approaches 3 db asymptotically. Thus while both type I and type II systems increase their power advantage over the phase-only system for larger alphabet sizes, there is about a constant 3-db saving between the two. In summary, combined phase- and amplitude-modulation systems are advantageous whenever an information capacity of more than 3 bits per symbol is used. For a signal-to-ratio of greater than 11 db, it is efficient from the point of view of channel capacity to use such information rates. If the channel bandwidth is not restricted, a phase-only system of nearly the same efficiency can be made by increasing the bandwidth and sending more symbols per second with each symbol carrying 3 bits or less of information. BIBLIOGRAPHY
[1] C. R. Cahn, "Performance of digital phase-modulation communication systems," IRE TRANs.. ON COMMUNICATION SySTEMS, vol. CS-7 pp. 3-6; May, 1959. [2] C. R. Cahn, "Performance of Digital Phase-Modulation Commm:rlcations Systems," Ramo-Wooldridge, Corp. Los Angeles t Calif., Tech. Rept, No. ~1110-9U5; April, 1959. ' [3] C. R. Cahn, "Combined digital phase- and amplitude-modulation communication systems," IRE TRANS. ON COMMUNICATIONS SYSTEMS, vol. C8-8, pp. 150-154; September 1960. [4] C. E. Shannon, "The mathematical theory of commu'nication " Bell Sys. '1 ech. J., vol. 27, pp. 379-423, 623-656· July-.Octob~r 1948. " 1
SYNCHRONOUS COMMUHICATIOHS J. P. Costas
General Electric Comp~ Syracuse, H.Y.
SUD8!7
singled out as the logical replaceJl8Dt tor
It can be shown that present Usage ot amplitude modulation does not permit the in)lerent capabilities of the modulat~on process 'to be re8l1zed. In order to achieve the ultimate performance ot which AM 18 capable S)'DChroDOUB or coherent detection techniques must be used at the receiver and carrier suppression Dl\18t be employed at the
transmitter•
When a pert01'll8l1Ce comparison 18 made between a qncbronous AM 8~.. tem and a single-sideband qstem it 18 shawn that ID8IJY ot the advantages norma1.l7 attributed to single-sidebalJd DO longer exist. SSB has no power advantage over the synchronous AM (DSB) qatem and SSB is shown to be more susceptible to j.-dDg. The performance ot the two systems' with regard to lIlultipath or selective fading conditions 18 also discussed. The DSB system shows a decided advantage OYer SSB with regard to tVstem complexl.t7J espec1,ally at the transmitter. The band.v1cith saving ot SSB Oftr DSB 18 considered and 1 t 18 shown that factors other than s1gnal ~dtb IlUSt be considered. The number of usable channels 18 not nece88ar1l7 doubled the SSB and in ~
br
useot
prac'tical situations no increase in the nuaber of usable channels result8 trom the use of SSB. The transmitting and receiving equipnent
which has been developed under Air Force SPODsorship 18 discussed. The receiving system design involves a local oscillator phase-control &Jatem
vb1cb derives carrier phase information trom tbe sidebands alone and does Dot require the use of
a pilot carrier or syncbron1z1ng tone. The avoidance of superheterodyne techniques 1rl this receiver ia explained and the versat1l1 ty ot· 8uch a receinag S7stem with regard to the reception at
many different types
ot signals 1s pointed out.
System 1iest results to date are presented and
discussed.
Introduction Far a good many years a vf!r7 large percentage of all mill tary and commercial cClBUll1cat,1ona systems have eMPloyed amplitude lIlOdulaUoD tor the
tran8lnission of information.
In spite
~
certain
well-mown shortcomings of conventio~ AM i t,s use has been continued mainly due to the simpl1city or this system as compared to other modulation methods which have been proposed. During the last, few years, however, it has been felt by many respon-. sible engineers that the increased demands being made on communications facilities could not be met
ot conventional AM and that new modulation techniques would h~Ve to be employed in spite of the additional system complexity. or these new 'techniques single-sideband has been. by the use
COIl-
ftIlticmal AM and a great deal ot publici t7 aDd financial support 'has been g1Yen SSB as a COD8e-
quence.
Maror technical. reasons haft been given to support the cla1ll that SSB 1s better than AM and these pointe will be discussed in 80_ detail later in this paper. In addition ~ exper1mentll haft been pertoraec1 vbich also 1IId1cate a auperiorit1' tor SSB over AM. Some care JlQ8t be taken, however, 1D draviDg conclual0D8 trOll the abow stateaente. We cannot conclude that SSS i.
8!JP!1'ior to AM beCause we lim no assurance whate'9V that come :tionat JR stella.... effie ent use 0 the ulatd..on e88 D o va &8 a IIOd: on proce88 lUll' capable ot tar better pertormanee thaD that which is obtained in conventional AM 87ste.. It III
anal7818 1s made of AM and 55B 878t-. it v1ll be round that existing SSB 87Stems are very nearly optimuJll with respect to the modulatLoD procesl employed 1Ihereas cormmtional AM 878te118 tall tar ahort ot realiziDe the lull potential of tlie modulation proces8 .ap].a.red. In tact it could h0l188tl7 be 8aid that we haft been lli8USing rather thaD U81Dg AH'in the past. ReaUs.t1on of the abo'n situation raises SOlIe 1JIIIec:l1ate queSt10D81 What are the equiJaeDt requireaenta of the optiaua .&II ayatem? How does the performance of the opt1Jml AM system compare wi th that ot asS? Wbicb shove the greater promise ot fulfilling future IIIil1tar7 and c"ommercial cOllllllUl11cati0D8 requirements, optimum AM or SSB? The remainder of this paper v111 be devoted maiDly to answer1ag these queatiOlUl.
Synchronous
C~cati0D8
- The OptiJlua AI(
STate.
Receiver Conventional AM qsteJ18 taU to obtain the full benetits ot the modulation proces8 tar two main reasonsl inefficient use of generated power
at the transmitter and inefficient cletection methode at the receiver. Starting with the receiver it can be shown that it max1DIum rece1'9V performance is to be' obtained the detectio~ proeesl must 1IlvolV8 the use ot a phaae-locked oscillator and a IIJDCbronous or coheren't detector. The basic IIJDChronoUB rece1'Y8r is shovn in F1&\1re~. The incom:iDg signal is JIixed or multiplied with 'the
coherent, local o8cUlator signal in the detec'tar and the demodulated audio output 18 thereby 'directly produced. The audio 8ignal 18 then filtered and amplified. The local oecUlator IlU8t be maintained at proper phase 80 that the audio output contributioDS at the upper and lower sidebaDd8 reinforce one another. It the oscillator phase 18 90 degrees away from the optimum value a null 111 audio output will result which 1s typical ~ detectors of this type. The actual. lIlethod ot phase
control will be explained ahor1:ily but tor the pur-
Reprinted from IRE Transactions on Communications Systems, March 1957.
The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
507
508
THE BEST OF THE BEST
Fig. 1 - Basic Synchronous reeei ver.
I U»-,ASS nL!f21
Q
aDIlO
IICPUna
Fig. 2 - Two-phase Q'11ehronous receiver.
pose ot this discu8sion ma1ntel18nCe of correct; oscillator phase shall be aslNiled. In spite ot the s1Jlpl1ciV of this type
~
rece1 Yer there are several 1IIportant advantages vorth7 ot note. To begin vi tit no IF systeJi 1. - -
plaTed which eliminates ccaplete17 the probl. . ot image responses. The opportUDi't7 'to use ettect1ft17 poat-detector filtering allows extreme 881ect1Y1t7 to be obtained vi thout ditticul ty. The aelectirlty curve ot such a receiver will be found to be 'the low-pass tilter characteristic llirrorimaled about the operating frequency. Not onl7 1. a high order ot select!ri ty obtained 1n this manner but the select!vi t)" at the raceiver mq be easil.7 changed by low-pass tilter switching. The carrier cc.p0D8n~ of the AM signal 1s not. in a1J7 way 1n~1'Yed 111 the demodulation process and need no't 'be transmitted when us1ngauch a receiver.
Furthermore, detection JD81' be accomplished at very low level and consequentlT the bulk or total receiver gain mq be at audio frequencies. This perlllits an obvious application of transistors but more iJIIport,ant it allows 'the aelectivi't7 determining lOll-pass fUter to be inserted at a low-level point in the race!vet- which aids 1IIJmeasurab17 111 protecting against spurious responses trOJll very strong undesired signals. Phase Control To obtain a pItactical 87Dchronous receiving ~8tem some additions to the basic receiver ot Figure 1 are required. A more complete synchronous reeeiver is shown in Figure 2. The first thing to be noted about this diagram is that we have essential1Y two basic receivers with the same input signal but wi t,h local oscillator sig-
nals in phase quadrature to each other. To understand the operation of the phase control circuit
Fifty Years of Communications and Networking
consider that the local oscillator signal is ot the same phase as the carrier component ot the incoming .AM signal. Under these conditions the inphase or I audio amplifier output will contain the demodulated audio signal while the quadrature or Q audio 8 ..'lplifier will have no output due to the quadrature null effect of the Q synchronous detector. It now the local oscillator phase drifts trom its proper value by a few degrees the I audio will remain essentially unaffected but there will now appear some audio output trom the Q channel. This Q channel audio will have the same polarit)" as the I channel audio for one direction or local oscillator phase drift and opposite polarity tor the opposite direction ot local oscillator phase drift. .The Q audio level is proportional to the magn1 tude of the local oscillator phase angle error tor small errors. Thus by simply cOIlbining the I and Q audio signals in the audio phase discriminator a D.C. control signal is obtained which auto. matically' corrects for local' oscillator phase errors. It should be noted that phase control information is deriTed entireq trom the sideband components ot the AM signal and that the carrier 'if present is not used in any wq. Thus since both Q11chronization and demodulation are accomplished in complete independence or carrier, 8uppressed-earrier transmissions mq be emplo7ed.
509
the receiver pass-band. Under these conditions the I channel will contain the desired audio. signal plus an undesired component due to the interterence. The Q channel will contain oDl.y an interference component also arising from the presence ot the interfering signal. In general the interference component in the I channel and the interference component in the Q channel are related to one another Or they mq be said to be correlated. Advantage DUO" be taken of this correlation by treating the I and Q voltages with the I and Q networlal and adding these network outputs. It properly done this process will reduce and sometimes eliminate the interfering 8ignal from the rece1var output as a result ot destructive addition of the I and Q interference voltages.
The design of these networks is determinea
by the spectrlDll
or the
interfering signal and
the details10f network desiin m8;y be found in the literature. Although such details cannot be
It is UDtortunate that maDY' engineers tend to avoid phase-locked systems. It is true that a certain amount ot stability is a prerequisite but it has been determined by experiment that tor thia application the stability requirements ot single-sideband voice are more than adequate. Once a certain degree of stability 1s obtained the step to phase-lock 1s a simple one. It 1s interesting to note that this phase-eontrol system can be modified quite readily to correct for large frequency errors when receiving AM due to doppler shirt in air-to-air or ground-to-air links.
gi ven here it is interesting to consider one special interference case. If the intertering signal spectrum 18 confined entirely to ODe side at the desired signal carrier trequenc, the optimum I and Q networks become the tamiliar 90 degree phasing networks common in single-sideband work. Such operation does not however result, in 8ingl8sideband reception of the desired signal since both desired signal sidebands contribute to receiver output at all times. This can be seen by noting that the Q channel contains no desirec1 signal component so that network treatment and addi tion aflects oDl.y the undesired audio signal components. The phasing networks are optimum only tor the interference condition assumed above. It there is an overlap ot .the carrier frequency by the undesired signal spectrum the phasing networks are no longer optimum and a different network design 18 required tor the greatest interference suppression.
It is apparent that phase control ceases with modulation and that phase lock will have to be reestablished with the reappearance of modulation. This has not proved to be a serious problem since lock-up normally" occur-s 80 rapidly that no perceptible distortion results when receiving voice transmissions. It should be further noted that such a phase control system is inherently immune to carrier capture or jamming. In addi tion it has been found that due to the narrow noise bandwidth of the phase-eontrol loop, synchronization is maintained at noise levels which render the channel useless for voice communications.
This tvo-phase method ot AM signal reception can aid materially in reducing interference. As a matter of tact it can be shown that the true anti-jam characteristics ot AM cannot be realized unless a receiving ~stem of the type discussed above is used. It we now compare the anti-jam characteristics of single-sideband and suppressedcarrier AM properly received it will be found that intelligent jamming of each type of signal. will result in a two-to-one power advantage tor AM. The b8l)dwidth reduction obtained with single-sideband does not come without penalty. One at the penalties as we see here is that single-sideband 1s more 8asil1' jumaed than double-aideband.
Interference Suppression The post-detector filters provide the sharp selectivity which of course contributes significantly to interference suppression. However, these filters cannot protect against interfering signal components lihich fall wi thin the pass-band ot the receiver. Such interference can be reduced and sometimes eliminated by proper combination ot the I and Q channel audio signals. To understand this process consider that the receiver 1s properlY locked to a desired AM signal and that an undesired signal appears, some of whose components tall within
Transmitter The Q'DChronous recei wr described above is capable of receiving suppresl!led-earrier·AM transmissions. It a cu-,rier is present as in standard AM this will cause no trouble but the receiver obYiously makes no use whatever ot the carrier component. The opportunity to employ carriersuppressed AM transmissions can be used to good advantage in transmitter design. There are m81V' wqs in which to generate carrier-suppressed AM signals and one or the mor~ successful methods i8
510
THE BEST OF THE BEST
shown 1n .Figure ). A pair or class-C beam power 8111P11fiers are screen-modulated by a push-pull audio signal and are dr1 ven in push-pull from an R.F. exciter. The screens are returned to ground
equivalent to a standard AM carrier outputot one kilowatt. Modulation linearity is good and the circuit is amenable to various feedback techniques tor obtaining wry low distortion which IIIq be required tor IIl111tiplex transmissions.
or to some negative bias value by means of the driver transformer center-tap. Thus in the absenee of modulation no R.F. output results and during modulation the tubes conduct alternately with audio polarity change. The circuit is extremely simple and a given pair of tubes used in such a transmitter can easily match the average R.F. power output of the same pair of tubes used in SSB-l1near 8IIlpl1t1er service. The circuit is seU-neutralizing and the tune-up procedure is vei7 much the same as in any other class-C R.F. power amplifier. The excitation requirements are .modest and as an example the order or eight watts or audio are required to produce a sideband power output
-.L
1-=
F"
r
This tranBlld.tter circuit 1s by no lleaDS neve The information ill presented here to indicate the equipaent s1mpl1ci ty which can be realized by use or synchronous .AM comazunications.
Prototype Equipment A synchronous receiver covering the frequency range of 2-32 IIC. i8 shown in F~~ 4. The theO%'1' of operation of this receiver 18 essentially that of the two-phue synchronous rece1ver discussed earlier. 1b1. is a direct conversion
~
11
1 -:
-=
-=
l-1I
11
1
",
...'010
Fig. .3 - Suppressed-carrier AM transmitter.
Fig.
4-
The AN/FRR-48 (XW-l) synchronous receiver.
511
Fifty Years ojCommunications and Networking
Fig. 5 - The Alf,TR'r-h9 (XW-l) aupprelS8ed-carr1er.AM transmitter. rece!ver and the superheterodyne principle 111 not used. A rather unUBUal frequency syntheal1 system is employed to give high stabiUty with Tery low spurious response. Only one C1'1stal 1s used aDd thia is a 100 kc. oven-eontrolled unit. This receiver will demodulate st.tndard .AM, suppressed-earrier AM, single-sideband, narrowband FM, phase lIIodulation, and CW signals in an optiaua lI&N1er. This Tersatility is a natural by-product ot the synchronous detection system and no great eftort is required to obtain this pertonunce. Figure S shows a auppressed-earrier AM tranSlli ttel" using a pair ot 6146 tubes in the tinal.. This un1 t 18 capable ot 150 vatta peale sideband power output tor continuous sine-wave lIIOdulation. The modulator is a single 128H7 lI1n1ature double triode. Figure 6 shovs a transII1tter capable or one-thousand watta peak sideband power output under continuous sine vave audio conditions. The tinal tubes are 4-250-A's and the modulator uses a pair of 61.6's. Both ot these tranaaitters are continuously" tuneable over 2 .. 30 .c. A Caap.r1aon ot Synchronous AM and
Single~ideband
It 1s interesting at this point to compare the relat!'We advantages aDd. disadvantages of lI1JlCbronous AM and single-sideband aysteu. Although single-sideband has a clear advantage 0'981' conventional .AM this picture is radically" changed when synchronous AM is considered. Signal-To-H01ae Ratio It equal average powers are &8sUll8d tor SSB and synchronous AM it can e&8ily be shown that
Fig. 6 .. The AHjFIrr/30 (IW-l) suppressedcarrier .AM transmitter.
512
THE BEST OF THE BEST
identical
SIN
ratios will result at the receiver.
The addi tiona], noise involved troa the reception of two sidebands is exactly compensated tor by the coherent addition ot these sidebands. The 9db advantage orten quoted tor SSB is based on a lull AM carrier and a peak power comparison. Since we have eliminated the carrier and since a given pair ot tubds will give the same average power in suppressed-earrier AM or SSB service there 1s
w.,.
actually no advantage either If intelligent jamming rather than noise is considered there exists a clear advantage of two-to-one in average power in favor of synchronous AM. Systern Complexity
Since the receiver described is also capable of SSB reception it would appear that synchronous AM and SSB systems lnvalve rou~hly the same receiver complexity. This is not altogether true since much tighter design specificatioDs must be imposed if high quail ty SSB reception is to be obtained. If AJ-t reception only is considered t21ese specifications ma.r be relaxed considerably' without materially affecting performance. The sj~chronous receiver described earlier may possess important advantages over conventional superheterodyne receivers but this point is not an issue here. The 8uppressed-earrier AM transmit'ter i8
actually simpler than a conventional AM transmitter. It is of course far simpler than any SS9 transmitter. There are no linear amplifiers, filters, phasing networks, or frequency translators involved. Personnel capable ot operating or maintaining standard AM equipment will have no dittlcul ty in adapting to suppressed-earrier AM.
The military and commercial significance of this situation is rather obvious and further discussion of this point is not warranted. Long-Range
Communications
The selective fading and multipath conditions encountered in long-range circuits tend to v~ t~e a~pl1tude and phase o£ one sideband component relative to the other. This would perhaps tend to indicate an advantage £or SSB but tests to date do not eonfirtn this. Synchronous AM reception of standard AM signals over long paths has been consistently as good as SSB reception of the.same signal. In some eases it was noted that the SSB receiver output contained a serious .flutter which was only slightly discernable in the synchronous receiver output. Some attempt has been made to explain these results but as yet no complete explanation is available. One interesting tact about the synchronous receiver is that the local oscillator phase changes as the sidebands are modified by the medium since phase control is derived directly from the sidebands. In a stu~ of special cases of signal distortion it was found that the oscillator orients itself in phase in such a way as to attempt to compensate for the distortion caused by the medium. This ma;r partially explain the good results which have been obtained. Perhaps another point ot view would be. that the s;rnchronous receiver is taking advantage
or the inherent diversity feature provided by the two AM sidebands.
Test results to date indicate that ~chro neus AM and single-sideband provide much the same performance for long-range communications. The
AM system has been found on occasion to be better but since extensive tests have not. been performed and since a complete explanation of these results is not yet available 1 t would be unfair to claim I#1T advantage at this time tor AM. Spectrum Utilization
In
theo~
single-sideband transmissions re-
quire only half the bandwidth ot equivalent AM
transmissions and this fact has led to the popular belief that conversion to single-sideband will resul t in an increase in usable channels by a factor of two. It a complete conversion to single-sideband were made those who believe that 'twice the number of usable channels would be available might, be in tor a rather rude awakening.
There are I1IIInT .factors which determine frequency
allocation besides modulation bandwidth. Under JDarlT condi tiona it actually turns out that modulation bandwidth is not a consideration. This is a complicated problem and only a few of the more pertinent points can be discussed briefly here.
To begin with the elimination of one sideband is a complicated and delicate business. Any one of several misadjustments of the SSB,transmitter will result in an empty sideband which is not actUally empty. We are not thinking here of a telephone company point-to-po1nt system stalted bj" career perscn~el but, rather we have in mind the majority of milit~ and commercial field installations. TI':1s is in no wq meant to be a criticism but the technical personnel problem faced by the military especially in time or war is a serious one and this simple fact of life cannot be ignored in future system planning. Thus we must concede that single-sideband transmissions rill in practice not always be confined to one sideband and that those who allocate frequencies must take this into consideration. There may be those who would argue that SSB transmitting equipment can be designed for simple operation. This is probably true but in general operational simplici can only be obt,ained at the
expense
t,.complex1.tT in' manufacture
or &deli tional
and maintenance. This ot course trades one set or problems for another but it we assume ideal SSB
transmission we are still faced with an even more serious allocation problem. We re£er here to the problem of receiver non-linearity which becomes a dominant factor when trying to receive a weak signal in the presence or one or more near-t'requency' strong signals. Under such condi tiona the singlesignal seleetivity curves often shown by manufacturers are next to meaningless. This strong undesired-weak desired signal situation often arises in practice especially in the mil1t~ where close physical spacing of equipment 1s mandatory as in . the case ot ships or aircraft and where signal environment chan.res due to changing locations of these vehicles. Because of this situation 8110-
513
Fifty Years of Communications and Networking
cations to some extent must be made practically independent of modulation bandwidth and the theoretical spectrum conservation of single-sideband cannot alwqs be advantageously used. The problem of rece!ver non-linear! t,. is especially serious in multiple conversion superheterodyne receivers for obvious reasons. This
vas the dominant tactor in choosing a direct
conversion sch8lle in the 8)'I1Cbronous rece1-.er described earlier. Although this approach has given good reaul ts and continued refinement has
indicated that significant ad"18Dces over prior art can be obtained, it cannot be said however that
the receiver problem 18 solved. This problem will probably remain a serious one until new materials and components are made available. This is a relatively slow process and it is not at all absurd to consider that by the time this problem is e11Jll1nated new modulation processess vill have appeared which will eclipse both of those now being considered. In short the spectrum economies ot SSB
which exist in theory cannot alwq8 be realized
in practice as there exist
important military situations in which no increase in usable channels will result trem the adoption ot s1ngle-sideband. ~
and cOnJIIl8l'C1al communications
J8IJIJI1ns
The reductioD ot transmission bandwidth
afforded by single-sideband must be paid for in one form or another. A sy-stela has yet to be proposed which olters nothing but advantages. One ot the prices paid for this reduction in bandwidth is greater sU8ceptibi11ty to jamming as was pre. viously mentioned. There 1s an understandable tendency at times to ignore jamming since the 818tems vi th which we are usually' concerned proTide U8 wi th ample worries without art¥ outside aid. Jamming ot course cannot be ignored and tram a JDili-
tary point ot view this raises a very serious
question. It we concede tor the moment that by proper frequencY' allocation single-sideband orfers
a normal channel capacity advantage over AM, what will happen to this advantage when we have the greatest need tor communications? It 1s almost a certa1nt7 that at the 'time ot greatest Deed jatalng v1l1 have to be reckoned with. Under
these conditions arq channel capacity advantage of SSB could eas117 vaD1sh. A cletini te .statement to this effect cannot be made of course without additional stud1' but this 18 a factor well worth considering. Concluding R8lW'ks
There is an undeniable need tor improved
cOlIIIIlunications and to date it appears that 81nglesideband has been almost exclu81"17 considered to supplU\t convent.1onal AM. I t bas been the maiD purpose ot this paper to point oat that the 111-
prond performance needed can be obta1Ded 111 another W87. The Q'Dcbronous AM &y'stea can com.pete more than favorably' with single-sideband when all tactor. are taken into account.
Acknowledgement Mach of the work reported here vas sponsored by the R<1D8 Air Development Center at the Air Research and Development COIIIIland under .A:lr Force
contract AF 30(602) S84.
The author wishes to acknowledge the support, cooperation, and encouragement which baa been extended by' the personnel of the Rome Air DevelolDent Center.
BibliOgraphy
1. J. P. Costas, ·Interference Filtering", Technical Report 10. 18S, Research Laborator)' of
Electronics, M.I.!.
.
NETWORKING
On the Self-Similar Nature of Ethernet Traffic (Extended Version) WillE. Leland, Member, IEEE, MuradS. Taqqu, Member. IEEE, Walter Willinger, and DanielV. Wilson, Member, IEEE
Abstrcu:t-We demonstrate that Ethernet LAN traffic is statistically self-similar, that DODe of the commonly used trame models is able to capture this fractal-like bebavlor, that such behavior has serious implieations for the design, control, and analysis of
traffic measurements presented in [14]. Moreover, we illustrate some of the most striking differences between self-similar
sucb traflk typleally Intensifies the self-similarity ("burstiness") instead of smootblng It. Our eo"clusions are supported by a rigorous statistical analysisof hundredsof mUUons of high quality
Ethernet data shows that the generally accepted argument for the "Poisson-like" nature of aggregate traffic, namely, that aggregate traffic becomes smoother (less bursty) as the number of traffic sources increases, has very little to do with reality. In fact, using the degree of self-similarity (which typically depends on the utilization level of the Ethernet and can be defined via the Hurstparameter) as a measure of "burstiness," we show that the burstiness of LAN traffic typically intensifies as the number of active traffic sources increases. contrary to commonly held views. The term "self-similar" was coined by Mandelbrot He and his co-workers (e.g., see [21]-[23]) brought self..similar
high-speed, ceO-based networks, and tbat Blgregadng streams of
Ethernet traffic measurements coUected between 1989 and 1992,
coupled with 8 dIscussion of the underlying mathematical and statistical propertiesof self-similarity and their relationsbip with actual network behavior. We also present traffic models based
on self-similar stochastic processes that provide simple,accurate, and realistic descriptions of traffic scenarios expected during
B-ISDN deployment.
L INTRODUCTION
I
1
N rats PAPER .we use the LAN traffic data collected by Leland and WIlson [14] who were able to record hundreds of millions of Ethernet packets without loss (irrespective of the traffic load) and with recorded time-stamps accurate to within 100 jJS. The data were collected between August 1989 and February 1992 on several Ethernet LAN's at the Bellcore Morristown Research and Engineering Center. Leland and Wilson [14] present a preliminary statistical analysis of this unique high-qualitydata and comment in detail on the presence of "burstiness" across an extremely wide range of time scales: traffic "spikes" ride on longer-term "ripples," that in turn ride on still longer term "swells," etc. This self-similar or fractal-like behavior of aggregate Ethernet LAN traffic is very different both from conventional telephone traffic and from currently considered formal models for packet traffic (e.g., pure Poisson or Poisson-related models such as Poisson-batch or Markov-Modulated Poisson processes (see [11]), packettrain models (see [13]), fluid flow models (see [1]), etc. and
requires a new look at modeling traffic and performance of
broadband networks. The main objective of this paper is to establish in a statistically rigorous manner the self-similarity characteristic
models and the standard models for packet traffic currently considered in the literature. For example, our analysis of the
processes to the attention of statisticians, mainly through
applications in such areas as hydrology and geophysics. For further applications and references on the probability theory of self-similar processes, see the extensive bibliography in [27]. For an early application of the self-similarity concept to communicationssystems, see the seminal paper by Mandelbrot [18].
The paper is organized as follows. In Section II, we describe the available Ethernet traffic measurements and comment on the changes of the Ethernet population, applications, and environmentduring the measurementperiod from August 1989 to February 1992. In Section III, we give the mathematical definition of self-similarity,identify classes of stochastic models which are capable of accurately describing the self.. similar behavior of the traffic measurements at hand, and illustrate
statistical methods for analyzing self-similar data sets. Section IV describes our statistical analysis of the Ethernet data, with emphasis on testing for self-similarity. Finally, in Section V we discuss the significance of self-similarity for traffic engineering, and for operation, design, and control of B-ISDN
environments.
of the very high quality, high time-resolution Ethernet LAN Manuscriptreceived July 1, 1993; revised January 15, 1994; approved by IEEE/ACM TRAKSACTIO~S ON NE~'ORXlNO EditorJonathan Smith.This work was supponed in part by Boston University, under ONRGrantNOOOl4-90-J1287. W. E. Leland, W. Willinger, and D. V. Wilson are with Bellcore, Morristown, NJ 07962-1910 (email: [email protected]@bellcore.com. [email protected]),. M. S. Taqqu is with the Dept of Mathematics, BostonUniversity, Boston, MA 0221.5-2411 (email: [email protected]). IEEE Log Number 9300098. 1 An abbreviated version of this paper appeared in [15).
II. TRAFFIC MEASUREMENTS
2.1. The Traffic Monitor The monitoring system used to collect the data for the present study was custom-built by one of the authors (Wilson) in 1987/88 and has been in use to the present day with one upgrade. For each packet seen on the Ethernet under study, the monitor records a timestamp (accurate to within lOOj.ts-to within 20 JjS in the updated version of the monitor), the packet
Reprinted from IEEE /ACMTransactions on Networking, vol. 2, no. 1, February 1994.
The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxernchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
517
518
THE BEST OF THE BEST TABLE I QUALITATIVE DEsCRIPTION OF SETS OF ETHBRNET TRAFFIC MEASURBMENTS USED IN THE Al\ALYSIS IN SECTION IV
Traces of Ethernet Traffic Measurements Total Number Total Number Ethernet of Bytes Utilization of Packets Data Set Measurement Period 11448753 134 27901984 9.3% Total (27.45 h) AUG89.LB 224315439 AUGUST 1989 Low Hour 5,0% AUG89.LP 652909 (6:25 am-7:2S am) Start of Trace: 380889404 AUG89.MB Aug. 29, 11 :25 am Normal Hour 8.5% End of Trace: AUG89.:MP 96863l. (2:25 pm-3:25 pm) Aug. 30t 3:10 pm BUsy Hour 67771.5381 AUG89.HB 15.1% 1404444AUG89.HP 4:25 pm-S:2S pm) 14774694236 27915376 15.7% Total (20.86 h) 468355006 OCT89.LB OCTOBER 1989 Low Hour 10.4%, 978911 OCT89.LP (2:00'am-3:00 am) Start of Trace: 827287174 Oct. 5, 11:00 am Normal Hour OCT89.MB 18.4% 1359656 End of Trace: OCT89.MP (5:00 pm-6:00 pm) Oct. 6. 7:51 pm 1382483551 OCT89.HB Busy Hour 30.7% 2141245 OCT89.HP (11:00 am-12:00 am) 7122417589 27954961 3.9% Total (40.16 h) JAN90.LB 87299639 JANUARY 1990 Low Hour 1.9% 310038 JAN90.LP (Jan. 11t 8:32 pm-9:32 pm) Start of Trace: 182636845 Jan. 10, 6:07 am Nanna) Hour JAN90.MB 4.1% End of Trace: 643451 (Jan. 10,9:32 am-lO:32 am) JAN90.MP Jan. II, 10:17 pm Busy Hour 711529370 JAN90.HB 15.8% FEBRUARY 1992 Start of trace: Feb. 18, 5:22 am End of Trace: Feb.
20~
5:16 am
(10:32 am-I 1:32 am) Total (47.91 h)
JAN90.HP
Low Hour (Feb. 20, 1:21 am-2:21 am) Normal Hour (Feb. 18, '8:21 pm-9:21 pm) Busy Hour (Feb. 18. 11:21 am-12:21 am)
FEB92.LB FEB92.LP FEB92.MB FEB92.MP FEB92.HB
FEB92.HP
6585355731 56811435
1391718 27674814
231823 154626159 225066741
524458 947662
3.1% 1.3% 3.4%
5.0%
data sets is given in Table 1. The first two sets of traffic measurements, taken in August and October of 1989 (see first two rows in Table I), were from an Ethernet networkserving a laboratory of researchers engaged in everything from software development to prototyping new services for the telephone system. The traffic was mostly from services that used the Internet Protocol (IP) suite for such capabilities as remote login or electronic mail, and the Network File System (NFS) protocol for file service from servers to workstations. There 2.2. The Network Environment at Bellcore The network environment at the Bellcore Morris Research were some unique services, though; for example, the audio of andEngineering Center(MRE) wherethe traffic measurements a local radio station was p-law encoded and distributed over used for the analysis presented later were collected is probably the network during portions of the day. While it is not our typical of a research or software development environment intent to provide here a detailed description of the particular where workstations are the primary machines on people's MRE network segments under study, some words about the desks. It is also typical in that much of the original installation types of traffic on them are appropriate. A snapshot of the network configuration at the time of was well thought out and planned but then grew haphazardly. collection of the earliest data set being used (August 1989) For the purposes of.this study. this haphazard growth is not is given in Fig. 1: there were about 140 hosts and routers necessarily a liability as we are able to study the traffic on a connected to this intra-laboratory network at that time, of network that. is evolving over time. Table I gives a summary which 121 spoke up during the 27 h monitoring period. This description of the traffic data analyzed later in the paper. We network consisted of two cable segments connected by a consider four sets of traffic measurements, each representing between 20 and 40 consecutive hours of Ethernet traffic and bridge. implying that not all the traffic on the network as each consisting of tens of millions of Ethernet packets. The a whole was visible from our monitoring point. During the data were collected on different intracompany LAN networks period this data was collected, among the 25 most active hosts at different times over the course of approximately four years were two DEC 3100 fileservers, one Sun-4 fileserver, six Sun(August 1989, October 1989, January 1990, and February 3 fileservers, two VAX 8650 minicomputers, and one eel 1992). Power6 minicomputer. At that time, the less active hosts were 2.2.1. Workgroup Network Traffic Data: Four data sets will mainly diskless Sun..3 machines and a smattering of Sun-a's, be considered in this paper. A summary description of these DEC 3100·s, personal computers, and printers.
length, the statusof the Ethernetinterface and the first 60 bytes of data in each packet (header information). As we will show in Section IV, the high-accuracy timestamps of the Ethernet packets produced by this monitorare crucialfor our statistical analyses of the data. A detailed discussion of the capabilities of the original monitoring system. including extensive testing of its capacity and accuracy can be found in [14].
t
Fifty Years of Communications and Networking
519
Fig. 1. Network from which the August 1989 and October 1989 measurements were taken.
During the latter part of 1989 when the first two data sets were collected, a revolution was taking place on this network. The older Sun-3 class workstations were rapidly replaced with RlSC-based workstations such as the SPARe station-I and DEC 3100. Many of the new workstations were "dataless" (where the operating system is stored on a local disk but user data on a server) instead of "diskless" (where all files for the user and for the operating system are stored on a remote server). Because of the increased computing power of the machines connected to this segment, the network load increased appreciably. in spite of the trend towards dataless workstations. Note, for example, that the "busy hour" from the October 1989 data set is indeed busy: 30.7% utilization as compared to 15.1% during the August 1989 busy hour; similar increases can also be observed for the low and normal hours. Not long after this data was taken. this logical Ethernet segment was again segmented by adding yet a third cable and a bridge, and moving some user workstations and their fileserver to that new cable. The above network has always been isolated from the rest of the Bellcore world by one or more routers. The other sides of these routers were connected to a large corporate internet consisting at that time of many Ethernet segments and T-l point-to-point links connected together with bridges. Less than 5% of the total traffic on this workgroup network during either of the traces went out to either the rest of Bellcore or outside of the company . 2.2.2. Workgroup and External Traffic: The third data set, taken in January 1990 (row 3 in Table I), came from an Ethernet cable that linked the two wings of the MRE facility that were occupied by a second laboratory (see Fig. 2). At the time this data set was collected. this second laboratory comprised about 160 people, engaged in work similar to the first laboratory. This particular segment was unique in that it was also the segment serving Bellcore's link to the outside Internet world. Thus the traffic on this cable was from several sources: (i) two very active file servers directly connected to the segment; (ii) traffic (file service and remote login) between the two wings of this laboratory; (iii) traffic between the laboratory and the rest of Bellcore; and (iv) traffic between Bellcore as a whole and the larger Internet world. This last type of traffic we term external traffic, and in 1990 could come from conversations between machines in any part of Bellcore and the outside world. This Ethernet segment was specifically monitored to capture this external traffic. In Section IV. we
Fig. 2. Network for second laboratory from which the January 1990 measurements were taken.
.......................... ("
To OUIer "\..~
\ Belle
-,·····t
Fill. 3. Backbone network for MRE facility from whichthe February 1990 measurements were taken.
will be considering the aggregate and external traffic from this data set separately. This segment was separated from both the Bellcore internet and the two wings of the laboratory by bridges. and from the outside world by a vendor-controlled router programmed to pass anything with a Bellcore address as source or destination. In contrast to the two earlier data sets, over 1200 hosts spoke up during the 40 h monitoring period on this segment. The last data set, from February 1992 (see row 4 in Table I), was taken from the building-wide Ethernet backbone in MRE after security measures mandated by the "Morris wonn" (described in detail in [26]) had been put into place (see Fig. 3). This cable carried all traffic going between laboratories within MRE, traffic from other Bellcore buildings destined for MRE, and all traffic destined for locations outside of Bellcore.
520
THE BES T OF THE BEST
'00
200
400
:100
100
000
700
800
000
'00
'COO
200
;§
!
0
'00
'00
-
:100
000
000
700
IlOO
000
' COO
i
2000
'00
200
:lOO
0
100
-:lOO
I50Cl
000
700
000
000
100
'COO
200
:100
11mI Unll • , S«oncI
!
I
000
000
'COO
-
100
000
700
800
000
' COO
700
IlOO
000
1COO
700
000
000
' 000
700
000
000
' COO
(.'1
~
000
100
100
~
00
40 20
0
100
200
:100
-
J 000
000
700
eoo
000
' COO
Tn. Unit • C.1 8econd (d)
,.
,.
!
'00
200
:lOO
-
100
eoo
700
000
000
lOCO
T1m. l..'Nt _ 0 .0 t s.oond
to)
I
-
000
400
!OO
000
1'Irne Un it. t a.cond
(e,
00
•
700
T'Ime UNt • 10 ,s.cand&
( b)
000
200
I
000
4000
I- ~ I !
100
ססoo
T..,. UrWt• 10 8eoDndII
!
400
(a ')
(0 )
j
.00
Tn. Unl1_ 100 8«:Jona
Time Un" _ 100 8econdA
(e,
00 00
40 20 0
,.
100
200
toO
IlOO
Time Unit _ O. t s..cond (d ' )
'0
'00
200
toO
400
500
eoo
Tlme UI"'lI• 0 .01 SKoncI (0 '
Fig. 4. Pictorial "proof ' of self-similarity: Ethernet traffic (packets per time unit) on five different time scales (aHe). For comparison. synthetic traffic from an appropriately chosen compound Poisson model on the same five different time scales (a' H e').
Some hosts were still directly connected to this companywide network in early 1992, but the trend to move them from the Bellcore internet to workgroup cables connected to the Bellcore internet via routers continues to the present. Because this cable had very little host to file server traffic, the overall traffic levels were much lower than for the other three sets. On the other hand, the percentage of remote login and mail traffic was higher. This cable also carried the digitized radio traffic between the two laboratories under discussion. The most radical difference between this data set and the others is that the traffic is primarily router to router rather than host to host. In fact. about 600 hosts spoke up during the measurement period (down from about 1200 active hosts during the January '90 measurement period), and the five most active hosts were routers.
III. SELF-SIMn..AR STOCHASTIC PROCESSES 3 .1. A Picture is Worth a Thousand Words For 27 consecutive hours of monitored Ethernet traffic from the August 1989 measurements (first row in Table I), Fig. 4 (a}-(e) depicts a sequence of simple plots of the packet counts (i.e., number of packets per time unit) for five different choices of time units. Starting with a time unit of 100 s (Fig.
4(a», each subsequent plot is obtained from the previous one by increasing the time resolution by a factor of 10 and by concentrating on a randomly chosen subinterval (indicated by a darker shade). The time unit corresponding to the finest time scale (e) is 10 ms. In order to avoid the visually irritating quantization effect associated with the finest resolution level, plot (e) depicts a "jittered" version of the number of packets per 10 ms, i.e., a small amount of noise has been added to the actual arrival rate. Observe that with the possible exception of plot (a) which suggests the presence of a daily cycle, all plots are intuitively very "similar" to one another (in a distributional sense), that is, Ethernet traffic seems to look the same in the large (min, h) as in the small (s, ms). In particular, notice the absence of a natural length of a "burst" at every time scale ranging from milliseconds to minutes and hours, bursts consist of bursty subperiods separated by less bursty subperiods. This scale-invariant or "self-similar" feature of Ethernet traffic is drastically different from both conventional telephone traffic and from stochastic models for packet traffic currentl y considered in the literature. The latter typically produce plots of packet counts which are indistinguishable from white noise after aggregating over a few hundred milliseconds , as illustrated in Fig. 4 with
521
Fifty Years of' Communications and Networking
the spectral density f(·) obeys a power-law near the origin (Ilf-noise), i.e.• f(>") "-J aa,x-" as" -+ 0, with 0 < "( < 1 depicts synthetic traffic generated from a comparable (in and "( = 1 - {3. Intuitively, the most striking feature of (exactly or asympterms of average packet size and arrival rate) compound Poisson process. (Note that while the choice of a compound totically) second-order self-similar processes is that their agPoisson process is admittedly not very sophisticated, even gregated processes x(m) possess a nondegenerate correlation more complicated Markovian arrival processes would produce structure, as m --+ 00. This intuition is best illustrated with plots indistinguishable" from Fig. 4(a')-(e').) Fig. 4 provides a the sequence of plots in Fig. 4: if X represents the numberof surprisingly simple method for distinguishing clearly between Ethernet packetsper 10 ms (plot (e)), then plots (dHa) depict our measured data and traffic generated by currently used segments of the time series mX(m), m = 10, 100, 1000, 10000 models and strongly suggests the use of self-similar stochastic (i.e., number of Ethernet packets per 0.1, 1, 10, 100 s), processes for traffic modeling purposes. Below, we give a brief respectively. Note that all plots look "similar" and distinctively description of the concept of self-similar processes, discuss different from pure noise. The existence of a nondegenerate their most important mathematical and statistical properties, correlation structure for the processes x(m), as m -+ 00, mention some modeling approaches, and outline statistical is in stark contrast to typical packet traffic models currently methods for analyzing self-similar data. For a more detailed considered in the literature, all of whichhave the property that presentation and references, see [17], [4], or [2]. their aggregated processes x(m) tend to second-order pure noise, i.e., for all k ~ 1, 3.2, Definitions and Properties r(m)(k) -+ 0, as m --+ oc. (4) Let X == (1Yt : t == 0, 1,2, . , .) be a covariance stationary stochastic process with mean u, variance 0'2 and autocorrelaEquivalently, packet traffic models currently considered in tion function Ir(k), k ~ O. In particular, we assume that X has the literature can be characterized by (i) a variance of the an autocorrelation function of the form sample mean that decreases like the reciprocal of the sample · (x(m)) a4m -1 ,as m ........ 00, (ii) 11 an (1) mean, r.e., var r(k) ~ k- 3 L(t)~ as k ~ oo~ autocorrelation function that decreases exponentially fast (i.e., where 0 < 13 < 1 and L is slowly varying at infinity, i.e., r(k) pk;O < P < 1), implying a summable autocorrelation linlt~oc L(tx)/ L(t) = 1, for all x > O. (For our discussion function Ek r(k) < 00 (short-range dependence), or (iii) a below, we assume for simplicity that L is asymptotically spectral density that is bounded at the origin. constant.) For each m = 1,2,3, .. " let x(m) = (Xkm ) : k = Historically the importance of self-similar processes lies 1, 2, 3 ~ ...) denote the new covariance stationary time series in the fact that they provide an elegant explanation and (withcorresponding autocorrelation function r (m)) obtained by interpretation of an empirical law that is commonly referred to averaging the original series -,Y over non-overlapping blocks the Hurst effect. Briefly, for a given set of observations (Xk : of size m. That is, for each m. = 1,2,3,.,., x(m) is given k = 1,2: ... , n) with sample mean X(n) and sample variance m 2 by }(k ) = l/m(Xkm-m+l + ... + Xkm), k ~ 1. The 8 ( n), the rescaled adjusted range statistic (or RIS statistic) is process X is called (exactly) second-order self..similar with given by R(n)jS(n) == IjS(n)[ max(O, WI, W2~.'" Wn ) min (0: W 1 , W 2 , •.• , Wn)l, with IVk = (Xl + X 2 + .,. + self-similarity parameter H = 1 - ,8/2 if for all m. = 1, 2, ..., var(x(m)) = a 2m -(3 and X k ) - kX(n)(k ~ 1). While many naturally occurring time series appear to be well represented by the relation r(m)(k) = r(k)~ k ~ 0, (2) E[R(n)jS(n)] '" asn H , as n --+ 00, with Hurst parameter X is called (asymptotically) second-order self-similar with H "typically" about 0.7, observations Xk from a shortself-similarity parameter H = 1- {3/2 if for all k large enough, range dependent model are known to satisfy E[R{n)jS(n)] a6 n O.5~ as n ~ 00. This discrepancy is generally referred r(m)(k) ~ r(k)~ as m ~ 00 (3) to as the Hurst effect. the sequence of plots (a ')-(e '); this sequence was obtained in the same way as the sequence (a)-(e), except that it
I'wI
f"oJ
t
f"o.J
with r(k) given by (1). In other words, X is exactly or asymptotically second-order self.similar if the corresponding aggregated processes ~J((m) are the same as X or become indistinguishable from X -at least with respect to their autocorre lation functions.
3.3. Modeling of Self-Similar Phenomena
Since in practice we are always dealing with finite data sets, it is in principle not possible to decide whether the above asymptotic relationships (e.g., (1)-(4) hold or not. Mathematically, self-similarity manifests itself in a number For processes that are not self-similar in the sense that their of equivalent ways: (i) the variance of the sample mean aggregated series converge to second-order pure noise (see decreases more slowly than the reciprocal of the sample size (4»), the correlations will eventually decrease exponentially, (slowly decaying variances), i.e., var(..~(m)) a2m-f3, as continuity of the spectral density function at the origin will m -t 00, with 0 < {3 < 1 (here and below, a2,a3, ... eventually show up, the variances of the aggregated prodenotefinite positive constants); (ii) the autocorrelations decay cesses will eventually decrease as m -1, and the rescaled hyperbolically rather than exponentially fast, implying a non- adjusted range will eventually increase as nO. 5• For finite I"-.J
summable autocorrelation function I:k r( k) == 00 (longrange dependence), i.e., r( k) satisfies relation (1); and (iii)
sample sizes, distinguishing between these asymptotics and the ones corresponding to self-similar processes is, in general,
522
THE BEST OF THE BEST
problematic. In the present context of Ethernet measurements, we typically deal with time series with hundreds of thousands of observations and are therefore able to employ statistical and data analytic techniques that are impractical for small data sets. Moreover, with such sample sizes, parsimonious modeling becomes a necessity due to the large number of parameters needed when trying to fit a conventional process to a "truly" self-similar model. Modeling. for example, longrange dependence with the help of short-range dependent processes is equivalent to approximating a hyperbolically decaying autocorrelation function by a sum of exponentials. Although always possible, the number of parameters needed will tend to infinity as the sample size increases, and giving physically meaningful interpretations for the parameters becomes more and more difficult. In contrast, the long-range dependence component of the process can be modeled (by a self-similarprocess) with only one parameter. Moreover, from a modeling perspective, it would be very unsatisfactory to use for a single empiricaltime series two different models, one for a short sequence, another one for a long sequence. Two formal mathematical models that yield elegant representations of the self-similarity phenomenon but do not provide any physical explanation of self-similarity are fractional Gaussian noise and the class of fractional autoregressive integratedmoving-average (AR/MA) processes. Fractional Gaussian noise X = (Xk : k ~ 0) with parameter H E (O? 1) has been introduced in [22] and is a stationary Gaussian process with mean u; variance 0'2, and autocorrelation function r(k) 1/2(lk + 112H - Ikl 2H + Ik - 1/2H), k > o. Simple calculations show that fractional Gaussian noise is exactly second-order self-similar with self-similarity parameter H, as long as 1/2 < H < 1. Methods for estimating the three unknown parameters j.JJ, (12, and H are known and will be addressed below. Fractional ARIMA(p, d, q) processes are a natural generalization of the widely used class of Box-Jenkins models [3] by allowing the parameter d to take non-integer values. They were introduced by Granger and Joyeux [10] and Hosking [12] who showed that fractional ARIMA(p, d, q) processes are asymptotically second-order self-similar with self-similarity parameter d + 1/2, as long as 0 < d < 1/2. Fractional ARIMA processes are much more flexible with regard to the simultaneous modeling of the short-term and long-term behavior of a time series than fractional Gaussian noise. mainly because the latter, having only the three parameter 1-', 0'2, and H t has a very rigid correlation structure and is not capable of capturing the wide range of low-lag correlation structures encountered in practice. This flexibility can already be observed when considering the simplest processes of the fractional ARIMA(p, d, q) family, namely the two-parameter models ARIMA(l i d, 0) and ARIMA(O, d, 1). Finally, we briefly mention a construction of self.. similar processes(due to Mandelbrot [19] and later extended by Taqqu and Levy [28]), based on aggregating many simple renewal reward processes exhibiting inter-renewal times with infinite variances. Although the construction was originally cast in an economic framework involving commodity prices, it is particularly appealing in the context of high-speed packet traffic, and we will return to this construction in Section V when
=
attempting to provide a "phenomenological" explanation for the observed self-similar nature of aggregate Ethernet traffic. In its simplest form, this construction requires a sequence of i.i.d. integer valued random variables U0, U1 , U2, . .. (,'inter renewal times") with "heavy tails," i.e., with the property P[U~u]~u-Qh(u), as'U~oc,
(5)
where h is slowly varying at infinity and 0 < a < 2. For example, the stable (Pareto) distribution with parameter 1 < a < 2 satisfies the "heavy ..tail" property (5). Furthermore, let Wo, W 1 , W 2 , ••• be an i.i.d, sequence ("rewards") with mean zero and finite variance, independent of the U's. Next, let Sk = So + 2:;=1 Uj, k ~ 0 denote the delayed renewal sequence derived from (Uj)j~O where So is chosen such that the sequence (Sk)k~O is stationary. The renewal reward process W = (W(t) : t = 0,1,2, ...) is then defined by W(t) := Ei=o WkI(Sk_l,Sk](t), with I A (·) denoting the indicator function of the set A. By aggregating M i.i.d. copies W(l), W(2), ... , W(l\l) of W, we obtain the model of interest, namely the process W* given by W* (T, M) = 2:;=1 E~=1 WCm)(t) with W*(O,M) = O. In [19] and [28] it is shown that for T and M both large with T ¢: M, W* behaves like fractional Brownian motion; in other words, properly normalized, l{-r*(T, M) converges to the integrated version of fractional Gaussian noise, i.e., to a mean-zero Gaussian process BH (BH(S) : S ~ 0),1/2 < H < 1, with correlation function Ri«, t) = 1/2(s2H + t 2H -ls - tI 2H ). For more details concerning fractional Brownian motion, see [22] and [21]. As an immediate consequence of Taqqu and Levy's result, we have that for T and M both large with T « M, the increment process of W* behaves like fractional Gaussian noise.
=
3.4.1n[erence for Self-Similar Processes Since slowly decaying variances, long-range dependence, and a spectral density obeying a power-law are different manifestations of one and the same property of the underlying covariance stationary process X, namely that X is asymptotically or exactly second-order self-similar, we can approach the problem of testing for and estimating the degree of selfsimilarity from three differentangles: (1) time-domain analysis based on the RlS-statistic, (2) analysis of the variances of the aggregated processes x(m), and (3) periodogram-based analysis in the frequency-domain. The following gives a brief description of the corresponding statisticaland graphical tools. For an engineering-based graphical tool that is related to the varianceproperty of the aggregated processes, see Section 5.2. The objective of the RlS analysis of an empirical record is to infer the degree of self-similarity H (Hurst parameter)-via the Hurst effect-for the self-similar process that presumably generated the record under consideration. Graphical R/S analysis consists of taking logarithmically spaced values of n (starting with n ~ 10), and plotting log(R.(n)/S(n») versus log(n) results in the rescaled adjusted range plot (also called the pox diagram of RIS). When H is well defined, a typical rescaled adjusted range plot starts with a transient zone representing the nature of short-range dependence in
523
Fifty Years ofCommunications and Networking
the sample, but eventually settles down and fluctuates in a straight "street" of a certain slope . Graphical R/S analysis is used to determine whether sucb asymptotic behavior appears supported by the data. In the affirmative, an estimate iI of H is given by the street's asymptotic slope which can take any value between 1/2 and 1. For practical purposes, the most useful and attractive feature of the R/S analysis is its relative robustness against changes of the marginal distribution. This feature allows for practically separate investigations of the self-similarity property of a given data set and of its distributional characteristics. We have observed that for second-order self-similar processes, the variances of the aggregated processes x(m), m ~ 1, decrease linearly (for large m) in log-log plots against m with slopes arbitrarily ftatter than -1. The so-called variancetime plots are obtained by plotting log(var(x{m J)) against log(m) (htime") and by fitting a simple least squares line through the resulting points in the plane, ignoring the small values for m. Values of the estimate ~ of the asymptotic slope between -1 and 0 suggest self-similarity, and an estimate for the degree of self-similarity is given by fI = 1 - ~ /2. The absence of any limit law results for the statistics corresponding to the R/S analysis or the variance-time plot makes them inadequate for a more refined data analysis (e.g., confidence intervals for H). In contrast, a more refined data analysis is possible for maximum likelihood-type estimates (MLE) and related methods based on the periodogram lex) = (27rn)-1/E7=lXjeijxI2,o S x S 7r of X == (Xl, X2, ... ~ X n ) and its distributional properties. In particular, for Gaussian or approximately Gaussian processes, Whittle's approximate MLE has been studied extensively and has been shown to have desirable statistical properties. Combined, \\'hittle's approximate MLE approach and the aggregation method discussed earlier give rise to an operational procedure for obtaining confidence intervals for the self-similarity parameter HI Briefly, for a given time series. consider the corresponding aggregated processes x{m) with m == 100,200,300, .... For each of the aggregated series, estimate the self-similarity parameter H(m) via Whittle's method. This procedure results in point estimates H(m) of H(m) and corresponding 95%-confidence intervals of the form H<m) ± 1.960'H(m), where a~(m) is given by a known central limit theorem result (for references, see (17]). Plots of Hem) (together with their 9S%-confidence intervals) versus m will typically vary for small aggregation levels, but will stabilize after a while and fluctuate around a constant value, our final estimate of the self-similarity parameter H * IV. ETHERNET TRAme Is
SELF-SIMILAR
While Fig. 4 gives a pictorial "proof' of the self-similar nature of the traffic measurements described in Section Il,
using the statistical and graphical tools presented above, we establish in this section the self-similar nature of Ethernet traffic (and some of its major components, such as external traffic or external rep traffic) in a statistically more rigorous manner. For each of the four measurement periods described in Table I, we identified rypical low-, medium-, and high-activity
hours. With the resulting data sets, we are able to investigate features of the observed traffic that persist across the network as well as across time, irrespective of the utilization level of the Ethernet. Only one LAN could be monitored at anyone time (making it impossible to study correlations in the activity on different LAN~s) and all data were collected from LAN's in the same company (making it not representative for all LAN traffic). For a similar analysis that uses differentdata sets from Table I, see [16].
4.1. Ethernet Traffic over a 27-Hour Period In order to check for the possible self-similarity of the August 1989 Ethernet traffic data, we apply the graphical tools described in the previous section, namely, variance-
time plots, pox plots of R/S, and periodogram plots, to the three subsets AUG89.LB, AUG89.MB, and AUG89.HB of the August ~89 trace that correspond to a typical "low hour," "normal hour," and "busy hour" traffic scenario. respectively (see Table I). Each sequence contains 360000 observations, and each observation represents the number of bytes sent over the Ethernet per 10 ms. As an illustration of the usefulness of the graphical tools for detecting self-similarity in an empirical record, Fig. 5 depicts the variance-time curve (a), the pox plot of R/S (b), and the periodogram plot (c) corresponding to the sequence AUG89.MB. The variance-time curve, which has been normalized by the corresponding sample variance, shows an asymptotic slope that is distinctly different from -1 (dotted line) and is easily estimated to be about -.40, resulting in an estimate iI of the Hurst parameter H of about H ~ .80. Estimating the Hurst parameter directly from the corresponding pox plot of RlS leads to a practically identical estimate; the value of the asymptotic slope of the RlS plot is clearly between 1/2 and 1 (lower and upper dotted line. respectively), with a simple least-squares fit resulting in il ~ .79. Finally, looking at the periodogram plot, we observe that although there are somepronounced peaks in the high-frequency domain of the periodogram, the low-frequency part is characteristic for a power-law behavior of the spectral density around zero. In fact, by fitting a simple least-squares line using only the lowest 10% of all frequencies. we obtain a slope estimate ~ ~ .64 which results in a Hurst parameter estimate if of about .82. Thus, together the three graphical methods suggest that the sequence AUG89.MB is self-similar with self-similarity parameter H ~ .80. Moreover, Fig. 5(d) indicates that the normal hour Ethernet traffic of the August 1989 data is, for practical purposes, exactly self-similar: it shows the estimates of the Hurst parameter H for selected aggregated time series derived from the sequence AUG89.MB, as a function of the aggregation level m. For aggregation levels m = 1,5,10,50,100,500,1000, we plot the Hurst parameter estimate H(m) (based on the pox plots of RlS ("*"), the variance-time curves ("OU), and the periodogram plots ("0"» for the aggregated time series x(m) against the logarithm of the aggregation level m. Notice that the estimates are extremely stable and practically constant over the depicted range of aggregation levels 1 $ m :5 1000. Because the range includes small values of m, the sequence AUG89.MB
THE BEST OF THE BEST
524
o
4
3
5
log10(d) (a)
o
2
4
3
5
loglO(m) (b)
log10(frequency) (c)
1.0
_
0.9 0.8
i
I I
._.._
-
0.7 0.6 0.5
_-
_
I g
_._
_.._
o
__
-
.
-t ::ii _
2
_._ . 3
log10(m) (d )
Fig. 5. Graphical methods for checking the self-similarity property of the sequence AUG89.MB.
can be regarded as exactly self-similar. Similar results are obtained for the sequences AUG89.LB and AUG89.HB, and for the corresponding packet count processes AUG89.LP, AUG89.MP, and AUG89.HP. Together, these observations show that Ethernet traffic over approximately a 24-hour period is self-similar, with the degree of self-similarity increasing as the utilization of the Ethernet increases.
4.2. Ethernet Traffic Over a Four-Year Period In order to examine in detail the nature of Ethernet traffic across time as well as across the network under consideration, we now consider the remaining data sets described in Table 1.
In contrast to Section 4.1, our analysis below results in estimates of the self-similarity parameter H together with their respective 95%-confidence intervals. As discussed in Section 3.4, such a refined analysis is possible if maximum likelihood type estimates (MLE) or related estimates based on the periodogram are used instead of the mostly heuristic graphical estimation methods illustrated in the previous section. Plots (aj-Id) of Fig. 6 show the result of the MLE-based estimation method when combined with the method of aggregation. For each of the four sets of traffic measurements described in Table 1, we use the time series representing the packet counts during normal traffic conditions (i.e., AUG89.MP in Fig. 6(a), OCT89.MP in (b), JAN90.MP in (c), and FEB92.MP in (d», and consider the corresponding aggregated time series x (m ) with m = 100, 200, 300.. ··, 1900, 2000 (representing the packet counts per 1,2.· . . , 19, 20 s, respectively) . We plot the Hurst parameter esti~ates iI(m) of H(m) obtained from the aggregated series x (m), together with their 95%-confidence intervals, against the aggregation level m. Fig. 6 shows that for the packet counts during normal traffic loads (irrespective of the measurement period), the values of iI(m ) are quite stable and fluctuate only slightly in the 0.85 to 0.95 range throughout the aggregation levels considered. The same holds for the 95%-eonfidence interval bands indicating strong statistical evidence for self..similarity of these four time series with degrees of self-similarity ranging from about 0.85 to about 0.95. The relatively stable behavior of the estimates H (m) for the different aggregation levels m also confirms our earlier finding that Ethernet traffic during normal traffic hours can be considered to be exactly self-similar rather than asymptotically self-similar. For exactly self-similar time series, determining a single point estimate for H and the corresponding 95%confidence interval is straightforward and can be done by visual inspection of plots such as the ones in Fig. 6 (see below). Notice that in each of the four plots in Fig. 6, we added two lines corresponding to the Hurst parameter estimates obtained from the pox diagrams of RIS and the variance-time plots, respectively. Typically, these lines fall well within the 95%confidence interval bands which confirms our earlier argument that for these long time series considered here, graphical estimation methods based on RlS or variance-time plots can be expected to be very accurate. In addition to the four normal hour packet data time series, we also appliedthe combined MLE/aggregation method to the other traffic data sets described in Table I. Fig. 7(a) depicts all Hurst parameter estimates (together with the 95%confidence interval corresponding to the choice of m discussed earlier) for each of the 12 packet data time series, while Fig. 7(b) summarizes the same information for the time series representing the number of bytes. We also include in these summary plots the Hurst parameter estimates obtained via the variance-time plots ("0") and RlS analysis ("*") in order to indicate the accuracy of these essentially heuristic estimators when compared to the statistically more rigorous Whittle estimator (". "). Concentrating first on the packet data, i.e., Fig. 7(a),we see that despite the transition from mostly host-to-host workgroup traffic during the August 1989 and October 1989 measurement
525
Fifty Years of Communications and Networking
. -- - - - -. ,: ::-,,:: =-=,:.~.,,:.:: .:..:, :: .:::"':';'''' :': '~':.:.~='':''''''' '' _ .... r.."_ .
.:: .... - /
.
._-=. - .. ..............
.:.. .-?::.::==.~.,-.,::::::::-._._.__._._._.__ ..
-
__
....• ..
~
"i
JI
1 .0
0 .9
-T--I--------t-------rr: --'-fQ~ IT ~ , ! • t
0 .8
i+ 1
UJ
--- ------ -- ------- ----- --------- -----11500
1000 AQgregallon
2000
L.a... ! m
0 .7
.!!
0.6
~
0 .5
(a )
0
~
i:z:
'15
Il
s 0 '"
"'!
... D
AUGB9
-c. _ ..
/r-,_
.
0
D
....d
---- .. --- --- ---- ---- -- -------- --------500
1000 Aggregation
1500
2000
L.a",el m
.
..
.
;;.~.~.~.~. ~.::.:.: ;r-_:.:--~.".~ ~·~·~·:'::".:':: .:"':~·:·: ::·:·
__
::'-.r.:~.•. .: .. ..: ::;:=:::::;=:,,~ ~_::;'::.:":.:~~.;;':7.:.::::-.-- -.-. -.-
-
_
.
I
i
•,
••!
J .a.N 90
FEB92
(a)
~
... i
j
I
I
(b)
!
Meae uremen t Period
~-~.~.~. =-~:.: .............................. - _ ..
OCT89
i
I
1.0 0 .9
0 .8 0.7
---- ------- ---l-E-------·"i-------:-i· tI T I , i .l. T T,
~!
.
1
0 .6
•
*
• , t
I ·
T! i
t
• 1
t
.
i+
0 .5 AUG89
OCT89
JANSlO
FEB92
Meaau rement Per iod
(b)
---- ------- ------- --------------- ----6 00
1000
(c)
reco
2000
Fig. 7. Summaryplot of Hurst parameterestimatesfor all data sets in TableI.
there were about 120 hosts that spoke up during the August 1989 or October 1989 busy hour, we heard from an order of magnitude more hosts (about 12(0) during the January 1990 high traffic hour; the comparable number of active hosts during the February '92 busy hour was around 600. The major difference between the early (pre-1990) measurements and the later ones (post-1990) can be seen during the low traffic hours. -- -- ---------- --- --.. _- ------ ---------1000 500 1500 Intuitively, low period router-to-router traffic consists mostly 2000 Aggregation Le ve l m of machine-generated packets which tend to form a much (d) smoother arrival process than low period host-to-host traffic which is typically produced by a smaller than average number Fig. 6. Periodogram-based MLE/aggregation method for the sequences of actual Ethernet users, e.g., researchers working late hours. AUG89.MP. OCT89.MP. JAN90.MP. and FEB92.MP. Next, turning our attention to Fig. 7(b), we observe that as in the case of the packet data, H increases as we move from periods, to a mix.ture of host-to-host and router-to-router traffic low to nonnal to high traffic hours , Moreover, while there during the January 1990 measurement period, to the pre- is practically no difference between the two post-1990 data dominantly router-to-router traffic of the February 1992 data sets, the two pre-1990 sets clearly differ from one another but set, the Hurst parameter corresponding to the typical normal follow a similar pattern as the post-1990 ones. The difference and busy hours, respectively, are comparable , with slightly between the August 1989 and October 1989 measurements higher H-values for the busy hours than for the normal traffic can be explained by the transition from diskless to "dataless" hours. This latter observation might be surprising in light of workstations that occurred during the latter part of 1989 (see conventional traffic modeling where it is commonly assumed Section 2.2). Except during the low hours, the increased that as the number of sources (Ethernet users) increases, the computing power of many of the Ethernet hosts causes H resulting aggregate traffic becomes smoother and smoother. In to increase and gives rise to a bit rate that closely matches contrast to this generally accepted argument for the "Poisson- the self-similar feature of the corresponding packet process. like" nature of aggregate traffic, our analysis of the Ethernet Also note that the 95%-confidence intervals corresponding to data shows that, in fact, the aggregate traffic tends to become the Hurst parameter estimates for the low traffic hours are less smooth (or. more bursty) as the number of active sources typically wider than those corresponding to the estimates of H increases (see also our discussion in Section 5.1). While for th~ normal and high traffic hours. This widening indicates
THE BEST OF THE BEST
526 TABLE II
Ql.iALITATIVE DESCR1PTION OF 11iE SETS OF EXTERNAL ETHERNET TRAFFIC MEASCREMENTS USED IN THE A~ALYSIS IN SECTION
Measurement Period
JANUARY 1990 Start of Trace: Jan. 10, 6:07 am End of Trace:
Jan. 11. 10:17 pm FEBRUARY 1992 Start of Trace:
Feb. 18,5:22 am
End of Trace:
Feb. 20~ 5:16 am
4.3
Traces of Ethernet Traffic Measurements Total Percentage of Total Internal Numberof Number of Internal Traffic Bytes Packets Traffic Data Set Data (see Table I) 1105876 1.27% JAN90E.LB IAN90.LB 9369 JAN90E.LP 3.02% JAN90.LP 9.0.;% 16536 148 JAN90E.MB JAN90.MB 13..)7% 87307 JAN90E.MP JAN90.MP 13023016 JAN90E.HB 2.00% JAN90.HB 68405 JAN90E.HP 4.96% JAN90.HP 2319881 FEB92E.LB 4.08% FEB92.LB 25247 FEB92.LP FEB92E.LP 10.89% FEB92E.MB 86283283 FEB92.MB 55.80% 270636 FEB92E.MP 51.60% FEB92.MP FEB92E.HB 5S 154789 FEB92.HB 24.50% FEB92E.HP 202367 FEB92.HP 21.35%
1992 data sets, and for ease of comparison, we analyze for both measurement periods the time series consisting of the number of external packets (bytes) per 10 ms during the same low.., normal-, and high-hours of (internal) Ethernet traffic as considered in Table 1. The last column in Table II shows that external traffic (in terms of packets or bytes) makes up between 1-10% of the internal traffic during the low hours in January 1990 and February 1992, about 2-25% during the corresponding busy hours, and up to 56% during the February 1992 normal hour. As a result, it is reasonable to expect external traffic to behave very similarly to the overall traffic analyzed earlier in this section. Differences (if any) between the internal and external traffic can, in general, be attributed to NFS traffic between workstations and file servers which is missing completely in the external traffic. Repeating the same laborious analysis of Section4.2 for the data sets described in Table II, we find that in terms of its selfsimilar nature, external traffic does not differfrom the internal traffic studied earlier. More specifically, the Hurst parameters for the external traffic during normal and high (internal) traffic yield slope estimates (not shown) that are consistent with the hours (or during previously identified stationary parts of the observed high H -values, As discussed in [2] this consistency corresponding data sets) are only slightly smallerthan the ones is a strong indication that the given time series cannot be depicted in Fig. 7. For instance, even though the portion of regarded as nonstationary due to a lack of differencing. Further external packets during the high (internal) traffic hour of the tests for non-stationarity (e.g., due to nonhomogeneities of H) January1990data is only 2% of all the packetsseen during this can be found in [17]. period. the data set JAN90E.HP seems to be well described by an H -value that changes from H = 0.82 for the first 30 min to 4.3. External Ethernet Traffic H = 0.94 for the second 30 min; recall that the corresponding The Ethernet traffic analyzed so far is also called internal data set of internal traffic, i.e., the sequence JAN90.HP, has traffic and consists of all packets on a LAN. An important an estimated Hurst parameter of 0.98. A more significant component of internal Ethernet traffic is the so-called remote change in the Hurst parameter occurs during the low traffic or external Ethernet traffic, consisting of all those Ethernet hours. While the internal traffic data (JAN90.LB, JAN90.LP. packets that originate on one LAN but are routed to another FEB92.LB, and FEB92.LP) yield a Hurst parameter of about LAN. That is. for the traffic measurements at hand, an external 0.70, the sequences JAN90E.LB, JAN90E.LP, FEB92E.LB, packet is defined to be an IP (Internet protocol) packet with and FEB92E.LP have H ~ 0.55, and the corresponding 95 a source or destination address that is not on any of the intervals contain the value H = 0.5. These are the only cases Bellcore networks. This external traffic can be viewed as in all the data sets considered in this paper, where an H -value representative for LAN interconnection services, which are of 0.5 (i.e., conventionally used short-range dependent models expected to contribute significantly to future broadband traffic. such as Poisson. batch-Poisson, or Markov-Modulated Poisson Table II summarizes the external Ethernet traffic data ana- Processes) seems to describe the data accurately. For all other lyzed in the process of this study. We consider the two most data sets described in Tables I and II, the 95%-confidence recent measurement traces i.e., the January 199~ and February intervals for the Hurst parameter estimates do not even come
that Ethernet traffic during low traffic periods is asymptotically self-similar rather than exactly self-similar. We also notice in Fig. 7 that some of the analyzed time series result in estimated Hurst parameters close to 1, i.e., their corresponding 95%-confidence intervals include the value H = L When finding an H-estimate close to 1, it is advisable to analyze the time seriesfurther to ensure that the observed high degree of self-similarity is genuine and cannot be explained by elementary arguments (see for example [21]). To illustrate, we consider the sequences JAN90.HP and FEB92.HP; visual inspection of both time series and comparisons with traces of fractional Gaussian noise with H ~ 0.9 (see. for example, the plots in [23] and [21]) show no obvious signs of nonstationarity; the mean seems to be changing with time but the overall mean appears constant and although, locally. there clearlyexist spurious trends and cycles of varying frequencies, these "typical" features of nonstationarity are characteristic of stationary long-range dependent processes. Moreover, the variance-time plots as well as the pox diagrams of the adjusted range R (without rescaling by S) of the two time series
527
Fifty Years of Communications and Networking
close to covering the value H == 0.5 . As already mentioned
in our discussion of Fig. 7, the 10\v hour traffic in the January 1990 and February 1992data is mostly machine-generated and produces traffic that is typically smoother (i.e., less bursty) than trafficthat is generatedduring the normal and busy hours by humans using their workstations. This argument applies even more when considering low hour external traffic. We also looked at the portion of external traffic using the Transmission Control Protocol (Tep) and IP. There were two main reasons for this. First, the traditional services offered by the Internet are for the most part based around TCP, which offers reliable delivery of data and protection against data loss due to lost or corrupted packets. These services include remote login, file transfer (including anonymous file transfer for making information and programspublicly available to any
Internet user), electronic mail, and more recently the delivery of the electronicbulletin board known as Netnews. The second reason is that application programs using the TCP protocol have significantly less control over how their data is actually
sent than do applications using the User Datagram Protocol (UDP) or their own protocol.The TCl' protocol has significant control over how the user data is segmented and a great deal of control over the spacing of the packets as they are sent out. When investigating the external Tep traffic, we found that there was little point in doing a separate analysis. For instance, in the heavy traffic hour from the MRE backbone
taken in 1992 (FEB92E.HP), 87% of the packets were rep packets, and a plot of the external TCP traffic is practically indistinguishable from the corresponding plot of the entire external traffic. Of those TCP packets of the FEB92E.HPdata set, about 66% of the packets were for file transfer, 9% for remote loginfTELNET, 11% for electronic mail, and 13% for netnews delivery. The 12% of non-TCP traffic simply had no effect on the results of our analysis for this data set; external
TCP traffic is practically identical to the external traffic, and our findings for the external traffic apply directly to external TCP traffic. V. ENGINEERING FOR SELF-SIMILAR NETWORK TRAFFIC
The fact that one can distinguish clearly-with respect to second-order statistical properties-between the existing models for Ethernet traffic and our measured data is surprising and clearly challenges some of the modeling assumptions that have been made in the past. While this distinction is obvious from a statistical perspective, potential traffic engineering implications of this distinction are currently under intense
scrutiny. Below, we concentrate on three implications of selfsimilar network traffic for traffic engineering purposes: mod-
eling individual sources such as Ethernet hosts, inadequacy of conventional notions of "burstiness," and the generation of synthetic traces of self-similartraffic. For a simulationstudy of the effects of self-similar packet traffic on congestion control and management for B..ISDN, we refer to [7].
5.1. On the Nature of Traffic Generated by Individual Ethernet Hosts In Section IV we showed that irrespective of when and t
where the Ethernet measurements were collected, the traffic is
self-similar, with differentdegrees of self..similaritydepending on the load on the network. We did so without first studying and modeling the behavior of individual Ethernet users
(sources). Althoughhistorically, accurate source modeling has been considered a prerequisite for successful modeling of aggregate traffic, we show here that in the case of self..similar packet traffic, knowledgeof fundamental characteristics of the aggregate traffic can provide new insight into the nature of traffic generated by an individual user. Thus, in this section we attempt to give a phenomenological explanation for the visually obvious (see Fig. 4) and statistically significant (see
Fig. 7) self-similarity property of aggregate Ethernet LAN traffic in terms of the behavior of individual Ethernet users. To this end, we recall Mandelbrot's construction of fractional Brownian motion (see Section 3.3) and interpret the
=
renewal reward process w(tn) (w(m)(t) : t = 0,1,2, ...) introduced in Section 3.3 as the amount of information (in bits, bytes, or packets) generated by Ethernet host m at time t (1 S m ~ M, t ~ 0). In fact, if bits or bytes are the preferred units, the renewal reward process source model resembles the popular class of fluid models (see [1]). On the other hand, if we think of packets as the underlying unit of information, the renewal reward process is basically a packet train model in the sense of [13]. For ease of presentation, we can assume that the "rewards" WOt WI, W 2 , ••• take only the values 1 and 0 (Of, to keep E[W] = 0, + 1 and -1). with equal probabilities, where the value 1/0 during a renewal interval indicates an active/inactive period during whichthe source sends I/O unit(s) of infonnation every time unit. The crucial property that distinguishes the renewal reward process source model from the above mentionedmodels is that the inter-renewal intervals (i.e., the lengths of the active/inactive periods) are heavytailed in the sense of (5) or, using Mandelbrot's terminology, exhibit the infinite variance syndrome. Intuitively, (5) states that with relativelyhigh probabilityt the active/inactive periods are very long, i.e., each Wm can assume the same value for a long period of time. While this heavy-tailed property of the active/inactive periods seems plausible in light of the way a typical workstation user contributes to the overall traffic on the Ethernet, we have not yet analyzed the traffic generated by individual Ethernet users in order to validate the simple renewal reward source model assumption. However, evidence in support of the infinite variance syndrome in packet traffic measurements already exists. For example, in a recent study of traffic measurements from an ISDN office automation application, Meier-Hellstern et al. [24] observed that the extreme variability in the data (e.g.• interarrival times of packets, number of successive packet arrivals in certain states) cannot be adequately captured using traditional packet traffic models but, instead, seems to be best described with the help of heavy-tailed distributions of the form (5). These authors subsequently propose an elaborate and highly parameterized model for the measured traffic. In contrast, the renewal reward source model for the traffic generated by an individual workstation user is extremely simple: moreover, we have seen in Section 3.3 that when
aggregating the traffic of many such source models, the resulting superposition process is a fractionalBrownianmotion
528
THE BEST OF THE BEST
with self-similarity parameter H = (3 - 0:)/2, where 0: is given in (5), and that the time seriesrepresenting, for example, g 5001· the total number of bytes or Ethernet packets every 10 ms, ~ so •. ••• behaves like fractional Gaussiannoise with the same H-value. E 5 .• ,___ ..-_~ :.~~.:~.:.: ' _ In this sense, our analysis in Section IV suggests that a simple renewal reward process is an adequate traffic source model 10.0 0.10 1.00 0.01 for an individual Ethernet user and that often, a more detailed log10(L) (In SecondS) source modeling might not be needed since the convergence (a) result in Section 3.3 shows that many of the details disappear during the process of aggregating the traffic of many sources and only property (5) is required for the fractional Brownian motion behavior of the superposition process to hold. Note that we have reached this conclusion by treating the Ethernet j 5 ." _• • ' __• - ~.:. :. :. :. .:..: ~:~.:.: ..-.....packets essentially as black boxes, i.e., we did not look into the packet header fields or distinguish packets based on 0.01 0.10 10.0 1.00 their source or destination. Further work on extracting the relevant source-destination addresses from our measurements log10(l) (In SeCOnds) (b ) and on statistically validating the infinite variance property of the inter-renewal periods of a single source is currently in Fig. 8. Indexof dispersion for counts (IDe) as a function of the length L of progress. the time interval over which the IDe is calculated. for the high traffic hours
:> '/ -:·: ,~·:;··-I -»
~ :1
- '-- :':'~ : ': ': :': 'I !
5.2. On Measuring "Burstiness" for Self-SimilarNetwork Traffic On an intuitive level, the results of our statistical analysis of the Ethernet traffic measurements in Section IV can be summarized by saying that typically, the higher the load on the Ethernet the higher the estimated Hurst parameter H, i.e.. the degree of self-similarity in the arrival rate process (in terms of packets or bytes). Visual comparisons between the different traces also suggest that the larger H , the "burstier" the corresponding trace appears. Tryingto capture the intuitive notion of "burstiness" with the help of the Hurst parameter H becomes particularly appealing in light of the relation H = (3 - 0:)/2 mentioned in the previous section between the selfsimilarity parameter H and the parameter a: that characterizes the "thickness" (see (5» of the tail of the inter-renewal time distribution (i.e., of the lengths of the active/inactive periods). Clearly, the heavier the tail in (5) (i.e., the closer 0: gets to 1), the greater the variability of the active/inactive periodsand hence,the burstierthe traffic generated by an individual source. Going from 0: to H relates burstiness of an individual source to burstiness of the aggregate traffic: the higher the H, the burstierthe aggregate traffic. The fact that the Hurst parameter H seems to capture the intuitive notion of burstiness through the concept of self-similarity and, at the same time, also seems to agree well with the visual assessment of bursty behavior challenges the feasibility of some of the most commonly used measures of "burstiness." The latter include the index of dispersion (for counts), the peak-to-mean ratio, and the coefficient of variation (of inter-renewal times). A commonly used measure for capturing the variability of traffic over different time scales is provided by the index of dispersion (for counts) and has recently attracted considerable attention (see for example [liD. For a given time interval of length L, the index of dispersion for counts (IDC) is given by the variance of the number of arrivals during the interval of length L dividedby the expected value of that same quantity. Fig. 8 depicts the IDC as a function of L in log-log
of the January 1990 and February 1992 data sets.
coordinates; it shows the IDC for both internal (solid lines) and external (dashedlines) traffic from the high traffic hour of the January 1990 (Fig. 8(a» and February 1992 data (b). Note in particular that the IDC increases monotonically throughout a time span that covers 4-5 orders of magnitude. This behavior is in stark contrast to conventional traffic models such as Poisson or Poisson-like processes and the popular Markov-modulated Poisson processes where the IDC is either constant or converges to a fixed value quite rapidly. On the other hand, self-similar traffic models are easily shown to produce a monotonically increasing IDC. In fact, assume for simplicity that the process X representing the total number of packets seen in every 10 ms interval, is fractional Gaussian noise (with positive drift) with self-similarity parameter H. Then we have IDC(L ) = var(L;:~ Xj)/EIE;:f X i] '" cL2H -1 (where c is a finite positive constant that does not depend on L), and plotting 10g(IDG(L)) against 10g(L) results in an asymptotic straight line with slope 2H - 1. The dotted lines in Figure 5.1 represent the IDC curves predicted by self-similar traffic models with H :::= 0.94 (JAN90.HP) and H :::= 0.96 (FEB92.HP), respectively. Similarly striking agreement between the empirical and theoretical IDC curves can be observed for the corresponding external traffic data sets. Notice that plotting the IDC curve and estimating its slope provides a quick and simple engineering-based approach to testing for self-similarity of a set of traffic measurements. Leland and Wilson [14] have pointed out the problem with using the peak-to-mean ratio as a measure for "burstiness" in the presence of self-similar traffic. The observed ratio of peak bandwidth (i.e., peak arrival rate of, say, bytes) to mean bandwidth depends critically on the time interval over which the peak and mean bandwidth is determined, i.e., essentially any peak-to-mean ratio is possible, depending on the length of the measurement interval. For a two-week long trace of the October 1989 measurements, they show that the peak rate
Fifty Years of Communications and Networking
529
in bytes for the external traffic observed in any 5 s interval is about 150 times the mean arrival rate, while the peak rate observed in any 5 ms interval is about 710 times the mean. The dependence of this burstiness measure on the choice of the time interval is clearly undesirable. Finally, we remark that the use of the coefficient of vari-
the degree of self-similarity of the latter. Generating a time series of length 100000 this way requires about 2 h of CPU-
results in an asymptotically self-similar buffer occupancy process, and he relates the tail-behavior of the former to
which are likely to require new sets of mathematical tools. Ultimately, in the context of traffic engineering, it is the pre-
time on a Sun SPARCstation 2. The second method exploits a convergence result obtained by Granger [9] who showed that when aggregating many simple AR(l)-processes t where the AR(1) parameters are chosen from a beta-distribution on ation (for interarrival times), i.e., the ratio of the standard [0,1] with shape parameters p and q, then the superposition deviation of the interarrival time to the expected number of process is asymptotically self-similar; Granger also showed the interarrival time, as a measure of "burstiness" becomes that the Hurst parameter H depends linearly on the shape questionable because of the potential "heavy-tailedness" (in parameter q of the beta-distribution. This method is wellthe sense of (5) of the interarrival times and the implied suited for parallel computers, and producing a synthetic trace infinite variance property. Although the empirical standard of length 100000 on a MasPar MP-1216, a massively parallel deviation can always be calculated, it will depend crucially computerwith 16384 processors, takes only a few minutes. In on the sample size and can attain practically any value as the contrast, Hosking's method to produce 100000 observations from a fractional ARIMA(O, d, 0) model requires about 10 h sample size increases. of CPU time on a Sun SPARCstation 2. Implementations of 5.3. On Generating Synthetic Traces of Self-Simi/ar Traffic and experimentations with these and some other methods are As we have noted in Section IV, exactly self-similarmodels currently under way. such as fractional Gaussian noise, or some nonlinear transVI. DISCUSSION formation of fractional Gaussian noise (in order to ensure Understanding the nature of traffic in high-speed, highfor example that the process takes only positive values) or asymptotically self-similar models such as fractional ARIMA bandwidth communications systems such as B-ISDN is essenprocesses can be used to fit hour-long traces of Ethernet traffic tial for engineering, operations;and performanceevaluation of very well. Parameter estimation techniques for these models these networks. In a first step toward this goal, it is important are known but they often turn out to be computationallytoo in- to know the traffic behavior of some of the expected major tensive in order to work for large data sets. However, we have contributors to future high-speed network traffic. In this paper, illustrated in Section IV how to estimate the Hurst parameter we analyze LAN traffic offeredto a high-speed public network H for large data sets, and methods to adapt the existing pa- supporting LAN interconnection, an important and rapidly rameter estimation techniques and to apply them to long time growing B-ISDN service. The main findings of our statistical series are currently being studied (for references, see [17]). analysis of hundreds of millions of high quality, high timeNotice also that our analysis of the measured data has shown resolution Ethernet LAN traffic measurements are that (i) that the Hurst parameter can be expected to change during a Ethernet LAN traffic is statisticallyself-similar, irrespective of measurement period of an hour or more and that refinements when during the four-year data collection period 1989-1992 such as modeling the change points of H may be needed in the data were collected and where they were collected in the the future in order to produce more realistic traffic models. For network, (ii) the degree of self-similarity measured in terms other approaches to modelingself-similarpacket traffic, see the of the Hurst parameter H is typically a function of the overall recent articles by Erramilli and Singh [6] who use deterministic utilization of the Ethernet and can be used for measuring the nonlinear chaotic maps in order to mimic the fractal-like "burstiness" of the traffic (namely, the burstier the traffic the properties of Ethernet traffic, and Veitch [29] whose work is higher H), (iii) majorcomponents of Ethernet LAN trafficsuch motivated by the early paper of Mandelbrot [18]. as external LAN traffic or external Tep traffic share the same An important requirement of practical traffic modeling is to self-similar characteristics as the overall LAN traffic, and (iv) generate synthetic data sequences that exhibit similar features the packet traffic models currently considered in the literature as the measured traffic. While exact methods for generating are not able to capture the self-similarity property and can synthetic traces from fractional Gaussian noise and fractional therefore be clearly distinguished from our measured data. ARIMA models exist (see for example [12]), they are, in For the purpose of modeling this self-similar or fractal-like general, only appropriate for short traces (about 1000 obser- nature of the Ethernet traffic data, we introduce novel methods vations). For longer time series, short memory approximations based on self-similar stochastic processes. The motivation for have been proposed such as the fast fractional Gaussian these methods is the desire for an accurate and relatively noise by Mandelbrot (20]. However, such approximations also simple (i.e., parsimonious) description of the complex packet become often inappropriate when the sample size becomes traffic generation process. These modeling approaches typexceedingly large. Here, we briefly discuss two methods .ically yield a single parameter (i.e., the Hurst parameter) for generating asymptotically self-similar observations. The that describes the fractal nature of the measured traffic and first method simulates the buffer occupancy in an M/ G100 appears to capture the intuitive notion of "burstiness" where queue, where the service time distribution G satisfies the conventional measures of burstiness no longer apply. From the heavy-tail condition (5), i.e.• G has infinite variance. Cox point of view of queueing/performance analysis, the proposed (4] showed that an infinite variance service time distribution modeling approaches pose new and challenging problems
530
THE BEST OF THE BEST
dieted performance of appropriately chosen queueing systems that will decide the relevance of self-similartraffic models. However, indications of the impact of the self-similarnature of packet traffic for engineering, operations, and performance evaluation of high-speed networks are already ample: (i) source models for individual Ethernet users are expected to show extreme variability in terms of interanival times of packets (i.e., the infinite variance syndrome), (ii) commonly
used measures for "burstiness" such as the index of dispersion (for counts), the peak-to-mean-ratio, or the coefficient of variation (for interanival times) are no longer meaningful for self-similar traffic but can be replaced by the Hurst parameter, (iii) the nature of congestion produced by selfsimilar network traffic models differs drastically from that predicted by standard formal models and displays a far more complicated picture than has been typically assumed in the past. and (iv) first analytic results show a clear distinction between predicted performance of certain queueing models with traditional input streams and the same queueing models with self-similarinputs (see for example [25] and [5]). Finally, in light of the same fractal-like behavior recently observed in VBR video traffic (see [2] and [8])-another majorcontributor to future high-speed network traffic-the more complicated nature of congestion due to the self-similar traffic behavior can be expected to persist even when we move toward a more heterogeneous B-ISDN environment. Thus, we believe based on our measured traffic data that the success or failure of, for example, a proposed congestion control scheme for B-ISDN will depend on how well it performs under a self-similarrather than under one of the standard formal traffic scenarios. ACKNOWLEDGMENT
This work could not have been done without the help of 1. Beran and R. Sherman who provided the S-functions that made the statistical analysis of an abundance of data possible. The authors also acknowledge many helpful discussions with A. Erramilli about his dynamical systems approach to packet traffic modeling. REFERENCES [I] D, Anick, D. Mitra, and M. M. Sondhi, "Stochastic theory of a datahandling system with multiple sources," Bell System Tech. J. vol. 61, pp. 1871-1894, 1982. [2] J. Beran, R. Sherman, M. S. Taqqu, and W. Willinger, "Variable-bitrate video traffic and long-rangedependence," accepted for publication in lEEE Trans. Commun., 1993. [3] G. E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control, 2nd ed, San Francisco, CA: Holden Day, 1976. (4) D. R. Cox, "Long..range dependence: A review," in Statistics: An Appraisal, H, A. David and H. T. David, eds. Ames, IA: The Iowa State University Press, 1984, pp, 55-74. [5J N. G. Duffield and N. O'Connell, "Large deviations and overflow probabilities for the general single-server queue, with applications," preprint, 1993. [6] A. Erramilliand R. P. Singh,"Chaoticmapsas models of packet traffic, n in Proc. 14th lTC, Antibes Juan..les-Pins, France. 1994 (to appear), [7) H. 1. Fowler and W. E. Leland, "Local area network traffic characteristics, with implicationsfor broadband network congestionmanagement,n IEEE J. Select. AreasCommun. vol. 9, pp. 1139-1149, 1991. [8] M. W. Garrett and W. Willinger, ~'Ana1ysis" modelling, and generation of self..similar VBR video traffic," preprint, 1994. [9] C. W. J. Granger, "Long memory relationships and the aggregation of dynamic models," ). Econometr. vol. 14, pp. 227-238, 1980.
[10] C. W. J. Granger and R. Joyeux, "An introduction to long-memory time series models and fractional differencing." J. Time Series Anal. vol. 1, pp. 15-29, 1980.
[11] H. Heffes and D. M. Lucantoni, UA Markov modulated characterization of packetized voice and data traffic and related statistical multiplexer performance," IEEEJ. Select. AreasCommun., vol,SAC-4,pp. 856-868,
1986. [12] J. R. M. Hosking, "Fractional differencing:' Biometrika, vol. 68t pp.
165-176, 1981. [13] R. Jain and S. A. Routhier, "Packet trains: Measurements and a new model for computer network traffic," IEEE I, Select. Areas Commun.,
vol. SAC-4 pp. 986-995, 1986. [14] W. E. Leland and D, V. Wilson, ~'High time-resolution measurement and analysis of LAN traffic: Implications for LAN interconnection," in Proc./EEE INFOCOM '91, Bal Harbour, FL, 1991, pp. 1360-1366. [15] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, '·On the self-similar nature of Ethernet traffic," in Proc. ACM Sigcomm '93, San Francisco, CA, 1993, pp. l83-193, [16] - , "Statistical analysis of high time-resolution Ethernet LAN traffic measurements," in Proc. 25th Interface, San Diego. CA. 1993. [17] - , "Self..similarity in high-speed packet traffic: Analysis and modeling of Ethernet traffic measurements," Statistical Science, 1994 (to appear). [18] B. B. Mandelbrot,"Self-similar error clusters in communication systems and the concept of conditional stationarity." IEEE Trans. Commun. Techn., vol. COM-I3, pp. 71-90 t 1965. [19] - , "Long-run linearity. locally Gaussian processes, H-spectra and infinitevariances," Intern, Econom. Rev., vol. 10, pp. 82-113, 1969. [20] - , UA fast fractional Gaussian noise generator," Water Resources Research, vol. 7, pp. 543-553, 1971. [21] B. B. Mandelbrot and M. S. Taqqu, "Robust RlS analysis of long run serial correlation," in Proc. 42nd Session lSI, 1979, pp. 69-99. [22] B. B. Mandelbrot and 1. W. Van Ness, "Fractional Brownian motions, fractional noises and applications," SIAM Rev., vol. 10, pp. 422-437, t
1968. [23) B. B. Mandelbrot and J. R. Wallis, "Computer experiments with fractional Gaussian noises," Water Resources Research, vol, 5, pp. 228-267,
1969, [24] K. Meier-Hellstem,P. E. Winh, Y.-L. Yan, and D. A. Hoeftin, "Traffic models for ISDN data users: Office automation application," in reletraf.. fie and Datatraffic in a Periodo!Change (Proc. 13thITC, Copenhagen, 1991), A. Jensen and V. B. Iversen, eds, Amsterdam,The Netherlands: Nonh-Holland, 1991, pp. 167-172. [25] I. Norros, "Studies on a model for connectionless traffic, based on fractional Brownian motion," COST24TD(92)04t. 1992. [26] B. H. Spafford, "The Internet worm incident," in Proc. ESEC 89 and Lecture Notes in Computer Science 87. New York: Springer-Verlag, 1989. [27] M. S. Taqqu, ··A bibliographical guide to self-similar processes and long-range dependence," in Dependencein Probability and Statistics, E. Eberlein and M. S. Taqqu,eds. Basel: Birkhauser, 1985, pp. 137-165. [28] M. S. Taqqu and J. B. Levy, "Using renewal processes to generate long-range dependence and high variability," in Dependence in Probability and Statistics, E. Eberlein and M. S, Taqqu, eds. Boston, MA: Birkhauser, 1986, vol. 11, pp. 73-89. [29] D. Veitch, "Novel models of broadband traffic," in Proc.7th Australian Teletraffic Research Seminar, Murray River, Australia, 1992.
Will Leland (M'82/ACM'77) received the Ph. D. degree in computer science from the University of Wisconsin, Madison.
PHOTO NOT
AVAILABLE
He is a Member of Technical Staff at Bellcore,
where he works in the Network Systems Research Department.
531
Fifty Years ofCommunications and Networking
PHOTO NOT AVAILABLE
Murad Taqqu (M'92) received the B. A. degree in mathematics and physics in 1966 from the Universite de Lausanne-Ecole Polytechnique and the Ph. D. degree in statistics in 1972 from Columbia University New York. Since 1985, he has been Professor in the Department of Mathematics at Boston University. Dr. Taqqu is a Guggenheim Fellow and a Fellow of the Institute of Mathematical Statistics. He is currently an Associate Editor for Stochastic Prot
PHOTO NOT AVAILABLE
cesses and their Applications and coauthor of the book Stable Non-Gaussian Random Processes: Stochastic Models withInfinite Variance (Chapman and Hall, 1994).
PHOTO NOT AVAILABLE
Walter WUlinger received the Diplom (Dipl. Math.) in 1980 from the ETH Zurich, Switzerland. and the M. S. and Pit D. degreesin 1984and 1987, respectively, from the School of ORIE, Cornell University, Ithaca, NY. He is a Member of Technical Staff at Bellcore, where he works in the Computing and Communications Research Department. Dr. Willinger is currently an Associate Editor for The Annals of AppliedProbability.
Daniel V. WUson (M'8S/ACM'85) received the M. S. degree in electrical engineering from Stanford University in 1983 and the B. S. degree in physics and mathematics from Southwest Missouri Stale University in 1977. He is a Member of Technical Staff at Bellcore where he works on networkmonitoring and analysis.
A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks: The Single-Node Case Abhay K. Parekh, Member, IEEE, and Robert G. Gallager,
Abstract-The problem of allocating network resources to the users of an integrated services network is Investigated in the context of rate-based flow control. The network is assumed to be a virtual circuit, connection-based packet network. We show that tbe use of Generalized Processor Sharing (GPS), when combined with Leaky Bucket admission control, allows the network to make a wide range of worst-case performance guarantees on
throughput and delay. The scheme is flexible in that different users may be given widely different performance guarantees, and is efficient in that each of the servers is work conserving. We present 8 practical packet-by-packetservicediscipline, PGPS (first proposed by Demers, Shenker, and Keshav [7] under the
name of Weighted Fair Queueing), that closely approximates GPS. This allows us to relate results for GPS to the packet-by-
packet scheme in a precise manner. In this paper, the performance of a single-server GPS systemis analyzed exactly from the standpoint of worst-case packet delay and bursdness whenthe sourcesare constrained b}' leaky buckets. The worst-case session backlogs are also determined. In the sequel to this paper, these results are extended to arbitrary topology networks with multiple nodes.
I. INTRODUCTION This paper and its sequel [17] focus on a central problem in the control of congestion in high-speed integrated services networks. Traditionally, the flexibility of data networks has been traded off with the performance guarantees given to its users. For example, the telephone network provides good performance guarantees but poor flexibility, while packet switched networks are more flexible but only provide marginal performance guarantees. Integrated services networks must carry a wide range of traffic types and still be able to provide performance guarantees to real-time sessions such as voice and video. We will investigate an approach to reconcile these apparently conflicting demands when the short-term demand for link usage frequently exceeds the usable capacity. We propose the combined use of a packet service discipline based on Generalized Processor Sharing and Leaky Bucket Manuscript received June 1992; revised February and April 1992; approved by IEEEIACM TRANSACTIO~S 01'\ NETWORKI~G EditorMoshe Sidi. This paper was presented in part at IEEE INFOCOM '92. The research of A. Parekh was partly funded by a Vinton Hayes Fellowship and a Center for Intelligent Control Systems Fellowship. The research of R. Gallager was funded by the National ScienceFoundation under880299l-NCR and by the Army Research Office under DAAL03-86-K-O 171.
A. K. Parekh is with the IBM T. J. Watson Research Center, Yorktown Heights. NY 10598. R. G. Gallager's is with the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology. Cambridge. ~lA. IEEE Log Number 9211033.
Fello~',
IEEE
rate control to provide flexible, efficient, and fair use of the links. Neither Generalized Processing Sharing, nor its packetbased version, POPS, are new. Generalized Processor Sharing is a natural generalization of uniform processor sharing [14], and the packet..based version (while developed independently by us) was first proposed in (7] under the name of Weighted Fair Queueing. OUT contribution is to suggest the use of POPS in the context of integrated services networks and to combine this mechanism with Leaky Bucket admission control in order to provide performance guarantees in a flexible environment. A major part of our work is to analyze networks of arbitrary topology using these specialized servers, and to show how' the analysis leads to implementable schemes for guaranteeing worst-case packet delay. In this paper, however, we will restrict OUf attention to sessions at a single node, and postpone the analysis of arbitrary topologies to the sequel. Our approach can be described as a strategy for rate-based flow control. Under rate-based schemes, a source's traffic is parametrized by a set of statistics such as average rate, maximum rate, and burstiness, and is assigned a vector of values corresponding to these parameters. The user also requests a certain quality of service that might be characterized, for example, by tolerance to worst-case or average delay. The network checks to see if a new source can be accommodated and, if so, takes actions (such as reserving transmission links or switching capacity) to ensure the quality of service desired. Once a source begins sending traffic, the network ensures that the agreed-upon values of traffic parameters are not violated. Our analysis will concentrate on providing guarantees on throughput and worst-case packet delay. While packet delay in the network can be expressed as the sum of the processing, queueing, transmission, and propagation delays. we will focus exclusively on how to limit queueing delay. We will assume that rate admission control is done through leaky buckets [20]. An important advantage of using leaky buckets is that this allows us to separate the packet delay into two components: delay in the leaky bucket and delay in the network. The first of these components is independent of the other active sessions, and can be estimated by the user if the statistical characterization of the incoming data is sufficiently simple (see [1, Sect. 6.3] for an example). The traffic entering the network has been "shaped" by the leaky bucket in a manner that can be succinctly characterized (we will do this in Section V), and so the network can upper bound the second component of packet delay through this characterization. This
Reprinted from IEEE/ACM Transactions on Networking, vol. 1, no. 3, June 1993.
The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
533
534
THE BEST OF THE BEST
upper bound is independent of the statistics of the incoming for any session i that is continuously backlogged in the interval data, which is helpful in the usual case where these statistics (7, t]. are either complex or unknown. A similar approach to the Summing over all sessions j:
analysis of interconnection networks has been taken bv Cruz [5]. From this point on, we will not consider the delay" in the leaky bucket. Generalized Processor Sharing (GPS) is defined and exand session i is guaranteed a rate of plained in Section II. In Section III. we present the packetbased scheme, POPS, and show that it closely approximates rPi (2) gi = --. r. GPS. Results obtained in this section allow us to translate Lj o, session delay and buffer requirement bounds derived for a GPS server system to a PGPS server system. We propose GPS is an attractive multiplexing scheme for a number of a virtual time implementation of PGPS in the next section. reasons: Then. PGPS is compared to weighted round robin, virtual • Define r, to be the session i average rate. Then, as long clock multiplexing [2 J], and stop-and-go queueing [9]-[ 1]]. as r, .$ 9i, the session can be guaranteed a throughput of Having established POPS as a desirable multiplexing Pi independent of the demands of the other sessions. In scheme, we turn our attention to the rate enforcement function addition to this throughput guarantee, a session i backlog in Section V. The Leaky Bucket is described and proposed as will always be cleared at a rate ~ gj. a desirable strategy for admission control. \Ve then proceed • The delay of an arriving session i bit can be bounded as with an analysis, in Sections VI-VIII. of a single GPS server a function of the session i queue length, independent of system in which the sessions are constrained by leaky buckets. the queues and am vals of the other sessions. Schemes The results obtained here are crucial in the analysis of arbitrary such as FCFS, LCFS, and Strict Priority do not have this topology and multiple node networks, which we will present property. in the sequel to this paper. • By varying the ¢i s, we have the flexibility of treating the sessions in a variety of different ways. For example, when all Qi' s are equal, the system reduces to uniform processor II. GPS MULTIPLEXING sharing. As long as the combined average rate of the The choice of an appropriate service discipline at the nodes sessions is less than r, any assignment of positive ¢i' s of the network is key to providing effective flow control. yields a stable system. For example, a high-bandwidth A good scheme should allow the network to treat users delay-insensitive session i can be assigned gi much less differently, in accordance with their desired quality of service. than its average rate, thus allowing for better treatment However, this flexibility should not compromise the fairness of the other sessions. of the scheme, i.e., a few classes of users should not be able to • Most importantly. it is possible to make worst-case netdegrade service to other classes, to the extent that performance work queueing delay guarantees when the sources are guarantees are violated. Also, if one assumes that the demand constrained by leaky buckets. We will present our results for high bandwidth services is likely to keep pace with the on this later. Thus, GPS is particularly attractive for increase in usable link bandwidth. time and frequency multisessions sending real-time traffic such as voice and video. plexing are too wasteful of the network resources to be conFig. I illustratesgeneralized processor sharing. Variable-length sidered candidate multiplexing disciplines. Finally, the service packets arrive from both sessions on infinite capacity links and discipline must be anaiyzable so that performance guarantees appear as impulses to the system. For ·i = 1~ 2, let Ai (O~ t) be can be made in the first place. We nov.' present a flow-based the amount of session i traffic that arrives at the system in multiplexing discipline called Generalized Processor Sharing the interval roo t] and, similarly, let Si(O~ t) be the amount that is efficient, flexible, and analyzable, and that therefore of session i traffic that is served in the interval (O~ t]. We seems very appropriate for integrated services networks. HO\\lassume that the server works at rate 1. When ¢1 =
535
Fifty Years ofCommunications and Networking
How GPS
A~D
packet information
cPl
= >2
2;1 = ¢2
TABLE I PGPS COMPARE FOR THE EXA~PLE
Arrival Size GPS
PGPS GPS PGPS
1\ FIG,
1.
Session 2 5 9 2 2 5 9 11 3 9 11 4 8 11 3 7 11
Session 1 2 3 11 1 2 2 3 5 9 13 4 5 7 13 4 5 9 13 4 5 9 13
0 3
1 1
The lower portion of the table gives the packet departure times under both schemes. Session 2
Session 1 packet size
packet aize
20
time
time
10
20
30
time >1
1
,.
,
I
i--
10
7 6 5 4
<, , ~--~~~~: ,
,
,..
3
:
,
I
51
20
30
I
f--- : \,(0,1)
6
8
10
12
14 time
Fig. 1. An example of generalized processor sharing.
may not be the case when the better-treated session is steady. Thus, when combined with appropriate rate enforcement.. the flexibility of GPS multiplexing can be used effectively to control packet delay.
III. A PACKET-BY-PACKET
10
20
TRA!'S~]SSIO!'l
Fig. 2. The effect of increasing
OJ
30
time
for a steady session i).
scheme that serves packets in increasing order of F p . Now, suppose that the server becomes free at time
1
4
time
1"'--
2
~
,.~
2
time'
I
~
Al(O,t)
1
30
d·= 37.5
I I
3 2
20
<1>2
d· = 5
,,... --------
.. ~
:
»>
10
SCHEME-PGPS
A problem with GPS is that it is an idealized discipline that does not transmit packets as entities. It assumes that the server can serve multiple sessions simultaneously and that the traffic is infinitely divisible. In this section, we present a simple packet-by-packet transmission scheme that is an excellent approximation to GPS even when the packets are of variable length. Our idea is identical to the one used in [7]. We will adopt the convention that a packet has arrived only after its last bit has arrived. Let Fp be the time at which packet p will depart (finish service) under Generalized Processor Sharing. Then. a very good approximation of GPS would be a work-conserving
T.
The next
packet to depart under GPS may not have arrived at time T and, since the server has no knowledge of when this packet will arrive. there is no way for the server to be both work conserving and serve the packets in increasing order of F p • The server picks the first packet that would complete service in the GPS simulat.ion if no additional packets ",'ere to arrive after time r . Let us call this scheme PGPS for packet-bypacket Generalized Processor Sharing. As stated earlier, this mechanism was originally called Weighted Fair Queueing [7]. Table I shows how PGPS performs for the example in Fig. 1. Notice that when
THE BEST OF THE BEST
536
Note that if N maximum-size packets leave simultaneously p' if there are no arrivals after time T. Then! packet p will also complete service before packet p' for any pattern of arrivals in the reference system, they can be served in arbitrary order after time 'T. in the packet-based system. Thus, F p - t; ~ (N - 1) ~ Proof' The sessions to which packets p and p' belong are even if the reference system is tracked perfectly. both backlogged from time r until one completes transmission. Let Sj(T, t) and Si(r, t) be the amount of session i traffic (in By (1), the ratio of the service received by these sessions is bits, not packets) served under GPS and POPS in the interval independent of future arrivals. 0 [7 tJ. Theorem 2: For all times T and sessions i: A consequence of this lemma is that if PGPS schedules a packet p at time T before another packet pI that is also Si(O, T) - Si(O~ r ) ~ L max · backlogged at time T, then in the simulated GPS system, packet p cannot leave later than packet pl. Thus, the only Proof' The slope of alternates between r when a packets that are delayed more in PGPS are those that arrive session i packet is being transmitted. and 0 when session i is too late to be transmitted in their GPS order. Intuitively, this not being served. Since the slope of S, also obeys these limits, means that only the packets that have a small delay under GPS the difference Si(O, t) - Si(O~ t) reaches its maximal value are delayed more under POPS. when session i packets begin transmission under PGPS. Let t Now let t; be the time at which packet p departs under be some such time, and let L be the length of the packet going POPS. We show that into service. Then, the packet completes transmission at time Theorem 1: For all packets v. t + ~. Let T be the time at which the given packet completes transmission under GPS. Then, since session i packets are Fp - Fp <- L mr ax ' served in the same order under both schemes, 1
s,
where L max is the maximum packet length and r is the rate of the server. Proof: Since both GPS and POPS are work-conserving disciplines, their busy periods coincide, i.e., the GPS server is in a busy period iff the PGPS server is in a busy period. Hence, it suffices to prove the result for each busy period. Consider any busy period and let the time that it begins be time zero. Let Pk be the kt h packet in the busy period to depart under PGPS, and let its length be Lk. Also, let tk be the time that Pk departs under PGPS and Uk be the time that Pk departs under GPS. Finally, let ak be the time that Pk arrives. We now show that
for k == 1~ 2~ .... Let m be the largest integer that satisfies both o <m ~ k - 1 and Urn > us: Thus, Um
>
Uk
2:: u,
for m
< i < k.
(3)
Si(O,r)
" L = Si(O.t+ -). r
From Theorem 1,
(5) (6) (7) Since the slope of S, is at most T, the theorem follows. 0 Let Qi( i) and Qi(t) be the session i backlog (in units of traffic) at time 'T under PGPS and GPS~ respectively. Then, it immediately follows from Theorem 2 that Corollary 1: For all timesr and sessions i Qi(O.r) - Qi(O,T) ~ L max ·
Theorem 1 generalizes the result shown for the uniform processing case by Greenberg and Madras [12]. Notice that Then, packet Prn is transmitted before packets Pm+l · · .. Pk • Theorem 1 and Corollary 1 can be used to translate under PGPS but after all these packets under GPS. If no such bounds on GPS worst-case packet delay and backlog to integer m exists, then set ni = O. Now, for the case m '» 0, corresponding bounds on PGPS. packet Pm begins transmission at t m - ~; so, from Lemma 1. • Variable packet lengths are easily handled by PGPS. This is not true of weighted round robin. (4) • The results derived so far can be applied to provide an alternative solution to a problem studied Since Pm+l ~ ... ~ Pk-l arrive after t-; - ~ and depart before in [4],[ 19],[2],[8],[3]: There are N input links to a Pk does under GPS. multiplexer; the peak rate of the it h link is C i , and the 1 Lm rate of the multiplexer is C 2:: Li".::l Cit Since up to tLk ~ -(Lk + Lk-l + Lk-2 + ... + L m - 1 ) + i.; - r r L m a x bits from a packet may be queued from any link before the packet has "arrived," at least L m ax bits of buffer must be allocated to each link. In fact, in [3] it is shown that at least 2L m ax bits are required, and that a class of buffer policies called Least Time to Reach If m == 0, then Pk-l . .. . , PI all leave the GPS server before Bound (LTRB) meets this bound. It is easy to design Pk does, and so a PGPS policy that meets this bound as well: Setting rt>i = C1: ~ it is clear that the resulting GPS server ensures o
537
Fifty Years of Communications and Networking
that no more than Lm ax bits are ever queued at any link. The bound of Corollary 1 guarantees that no more than 2Lm ax bits need to be allocated per link under PGPS. In fact, if L, is the maximum allowable packet size for link i, then the bound on the link i buffer requirement is Li+L m ax . Further, various generalizations of the problem can be solved: For example, suppose the link speeds are arbitrary. but no more than fi(t) + Tit bits can arrive on link i in any interval of length t (for each i). Then, if Li ri ::; C, setting cPi = r, for each i yields a PGPS service discipline for which the buffer requirement is L max + maXt>O(!i(t) - Tit) bits for each link i. • There is no constant c ~ 0 such that
when the server is idle. Consider any busy period, and let the time that it begins be time zero. Then, II(t) evolves as follows: \/(0) == 0
V(tj-l
+ 7)
== V(tj-d T
S ti
- tj
+L - 1
.i
7
'.
The rate of change of V, namely &F~~+7"). is
E
(10) 1
.. '
'EB j ¢,
and
each backlogged session i receives service at rate
rt.
'" K Sl(O~(
n 1a X - 1L) r
)
== (K -1 ) L m ax
and S (0. (K _ 1) L m ax 1 , r
)
= K(K -
l)L rnax
lV - K + 1
.
This yields
For any given K, the RHS of (9) can be made to approach (K - 1) L m ax arbitrarily closely by increasing ;.V. A. Virtual Time Implementation of PGPS
In this section, we will use the concept of Virtual Time to track the progress of GPS that will lead to a practical implementation of PGPS. OUf interpretation of virtual time generalizes the innovative one considered in [7] for uniform processor sharing. In the following, we assume that the server works at rate 1. Denote as an event each arrival and departure from the GPS server, and let t j be the time at which the jth event occurs (simultaneous events are ordered arbitrarily). Let the time of the first arrival of a busy period be denoted as t 1 == O. Now observe that, for each j = 2,3 ...., the set of sessions that are busy in the interval (tj -1 ~ t j) is fixed, and we may denote this set as B r- Virtual time V (t) is defined to be zero for all times
Fp
to be paid for these advantages is some overhead in keeping track of sets B J , which is essential in the updating of virtual time. Define Next (f) to be the real time at which the next packet will depart the GPS system after time t if there are no more arrivals after time t. Thus, the next virtual time update after t will be performed at Next! t) if there are no arrivals in the interval [to Next(t)]. Now, suppose a packet arrives at some time t (let it be the jth event) and that the time of the event just prior to t is T (if there is no prior event, i.e., if the packet is the first arrival in a busy period, then set r = 0). Then, since the set of busy sessions is fixed between events! V (t) may be computed from (10) and the packet stamped with its virtual time finishing time. Xext(t) is the real time corresponding to the smallest virtual time packet finishing time at time t. This real time may be computed from (10) since the set of busy sessions, B i- remains fixed over the interval [t . N ext( t)]; Let F min be the smallest virtual time finishing time of a packet in the system at time i, Then, from (10) r'.
_
rm m-
ll
( )
Next(t) - t
t+~,
£"'''EB)
--f,.
r.r"
:::} Next(t) = t 1- (F!l1in - ~~(t))
L
¢'i'
tee,
Given this mechanism for updating virtual time, PGPS is defined as follows: When a packet arrives" virtual time is updated and the packet is stamped with its virtual time finishing time. The server is work conserving and serves packets in an increasing order of timestamp.
THE BEST OF THE BEST
538
IV.
COMPARING
POPS TO
OTHER SCHEMES
Under wei~hted round robin, every session i has an integer weight tDi associated with it. The server polls the sessions according a precomputed sequence in an attempt to serve session i at a rate of If an empty buffer is encountered, the
t:iWj'
server moves to the next session in the order instantaneously. When an arriving session i packet just misses its slot in a frame, it cannot be transmitted before the next session i slot. If the system is heavily loaded in the sense that almost every slot is utilized, the packet may have to wait almost ]V slot times to be served, where LV is the number of sessions sharing the server. Since PGPS approximates GPS to within one packet transmission time regardless of the arrival patterns, it is immune to such effects. POPS also handles variable-length packets in a much more systematic fashion than does weighted round robin. However, if N or the packet sizes are small, then it is possible to approximate GPS well by weighted round robin. Hahne [13] has analyzed round robin in the context of providing fair rates to users of networks that utilize hop-by-hop window flow control. Zhang proposes an interesting scheme called virtual clock multiplexing [21]. Virtualclock multiplexing allows a guaranteed rate and (average) delay for each session, independent of the behavior of other sessions. However, if a session produces a large burst of data, even while the system is lightly loaded, that session can be "punished" much later when the other sessions become active. Under PGPS, the delay of a session i packet can be bounded in terms of the session 'i queue size seen by that packet upon arrival, even in the absence of any rate control. This enables sessions to take advantage of lightly loaded network conditions. We illustrate this difference with
a numerical example: Suppose there are two sessions that submit fixed-size packets of one unit each. The rate of the server is one, and the
packet arrival rate is 4 for each session. Starting at time zero, 1000 session 1 packets begin to arrive at a rate of 1 packet/second. No session 2 packets arrive in the interval (0900) but, at time 900, 450 session 2 packets begin to arrive at a rate of one packet/second. Now if the sessions are to be treated equally, the virtual clock for each session will tick at a rate of and the PGPS weight assignment will be dJl = (/>2. Since both disciplines are work conserving. they will serve session 1 continuously in the interval [0900). At time 900-, there are no packets in queue from either session; the session 1 virtual clock will read 1800 and the session 2 virtual clock will read 900. The 450 session 2 packets that begin arriving at this time will be stamped 900902904~ .... ,1798, while the 100 session 1 packets that arrive after time 900 will be stamped 1800,1804 ..... 1998. Thus, aU of the session 2 packets will be served under Virtual Clock before any of the session 1 packets are served. The session 1 packets are being punished since the session used the server exclusively in the interval [0900). Note, however, that this exclusive use of the server was not at the expense of any session 2 packets. Under POPS, the sessions are served in round robin fashion from time 900 on, which results in much less delay to the session 1 packets.
!'
The lack of a punishment feature is an attractive aspect of POPS since. in our scheme, the admission of packets is regulated at the network periphery through leaky buckets and it does not seem necessary to punish users at the internal nodes as well. Note, however, that in this example PGPS guarantees a throughput of ~ to each session even in the absence of access control. Stop-and-Go Queueing is proposed in [9]-[11] and is based on a network..wide time slot structure. It has two advantages over our approach: it provides better jitter control and is probably easier to implement. A finite number of connection types are defined, where a type 9 connection is characterized by a fixed frame size of Tg • Since each connection must conform to a predefined connection type, the scheme is somewhat less flexible than POPS. The admission policy under which delay and buffer size guarantees can be made is that no more than 1'iTg bits may be submittedduring any type 9 frame. If sessions 1. 2 ~ .." ~V are served by a serverof capacity 1, it is stipulated that L~~l r. S 1, where the sum is only taken over the realtime sessions. The delay guarantees grow linearly with Tg , so in order to provide low delay one has to use a small slot size. The service discipline is not work conserving and is such that each packet may be delayed up to 2Tg time units, even when there is only one active session at the server. Observe that for a single-session POPS system in which the peak rate does not exceed the rate of the server, each arriving packet is served immediately upon arrival. Also, since it is work conserving, PGPS will provide better average delay than stop-and-go for a given access control scheme. It is clear that r, is the average rate at which the source i can send data over a single slot. The relationship between delay and slot size may force Stop-and-Go to allocate bandwidth by peak to satisfy delay-senstive sessions. This may also happen under POPS. but not to the same degree. To see this, consider an on/off periodic source that fluctuates between values C - E and o. (As usual, f is small.) The on period is equal to the off period, say they are B seconds in duration. We assume that B is large. Clearly. the average rate of this session is O.5( C - E). We are interested in providing this session low delay under Stop-and-Go and POPS, To do this, one has to pick a slot size smaller than B, which forces r = C - e. The remaining capacity of the server that can be allocated is e. Under POPS! we allocate a large value of
Fig. 3 depicts the Leaky Bucket scheme [20] that we will use to describe the traffic that enters the network. Tokens or permits are generated at a fixed rate, p, and packets can be released into the network only after removing the required number of tokens from the token bucket. There is no bound
539
Fifty Years of Communications and Networking
__
)
~kenl ent er ..
,."" L-O tjj I
-
Incoming (Bursty) Tr..ffic
A, (O.t )
S, (O , t)
-
R..te
-
~ A. ( '. t) /
Fig. 3.
t rate P.
To t he network
A Leaky Buckel.
Q
,..
Fig. 5. .-\;( 0. f ).
A;(O, t)
s .ro, f ).
Q ;(t) and Ddt)
We may now express l;(t ) as
[itt)
=: ITi
+ K i(t ) -
A;(O. t) .
(15)
From (15) and (14), we obtain the useful inequality L -_ _
Uj
A;(T. t)
Bucket Full
:s
li(T ) + Pi(t - r ) -1; (t ).
(16)
VI. ANALYSIS In this section. we analyze the worst-case performance of single-node GPS systems for sessions that operate under Leaky Bucket constraints, i.e.. the session traffic constrained as in
t
b
Fig. 4. .·U t) and 1, (t) .
(12 ).
on the number of packets that can be buffered, but the token bucket contains at most U bits worth of tokens. In addition to securing the required number of tokens, the traffic is further constrained to leave the bucket at a maximum rate of C > p . The constraint imposed by the leaky bucket is as follows: If Ai (7, t) is the amount of session i flow that leaves the leakv bucket and enters the network in time interval (i . t]. then
Ai(i.
o s min{(t -
ilC i . 0'; + Pi(t - T)}. W
~
T 2: O. (12)
There are N sessions. and the only assumptions we make about the incoming traffic are that Ai "" (O' j. Pi.Ci) for i =: 1. 2. .... IV and that the system is empty before time zero. The server is work conserving (i.e., it is never idle if there is work in the system). and operates at the fixed rate of l. Let Si [r , t.) be the amount of session i traffic served in the interval (T. t]. Note that S, (0. t ) is continuous and nondecreasing for all t (see Fig. 5). The session i backlog at time i is defined to be
Qi(r ) = .'1 1 (0. 7 ) - 5 ,(0. i ). for every session i. We say that session i conforms to The session i delay at time 7 is denoted by D; (T). and is (a; ,pi ,Ci ) , or Ai'" (Ui,Pi'C;) , This model for incoming traffic is essentially identical to the amount of time that it would take for the session i backlog the one recently proposed by Cruz [5], [6], and it has also to clear if no session i. bits were to arrive after time T . Thus. been used in various forms to represent the inflow of parts Di(T ) = inf{t ~ 8 ;(0. t.) = .4; (0 . T. (17) into manufacturing systems by Kumar [18]. [IS). The arrival constraint is attractive since it restricts the traffic in terms of From Fig. 5. we see that Di(T ) is the horizontal distance average sustainable rate (p) . peak rate (e). and burstiness ( IT between curves .4; (0. t ) and S ilO. t ) at the ordinate value of and C). Fig. 4 shows how a fairly bursty source might be A ; (0.7 ). characterized using the constraints. Clearly. D;(7) depends on the arrival functions .'1 1 • .. . , AN. Represent Ai (O , t) as in Fig. 4. Let there be li(t ) bits worth We are interested in computing the maximum delay over all of tokens in the session i token bucket at time t. We assume time. and over all arrival functions that are consistent with that the session starts out with a full bucket of tokens. If K; (t ) (12). Let D; be the maximum delay for session i. Then. is the total numberof tokens accepted at the session i bucket in Di =: (. .-t l max . max D ;(i ). the interval (0, t] (it does not include the full bucket of tokens ... . .. ,..!.\' ) ,2::0 that session i starts out with. and does not include arriving Similarly, we define the maximum backlog for session 1:. Q;: tokens that find the bucket full), then
i:
K;(t) = min {.4;(O. T ) + (l i(t - 7)}. O$ r $ 1
Thus, for all
T
::::;
( 13)
'
t
K; (t ) - K ;(T)
:s (li (t -
r ).
(14)
Qi =:
max
in -
maxQ;(T).
( .-\ l .. ... . .-\S ) , ~o
The problem we will solve in the following sections is: Given cPl , .. .. CPs for a GPS server of rate 1 and given (aj . Ps- e ) ). j = 1. .... N. what are D ; and Qi for every
540
THE BEST OF THE BEST
session i? We will also be able to characterize the burstiness of the output traffic for every session i, which will be especially
Proof' Suppose [tl' t2] is a system busy period. By assumption.
useful in our analysis of GPS networks in the sequel.
N
E Qi(t
A, Definitions and Preliminary Results
.'1
=
1)
L Qi(t
2)
= O.
i=li=l
We introduce definitions and derive inequalities that are helpful in our analysis. Some of these notions are general enough to be used in the analysis of any work-conserving service discipline (that operates on sources that are Leaky Bucket constrained). Given ill! .... AN, let aT be defined for each session i and time T 2: 0 as
Thus~
i
T. If C1', = oc, we can think of aT as the maximum amount of session i backlog at time r+ over all arri val functions that are identical to . 4 1 ~ ... ~ ~4 N up to time T. Observe that a? = a i and
at the server at time
( 19)
Recall (16)
1 , t 2)
=L
SI(t 1 . t2) = t2 - ii-
i=l
i=l
Substituting from (12) and rearranging terms: t 2
(18)
where lieT) is defined in (15). Thus. aJ is the sum of the number of tokens left in the bucket and the session backlog
N
~\"
L . 4 (t
t < 1 -
.\'
'""' a, L....,i::::::l l
N'
1 - 2:i=l Pi
o A simple consequence of this lemma is that all system busy periods are bounded. Since session delay is bounded by the
length of the largest possible system busy period, the session
delays are bounded as well. Thus, the interval B is finite whenever 2:i~l Pi < 1 and may be infinite otherwise. We end this section with some comments valid only for the GPS system: Let a session i busy period be a maximal interval B, contained in a single system busy period, such that for all r, t E B( ¢i.
Si(r.t)
T
(23)
Sj(T,t)~
Qi(T)
+ Ai(T~ t) -
Qi(t)
:s aT - a; + pdt -
T).
(20)
Now notice that
Combining (20) and (21), we establish the following useful result: Lemma 2: For every session i.,r ~ t:
Notice that it is possible for a session to have zero backlog during its busy period. However. if Qi( r ) > 0 then 'T must be in a session i busy period at time T. We have already shown in (2) that Lemma: For every interval [1'~ t} that is in a session i busy period
Notice that when ¢ reduces to
=
¢i for all i, the service guarantee
(22)
Define a system busy period to be a maximal interval B such that for any T. t E B, T ~ t: N
L: Si(7" ,t ) = t - T. i=l
Since the system is work conserving, if B = [tl' t2]' then lV 1\' L~=l Qi(t 1) = Li=l Qi(t2) = o. Lemma 3: When Lj Pj < 1, the length of a system busy period is at most
B. Greedy Sessions Session i is defined to be greedy starting at time". if
.4i (-r. t)
= min] Gi(t -
r).ld T )
+
(t - r)Pi}~ for
all t 2
T.
(24)
In terms of the Leaky Bucket, this means that the session uses as many tokens as possible (i.e., sends at maximum possible rate) for all times ~ T. At time T, session i has li( T) tokens left in the bucket, but it is constrained to send traffic at a maximum rate of C; Thus, it takes Cil~p. time units to deplete the tokens in the bucket. After this.. the rate will be limited by the token
arrival rate Pi Define iii as an arrival function that is greedy starting at time i (see Fig. 6). From inspection of the figure [and from I
541
Fifty Years ofCommunications and Networking
A[(O, t)
~--slope
= Pi
for every session i. We say that session i conforms to (l1 i ~ Pi) or Ai ~ ((Ji: pi). Further, Vie stipulate that Ei Pi < 1 to ensure stability. By relaxing our constraint, we allow step or jump arrivals, which create discontinuities in the am val functions Ai. Our convention will be to treat the A'i as left-continuous functions (i.e., continuous from the left). Thus. a session i impulse of size ~ at time 0 yields Qi(O) = 0 and Qi(O-r-) = ~. Note also that [i(O) ;= ai~ where li(r) is the maximum amount of session i traffic that could arrive at time r : without violating (25). When session i is greedy from time T, the infinite capacity assumption ensures that Ii (t) = 0 for all t > r . Thus, (16) reduces to ATCr.f);= 'itT)
t Fig, 6. A session I arrival function that is greedy from time
+ t!
- r)Pi.for all t
> T.
(26)
Note also that if the session is greedy after time T, li(t) = 0
for any f
T.
(24)], we see that if a system busy period starts at time zero, then
The major result in this section is the following: Theorem 3.' Suppose that Cj ~ r for every session i, where r is the rate of a GPS server. Then, for every session i, Di and Qi are achieved (not necessarily at the same time) when every session is greedy starting at time zero, the beginning of a system busy period. This is an intuitively pleasing and satisfying result. It seems reasonable that if a session sends as much traffic as possible at all times, it is going to impede the progress of packets arriving from the other sessions. Notice, however, that we are claiming a worst-case result, which implies that it is never more harmful for a subset of the sessions to "save up" their bursts and to transmit them at a time greater than zero. While there are many examples of service disciplines for which this "all-greedy regime-, does not maximize delay, the amount of work required to establish Theorem 3 is still somewhat surprising. OUf approach is to prove the theorem for the case when Ci = oc for all i-this implies that the links carrying traffic to the server have infinite capacity. This is the easiest case to visualize since we do not have to worry about the input links. Further. it bounds the performance of the finite link speed case since any session can "simulate' ~ a finite speed input link by sending packets at a finite rate over the link. After we have understood the infinite capacity case, it will be shown that a simple extension in the analysis yields the result for finite link capacities as well,
>
T.
Defining (1 [ as before (from 18), we see that it is equal to (Ji (T--) when session i is greedy starting at time T. An all-greedy GPS system: Theorem 3 suggests that we should examine the dynamics of a system in which all the sessions are greedy starting at time O. the beginningof a system busy period. This is illustrated in Fig. 7. From (26), we know that
.4JO. T) ::::
a, - Pir.
T
~
()
and let us assume, for clarity of exposition, that a, > 0 for all i. Define el as the first time at which one of the sessions, say L(l), ends its busy period. Then, in the interval (0. ell, each session i is in a busy period (since we assumed that a; > 0 for all i) and is served at rate 2:;':'1 e Since session L( 1) is l
,
•
greedy after 0, it follows that PL(l)
9~i
<
'\. , Lk=19k'
where i :::: L( 1). (We will show that such a session must exist in Lemma 5.) Now each session j still in a busy period will be served at rate
until a time f2 when another session. L,( 2), ends its busy period. Similarly, for each k: ,
~k-l
P L (k)
<
(1 - LJj~l
.,-
Lj=1 0 J
-
PL(j))(J'i k'_ 1 i L);;::1 ¢)Llfl
•
.
= 1. 2..... 1'1. 'l = L, (k ).
k
(27)
As shown in Fig. 7, the slopes of the various segments that comprise S'i(O. t) are I~~' .'i~. ,.,. From (27)
C. Generalized Processor Sharing with Infinite Incoming Link Capacities
When all input link speeds are infinite, the arrival constraint (12) is modified to (25)
, 1( -
.~ k:=:
'~y
,k'-l
'
L - ' PL(j",)(fJi Jk ~ 11 j.
I.
•
k
= 1. 2..... L(i) .
2:.1=1 yJj - L)=1 (jJL(j)
It can be seen that {8~}. k = 1. 2..... L(I) forms an increasing sequence.
THE BEST OF THE BEST
542 traffic
tra.ffic
time t
Fig. 7.
Session i arrivals and departures afterO, the beginning of a system busy period.
Note that:
The arrival functions are scaled 80 that a universal service curve, 5(0, t), can be drawn. Af~er t.ime e,.' session i has a backlog of zero until the end of the system busy period, whlch lS at time es. The vertical distance between the dashed curve corresponding to se88i~n i ~d 5(0, T) is j;Qi(T), while the horizontal distance yields Di(T) just as it does in Figure 8.
• We only require that
o~
el
Fig. 8.
$
allowing for several ei to be equal. We only care about t ~ eLf;) since the session i buffer
is always empty after this time. • Session L(i) has exactly one busy period-the interval
[0. ed.
is the maximum busy period length, i.e., it meets the bound of Lemma 3. Any ordering of the sessions that meets (27) is known as a feasible ordering. Thus, sessions 1 TV follow a feasible ordering if and only if: •
(;lV
Pk
<
(1 -
2:~~i {Jj)¢k' !'-:
2: j = k
• _
.,
.k-l.2
\.
Lemma 5: At least one feasible ordering exists if 1.
(28l
2:::
1
Pi
<
Proof' By contradiction, suppose there exists an index i. 1 S; i :5 IV' such that we can label the first i-I sessions of a feasible ordering {1« .... i-I} but (28) does not hold for any of the remaining sessions when k' = i. Then, denoting L i - 1 = {I, ...~ i - l}, we have for every session k tt L i - 1 : Pk ~ ( 1-
" c:
jEL 1 -
Pj)(L: 1
~)k
q)'
L
Pk 1
2: 1 -
Pj -
1
:::}
L fJ
j
2
'~k'
-
-;;; I
1
j=l
which is a contradiction, since we assumed that )'~Y II}'J < 1 • ~j= Thus, no such index i can exist and the lemma is proven. 0 In general, there are many feasible orderings posslble, but the one that comes into play at time 0 depends on the (J i 'S. For example. if p ::::: Pj and rj) = dJj.j = 1.2..... 1'1, then there are lV! different feasible orderings. Similarly, there are ]\l! different feasible orderings if Pi = ¢i for all i. To simplify the notation, let us assume that the sessions are labeled so that
. 8i. _ 1 -,.YLJj=l, PJ ~ ~k-l
" .
· '/.. .J
'-j=k
' )
> k. k =
• ~ 1~ 2..... ;.\.
This allows us to describe the behavior of all sessions in a single figure as is depicted in Fig. 8. LInder the allgreedy regime. the function \/(t) (described in Section IlIA) corresponds exactly to the universal service curve 8(0. t) shown in Fig. 8. It is worth noting that the virtual time function ,~( t) captures this notion of generalized service for arbitrary arrival functions. In the remainder of this section, we will prove a tight lower bound on the amount of service a session receives when it is in a busy period: Recall that, for a given set of arrival functions .4 = { ..41 ..... .It~\" }, .4i == {.4I...... 4~,} is the set such that for every session k• .:lk(O.8) == .t4 k(O . .s ) for.') E [O, T) and session k is greedy starting at time T. Lemma 6: Assume that session i is in a busy period in the interval (T. tl. Then. i) For any subset ill of tu sessions. 1 ~ m $ .\T and any time t 2: r : o ( _'
:v
L
jEL l
.
" _ ,-;j. _
---"l i.t) ~
JiL~-l)
Summing over all such k, v.;e have:
kg'L 1 -
== 1. 2..... N, Then, for any two sessions i. j indexed greater than k we can define a "universal slope" 8k by:
j == L (j) for j
es-
t2 ~ ... ~
The dynamics of an all-greedy GPS system.
(f -
i
-
(2: J ~ ., I J + (1
, . jE~'I
.
Pj (t - T))) 9)i
9j
.
(29)
ii) Under .4T. there exists a subset of the sessions, M", for every t ~ 7 such that equality holds in (29). Proof' For compactness of notation. let q>ji = ~: .Vi. j. i) From (22). S"j(:T. t) ~
aj + {lJ(1
- -r)
for all j. Also. since the interval [T.t] is in a session i busy period:
Thus.
543
Fifty Years of Communications and Networking
Since the system is in a busy period. the server serves exactly t - T units of traffic in the interval [i. t]. Thus,
Then, p-l
tV
i-T ~
LIllin{aj +Pj(t -T).rPjiSi!T.t)}
r~
L
L :j). .\"
Proof- For compactness of notation.. let Now because of the feasible ordering,
oj + Pj(t
+ L
- T)
j~l't[
QjlSi(T.
t)
<
Pp
jE~\I
for any subset of sessions AI. Rearranging the terms yields (29).
ii) Since all sessions are greedy after T under A-;, every session j will have a session busy period that begins at T and lasts up to some time ej' As we showed in the discussion leading up to Fig. 8~ Qj(t) = 0 for all t 2 fj. The system busy period ends at time e" = rnaxj ej. Define
By the definition of GPS, we know that session j E M' receives exactly cpjiSi(r. t) units of service in the interval (T, t]. A session k is not in At[t only if ek < i. so we must have Qk (t) = O. Thus, for k rt Aft
Thus.
.
< (t
S p t'r~ t) Also, Sj(T. t) ::;
.
(31)
j=p+l ' P
k=l
k=l
j=l
~ t-
p-l
> L(t-T)pk+;r.(l+
LSk(T.t)
Lj:; Pi
1-
r...~.
= :1, 'Vi, j. .J
•
Li=p (/>ip
- T)
cPij
(1 -L~:i Pi) .'-. .
Li=p ¢lP
(32)
- X.
9j p 8 p (T. t) for all j. Thus.
~\'
.V
j=p
)=p
L Sj(T. t) ~ Sp(T. t) L
p-l.V
< (t-i)(l- LPj)-XLt1>jp·
L:Sj(r.t)
t
J=p
j=p
j=1
Since [To t] is in a system busy period, ~
and equality is achieved in (29).
LSj(T.t)
p-l
T) - 2:Sj(r.t).
j=p
D. An Important Inequality In the previous section, we examined the behavior of the GPS system when the sessions are greedy. Here, we prove an important inequality that holds for any am val functions that conform to the arrival constraints (25). Theorem 4: Let 1~ .... N be a feasible ordering. Then. for any time t and session p: p
= (t -
Thus. p-l
(t - T) -
LO"L :5
L(Jk. k=l
p-l]V
L5
J(
r,
L Pj) - L cPjp
t) < (t - T)(1 -
x
)=1
j=l
p-l
p-l
j=l
1\'
rate smaller than its average rates, during a session Ii busy period, then the sessions indexed lower than p will be served correspondingly higher than their average rates. Note that this lemma is true even when the sessions are not greedy. Lemma 7: Let 1.. ... 1\1 be a feasible ordering, and suppose that session p is busy in the interval [T. tJ. Further, define 1; to satisfy (30)
L
epjp)
j=p·l
1=1
0
since
We want to show that at the beginning of a session p busy period. the collective burstiness of sessions ] ..... p will never be more than what it was at time O. The interesting aspect of this theorem is that it holds for every feasible ordering of the sessions. When Pj = P and ¢ j = ¢ for every j. it says that the collective burstiness of any subset of sessions is no less than what it was at the beginning of the system busy period. The following three lemmas are used to prove the theorem. The first says (essentially) that if session p is served at a
j=p
~LSJ(r~t»(t-T)LPJ+X(l+
p
k=l
)=1
Lemma 8: Let 1..... N be a feasible ordering, and suppose that session p is busy in the interval [To t]. Then, if Sp( T~ t) ~ pp(t - r): p
p
k=l
k=l
L Sk(T. t) > (f - T) L Pk
(33)
Proof' Let Sp(To t) = pp(t -
T) - J~
2: O. Then, from (31)9 we are done since x E;:P+l ~ ~ O. [] Lemma 9.' Let 1..... N be a feasible ordering, and suppose that session p is busy in the interval [T~ t]. Then, if Sp(r. t) ~
:1'
pp(t - T): p
p
L(7k~LO"k' k=l
k=l
544
THE BEST OF THE BEST
Proof' From Lemma 2, for every k.
Ok + Pk(t -
r) - Sk(T: t) ~
Lemma 10: Suppose that time t is contained in a session p busy period that begins at time 7: Then
O"t.
Summing over k and substituting from (33), we have the result. If we choose r to be the beginning of a session p busy period. then Lemma 9 says that if Sp (T~ t) ~ Pp (t - T) then
a~+
p-l
p-l
k=l
k=l
2: a k ~17p+ 2:ar.
(36) P roof' Define B as the set of sessions that are busy at time t - T under . 4. From Lemma 6:
Sp(T.t) ~
(34)
o
(t -
+ Pi, (t -
LiiB (aT .L
T -
.
'T)) )¢i
jEg rjJj
Since the order in which the sessions become inactive is a feasible ordering, Theorem 4 asserts that:
Now we will prove Theorem 4. Proof (of Theorem 4): We proceed by induction on the index of session p. Basis: p = 1. Define T to be the last time at or before t such that Ql (T) = O. Then, session 1 is in a busy period in
Sp(T. t) ~
(t -
T -
2:ie'8(O"i
+ Pi(t - 'T)))4>1
L:' jE!3 rpj
the interval [1", t), and we have Sl('T~t)2:
(t-r)4>l ~.
~k=l ¢k
(from Lemma 6) and (36) is shown. 0 Lemma J 1: For every session i., Dr; and Q7 are achieved (not necessarily at the same time) when every session is greedy
'
>(t-T)Pl.
'
The second inequality follows since session 1 is first in a feasible order, implying that PI < L.f l . From Lemma 2, k=l
(/)/r
This shows the basis. Inductive Step: Assume the hypothesis for 1. 2.... ~ p - 1 and show it for p. Observe that if Qi(t) 0 for any session i then
0';
5 o i. Now consider two cases:
=
starting at time zero, the beginning of a system busy period. Proof' We first show that the session 1: backlog is maximized under A: Consider any set of arrival functions A = {.. 4 1 •.•.• ~4.\·} that conforms to (25), and suppose that for a session i busy period that begins at time r:
From Lemma 1O~
Case 1: u; :5 up: By the induction hypothesis: p-l
p-l
i=l
i=l
L crI :5 2:
Also, ai,
Thus.
Thus,
i.e..
(1;
Case 2: > (J p: Session p must be in a session p busy period at time t, so let T be the time at which this busy period begins. Also, from (22): Sp(7. t) < pp(t - 'T). Applying (34):
a~
p-l
p-l
P
k=l
k=l
k=l
+ Lai Sap + LCTk ~ LOk.
(35)
where, in the last inequality, we have used the induction hypothesis. 0
The case for delay is similar: Consider any set of arrival functions ..4 = {AI ......A.y } that conforms to (25)~ for a session i busy period that begins at time T. let t* be the smallest time in that busy period such that:
From the definition of delay in (17): ~4 d, 7. t *) - S, ( T.
Proof of the Main Result In this section, we will use Lemma 6 and Theorem 4 to
prove Theorem 3 for infinite capacity incoming links. Let .4 1 ~ .... AN be the set of arrival functions in which all the sessions are greedy from time 0, the beginning of a system busy period. For every session p, let l5p ( 7~ t) ~ and b, (t) be the session p service and delay functions under i~. We first show
Let us denote
di = t"
and, since
~
OJ
aT:
-
T.
t * + Di ( t * )) = O.
From Lemma 10,
545
Fifty Years ofCommunications and Networking
Proof: First, consider the case Ci = oc. Suppose that Qi is achieved at some time t", and session i continues to send traffic at rate Pi after t", Further, for each j :j:. i, let tj be the time of arrival of the last session j bit to be served before time t", Then, Qi is also achieved at t: when the arrival functions of all sessions j #- i are truncated at tj, i.e., Aj(tj, t) = 0, j #- i. In this case, all other session queues are empty at time t" and, beginning at time i"; the server will exclusively serve session i at rate 1 for 1~~, units of time, after which session Thus, we have shown Theorem 3 for infinite capacity i will be served at rate Pi. Thus, incoming links. 0 S, ii", t) = min {t - t", Qi + Pi (t* - t)}. 'v't ~ i", VII. GENERALIZED PROCESSOR From this, we have
Thus,
1
SHARING WITH FINITE LINK SPEEDS
In the infinite link capacity case, we were able to take advantage of the fact that a session could use up all of its outstanding tokens instantaneously. In this section, we include the maximum rate constraint, i.e., for every session i, the incoming session traffic can arrive at a maximum rate of C, ~ 1. Although this can be established rigorously [16], it is not hard to see that Theorem 3 still holds: Consider a given set of arrival functions for which there is no peak rate constraint. Now consider the intervals over which a particular session i is backlogged when the arrivals reach the server through (a) infinite capacity input links and (b) input links such that 1 $ C, for all j and Ck < oc for at least one session k. Since the server cannot serve any session at a rate of greater than 1, the set of intervals over which session i is backlogged is identical for the two cases. This argument holds for every session in the system, implying that the session service curves are identical for cases (a) and (b). Thus, Lemma 10 continues to hold. and Theorem 3 can be established easily from this fact. We have not been able to show that Theorem 3 holds when Cj < 1 for some sessions i. but delay bounds calculated for the case C, = 1 (or Cj (0) apply to such systems since any link of capacity 1 (or oc) can simulate a link of capacity less than 1.
We now show that the reverse inequality holds as well: For any T $ t: S·i(r. t) = Ad.T~ t)
OUTPUT BURSTI~ESS ajut
In this section. we focus on determining, for every session i, the least quantity lTiut such that
where r is the rate of the server. This definition of output burstiness is due to Cruz [5]. (To see that this is the best possible characterization of the output process, consider the case in which session i is the only active session and is greedy from time zero. Then, a peak service rate of T and a maximum sustainable average rate of Pi are both achieved.) By characterizing S, in this manner, we can begin to analyze networks of servers, which is the focus of the sequel to this paper. Fortunately, there is a convenient relationship between afut and Qi: Lemma 12: If C, ~ T for every session j, where r is the rate of the server, then for each session i:
- Qi(t)
(since C, =x.) This implies that
<
(J'iout _ O'ir -
Qi(t') ' < _
< Q*i'
(Jli' _
Thus.
Now suppose that C, E [r.oc), Since the traffic observed under the all-greedy regime is indistinguishable from a system in which all incoming links have infinite capacity, we must 0 have 0iu t = Qi in this case as well.
=
VIII. THE
+ Qd.')
IT + Pi(t - T) + Qi(T) - Qi(t) = aT - Qi(t) + p,(f - r) ~
REFERENCES {l] D. Bertsekas and R. Gallager, Data Networks. Englewood Cliffs. NJ: Prentice Hall. 1991. [2} A. Birman. P. C. Chang. 1. S. C. Chen, and R. Guerin, ~'Buffer sizing
in an ISD:\"' frame relay switch." Tech. Rep. RC14 386,IBM Res., Aug. 1989. [31 A, Birman, H. R, Gail. S, L. Hamler. Z. Rosberg, and M. Sidi. "An
optimal policy for buffer systems," Tech. Rep. RC16 641, IBM Res.•
rv1ar. 1991.
[4] 1. Cidon, I. Gopal, G. Grover. and M. Sidi. "Real time packet switching: A performance analysis~!!IEEE J, Select. Areas Commun., vol. SAC·6, pp. 1576-1586, 1988. [5) R. L. Cruz, "A calculus for network delay. Part I: Network elements in isolation,"JEEE Trans. Inform. Theory, \~l. 37. pp. 114-131, 1991. [6] _ _ , UA calculus for network delay. Part [I: Network analysis," IEEE Trans. Inform. Theory, vol. 37, pp. 132-141, 1991. [7] A. Demers, S. Keshav, and S. Shenkar. "Analysis and simulation of a fair queueing algorithm," Internet. Res. and Exper., vel, L 1990, [8] H. R. Gail. G. Grover. R. Guerin, S. L. Hantler, Z. Rosberg. and M. Sidi, "Buffer size requirements under longest queue first:' Tech. Rep.
RC14 486, IBM Res., Jan. 1991. [9] S. J. Golestani, "Congestion-free transmission of real-time traffic in packet networks:' in Proc. IEEE INFOCOAf '90, San Fransisco, CA, 1990. pp. .527-536. (10) _ _ , UA framing strategy for connection managment,' in Proc. SIGCOMM '90, 1990, [l l] - _ , "Duration-limited statistical multiplexing of delay sensitive traffic in packet networks," in Proc. IEEE INFOCOM '91, 1991. [12] A. C. Greenberg and N. Madras. "How fair is fair queueing?" 1. ACM. vel. 3. 1992. r13) E. Hahne. "Round robin scheduling for fair flow control," Ph.D. thesis, Dept. Elect. Eng. and Comput. SeL, ~.I.T .. Dec. 1986.
546
THE BEST OF THE BEST
/14J L. Kleinreck, Queueing Systems Vol. 2: Computer Applications. New [IS ]
[16] (17) [ 18]
[19] [20] [21]
York: Wiley, 1976. C. Lu and P. R. Kumar, "Distributed scheduling based on due dates and buffer prioritization." Tech. Rep" Univ, of Illinois, 1990. A. K. Parekh,"A generalized processorsharingaproach to flow control in integrated services networks," Ph.D. thesis, Dept. of Elect. Eng. and Comput. Sci" M.I.T., Feb. 1992. A. K. Parekh and R. G. Gallager, "A generalized processor sharing approach to flow control-The multiple node case." Tech. Rep. 2076, Lab. for Inform. and Decision Syst., M.l.T., 1991. J. R. Perkinsand P. R. Kumar, "Stable distributed real-time scheduling of flexible manufacturing systems," JEEE Trans. Aut. Contr. , vol. AC-34, pp. 139-148, 1989. G. Sasaki,"Input buffer requirements for round robin polling systems," in Proc. Allerton Con! Commun.. Comr.. and Comput., 1989. J. Turner, "New directions in communications, or Which way to the information age"," IEEE Commun. Mag., vol, 24. pp. 8-15, 1986. L. Zhang, "A new architecture for packetswitching network protocols." Ph.D. thesis, Dept. Elect. Eng. and Comput. Sci., M.l.T., Aug. 1989.
Abbav K, Parekh (M'92) received the B.E.S. degree i~ mathematical sciences from Johns Hopkins University, the S.M. degree in operations research from the Sloan School of Management, and the Ph.D. degree in electrical engineering and computer science from the Massachusetts Institute of Technology in 1992. He was involved in private network design as a Member of Technical Staff at AT&T Bell Laboratories from 1985 to 1987. From February to June 1992, he was a Postdoctoral Fellow at the Laboratory for Computer Science at M.I.T., where he was associated with the Advanced Network Architecture Group. In October 1992. he joined the High Performance Computing and Communications Group at IBM as a Scientific Staff Member. His current research interests arc in application-driven quality of service for integrated services networks. and in distributed protocols for global client-server computing. While a student at M.LT.. he was a Vinton Hayes Fellow and a Center for Intelligent Control Fellow. A paper from his Ph.D. dissertation, jointly authored with Prof. Robert Gallager, won the INFOCOM '93 best paper award.
Robert G. Gallager (S'58- M'61- F'6&) received the B.S.E.E. degree in electrical engineering from the University of Pennsylvania in 1953, and the S.M. and Sc.D. degrees in electrical engineering from the Massachusetts Institute of Technology in 1957 and 1960. respectively. Following two )'ears at Bell Telephone Laboratories and two years in the U.S. Signal Corps. he has been at M.LT. since 1956. He is currently the Fujitsu Professor of Electrical Engineering and Co-Director of the Laboratory for Information and Decision Systems. Hisearly work wason information theory. and his textbook Informalion Theory and Reliable Communication (New York: Wiley, 1968)is still widely used. Later research focused on data networks. Dara Networks (Englewood Cliffs, NJ: Prentice Hall. 1992), coauthored with D. Bertsekas, helps provide a conceptual foundation for this field. Recent interests include multiaccess information theory. radio networks, and all-optical networks, He has been a consultant at Codex Motorola since its formation in 1962. He was on the IEEE Information Theory Society's Board of Governors from 1965 to 1970 and 1979 to 1988, and was its president in 1971. He was elected a member of the National Academy of Engineering in 1979 and a member of the National Academy of Sciences in 1992. He was the recipient of the IEEE Medal of Honor in 1990, awarded for fundamental contributions to communications coding techniques.
DQDB Networks with and without Bandwidth Balancing Ellen L. Hahne, Member, IEEE, Abhijit K. Choudhury, Member, IEEE, and Nicholas F. Maxemchuk, Fellow, IEEE
Abstract-« This paper explains why long distributed queue dual bus (DQDB) networks without bandwidth balancing can have fairness problems when several nodes are perrormin~ large file transfers. The problems arise because the network control information is subject to propagatioll delays that are much longer than the transmission time of a data segment. Bandwidth balancing is then presented as a simple solution. By constraining each node to take only a certain fraction of the transmission opportunities offered to it by the basic DQDB protocol, bandwidth balancing gradually achieves a fair allocation of bandwidth among simultaneous file transfers. We also propose two ways to extend this procedure effectively to multi priority traffic.
From 4.
II IRE ~B I ~
. E If) ATAQ"
I
iI
Q
!
~,
A
BC O
To
J.
('0
I' V j
B
A BC
I
C
n
n
I
,
R
u;c
I
R
c
s ,Q " ' Qf)AH
iv I
~
I. INTROD UCTION
Paper approved by the Editor for Wide Area Networks of the IEEE Communications Society. Manuscript received October 8, 1989; revised October 30, 1990 and August 6, 1991. This paper was presented in part at IEEE INFOCOM '90, San Francisco, CA, June 1990 and IEEE INFOCOM ' 91, Bal Harbour, FL, April 1991The authors are with AT!: T Bell Laboratories, Murray Hill. NJ 07974. IEEE Log Number 9201110.
'.
To f)
I.-
p
.
I '
(
From
From
BC D
Fig. I.
From
CD A
T
From
1.
_~J-L__
To
HE distributed queue dual bus (DODB) [1], [2] is a metropolitan area network that has recently been standardized by the IEEE [3]. The dual-bus topology is identical to that used in Fasnet [4] and is depicted in Fig. J. The two buses support unidirectional communications in opposite directions. Nodes are connected to both buses and communicate by selecting the proper bus. In both DODB and Fasnet a special unit at the head-end of each bus generates slots; however, the protocols for acquiring slots differ significantly. Fasnet uses a protocol similar to that of a token ring, where each station is given an opportunity to transmit in order. DODB resembles a slotted ring with free access, where stations transmit in every empty slot if they have data. Both token access and free access have performance drawhacks, so Fasnet and DODB use the channe l in the opposite direction from wh ich they are sending data to derive performance improvements over the earlier networks. In a token-passing network, a significant fractionof the bandwidth can be wasted as the token circulates among a small number of active stations . Therefore, Fasnct includes several techniques for using the reverse channel to reduce the token circulation time and to let stations use slots that would o therwise have been wasted. In a slotted ring network with free access , one station may take all the slots and prevent the others from transmitting. This fairness problem is exacerbated when the ring topology is replaced hy a bus because the station closest to the bus head-end always has first access to slots . Therefore DODB uses the reverse channel to reserve slots for stations that are farther from the head-end ,
From
.\ 11
n
To
To
o
AB
To
A BC
DQDB network architecture.
as explained in Section II. The aim of DQDB's reservations is to improve access fairness without the bandwidth wastage of a circulating token. Moreover, by associating priority levels with these reservations, DQDB can offer multipriority service. Unfortunately, the DODB reservation process is imperfect. The network span (up to 50 km), the transmission rate (assumed to be 150 Mbps in this paper), and the slot size (53 bytes) of DODB allow many slots to be in transit between the nodes. Therefore, the nodes can have inconsistent views of the reservation process. If this happens, and if the access protocol is too efficient and tries never to waste a slot, then bandwidth can be divided unevenly among nodes simultaneously performing long file transfers. Since the network does not provide the same throughput to all of the nodes, in that sense it is unfair. This is the main problem to be addressed in this paper. In Section III we offer a novel analysis of a closed queueing network model for this scenario . (Other DQOR fairness studies appear in [5]-(31].) In Section V we present a simple enhancement to the basic DODB protocol, called " bandwidth balancing," that equalizes the throughput. (Other recent fairness proposals ean be found in [28]-[37].) Bandwidth balancing intentionally wastes a small amount of bus bandwidth in order to facilitate coordination among the nodes currently using that bus, hut it divides the remaining bandwidth equally among those nodes. The key idea (adapted from Jaffe [38]) is that the maximum permissible nodal throughput rate is proportional to the unused bus capacity; each node can determine this unused capacity by observing the volume of busy slots and reservations . The throughput is equalized gradually over an interval several times longer than the propagation delay between competing nodes. Bandwidth balancing is easy to implement: each node
Reprin ted from IEEE Transactions on Communications. vol. 40, No.7, July 1992.
The Best ofthe Best. Edited by W. H. Tranter, D. P. Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W. Mark . Copyright © 2007 The Institute of Electrica l and Electronics Engineers, Inc.
547
548
THE BEST OF THE BEST
permits itselfto use only a certain fraction of the transmission opportunities offered to it by the basic DQDB protocol. If the traffic is all of one priority, then bandwidth balancing requires no additional control information to be transmitted, and it requires only one additional counter (per bus) in each node. Bandwidth balancing has been incorporated into the DQDB standard as a required feature that is enabled by default; the ability to disable this feature is also required. In the standard, bandwidth balancing conforms to the priority structure of the basic DQDB protocol, which is explained in Section II below. In particular, a node is aware of the priority levels of incoming reservations, but not the priority levels of the data in busy slots. This asymmetry means that different nodes have different information about the traffic on a bus, making it difficult to control multipriority traffic. The version of bandwidth balancing specified in the standard guarantees equal allocations of bus bandwidth to nodes with traffic of the lowest priority level. A node with higher-priority traffic is guaranteed at least as much bandwidth as a lowest-priority node, but no further guarantees are possible. Furthermore, when there are nodes with different priorities, the throughputs achieved by nodes can depend on their relative positions on the bus [27], [39]. In Sections VI and VII of this paper we propose two better ways to extend the unipriority bandwidth balancing procedure of Section V to multipriority traffic. (Additional proposals appear in [40] and [41].) The first step is to correct the asymmetry of the priority information, either by adding priority information about the data in busy slots, or by removing priority information from the reservations. The former method we call the "global" approach, because the priority information for all traffic is available to all nodes by reading the data and reservation channels. The latter method we call the "local" approach because a node is only aware of the priority of its own locally generated data, and the node does not disseminate this information over the network. Section VI presents a version of bandwidth balancing based on local priority information. (Similar "local' versions of bandwidth
balancing have been proposed by Damodaram [42], Spratt [43], and Hahne and Maxemchuk [44], [45]). Section VII presents a version of bandwidth balancing based on global priority information. (A cruder version of this scheme appears in [46]_) Both multipriority schemes presented in this paper produce bandwidth allocations that are independent of the nodes' relative positions on the bus. Moreover, both schemes are fair, i.e., they allocate equal bandwidth shares to all nodes active at the same priority level. The schemes differ in the way they allocate bandwidth across the various priority levels. Either scheme could easily be included in a future version of the standard, and nodes satisfying the old standard could share the same network with the new nodes, provided that the old nodes only generate traffic of the lowest priority level.
such traffic is present. DQDB supports asynchronous traffic of several priority levels, which for convenience we will number from 1 (least important) through P (most important}! DQDB uses the dual-bus topology depicted in Fig. 1. The two buses support unidirectional communications in opposite directions. Nodes are connected to both buses and communicate by selecting the proper bus. The transmission format is slotted, and each bus is used to reserve slots on the other bus in order to make the access fairer. Each slot contains one request bit for each priority level and a single busy bit. The busy bit indicates whether another node has already inserted a segment of data into the slot. The request bits on one bus are used to notify nodes with prior access to the data slots on the other bus that a node is waiting. When a node wants to transmit a segment on a bus, it waits for an empty request bit of the appropriate priority on the opposite bus and sets it, and it waits for an empty slot on the desired bus to transmit the data. The IEEE 802.6 Working Group is currently considering optional procedures for erasing the data from a slot once it passes the destination, so that the slot can be reused [47], [48], [49]. However, this paper assumes that once a data segment has been written into a slot, that data is never erased or overwritten. This paper only discusses how data segments are written onto a bus, since reading data from the bus is straightforward. The operation for data transmission in both directions is identical. Therefore, for the remainder of this paper, the operation in only one direction is described. One bus will be considered the data bus. Slots on this bus contain a busy bit and a payload of one segment of data. These slots are transmitted from upstream nodes to downstream nodes. The other bus is considered the request bus. Each slot on this bus contains one request bit for each priority level, and slots are transmitted from downstream nodes to upstream nodes. Fig. 2 shows how a DQDB node operates with bandwidth balancing disabled. We model each node as composed of P sections, one to manage the writing of requests and data for each priority level. We assume that the sections act like separate nodes: each section has its own attachments to the buses, and the data bus passes through the sections in order, starting with priority 1, while the request bus passes through the sections in the opposite order, starting with priority P. While this layout does not correspond to the actual physical implementation of DQDB,2 for the purpose of this paper, they are functionally equivalent.3 1The DQDB standard calls for three priority levels, i.e., P labels them differently: 0,1,2.
= 3, but
it
2 In an actual DQDB node, the sections of the various priority levels all read the request bus at the same place, before any of them has a chance to
II. DQDB WITHOUT BANDWIDTH BALANCING
write. Write-conflicts on the request bus are impossible, though, because the priority-p request bit can only be set by the priority-p node section. There is some internal linkage amongthe sections: whenever a section generates a request, it notifies all lower-priority sections within the node as well as writing the request onto the bus. Similarly, all sections of the node read the data bus at the same place, before any of them can write. Nevertheless, because of the intemallinkage just described, the protocol can guarantee that two node sections will not try to write data into the same empty slot.
DQDB allows some network capacity to be set aside for synchronous services such as voice, but we will assume that no
3Figs. 2,4,8, and 10 and associated text are intended to be functional descriptions. The physical implementation of these ideas will not be discussed in this paper.
549
Fifty Years ofCommunications and Networking Local
Dala. Prio. I
BUSY. DATA
Local Data .
Local
Prio.p
Prio. P
REO. > 1' REQs
r--J J Prio. I Section
REQ<>p
11IJJ)-
- IJ - IJ
REQp
BUSY ... DATA
·11-
IJ
Prio.p Section
processing delays, and if it included an idealized reservation channel with no slotting, queueing or transmission delays. Under these conditions [50):
Data.
REQp REQs
PiiQP
Section
Fig. 2. DQDB node architecture without bandwidth balancing.
In Fig. 2 the details of the priority-p section are shown. This section has a local FIFO queue to store priority-p data segments generated by local users while these segments wait for the data inserter (Dl) to find the appropriate empty slots for them on the data bus. The data inserter operates on one local data segment at a time; once the local FIFO queue forwards a segment to the data inserter, the local FIFO queue may not forward another segment until the data inserter has written the current segment onto the data bus. When the data inserter takes a segment from the local FIFO queue, first it orders the request inserter (RI) to send a priority-p request on the request bus. Then the data inserter determines the appropriate empty slot for the local segment by inserting the segment into the data inserter's transmit queue (TO). All the other elements of this queue are requests of priority p or greater from downstream nodes. (The data inserter ignores all requests of priority less than p.) The transmit queue orders its elements according to their priority level, with elements of equal priority ordered by the times they arrived at the data inserter. The data inserter serves its transmit queue whenever an empty slot comes in on the data bus. If the element at the head of the queue is a request, then the data inserter lets the empty slot pass. If the head element is the local data segment, then the busy bit is set and the segment is transmitted in that slot. The transmit queue is implemented with two counters, called the request counter and the countdown counter. When there is no local data segment in the queue, the request counter keeps track of the number of unserved reservations from downstream nodes in the transmit queue. When the data inserter accepts a local data segment, the request counter value is moved to the countdown counter, which counts the number of reservations that are ahead of the local data segment in the transmit queue, and the request counter is then used to count reservations behind the local data segment. The request inserter sends one reservation of priority p for each data segment taken by the data inserter from the local FIFO queue. Since the incoming priority-p request bits may have been set already by downstream nodes, the request inserter sometimes needs to queue the internally generated reservations until vacant request bits arrive. Thus, it is possible for a data segment to be transmitted before its reservation is sent. Perfect operation of the DQDB protocol without bandwidth balancing would occur if the system had no propagation or
.. Slots are never wasted . .. The priority mechanism is absolute (i.e., a data segment can only be transmitted when there are no higher priority segments waiting anywhere in the network). • Nodes with traffic at the current highest priority level are served one-segment-per-node in round-robin fashion. However, if the propagation delay between nodes is much longer than the transmission time of a data segment, then performance deteriorates . This is the subject of the next section.
III. THROUGHPUT FAIRNESS OF DQDB WITHOUT BANDWIDTH BALANCING
When the network propagation delay is larger than a slot transmission time, the DQDB access protocol without bandwidth balancing is unfair, in the sense that nodes simultaneously performing large file transfers can obtain different throughputs. The severity of the problem depends upon the propagation delay, the network utilization, and the lengths of the messages submitted to the network. In this section, we assume that users often submit messages consisting of a great many segments. This model (suggested to us by M. Rodrigues) seems to be increasingly appropriate as diskless workstations abound, because large files are typically transferred between a workstation and its file server. Network "overloads" are caused by as few as two users simultaneously performing file transfers and hence could be quite typical. We model an overloaded DQDB system as a closed network of queues and study its fairness through a novel approximate analysis . We will examine scenarios similar to those explored in Wong's study [10] of an earlier version of DQDB. Consider two nodes that are transmitting very long messages of the same priority. Call the upstream node 1 and the downstream node 2. Ideally, each node should obtain half the bandwidth of the data channel, but this rarely happens. Suppose that the propagation delay between the nodes equals D slot transmission times where D is an integer. Let 6 be the difference in the starting times of the two nodes, i.e., the time when node 2 wants to begin transmission minus the time when node 1 is ready; Ll is measured in slot times and is assumed to be an integer. Once both nodes are active, node 1 leaves slots idle only in response to requests from node 2. Therefore, once node 2 begins to receive segments transmitted by node 1, the only idle slots node 2 receives are in response to its earlier requests. Each idle slot received by node 2 results in a segment being transmitted, a new segment being queued, and a new reservation being transmitted. Therefore, the number X of requests plus idle slots circulating between the two nodes is fixed. (Some of these requests may be stored in node 1's transmit queue.) Let us call these conserved entities permits. This quantity X determines the throughput of the downstream
550
THE BEST OF THE BEST
node. Unfortunately , X depends strongly on D and ~:4
X
=1+ D -
c( ~ )
(3.1)
0.9
o analysis
0.8
x simulation
07
where c is a function that clips its argument to the range r( I) = 06 [- D, D]. To clarify this claim, let us explain the system U=", 0.5 " behavior for extreme values of ~. If the first segment from Throughput 0.4 node 1 has already been received at node 2 by the time node 0.3 2 becomes active, i.e., if ~ ~ D, then node 2 inserts one data 0.2 segment in its transmit queue and transmits one reservation . 0.1 0: l!'l • The segment will not be transmitted until the reservation is I I ~-----r--r--r--r--r--r-,-' received by node 1 and an idle slot is returned. In this instance, 70 -ffi -so -40 - 30 -2 0 - 10 0 10 20 30 40 50 60 70 Ii. = Dow nstream Node Start T ime - Upstream Node SW1 T ime there is one permit in the network. At the other extreme , consider ~ ::; -D. Initially, only node 2 is active. It inserts its Fig. 3. Bandwidth division for two users without bandwidth balancing. first segment in its transmit queue and sends its first reservation upstream. The first segment is transmitted immediately in the. first slot. Then the second segment is queued, the second 1's data segment), so Q(2) = 2. The queue length observed reservation sent, and the second segment is transmitted in the by the data segment, however, is usually one (itself) and second slot, etc. The request channel is already carrying D occasionally two (itself plus the permit), so Q(l) ~ 1. Even requests when node 1 begins transmission, and in the D time though Q(I) and Q(2) differ by almost a factor of two in slots that it takes for node l's first segment to reach node 2, this example, recall that D is large, so that the approximation node 2 injects another D requests, so that X ~ 2D. shown above for the round-trip delay T is still justified . Now we will show the relationship between X and the Solving (3.1)-(3.5) for the steady-state throughput rates nodal throughputs. Recall that permits can be stored in the yields: request channel, in the data channel, and in the transmit 2 queue of the upstream node. This transmit queue also includes r(l) ~ - - - - --,= = = = = = = a single data segment from node 1. When the second file 2 - D - r:(~) + c(~) + 2)2 + 4Dc(~) transfer begins, there is a transient phase in which the transm it r (2) = l -r(I). queue length moves to a steady-state average value Q. More precisely, we should distinguish between Q (1), the average Note that if the nodes are very close together (D = 0) or if queue length observed by a data segment from node 1 just they become active at the same time (A = 0), then each node after it has been inserted into node 1's transmit queue and gets half the bandwidth. However, if D is very large and the Q(2), the average queue length observed by a request (permit) downstream node starts much later, its predicted throughput from node 2 just after it has been inserted into node 1's rate is only about 1/2D. Node 1 is also penalized for starting transmit queue. The difference between these two views of late (though not as severely as node 2): the worst case upstream the queue will be explained shortly . The network's steady-state rate is roughly 1//2l5. behavior can be determined approximately by simultaneously The predicted throughputs match simulated values very solving the following equations involving X, Q(I) , Q(2), the well. Fig. 3 compares our approximate analysis with simunodal throughput rates r (L) and r(2), and the average round- lation results for an internode distance of D = 50 slots ::::: trip delay T experienced by a permit. (Throughput rates are 29 km. The analysis can easily be generalized to multiple measured in segments per slot time, and round-trip delays are nodes, provided that these nodes are clustered at only two measured in slot times.) distinct bus locations. The analysis and simulation studies in this section show r(l) + r(2) = 1 (3.2) perfect fairness when the propagation delay is negligible. r(l ) = l/Q{I) (3.3) However, moderate unfairness can be demonstrated even for r(2) = X/T (3.4) D = 0 if the timing assumptions in the previous footnote are changed (51], [52]. Hence, bandwidth balancing may prove T = 2D + Q(2) ~ 2D + Q(I) . (3.5) useful for fairness enhancement even on short networks. Before solving the equations above, let us discuss the approximation in the last equation . The difference between Q(I) and IV. DEFINITIONS Q(2) is most pronounced when the internode distance D is This section gives various background assumptions and large and only one permit circulates. In this case, the queue definitions to be used in the remainder of this paper. Recall length observed by the permit is always two (itself plus node that we are focusing on data transmission over one bus only. 4The analysis depends on some detailed timing assumptions. We have (Of course, the other bus is needed to carry requests for the use assumed that the bus synchronization is such that a node reads a busy bit of the primary bus). The term parcel will be used to denote on the data channel immediately after reading a request bit on the reservation channel. We also assume that there arc no processing delays in the nodes, and the traffic originating at one node at one priority level for that a node is permitted to insert a new data segment into the transmit queue transmission over one bus. All traffic rates will be measured and send the corresponding reservat ion as soon as it begins to transmit the in data segments per slot time. We will usually assume that the previous data segment.
V(D -
Fifty Years of Communications and Networking
551
traffic demand of each node n has some fixed rate p(n). This offered load may be stochastic, as long as it has a well-defined average rate. The offered load of the traffic parcel of priority level p at node n will be denoted Pp ( n). It is possible that not all this offered load can be carried. The actual long-term average throughput of node n will be denoted by r(n), and that of its parcel p by T'p ( n) . The unused bus capacity will be denoted by U: U
= 1- L
r(m)
= 1- LL'I"q(7n) m
m
q
while Up + will be the bus capacity left over by parcels of priority p and greater:
Up+
those priority-p requests that entered the transmit queue after the local data segment. The local data segment is served only when there are no requests (of priority p or greater) in the transmit queue. Deference scheduling uses a request counter but needs no countdown counter. By itself, deference scheduling makes little sense because it offers no guarantee that the local data segment in the transmit queue will ever be served. This discipline does make sense, however, when used in conjunction with bandwidth balancing, as explained in the next section. To contrast with deference scheduling, we will call the two-counter discipline of Section 11 distributed queueing..
v.
= 1- LLTq(m).
BANDWIDTH BALANCING FOR UNIPRIORITY TRAFFIc
We contend that the unfairness problem discussed in Section III arises because the DQDB protocol pushes the system too Of course, any individual node 1t at any instant t will not hard. If it attempts to use every slot on the bus, DQDB have direct knowledge of the long-term average rates defined can inadvertently lock the network into unfair configurations above. All the node can see is: B(n, t), the rate of busy slots for the duration of a sustained overload. In this section, we coming into node n at time t from nodes upstream; R{ 11., t), introduce the concept of bandwidth balancing: the protocol the rate of requests coming into node 1£ at time t from nodes of Section II is followed, except that a node takes only a downstream; and Sen, t), the rate at which node n serves its own data segments at time t. In one of our proposed fraction of the slots that are not reserved or busy. This rate control mechanism lets the system relax a bit so it can work protocols, the node can break these observations down by as intended. In this section, we focus on traffic of one priority priority level p, in which case they are denoted Bp(,rt, t), level only. In Sections VI and VII, we propose ways to extend Rp(n, t), and Sp(n, t). These observations can be used to determine U(n ~ t), the bus capacity unallocated by node n bandwidth balancing effectively to multipriority traffic. Section V.. A presents our definition of fairness and the at time L By "unallocated," we mean the capacity that is bandwidth balancing concept. In Section V·B, the impleneither used by nodes upstream of n, nor requested by nodes mentation of unipriority bandwidth balancing is described. It downstream of n, nor taken by node n itself: requires only one extra counter, and either distributed queueing or deference scheduling may be used. The performance of U(n, t) == 1 - B(n, t) - R(n, t) - Sen, t) bandwidth balancing is investigated in Section V-C through = 1 - L Bq(n, t) Rq(n, t) Sq(n, t) analysis and simulation. There we show the existence of a q q q tradeoff between bus utilization and the rate of convergence If node n can observe priority levels, then it can also measure to a fair operating point. In this paper, we consider only Up+(n, t), the bus capacity not allocated by node n at time t throughput performance during overloads involving two or three active nodes. Other simulation studies of bandwidth to parcels of priority p or greater: balancing using more nodes, different traffic models, and a Up+(n,t) == 1- LBq(n:t) - LRq(n,t) - ESq(ll.,t)- variety of performance measures appear in [24]-[27]. fll
q~p
L
q~p
q~p
L
q~p
All tbe access control protocols described in this paper
have a parameter M called the bandwidthbalancing modulus;5 in some schemes the modulus is different for each priority level p and is denoted Mp • For convenience we assume that the bandwidth balancing moduli are integers, though rational numbers could also be used. Finally, let us define an alternative queueing discipline called deference scheduling for the data inserter '8 transmit queue. With deference scheduling, the transmit queue for the node section of priority p still holds at most one local data segment at a time, still accepts no requests of priority less than p, and still accepts all requests of priority p or greater. The difference is that all requests (of priority p or greater) are served before the local data segment--even SIn the DQDB standard the bandwidth balancingmodulus is tunable, with a default value of 8.
A. Concept
When the bus is overloaded, we want to divide its bandwidth fairly. OUf ideal of throughput fairness is that nodes with sufficient traffic to warrant rate control should all obtain the same throughput, called the control rate. As discussed in Section III, the performance of the DQDB protocol without
bandwidth balancing can diverge from this ideal when the bus is long. One obvious solution is a centralized approach where nodes inform some controller node of their offered loads; the controller then computes the control rate and disseminates it. We will present an alternative way to do this-one that requires no controller node, no offered load measurements, and no explicit communicationof the control rate. Our method intentionally wastes a small amount of bus bandwidth, but evenly divides the remaining bandwidth among the nodes, The key idea is that the control rate is implicitly communicated through the idle bus capacity; since each node can
552
THE BEST OF THE BEST Local Data
determine this quantity by observing the passing busy bits from upstream and request bits from downstream, control coordination across the system can be achieved. This idea has also been suggested for congestion control in wide-area mesh-topology networks [38], but the problem there is more complex, since flow control must also be coordinated along multihop paths; in our case, the implementation is quite simple, adding very little to the complexity of DQDB. More specifically, each node limits its throughput to some multiple M of the unused bus capacity; nodes with less demand than this may have all the bandwidth they desire:
t
Local FifO
BUSY + DATA
REQ
1·('11.) = min] p(n), M . [1] =min[p(n),M.
(1- ~r(m))].
Fig. 4.
(5.1)
This scheme is fair in the sense that all rate-controlled nodes get the same bandwidth. Given the offered loads p(n) and the bandwidth balancing modulus M, (5.1) can be solved for the carried loads '('(n). If there are N rate-eontrolled nodes, then the throughput of each is
r(n)
M
= 1+ M .N
. (1 - S)
(5.2)
and the total bus utilization is "it:::-,~ where S is the utilization due to the nodes that are not rate-controlled. (It takes some trial and error to determine which nodes are ratecontrolled). The worst case bandwidth wastage is 1/(1 + 1\11), which occurs when only one node is active. For example, if there are three nodes whose average offered loads are 0.24,0.40, and 0.50 segments per slot time and if M = 9, then only the last two nodes are rate-controlled, the carried loads are 0.24, 0.36, and 0.36, respectively, and the wasted bandwidth is 0.04. One desirable feature of this scheme is that it automatically adapts to changes in network load. B. implementation
In order to implement unipriority bandwidth balancing, the slot header need only contain the busy bit and a single request bit. In theory, a node can determine the bus utilization by summing the rate of busies on one bus, the rate of requests
on the other bus, and the node's own transmission rate. In the long run, this sum should be the same at every node (though the individual components will differ from node to node). In other words, each node n has enough information available to implement (5.1). Fortunately, it is not necessary for the node to measure the bus utilization rate over some lengthy interval. As the analysis and simulation of Section V-C will show, it is sufficient for node n to respond to arriving busy bits and request bits in such a way that S(n, t)
< M . U(n, t) = M: [1 - B(n, t) - R(n, t) - S(n, t)] (5.3)
or equivalently:
M S(n, t) ::; 1 + M · !1 - B(n, t) - R(n, t)].
(5.4)
BUSY + DATA
REQ
~------&----+~
Implementation of bandwidth balancing.
In other words, the node takes only a fraction M / (1 + M) of the slots that are not reserved or busy at any point in time. One simple way to implement (5.3) and (5.4) is to add a bandwidth balancing counter (Be) to the data inserter, as shown in Fig. 4. The bandwidth balancing counter counts local data segments transmitted on the bus. After M segments have been transmitted, the bandwidth balancing counter resets itself to zero and generates a signal that the data inserter treats exact!y like a request from a downstream node. This artificial request causes the data inserter to let a slot go unallocated. (The request inserter is not aware of this signal; hence the node does not send any extra requests upstream corresponding to the extra idle slots it sends downstream.) The data inserter may use either distributed queueing or deference scheduling in serving its transmit queue. (Since bandwidth balancing by all nodes ensures some spare system capacity, the local data segment in the transmit queue will eventually be served, regardless of the queue's scheduling discipline.)· The advantages of deference scheduling are that it uses one counter (rather than two) and that it is easier to analyze, as we shall show shortly. However, we prefer distributed queueing for the following reasons. 1) While we will show that both versions of bandwidth balancing have the same throughput performance under sustained overload, the delay performance under moderate load is frequently better with distributed queueing [24]. 2) Many DQDB networks will have no significant fairness problems (e.g., if the buses are short, or if the transmission rate is low, or if the application is point-to-point rather than multi-access). In these cases, one would want to disable bandwidth balancing (because it wastes some bandwidth) and use the DQDB protocol as described in Section II, which works only with distributed queueing. It is convenient to build one data inserter that can be used with or without bandwidth balancing, and this would have to be the distributed queueing version.
c.
Performance
1) Analysis: This section analyzes the transient behavior of bandwidth balancing. In preparation, we first present our modeling assumptions and a useful bound on the value of a node's request counter. Suppose the propagation delays between nodes are all integer numbers of slot transmission times. Let the propagation delay from the most upstream node to the most downstream node he D1\1AX slot times. Assume
Fifty Years ofCommunications and Networking
the bus synchronization is such that a node reads a busy bit on the data channel immediately after reading a request bit on the reservation channel. Let the nodes use deference scheduling. To simplify the counting, imagine that deference scheduling serves a node's transmit queue as follows: first all genuine requests [i.e., those from downstream nodes) are served in order of arrival, then the artificial request from the node's bandwidth balancing counter is served, then the node's local data segment is served. This viewpoint yields the following two upper bounds on the time when a genuine request r is served at a node n: (i) the time when r is served at the node immediately upstream from n, plus the one-way nodeto-node propagation delay; (ii) the arrival time of 1" at n, plus the round-trip propagation delay between n and the network's upstream end. (Bound (i) can be proved by induction on the requests. Bound (ii) follows from (i) and induction on the nodes.) Bound (ii) shows that the number of genuine requests in a node's transmit queue can be at most 2D MAX + 1. Adding in one possible artificial request from the bandwidth balancing counter bounds the request counter value at 2D rvl AX. + 2. We will now offer an approximate analysis of bandwidth balancing during simultaneous file transfers by two nodes separated by a propagation delay of D slots. Call the upstream node 1 and the downstream node 2. We will show that the bandwidth balancing scheme converges to the steadystate throughputs given by (5.2), independent of the initial conditions created by the previous history of the system. We will also determine the rate of convergence, Although the analysis assumes deference scheduling, simulations will show that the distributed queueing implementation also achieves the desired steady-state throughputs. First we show that the request counters of both active nodes drain rapidly. Suppose that both file transfers have started, and that all other nodes have been inactive for at least D M AX slot times, so that the effects of these other nodes have disappeared from the buses (though not necessarily from the Request
Counters). In every M + 1 slot times, nodes 1 and 2 each transmit at most M data segments and at most M requests, i.e., node 1 leaves at least one idle data slot and node 2 leaves at least one vacant request bit. Each of these holes gives the
other node a chance to decrement its request counter. Since the request counter values started at 2D"MAX + 2 or less, they will drain to zero within (2DMAX + 2) . (M + 1) slot times, and thereafter they will never increase above one. Now we can show how the throughput rates converge to a fair allocation. Assume that at time 0, the request counters
have already drained. Also assume that a fraction In of the D busy bits in transit between nodes 1 and 2 on the data bus and a fraction f R of the D request bits in transit between the
nodes are set. For convenience, define
M
n=--. 1 Atf
+
Each node transmits in a fraction a of the idle slots available to it for its own data transmission. Consequently, in the first D slot times, node 1 will transmit in a(l - f R)D slots and node 2 will transmit in 0:(1 - fB)D slots. In the next D slot times, node 1 transmits in a[l-a( 1 - f B)]D slots, while node
553 2 transmits in all - n(l - fR)}D slots. The throughput of a node over half a round-trip time depends on the other node's throughput in the previous half round-trip time. (This analysis is approximate; in the interval D a node actually acquires an integer number of slots.) Let ,(l,k) and ,(2,k) be the fraction of the bandwidth acquired by nodes 1 and 2, respectively, during slots kD to (k + l)D where k == 0, 1, 2, .... The analyses for the two nodes are similar and we shall concentrate on the bandwidth acquired by node 1. Consider the sequence i( 1 ~ k) to be composed of two subsequences: a subsequence of even terms ,e(l, m) == ,(1,2m) and a subsequence of odd terms ,°(1, m) == ')'(1,2m + 1), for m == 0, 1, 2, .... Both subsequences rye (1, m) and 'Yo (1., 1n) satisfy the same difference equation, for tn. = 1, 2, 3, ... , ,e(l, m) == 0:(1 - 0)
+ a2'~l(1, TTL -
1)
,°(1, m) == a(l - a) + 0: [ °(1, m - 1) 2
but they have different initial conditions:
,e(I,O) == 0(1 - JR) ,°(1,0) = a[1 - 0(1 - fB)]. The throughput of node lover half round-trip times can be found by separate Z-transform analyses of the even and odd subsequences: ")'(1, A:)
={
1~0I
-
a 1+0
+
[fR - 1~0I ]
[
a
Qk+l ,
]
fB - l+a a
k+l
k even
(5.5)
, k odd
Similarly, the throughput of node 2 over half round-trip times is given by
_ ( k) 'Y 2,
~ - [IB - ~] (j.:k+l. k even 1+0 1+0 I
{
l~a
[]
+ In -
1:0
k l
a +
,
k odd
·
(5.6)
We can use the model developed above to analyze various possible scenarios in the simultaneous transfer of two files, some of which are listed below. • Both nodes turn on at the same time: fB = fR = o. • The upstream node turns on at least half a round-trip time before the downstream node: IB = 0, In = o. • The downstream node turns on at least half a round-trip time before the upstream node: In = 0, In = o. The approximate throughput expressions are found to match simulation results reasonably well. Fig. 5 compares the analysis with simulation results for the case where the two active nodes are separated by 38 slots (~ 22 km), the upstream node starts transmitting at least half a round-trip time before the downstream node, and Q = 0.9. The plotted throughputs are measured over successive full round-trip times (i.e., successive 76 slot intervals). Simulation results are shown for both deference scheduling and distributed queueing. Let us make a few remarks on (5.5) and (5.6). Note that in steady state the nodal throughputs are each 1 = 1~~J'yl and the amount of system bandwidth wasted is ~+~ = l+~A{' in accord with (5.2). For example, if a = 0.9, 5.3% of the
:0
554
THE BEST OF THE BEST I
I
,.u.JlI.Sci... • O;W; "~\i,l~;a!6 . lI r O . II~~tl ll tl lI . II II I1 I1 " I!I • • •II.IIHfI.fUIBIl
18/19
ul
18/19 9110
total throughput
.
:
h
--
=.:.::: :'f....~:: :~: ,::::
Average
Throughpul al'lil'llllllilllllJ
.. "ftIlHBHIl
9J19 9/28
-
.;
Analysis, Deference Scheduling .. S imulation. Deference Scheduling S imulation, Distribu ted Queueing
nod.
middle node
on
off
turns
:J
tum.
O -+--r--.---.----,r--,---,---,--,--~_I
o
5
10
15
2U
zs
30
35
40
45
5(}
Ti me (Uni t =- Rou nd-Trip De lay between Upsrreem and Downstream Nodes)
Fig. 5. Throughputs for two users with bandwidth balancing (a M=9).
= 0.9 ,
.. total thrcuahoet
1
18/19
38/39
I="
8/9
4 bl
~~I=m~e
Average
Throughput
9/19 -j
!
\ '-~-
7-
/,.II" II o
19/39 4/9
downstream node
c A
a a
x a
= 0.95 . M = 19 = U.W . M - 9
= 0.80 . It!
~
4
o 5 10 15 20 25 30 35 40 45 50 Time l Umt • Round-Trip Delay between Upstream and Downstream Nodes)
Fig. 6. Throughput performance
of bandwidth balancing for various values of a .
bandwidth is wasted. Note , moreover, that the steady-state nodal throughputs are independent of the initial conditions J Band JR' in marked contrast to the behavior of DQDB without bandwidth balancing, shown in Fig. 3. Finally note that, while the exact transient depends on Js and JR, the rate of convergence depends only on Q. For example, if Q = 0.9, then the error (i.e., the unfairness) in each nodal throughput ,(n , k) shrinks by a factor of 0.9 22 == 0.1 every 22D slot times . In other words, each nodal throughput moves 90% of the way to its steady-state value every 11 round-trip times . A lower Q results in faster convergence but more bandwidth wastage. The effect of different values of Q on the convergence rate and on the steady-state throughputs is shown in Fig. 6, for the same scenario as Fig. 5. 2) Simulation : Fig. 7 depicts simultaneous file transfers by three nodes, with 28 slots (~ 16 km) between successive nodes. The plot shows the average nodal throughputs measured over successive 112 slot intervals. Bandwidth balancing is used, with M = 9. The system starts in an idle state. The most upstream node comes up first and immediately achieves a throughput of 9/10, in accord with (5.2). The most downstream node turns on next and contends with the upstream node for an equal share of the bandwidth, viz.,
Fig. 1. Throughputs for three users with bandwidth balancing (M = 9).
9/19. The middle node turns on next, and the system again adjusts so that all three nodes achieve the throughput of 9/28 predicted by (5.2). The most downstream node and then the middle node complete their file transfers, . and in each case the system adjusts rapidly and redistributes the available bandwidth equally. Note that the amount of wasted bandwidth decreases as the number of active nodes increases. (The simulation for Fig. 7 used distributed queueing; the simulation was also performed with deference scheduling and the results were virtually indistinguishable.) An interesting feature of bandwidth balancing is that nodes whose offered load is less than the control rate are not ratecontrolled. The remaining bandwidth is distributed equally amoog the rate-controlled nodes, in accord with (5.2). The simulation results in Table I show the distribution of bandwidth among three active nodes, when the upstream and downstream nodes are involved in long file transfers and tbe middle node is a low-rate user, with either Poisson or periodic segment arrivals. The results are the same whether distributed queueing or deference scheduling is used .
VI.
BANDWIDTH BALANCING
USING
LOCAL PRIORITY INFORMATION
A. Concept
Now let us introduce multipriority traffic. As before, our bandwidth balancing procedure will guarantee that there is some unused bus capacity and ask each parcel to limit its throughput to some multiple of that spare capacity. Now, however, the proportionality factor will depend on the priority level of the parcel. Specifically, the parcel of priority p is asked to limit its throughput to a multiple Mp of the spare bus capacity; parcels with less demand than this may have all the bandwidth they desire :
rp{n) = min [pp(n) ,
u, .U]
~ mi+(nj,M,. (1- ~~r,(m»)
l
(6.1)
555
Fifty Years ofCommunications and Networking TABLE I
Local
THROUGHPUTS FOR H HlIlOO"NEOUS US ERS
wrrn
BANDWID1ll BAlANCING
local Da.la. Prio. P
( AI ::: 9)
. . . local FIFOj
Throughpu ts
Offered load
Upstream
Middle
Downstream
Total
U.2 0.25 0.3
0.379 0.355 0.332
0.200 0.250 0.300
0.379 0.355 0.332
0.958 O.lJ61 0.963
0.2 0.25 0.3
0.379 0.355 0.332
0.200 0.250 0.300
0.379 0.355 0.:B2
0.961
at the middle Ao4~
deterministic
random
Mp (n ) = I+LMq .N'l'
BUSY. DATA
REQ
REQ
0.958 0.963
Note that every active parcel in the network gets some bandwidth. This scheme is fair in the sense that all rate-controlled parcels of the same priority level get the same bandwidth . Parcels of different priority levels are offered bandwidth in proportion to their bandwidth balancing moduli Mp • Given the offered loads pp(n) and the bandwidth balancing moduli Mp, (6.1) can be solved for the carried loads rp(n). In the special case where all N p parcels of priority level ]I have heavy demand, the solution has an especially simple form: rp
BUSY . DATA
Fig. 8.
Bandwidth balancing with local priority information .
priority level p. (If fewer than M p segments of priority pare available, then all these available segments are authorized and the extra authorizations expire.) The order in which authorized segments pass through the gate is unimportant, as long as FIFO order is preserved among segments of the same priority level. When all authorized segments of all priority levels have been transmitted. the data inserter is temporarily prevented from processing any more local data. Because the other nodes are following the same discipline, however, the data inserter will eventually detect an unallocated slot and create more authorizations .
(6.2)
q
Suppose, for example, that there are three priority levels and M1 = 2, M2 = 4, and M3 = 8. If there is one active parcel of each priority, then the parcels' throughput rates arc 2/15, 4/15, and 8/15. and the unused bandwidth is 1/15 of the bus capacity. B. Implementation
For this "local" version of bandwidth balancing, the slot header need only contain the busy bit and a single request bit. In order to implement (6.1), the node should respond to arriving busy bits and request bits in such a way that Sp(n , t)
s Alp ' U(n , t) .
(6.3)
The most straightforward way to implement (6.3) is to construct a separate section for each priority level p, similar to Fig. 2, then add a bandwidth balancing counter with modulus Mp to that section. A more compact implementation is shown in Fig. 8. Here the node has only one section with one data inserter and one request inserter to manage data of all priority levels, but a separate local FIFO queue for each priority is required. The data inserter may serve its transmit queue using either distributed queueing or deference scheduling. A gate controls the movement of local data segments from the local FIFO queues to the data inserter. A local data segment must be authorized (as explained below) before it may pass through the gate, and the data inserter may only accept and process one authorized segment at a time. Whenever the data inserter observes an unallocated slot, it authorizes i'r[p local data segments for each
C. Performance
Fig. 9 shows simulation results for the bandwidth balancing scheme with local priority information, using the compact implementation described above and using distributed queueing. As in the simulation of Fig. 7, the bus is shared by three nodes spaced apart by 28 slots (~ 16 km) for a round-trip delay of 112 slot times. Fig. 9 shows the nodal throughputs over successive round-trip times. There are three priority levels of traffic, and their bandwidth balancing moduli are 2,4, and 8. First the node farthest upstream begins transmitting a long message of medium priority. As predicted by equation (6.2), this node acquires 4/5 of the bus bandwidth. Later the downstream node gets a high-priority message to transmit, and after several round-trip times it achieves a throughput rate of 8/13, while the medium-priority parcel is cut back to 4/13, again as predicted by (6.2). Finally, the middle node becomes active at low priority, and the nodal throughputs shift to 8/15, 4/15, and 2/15, in accord with (6.2).
VII . BANDWIDTH BALANCING USING GLOBAL PRIORITY INFORMATION
A. Concept
Now we assume that every node can determine the bus utilization due ' to traffic of each priority level. Each parcel is asked to limit its throughput to some multiple M of the spare bus capacity not used by parcels of equal or greater priority; parcels with less demand than this may have all the
556
THE BEST OF THE BEST
4/5
8/13 Average Thro ughp ut
n r= UP- ~\~ . , ~\,
I
lum ~
n
downstrenm~ ' ~'!I!!t_IIilfi\ilil!_~~ tsll5
r.~
node
node turns on. hi prio
l' 4113 I1 ~ f ~>l\"""""\"""\I,,-~nN·-.tII'/\,..-. :
II
on.
I
4/15
middle node uims on, 10 prio
f I
25
'I '
I 50 I
75
100
=
=
= 2,
bandwidth they desire:
= min
+]
(1- ~~r,(m») 1
[pp(n),M. Up
=mi+,(,,),M
I~ II-J 11-
REQ p
(7.1)
This scheme is fair in the sense that all rate-controlled parcels of the same priority level get the same bandwidth. Allocation of bandwidth across the various priority levels is as follows. First, the entire bus capacity is bandwidth-balanced over the highest priority parcels, as though the lower priority parcels did not exist. Bandwidth balancing ensures that some bus capacity will be left unused by the highest priority parcels. This unused bandwidth is then bandwidth-balanced over the second highest priority parcels. The bandwidth left over after the two highest priorities have been processed is then bandwidth-balanced over the third highest priority parcels, etc. We emphasize that with this scheme, in contrast to the scheme of Section yI, the throughput attained by a parcel of a given priority is Independent of the presence of lower priority parcels anywhere in the network. Given the offered loads pp(n) and the bandwidth balancing modulus. M, (7.1) can be solved for the carried loads Tp (n). In the special case where all Np parcels of priority level p have heavy demand, the solution has a simple form: (7.2)
=
For example, if At! 4 and there are three active parcels of three different priorities, then the parcels' throughput rates are 4/5, 4/25. and 4/125, and the unused bandwidth is 1/125 of the bus capacity. B. Implementation
For this "global" version of bandwidth balancing, the slot header must contain the busy bit, an indication of the priority
Prio . p Sec tion
Fig. 10.
~
I
REQ. >p ,
2115
Time (Unit = Round -Trip Delay be tween L pstrea m and Downst ream Nodes)
Fig. 9. Bandwidth balancing with local priority information (MI ,'\ ;[2 4, M~ 8).
PRIO -
Loc a l Data. Prio . P
Loca l Data. Pri o. p
REQ' < p
~fIwII""~ . .w.-"'~~ftM~ '
BUSY .
liATA
I
' I oI j I o
Tp(n)
n Local Data. Prio . I
BUSY +
PRIO .
DATA
I
~I f-
REO. >p REQp REQ« p
o. Sect ion
Bandwidth balancing with global priority information.
level of the data segment in a busy slot," and one request bit for each priority level. By reading these fields, each node can determine the priority level of all traffic on the bus (i.e.• there is "global" priority information). In order to implement (7.1), node n should respond to arriving busy and request information in such a way that (7.3) As shown in Fig. 10, the node needs a separate section to manage data for each priority level. Each section has its own data inserter, request inserter, local FIFO queue, and gate. Each data inserter may serve its transmit queue using either distributed queueing or deference scheduling. Inequality (7.3) can be implemented by the node section of priority p as follows. Only authorized segments may pass through the gate, from the local FIFO queue into the data inserter. The data inserter is still restricted to processing only one (authorized) local data segment at a time. Whenever the data inserter observes a slot that is not allocated to traffic of priority p or higher, it authorizes up to M additional local data segments that were not previously authorized. (If fewer than M unauthorized segments of priority p are available, then all these available segments are authorized and the extra authorizations expire.) Note that there are two circumstances under which the data inserter observes such a slot: a) the slot is already busy with a segment of priority less than p when the slot arrives at the data inserter, or b) the slot arrives empty and finds the transmit queue inside the data inserter also empty, holding no local data segment and holding no requests from downstream nodes. C. Performance
Fig. 11 shows simulation results for this scheme with distributed queueing. As in the preceding simulation, the bus is shared by three nodes spaced apart by 28 slots (~ 16 km) for a round-trip delay of 112 slot times. There are three priority levels of traffic, and the bandwidth balancing modulus is 4. The arrival times and priority levels of the messages match those for Fig. 9. There are two key differences from the Ii In the current DQDR standard, the access control field of the slot header d?es not in.c1~de tbe ~ata priority level. However, tile field has enough spare bits that this information could be added in a future version of the standard .
Fifty Years ofCommunications and Networking
557
r~
415
A verage
Thr oughput
[If
~
st::;.mf,~nwn,"eam node
node
turns
(urns on,
on, med
hi
,
pno
p rt O
\
J
I Fig. 11. Bandwidth balancing with global priority information (AI
preceding simulation, however. First note that the throughput of the downstream node's high-priority parcel overshoots its target value of 4/5 before settling down at time 31. This happens because the parcel receives less than its correct bandwidth when it first becomes active; the authorizations it accumulates at that time allow it to compensate later. The second thing to notice is that a parcel's steady-state throughput is independent of the presence of lower-priority parcels: when the low-priority parcel turns on at time 45, the high-priority and medium-priority parcels surrender no bandwidth. Once again, for each interval over which the number of active parcels is constant, the steady-state parcel throughputs are close to the values predicted by (7.2).
VIII.
CONCLUSION
If a dual-bus network uses the DQDB protocol with bandwidth balancing disabled, unfair operating conditions can occur during overloads. If the bus speed is 150 Mbps, then when two nodes separated by 30 km perform simultaneous file transfers, one can obtain a hundred times the throughput of the other. Fortunately, bandwidth balancing is a simple technique that corrects this problem reasonably well. By having each node use the request and busy bits of the DQDB protocol to gauge competing traffic, and by constraining the node to acquire no more than a certain fraction of the remaining bandwidth, we can make the protocol converge to a fair operating point when several nodes are performing large file transfers. How rapidly the network converges to fair operation depends upon how much bandwidth we are willing to waste. The larger the waste, the faster the convergence. The bandwidth efficiency and convergence time of bandwidth balancing cannot match that attainable by a centralized scheduler. This difference suggests that there is still room for improvement in distributed schemes like bandwidth balancing, although there is likely to be a tradeoff between performance and complexity. The implementation of bandwidth balancing is quite simple. For instance, a node can restrict itself to 90% of the available slots by letting one extra (i.e., unrequested) idle slot go
= 4).
by, every time it transmits nine of its own data segments. Bandwidth balancing somewhat reduces the importance of the discipline used by a node to schedule its own data segments among the requests from downstream nodes. The distributed queueing discipline of DQDB may be retained, or a streamlined discipline called deference scheduling may be substituted. In this paper we also proposed two ways to balance the bandwidth of multipriority traffic more effectively than does the current standard. The two new methods differ in the priority information they use. In the "local" scheme, each node only knows the priority level of its own data; this scheme uses no priority information from the buses. In the "global" scheme, however, the slot headers need to show the priority of data segments in busy slots, as well as the request priorities. Both methods determine the carried loads of the nodes based only on their offered loads, not on their relative bus locations. (The version of multipriority bandwidth balancing in the current standard is not so robust.) Moreover, both methods are fair in the sense that traffic parcels of the same priority level get the same bandwidth in steady state. (The current standard lacks this level of fairness.) The methods differ in their allocation of bus bandwidth across the various priority levels. In the "local" scheme, each priority level has an associated bandwidth balancing modulus, and the scheme allocates bandwidth to all traffic parcels in proportion to their bandwidth balancing moduli. The steady-state throughput of a traffic parcel is affected by the presence of every other parcel for that bus. The "global" scheme, roughly speaking, allocates bandwidths first to the higher priority parcels, then allocates the leftovers to the lower priority parcels. Lowerpriority parcels have no effect on the steady-state throughputs of higher priority parcels. Either multipriority technique could be implemented through modest changes in the DQDB standard. The current slot header already contains more than enough control information to drive the "local" scheme, and while the header does not currently include all the information needed for the "global " scheme, there are spare header bits that could be used for that purpose in a future version of the standard. If either scheme were included in a future standard, then nodes satisfying the
558
THE BEST OF THE BEST
old standard could share the same network with the new nodes, provided that the old nodes only generate traffic of the lowest priority level. In fact, a crude variant of the "local" scheme (cf. [44] and the "local per-node" procedure in [45]) can be realized within the framework of the current standard as follows: each node executes the DQDB protocol and bandwidth balancing as if all its traffic were at the lowest priority level, while constantly varying its bandwidth balancing modulus according to the true priority level of its current
traffic. It also hears mention that all the bandwidth balancing schemes presented in this paper have an extra dimension of flexibility; different bandwidth balancing moduli may be assigned to different nodes [43]. By this means, extra bandwidth may "be allocated to high-volume nodes such as file servers.
ACKNOWLEDGMENT OUT
many discussions with U. Mukherji greatly increased
our understanding of the performance of the DODB protocol.
[16] B.T. Doshi andA.A. Fredericks, "Visual modeling and analysis with application to IEEE ~02.6 DQDB protocol," Paper no. 16.2, ITC Specialist Seminar, Adelaide, Australia, 1989.
[171 K. Sauer and W. Schodl, "Performance aspects of the DODBprotocol," Paper no. 16.3, ITC Specialist Seminar, Adelaide, Australia, 1989. [18] M. Zukerman and P. Potter, "The DQDB protocol and its performance under overload traffic conditions," Paper no. 16.4, ITC Specialist Seminar, Adelaide, Australia, 1989. [19] O. Gihr, "The fairness issue in high speed local area networks," in Fourth Australian Teletraffic Res. Sem., Bond Univ., Queensland, Australia, Dec. 4-5, 1989. [20] P. Martini, "Fairness issues of the DQDB protocol," in 14th Conf. Local Comput. Networks, Minneapolis, MN, Oct. 10-12, 1989, pp. 160-170.
[21] Y. Yaw, Y. -K. Yea, W. ·D. Ju, and P. A. Ng, "Analysis on access fairness and a technique to extend distance for 802.6," in 14th Conf. Local Comput. Networks, Minneapolis, MN, Oct. 10-12, 1989, pp. 171-176. [22] C. Bisdikian, "Waiting time analysis in a single buffer DQDB (802.6) network," in IEEE INFOCOM'90, San Francisco, CA, June 5 - 7, 1990,
pp. 610-616.
[23] H. Adiseshu and U. Mukherji, "An approximate performance analysis of the distributed queue dual bus metropolitan area network medium access control protocol," in Workshop Signal Processing, Commun., Networking, Indian lnst. Sci., Bangalore, India, July 23-26, 1990. (24) E. L. Hahne, A. K. Choudhury and N. F. Maxemchuk, "Improving the fairness of distributed-queue dual-bus networks," in IEEE INFOCOM '90, San Francisco, CA, June 5-7, 1990, pp. 175-184. [25] M. Conti, E. Gregori, and L. Lenzini, "DODB under heavy load: Performance evaluation and fairness analysis," in IEEE INFOCOM '90, San Francisco, CA, June 5- 7, 1990, pp, 313- 32U. (26) S. Fdida and H. Santoso, "Simulation and approximate performance analysis of DQDB," Tech. Rep. 90-18, MASI Lab., Pierre and Marie
REFERENCES [1] R. M. Newman, Z. L. Budrikis, and J. L. Hullett, uThe QPSX MAN," IEEE Commun., vel, 26, pp. 20-28, Apr. 1988. [2] Z. L. Budrikis, J. L. Hullett,R.M. Newman, D. Economou, F.M. Fozdar, and R. D. Jeffery, "QPSX: A queue packet and synchronous circuit exchange," 8th Int. Con! Comput. Commun., Munich, Germany, Sept. 15-19, 1986, published by North-Holland, pp. 288-293. [3] IEEE 802.6 Working Group, uIEEE standard 802.6: Distributed Queue Dual Bus (DQDB) subnetwork of a metropolitan area network (MAN)," Final Draft 015, approved by IEEE Standards Board on Dec. 6, 1990.
[4] J. O. Limb and C. Flores, "Description of Fasnet-A unidirectional local area communications network," Bell Syst. Tech. J., vol. 61, no. 7, pp. 1413-1440, Sept. 1982. [5J H. Kaur and G. Campbell, ·'DQDB-An access delay analysis," in IEEE INFOCOM 190, San Francisco, CA, June 5-7, 1990. pp. 630-635. [6J R. M. Newman, "Distributed queueing: Performance characterisation," Contribution 802.6-88/11 to the IEEE 802.6 Working Group, 1988. [7J M. Conti, E. Gregori, and L. Lenzini, "DODD media access control protocol: Performance evaluation and unfairness analysis," Third IEEE Workshop MAN's, Mar. 28-30, 1989, San Diego, CA, pp. 375-408. [8J M. W. Garrett, "Simulation study of dual bus MAN's," in Third IEEE Workshop MAN's, San Diego, CA, Mar. 28-30, 1989, pp. 409-419. [9] P. Davids, T. Welzel, "Performance analysis of DQDB based on simulation," in Third IEEE Workshop MAN's, San Diego, CA, Mar. 28-30, 1989~ pp.431-445. [10] J. W. Wong, "Throughput of" DQDB networks under heavy load," EFOC/LAN-89, Amsterdam, The Netherlands, June 14-16, 1989, pp, 146-15l. [11] M. Zukerman and P. Potter, "The effect of eliminating the standby state on DQDB performance under overload," Int. J. Digital Analog Cabled Syst., vol. 2, 19~9, pp. 179-186. [12] M.N. Huber, K. Sauer, and W. Schodl, "QPSX and FDDl-II: Performance study of high speed LAN's,'~ EFOC/LAN'88, Amsterdam, June 29-July 1, 1988, pp. 316-321. [13] R. M. Newman and J. L. Hullett, "Distributed queueing: A fast and efficient packet access protocol for QPSX," 8th Int. Conf. Comput. Commun., Munich, Germany, Sept. 15-19, 1986, published by NorthHolland, pp. 294-299. [14] H. R. van As, J. W. Wong, and P. Zafiropulo, "Fairness, priority and predictability of the DQDB MAC protocol under heavy load," in Tnt. Zurich Sem. Digital Commun.. Zurich, Switzerland, March 5 -8, 1990, pp. 410-411. (15) P. Tran-Gia and T. Stock, "Approximate performance analysis of the DQDB access protocol," Paper no. 16.1, fTC Specialist Seminar, Adelaide, Australia, 1989.
Curie Univ., Paris, France, May 1990. (27) H. R. van As, "Performance evaluation of bandwidth balancing in the DQDB MAC protocol," EFOC/LAN 90, Munich, Germany, June 27-29, 1990~ pp. 231-239.
[28J A. Myles, "DQDB simulation and MAC protocol analysis," Electron. Leu., vol. 25, no. 9, Apr. 27, 1989, pp. 616-618. [29] A. Baiocchi, M. Carosi, M. Listanti, G. Pacifici, A. Roveri, and R. Winkler, "The ACCI access protocol for a twin bus ATM metropolitan area network." in IEEE INFOCOM '90, San Francisco, CA, June 5-7, 1990, pp. 165 -174. [30J K. M. Khalil and M. E. Koblentz, "A fair distributed queue dual bus access method," in 14th Conf. Local Comput. Networks, Minneapolis, MN, Oct. lO-12, 1989, pp. 180-188. (31 J J. Filipiak, "Access protection for fairness in a distributed queue dual
bus metropolitan area network," IEEE ICC '89~ Boston, MA, June 1989, pp. 635 - 639. (32} H. R. Muller, M. M. Nassehi, 1. W. Wong, E. Zurfluh, W. Bux, and P. Zafiropulo, "DOMA and eRMA: New access schemes fOT Obit/s LANs and MANs,
in IEEE INFOCOM '90, San Francisco, CA, June 5 - 7, 1990, pp. 185-19l. [33] B. Mukherjee and J. S. Meditch, "The pi-persistent protocol for unidirectional broadcast bus networks," IEEE Trans. Commun., vol. 36, pp. 1277-1295, Dec. 1988. [34J J. O. Limb, "Load-controlled scheduling of traffic on high-speed metropolitan area networks," IEEE Trans. Commun., vol. 37, pp. 1144H
1150, Nov. 1989. [35] M.A. Rodriques, "A fair MAC access scheme," Contribution 802.6 .. 88/62 to the IEEE R02.6 Working Group, July 5, 1988. [36] P. Potter and M. Zukerman, "Cyclic request control for provision of guaranteed bandwidth within the DQDB framework," ISS '90, Stock-
holm, Sweden, May 1990, Paper no. A4.1. [37] Y. -s. Leu and D.H. C. Du, "Cycle compensation protocol: A completely fair protocoJ for the unidirectional twin bus architecture," in 15th Coni. Local Comput. Networks, Minneapolis, MN, Sept. 30-Oct. 3, 1990, pp. 416-425.
[38] J. Jaffe, "Bottleneck flow control," IEEE Trans. Commun., vol. COM29, pp. 954-962, July 1981. [39] M. Spratt, "A problem in the multi-priority implementation of the bandwidth balancing mechanism," Contribution to the IEEE 802.6 Working Group, Nov. 1989. [40] V. Phung and R. Breault, "Enhancement to the bandwidth balancing mechanism (version 2)," Contribution 802.6-90/25 to the IEEE 802.6 Working Group, Mar. 1990. [41] M. Spratt, "Implementing the priorities in the bandwidth balancing mechanism," Contribution 802.6-90/05 to the IEEE 802.6 Working
Group, Jan. 1990. [42] R. Damodaram, verbal proposal at the IEEE 802.6 Working Group
Fifty Years of Communications and Networking meeting of Sept. 1989. (43) M. Spratt, "A non-unity ratio bandwidth aJIocation mechanism-A simple improvement to the bandwidth balancing mechanism," Contribution to the IEEE 802.6 Working Group, Nov. 1989. [44] E. L. Hahne and N.F. Maxemchuk, "Bandwidth balancing with local priority information," Contribution 802.6-90/06 to the IEEE 802.6 Working Group, Jan. 22, 1990. l45) __ , "Fair access of multi-priority traffic to distributed-queuedual-bus networks," in IEEEINFOCOM '91, Bal Harbour, FL, Apr. 9-11,1991, pp. 889 -900. [46] __ , "Bandwidth balancing with global priority information," Contribution ~02.6-90/07 to the IEEE 802.6 Working Group, Jan. 22, 1990. [47] A.M. Perdikaris and M. A. Rodrigues, "Destination release of segments in the IEEE 802.6 protocol," Contribution 802.6-88/61 to the IEEE 802.6 Working Group, July 1988. [48] R. Breault and V. Phung, "DQDB performance improvement with erasure nodes," Contribution 802.6-90/21 to the IEEE 802.6 Working Group, Mar. 1990. [49] M. Zukerman and P. G. Potter, :'A protocol for eraser node implementation within the DQDB framework," in IEEE GLORECOM '90, San Diego, CA, Dec. 1990, pp. 1400-1404. (SO] P. Potter and M. Zukerman, "A discrete shared processor model for DQDB: ITC Specialist Sem. Adelaide, Australia, 1989, Paper No. 3.4. [51] N. L. Golding, "DQDB 010 fairness analysis," Contribution802.6-90101 to the IEEE 802.6 Working Group, Jan. 1990. [52] M. Zukermanand G. Shi, "Throughput analysis for a DQDB subnetwork with two close sources under overload trafficconditions," SNRB Branch Paper 205, Telecom Australia Res. Labs., Clayton, Australia, Jan. 1991. t
Ellen L. Hahne (M'87) was born in 1956 in Pittsburgh, PA. She received the B.S. degree in electrical engineering from Rice University, Houston, TXt in 1978 and the S.M., E.E., and Ph.D. degrees in elecPHOTO trical engineering and computer science from the NOT Massachusetts Institute of Technology, Cambridge, MA in 1981, 1984 and 1987. Since 1978 she has AVAILABLE worked for AT&T Bell Laboratories, Murray Hill, NJ. Her primary research interest is the control of traffic in networks. Past projects have dealt with scheduling and flow control in wide-area data networks, access control in local- and metropolitan-area data networks, detection and control of focused overloads in telephone networks, and dynamic routing in automated factories. Dr. Hahne was named a Presidential Scholar in 1974. She is a member of Phi Beta Kappa, Tau Beta Pi, and Sigma Xi. 1
559 Abhijit K. Choudhury (M'91) received the B. Tech. degree in electrical engineering from Indian Institute of Technology, Kanpur, India, in 1986 and the M.S. in electrical engineering from S.U.N.Y. PHOTO at Stony Brook, New York, in 1987. He received the Ph.D. degree in electrical engineering from the NOT University of Southern California in 1991. AVAILABLE During the summer of 1988, he worked at AT&T Bell Laboratories, Murray HilJ, NJ, where he was part of a team that designed the bandwidth balancing technique for solving the unfairness problem in DODB networks. In the summer uf 1989, he was engaged in research at AT&T Bell Laboratories, Murray Hill, that lead to an understanding of the effect of a finite reassembly buffer on the performance of deflection routing. His dissertation work dealt with deflection routing in high-speed computer networks. He is currently working in the Distributed Systems Research Department of AT&T Bell Laboratories, Murray Hill. His current areas of interest are routing and flow control in high speed networks, performance evaluation of communication networks and computer systems, multiple access protocols and network design. Dr. Choudhury is a member of IEEE Communications and Computer Societies and of Eta Kappa Nu.
Nicholas F. Maxemcbuk (F'89) received the B.S.E.E. degree from the City CoJJege of New York, and the M.S.E.E. and Ph.D. degrees from the University of Pennsylvania. PHOTO He is currently the Head of the Distributed NOT Systems Research Department at AT&T Bell Laboratories, where he has been since 1976. Prior to AVAILABLE joining Bell Laboratories he was at the RCA David Sarnoff Research Center in Princeton, NJ for eight years. Dr. Maxemchuk has been on the adjunct faculties of Columbia University and the University of Pennsylvania. He has served as the Editor for Data Communications for the IEEE TRANSACTIONS ON COMMUNICATIONS, as a Guest Editor for the IEEE JOURNAL ON SELECfED AREAS IN COMMUNICATIONS, and is currently on JSAC's Editorial Board. He has been on the program committee for numerous conferences and workshops, and was the Program Chairman for 1987 and 1990 workshops on Metropolitan Area Networks. He was awarded the RCA Laboratories Outstanding Achievement Award in 1970, the Bell Laboratories Distinguished Technical Staff Award in 1984, and the JEEE'~ 1985 and 1987 Leonard G. Abraham Prize Paper Award.
Input Versus Output Queueing on a Space-Division. Packet Switch MARK J. KAROL,
MEMBER, IEEE,
MICHAEL G. HLUCHY~,
Abstract-Two simple models of queueing on an N X N space-division packet switch are examined. The switch operates synchronously with fixed-length packets; d~ring eacb time slot, packets may arrive on any inputs addressed to any outputs. Because packet arrivals to tbe switch are unscheduled, more than ODe packet may arrive for the ~me output during tbe same time slot, making queueing unavoidable. Mean queue lenglh~ are always grea.er for queueing on inputs than for queueing on outputs, apd tbe output queues ~turate only as the utilization appro.~"es onity. Input queues, on tbe otber band, saturate at a utUiza"oD that depends on N, but is approximately (2 - .J2) = 0.586 when Nis large. if o.,.tput tru~ utilizatio~ i~ the primary consideration, it is possible to sligbtly increase utilization of tbe output trunks-up.to (1 - e- I ) = 0.632 as N --+ oo-by dropping ioterf,ring packets at the end each time slot, rather than stori"g them in tbe input queues. This improvement is possible, howe\'~r, only wben the utDization of the input trunks exceeds a second critical threshold-approximately In (1 + ;ji) = 0.881 for large N.
of
I.
INTRODUCTION
S
PAC E-DIVISIO N packet switching is emerging as a key component in the trend toward high-performance integrated communication networks for data, voice, image, and video [1], [2] and multiprocessor interconnects for building highly parallel computer systems [3], [4]. Unlike present-day packet switch architectures with throughputs measured in 1's or at most 10's of Mbits/s, a space-division packet switch can have throughputs measured in 1 's, 10's, or even 100' s of Gbits/ s. These capacities are attained through the use of a highly parallel switch fabric coupled with simple per packet processing distributed among many high-speed VLSI circuits. Conceptually, a space-division packet switch is a box with N inputs and N outputs that routes the packets arriving on its inputs to the appropriate outputs. At any given time, internal switch points can be set to establish certain paths from inputs to outputs; the routing information used to establish inputoutput paths is often contained in the header of each arriving packet. Packets may have to be buffered within the switch until appropriate connections are available; the location ofthe buffers and the amount of buffering required depend on the switch architecture and the statistics of the offered traffic. Clearly, congestion can occur if the switch is a blocking network, that is, if there are not enough switch points to provide simultaneous, independent paths between arbitrary pairs of inputs and outputs. A Banyan switch [3]-[5], for example, is a blocking network. In a Banyan switch, even when every input is assigned to a different output, as many as Paper approved by the Editor for Local Area Networks of the IEEE' Communications Society. Manuscript received August 8, 1986; revised May 14, 1987. This paper was presented at GLOBECOM'86, Houston, rx, December 1986. M. J. Karol is with AT&T Bell Laboratories, Holmdel, NJ 07733. M. G. Hluchyj was with AT&T Bell Laboratories, Holmdel, NJ 07733. He is now with the Codex Corporation, Canton, MA 02021. s. P. Morgan is with AT&T Bell Laboratories, Murray Hill, NJ 07974. IEEE Log Number 8717486.
MEMBER, IEEE, AND
SAMUEL P. MORGAN, FE~LOW,
IEEE
-IN connections may be
contending for use of the same center link. The use of a blocking network as a packet switch is feasible only under light loads or, alternatively, if it is possible to run the switch substantially faster than the input and output trunks. In this .paper, we consider only nonblocking networks. A simple example of a nonblocking switch fabric is the crossbar interconnect with Nl switch points (Fig. 1). Here it is always possible to establish a connection between any idle inputoutput pair. Examples of other nonblocking switch fabrics are given in [31. Even with a nonblocking interconnect, some queueing in a packet switch is unavoidable, simply because the switch acts as a statistical multiplexor; that is, packet arrivals to the switch are unscheduled. If more than one packet arrives for the same output at a given time, queueing is required. Depending on the speed of the switch fabric and its particular architecture, there may he a choice as to where the queueing is done: for example, on the input trunk, on the output trunk, or at an internal node.. We assume that the switch operates synchronously with fixed-length packets, and that during each time slot,· packets may arrive on any inputs addressed to any outputs (Fig. 2). If the switch fabric runs N times as fast as the input arid output trunks, all the packets that arrive during a particular input time slot can traverse the switch before the next input slot, but there will still be queueing at the outputs [Fig, 1(a)]. This queueing
really has nothing to do with the switch architecture, but is due to the simultaneous arrival of more than one input packet for the same output. If, on the other hand, the switch fabric runs at the same speed as the inputs and outputs, only one packet can be accepted by any given output line during a time slot, and other packets addressed to the same output must queue on the input lines [Fig. 1(b)]. For simplicity, ·we do not consider the intermediate case where some packets can be queued at internal nodes, as in the Banyan topology. It seems intuitively reasonable that the mean queue lengths, and hence the mean waiting times, will be greater for queueing on inputs than for queueing on outputs. When queueing is done on inputs, a packet that could traverse the -switch to an idle output during the current time slot may have to wait in queue behind a packet whose output is currently busy. The intuition that, if possible, it is better to queue on the outputs than the inputs of a space-division packet switch also pertains to the following situation. Consider a single road leading to both a sports arena and a store [Fig. 3(a)]. Even if there are no customers waiting for service in the store, some shoppers might be stuck in stadium traffic. A simple bypass road around the stadium is the obvious solution [Fig. 3(b)]. This paper quantifies the performance improvements provided by output queueing for the following simple model. Independent, statistically identical traffic arrives on each input trunk. In any given time slot, the probability that a packet will arrive on a particular input is p. Thus, p represents the average utilization of each input. Each packet has equal probability 1/ N of being addressed to· any given output, and successive packets are independent. With output queueing, all arriving packets in a time slot are
Reprinted from IEEE Transactions on Communications, voL COM-35, no. 12, December 1987.
The Best ofthe Best. Edited by W. H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
561
THE BEST OF THE BEST
562
N......+-----!- - __ + I
I
I
N-+=rruW
I
I
I
I
I
I I
---+ I
I I
I I
---f ::±i~~j 2' ~-+=nnn :Iill 11--~ ~ ~ '"
2
1
N
QUEUEING ON OUTPUTS
N
QUEUEING ON INPUTS
(a)
Fig. 1.
(a) An N x N crossbar switch with output queueing. (b) An N x N crossbar switch with input queueing: TIME SLO T
-l
t-
1 I q I" ;1 I 1 2 it 2 ;1 I
ItNI I, 3 f; I II I; 2 N&7+1 ,I
'"
.
TlME-5LOTIEO PACKE T
SWITCH
I· j: 4 :1 2 'I 1 I
N
" I
INPUTS
Fig. 2.
I; N:1 Nt
hNiJ
N
OUTPU TS
Fixed-length packets arrive synchronously to a time-slotted packet switch.
STORE
SPORTS ARENA
STORE
§
§ ~
§ § "I NPUT QUEUEING "
(a )
y
§
" OUTPUT QUEUEING"
(b)
Fig. 3. "Output queueing" (b) is superior to " input queueing" (a). In (a), even if there are no customers waiting for service in the store, some shoppers might be stuck in stadium traffic. In (b). a bypass road around the stadium serves cars traveling to the store. '
cleared before the beginning of the next time slot. For example, a crossbar switch fabric that runs N times as fast as the inputs and outputs can queue all packet arrivals according to their output addresses, even if all N inputs have packets destined for the same outputjftlg. I(a)]. If k packets arrive for one output during the current time slot, however, only one can be transmitted over the output trunk. The remaining k - I packets go into an output FIFO (first-in, first-out queue) for transmission during subsequent time slots. Since the average utilization of each output trunk is the same as the utilization of each input trunk, namely p, the system is stable and the mean queue lengths will be finite for p < I, but they will be greater than zero if p > O. A crossbar interconnect with the switch fabric running at the same speed as the inputs and outputs exemplifies input
queueing [Fig. I(b)] . Each arriving packet goes, at least momentarily , into a FIFO on its input trunk. At the beginning of every time slot, the switch controller looks at the first packet in each FIFO. Ifevery packet is addressed to a different output, the controller closes the proper crosspoints and all the packets go through. If k packets are addressed to a particular output, the controller picks one to send; the others wait until the next time slot, when a new selection is made among the packets that are then waiting. Three selection policies are discussed in Section III: one of the k packets is chosen at random, each selected with equal probability 1/k, longest queue selection, in which the controller sends the packet from the longest input queue, I and fixed priority selection where the N inputs have fixed priority levels, and of the k packets, the controller sends the one with highest priority. Solutions of these two queueing problems are given in Sections II and III. Curves showing mean waiting time as a function of p are plotted for various values of N. As expected, the mean waiting times are greater for queueing on inputs than for queueing on outputs. Furthermore, the output queues saturate only as p -> 1. Input queues, on the other hand, saturate at a value of p less than unity , depending weakly on ~ for large N, the critical value of p is approximately (2 '0/2) = 0.586 with the random selection policy. When the utilization p of the input trunks exceeds the critical value, the steady-state queue sizes are infinite, packets experience infinite waiting times, and the output trunk utilization is limited to approximately 0 .586 (for large N). In the saturation region, however, it is possible to increase utilization of the output trunks-up to (1 - e- I ) = 0.632 as N -> co-by dropping packets, rather than storing them in the input queues. This improvement is possible, however, only when the .utilization of the input trunks exceeds a second critical threshold-approximately In (1 + ~) = 0.881 for large N . Consequently, if the objective is maximum output utilization , rather than 100 percent packet delivery, then below the second threshold, it is better to queue packets until they are successful, whereas above the second threshold, it is better to reduce input queue blocking by dropping packets whenever there are conflicts. With high probability , new packets (with new destinations) will quickly arrive to replace the dropped packets. Comparing the random and longest queue selection policies of input queueing, the mean waiting times are greater with random selection. This is expected because the longest queue selection policy reduces the expected number of packets blocked (behind other packets) from traversing the switch to idle outputs. For fairness , the fixed priority discipline should be avoided because the lowest priority input queue suffers large delays and is sometimes unstable, even when the other two selection policies guarantee stability. II.
QUEUES ON OUTPUTS
Much of the following analysis of the output queueing scheme involves well-known results for discrete-time queueing systems [6] . Communication systems have been modeled by discrete-time queues in the past (e.g., [7]); we sketch our analysis and present results for later comparison to the input queueing analysis . We assume that packet arrivals on the N input trunks are governed by independent and identical Bernoulli processes. Specifically, in any given time slot , the probability that a packet will arrive on a particular input is p , Each packet has equal probability 1/N of being addressed to any given output, and successive packets are independent. Fixing our attention on a particular output queue (the I A random selection is made if. of the k input queues with packets addressed to a particular output, several queues have the same maximum length.
563
Fifty Years ofCommunications and Networking "tagged" queue), we define the random variable A as the number of packet arrivals at the tagged queue
~ Pr [A=i l=(7)
STATE TRANSITION PROBABiliTIES aj
;:
i = 0.1,2..
Pr(A:i)
IF N
(p/N)j(l-p/N)N-j
=
00
IF N< 00
;=0,1, .. ', N (I) with probability generating function (PGF)
(
N
P
p)N .
A(z) ~ ~ziPr[A=i]= l--+z;=0 N N
(2)
As N -+ 00, the number of packet arrivals at the tagged queue during each time slot has the Poisson probabilities
a,
~
Pr
r'«> [A == i] = - .,
;=0, 1, 2, ...
I .
.(3)
with probability generating function (PGF) N
A (Z) ~ ~
z' Pr
(A
= i] =e-P(l-z).
(4)
Letting Qm denote the number of packets in the tagged queue at the end of the mth time slot, and Am denote the number of packet arrivals during the mth time slot, we have
Qm=max (0, Qm-l +Am-l).
(1- p)(l- z) A(z)-z ·
(6)
Finally, substituting the right-hand side of (2) into (6), we obtain
(l-p)(l-z).
(7)
(l_~+Z~)N -z Now, differentiating (7) with respect to z and taking the limit as z --+ 1t we obtain the mean steady-state queue size Q given by
_ (N-1)
p2
Q=~ · 2(1-p)
(N- 1) _ --;::;- QMIDII
which corresponds to the PGF for the steady-state queue size of an MIDll queue. Expanding (9) in a Maclaurin series [9] yields the asymptotic (as N --+ (0) queue size probabilities 3
(10)
(l-p)(l-z)
l~~ Q(Z)- ~-p(l-z)-Z
Pr (Q= l)=(l-p)e
P(e P :'- l
(11)
- p)
n+l
Pr (Q::::n)=(l-p) ~ (_l)n+l-je
jp
j=l
.
(jp)n+l-j _ [ (n+l-j)!
+ (jp)n- i ]
for
(n-j)!
n~2
(l-p) qo £ Pr (Q=O)=--
(13)
ao
(14)
qn
~
(l- a l)
= Pr (Q==n)::::-ao
· qn-l n
(9)
2 We use the phrase "arrivals at the tagged queue during a given time slot" to indicate that packets do not arrive instantaneously, in their entirety, at the output. Packets have a nonzero transmission time.
(12)
where the second factor in (12) is ignored for j = (n + 1). Although it is mathematically pleasing to have closed-form expressions, directly using (12) to compute the steady-state probabilities leads to inaccurate results for large n. When n is large, the alternating .serles (12) expresses small steady-state probabilities as the difference between very large positive numbers. Accurate values are required if one is interested in the tail of the distribution; for example, to compute the probability that the queue size exceeds some value M. Numerically, a more accurate algorithm is obtained directly from the Markov chain (Fig. 4) balance equations. Equations (13)-(15) numerically provide the steady-state queue size probabilities.
(8)
where QM/D/l denotes the mean queue size for an MIDll queue. Hence, as N -+ 00, Q -+ QM/D/l. We can make the even stronger statement that the steadystate probabilities for the queue size converge to those of an MIDil queue. Taking the limit as N --+ 00 on both sides of (7) yields
.
output queue size.
(5)
When Qm-l = 0 and Am > 0, one of the new packets is immediately transmitted during the mth time slot; that is, a packet flows through the switch without suffering any delay. The queue size Qm is modeled by a discrete-time Markov chain; Fig. 4 illustrates the state transition diagram. Using (5) and following a standard approach in queueing analysis (see, for example, [8, sect. 5.6]), we obtain the PGF for the steadystate queue size: -
Q(z)=
The discrete-time Markov chain state transition diagram for the
Pr (Q=O)=(l-p)~P
i=O
Q(Z)=
80 Fig. 4.
a.
-~~. ~ ;=2 00
q n-r t.
n~2
(15)
3 The steady-state probabilities in [9, sect. 5.1.5] are for the total number of packets in an MIDIIsystem. We are interested in queuesize; hence, the modification to (10)-(12).
564
THE BEST OF THE BEST
where the a, are given by (1) and (3) for N < 00 and N = 00 , .respectively . . . We lire now interested in the waiting time for an arbitrary (tagged) packet that arrives at the tagged output FIFO during the mth time slot. We assume that packet arrivais to the output queue in the mth time slot are transmitted over the output trunk in random order. All packets arriving in earlier time slots; however, must be transmitted first. .The tagged packet's waiting time W has two components. First, the packet must wait WI time slots while packets that arrived in earlier time slots are transmitted. Second, it must wait an additional W 2 time slots until it is randomly selected Out of the packet arrivals in the mth time slot. Since packets require one time slot for transmission over the output trunk, WI equals Qm -I' Consequently, from (6), the PGF for the steady-state value of WI is
(l-p)(l-z)
A(z)-z
co
i=k+l
I
I
_
-:- ia;lA
(16)
k=O, 1,2, ...
I
co
=~ ~ a, P i=k +1
(17)
where A (= p).is the expected number of packet arrivals at the tagged output during each time slot, and the a, are given by (1) and (3) for N < 00 and N = 00, respectively. The PGF for the steady-state value of W 2 follows directly from (17).
I-A(z) p(l-Z)
(18)
Finally, since W is the sum of the independent random variables WI and W 2 , the PGF for the steady-state waiting time is
I-A(z)
W(Z) = Q(Z)' p(l _ z) ,
(19)
A(z) is given by (2) and (4) for N < 00 and N = 00, respectively. Differentiating (19) with respect to z and taking the limit as z -. I , we obtain the mean steady-state waiting time given by ~ _ I _ _ W =Q+- [A 2-A ]. . 2p
(20)
Since A= p and A 2 = p 2 + p(1 - piN), substituting the right-hand side of (8) into (20) yields
(N-I) p (N-l)W=~ '. 2(I-p)=~ W M / D / I
8
~
(/)
~
E
6
w
~
o z
5
~
~
N:::oc - - - - - - - - - -
4
Z
We must be careful When we compute W 2 , the delay due to the transmission of other packet arrivals in the mth time slot. Burke points out in [10] that many standard works on queueing theory are in error when they compute the delay of singleserver queues with batch input. Instead of working with the . size of the batch to which the tagged packet belongs, it is tempting to work with the size of an arbitrary batch. Errors result when the batches are not of constant size. The probability that our tagged packet arrives in a batch of size i is given by ia;lA; hence, the random variable W 2 has the probabilities .
Pr [W2 =k] = ~
9
(21)
~here W M IDll denotes the mean waiting time for an M/DII queue. The mean wa iting time W , as a function of p, is shown
«w
::;
N=10 - - - - - - - - - -
3
2
N=3
0.2
06
0.4
0.8
INPUT TRUNK UTILIZATION (p)
Fig. 5. The mean waiting time for several switch sizes N with output queueing.
in Fig. 5 for several values of N, Notice that Little's result and (8) generate the same formula for W . Rather than take the inverse transform of W(z), it is easier to compute the steady-state waiting time probabilities from I
=-
k
~ qn '
p
n =O
P
n =O
..
a,
~
k =0, I, . . .
i = k + l- n
.:± '[1- ~ Oi] qn
(22)
.= 0
where the qn are given by (13)-(15) and the OJ are given by (1) and (3) for N < 00 and N = 00, respectively, Ill. QUEUES ON INPUTS The interesting analysis occurs when the switch fabric runs at the same speed as the input and output trunks, and packets are queued at the inputs. How much traffic can the switch accommodate before it saturates, and how much does the mean waiting time increase when we queue packets at the inputs rather than at the outputs? As in the previous section, we assume that packet arrivals on the N input trunks are governed by independent and identical Bernoulli processes. In any given time slot , the probability that a packet will arrive on a particular input is p; each packet has equal probability liN of being addressed to any given output. Each . arriving packet goes , at least momentarily , into a FIFO on its input trunk. At the beginning of every time slot, the switch controller looks at the first packet in each FIFO. If every packet is addressed to a different output, the controller closes the proper crosspoints and all the packets go through . If k packets .are addressed to a particular out\?ut, one of the k packets is chosen at random, each selected with equal probability Ilk. The others wait until the next time slot when a new selection is made among the packets that are then waiting.
565
Fifty Years ofCommunications and Networking
A. Saturation Analysis-Random Selection Policy Suppose the input queues are saturated so that packets a~e always waiting in every input queue. Whenever a packet IS transmitted through the switch, a new packet immediately replaces it at the head of the input queue. We define B~ as the number of packets at the heads of input queues that are "blocked" for output i at the end of the mth time slot. In other words, Bi is the number of packets destined for output i, but not select~d by the controller during the m th time slot. We also define A ~ as the number of packets moving to the head of "free" input queues during the mth time slot and destined for output i. An input queue is "free" during the mth time slolv.. if, and only if, a packet from it was transmitted during the (m - 1)8t time slot. The, new packet "arrival" at the head of the queue has equal probability I I N of being addressed to any given output. It follows that B~=max (0, B~_l +A~ -1).
(23)
Although Bi does not represent the occupancy of any physical queue, notic~ that (23) has the same mathematicalform as (5). Ai t the number of packet arrivals during the mth time slot to fn~'e input queues and destined for output i, has the binomial .probabil ities
Pr
[A~=k]= (F'k
l
)
(l/N)k(l-lIN)Fm-l-k k=O, 1, ... , F m -
1
(24)
where
= N-
F m - 1 b.
N
"~
Bim-l.
(25)
i=l
F
m- I
is the number of free input queues at the end of the (m -
1)st time slot, representing the total number of packets transmitted through the switching during the (m - l)st time slot. Therefore, F m - 1 is also the total number of input queues with new packets at their heads during the mth time slot. That is, N
Fm':'l=~A~.
(26)
i=1
Po where P is the mean steady-state Notice that PIN number of free input queues and Po is the utilization of the output trunks (i.e., the switch throughput). As N -+ 00, the steady-state number of packets moving to the head of free input queues each time slot, and destined for output i, (A i) becomes Poisson at rate Po (see Appendix A). These observations and (23) together imply that we can use the results of Section II to obtain an expression for the mean steady-state value of Bi as N --+ 00. Modifying (8), we have
P~ 2(1- Po) However, using (25) and FIN
=
(27)
B. Increasing the Switch Throughput by Dropping Packets Whenever k packets are addressed for a particular output in a time slot, only one can be transmitted over the output trunk.. We have been assuming that the remaining k - 1 packets wait in their input queues until the next time slot when a new selection is made among the packets that are then waiting. Unfortunately, a packet that could traverse the switch to an idle output during the current time, slot may have to wait in queue behind a packet whose output is currently busy. As shown in Section III-A, input queue blocking limits the switch throughput to approximately 0.586 for large N. Instead of storing the remaining k - I packets in input queues, suppose we just drop them from the switch (i.e., we eliminate the input queues). Dropping packets obviously reduces the switch throughput when the input trunk utilization p is small; more time slots on the output trunks are empty because new packets do not arrive fast enough to replace dropped packets. Although dropping a significant number of packets (say, more than lout of 1000) may not be realistic for a packet switch, it is interesting to note that as the input utilization p increases, the reduction in input queue blocking when packets are dropped eventually outweighs the loss associated with dropping the packets. We define A i as the number of packet arrivals during the m th time slot that are addressed for output i, A ~ has the, binomial probabilities
Pr
[A~=k]= (~) (p/N)k(l_p/N)N-k k=O, 1, ... , N.
(29)
We also. define the indicator function l~ as follows:
if output trunk i transmits a packet during the mth time slot otherwise.
(30)
When we drop packets, only those that arrive during the mth time slot have a chance to be transmitted over output trunks during the mth time slot. If they are not selected in the slot in which they arrive, they are dropped. Consequently, for each output trunk i, the random variables I~ and I~(r s) are independent and identically distributed, with probabilities
"*
Po, we also have
Hi = 1- Po.
if more than one simultaneous request is made to a particular memory. In the event of a conflict, one request is accepted, and the other requests are held for future memory cycles. If M = N and processors always make a new memory request in the cycle immediately following their own satisfied request, then our saturation model for input queueing is identical to the multiprocessor model. As N --+ 00, the expected number of busy memories (per cycle) is (2 - .,j2). N [11]. . When the input queues are saturated and N < 00, the switch throughput is found by analyzing a Markov chain mod~l. Under saturation, the model is identical to the Markov chain analysis of memory interference in [12]. Unfortunately, the number of states grows exponentially with N, making the model useful only for small N. The results presented in Table I, 4 however, illustrate the rapid convergence to the asymptotic throughput of 0.586. In addition, sa~ration thro~ghputs obtained by simulation' (Fig. 6) agree With the analysis.
(28)
It follows from (27) and (28) that Po = (2 - ~) = 0.586 when the switch is saturated and N = 00. It is interesting to note that this same asymptotic saturation throughput has also been obtained in an entirely different context. Consider the problem of memory interference in synchronous multiprocessor systems [11], [12] in which M memories are shared by N processors. Memory requests are presented at the beginning of memory cycles; a conflict occurs
Pr [I~= 1] =Pr [A~>O]
=l-(l-p/N)'1V.
(31)
4 The entries in Table I were obtained by normalizing (dividing by N) the values from [12, Table III]. S Rather than plot the simulation results as discrete points, the saturation throughputs obtained for N between 2 and tOO are simply connected by straight line segments. No smoothing is done on the data.
566
THE BEST OF THE BEST
TABLE I THE MAXIMUM THROUGHPUT ACHIE VABLE WITH INPUT Q UEUEING
N
Satu ration T hroughput
1
1.0000
2
0.7500
3
1.0
0.9
Z
0.8
0.6825
~:J
4
0.6553
0.7
::>
5
0.6399
0
>=
'" z
::>
a:
l-
6
0.6302
7
0.6234
8
0.6184
06
N=CXI - - - - - - - - -
N = 10 -
I-
::>
N=4 - - - N= 3 - - - -
0..
I-
::>
Q
0.5
::> 0..
e
-~_"'~
-~.o'-":;..'
- ...... ...,..,"'
0.4
::>
0.5858
00
- -
N=2 - - - N= l----,
I-
r
- - -
0
a: r r
0.3
~
0.2
I-
g
0.8
0.75
0:1
0.7
0
I-
::>
c..
:x:
o
0
::> 0
0.2
ex:
:x: 0.65 z 0 ;:: I-
«
ex:
::>
I-
Fig. 7.
0.6
0.8
The switch throughput when packets are dropped, rather than queued at the inputs.
0.6 TABLE II THE STRATEGY (AS A FuNCTION OF p ) , INP UT Q UEUEING, OR PAC KET DROPPING THAT YIELDS THE LARGER SWITCH THROUGHPUT
0.586
«(f)
0.55
N 0.5
~....I.....I-L..J.....L....L...l...J......I....L....L....I....J.....I....I.....1...L....L.:::I
o
20
40
60
80
1
100
SWITCH SIZE . N
2
Fig. 6. The maximum throughput achievable with input queueing.
3 4
By symmetry, I - (l - p/N)N is the utilization of each output trunk; the switch throughput Po is given by
5
(32)
6
Po= l-(l-p/N) N. As N ......
04
INPUT TRUNK UTILIZATION (p)
7
00,
Po=l-e- p •
(33)
The probability that an arbitrary packet will be dropped from the switch is simply I - po/po The switch throughput Po, as a function of p, is shown in Fig. 7 for several values of N. When the utilization p of the input trunks exceeds a critical threshold , the switch throughput Po is larger when we drop packe ts [(32) and (33)] than when we queue them on the input trunks (Table I). For example, when N = 00 and p > In (I + V2), the switch throughput when we drop packets is greater than (2 - V2) - the throughput with input queues . Table II lists , as a function of p , which of the two strategies yields the larger switch throughput. C. Waiting Time-Random Selection Policy Below saturation, packet waiting time is a function of the serv ice discipline the switch uses when two or more input queues are waiting to transmit packets to the same output . In this section, we derive an exact formula for the mean waiting
8 00
Queues On Inp uts F in it e Qu eu e Sizes
o~ o~ o~ o~ o~ o~ o~ o~ o~
p
<
Qu eu es On Inpu ts - . Sat urated Qu eu es
p
I
=1
p
< 0.750
p
~
0.682
0.683
p
<
0.655
0.656. ~ p
p ~ 0.639
0.640
~
0.630
p ~ 0.623
0.750
~
p
Drop Packe ts ------- -
~
1
------ - -
0.953
0.954
~
p
~
1
~
0.935
0.936
~
p
~
1
~
0.923
0 .924 .~
P ~ 1
0.631
~ p ~
0.916
0.9 17
~
p
~
1
0.62 4
~ p ~
0.911
0.912
~
p
~
1
p ~ 0.618
0.619
s
p ~ 0.907
0.908
~ p ~
1
p ~ 0.585
0.586
~
p
0.881
0.882
~ p ~
1
p
~
~ p ~
p
~
time under the random selection policy for the limiting case of 00 . The waiting time is obtained by simulation for finite values of N. In Section III-D , numerical results are compared to the mean waiting time under the longest queue and fixed priority selection policies. When the input queues are not saturated, there is a significant difference between our analysis of a packet switch with input queues and the analysis of memory interference in synchronous multiprocessor systems. The multiprocessor application assumes that new memory requests are generated only after a prev ious request has been satisfied. A processor never has more than one memory request waiting at any time. In our problem, however, packet queueing on the input trunks impacts the switch perforinance. A discrete-time Geom/G/I queueing model (Fig. 8) is used
N =
Fifty Years ofCommunications and Networking
567
-
9
OUTPUT
INPUTi_
=
w
:::<
(N
~ u5
i
8 7
E
6
~
5
w
o ~
NxN NONBLOCKING PACKET SWITCH
~
(0)
-
D=
pS(S-l) 2(1- pS)
+
S
z
3
~
2
~
Fig. 8. The discrete-time Georn/G/I input queueing model used to derive ' an exact formula for the mean waiting time for the limiting case of N = 00.
to determine the expected packet waiting time for the limiting case of N = 00. The arrival process is Bernoulli: in any given time slot, the probability that a packet will arrive on a particular input isp where 0 < p < 2 - .J2. Each packet has equal probability liN of being addressed to any given output, and successive packets are independent. To obtain the service distribution, suppose the packet at the head of input queue i is addressed for output}. The " service time" for the packet consists of the wait until it is randomly selected by the switch controller, plus one time slot for its transmission through the switch and onto output trunk}. As N --+ 00, successive packets in input queue i experience the same service distribution because their destination addresses are independent and distributed uniformly over all N outputs. Furthermore, the steady-state number of packet "arrivals" to the heads of input queues and addressed for output} becomes Poisson at rate p. 6 Consequently, the service distribution for the discrete-time GeomlGl1 model is itself the packet delay distribution of another queueing system : a discrete-time MIDII queue with customers served in random order. Analysis of the discretetime MIDII queue, with packets served in random order, is given in Appendix B. Using [6, eq . (39)], the mean packet delay for a discretetime GeomlGl1 queue is
4
0.2
0.4
0.8
0.6
OFFERED LOAD (p)
Fig. 9 .
A comparison of the mean waiting time for input queueing and output queueing for the limiting case of N = 00. 10 ~r-r-r-'-'-'-'-'-'-'--'TITIrrT'--TT'--'-'--'--'-'l
9
§en w :::;;
7
N";' ()D - - - - - - -
6
E w
~
tel
N=1 6 - -- ----.
5
z ;:: ~
4
Z
3
(34)
where S is a random variable with the service time distribution given in Appendix B and mean value S. The mean waiting time isW=D-l
W=
pS(S-l) 2(l-pS)
+S-1.
(35)
1) and S are determined numerically using the method in Appendix B. _ The mean waiting time W, as a function of p, is shown in Fig . 9 for both input queueing and output queueing-in the limit as N .... 00. As expected, waiting times are always greater for queueing on inputs than for queueing on outputs . Packet waiting times for input queueing and finite values of N, obtained by simulation," agree with the asymptotic analytic results (Fig . 10).
S(S -
This follows from the proof in Appendix A. Rather than plot the simulation results as discrete points. the simulation results are simply connected bystraight line segments; no smoothing is done on the data. The same comment applies to Figs. 11. 12, and 13. 6
7
0.2
0.4
06
0 .8
INPUT TRUN K UTILIZATION (p)
Fig. 10; The mean waiting time for several switch sizes N with input queueing.
D . Longest Queue and Fixed Priority Selection Policies Until now, we have assumed that if k packets are addressed to a particular output, one of the k packets is chosen at random, each selected with equal probability 11k. In this section, we consider two other selection policies: longest queue selection, and fixed priority selection. Under the longest queue selection policy, the controller sends the packet from the longest input queue. A random selection is made if, of the k input queues with packets 'addressed to a particular output, several queues have .the same maximum length . Under the
568
THE BEST OF THE BEST 5.0
4.5 9
P RIORITY 4 (LOWEST) - -
4.0 8
en
3.5
RANDOM, N=16 - - -
PRIORIT Y 3
0
- ---
...J
l/l
w
:::;:
3.0
E.
LONGEST, N= 16 - - - -
w
::;
;::
2.5
o Z
;::
~
2.0
z
-c w
::;
1.5 - - RANDOM, N=2
10
- - LONGEST, N =2 2
- PR IORITY 2
05
0 0
0.2
04
0.6
08
P RIORITY 1 ( HIGHEST)
INPUT' TRUNK UTILIZATION (p )
Fig. II . The mean waiting time for input queueing with the random and longest queue selection policies.
fixed priority selection policy , the N inputs have fixed priority levels, and of the k packets, the controller sends the one with highest priority . Simulation results for the longest queue policy indicate smaller packet waiting times than those expected with random service (Fig. 11). This is anticipated because the longest queue selection policy reduces the expected number of packets blocked (behind other packets) from traversing the switch to idle outputs. For the fixed priority service discipline, our simulation results show that the lowe st priority input queue suffers large delays and is sometimes saturated, even when the other two .service disciplines guarantee stability. Although the saturation throughput is 0.6553 under the random selection policy when N = 4 (see Table I) , it is 'shown in Fig. 12 that the lowest priority input queue saturates at approximately 0 .55 under the fixed priority discipline . Fig . 13 illustrates the family of waiting time curves for N = 8. These results are interesting because imposing a priority scheme on a single server queueing system usually does not affect its stability; the system remains work conserving. For the N X N packet switch, however, packet blocking at the low priority input queues does impact stability. More work remains to characterize the stability region .
IV. CONCLUSION Using Markov chain models, queueing theory, and simulation, we have presented a thorough comparison of input versus output queueing on an N x N nonblocking space-division packet switch. What the present exercise has done, for a particular solvable example, is to quantify the intuition that better performance results with output queueing than with input queueing. Besides performance, of course, there are other' issues, such as switch implementation, that must be considered in designing a space-division packet switch. The Knockout Switch [13] is an example of a space-division packet switch that places all buffers for queueing packets at the
0.2
0.4 INPUT TRUNK UTILIZATION (p)
Fig. 12. The mean waiting time for input queueing with the fixed priority service discipline and N = 4.
10
9
8 PRIO RITY 8 (LOWESTj --
en
§
l/l
w
::;
5
E. w
::; ;::
o
z ;::
~
4
5
3 4
PRIORIT Y 7
~
- - -
Z
<w
:::;:
3
P RIORITY 6 - - -
0.2
0.4
-
0.6
08
INPUT TRUNK UTILIZATION (p)
Fig. 13.
The mean waiting time for input queueing with the fixed priority service discipline and N = 8.
569
Fifty Years of Communications and Networking
outputs of the switch, thus enjoying the performance advantages of output queueing. Furthermore, the switch fabric runs at the same speed as the input and output trunks.
Case 1 (a = 0): If f E SN, then (l-l(N)uS,
(~) (lIN)O(1-1/N)/-oS,(1-1/N)L.
ApPENDIX A
(AS)
POISSON LIMIT OF PACKETS MOVING TO THE HEAD OF FREE INPUT QUEUES
For the input queueing saturation analysis presented in Section III-A, as N ~ 00, we show that the steady-state number of packets moving to the head of free input queues each time slot, and destined for output i, (A i) becomes Poisson at rate Po = F / N. To make clear the dependency of F and Po on the number of inputs N, we define F(N) as the steady-state number of free input queues and Po(N) as the output trunk utilization for a given N where PoN = P(N)/N. We can write
Var
[F<;) J =~ Pr [input queue r is free] +(
I-
Therefore, from (A6), (A7), and (A8), we obtain
Var
---+ 00, e-{p()+E)s
- (Pr [input queue r is free])
lim Var (F(N)J ==0. N
Given
€
> 0, we define the set
(AIO)
> 0,
lim Pi- [Ai=O]=e- po.
(AI)
As N -+ 00, the events {input queue r is free} and {input queue s is free} become independent for s =1= r, Therefore, from (Al),
N-oo
Pr [A i = O]:s e-(p()-E).
Since this holds for arbitrarily small e
input queue s(s*r) is-free] 2•
J
1------
As N
~) Pr [input queue r is free,
[F_<_;_)
(A2)
(All)
N-oo
Case II (a > 0): Since (~)(l/N)a(l .; l/N)f-a is a nondecreasing function of/for 1 S f ~ N and a > 0, for f E SN we have ; (l/N)a(l-lIN)L-as,
(~) (l/N)a(l-lIN)/-a
~ ( ~) (lIN)o(l-l/N)U-a.
SN by
SN g {L,L+l, ... , V-I, U}
(A3)
L ~ max {I, LP(N)-€NJ},
(A4)
U £ min {N, fP(N) + eNl},
(A~)
where
(AI2)
Therefore, from (A6), (A7), and (AI2), we obtain
[1
Var ( :
and LxJ (fxl) denotes the greatest (smallest) integer less than (greater than) or equal to x. By the Chebyshev inequality,
J] .(~)
s,Pr [A i =0] s, (A6)
Var
Therefore,
(lIN)fl(I-l/N)L-a
(~) (l/N)a(l-l/N)U-a
(F_~_N_)]
+~----
N
Pr (Ai=a] ==
~
(AI3)
Pr [F(N)=f]
As N
f=max(a,l)
· :S
(~) (lIN)fl(l-lIN)/-fl
-+
e-(po-t)
a!
+-----
[At=a]~e-(po+d
" . Iim Pr [Al=a]=e-
N-f»
(~) (lIN)a(l-lIN)f-a F (N Var [ -) J N
Since this holds for arbitrarily small
~ Pr [F(N)=f] jESN
·
(Po+€)Q
(pO-f)O.
ApPENDIX
DISCRETE-TIME
E
a!
•
(A14)
> 0, pO
PO -
0
a!
.
(A15)
B
MIDII QUEUE-PACKETS
SERVED IN RANDOM
ORDER
(A7)
In this Appendix, we present a simple numerical method for computing the delay distribution of a discrete-time M / D/ 1
570
THE BEST OF THE BEST
queue, with packets served in random order. The number of' packet arrivals at the beginning of each time slot is Poisson distributed with rate X, and each packet requires one time slot for service. We fix our attention on a particular "tagged" packet in the system, during a given time slot. Let Pm,k denote the probability, conditioned on there being a total of k packets in the system during the given time slot, that the remaining delay is m time slots until the tagged packet completes service. It is easy to obtain Pm,k by recursion on m.
Pm,l=O
1
D. P. Bhandarkar, "Analysis of memory interference in multiprocessors," IEEE Trans. Comput., vol. C-24, pp. 897-908, Sept. 1975. [13] Y. S. Yeh, M. G. Hluchyj, and A. S. Acampora, "The knockout switch: A simple, modular architecture for high-performance packet switching," in Proc. Int. 'Switching Symp., Phoenix, AZ, Mar. 1987, pp. SOl-808.
*
Mark J. Karol (S'79-M'85) was born in Jersey City, NJ, on February 28, 1959. He received the B.S. degree in mathematics and the B.S.E.E.
(Bl)
Pltt=l
P l ,k = -k
[12]
k~l
(B2)
PHOTO NOT
(B3)
AVAILABLE
m>l, k>l. (B4) Averaging over k, the packet delay D. has the probabilities
degree in 1981 from Case Western Reserve University, Cleveland, 08, and the M.S.E., M.A., and Ph.D. degrees in electrical engineering from Princeton University, Princeton, NJ, in 1982, 1984, and 1985, respectively.
Since 1985 he has been a member of the Network Systems Research Department at AT&T Bell Laboratories, Holmdel, NJ. His current research interests include local and metropolitan area lightwave networks, and wide-band circuit- and packet-switching architectures. -
*
Michael G. mucbyj (S'75-M'82) was born in.
co
Pr [D=m]= ~ Pm,k k=l
· Pr [k packets in system immediately after the tagged packet arrives] (B5)
where the Qn are the steady-state queue size probabilities given by (13)-(15). The variance and mean of the packet delay distribution are determined numerically from the' delay probabilities in (B5).
Erie, PA, on October 23, 1954. He received the B.S.E.E. degree in 1976 from the University of Massachusetts, Amherst, and the S.M., E.E.,. and PHOTO Ph.D. degrees in electrical engineering from the NOT Massachusetts Institute of Technology, Cambridge, in 1978, 1978, and 1981, respectively. AVAILABLE From 1977 to 1981 he was a Research Assistant in the Data Communication Networks Group at the M.I.T. Laboratory for Information and Decision Systems where he investigated fundamental problems in packet radio networks and multiple access communications. In 1981 he joined the Technical Staff at Bell Laboratories where -he worked on the architectural design and performance analysis of local area networks. In 1984 he transferred to the Network Systems Research Department at AT&T Bell Laboratories, performing fundamental and applied research in the areas of high-performance, integrated communication networks and multiuser lightwave networks. In June 1987 he assumed his current position as Director
REFERENCES
ttl J.
S. Turner and L. F. Wyatt, "'A packet network architecture for integrated services, in GLOBECOM'83 Conf. Rec., Nov. 1983, pp. 45-50.. [2] J. J. Kulzer and W. A. Montgomery, "Statistical switching architectures for future services, " in Proc. Int. Switching Symp., May 1984. [3] T .-Y. Feng, "A survey of interconnection networks, " Computer, vol, . 14, pp. 12-27, Dec. 1981. [4] _ D. M. Dias and M. Kumar, "racket switching in Nlog N multistage networks," in.GLOBECOM'84 Conf. Rec., Nov. 1984, pp. 114U
120. [5]
Y.-C. Jenq, "Performance analysis of a packet switch based on a single-buffered Banyan network, H IEEE J. Select. Areas Commun., vol. SAC-I, pp. 1014-1021, Dec. 1983. [6] T. Meisling, "Discrete-time queueing theory, " Opera Res., vol. 6, pp. 96-105, Jan.":"Feb. 1958. [7] 1. Rubin, "Access-control disciplines for multiaccess communication channels: Reservation and TDMA schemes," IEEE Trans. Inform. Theory, vol. IT-25, pp. 516-536, Sept. 1979. [8] L. Kleinrock, Queueing Systems, Vol. 1: Theory. New York: Wiley, 1975. r9] D. Gross and C. M. Harris, Fundamentals oj Queueing Theory. New York: Wiley, 1974. [10] P. J. Burke, "Delays in single-server queues with batch input;' Opera Res., vol. 23, pp. 830-833, July-Aug. 1975. [11] F. Baskett and A. J. Smith, "Interference in multiprocessor computer systems with interleaved memory," Commun. ACM, vol. 19, pp. 327-334, June 1976.
of Networking Research and Advanced Development at Codex Corporation. His current research interests include wide-band circuit- and packet-switching architectures, integrated voice and data networks, and local area network interconnects. Dr. Hluchyj is active in the IEEE Communications Society and is a member of the Technical Editorial Board for the IEEE NElWORK Magazine.
*
Samuel P.·Morgan (SM'55-F~63)was born in San Diego, CA, on July 14, 1923. He received the B.S., M.S., and Ph.D. degrees, 'all in physics, from the California Institute of Technology, Pasadena, in PHOTO 1943, 1944, and 1947, respectively. NOT He is a Distinguished Member of the Technical Staff at AT&T Bell Laboratories, Murray Hill, NJ. AVAILABLE He joined Bell Laboratories in 1947, and for a number ofyears he was concerned with applications of electromagnetic theory to microwave antennas and to problems of waveguide and coaxial cable transnussion. From 1959 to' 1967 he was Head, Mathematical Physics Department, and from 1967 to 1982 he was Director, Computing Science Research Center. His current interests include queueing and congestion theory in computer-communication networks, Dr. Morgan is a member of the American Physical Society, the Society for Industrial and Applied Mathematics, the 'Association for Computing Machinery, the American Association for the Advancement of Science, and Sigma Xi.
Routing in the Manhattan Street Network NICHOLAS F. MAXEMCHUK,
SENIOR MEMBER, IEEE
Abstract-Tile Manbatta~ Street Network is a reguillr, ~wo-coDnected network, designed for packet communications in a l~aI or metropolitan area. It operates as a slotted system, similar to convention~1 loop networks. U nUke loop networks, routing decisions must be made at every node in this network. In this paper, several dlstnbnted routing roles are investigated that take advantage of the regular structure of the network. In an operational network, irregularities occur in the structure because of the addressing mecbanisms, adding single nodes, and fail,.res. A fractional addressing sebeme is described tbat makes it possible to add new rows or columns to the network without changing the addresSes of existing nodes. A technique is described for adding one node at a time to the network, while chang_ng only two .existing links. Finally, two procedures are described that allow the network to adapt to node or link failures. The effect that irregularities have on routing mechanisms designed for a regular structure is investig~ted.
I.
INTRODUCTION
T
HE Manhattan Street Network (MSN) [1], Fig. 1, is a two-connected, regular network with unidirectional links. The links are arranged in a structure that resembles the streets and avenues in Manhattan. The MSN topology is being applied to a local or metropolitan area packet communication system. The nodes in the MSN are described in Section II. The structure of the nodes and the access strategy are similar to those in a slotted loop system. The principle difference between the MSN and a loop network is that there are two links arriving at and leaving each node instead of a single link, and a routing decision must ~e made for each packet transmitted at each node. An experimental network is being constructed with a 50 Mbit/s transmission rate on each link and 128 bit fixed sized packets. More than 750 000 routing decisions per second may have to be made at each node in this network. In this type of network, the routing rule must be simple. In this paper, distributed routing rules for the MSN are investigated. Simple routing rules that use the regular structure of the network are compared to shortest path algorithms and random routing strategies. In the MSN, shown in Fig. 1, the number of rows and columns completely defines the network, and if these numbers are known, the shortest path between any pair of nodes can be determined. In addition, because of the cyclic structure of the MSN, routing is only dependent upon
the relative location of the current node with respect to the destination, as defined in Section III-B, and the same routing rule can be used at every node. In Section Iv-A, a distributed rule is described that finds the shortest path. In Sections IV-B and IV-C, two simplifications of the shortest path rule are described. The simplified rules do not always find the shortest path, and the effect that these rules have on the average path length is investigated in Section IV -E. In Section III-A, a fractional addressing scheme is described. This addressing scheme has two advantages over the integer addressing scheme in Fig. 1.
Paper approved by the Editor for Wide Area Networks of the IEEE Communications Society. Manuscript received November 25, 1985; revised December 11,1986. The author is with AT&T Bell Laboratories, Murray Hill, NJ 07974. IEEE Log Number 8613925.
Fi8' 1.
36-node MSN.
1) New rows or columns are added to the network without changing the addresses of existing nodes.
2) The distributed routing rules are independent of the number of rows or columns in the network. A disadvantage of fractional addressing is that the distributed routing rules must operate' without knowing the position of all of the rows and columns in the network and cannot always find the shortest path to the destination. The effect that this addressing scheme has on the average distance between nodes in investigated in Section IV -E. . The MSN is a regular structure with an even number of rows and columns and is not defined for an arbitrary number of nodes. A realistic network, in which nodes are added and other nodes or links fail, may be approximated by the MSN, but it is unlikely mar it will exactly correspond to the regular structure. The routing rule that is selected for the, MSN must
operate in networks with irregularities. The effect that
irregularities have on the routing rules depends upon the techniques used to add nodes and remove failed components. In Section V-A, a technique is described for adding one node at a time to the network. When this technique is used, only two links must be changed when a new node is added to the network. As nodes are added, a row or column may not have a full complement of nodes. The routing rules operate without knowing which rows or columns are incomplete. In Sections V-B and V-C, procedures are described that allow the network to adapt to node or link failures. The adaptations guarantee that the nodes continue to operate without losing packets at any of the surviving nodes. The routing rules operate without knowing which nodes or links have failed, II.
SYSTEM
DESCRIPTION
The MSN, Fig. 1, is a member of a class of multiplyconnected, regular, mesh-configured networks. There is an
Reprinted from/EEE Transactions on Communications, vol. COM-35, no. 5, May 1987.
The Best ofthe Best. Edited by W. H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
571
THE BEST OF THE BEST
572
In 1--0+---- .......
~+--......
In 2 - - - - I........
t--+----.. Out 2
OUt 1
Source
Fig. 2.
The structure of a node in a two-connected network with fixed size
packets.
even number of rows and columns with two links arriving at and two links leaving each node. Logically, the links form a grid on the surface of a torus, with links in adjacent rows or columns traveling in opposite directions. In [2], it has been demonstrated that, because of the increased connectivity, mesh networks can achieve higher throughputs and support more sources than conventional loop [3]-[5] and bus [6], [7] networks. This occurs because of the following. 1) On the average, a smaller fraction of the links in the network are used to interconnect a source and destination. 2) Sources that communicate frequently can be clustered into communities of interest that do not interfere with one another. In a network with several paths arriving at a node, messages from more than one incoming link may be destined for the same outgoing link. Data from several links can be concentrated onto one link by storing the data, forwarding them when the link is available, and establishing protocols to recover messages that are lost because of buffer overflows. In [2l t a slotted system is described that does not require buffering on the output links and does not lose packets because of buffer overflows. The structure of a node in this network is shown in Fig. 2. The packets in the slotted system are a fixed size. A node periodically transmits a packet from an input line, a packet from the source, or an empty packet on each output line. At each node, the packets from the input lines are delayed so that they arrive at the switch at the time that the node transmits a packet. The node switches each of the incoming packets not destined for the node to one of the output links. If the buffer for an output link is full, and two incoming packets are destined for this link, one of the packets is forced to take the. other output link. This strategy guarantees that packets are never lost because of buffer overflow, even if the output buffer size at the nodes is reduced to zero; however, the larger the buffers at a node, the less likely it is that a packet must be misdirected. Packets from the source are only transmitted when there is an empty slot on an output link. The node controls the source so that packets do not arrive faster than they are transmitted, and the rate available to' the source decreases when the network is busy. Packets that are misdirected take a longer path to their destination and prevent more new packets from entering the network. Therefore, there is a tradeoff between buffer size and the throughput of the network. In the MSN, each time a packet is misdirected, the length of the path to the destination is increased by at most four links. In addition, there are many nodes for which either outgoing link provides the same path length to the destination, and when a
packet may take either link, the probability any packet will have to be misdirected decreases. A recently published analysis and simulation of the MSN [8] indicates that the MSN operates reasonably efficiently without buffers on the outgoing links, and the experimental system that is being implemented does not have buffers. III.
ADDRESSING NODES IN THE NETWORK
Each node in the network has a unique address that is called the node's absolute address. To simplify the routing rule, the absolute address of a node reflects the regular structure of the MSN. Because of the cyclic nature of the MSN, routing only depends on the relative position of the current node and the destination, called the current nodes relative address, and not on the absolute address of any node. Relative addresses allow the same routing rule to be used at each node.
A. Absolute Addresses In Fig. 1, the rows in the MSN arc sequentially numbered from 0 to m - 1, the columns are numbered from 0 to n - 1, and the absolute address of a node is its row and column. The odd-numbered rows have links in one direction and the evennumbered rows have links in the opposite direction. New rows or columns are added in pairs to preserve the alternating directions, and the address of an 'existing row or column changes when new elements are added in the middle of the network. To reduce the effect of changing addresses on the communications routines at the source, there must be a transformation between a logical address by which the source refers to the destination and a physical address that is the destination node's current row and column. The transformation need only be performed at the source node of a packet; however, a transformation table must be maintained at every node in the network, and a protocol must be developed to update the table as physical addresses change. An alternative to changing the address of a node when new rows and columns are added is to plan for expansion by not using all of the addresses initially. For instance, in the initial implementation of the network, the rows may be numbered 0, 11, 22, · · · so that ten new rows can be added between each of the initial rows. By leaving an even number of integers between assigned rows or columns, the alternating direction can be retained as new rows are added. The spacing between rows can be decreased when nodes in the same community of interest are in adjacent rows, and it can be increased where new communities of interest may be inserted. This approach requires careful planning because the network growth must be predicted when the initial network is designed. The planning can be reduced by using fractional addresses rather than integer addresses. Fractional addresses allow an arbitrary number of pairs of rows to be added at any position in the network. The fractional addressing scheme that has been selected is shown in Fig. 3. The first two rows or columns are labelel 0 and 1. Rows are added in pairs and are labeled as two fractions, 1/3 of the way between two other rows. For instance, two rows added between 0 and 1 are labeled 1/3 and 2/3 and two rows added between 2/3 and 1 are labeled 719 and 8/9. New rows that are added between 1 and 0 are considered to be between 1 and 2 so that they have different addresses from the rows between 0 and 1. For instance, two rows added between 1 and O are labeled 4/3 and 5/3 and two rows between 5/3 and 0 are labeled 16/9 and 17/9. Fractional addressing does not constrain the total number of rows that can be added to the network or the number of rows that can be added to a community of interest in a particular area of the network. In addition, the fractional addressing scheme selected guarantees that all rows with an even numerator have links in one direction and all rows with an odd numerator have links in the opposite direction, as in the integer addressed system.
573
Fifty Years of Communications and Networking
o 1/3 1/9 2/9
4/9 519
Fig. 3.
5/3
4/3
2/3
13/9 14/9
10/9 11/9
7/9 8/9
Fractional addressing in the MSN.
Q2r" - - - - - - - - - - - - - - - - - - - - - - - - - - - - -,- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
~l
•
1
•
I I I
I I
•
I
I
1•
I
II
1
I ,
I
I
I I
I I
I I
t
I
t
•I
1I
I
•
I
I I
I I
t
t
I I I I I I I
~-------
I
I I I I I
t
t
I
--
-- -------i
I
I
I
I
I
I
I
I t
I
I I
,
,
t
I
t
I t
•
I
t•
It I
I I I I I
I
I
t
I I I I
I
I I •I
I
I
I
I
I
I
I
I
I
I
I
t
Ql~----------------------------~-----------------------------~4
Fig. 4.
Relative addresses in a 36-node MSN.
Fig. 5.
Assignment of rows to quadrants in a network with eight rows.
network. A node is in Ql when r > 0 and c > 0, Q2 when r > o and c s 0, Q3 when r -s 0 and c =s 0, and Q4 when r > 0 and C S O. The quadrant of the current node indicates the direction in which to proceed to get to the destination. Because the network has unidirectional links, this routing strategy must be modified when the current node is at the boundary of the quadrants, as discussed in Section IV. Fixing the orientation of the links at the destination allows the same routing decisions to apply at the boundaries. An advantage of fractional addressing over integer addressing is that the relative addresses are independent of the number of rows or columns in the network. In an integer-addressed network, the arithmetic unit that .calculates the relative address must be changed whenever the number of rows or columns changes. This arithmetic unit does not change in the fractionally addressed network. A disadvantage of fractional addressing is that the destination is sometimes displaced from the center of the network when relative addresses are calculated. For instance, in Fig. 5(a), a possible assignment of rows to quadrants is shown for a network with eight rows and only two of a possible 12 rows in the 119th addressing level. Five rows are assigned to quadrants 1 or 2 and only three to quadrants 3 or 4, and as a result, packets routed from nodes with a relative address (1, X) may
B. Relative Addresses
Because of the cyclic structure of the MSN, any node can be considered to be in the' center of the network. The relative address (r, c) of a node with absolute address (rjr, Cjr) with respect to the destination node with absolute address (rto , c to ) is defined so that the destination node is approximately at the center of the network, has relative address (0, 0), and has both row and column links directed toward decreasing numbered nodes, as in Fig. 4. The relative address in an m x n integer-addressed network is
J J
r= ; - ((; -Dc(rtr-rto» mod m c=i- ((i-Dr(Ctr-C/o»
b) Expectedassignment of roWs to quadrants
I
I
I ,
I
a) Actual assignmentof rows to quadrants
16/9 1719
mod n
(1)
and in a fractionally addressed network is
take a longer path to the destination. In Fig. 5(b), the assignment of rows to quadrants that places the destination in the center is shown... It is evident from this example that new rows should be added uniformly, when possible, in order to calculate the quadrant correctly. However, the quadrant is most likely to be calculated incorrectly for the nodes that are furthest apart. In large networks, with many small communities of interest, expanding the network with nonuniform addresses that keep nodes in their communities of interest is preferable to forcing nodes to join distant parts of the network. If most packets remain within the community of interest and are not directed to the nodes that are furthest away, the distance between nodes that communicate frequently is kept small and the effect of nonuniform addresses is less than in a network with uniform traffic requirements. IV. DISTRIBUTED ROUTING RULES In Sections IV-A, B, and C, three distributed routing rules are described that use the regular structure of the MSN to
r= 1- {(l-Dc(Tjr- 'to» mod 2}
select a path to the destination. Rule 1 determines all shortest paths to the destination for integer addressed MSN's. Rules 2 and 3 reduce the number of calculations that are performed at where D; and Dr are dependent upon the direction of the links. each node, but occasionally take longer paths. Rules 1 and 2 at the destination node. In a network with the links in the even are dependent UP01) the addresses of the adjacent nodes to and odd rows and columns directed toward increasing or which a node is connnected; rule 3 is not. decreasing rows and columns, as in Fig. I, D; = + 1 when Cto In complete, integer-addressed networks, the address of (the numerator of Cto in a fractionally addressed network) is 'adjacent nodes is known. In fractionally addressed networks or even, D; = - 1 when e to is odd, Dr = + 1 when T to is even, in networks with partially full rows or columns, as described and Dr = - 1 when rIO is odd. in Section V-A~ the address of adjacent nodes is not known. If The definition of the relative coordinates in (1) and (2) rule 1 or 2 is used, a technique must be used to determine the limits the relative address of the current node to - (m/2) < r node to which each node is connected. In the experimental s m12, and - (nI2) < e -s nl2 for an integer-addressed network, the nodes to which a node is connected is stored network and - 1 < r, C -s 1 for a fractionally addressed locally and changed manually when the network connectivity
c= l-{(l-Dr(c/r-c to » mod 2}
(2)
574
THE BEST OF THE BEST
------------~~~-------j-~ ,
I I I
,
, ,, I
I
I
Qr
I I I
--.
,, I
I
1
t I
t
~-~-------~------------~I
'2----'
I
CJ
1 Q]-
---..
I
CI
Qr
,:~ I
I I I I I
4---
--.
I
1
:
--._----_._----_._-----~
I
1
QI
! ......-I I I I I
..--
-------
'4
--._~-----------------------
f Q)
Q..-
1 Q.-
c.f
C4
I I
~
!
~
'4
----~-----------------------
.......-
C2
I I
'2---'
I
t
I
.,
+:
!
1
~
I
...L:.
Q.-
I I
,
I I I I I I
4---
C2
I I I I I
----~-------------------_.--
']~
Fig. 6.
Preferred paths in Rule 1.
changes. When the procedure described in Section V-A is used to add nodes, the connectivity for only two nodes changes when a new node is added to the network. Therefore, a manual rather than an automatic procedure is reasonable. When nodes or links fail and are bypassed automatically, as in Sections V-B and C, the connectivity information at a node is not changed, and it is incorrect. These three rules are referred to as deterministic routing rules. In addition, two random routing rules are described in Section IV-D. Rule A is independent of the address of adjacent nodes, and rule B routes packets correctly when the destination is one node away. In Section IV-E, the path lengths resulting from the routing rules are compared in integer and fractionally addressed networks.
A. Deterministic Rule 1 The solid arrows in Fig. 6 show the preferred direction of .travel from the relative positions in the network to the destination for the first routing rule.' In this figure, rh r2, r3, and r4 are rows at the edges of the quadrants, and Cit C2, C3.' ~nd C4 are columns at· the edges of the quadrants. The first routing rule is as follows.
sut«).
,
(;
• Select the preferred path if there is one preferred path from a node. • Select either path if there are zero or two preferred paths from a node. To implement the first rule, the relative addresses of the current node (r, c), the next node along the column (rnxt , c), and the next node along the row (r, c nxt ) are calculated. The quadrant is determined from (r, c) as in Section III-B, the direction of the link along the row is determined from C - Cnxt , and the direction of the link along the column from r - r nxt- A node in 04 is in row r~ if r = 0, and a node ,in Q2 is in column C2 when c = O. When a node in Q2 is also in r~, the link directed down is not preferred. These links can be determined because rnxt = 0 and C nxt =1= O. Similarly, a link is directed to the left from C4 if (r, c) is in Q4, Cnxt = 0, and r nX1 =1= O. Row rl is at the outside edge of the network. A node in Qi is in rl and has a preferred link that is not preferred in Ql if Cnxt =1= ,0 and Cnx t is in Q2. Similarly, when a node is in r3, Cit or C3 and has a preferred link that is not preferred in the rest of the quadrant, an adjacent node is also in a different q~ad!ant. , In the Appendix, it is shown that this routmg rule selects the shortest path from any node to the destination in an integer addressed network. Furthermore, when there are several
Fig. 7.
Preferred paths in Rule 2.
shortest paths, every path is selected as one of the alternatives; therefore, this rule has the maximum number of instances in which either link may be selected.
B. Deterministic Rule 2 Rule 2 is the same as Rule 1 except that the preferred paths are those shown in Fig. 7 instead of those in Fig. 6. Rule 2 has the advantage that there are fewer calculations than in Rule 1 because the special cases when nodes are in ri, Cl, r3, and (-3 are not determined. However, this rule has the disadvantage that nodes in these special rows and columns take a slightly longer path to the destination. In Rule 2, it is still necessary to know Cnxt and r nxt in order to determine when a node is in r2 or C4·
The routing rule is simplified in this manner because it
should have a relatively small effect on the average path
length. The nodes that are affected are those that are furthe.st from the destination. Fewer packets are affected by changes In routing rules in these nodes than elsewhere in the network because packets from other nodes are not intentionally routed through these nodes to get to the destination, andin a .network with communities of interest, nodes .are more hkely to 'communicate with nodes that are nearby. By contrast, a change in the routing rule in C2 and r4 would affect every packet headed for the destination. In addition, incorrect paths are not selected at all of the nodes in the affected rows and columns. From Fig. 6, preferred paths in the quadrants are also preferred paths in the special rows and columns at the edges of the network. Incorrect decisions may only be made at nodes where neither path is thought to be preferred and one of the paths is shorter. Furthermore, from Table V in the Appendix, the longer paths at the edge of the network are two greater than the preferred paths, while elsewhere in the network, they are four greater. The effect of longer paths on the average shortest path length is shown in Section IV-E.
c. Deterministic Rule 3 The solid arrows in Fig. 8 show the preferred paths and the dashed arrows show the alternate paths. The routing rule is as follows.
Rule 3:
• Select the preferred path if there is one preferred path from a node. • Select the alternate path if there is no preferred path and one alternate path from the node.
Fifty Years ofCommunications and Networking
575 TABLE I THE EFFICIENCY OF ROUTING RULES RELATIVE TO THE SHORTESTPATH
ALGORITHM
,>0
c-o
,>0 . e
co
Q2
---.
C2
I
I I I
V
I I I I I
..--
1
QI
,>0 c>o
1 ,-0
~ ---------------------------4--'4
r
c
1 Ql
--...
Efficiency of Routing Rules Deterministic
<~J';--'
1 c- - - -
Q4
c>O
,<0 c>o
Network
Short.
Integer Addr. 1 I 2,3
Path
1.00 1.00
4x4 4x6 6x6
2.93 3.30
3.11
1.00
'618 8x8 8dO
4.34 5.02 5.42
1.00 1.00
10xl0 10d2 12x12 12x14 14x14
1.00 .91 .91
Random
Fractional Addr. 1 I 2,3
A
I
8
.95 .91 1.00
.94 .95 .97
.21 .14 .10
.79 .30 .21
.14
.11
1.00
.98 1.00 .99
.99 1.00
.97 .98
1.00
.98
.09 .07 .06
.11
5.84 6.42 7.02
1.00 1.00
.99
1.00
.99
.99
.99
1.00
1.00
1.00
1.00
.99
.05 .OS .04
.09 .08 .07
1.45
1.00
1.00
1.00 1.00
.99 .99
.04 .03
.06 .06
1.89
1.00
.99
rules is that they use more links to get between a source and destination, and this results in a smaller network throughput. Fig. 8.
Preferred and alternate paths in Rule 3.
• Select either path if neither path is a preferred or alternate path or if both paths are preferred. The advantage of Rule 3 is that it uses fewer calculations than Rule 2 and is not dependent upon r nxt or c nxt ' From Fig. 8, the regions of interest in Rule 3 depend only upon the relative address of the current node .. The direction of the links at the current node is determined by assuming that a node has a link directed to the left when r is even and to the right when r is odd, and a link directed down when C is even and up when C is odd. The disadvantage of Rule 3 is that there are fewer instances in which either path from the node may be selected. In Q2 and Q4 where Rule 2 may select either path, Rule 3 is constrained to select one of the paths. This increases the number of times when two packets arriving at the node conflict. There are also instances in incomplete networks, in Section V-A, where Rule 3 cannot get to a specific destination while Rule 2 can. In complete networks, Rules 2 and 3 result in the same distance from any node to the destination. Whenever there are one or two preferred paths in Rule 2, one of these paths is selected by Rule 3. When there are two preferred paths in Rule 3, both paths have the same distance to the destination. Therefore, the path length is the same for both rules.
D. Random Routing Two random routing rules have been considered. Rule A is completely random, a packet selects either link with equal probability, and at each node, checks to see if it is at the destination. Rule B assumes that the two nodes to which a node is connected is known. At each node, if the destination is one node away, the packet is directed there; otherwise, a path is selected at random. There are two advantages to using random routing rules rather than deterministic rules. First, they are extremely easy to implement. In the random routing rules, it is not necessary to calculate the relative address of a node, its quadrant, or the direction of the links emanating from the nodes. Second, random routing rules are extremely tolerant of network irregularities. If nodes are added or fail in a perverse manner, the network may bear little resemblance to the regular structure, and the deterministic rules may not work. The random rules provide an alternative to the deterministic rules when this occurs. These random routing rules were first, investigated by Prosser [9] as a routing mechanism for survivable networks. The disadvantage of random routing
E. Comparison A comparison of the deterministic routing rules in integer addressed and fractionally addressed networks is presented in Table I. The average distance between nodes for a routing rule is calculated by determining the average distance between each source and destination in the network. The efficiency of the routing rule is the average of the shortest distance between nodes over the average distance between nodes using the routing rule. In the comparisons, there is no contention, and a packet always takes the path specified by the routing rule. When the rule decides that both paths are equivalent, either path is selected with probability 0.5. Because of this random component, a packet does not always take the same length path from a source to the destination. To compensate for the random component, the efficiency is calculated by determining the average distance between each node several times and averaging the result. The number of times that the average distance is determined is varied so that the span of values representing a 95 percent confidence interval is less than 1 percent of the average value. Table I shows that Rule 1 determines the shortest path in integer addressed networks, and that Rules 2 and 3 result in the same average distance between nodes. Rule 2 selects longer paths than Rule 1 when the relative location of a node is at the edge of the network, and this effect is also seen in the table. In the simulations, fractionally addressed rows and columns are added to the network two at a time in the order shown in Table II, which makes the depth or the row and column addresses as uniform as possible. It is evident that fractionally addressed rows can be added to large networks in a way that has a small effect on the average path length. Random rules are inefficient. In the networks in Table I, the average path length using random routing can be 33 times longer than the path" lengths resulting from the deterministic rules. It is inadvisable to use random routing when a network has some regularity to its structure. However, a hybrid random and deterministic rule can be used to obtain the efficiency of the deterministic rule in a regular network and the survivability of the random rule. For instance, a random component can be inserted in the routing rule after a packet has traversed a larger number of nodes than expected. The number of nodes a packet has traversed must be tracked in any practical network because when a node fails, packets destined for this node must be purged from the network. V. NETWORK IRREGULARITIES In addition to getting packets quickly between nodes in complete, regular MSN"s, the routing rules must continue to
576
THE BEST OF THE BEST TABLE II
THE ORDER IN WHICH RowS
AND COLUMNS ARE ADDED TO THE
NETWORK
Number of Rowsor Col!. 4 6
8 10 12 14
Addressof Rows or ColumnsAdded 0 1 1/3 4/3
1/9 10/9 4/9 13/9
2/3
5/3 2/9 11/9 5/9 14/9
function in irregular networks. The irregularities investigated in this section occur when the number of nodes that are added to the network are not sufficient to completely' fill a row or column .and when nodes or links fail and are deleted from the network. The effect of the irregularities on the routing rules depends upon the procedures used to add and delete nodes and links from the network. In Section V-A, a procedure for adding one node at a time to a network is described. This procedure has the characteristic that only two existing links must be changed to add a new node. In Sections V-B and C, procedures for deleting failed nodes and failed links are described. These procedures are similar to those used in loop networks [10] and can be implemented automatically. The source is not informed when a packet cannot be deliveredto the destination and not all failures are detected. Therefore, a higher level acknowledgment protocol is still required to guarantee that packets are delivered.
STEP 0
STEP.1
STEP 2
STEP 3
STEP 4
STEPS
A . Adding Nodes One at a Time A procedure is shown in Fig. 9 for adding one node at a time to an MSN. Each time a node is added, two links must be changed. The two links that will be changed when the next node is added are shown by dashed lines. When this procedure is followed, two complete new rows or columns are eventually added to the network. . Adding one node at a time makes the network less regular and affects the ability of the distributed routing rules to find the shortest path to a destination. The effect on a 6 x 6 network is shown in Table m. When 12 nodes are added, a 6 x 8 network' is formed. In this. table, the efficiency is calculated as in Section IV-E. The italicized numbers in parentheses indicate the fraction of source destination pairs that are unable to communicate. There are several cases for which Rule 3 cannot find a path. The reason this routing -rule fails. is seen by examining Step 4 in Fig. 9. Assume that a packet at node A is destined for node B. Node A is an odd-numbered column in Q4; therefore, in Rule 3, the link along the column is assumed to be directed upward and is selected. Unfortunately, the column is not complete and the packet ends up at node C. At node C, which is also in an odd-numbered column in Q4, the upward-directed path is selected, and the packet arrives back at node A. At node A, the path to node C is again selected, and the packet is stuck in a loop. In Rules 1 and 2, at node A it is known that the next node along the column is node C and that both links are directed away from the destination. Therefore, at node C, either link is selected with probability 0.5 and the loop is avoided.
B. Node Failures Loop systems have active components in the path at each node, and if one of these components fails, the loop is broken. When nodes fail, they are bypassed so that the remainder of the loop continues to operate. Loss of power at the node is a common failure because power is usually obtained from a local source. This type of failure is automatically corrected by using a relay to create a path around the node [10]. The relay is
Fig. 9.
Adding two columns to an existing network, one node at a time. TABLE III
THE EFFICIENCY OF LOCAL ROUTING RULES RELATIVE TO THE SHORTEST PATH ALGORITHM AS SINGLE NODES ARE ADDED TO A 6 X 6 NETWORK Efticicncy of Routin. Rules Random 3 A I B I 2 I
Network
Short. Path
6x6
3.71
1.00
.97
.97
.10
.20
add 1 add 2 add 3
3.77 3.80 3.91
.98 .95 .92
.95 .94 .91
.93 .93 (005) .91 (005)
.10 .10 .10
.21 .20 .19
add 4 add 5 add 6
3.95 3.99 4.04
.92 .89 .88
.90 .89 .87
.88 (.007) .87
.91 (.013)
.10
.19 .19 .18
add 7 add 8 add 9
4.07 4.16 4.18
.92 ..95 .94
.90 .93 .93
.91 .93 .93
.09 .09
.09
.19 .18 .19
add 10 add 11
4.22 4.26
.94 .97
.93 .95
.93 .95
.09 .09
.i8 .17
6x8
4.34
1.00
.97
.97
.09
.17
Deterministic
1
.09 .09
open when there is power and closes to bypass the node when power is lost. When nodes fail in the MSN, the system is not completely disabled as in a loop; however, packets that arrive at the node are lost and the node should be bypassed to prevent this from occurring. Loss of power at a node in the MSN can operate relays as in a loop system; however, there are two links entering and leaving each node and two relays must be used. The failure recovery procedure selected connects the row through and the column through, as shown in Fig. 10.
577
Fifty Years of Communications and Networking
Fig. 11.
Operation of the MSN when links fail.
Fig. 10. Operation of the MSN when nodes fail.
c. Link Failures In loop systems, a transmitter sends bits continuously, even when there is no information to send. This allows the receiver to retain bit synchronization between packets. Broken links and certain node failures are detected by the loss of signal. More subtle failures can be detected when the periodic start of slot does not occur in systems with fixed size slots or when there are violations of the pseudoternary modulation rules used in wire systems. Loop systems can be designed to bypass segments with failed links by constructing the loop as a series of subloops that start and end at a central location [10]. When the loss of signal is detected on a subloop, the subloop is bypassed and the signal from the previous subloop is switched to the next subloop. This allows a large part of the loop system to operate when links on one of the subloops fail. The MSN is a slotted system with continuous transmission on each of the links; therefore, failures can be detected on the links arriving at a node as in a loop system. When a link has failed and is detected at the termination node, the origination node of the link must be informed. Otherwise, packets that are transmitted on the inoperable link will be lost, and if packets between a pair of nodes are always routed along that link, the pair of nodes will not be able to communicate. In addition, the implementation of the MSN described in Section II does not lose packets because the node can transmit as many packets as it receives. In order to preserve this characteristic, when a link leaving a node fails, data on one of the incoming links must stop so that the in-degree and out-degree of the node remains the same. One way to prevent transmission on a link that has failed and to keep the in-degree equal to the out-degree at every node is to stop transmitting on a directed cycle of links that includes the link that has failed. The in-degree and out-degree of each node in the cycle is reduced by one and the link that has failed is not used. To minimize the effect that this strategy has on the throughput and connectivity of the network, the number of links in the cycle must be kept as small as possible and the cycle should not pass through any node twice. A simple rule that meets these conditions most of the time is to stop transmitting on a row if a signal is not received on a column and to stop transmitting on a column when a signal is not received on a row. The operation of this rule is shown in Fig. 11. In this example, the dotted link from node 2,2 to node 2,3 fails. According to the rule, the dashed links are taken out of service. That is,
• Node 2,3 receives no signal on the row; therefore, it stops transmitting on the column • Node 3,3 receives no signal on the column; therefore, it stops transmitting on the row • Node 3,2 receives no signal on the row; therefore, it stops transmitting on the column • Node 2,2 receives no signal on the column; therefore, it does not try to transmit on the failed link on the row. When the failed link is restored, the cycle is returned to service by forcing transmission on this link. When a single failure occurs in a complete MSN, this procedure removes four links, which is the· minimum number of links in a cycle. Also, at most one link is removed at each node. Therefore, this simple rule has the desirable characteris-
tics in this instance. However, when there are mutliple link and node failures or if the network has partially full rows or columns, these characteristics are not always obtained. For instance, if the link from 4,4 to 3,4 also fails, both links to node 3,3 will stop, and an operable node is removed from the network. This removal rule is simple, and it works well when there are a few removals. Since a network will be repaired when removals occur, it is unlikely that there will be .many removals, and this simple rule is adequate.
D. Effect on Routing Rules Simulations were conducted to determine the effect that failures have on the distributed routing rules, and the results are presented in Table IV. The fraction of nodes that are in the network, but cannot communicate using the distributed routing rules, are italicized. In the simulations, a random selection of nodes or links fail in a lOx 12 network and the bypass or link removal rules are applied. Each experiment is repeated ten times .. This results in the span of values in the 95 percent confidence interval for the average path length being less than 1 percent of the mean for node failures. The span of values in the 95 percent confidence intervals for link failures ranges from 1 to 5 percent of the mean. The efficiency is calculated as in Section IV-E. . When nodes are added to each network in Section V-A, it is assumed that the routing rules that use information about the next node know about the changes. This is reasonable because adding nodes is a planned activity. When links or nodes fail, the network is modified automatically, without operator intervention. After failures occur, if the node that each node is connected to is to be known, a protocol must be developed to distribute this information. In the simulations, the change in network connectivity after failures is not known, and the routing rules operate with incorrect information.
578
THE BEST OF THE BEST TABLE IV
THE EFFICIENCY OF LOCAL ROUTING MECHANISMS RELATIVE TO THE SHORTEST PATH ALGORITHM WHEN NODES OR LINKS FAIL IN A 10 X 12 NETWORK Failures Size
Short. Path
I
1
8 Nodes 4 Nodes 2 Nodes 1 Node
5.94 6.15 6.28 . 6.34
1 Link
6.51 6.59 6.75
2 Links 4Linb
Random
T
A
3
B
.91 .94 .96 .97
.92 .95 .96 .97
.05 .05 .05 .05
.09 .08 .08 .08
1.00
.99
.99
.05
.08
.05
.08 .08
.93
6.42 .
I
2
.98 .98
.96
None
Efficiency of Routing Rules Deterministic
.93
c
.89" 00 7) .81 (,OU)
.92
.93
.88 (.000 .80 (,OOJJ
.87 (.00]) .79 (,OO7)
.05 .04
.07
The simulations show that the distributed routing rules operate reasonably efficiently when up to eight nodes fail and are bypassed. When up to four links fail and up to 16 links are removed from the network, the distributed routing rules still operate reasonably efficiently. However, a small fraction of the nodes in the network, shown by italicized numbers, cannot communicate until the network is repaired. In the first deterministic rule, a greater fraction of the nodes cannot communicate compared to the other two rules. The first rule uses information about the edges of the network to improve routing decisions. When this information is incorrect, routing failures can occur. VI.
CONCLUSIONS
The objective of this paper is to study simple mechanisms for routing packets in the MSN. The routing rules must not only operate in complete rectangular networks, but must also operate when single nodes are added and when failures occur. Three distributed routing rules are described in Section IV that use the regular structure of the MSN to simplify routing. The first rule provides the shortest path between any source and destination in an integer addressed network. The second is simpler to implement than the first rule, but results in slightly longer paths. The third rule is the simplest to implement; it has the same path length as the second rule in complete networks, but it does not determine equal length shortest paths as well as
xsp(r, c)= (2
* (r modZ) * (c
mod 2)
*
(1-0
+ (2 * (1-, mod 2) * (c
mod 2))
+ (2+2 *.(I-r mod 2)
* (l-c
* (U( -
r)
* U( -
c»:+ (2
(r-
* (r mod 2)
=j<
ApPENDIX
Theorem: The first routing rule, Section IV-A, selects ,all possible shortest paths to the destination in a complete, integer addressed MSN. To prove this theorem, 1) hypothetical distance function dsp(r, c) from every node to the destination is defined, 2) a path from (r, c) to the destination that has this distance is found, 3) it is shown that a shorter path does not exist, 4) it is shown that the routing rule can select every path with this distance, and does not select any paths with a larger distance. The distance function that will be tested is
a
Irl+ lei +xsp(r, c)
. dsp(r, c)=
where
~)) * (1-0 (c-i))) * ((1- U( -
* «1- U( -
mod 2)
addressed network may be longer than in an integer-addressed network; however, in Section IV-E, it is shown that new rows or columns can be added to a fractionally addressed network in a way that has very little effect on the average distance between nodes. In addition, the nodes that are most affected by fractional addressing are those that are furthest away from the destination. Therefore, fractional addressing is preferred to integer addressing. In Section IV -A, a procedure is described for adding new nodes to a network. When the third routing rule is used in networks that use this procedure, there are nodes that cannot communicate. Since this is not a condition that can be repaired, and the network must operate. for all combinations of nodes, the second routing. rule is preferable to the third rule, even though the third rule is simpler to implement. In Sections IV -B and C, procedures are described to automatically bypass nodes or links that fail. The procedure for bypassing links that fail guarantees that packets are not transmitted on the failed link as well as guaranteeing that all of the packets that arrive at a node can be transmitted. There is a small fraction of the nodes that cannot communicate when multiple link failures occur. Unlike adding nodes, multiple failures should be repaired, and "it is unlikely that the network will operate in this mode frequently. Therefore, this condition does not preclude using these failure recovery mechanisms.
*
r)
(1-0
* U( (r+ ;
(1- c mod 2))
the second rule. The nodes with longer path lengths in the second and third rules are those that are furthest from the destination. Networks are divided into communities of interest and nodes communicate more frequently with nodes that are nearby than with nodes that are further away. Therefore, the simpler routing rules are preferable to the first rule. In Section III-A, a fractional addressing scheme is described that does not change network addresses when new rows or columns are added. In addition, in a fractionally addressed network, the routing rules are independent of the number of rows or columns in the network. This makes the arithmetic unit that calculates relative addresses simpler to implement in a fractionally addressed network than in an integer-addressed network. The path selected between nodes in a fractionally
r))
* (1- U( -c)))
c)
-1)) * (1-0 (c+i-1))
* (U( -
r)
* (1 -
-4
* o(r) * O(C))
U( - c))
and
U(x) =
[~
for x
for x~O
and o(x)=
(0
t1
for x*O for x=O.
The correction factor xsp(r, c) is included in the distance function to account for longer paths that must be taken because of the unidirectional links. It adds two to d sp(r, c) for nodes that only have links directed away from the destination. There are two additional links added to dsp(r, c) for all nodes in Q3 because packets from this quadrant must pass the destination and return. In addition, there is a modification in xsp(r, c) at the edges of the first and third quadrant that is caused by the
Fifty Years ofCommunicationsand Networking
579
destination being displaced from the exact center of the network. A node (r, c) is connected to node (r nxt , c) along the column and (r, c nxt ) along the row where
TABLE V COMPARISON OF ROUTING DISCUSSIONS FROM RULE 1 AND THE CHANGE IN DISTANCE FROM dsp(r, c) Current Node 2
mod
r_=r-(I-cmod 2) * (l-mo (r+; -1))
'C
Column-
Region
3-2&(,-(
Ql01
Row- (r,caI )
('MI'C)
R,
J1c
+1
CI
"1+11
+(C mod 2) * (l-mo (r- ~))
00
ell
Q.-
QI
10
'. cl+l
X
-I
11
c_=c-(I-rmod2) * (I-no (c+i-1)) +(rmod2) *
(I-no (c-i)).
Q2
The change In this distance function between the (r, c) and tr,«. c) is Q)
4 c (r, c)=dsp(rnxt' c)-dsp(r, c)=(lrnxtl-lrl)
+ (xsp(rnxt' and between (r, c) and (r,
Cnxt )
J1 r (r , c)=dsp(r, Cnxt)---dsp(f, c)=(lcnxtl-lcl) xsp(r,
c».
In an m x n, integer-addressed network, the regions in Fig. 6 are
m·
(
I.
X
-I
'1
"X
-I
+1
CI
'2
3-21(,- ml2 ) )-2i(c+ n/2 -I)
C2
-1
Q2-+'2
10
Qr
00 01
Qr+C2 QQ3-
-I
X
01
'J
X
11
c'J+l) Q3 Ql-
-I
10
c]
Qr
+1
X
-1 X
3-2a(,-
",/2 )
3-6(c+ n/2-1)
3-2cHc+ n 12-2) +1
3-26(,+ m /2 -2) +1
X
-I
Q4
X
·1
00
'3
11
Q4-+C4
01
Q4-
00 10
Q"-+'4 Q,,-
+1
I]
c.
X
-1
'4
3-2&«('- n~) 3-2&(,+ m/2 -1)
3-2I(c- n/2). 3-2&Cr+ mI2-l) ·1
X
·1
+1
C]
nxt ) -
X
')+h
c) - xsp(r, c)
c
-1
1
11
is
+ (xsp(r,
~
)-2I(c-( n/2 -1» +1
Qt-
and
~
m/2 -1»
X
·1
X
n]
Ql= (r, c) : lsrS2" , 1~CS2
Ql-=fQl n (rl U c] U II>}
[
r,>
l:sc
[(r, c): lsr<~ ,c=iJ
Cj=
Q4-
= {Q4 n
(r4 U C4)}
=(; ,i)
r4= (r. C) : r=O, 1 scs~)
Q2= (r, c) : 1 srs; , -~+ I, SCSo]
C4= «r, c): -;+lsr
11
Q2- ={Q2
n (r2 U
C2)}
r2= (r. c): r=l, -~+lSC
+ 1:s;rsO,
= {Q3 n
-i+ 1 scsO n r=c=o]
('3 U C3 U 13)}
13= «r, c) :r= - i + 1, -~+ 1
In Table V, ~c(r, c) and ~,(rt c) are listed for rand c even and odd in each of the regions. Those table entries that ate not possible, such as r even in r2, have been eliminated. An '4X" in column R; indicates that the first routing rule selects the link to (rnxt, c) and an "X' in column k, indicates that the routing rule selects the link to (r, c nxt ) . Table V shows the following. 1) ~r(r, c) = - 1 or Llc(r, c) = - 1 for all (r, c) (0, 0). Therefore, each node (r, c) =1= (0, 0) is connected to at least one node that is one closer to the destination. It is possible to travel from any node to the destination in the number of steps specified by dsp(r't c) by selecting a link for which 4 r(r, c) = - 1 or .dc(r, c) = - 1 at each node along the path. Therefore, d sp( r, c) is a valid distance function from every node (r, c) to the destination. 2) 4 r(r, c) ~ - I and 4 c(r, c) ~ -1 for ali (r, c) (0, 0). Therefore, dsp(r, c) = Minimum (dsp(rnxtt c) + 1, dsp(r,
*
*"
THE BEST OF THE BEST
580 Cnxt) + 1) for all (r, cy =1= O. Since it is not possible to select a link from any node that results in a shorter path to the destination, dsp(r, c) is the shortest path to the destination. This is the basis for many shortest path algorithms and is proven in [11, PP. 193-195]. 3) R c = X or R, = X for every entry in the table. Therefore, Rule 1 selects at least one outgoing link at every ' node in the network. 4) R; = X if and only if Ac(r, c) = - 1 and R, = X if and only if A,(r, c) = - 1 for all (r, c) =1= (0, 0). Since R; = X only if dc(r, c) = -1 and R, = X only if 4,(r, c) = -1, 'routing Rule 1 selects a shortest path to the destination. Since R; == X whenever ~c(r, c) =; -1 andR, = X whenever ~,(r, c) == - 1, routing Rule 1 can find every shortest path to the destination.
REFERENCES
[1] {2] [3] [4]
[5] [6J [7] [8]
N. F. Maxemchuk, "The Manhattan street network, in Proc. GLOBECOM'85, New Orleans, LA, Dec. 1986, pp. 255-261. - - , "Regular and mesh topologies in local and metropolitan area networks," AT&T Tech. J., vol. 64, pp. 1659-1686, Sept. 1985. E. H. Steward, "A loop transmission system," in Conf. Rec. Int. Conf. Commun., San Francisco, CA, June 1970, pp. 36-1, 36-9. B. K. Penney and A. A. Baghdadi, "Survey of computer communications loop networks: Part I," Comput. Commun., vol. 2, pp. 165180, Aug. 1979. - - , "Survey of computer communications loop networks: Part 2, " Comput. Commun., vol. 2, pp. 224-241, Oct. 1979. N. Abramson, "The Aloha system-Another alternative for computer communications," in Fall Joint Comput. Conf., AFJPS Conf, Proc., vol. 37, 1970, pp. 281-285. R. M. Metcalf and D. R. 'Boggs, "Ethernet: Distributed packet switching for local computer networks," Commun. ACM, vol. 19, pp. 395-404, July 1976. A. G. Greenberg and J. Goodman, "Sharp approximate analysis of adaptive routing in mesh networks," submitted to Proc. Int. Sem. H
Teletraffic Anal. and Comput. Performance Eva/., The Netherlands, June 1986. [9] R. T. Prosser, "Routing procedures in communications networksPart I: Random procedures," IRE Trans. Commun. Syst., pp. 322329, Dec. 1962. . [10] H. E. White and N. F. Maxemchuk, "An experimental TOM data loop exchange," in Proc. ICC »74. {11] H. Frank and 1. T. Frisch, Communication, Transmission and Transportation Networks. Reading, MA: Addison-Wesley, 1971.
* Nicholas F. Maxemcbuk (M'72-SM'85) received the B.S.E.E. degree from the City College of New York, New York, NY, and the M.S.E.E. and Ph.D. degrees from the University of Pennsylvania, PhilaPHOTO delphia. NOT He is the Head of the Distributed Systems Research Department at AT&T Bell Laboratories, AVAILABLE Murray Hill, NJ. He has been at Bell Laboratories since 1976, and his research interests include local and metropolitan area networks, protocols, speech editing, and picture processing. Prior to joining Bell Laboratories, he was at RCA Labs, Princeton, NJ, for eight years where his research interests included local area networks, error-correcting codes, and graphics compression. He has been on the adjunct faculty at Columbia University, New York, NY, where he was associated with the Center for Telecommunications Research, and the University of Pennsylvania, where he taught courses in computer communications networks. Dr. Maxemchuk has served as the Editor for Data Communications for the IEEE TRANSACTIONS ON COMMUNICATIONS, as Guest Editor for the IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, and on the Program Committees of numerous conferences. He was awarded the RCA. Laboratories Outstanding Achievement Award, the Bell Laboratories Distinguished Technical Staff Award, and the IEEE's Leonard G. Abraham Prize
Paper Award.
Bottleneck Flow Control JEFFREY M. JAFFE,
MEMBER, IEEE
Abstract-The problem of optimally choosing message rates for users of a store-and-ferward network is analyzed. Multiple users sharing the links of the network each attempt to adjust their message rates to achieve an ideal network operating point or an "ideal tradeoff point between high throughput and low delay." Each user has a fixed path or virtual circuit. In this environment, a basic definition of "ideal delay-throughput tradeoff" is given and motivated. This definition concentrates on a fair allocation of network resources at network bottlenecks. This "ideal policy" is implemented via a decentralized algorithm that achieves the unique set of optimal throughputs. All sharers constrained by the same bottleneck are treated fairly by being assigned equal throughputs. A generalized definition of ideal tradeoff is then introduced to provide more flexibility in the choice of message rates. With this definition, the network may accommodate users with different types of message traffic. A transformation technique reduces the problem of optimizing this performance measure to the problem of optimizing the basic measure.
Flow control regulates the amount of traffic to maintain good system performance. For example, if the buffers at a link are almost full, some mechanism is needed to slow down the rate of incoming traffic. Otherwise, the buffers would overflow, causing severe queueing delays or even deadlock. Another purpose of flow control is to maintain a good throughput delay tradeoff. If a user is sending a high average message rate (in our studies this is equated with throughput), the resulting delays may be intolerably long. On the other hand, the user would not want to sacrifice too much throughput in order to achieve low delay. Related to this is the notion of fairly dividing network resources between competing network users. In this paper we discuss methods to achieve a well defined notion of system performance which results in fairness to users and a good delay-throughput tradeoff. We concentrate on network access means of flow control [6] where external I. INTRODUCTION inputs are throttled based on measurements of internal netARIOUS store-and-forward packet-switched computer work congestion. The buffer depletion problem (see [7]) is networks have been developed in recent years. The pri- ignored so that we may concentrate on delay and throughput. mary function of these networks is to route messages or Formally, when our model is specified (in Section II), infinite packets from one network location to another. Typically, buffers at each link are assumed. the source of a message dispatches a packet to a neighboring This paper primarily concentrates on the fundamental location or node, which relays the message to another node questions of "what is optimum performance?" and "what and so forth, until the message arrives at the destination. notions of optimality are accomplishable in a decentralized There are a number of disciplines used by networks to fun- environment?". No new method of constraining the input of nel a large number of packets from one source to a given messages is proposed; it is assumed that message rate is regudestination. For example, ARPANET handles each packet lated by a simple rate mechanism, i.e., some "black box" at individually [1], trying to find the shortest path for each each route which chooses the message rate for that route. packet based on changing network characteristics. In this paNetwork access flow control schemes include the isaper we assume a fixed route approach whereby all messages rithmic scheme [8], input buffer limiting [9], and the choke from a given "session" are assigned to a fixed unique route. packet scheme [10]. Other schemes are discussed in [6J and This approach is currently used in TYMNET [2] , [3] , IBM's [11]. The isarithmic scheme limits the total number of pack.. network architecture [4], and various other networks (e.g., ets allowable in the network. Input buffer limiting locally [5]). Many sessions may share a given route. restricts input traffic in favor of transit traffic. The total time required for transmission of a packet is The "bottleneck flow control" presented here may be called its delay. Assuming small nodal processing time t there viewed as a generalization and abstraction of both the choke are two major components to message delay. Since communi- packet scheme and certain ideas presented in (9]. Common cation links take some time to transmit a message, there is a features with the choke packet scheme are that the decision transmission delay component. Also, if a communication link to decrease message rate is a function of congestion in the needs to transmit too many packets at once, it temporarily bottleneck links. The relationship between the two is further buffers some of them, leading to a queueing delay component. developed throughout this paper. The main difference is that, The queueing delay clearly depends on the amount of network while optimality is defined in a similar way, the control traffic, and roughly speaking, increases with greater traffic. mechanisms are different. As a result, the choke packet scheme has no explicit way of ensuring a specified notion of Paper appoved by the Editor for Computer Communication of the fairness. On the other hand, bottleneck flow control uses IEEE Communications Society for publication after presentation at the 5th International Conference on Computer Communication. Atlanta, fairness criteria related to those that are described in [9]. GA, October 1980. Manuscript received Aprn 25, 1980; revised JanuIn Section III we define and motivate a notion of "optimal ary 6, 1981. This research was supported in part by the National tradeoff." An adaptive algorithm is given in Section IV which Science Foundation under Grant £D8-79-25092. The author is with the IBM. Thomas J. Watson Research Center, attempts to achieve this tradeoff in a network that is experiencing changes in traffic patterns and numbers of users. Due. Yorktown Heights, NY 10598.
V
Reprinted from IEEE Transactions on Communications, vol. COM-29, no. 7, July 1981.
The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
581
582
THE BEST OF THE BEST
to the changing nature of such a network, it is difficult to . b bits/message, there is no nodal processing time, and Kleinstate specific "steady-state" properties of the algorithm. We rock's independence assumption applies [15] . thus restate the problem somewhat to reflect a static network. Define the capacity of link I, c(/); by c(l) ~ sell/b. Assume In that environment it is easier to discuss properties of the that there are K users, all of whose fixed routes use a link I. "optimal tradeoff' and an algorithm that implements it. In Let 'tt denote the message rate of the ith user. In that case, the average steady-state delay for the packets (of each user) particular, the following is achieved: that traverse the link at I is d/(y) = 1/(c(/) - (11 + "'.+ 1x). • A "decentralized" algorithm is given that alwaysachieves The average total delay of packets sent by user i, Dt('Y) is the optimal tradeoff (Sections V and VII). the sum of the average delays experienced at the individual • The algorithm obtains the tradeoff in linear time [in the links. number of users (Section VII)] . • The "optimal tradeoff' defines a unique set of through. III. OPTIMALITY CRITERION puts that the users of the network must achieve (Section In this section an optimality criterion is presented using VIII). several levels of description. First, optimum' throughput is • The unique set of optimal throughputs has important defined in terms of link capacity. We explain why our defmi"fairness" properties (Section IX). tion might be considered "the optimum operating point of a Section X generalizes these results to the situation .where network." Next, the definition is reformulated to express a different user classes have different network performance tradeoff between user throughput and, delay. Section IV gives requirements. The main result of Section X is that the tech- an adaptive algorithm for optimizing the criterion in a "dyniques developed earlier in the paper may be applied directly namically changing" network. It is difficult, 'however, to pr~ to the more general case by a simple transformation technique. sent any concrete analysis for a rapidly changing network. We briefly explain and motivate the notion of a "decen- Starting with Section .V we analyze the optimality criterion in tralized" algorithm for flow control. When a user chooses its a "static" environment. throughput, the inputs to the process should consist of inforRecall that c(/) is the capacity of the link I. Let 1(1) denote mation locally available to it. The user might be permitted to the sum of the throughputs of all users of link 1. The maxiuse information about the interfering traffic on its path, but mum value that "1(1) can be is c(l) or else messages are genernot about global topology. Basically, ina decentralized algo- ated at a faster rate than they can be transmitted, Certainly" rithm, information not readily available on a user's path 1(/) > c(/) is not a situation we would like to encourage for should not be usable for throughput determination. any link. In fact, it probably not even desirable to have In [12] it is shown that a single user may optimize its 1(l) = c(l) for two reasons. First of'all, if "1(1) = e(/) the power (ratio of throughput to delay) using only such local system "never reaches steady state"; the delays of the rnes.. information. However, in [13] it is shown that, under certain sages increase over time due to the fact that buffer occupancy conditions, no decentralized algorithm maximizes power in approaches infmity. Also, choosing 1(1) = c(l) leaves no room a multiple user system. Since certain optimality criteria are for fluctuations in the network. One user may be forced by nondecentralizable, the importance of the decentralizable certain considerations to increase his throughput or new criterion discussedhere is enhanced. users may attempt to open up new routes sharing link I. For We further remark that the criterion expressed here has that reason, optimum "'(I) is chosen to be somewhat less than other advantages over the power concept. It is shown in [14] c(/), as we proceed to describe. This distance is parameterized that, in some network configurations, optimizing power by a variable x. This variable permits designers of different implies that certain users must choose zero throughput. A systems to choose somewhat different notions of "ideal corollary of the fairness property of Section IX is that no users throughput-delay tradeoff.." If they are throughput-oriented, are required to have zero throughput a~ optimal performance. they choose x large; if delay-oriented, then x should be small. This fact is still true for the generalization of Section X Define the residual capacity of I by r(/) = c(l) - 'Y(l). Let where users are not handled identically in terms of through- 'Y denote the throughput of a user whose path includes link I.. put allotment. The user saturates I if 1 = x(r(/». The user overloads / if "'( > x(r(/»). A user is overloaded if it overloads any link on its II. NETWORK MODEL path. A user is saturated if it is not overloaded and it saturates We model a data network as a graph (N, L) with vertex at least. one link on its path. These preliminaries prepare us (or node) set N and edge (or link) set L. Each link 1 E L has for the following. a service rate of s(/) bits/so Aptith p in the network is a sequence Definition: Given a data network (as modeled in Section II) p = (nl' ..., nk) with n, EN such that for i = 1, '.., k - 1, with several paths through the network (corresponding to I; = (n;, ni+l) E L. The set {It, ..., ik - 1 } is denoted 1(P), users of the network), and a rate assigned to each user, the the links of p. A path p models a fixed route that is used by rate assignment is optimal if all users are saturated. one of the "users" of the network. Remarks: The way that we keep "'(I) somewhat less than In order to evaluate the delays on the links, a queueing c(l) is to guarantee that no user overloads any Hnks. Thus, for model is needed which relates throughputs to delay. We use a each link 1, x(r(l) ;?; 'Ymax where 'Ymax is the largest through. simple model ([15, Sect. 5.6]) which, as indicated above, has put of any user of link I. In addition to keeping 1(1) somewhat infinite buffers. Specifically, we assume that each link may be less than c(l), we also desire a large measure of throughput in modeled as an MIMI! queue, the average message length is the network. Thus, each user must not only prevent
is
583
Fifty Years of Communications and Networking
overload-if also must, be saturated. Each user would then have the largest possible throughput subject to x and the residual capacities. To contrast this with the Cyclades choke packet proposal, remind the reader that optimality in [10] basically requires that' no link exceeds a certain threshold of utilization. For Instance,1(1) should not exceed (0.8) (c(l)) if the threshold equals 0.8. . We feel that it is better to force saturation of each user and choose ')'(/) ~s a function of 'Yritax for a few reasons. The primary reason is that the choke packet scheme has no regard for the number or types of users of the link, and therefore loses the ability to fairly allocate resources. By fixing the requirement that no link should exceed a certain utilization, one loses the ability to predict transients in future utilization based .. o~ current. utilization. This is developed further In Section IX. Also assume that x(r(/)) == 'Ymax. Then, with our deflnition, if x = 1, we cari accommodate one new user with throughput 'Yinax without causirig 1(1) > c(l). Similarly, choosing r(l) = ('Ymax)/x protects the network against percentage changes in each user's throughput due to transients. If a user increases his throughput by a factor of 1[x, the inequality c(l) ~ 1(1) still applies. Methods of obtaining an optimality criterion similar to "80 percent of utilization," as a limiting caSe of saturation, are discussed in Section XII.. Next, we motivate saturation as a means of expressing an "optimal delay-throughput tradeoff." Recall that the delay at 1 is given by d, == l/(c(l) - '}'(l)). Thus, saturation for user p is eqiiivalent to
we
'Y
=
.min
1:IE l(p)
x/d,(1)
(1)
From (1) it is evident that saturation is a direct method of expressing a delay -throughput tradeoff for the users of the network. A'user may increase its throughput until the delay on its "bottleneck" link is too large. As delay increases, 'Y is constrained by (1). Note the role played by the parameter x in all viewpoints of the optimality criterion. From the network point of view, it indicates the amount of traffic fluctuation that is to be protected against. From the user viewpoint, it indicates the amount of effect that increased delay should have on throughput. There is a third viewpoint of saturation. Using Little's theorem [16], the aver.age number of messages waiting at a link 1 when the throughput of a user is 'Y, and the delay is d, is 'Y·d,. Now if''Y ~ xld, for every link 1 in the path of a given user, the user is willing to tolerate x messages waiting at each link, and a total of x times # (user's links) , messages waiting in the system. Thus, the average number of waiting messages that a user will tolerate varies linearly with the length of his path-if' the path is longer, the user may have more messages in transit. review, the features of optimum network operation based on the use of the saturation measure are 1) protection for the network against changes in users' rates
To
2) protection for the network from arrivals of new users 3) establishment of delay/throughput tradeoff at the bottleneck link 4) use of the parameter (x) to permit flexibilityin the definition of optimum performance 5) protection for the buffers in an average sense 6) fair allocation of resources (Section IX). In addition to stating what optimal performance is (all users saturated), it might be helpful to evaluate how far suboptimal solutions are from optimal. To do this, it is useful to have an objective function which characterizes the quality of a set of throughput assignments. Assume that there are m users with throughputs 1 = (1'1, ... ,1m). Define
(2) If each user is saturated at 1, then for all t, 1i = x/d,('Y) and [(1) = O. Conversely, if 1(,) = 0, all users are saturated. Thus, the goal of saturating all users may be conveniently restated as an attempt to minimize f.
min,:IE1(i)
IV. AN ADAPTIVE DISTRIBUTED ALGORITHM An adaptive distributed algorithm which attempts to saturate all paths without overloading any is now given. Each user adjusts its message rate based on information sent to it by the links and nodes on its path. The information needed by a user with path p is 1) its current throughput 1 2) min,:IEl(P) r(l). We do not specify the mechanics of when this information , is made available and in what form the information arrives. Each link may know to dispatch information, to all users' of the link at regular intervals, or alternatively, information gathering may be prompted by a signal from the user. Each link may compute r(l) or estimate it based on buffer occupancy. Also, the links may send the throughputs of the individual users of the links, and let user p calculate r(l). The algorithm executed by user p each time it desires to recalculate its message rate l' from the old rate 1 is l' ~
min
l:IEl(p)
(3)
The following explains why we say that the above algorithm attempts to achieve saturation. First, note that after executing one step of the algorithm, the user is saturated. This can be seen as follows. For a link I, the new sum of throughputs 'Y'(I) = 1 (1) - "I + "('. Thus, x(c(l) - "('(I)) = xc(l) - x'Y'(I) = xc(l) - X1(/) + x"{ - xv' == l' by (3) for the link at which r(l) was minimized. Also,x(c(l) - 1'(1» ~ l' for all other links,l, i.e., none is overloaded. If there were no transients) such as no new users entering the system, and each user converged to a steady-state throughput, then those throughputs that are converged to will saturate all users. Any unsaturated or overloaded user must change its throughput! Unfortunately, we are unable to show, even
584
THE BEST OF THE BEST
without transients and new users, that each user does con~) Each user sets its new throughput 1 to the smallest verge. To clearly express an algorithm that saturates all users, value of 'Y(l, i) amonglinks 1 that it uses. 4) Each link I determines which of its users are now saturwe spend the rest of this paper discussing a static case, Le., no new users. ated at 1 and informs each such user. 5) Each user that is saturated at any link informs all of its As a practical matter, the above algorithm would need to be modified in an adaptive situation. Choosing l' by (3) may links that it is saturated. There are basically two computations done at each iteracause large deviations in certain user's message rates, leading tion. After receiving 1(1, i) from each link 1 on its path, a user to instabilities in the system. A better way is to have users slowly change rates in the direction (increase or decrease) readjusts its throughput by taking the minimum allocation implied by (3). The reader is referred to [14] for an algorithm [step 3)]. Also, each link must calculate 1(1, i). The informato coordinate user updates, so that many users do not change tion needed for this calculation is the number of saturated users [obtained in step 5)] and 1sat(/' i) (obtained in some their rates at once. way by measuring each saturated user's throughput). V. ALGORITHM TO SATURATE ALLUSERS One method whereby a link can determine 1 5a t(l, i) withIn this section an algorithm is presented which saturates out explicitly finding out which user sent each message is all users in a static network with a fixed set of users. It is briefly described. Let each saturated user set a bit in the assumed that if a user is assigned by the algorithm to send message header to 1 and each unsaturated user to O. Then messages at a rate "1, that indeed its average throughput is 1. 1sat(l, i) is just the average rate of messages arriving with (Variations of this are described in Section XI.) The algorithm header bit equal to 1. Further elaboration on implementation is decentralized in the sense described above. Each user is omitted. chooses its throughput based on information provided from The key properties of the algorithm (proved in Section VII) its links. In fact, the execution of the algorithm will be pre- follow. sented in a manner which distributes the computation even • Any user that is saturated after iteration t, remains satumore-the links (or whatever controls the links) will do some rated after iteration t + 1. computation in the algorithm. The link computation provides • If not all users are saturated at the beginning of an a concise description of the current traffic on the link. iteration, then at least one becomes saturated at the iteration. There are a number of idealizations used in this section. From the above two facts it is immediate that if there are It is assumed that each link may accurately calculate message m users, they are all saturated after no more m iterations. rates of users that use the link. Also, in order to conveniently discuss the convergence time of the algorithm, a synchronous VI. AN EXAMPLE algorithm is assumed (i.e., a clock at each node permits all updates to occur at once). However, the main feature of
using "local information," Le., information accumulated along a user's path, is preserved. In practice, one would probably use a hybrid of the algorithm of Section IV and the algorithm that we proceed to present here. The algorithm proceeds in iterations. Consider a link I which is shared by a number of users, exactly j of which are not saturated before the ith iteration. Let 'Ysat(l, i) denote the sum of the throughputs of the users of link / that are saturated before the ith iteration. Then the saturation allocation ofl at i, denoted 'Y(l, i), is
"(1, i) = x
eel) ~ ~;:(l,
Consider the network of Fig. 1. The following is a trace of the iterations' of the algorithm with x = 1. The labels of the links are the capacities.
'Yl 1'2 1'3 1'4 'Y5
Iteration 1 1/2 (from link D) 1 1/2 (F) 1 1/2 (F)
Iteration 2
Iteration 3
1/2
1/2 7/4
10 19/4 (C)
10 19/4
7/4 (E) 11/6 (F)
10 (A)
3 1/3 (C)
1518 (F)
User 1 is saturated at link D, 2 at E, 3 at F, 4 at A, and 5 at C.
VII. PROOF OF CORRECTNESS l) ).
(4)
Intuitively, if each unsaturated user of link 1 chooses the saturation allocation as its throughput, and each saturated user leaves its throughput unchanged, then all unsaturated users become saturated. This follows from the fact that r(l) in that case would be (c(l) - 'Ysat(l, i))/1 + jx. The following is the algorithm for the ith iteration. Initially, ail throughputs are 0 and each link knows how many users have paths which use it.
Saturation Algorithm (ith Iteratton) 1) Each link I calculates 1(1, i). 2) Each link sends the value y(f, i) to all users of I.
The main result of this section is the following.
Theorem 1 : Fix a network with m paths.. Defme f(,)
m =~ ;=1
I
1;-
min - x
l:IEl(i)
d,(l)
I .
(5)
If the saturation algorithm is executed, then after at most m iterations, the resulting value of 1, satisfies f('y) = O. Furthermore, 'Y is unchanged by subsequent iterations of the algorithm. Proof: As mentioned in Section IV, this is proved by showing that saturated users stay saturated-and each iteration produces at least one saturated user. (Recall that f( 'Y) = 0 if all users are saturated at 'Y.) The main technical result
585
Fifty Years of Communications and Networking
m
Fig. 2.
c.
algorithm of Section IV converges to an optimal solution, it does not converge exactly. Rather, the sequence of throughputs achieved by the users converge (in a Cauchy sense) to the optimal throughpu ts, The fact that linear time is actually required by our algorithm in the worst case is proved by the example of Fig. 2. Basically, the Yi may be chosen so that each user converges at a different step. See [17] for details.
r- ,.,p
--------....
I
" " - <: --..... -.. .... --"':..,, '-..;."............. F ..............
--......
\ \
\
\
\
' \\ G 10 \ \
, ,.
VIII. UNIQUENESS
\
, P,
"
\
,
\ ,~ , "
0
\
"
\ Fig. 1. Example network for execution of algorithm.
needed to prove Theorem 1 may be stated informally as "')'(1) i) is a nondecreasing function of i" This fact, and the fact that saturated users stay saturated, are proved inductively in the following lemma.
Lemma 1:
Worst case network (in terms of number of steps).
1) For all I EL, all IE Z +,1(1, i + 1) ~ ')'(1, i). 2) If any user becomes saturated at link 1 during the ith iteration, then all users of 1 that were not saturated before the ith iteration become saturated at 1 during the ith iteration. 3) If a user is saturated after the ith iteration with throughput '"1, he remains saturated after the i + 1st iteration with throughput 'Y. The proof of Lemma 1 is given in the Appendix. To complete the proof of Theorem 1 we prove the following.. Lemma 2: At each iteration which starts with some unsaturated users, at least one user becomes saturated. Proof: For each link at which not all users are saturated at a given iteration, consider the saturation allocation of the link. Some link must have minimal allocation among all such. links. All unsaturated users of that link choose that allocation. Since all saturated users of the link do not change their throughputs [3) of Lemma 1], all of the unsaturated users of that link become saturated. Theorem 1 follows directly from Lemma 2 and 3) of Lemma 1. At each iteration at least one user becomes saturated-and saturated users stay saturated. Corollary-(Existencej: Given any network and set of users' of the network, there is a throughput assignment 1, which saturates all of the users. Note that the saturation algorithm determines the optimal throughputs exactly. In contrast, even when the adaptive
In this section it is shown that for any network and any set of users there is a unique way to saturate all users. This is a "well-defined" result for the saturation measure: two different throughput assignments cannot both be optimal for the same network configuration. We first separate out a simple lemma which we refer to later. Lemma 3: Assume user i is saturated at link I at optimum solution "1, with throughput 1i, and user j uses link 1 and has throughput 'rj- Then 1i ~ 'rj. Proof- Since user i is saturated, 'r; = x(r(l»). Since user j is not overloaded, 11' ~ x(r(/)) = "Ii• Theorem 2: The value 'Y obtained from the saturation algorithm uniquely minimizes the objective function f. Proof: We prove by induction on the iteration number that all users saturated at step i must obtain the same throughput assignment in any optimal solution. The basis step is similar to the inductive step and is left to the reader. Consider all users saturated at the ith step. By Lemma 1, part 2), a u.ser may only be saturated if it takes the saturation allocation at some link 1, and all other not previously .saturated users also take their saturation allocations at 1 (and get saturated). Thus, we may study all users that are saturated at the ith step by looking at all links at which all nonsaturated users take the saturation allocation. Assume, contrary to the hypothesis, that it is possible for the users saturated at step i to get different assignments in some optimal assignmen t 'Y *. Consider a link 1, which is saturated at the ith iteration and has some of its saturated users with different assignments in '1 *. By induction, recall that all users that share 1, and are saturated before the ith iteration must receive the same throughputs in any optimal solution. We first claim that at least one user saturated at 1 at the ith iteration must obtain less than 'Y(I, i) in 1 * . For if all of them receive "(1, i) or more, and the users saturated before iteration i receive the same amounts, then x(c(/) - 'Y * (I» < "'(I, i). But then, all those users that receive "'(1, i) or more are overloaded at 1 in 'Y *, and thus 1 * is not optimal. Thus, one may consider a user which obtains throughput l' * in "y * where l' * < '1(1, i). Assume that the user is satur-
an
THE BEST OF THE BEST
586
ated in 'Y * at link 1'. Note that 'Y(t', i) > 'Y * since 'Y(1', i) ~ 'Y = "'(1, i) > 1· Consider the sharers .of 1'. Those saturated before iteration i may not change their throughput in 1 * by induction. The r other users must have throughputs in 'y'. - of at most 1 *, each by Lemma 3. Thus, x(c(l') - 'Y • (t) ~ x(c(l') - 1sat(l', i) - ry*). But < 1(1', i) implies that 1* < x(c(/') - 'Ysat(l', ;»/(1 + rx). Thus,
x(c(l') - 1*(1'») > x (c (1') -1sat(l', i)
-C
:rx1(C(/')-'Ysat(l',i» )
x , = 1 + rx (c(/) - 'Ysat(l', l)) > 'Y*.
This contradicts the fact that the user is saturated' at
"Y
*.
(6)
r in
IX. FAIRNESS One aspect of a flow control optimality criterion which is difficult to evaluate is the elusive notion of fairness. One version of fairness is to insist that all users obtain equal throughputs. In a network with different users, using links of different capacities, it is unlikely that such a policy would be desirable. Recall that flow control is instituted not only to protect a user against high delay due to traffic, but also to equitably divide network resources among competing users. The notion of fairness provided by saturation relates to the equitable division of resources. Briefly, saturation is "fair" because • each user's throughput is at least as large as all other users that share its bottleneck link (Lemma 3) • the only factor that prevents a user from obtaining higher throughput is the bottleneck link (which essentially divides resources equally). X. GENERALIZATIONS The fact that our algorithm saturates all users is interesting in a network with a homogenous user set, but suffers in that it provides too restrictive a notion of fairness. The property that "all users are treated equally" may not be desirable in practical networks. One user may be more important and thus deservingof a higher message rate. Alternatively, a user that interferes with many other users would probably deserve special treatment. This is only one deficiency that results from the definition of saturation. A different problem arises if many (n) users share a single link. If the link is the bottleneck link for each, then (at x = 1) they each choose a message rate of e(I)/ (n + 1). As n ~ 00, the total rate approaches c(l); thus, there is excessive throughput and disastrous delay. (This particular problem is dealt with both here and in Section XII.) A final problem with the definition of saturation is that it may not be desirable to have a network -wide value of x as defined. Recall that one reason to choose 'Y = x mini r(/) was to protect the network against transients in a user's message rate which were as large as a factor of l/x. Clearly, the varia-
bility .in rates of different users is different. A user that has
large variability would need a larger relative amount of residual capacity on its links. This section solves the above problems by reformulating the definition of saturation. With user p, one associates a number x p , the throughput priority of user p. User p's throughput priority expresses the desired message rate of user p as compared to the rates of interfering users. In particular, user p is saturated at 1 if 'Y p = x p r(l). If users p and q are both saturated at I, the ratio of their throughputs is xp/x q . This generalization clearly treats users differently. Optimum performance is again equated with rate assignments that saturate all users. In practice, some higher level protocol would decide what the relative values of x p should be. If x p were chosen as a function of the number of interfering users, some network manager could prevent the excessive use of an n user bottleneck. Similarly, a network manager could decide how to appropriately allocate relative priorities to competing users. In some network environments, each user might make a local decision choosing x p based on the expected variability of its message rate to protect the network. A network manager is not needed if some convention is adopted by network users for determination of their throughput priorities. We proceed to explain how the variable throughput priority case may be effectively reduced to the equal throughput priority case. In particular, the following questions are addressed: • Is there a static algorithm to saturate all users? • Is there a unique way to saturate all users? • Is there an appropriate adaptive) distributed algorithm such as the one described in Section IV? • What delay/throughput tradeoff is implied by the new definition of saturation? • What fairness properties are implied? First, consider the case that x p is an integer for all p, We assume that each link knows the value of x p for each user of the link. In this case, the variable x p case is reduced to the x = 1 case as follows. A user with priority x p is treated as x p users each with x = 1 and identical paths. Initially, if there are j users of I with priorities xl, ..., Xi' then e(l) "V(I 1 ) = - -
"
1 +S
(7)
where S = ~i=lj Xi. If 1<.1, 1) is the minimal allocation for user k (with priority Xk), then user k chooses 1 = Xk 1(1, 1). In subsequent steps, 'Ysat is measured as before, and
'Y(/, z) =
eel) -1 s at(l, lj 1 + S(I,t)
(8)
where S(l, i) is the sum of the xp's for users of link I that are not saturated before iteration i. It can be shown that with this modified algorithm, the value of 'Y(/, i) for every I and every i is identical here to the case where each user with priority x p were replaced by x p users with priority 1. Also, the message rate 'Y of a user with
587
Fifty Years of Communications and Networking
priority x p after iteration i equals the sums of the rates of the x p users with x = 1. These facts are proved trivially by induction on i. From this it follows that there is a static algorithm to saturate all users, and that saturation is unique. Actually, using (7) and (8) uniquely saturates all users even if x p is not an integer. The proof of this follows in a manner similar to the proof of Section VII. Continuing with the aforementioned questions, the appropriate adaptive algorithm remains roughly the same as in Section IV; each user saturates itself based on current conditions (perhaps changing message rate slowly for stability reasons). The delay-throughput tradeoff defmed for user p is
'Yp=
.
xp
mm - - . 1:IEl(p) d,(1)
(9)
The relevant fairness statements are as follows. • Each user's throughput is only constrained by its bottleneck link. • At its bottleneck link a. user gets at least "its share of capacity" based on its throughput priority. That is, the rate 'Yp of user p satisfies 'Yp ~ (xplxq) "1q if q shares p's bottleneck link.
XI. LOW THROUGHPUT USERS The saturation algorithm provides each user with an "optimum" throughput, but requires one special assumption to do so. It is assumed that each user has a throughput equal to that assigned in the algorithm. In practice, however, a user may not have enough data to send at the high rate. In this section we briefly discuss the required modifications to handle this case. Assume that 'Y is the maximum possible rate for a user based on incoming data rate considerations. Then the user "pretends" that on its path is a "virtual link" of capacity (1)(1 + xl/x, which is shared with no one. If all other links have saturation allocation larger than 'Y, then the rate chosen on the basis of the virtual link is 1. For example, if x = I, the virtual link has capacity 2')' and the user is saturated if its rate is 1. Thus, by slightly modifying the network, the inherent
throughput constraints of each user are taken into account,
This is not quite the Cyclades notion of optimality-they require that e(l) not be exceeded, but place no other restrictions on the message rates (such as "1 ~ r(1)). To effectively remove the restriction 'Y ~ r(l), let x ~ 00; 1 ~ xr(l) is then trivially accomplished. To review, a utilization of y at bottleneck links is accomplished by using e(l) instead of c«(), letting x -+ 00, and saturating all users. This accomplishes the desired utilization of bottlenecks, and also provides fairness not usually provided by just restricting link utilization. In this case, letting x -+ co does not strongly degrade delay at the cost of throughput, since the rates are all chosen based on e(l), not c(l).
XIII. CONCLUSIONS We have presented a "fair" motivatable network performance criterion. Two algorithms have been presented to optimize performance, one of which is guaranteed to find the unique optimal throughput assignments in a static environment.
APPENDIX PROOF OF LEMMA 1 (BY INDUCTION ON i)
i = 1: 1) 1(/, 1)
= x-c(l)/(l + jx)where j is the number of users that share I. 1(1, 2) = x(c(/) - 'Ysat(l, 2»/(1 + rx)where r is the number of users of I not saturated at the first iteration. By the way that throughputs are assigned, 1 sat(1, 2) ~ (j r) 1(/, 1) = (j - r)(x·c(/»)/(l + jx). Thus,
"1(/,2)=
xc(l) - XYsat(l, 2)
l+rx
Assume that it was desired that no link exceed a fraction y of its capacity. This might be used to prevent .~(1) ~ c(f) as n ~ 00 in the case of n users sharing a bottleneck link. Section XI prevents 1(/) ~ c(l) by suggesting that the values x p should be chosen as a function of n, In this section a more direct approach is used. This approach leads to a derivation of the "optimum Cyclades performance" as a limiting case of saturation. Defme the effective capacity of I, e(l) = yc(l). This is the largest amount of capacity of 1 that should be used. If e(l) is used instead of c(l) in the algorithms to saturate all users, then the capacity of any link utilized is restricted to be at most e(l).
xU - r)(x • c(l) . 1 + JX
~-------
l+rx
(1 _x_U_· --.')\ 1 +/xj
x(c(l)
=----.,.;;....---1 +rx
xc(f) (
1 + ix - jx
without changing the algorithms and their properties.
XII. RESTRICTING THE PERCENTAGE UTILIZATION OF ALINK
xc(l)-
+ rx
)
1 + rx
1 +rx
xc(l)
--I
+jx
= 1(1,1).
(AI)
2) Recall that the saturation allocation is designed to guarantee saturation if all unsaturated users of a link choose the saturation allocation and all saturated users keep the same throughput. Before the first iteration, there are no saturated users, and each user chooses at most the saturation allocation. From this, 2) follows immediately. 3) Similar to the inductive step (below). Inductive Step: Assume 1), 2), and 3) for k < i and prove 1) and 2) for k = i, Then, using 1) and 2) for k = i and 3) for k < i, prove 3) for k = i as follows.
588
THE BEST OF THE BEST
1) 'rtl, i + 1) = x(c(l) - 'Ysat(l, i + 1»/(1 + ze), 1(1, i) = x(c(l) - 'Ysat(l, i»/(l + sx) where there are, nonsaturated users of I before the i + 1st iteration and s before the ith. By induction on 3), any user saturated before the zth iteration remains saturated before the i + 1st (i.e., after the.zth) with the same throughput. Thus, 1 sa t(J., i + 1) = 'Ysat(/;i) t"'Ynew where 'Ynew is the sum of the throughputs of the s -- r users that become saturated at the ith iteration. Note that rnew:S:; (s - r) 'Y(l, i) since each newly saturated user has message rate at most 1(/, i). Thus, >
.
"(I, I
+
'Y(I') ~ 'Ysat(l', i + 1) + j",/(l', i + 1).
x(c(l) -1sat(l, i + 1» 1) = - - - - - - 1 +rx
= x(c(l) -
"'/s8t(l, i)
I +rx
~
(
1)
+ rx+l
SX
Thus, x(c(l') -1'(1')) ~ x(c(l') - 1sat(1' ) i + 1) - j'y(l',i +
X'Ynew ----
- - 'Y(I, I) -
To prove that the user is not overloaded at a link l', it suffices to show "1 ~ x(c(l') - 1(1'») where 1(1') is the sum of throughputs of users of I' after the i + 1st iteration. Consider the iteration (iteration k) at which the user became saturated (with rate ",/). If r is on its path, "'(1', k) ~ 'Y by the way 'Y is chosen. By induction on (1), "(I', i + 1) ~ 1. Recall that 1(1', i + 1) = (x(c(l') - "Isat(l', i + 1)))/(1 + jx) (if j users are not saturated before the i + 1st iteration). Note, the value of 'Y(I') after iteration i + 1 is given by
1 + rx
jx
xes -r) 1"1(/, Ij l+rx
sx+l-sx+rx = "(I, z ) I - - - - 1 +rx == 1(1, I)
,
== x(c(l') - "'/sat(l', i + 1) - - - . (c(l) 1 + JX
(A2)
==x(c(l')-'Ysat(l',i+
1» (A3)
1»)(_1_ . ) 1 +]X
2) By induction on 3), all users saturated before the ith iteration choose the same throughput at the ith iteration. ACKNOWLEDGMENT Since each unsaturated user chooses, at most, the saturation allocation at I, by the definition of '1(/, i), a user becomes The author acknowledges helpful conversations with K. saturated at 1 at iteration i only if all other unsaturated users Bharath-Kumar, F. H. Moss,and M. Schwartz. choose nl, i) and become saturated. 3) Fix a user that is saturated after the ith iteration with REFERENCES throughput 1. We must show that at the i + 1st iteration, it chooses the same throughput and remains saturated.. Consider [I J J. M. McQuillan. .• Adaptive routing algorithms for distributed computer networks." Bolt Beranek and Newman Rep. 2831, a link 1 at which the user is saturated after the ith iteration. NTISAD 781467. May 1974. Using 2) for the iteration number k at which the user was [2) L. Tymes, ""TYMNET-A terminal-oriented communication first saturated at I, (k ~ i), all users that share 1 are either network," in AFIPS Conf. Proc .. Spring Joint Comput, Conf., vol. 38. 1971. pp. 211-216. saturated before the kth iteration or become saturated at the [3] M. Schwartz, Computer Communication Network Design and kth iteration. By induction on 3), it follows that all are satuAnalysis. Englewood Cliffs. NJ: Prentice-Hall. rated after the kth iteration. Also, the ones that were previ- [4] J. P. Gray and T. B. McNeill. "SNA multiple-system networking;" IBM Syst, J .• vol. 18. no. 2, 1979. ously saturated use the same throughput as before the kth [5] A. Danet, R. Despres, A. Lakest, G. Pichon, and S. Ritzentheler. iteration. This continues through iteration t, Since the user is "The French public packet switching service: The transpac saturated at I, its throughput "1 satisfies 'Y = x(r(l)). Also, since network," in Proc. 3rd Int. Conf. Comput . Commun., Toronto, Ont., Canada, Aug. 1976, pp. 251-260. all users of I are saturated, 'Ysat(l, t + 1) = '1(1) and "{(I, i + [6] M. Gerla and L. Kleinrock. HFlow control: A comparative 1) = x(c(/) - 'rtf)) = 'Y. Thus, due to the saturation allocation survey," IEEE Trans. Commun., vol. COM-28. pp. 553-575, Apr. at I, the user chooses a throughput of at most 'Y at iteration 1980. i + 1. Since for every link I' in the user's path "(1', i + 1) ~ [7] V. Ahuja, "Routing and flow control in systems network architecture," IBM Syst, J., vol. 18, no. 2, 1979. 'Y(I', 'i) [by (1)], the user chooses exactly 'Y. [8J D. W. Davies, "The control of congestion in packet switching The above argument may be repeated for each user satunetworks," IEEE Trans: Commun., vol. COM-20, June 1972. rated after the ith iteration. Returning to the user fixed above, [9J E. Raubold and J. Haenle, "A method of deadlock-free resource allocation and flow control in packet networks." in Proc . 3rd Int. it is apparent that the user is saturated at 1 at the i + 1st Conf. Comput, Commun., Toronto, Ont., Canada, Aug. 1976. iteration, since all users that share I do not change their [10] J. C. Majithia et al ., "Experiments in congestion control techniques." in Proc. Int. Symp . Flow Contr . Comput . Networks, throughputs. Thus, 1(1) is unchanged and 1 = x(r(l)) still Versailles. France. Feb. 1979. holds. To prove that the user is still saturated after the i + [l I] Proc. Int. Symp . Flow Contr. Comput. Networks. Versailles Ist iteration, it suffices to show that it is not overloaded on France, Feb. 1979. {12] K. Bharath-Kumar. "Optimum end-to-end flow control in netany other link on its path. 9
589
Fifty Years ofCommunications and Networking works:' in Proc . Int. Conf'. Commun . Seattle, WA. June 1980. [131 J, M, Jaffe ..... Flow control power is non-decentralizable." IBM Res, Rep. RC8343, July 1980~ also to be published IEEE Trans. Commun .. 1981, (14] K. Bharath-Kumar and J. M. Jaffe ... A new approach to performance oriented flow control.' IBM Res. Rep. RC8307, May 1980; also. IEEE Trans. Commun ... vol, COM-29. pp. 427-435. Apr. 1981. [151 L, Kleinrock, Queueing Systems. vol. 2. New York: Wiley. 1976, r J6} D. C. Little. HA proof of the queueing formula: L = XW.'- Oper. Res., vol. 9, pp. 383-387, 1961. [171 J. M. Jaffe. '''A decentralized, 'optimal;' multiple user flow control algorithm." in Proc. 5th Int. Conf. Comput. Commun .. Oct. 1980.
PHOTO NOT AVAILABLE
JetTrey M. JatTe (M'80) received the B.S. degree in mathematics and the M.S. and Ph.D. degrees in computer science from the Massachusetts Institute of Technology. Cambridge. in 1976. 1977. and 1979. respectively. He is currently employed by IBM Research. Yorktown Heights. NY, where he is engaged in research on network algorithms and combinatorial optimization. Dr, Jaffe is a member of ACM and Phi Beta Kappa. and was a National Science Foundation Fellow while a graduate student.
Routing and Flow Control in TYMNET LA ROY W. TYMES'I MEMBER, (Invited Paper) Abstract-TYMNET uses two mechanisms ro~ moving data: a tree structure for supervisory control of the original network and a virtual circuit approach for everything else. Each mechanism is described. The routing and flow control is contrasted with ideal routing and now control, and also with conventional packet.. switched networks. One of the mechanisms described, the virtual circuit as implemented in TV MNET, is compared to tbe ideal. This mechanism avoids several inefficiencies found in other packet networks. The tree structure is shown to have several problems which increase roughly with the square of the size of the network.
INTRODUCTION
R
OUTING and flow control are the two most important factors in determining the performance of a network . They determine the response time seen by the user, bandwidth available to the user, and, in part, the efficiency of node and link utilization. TYMNET has several years' experience with two different mechanisms of moving data through the network. The routing of one of them, the virtual circuit, is described in detail with emphasis on performance considerations. Flow control is then discussed with emphasis on heavy load conditions. . TYMNET is a commercial value added network (VAN) which has been in operation since 1971 [1]. The original network, now called TYMNET I, was designed to interface low-speed (10-30 character/s) terminals to a few (less than 30) time-sharing computers.. The data rate was expected to be low, the size of the network small (less than 100 nodes), and the log-on rate low (less than 10 new users/min). High efficiency of the 2400 bit/s lines interconnecting the nodes was required along with good response time for the user, who typically interacted with full duplex terminals on a characterby-character basis. Echo control with full duplex terminals was to be passed smoothly back and forth between host and network. Finally, memory in the nodes was expensive, and little money was available for development and deployment. With these considerations, the nodes were made to be as small as possible, with very li ttle buffering. All complexity that could be centralized, such as routing control, was put into a supervisor program which ran on a time-sharing system.. A virtual circuit scheme was devised which allowed a smooth flow of data to the users with the small buffers and which allowed the multiplexing of data from several users into a single physical packet. This way a user could type one character at a time without generating a lot of overhead on the network lines. When a supervisor took control of the net, it generated a Manuscript received April 23, 1980; revised August 15, 1980. The author is with the Data Network Division, Tyrnshare, Inc., Cupertino, CA 95014.
IEEE
tree structure with itself at the root. Through this tree, it could read out the tables which defined the existing circuits and make new entries in them to define new circuits. This kept the software in the nodes very simple at the cost of complexity in the supervisor.. As the years passed, all the design considerations except the need for efficiency and interactive response were completely reversed. In particular, with many terminals requiring much higher bandwidth, with the number of nodes approaching 1000 worldwide, and the log-on rate measured in log-ons per second, many of the design decisions for TYMNET I became inappropriate.. In TYMNET II, which is displacing TYMNET I in high-density areas and new installations, the features which worked. well in TYMNET I have been enhanced and generalized. The supervisor control tree, however, did not scale up very" well, and TYMNET II uses virtual circuits exclusively. TYMNET II is a more recent technology than TYMNET I, not really a different network. In fact, both technologies can and do exist in the same network. TYMNET I was implemented in Varian 16-bit minicomputers. TYMNET II was implemented in the TYMNET Engine [2], a 32-bit byteaddressable machine developed by Tymshare, Inc. specifically for network applications. A basic difference between the two technologies is that in TYMNET I the supervisor maintains an image of the internal, routing tables of all the nodes and explicitly reads and writes the tables in the nodes, whereas in TYMNET II the nodes maintain their own tables, and there is much less interaction between node and supervisor.
IDEAL ROUTING AND FLOWCONTROL Before getting into the details of the machinery, we should be clear about what it is supposed to do. In routing, the first consideration is to find the "optimum" path, where the definition of the word "optimum" can be rather complex. It should also be fast, so that the impatient human user is not kept waiting. The ideal routing algorithm should use little CPU time and network bandwidth, since these are both scarce resources. It must base its decisions on the current state of the network, not on the state of the network a minute ago, since many things can change in a minute. ·Finally, it must not spend a lot of network bandwidth to keep its data up to date. The practice of passing tables from node to node becomes expensive as the network, and hence the tables, grows large. The ideal flow control mechanism must assure a smooth flow of data to the low-speed user. In TYMNET, it must do this with small buffers in the nodes. If the user wishes to abort his output (a common thing for an interactive user to do), the data buffered in the network must disappear quickly, An
Reprinted from IEEE Transactions on Communications, vol. COM-29, no. 4, April 1981.
The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
591
592
THE BEST OF THE BEST
additional consideration for a commercial network is that the network lines must be used efficiently . This means that all forms of overhead, such as end-to-end acknowledgment, deadlocks, discarding of packets, retransmission of data (other than for line errors), and so on, must be eliminated or greatly minimized, even if the user wishes to send just one byte at a time. Interactive users will want fast response, so queuing delays must be kept short. Finally, when bandwidth on a link is oversubscribed, it must be partitioned among the various users in some satisfactory manner, without causing lockups, deadlocks, or other network malfunctions.
PORT 4
NODE A Fig. 1.
NODE B
NODEC
A simple TYMNET virtual circuit.
node C refers to the buffer pair associated with port 7, which completes the circuit. All routing is done by the supervisor. It begins when the THE TYMNETVIRTUAL CIRCUIT user identifies himself with a name and password, and presents The TYMNET virtual circuit has been documented else- a request to build a virtual circuit (log-in to a host). The where [3], [4] . For the purposes of this paper, let us define supervisor hashes the user name into the MUD (master user it as a full duplex data path between two ports in the network. directory) to get the attributes needed for access control , Data are always treated as a stream of 8-bit bytes. The two accounting information, and so on. Then the supervisor ports are usually, but not always, on two different nodes, assigns a "cost" to each link in the net. This cost reflects and the circuit must therefore be routed through the network . the desirability of including that link in the circuit. This cost The routing is normally done only once, when the user first is based on link bandwidth, link load factor, satellite versus requests that the circuit be built. If the circuit is rebuilt later land-based lines, terminal type, and other factors. For in(because of a node failure on the original path, for instance) , stance, if the user has a low-speed interactive terminal, a the rebuild procedure is the same except for the recovery of 9600 bit/s land link will be assigned a lower cost than 56000 data that may have been lost when the first circuit failed. bitls satellite because satellites add a delay which is undeEach link between two adjacent nodes is divided into sireable to interactive users. If the circuit is to be used to channels, and a circuit passing over that link is assigned to a pass files between computers, however, the satellite will be channel. The communication between these nodes is con- assigned the lower cost because bandwidth is more important cerned with channels, not circuits. Only the network super- than response time. visor knows about circuits. Data from various channels may A definition of link load factor will depend on what type be combined, or multiplexed, into one physical packet to of circuit is being routed. For instance, if a user wishes to pass share the overhead of checksums and packet headers among files between background processes on two computers, reseveral low-speed channels. A high-speed channel with much sponse time is not required. High bandwidth is not needed data to send may use the whole physical packet by itself. either if the user is in no hurry . The objective is to move the An example of a simple circuit is given in Fig. 1. Port 4 data at the minimum cost to the network, which means giving on node A is connected through node B to port 7 on node the user the bandwidth left over from other users. A me movC. A pair of buffers (represented by a circle), one buffer for ing from one computer to another can pass at a high enough each direction of data flow, is associated with port 4 in node rate to saturate any link. Such a link is not overloaded, howA and another pair with port 7 in node C. Another pair is ever, since there is no reason not to build more circuits over it. used in node B for traffic passing through it. Each buffer is A second user may wish to run a line printer. He cannot elastic, like balloon, so it takes little memory when it is saturate a link by himself because after the line printer is empty . The data are stored in a threaded list of bufferlets of going full speed, it can use no more bandwidth . If several 16 bytes each (14 data bytes plus a 2-byte pointer to the next such users share the same link, they may ' saturate it and bufferlet). When more data are put into a buffer, additional compete with each other for bandwidth. If the printers can bufferlets are taken from a common pool. When data are re- no longer run. at full speed, then the link has reached "high moved from a buffer, the empty bufferlets are returned to speed overload," and it is not desirable to route more such the common pool. An empty buffer has no bufferlets as- users over this link. A longer route which avoids the consociated with it. Buffers in TYMNET are empty almost all gested link may be prefered, even though it increases cost the time. to the network and increases response time. Printer users do Each node has a table for each of its links, associating not care about response time. It is all right to send interactive channels on that link with internal buffer pairs. Entry number users over a link with high-speed overload because they require 5 in a table in node A refers to the buffer pair for port 4, so so little bandwidth that they will not interfere very much with data from port 4 are sent out on channel 5 on the link be- the printers. The printers, on the other hand, will not intertween A and B. Entry 5 in node B for that link refers to the fere with the response time of the interactive user. passthrough buffer pair. Entry 12 in another table in node The most severe form of overload is "low-speed overB also refers to the same passthrough buffer pair. This second load." The formal definition of low-speed overload is that table is for the link from B to C, so data from port 4 of node the average delay for a particular channel on a link to be A will use channel 12 when going from B to C. Entry 12 in serviced exceeds t s several times in a 4-min period. When
a
593
Fifty Years ofCommunications and Networking
this happens, low-speed interactive users begin to experience degradation in response times. A high cost is given to such a link to avoid building more circuits on it. Legal considerations also affect "cost." For instance, there may be prior agreements that the traffic between two countries be divided among the interconnecting links in a particular way. These rules can be enforced by assigning unacceptable costs to the unallowed options. This assigning of costs is not compute intensive. It is mostly a matter of indexing into the correct table. Once the correct table has been selected, there remains the problem of finding the path of lowest cost through the net. This is complicated by the fact that some users may have more than one possible target. For instance, there may be several gateways to another network, and the cost of the gateways and the other network may have to be considered. The one path which produces the "best" load balancing is preferred. The problem of fmding the best path through a network has been investigated by many people [5] -[7] . The algorithm that TYMNET has been usingis as follows. T1: Initialize the cost of reaching the source node from the source node to O. Initialize the cost of reaching all. other nodes from the source node to an unacceptably large, finite number. T2: Initialize a list of nodes to contain only the source node. T3: If the list is empty, done. Otherwise, remove the next node from the list. For each neighbor of this node, consider the cost of going to that neighbor (the cost associated with this node plus the cost of the link to the neighbor). If this cost. is less than the cost currently associated with that neighbor, then the newly computed cost becomes the cost associated with that neighbor and a path pointer for that neighbor is set to point to this node. If that neighbor is not on the list and the new cost is less than the cost associated with the target node, add the neighbor to the list. Repeat T3. When the algorithm completes, the path of minimum cost is defined by the backward pointers. Furthermore, the mini.. mum cost is precisely known. If the cost is high, the supervisor may elect to reject the user rather than tax the network to provide poor service. A trivial example is given in Fig. 2. The problem is to find the path of least cost from node A to node D. The cost associated with node A is set to 0, and for all the other nodes it is set to 99. Node A has two neighbors, Band C, and the cost of each of them can be improved by going to them from node A, so both are added to the list. When node B is considered, the cost of neighbor D is reduced by reaching it from B, but it is not added to the list because the new cost is not less than the cost of reaching the target node (since D is the target node). Finally, when C is processed, the cost for node D is further reduced, redefining the best path from A to D, and the cost for node E is also reduced. Node E is not added to the list because its cost is not less than the cost already associated with the target node D. Nodes F and G were never considered. The best path is seen to be from A to eto D, and has a cost of3. There are other algorithms which require less CPU time
NODE
BEING
NODE
COSTS
PROCESSED It BCD E F G LIST 9. 99 9. 9_ 99 89 A A B 0 2 1 99 99 99 99 B,C
o
C DONE
Fig. 2.
o
o
2
2
1 1
8 98 89 8' C 8 89 98 EMPTY
3
Routing example. Find the best path from A to D.
than this one, but of those known to the author) all are more complex and require more memory. This algorithm has been satisfactory for TYMNET so far. More important than the algorithm itself is the mechanism for correctly determining link costs. If the link costs are incorrect or inappropriate, the resulting circuit path will be unsatisfactory. Any time the network capacilities change, e.g., link failure or overload, the supervisor is notified immediately and can use this information for the next circuit to be built. In TYMNET II, the next step is to send a needle to the originating port. This needle contains the routing information and threads its way through the net, building the circuit as it goes, with the user data following behind it. In TYMNET I, the entries in the routing tables are explicitly made by the supervisor for every node. The TYMNET II needle is a list of node numbers, together with accounting information and some flags indicating circuit class, encoded as a string of 8-bit bytes. When data enter a TYMNET II node from link on an unassigned channel, it is checked to see if it is a needle. If it is not, it is discarded. Otherwise, the channel is assigned and the needle checked to see what to do next. If the circuit terminates in this node, it is attached to a port. If the next node is a neighbor of this one, a channel on the link to that node is assigned and the needle, together with any other data behind it, is sent on its way. If the next neighbor is unknow~ (because of recent link failure or some other error), the data are destroyed and the circuit is zapped back to its origin. Note that a needle followed by some data followed by a nongobbling zapper is similar to a datagram, one which contains explicit routing information. Thus, TYMNET II has a type of datagram capability, although it is not used.
a
VIRTUAL CIRCUIT FLOW CONTROL The following analogy will help make clear the dynamics of TYMNET virtual circuit flow control. A building with many water faucets in it is supplied by one water pipe coming from a water main . If a faucet is turned on, water immediately flows out of it and water begins flowing in from the main. The main can supply water at a much faster rate, but the rate of flow is restricted by the faucet. When the faucet is turned off, the resulting backpressure stops the flow from the main instantly. If enough faucets are turned on sirnul-
594
taneously, the capacity of the pipe from the main may be oversubscribed. When this happens, faucets which are turned on only a little still get all the water they want, but faucets turned on all the way will not flow at their maximum rate. If any faucet is turned off, the water it was consuming is now available to the remaining faucets. Note that there is no machinery in the water pipe from the main to allocate the water. It is just a pipe and does not know or care where the water goes. Also note that when the capacity is oversubscribed, all faucets that want water still get some. Some faucets get less than they want, but they do not stop. Finally, note that no water is wasted. It does not have to be "retransmitted" to compensate for water spilled. For each channel on a link between two nodes, there is a quota of 8-bit bytes which may be transmitted. This quota is assigned when the virtual circuit is built, and varies with the expected peak data rate of the circuit; in other words, the throughput class of the circuit. Once a node has satisfied the quota for a given channel, it may not send any more on that channel until the quote is refreshed from the other node. In TYMNET II, the receiving node remembers approximately how many characters it has received on a channel, so it knows when the quota is running low. It also knows how much data it is buffering for this circuit, and whether its total buffer space is ample. This last consideration is almost always true in TYMNET because at any given instant, most circuits are idle, at least in one directjon, and no memory is needed for their empty buffers. When the quota is low or exhausted and the receiving node does not have enough data buffered for the circuit to assure smooth data flow) it sends back permission to refresh the quota for this channel. This permission is highly encoded so as not to require much bandwidth. Note the passive nature of this backpressure scheme. Doing nothing is the way to stop the influx of data, so if a node is overloaded, one effect of the overload is to reduce the load. Also note that this mechanism does not know or care about the destination of the data on each channel. Backpressure propagates from node to node back to the source and effectively shuts it off. It does not matter whether the cause of the backpressure is inability of the destination to consume the data as fast as the source can supply it or congestion within the net. Either way, the source is quickly slowed down or shut off. Finally, note that only circuits which are actively moving data need attention. At any instant, this is a small percentage of the total number of circuits. A complication arises when an interactive user realizes that what is printing on his terminal is of no interest to him and he wishes to stop it. He can type an abort command and the host computer may stop outputing, but there are still many characters buffered in the network. To clean out the circuit, the host can send a character gobbler. The ·character gobbler ignores backpressure and goes through the circuit at full speed, gobblingall characters in front of it. Another exception to normal backpressure convention is the circuit zapper. When a session is over and one or both ports disconnect, a circuit zapper is released which not only gobbles up the characters, but releases the buffer pairs and
THE BEST OF THE BEST
clears the table entries as well to free up these resources for new circuits. A zapper must be able to override backpressure because some circuits may stay backpressured for a long time. Suppose, for instance, that the terminal is an IB!\1. 2741 with the.keyboard unlocked. It cannot accept output in this state, s(§ output data' remain buffered and backpressured in the net, waiting for the user to lock his keyboard. The zapper will not wait, but will clear his circuit and disconnect him. TYMNET I has only one circuit zapper, but TYMNET II has a family of them. Each is generated by a different terminating condition, so that th-e port at the other end knows why the circuit is being zapped. There are hard zappers and soft zappers. A hard zapper disconnects the port, but a soft zapper allows the port to request a circuit rebuild. A soft zapper might be generated by a link failure, for instance, and a rebuilt circuit will allow the session to continue. A short history of characters sent may be kept at each end, so that when the circuit is rebuilt, data lost when the old circuit failed can be retransmitted. This is transparent to the user, and there is normally no indication that an outage has happened unless it is a special host that wishes to monitor such things, No user data are lost or allowed to get .out of order. Note that there is no overhead on the links to provide this feature exceptbriefly when the circuit is rebuilt. In theory, TYMNET II could use this rebuild mechanism to redistribute network load, but in practice, it is not needed. Circuits come and go often enough that the supervisor has no difficulty redistributing load by proper routing of the new circuits, When the numbers of users are very large, as they are in TYMNET, a statistical approach to load levelingworks well. The load on any small area of the net changes very little from one minute to the next. PACKET TRANSMISSION AND BANpWIDTH ALLOCATION To move data from one node to another, they must be assembled into packets. Since the greatest value of a value-
added network is the sharing of line costs among many users, this must be done as efficiently as possible. The packet maker is a process which builds physical packets to send over a link. It will build a packet when there are data to send and the window of outstanding packets for the link is not full. The packet may contain data from several channels or from just one channel if only one channel has data to send or if a channel has so much data to send that it can fill 'a full length packet. A channel may send data if it has data to send, its backpressure quota is not zero, and if it has not had a tum recently. Once it is serviced, even if for only one byte, it is not serviced again until all other channels have had a chance (unless it is flagged as a priority channel, in which case it may be given extra turns to give it more bandwidth). On any particular turn, a channel is limited in the amount of data it may send by the amount of data in its buffer, its backpressure quota, and the amount of room left in the packet when it is serviced. Thus, when bandwidth is oversubscribed, channels which only need a little get what they want with little or no queuing delays, while channels that want all they can get share the remaining bandwidth equally (except for
595
Fifty Years of Communications and Networking
priority channels, which get extra bandwidth at the expense of nonpriority channels). Packet making and teardown are link-related processes. Once .a packet is'made, it is handed over to the packet transmitting and receiving processes, which are line-related. A line is a physical connection between nodes, and is subject to noise and outages. There may be several lines on one link, for instance, three 9600-bit/s lines, all passing data in both directions simultaneously. There are several different packet transmitters and receivers because there are several different kinds' of hardware for 'moving data between nodes. They {J • differ indata rate, interface requirements, checksummingand formating, and in methods for getting data in and out of memory'. Window size, which is the number of packets one may send before getting' art acknowledgment, varies from 4 to t28,~ depending on the maximum number of outstanding unacknowledged' packets .likely in the absence of errors. When the window size is exhausted, the oldest' packet is retransoneline on this link) mitted (on all lines, if. there is more than , until it fs acknowledged, Packets may be sent and received in . .1!, ' . . ' any order, but are alwa¥s built and tom' down in sequence. . . , It is instructive to contrast TYMNET with ARPANET in ~n overload ~ituation~ In fi~. j, assume th~t node A h~s high bandwidth access to 'sources of data bound for p~rts on nodes B, D, and E. Also assume that the links shown are of equal bandwidth and that nodes B, D, and E have equal appetites for data. In a packet-switched network, data from the sources to node A will b~ in packets to be distributed equally among nodes s. D, ~nd E. Two thirdsof these 'packets must be sent to node C, and one third -to node B. However,the bandwidths to nodes B ana C are the same, so node A can send only half as much data" to B C:ls, to (:. When A fills up with packets, it r
to
.
.,~,
~:
~~,.
must' reje~t all incoming packets, not just' those bound for
nodes D and E. Thus, the link to node B runs at half 'speed because of congestion on the link to node C. . Now suppose th~t the link from C to D goes out. C will wait a few seconds to be sure the link is out before discarding packets for D. tn the m~a:n~ime, it fills up with 'packets for node D and stops r~ceivi!lg p~ckets from node A. Node A is already full of packets for nodes D and E, so it st
. The
of
B
,PORTS
DATA
PORTS
SOURCES
E Fig. 3.
PORTS
Local section of loaded network.
In TYMNET, since the flow control applies to each direction of each channel of each link separately, the traffic bound for nodes D and E would" be backpressure d independently from the traffic bound for node B. The link from A to B would therefore run at full speed. Wh~n the link from C to D went out, the traffic for D would, backpressure to the sources for D and the bandwidth on the link from A to 'C would· be completely available for traffic bound for E. When C realized that the .link to D was down, the circuits for D would be zapped back to the sources for D. These sources would theri either give up or request a rebuild from the supervisor. The supervisor would reject the rebuild requests on the grounds that there is no longer a path' to nope D. At no point would any bandwidth be compromised. No deadlocks would occur. In fact, nothing like a 'deadlock has ever been observed i~ T)'MNET 'circuits. Situations like this one occur several times a day in TYMNET~, and. the only harm done .is that, when a link is oversubscribed, some users get less bandwidth than they would like. After many years' experience with this method of routing and flo~ control, the only complaint we have with'it is the a~o~nt
of CPU time required to assemble and disassemble
the physical packets. Packet-switched networks do not have this overhead since the packets merely pass through the node without being torn apart. We have 'implemented' the more compute bound portions of these"processes in microcode in our TYMN~T Engine[2] and can sustain throughputs in' excess of 25 QOO bytes/s (25 000 in 'ana 25 000 out), which has been adequate so far. A proposed enhancement, so far not implemented, is to allow a circuit to enter a "blast" mode in which it only uses full-sized physical packets. These packets would be so tagged and buffered separately,to avoid the need to tear them do~n and 'reassemble the~. Through anyone node, only a few circuits could be in blast mode at, a" time because of limited buffer space, and a circuit would be in blast mode only' while moving data at a high rate. Regulation of the buffer space would be handled by the standard flow contr~l procedure. The primary application for this special mode' would be file transfers between computers over very high bandwidth cornmunciation lines. In this mode, the throughput of 'an Engine might be about 200 000 bytes/so The exact limit would depend ~n "the method of getting the data in and out of memory. ,
'"
'
" .
FLOW CONT~OL T~ROUGH GATEWAYS TYMNET has many gateways, sOIPe to private or experimental networks using TYMNET technology, and some to
596
THE BEST OF THE BEST
networks quite alien to TYMNET. Fig. 4 illustrates the case where both networks use TYMNET technology. The node in the center is called "schizoid" and has two identities, one for each network. Within each network, each node number must be unique. The schizoid node has two numbers. In this case, it is known as node 12 the supervisor of network A and node 2073 to the supervisor of network B. Each supervisor claims the node as its own and sees the other network as a host computer, which can originate and terminate circuits. It is possible for one supervisor, to see the schizoid as a TYMNET I node, while the other sees it as a TYMNET II node. Each supervisor is responsible for circuit routing and resource allocation within its own network. Fig. 5 illustrates a TYMNET gateway to a different type of network. The actual interface is commonly an X.75 format on a high-speed synchronous line. Again, the tYMNET supervisor sees the other network as a host. The structure of any particular .gateway is dictated by the sophistication of the other network and the availability of manpower to design the interface. In theory, it is certainly possible to build a schizoid node between TYMNET and any other network. This would be the most efficient interconnection, and would make the flow control work as well as the other network would allow, but so far the X.75 approach has been satisfactory. It has the advantage of keeping the two networks truly separated from each other, both organizationally and technically.
to
TREE STRUCTURE FORTYMNET I CONTROL In the original TYMNET, the nodes were very small and the software primitive. They had no ability to process circuitbuilding needles. Instead, the supervisor maintained their internal tables directly. When the supervisor took control of the network, it first took ,control. of the node nearest to itself, then the neighbors of those ribdes, then the neighbors of those nodes, and so on. When a node was taken over on a link, that link became its upstream direction. A special channel on each link was dedicated to supervisory traffic, All such traffic consisted of 48-blt messages, either td the supervisor (upstream) or to the node (downstream). In this tree structure, upstream was always well defined. Any node wishing to send something to the supervisor sent it in the upstream direction, and it would proceed to the root of the tree. Down.. stream, however, had many branches. To get a message to a particular node, the supervisor had to first set switches in the intervening nodes so that each node would know which of its links was the downstream link: When the net was. small, the scheme worked fairly well. It was simple to implement and required little software. However, as the network grew beyond 200 nodes, problems occurred. One obvious problem was that the routing was haphazard. It simply happened as a result of the order in which the nodes were taken over. if a major control link. failed, the nodes which were downstream the failure were retaken from another direction, perhaps a less optimum direction than before. The time required to get data to and from the nodes
of
could be excessive. The second problem was the flow control. At first there
TYMNI:T B
TYMNET
A
Fig. 4.
TYMNET schizoid gateway.
TYMNET LiNKS
Fig. 5.
rVMNET GATEWAY
X.75
PACKET SWITCHED NETWORK
TYMNET to alien network gateway.
was no flow controL The volume of data was so small that none was needed. As the. nodes grew in size, so did their tables. At the same time, the numbers of nodes grew. Because tYMNET I required the supervisor to know the contents of the routing tables in all the nodes, one thing the supervisor had to do when it took over the network was read the tables. The data volume grew to hundreds of thousands of bytes. Obviously, something was needed to prevent nodes near the supervisor from running out of buffer space at takeover time . The ad hoc solution that was applied was for a node to backpressure the upstream supervisor channel when the node had too much supervisory data in it. The node could still send data to the supervisor, but the supervisor could no longer send data to it. Since most supervisory traffic is either generated by the supervisor or generated in response to the supervisor, this had the effect of allowing the data to drain out of the congested area of the net. Temporarily, the supervisor was cut off from that portion of the net. What is worse is that the backpressure could easily back up into the supervisor itself, thus cutting it off from the entire network. Another problem was the misbehaving node. If some node had a software bug which caused it to generate lot of supervisory traffic, it could flood the treee structure, turn on backpressure at some points, and, in general, make it difficult for the supervisor to retain control. In a tree structure, there is no good way to selectively apply backpressure to a node of the tree without affecting other nodes downstream
all
I
a
of it. The most troublesome problem for TYMNET I was the "deadend circuit." To build a circuit, the supervisor had to send commands to each node involved with the circuit to set up the routing tables. If, because of loss of data in the supervisory tree, e.g., due to a lirik failure, some commands made it and some did not, the result was that only pieces of the circuit could get built. Not only would the user not be con..
597
Fifty Years of Communications and Networking
because of the volume of data transferred between nodes and supervisor, combined with the damage done when any of that data was lost. For TYMNET II, however, it is probably at least 5000 nodes. When that limit is reached, regionalized net.. works connected with gateways are the most obvious answer and would require no new development. A possibly mote efficient approach would be a "super" supervisor managing global strategy while delegating local routing to "sub" superSUMMARY visors. The question may become academic before any limit Since TYMNET was brought Up in 1971, it has never been is reached. Perhaps 5000 nodes is enough to cover the entire down. Nodes, lines, and even the supervisor have all had their market. Also, value-added networks may soon be obsolete. failures, but there was never a moment when some part of Telephone networks of the future will be digital. Bandwidth the network was not .running and' able to do useful work. The between central offices will be very large and inexpensive, TYMNET virtual circuit, with its associated routing and and the offices themselves will be automated. The cost of flow control, has stood the test of time in a commercial the telephone network will be largely in the local loops to environment. Its efficiency and robustness when faced with a the customer locations, since installation and maintenance wide variety of overload and failure conditions closelymatches of the local loops is labor-intensive and will not benefit from the ideal. It has scaled very well from a ra~er small network technology advances as much as the rest of the system. In such to a very large one.. The size of TYMNET provides abundant a situation, the "value" of the value-added network disapfeedback to any changes in the machinery, which allows a pears, and straightforward digital circuit switching will be quick check of theory against reality. the evidence so far is the most cost-effective way to connect the intelligent terminal that the virtual circuit approach is entirely satisfactory. with the sophisticated computer. The tree structure, on the other hand, did not scale very REFERENCES well. While quite acceptable in the original network, the [I] L. Tymes, ·'TYMNET-A terminal oriented communications absence of controlled' routing and sloppy flow control proved network," in Proc . AFIP Nat. Comput. Conf., Spring 1971, pp. inadequate as the volume of supervisory data grew and the 211-216. [2] L. Tymes and J. Rinde, '"The TYMNEt II Engine," in Proc.ICCC average distance between node and supervisor increased. For 78. Int. Council Comput. Comrnun., Sept. 1978. this reason (and several others, such as the difficulty of main.. [3] J. Rinde, "'TYMNET I: An alternative to packet switching;" in taining synchronism between the tables in the nodes and the Proc. 3rd Int. Conf. Comput. Commun., 1976. [4] M. Schwartz, Computer Communication Network Design and images of those tables in the supervisor), this type of control Analysis, ch. 2. Englewood Cliffs, NJ: Prentice-Hall. 1977. structure is not used in TYMNET II. [5] E. W. Dijkstra, HA note on two problems in connexion with The routing takes into account all factors known at the graphs," Numer. Math .. vol. I; pp. 269-271, 1959. neeted, but some unreachable (therefore unzappable) circuit fragments would remain to chitter the tables. Furthermore, the image in the supervisor of the routing tables would no longer agree with the tables in the nodes. For these reasons, the tree ·structure was abandoned for TYMNET II. The supervisor builds a separate virtual.circuit to each node at takeover time and controls the node through that circuit.
time the circuit is to be built and attempts to find the "op-
timum" path, b'oth. for the user and for the network. The network overhead required to keep the supervisor informed about link outages and other factors that might affect routing is small. The overhead required to build the circuit is small, especially in rYMNET II. Normally, the response time to a circuit build request is less than two seconds, which is acceptable to most users. Load balancing is accomplished through proper routing of new circuits, according to the expected demands,ofthe new circuits and the current network load. The centralized routing control, where all information required for optimum global strategy is available to a single program on a single computer (not counting the inactive backup supervisors) has proven to be fast, efficient, and versatile. How iarge TYMNET can be and still be controlled by a single centralizedsupervisoris difficult to answer. For TYMNET I the answer appears to have been from 500 to 600 nodes
[6]
['7)
J. GiisinnandC. Witzgall, "A performance comparison of labeling algorithms for calculating shortest path trees." Nat. Bureau of Standards, Washington, DC, Tech. Note 772, 1973. D. R. Shier. "On algorithms for finding the k shortest paths in a network," Networks, vol. 9, pp. 195-214, 1979.
*
La Roy W. Tymes (M'79) received the B.S. degree in mathematics in 1966 and the M.A. degree, also in mathematics. in 1968. both from California State College at Hayward. PHOTO He joined Tymshare, Inc., in 1968 and has NOT been involved in network development ever since. His duties have included design and imAVAILABLE plementation of the original TYMNET, supervision of network development, design and microcoding of the TYMNET Engine, and systems programming of Tyrnshare ' s timesharing systems. He has also been a consultant for numerically controlled machine tools and development of custom LSI. His research interests include subnanosecond LSI, supercomputers. and wide-band digital communication.
OSI Reference Model-The ISO Model of Architecture for Open Systems Interconnection HUBERT ZIMMERMANN
{Invited Paper! Abstract-Considering the urgency of the need for standards which
wOUld allow constitution of heterogeneous computer networks, ISO created a new subcommittee for "Open Systems Interconnection" (ISO/ TC97/SC16) In 1977. The ftrst priority of subcommittee 16 was to develop an architecture for open systems interconnection which could serve as a framework for the definition ofstandard protocols. As a result of 18months of studies and discussions, SCI6 adopted a layered arcbitec=turecomprising seven layers (Pbysieal, Data Link, Network, Transport, Session, Presentation, and Application). In July 1979 the specifications of this architecture, established by SC 16, were passed under the name of "OSI Reference Model" to Technical Committee 97 "Data Processing" along with recommendations to start offtcially, on this basis, a'set or protocolS stand8rdl~tion projects to cover the most urgent needs. These recom·' mendatlons were adopted by TC97 at the end of 1979 as the basis for the following development or standards for Open Systems Interoonnection within ISO. The OSI Reference Model was also recognized by CenT Rapporteur's Group on "Layered Model for Public Data Network Services. ,t This paper presents the-model of architecture for Open Systems .Int~r. connection developed by SCI6. Some indications are also given on the initial set of protocols wbich will' likely be developed in this OSI Reference Model.
I. INTRODUCTION
I
initial discussions revealed [6] that a consensus could be reached rapidly on a layered architecture which would satisfy most requirements of Open Systems Interconnection with the capacity of being expanded later to meet new require .. ments. SC16 decided to give the highest priority to the development of a standard Model of Architecture which would constitute the framework for the development of standard protocols. After less than 18 months of discussions, this task was completed, and the ISO Model of Architecture called the Reference Model of Open Systems Interconnection [7] was transmitted by SC16 to its parent Technical Com.. mittee on "Data Processing" (TC97) along with recommenda.. tions to officially start a number of projects for developing on this basis an initial set of standard protocols for Open Systems Interconnection. These recommendations were adopted by TC97 at the end of 1979 as the basis for following development of standards for Open Systems Interconnection within ISO. The OSI Reference Model was also recognized by CCITT Rapporteur's Group on Public Data Network Services. The present paper describes the OSI Architecture Model as it has been transmitted to TC97. Sections 11-V introduce concepts of a layered architecture, along with the associated vocabulary defined by 8C16. Specific use of those concepts in the OSI seven layers architecture are then presented in Section VI. Finally, some indications on the likely development of OSI standard protocols are given in Section VII.
N 1977, the International Organization for Standardization (ISO) recognized the special and urgent need for standards for heterogeneous infonnatic networks and decided to create a new subcommittee (8C16) for "Open Systems Interconnection." The initial development of computer networks had been fostered by experimental networks such as ARPANET [1] Note on an "Interconnection Architecture" or CYCLADES [2], immediately followed by computer The basic objective of 8C16 is to standardize the rules of manufacturers [3], [4]. While experimental networks were interaction between interconnected systems. Thus, only the conceived as heterogeneous from the very beginning, each external behavior of Open Systems must conform to OSI manufacturer developed his own set of conventions for inter- Architecture, while the internal organization and functioning connecting his own equipments, referring to these as his of each individual Open System is out of the scope of OSI "network architecture." standards since these are not visible from other systems with The universal need for interconnecting systems from which it is interconnected [8] . different manufacturers rapidly became apparent [5], leading It should be noted that the same principle of restricted ISO to decide for the creation of SC16 with the objective to visibility is used in any manufacturer's network architecture come up with standards required for "Open Systems Inter- in order to permit interconnection of systems with different connection." The term "open" was chosen to emphasize the structures within the same network.. fact that by conforming to those international standards, a These considerations lead SC16 to prefer the term of system will be open to all other systems obeying the same "Open Systems Interconnection Architecture" (aSIA) to standards throughout the world. the term of "Open Systems Architecture" which had been The first meeting of SC16 was held in March 1978, and used previously and was felt to be possibly misleading. However, for unclear reasons, SC16 finally selected the title "Reference Model of Open Systems Interconnection" to Manuscript received August S, 1979; revised January 16, 1980. The author is with IRIA/Laboria, Rocquencourt, France. refer to this Interconnection Architecture.
Reprinted from IEEE Transactions on Communications, vol. COM-28, no. 4, April 1980.
The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
599
600
THE BEST OF THE BEST
---------- -------- ---------
--- ----- -- ------ --------- --- --------- --- -- --- -~---
---- -------------- -- ------
Ph y s i ca 1 med i a
Fig. 2.
--
-
-
I
---
-
-
---
I
f or oS!
An example of OSI representation of layering.
Fig. 1. Network layering.
II. GENERAL PRINCIPLE~ OF LAYERING Layering is a structuring technique which permits the network of Open Systems to be viewed as logically composed of a succession of layers, each wrapping the lower layers and isolating them from the higher layers, as exemplified in Fig. l. An alternative but equivalent illustration of layering, used in particular by SCI6 is given in Fig. 2 where successive layers are' represented in a vertical sequence, with the physical media for Open Systems Interconnection at the bottom. Each individual system itself is viewed as being logically composed of a succession of subsystems, each corresponding to the intersection of the system with a layer. In other words, a layer is viewed as being logically composed of subsystems of the same. rank of all interconnected systems. Each subsystem is, in tum, viewed as being made or"one or several entities. In other words, each layer is made of entities, each of which belongs to one system. Entities in the same layer are termed peer entities. For simplicity, any layer is referred to as the (N) layer, while its next lower and next higher layers are referred to as the (N - 1) layer and the (N + 1) layer, respectively. The same notation ' is used to designate all concepts relating to layers, e.g., entities in the (N) layer are termed (N) entities, as illustrated in Figs. 3 and 4. ' The basic idea of layering is that each layer adds value to services provided by the set of lower layers in such a way that the highest layer is offered the set of services needed to run distributed applications. Layering thus 'divides the total problem into smaller pieces. Another basic principle of layering is to ensure independence of each layer by defining services provided by a layer to the next higher layer, independent of how these services are performed . This permits changes to' be made in the way a layer or a set of layers operate , provided they still offer the same service to the next higher layer. (A more comprehensivelist of criteria for layering"is given in Section VI.) This technique is similar to the one used in structured programming where only the functions performed by a module (and not its internal functioning) are known by its users. Except for the highest layer which operates for its own purpose, (N) entities distributed among the interconnected Open Systems work collectively to provide the (N) service
Hi g hes t
la y c r
f - - - - 4 - ( :-O .. j ) - "le r v i c e s ( N" I) - laye r
f - - _ - 4 - ( N) - s e r v i c e s ( N)- I a)' e r
(N-I) - l a yer
Lu.....est l ayer Phv s i ca I mc din f o
Fig. 3.
: N" l - l a y O'
" , -.,,,,, e s
t-
OS I
Systems, layers, and services.
f'
i---<E:l~-4~----4~--~"'--~
::~::::::v,
W)-SA!'!"
(;- ( N- ll - " '"
(N-!)-layH
Fig. 4.
- ------------
Entities, service access points (SAP's), and protocols.
to (N + I) entities as illustrated in Fig. 4. In other words, the (N) entities add value to the (N - I) service they get from the (N - I) layer and offer this value-added service, i.e., the (N) service to the (tI + 1) entities. ' Communication between the (N + I) entities make exclusive use of the (N) services. In particular , direct communication between the (N + I) entities in the same system, e.g., for sharing resources, is not visible from outside of the system and thus is not covered by the OSI Architecture . Entities in the lowest layer communicate through the Physical Media for OSI, which could be considered as forming the (0) layer of the OSI Architecture. Cooperation between the (N) entities is ruled by the (N) protocols which precisely define how the (N) entities work together using the (N - I) services to perform the (N) functions which add value to the (N - I) service in order to offer the (N) service to the (N+ I) entities. , The (N) services are offered to the (N + I) entities at the (N) service access points, or (N) SAP's for short, which represent the logical interfaces between the (N) entities and the (N + 1) entities. An (N) SAP can be served by only one
Fifty Years of Communications and Networking
Q Q
~,
• ••• •• •
• •• • • • • • •••
: -.:
:
: ...
~ Fig. 5.
. "'.""
(!ill -SA l'
( ~) -s~rvi ces
(H) - layer
601
(~)-C.~.p , .... . ..
(S) - ad d TP S S
( )IJ\-ron ne Ctillo
.... ...... (:';) -Ct.:P-identi r l e r
Connections and connection endpoints (CEP's).
(N) entity and used by only one (N + 1) entity, but one (N) entity can serve several (N)SAP's and one(N + 1) entity can use several (N) SAP's. A common service offered by all layers consists of providing associations between peer SAP's which can be used in particular to transfer data (it can, for instance, also be used to synchronize the served entities participating in the association). More precisely (see Fig. 5), the (N) layer offers (N) connections between (N) SAP's as part of the (N) services. The most usual type of connection is the point-to-point connection, but there are also multiendpoint connections which correspond to multiple associations between entities (e.g., broadcast communication). The end of an (N) connection at an (N) SAP is called an (N) connection endpoint or (N) CEP for short. .Several connections may coexist between the same pair (or n-tuple) of SAP's. Note: In the following, for the sake of simplicity, we will consider only point-to-point connections. III. IDENTIFIERS·
Objects within a layer or at the boundary between adjacent layers need to be uniquely identifiable, e.g., in order to establish a connection between two SAP's, one must be able to identify them uniquely. The OSI Architecture defines identifiers for entities, SAP's, and connections as well as relations between these identifiers, as briefly outlined below. Each (N) entity is identified with a global title! which is unique and identifies the same (N) entity from anywhere in the network of Open Systems. Within more limited domains, an (N) entity can be identified with a local title which uniquely identifies the (N) entity only in the domain . For instance , within the domain corresponding to the (N) layer, (N) entities are identified with (N) global titles which are unique within the (N) layer. Each (N) SAP is identified with an (N) address which uniquely identifies the (N)-SAP at the boundary between the (N) layer and the (N + 1) layer. The concepts of titles and addresses are illustrated in Fig. 6. Binding between (N) entities and the (N - I) SAP's they use (i.e., SAP's through which they can access each other and communicate) are translated into the concept of (N) directory which indicates correspondence between global titles of (N) entities and (N) addresses through which they can be reached , as illustrated in Fig. 7 . 1 The term "title" has been preferred to the term "name" which is viewed as bearing a more general meaning. A title is equivalent to an entity name.
~_
Fig. 6.
..
(N) - ti ele
Titles, addresses, and CEP-identifiers.
( N) - t it te
(N- I) -nddre s s
A
352
B
237
B
0 15
C
0 15
Fig. 7.
Example of an (N)-directory.
Correspondence between (N) addresses served by an (N) entity and the (N - 1) addresses used for this purpose is performed by an (N) mapping function. In addition to the . simplest case of one-to-one mapping, mapping may, in particular, be hierarchical with the (N) address being made of an (N - I) address and an (N) suffix. Mapping may also be performed "by table." Those three types of mapping are illustrated in Fig. 8. Each (N) CEP is uniquely identified within its (N) SAP by an (N) CEP identifier which is used by the (N) entity and the (N + I) entity on both sides of the (N) SAP to identify the (N) connection as illustrated in Fig. 6. This is necessary since several (N) connections may end at the same (N) SAP. IV. OPERATION OF CONNECTIONS A. Establishment and Release When an (N + I) entity requests the establishment of an (N) connection from one of the (N) SAP's it uses to another (N) SAP, it must provide at the local (N) SAP the (N) address of the distant (N) SAP. When the (N) connection is established , both the (N + I) entity and the (N) entity will use the (N)CEP identifier to designate the (N) connection. (N) connections may be established and released dynamically on top of (N - I) connections. Establishment of an (N) connection implies the availability of an (N - I) connection between the two entities. If not available, the (N - I) connection must be established . This requires the availability of an (N - 2) connection. The same consideration applies downwards until an available connection is encountered. In some cases, the (N) connection may be established simultaneously with its supporting (N - 1) connection provided
602
THE BEST OF THE BEST
?
A
..
Bb
•
.
\
(Nl -layer
~
~
•
Be
.,
K
~-
L
H
<;?QQ
.
,
I
,: ,I : !
:'
.'
I
/
\i"
6 One-to-one
Fig. 8.
Hi era r chi cal
(N - l ) -CEP
.:>-----"::::}--I'd--'~By t a bl e
o n e -r t o n n c r
Mapping between addresses.
the (N - 1) connection establishment service permits (N) entities to exchange information necessary to establish the (N) connection. .
Fig. 9. Correspondence between connections.
Con r r o l
(N)
-
( N)
{N) -l' ro t oc-o 1-
t'e e r En t i t i e s
B. Multiplexing and Splitting
Three particular types of construction of (N) connections on top of (N - 1) connections are distinguished. 1) One-to-one correspondence where each (N) connection is built on one (N - 1) connection. 2) Multiplexing (referred to as "upward multiplexing" in [7]) where several (N) connections are multiplexed on one single (N - 1) connection. 3) Splitting (referred to as "downward multiplexing" in [7] ) where one single (N) connection is built on top of several (N - 1) connection, the traffic on the (N) connection being divided between the various (N - 1) connections. These three types of correspondence between connections in adjacent layers are illustrated in Fig. 9.
C. Data Transfer Information is transferred in various types of data units between peer entities and between entities attached to a specific service access point. The data units are defined below and the interrelationship among several of them is illustrated in Fig. 10. (N) Protocol Control Information is information exchanged between two (N) entities, using an (N - 1) connection, to coordinate their joint operation. (N) User Data is the data transferred between two (N) entities on behalf of the (N + 1) entities for whom the (N) entities are providing services. An (N) Protocol Data Unit is a unit of data which contains (N) Protocol Control Information and possibly (N) User Data. (N) Interface Control Information is information exchanged between an (N + 1) entity and an (N) entity to coordinate their joint operation. (N) Interface Data is information transferred from an (N + 1) entity to an (N) entity for transmission to a correspondent (N + I) entity over an (N) connection, or conversely, information transferred from an (N) entity to an (N + 1) entity which has been received over an (N) connection from a correspondent (N + I) entity. (N) Interface Data Unit is the unit of information transferred across the service access point between an (N + 1) entity and an (N) entity in a single interaction. The size of (N)
Sp li t ting
M.u l t i p l e x i n g
l ave r s
(N) -User - O:l t."I
o nr Tn 1- In Fo rma t i 0
(N- 1) - In te rface-
( ')-(N- t) \ cl.iar:ent
nat a
Cont r o l>
{N-I ) -l n tPrf-lce Oa t.,
Comb i n e d
(N) - P r l"lt C'C· o 1-[1 .11 · 1 nn i r s
( N-I ) - I n t P rl
; 11 ' "
[Mt.'l-I!ni r
Info rmation
Fig. 10.
Interrelationship between data units.
interface data units is not necessarily the same at each end of the connection. (N - 1) Service Data Unit is the' amount of (N - 1) interface data whose identity is preserved from one end of an (N - 1) connection to the other . Data may be held within a connection until a complete service data unit is put into the connection. Expedited (N - 1) service data unit is a small (N - 1) service data unit whose transfer is expedited. The (N - 1) layer ensures that an expedited data unit will not be delivered after any subsequent service data unit or expedited data unit sent on that connection. An expedited (N - 1) service data unit may also be referred to as an (N - 1) expedited data unit. Note : An (N) protocol data unit may be mapped one-toone onto an (N - 1) service data unit (see Fig. 11). V. MANAGEMENT ASPECTS Even though a number of resources are managed locally, i.e., without involving cooperation between distinct systems, some management functions do . Examples of such management functions are configuration information, cold start/termination, monitoring, diagnostics, reconfiguration, etc. The OSI Architecture considers management functions as applications of a specific type. Management entities located in the highest layer of the architecture may use the complete set of services offered to all applications in order to perform
603
Fifty Years of Communications and Networking
Cn -Iayer
.
t
., pp 1 i (' ole i ('I11-
_ _ _ _ _ _ -,---...:1.----,-
(~-I)
_
·, , . :I
. ~
I: ~
'
I
I
I
•
I
:; ,
-rC I
• I I • I '
(N-I) - Ia yer
,
,
: 'I
(!'I- I) - to u I----r----....J~
- - - - • • - - - --- j
POU· Pr o t oc ol-da t .1-u nir
S OU::" S e r v i c e r-d a r a -eu n i e
I
:::
PC I · l' rot n c o l - cn n t r o l -i nf o r matiCl n
Fig. 11.
gc ee nt - -1fl J"! i r ., t i on i t i .. ~
... nn
P!\t
( N- I)- I.1 ye r
- -
- - -
-
-
,
I : : :
I
-II
_ .
I
i, ,, ,
Logical relationship between data units in adjacent layers.
,,I
", I I I:
- . _ .... _.1
' - - - - r----....J 'LLA
, I
I
J
I
,I ,
I
management functions. This organization of management functions within the OSI Architecture is illustrated in Fig. 12.
•
~ _ ••• ••_}
~ pe(" i
~
- - -- - - • • _ • ••
a I i n t e r f a ce
r or
,,,
•J
ttt"n a lle me'l (
VI. THE SEVEN LAYERS OF THE OSI ARCHITECTURE A. Justification of the Seven Layers
ISO determined a number of principles to be considered for defining the specific set of layers in the OSI architecture, and applied those principles to come up with the seven layers of the OSI Architecture. Principles to be considered are as follows. 1) Do not create so many layers as to make difficult the system engineering task describing and integrating these layers. 2) Create a boundary at a point where the servicesdescription can be small and the number of interactions across the boundary is minimized. 3) Create separate layers to handle functions which are manifestly different in the process performed or the technology involved. 4) Collect similar functions into the same layer. S) Select boundaries at a point which past experience has demonstrated to be successful. 6) Create a layer of easily localized functions so that the layer could be totaly redesigned and its protocols changed in a major way to take advantages of new advances in architectural, hardware, or software technology without changing the services and interfaces with the adjacent layers. 7) Create a boundary where it may be useful at some point in time to have the corresponding interface standardized . 8) Create a layer when there is a need for a different level of abstraction in the handling of data, e.g., morphology, syntax, semantics. 9) Enable changes of functions or protocols within a layer without affecting the other layers. 10) Create for each layer interfaces with its upper and lower layer only. 11) Create further subgrouping and organization of functions to form sublayers within a layer in cases where distinct communication servicesneed it. 12) Create, where needed, two or more sublayers with a
Fig. 12.
A representation of management functions.
common , and therefore minimum, functionality to allow interface operation with adjacent layers. 13) Allow bypassing of sublayers. B. Specific Layers
The following is a brief explanation of how the layers were chosen. 1) It is essential that the architecture permits usage of a realistic variety of physical media for interconnection with different control procedures (e.g., V.24, V.35, X.21, etc.). Application of principles 3, 5, and 8 leads to identification of a Physical Layer as the lowest layer in the architecture. 2) Some physical communications media (e.g., telephone line) require specific techniques to be used in order to transmit data between systems despite a relatively high error rate (i.e., an error rate not acceptable for the great majority of applications). These specific techniques are used in data-link control procedures which have been studied and standardized for a number of years. It must also be recognized that new physical communications media (e.g., fiber optics) will require different data-link control procedures. Application of principles 3, 5, and 8 leads to identification of a Data link Layer on top of the Physical Layer in the architecture . 3) In the Open Systems Architecture, some systems will act as final destination of data . Some systems may act only as intermediate nodes (forwarding data to other systems). Application of principles 3, 5, and 7 leads to identification of a Network Layer on top ofthe Data link Layer. Network-oriented protocols such as routing, for example, will be grouped in this layer. Thus, the Network Layer will provide a connection path (network connection) between a pair of transport entities (see Fig. 13). 4) Control of data transportation from source end system to destination end system (which need not be performed in
604
THE BEST OF THE BEST
L."'I'.' r
,, - - - - - - - - - - -- - - - --- - - - - - - - -- ,. + l' n · ,., 'nf:l t
'"-- -- -- --- - ----- -- -- -- - -- - -
i f'l n
S l' '''''' i o n
--~
( - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -i>
T ."•• I>" por l
0 .01 .1- 1 : nil
I' h .......
\' 01 1
I Fig. 13.
.
--
-
--
-----t
--
--
-----
---)
"h V "Oi
" " I f!'ietl i
,
( 0 '
; Rt
f'
r e on n e c
t
ion
I
The seven layers OSI architecture.
intermediate nodes) is the last function to be performed in order to provide the totality of the transport service. Thus, the upper layer in the transport-service part of the architecture is the Transport Layer, sitting on top of the Network Layer. This Transport Layer relieves higher layer entities from any concern with the transportation of data between them. 5) In order to bind/unbind distributed activities into a logical relationship that controls the data exchange with respect to synchronization and structure, the need for a dedicated layer has been identified . So the application of principles 3 and 4 leads to the establishment of the Session Layer which is on top of the Transport Layer. 6) The remaining set of general interest functions are those related to representation and manipulation of structured data for the benefit of application programs. Application of principles 3 and 4 leads to identification of a Presentation Layer on top of the Session Layer. 7) Finally, there are applications consisting of application processes which perform information processing. A portion of these application processes and the protocols by which they communicate comprise the Application Layer as the highest layer of the architecture. The resulting architecture with seven layers, illustrated in Fig. 13 obeys principles 1 and 2. A more detailed definition of each of the seven layers identified above is given in the following sections, starting from the top with the application layer described in Section VI-C1) down to the physical layer described in Section VI·C7).
C. Overview of the Seven Layers of the OSI Architecture 1) The Application Layer: This is the highest layer in the OSI Architecture. Protocols of this layer directl y serve the end user by providing the distributed information service appropriate to an application, to its management, and to system management. Management of Open Systems Interconnection comprises those functions required to initiate , maintain , terminate , and record data concerning the establishment of connections for data transfer among application processes. The other layers exist only to support this layer. An application is composed of cooperating application processes which intercommunicate according to application layer protocols. Application processes are the ultimate source and sink for data exchanged. A portion of an application process is manifested in the application layer as the execution of application protocol (i.e., application entity). The rest of the application process
is considered beyond the scope of the present layered model. Applications or application processes may be of any kind (manual, computerized , industrial , or physical). 2) The Presentation Layer : The purpose of the Presentation Layeris to provide the set of services which may be selected by the Application Layer to enable it to interpret the meaning of the data exchanged. These services are for the management of the entry exchange, display, and control of structured data. The presentation service is location-independent and is considered to be on top of the Session Layer which provides the service of linking a pair of presentation entities . It is through the use of services provided by the Presentation Layer that applications in an Open Systems Interconnection environment can communicate without unacceptable costs in interface variability, transformations, or application modification. 3) The Session Layer: The purpose of the Session Layer is to assist in the support of the interactions between cooperating presentation entit ies. To do this, the Session Layer provides services which are classified into the following two categories. a) Binding two presentation entities into a relationship and unbinding them . This is called session administration service. b) Control of data exchange, delimiting, and synchronizing data operations between two presentation entities. This is called session dialogue service. To implement the transfer of data between presentation entit ies, the Session Layer may employ the services provided by the Transport Layer. 4) The Transport Layer: The Transport Layer exists to provide a universal transport service in association with the underlying servicesprovided by lower layers. The Transport Layer provides transparent transfer of data between session entities. The Transport Layer relieves these session entities from any concern with the detailed way in which reliable and cost-effective transfer of data is achieved. The Transport Layer is required to optimize the use of available communications services to provide the performance required for each connection between session entities at a minimum cost. 5) The Network Layer: The Network Layer provides functional and procedural means to exchange network service data units between two transport entities over a network connection. It provides transport entities with independence from routing and switching considerations. . 6) The Data Link Layer: The purpose of the Data link Layer is to provide the functional and procedural means to establish, maintain , and release data links between network entities . 7) The Physical Layer: The Physical Layer provides mechanical, electrical, functional, and procedural characteristics to establish, maintain, and release physical connections (e.g., data circuits) between data link entities. VII. OSI PROTOCOLS DEVELOPMENTS The model of OSI Architecture defines the services provided by each layer to the next higher layer, and offers con-
605
Fifty Years of Communications and Networking
cepts to be used to specify how each layer performs its specific functions. Detailed functioning of each layer is defined by the proto· eols specific to the layer in the framework of the Architecture model. Most of the initial effort within ISO has been placed on the model of OSI. The next step consists of the definition of standard protocols for each layer. This section contains a brief description of a likely initial set of protocols, corresponding to specific standardization pro~ects recommended by SC16.
F Presentation Layer Protocol So far, Virtual Terminal Protocols and part of Virtual File are considered- the most urgent protocols to be developed in the Presentation Layer. A number of VTP's are available (e.g., [16], [17]), many of them being very similar, and it should be easy to derive a Standard VTP from these proposals, also making use of the ISO standard for "Extended Control Characters for I/O Imaging Devices" [18]. These protocols are reviewed in another paper in this issue [19] . The situation is similar for File Transfer Protocols.
A. Protocols in the PhysicalLayer Standards already exist within CCITT defining: 1) interfaces with physical media for OSI, and 2) protocols for establishing, controlling, and releasing switched data circuits. Such standards are described in other papers in this issue [9], [10], e.g., X.21, V.24, V.35, etc. The only work to be done will consist of clearly relating those standards to the OSI Architecture modeL
G. Management Protocols Most of the work within ISO has been done so far on the architecture of management functions, and very little work has been done on management protocols themselves. Therefore, it is too early to give indications on the likely results of the ISO work in this area. VIII. CONCLUSION
B. Protocolsin the Data Link Layer
The development of OSI Standards is a very big challenge, the result of which will impact all future computer communication developments. If standards come too late or are inadequate, interconnecticn of heterogeneous systems will not be possible or willbe very costly. The work collectively achieved SQ far by SC16 members is very promising, and additional efforts should be expended to capitalize on these initial results and c~me up rapidly with the most urgently needed set of standards which will support initial usage of OSI (mainly terminals accessing services and file transfers). The next .set of standards, including OSI C. Protocols in the Network Layer management and access to distributed data, will have to An important basis for protocols in the network layer is follow very soon. Common standards between ISO and CCITT are also level 3 of the X.25 interface [14] defined by CelTT and described in another paper in this issue. It will have to be essential to the success of standardization, since new services enhanced in particular to permit interconnection of private announced by PTT's and common carriers are very similar and public networks. to data processing-services offered as computer manufacturer Other types of protocols are likely to be standardized products, and duplication of now compatible standards later in this layer, and in particular, protocols corresponding could simply cause the standardization effort to fail. in this regard, acceptance of the 9S1 Reference Model by CCITT to Datagram networks [10] . Rapporteur's Group on Layered Architecture for Public D. Protocolsin the TransportLayer Data Networks Services is most promising. It is essential that all partners in this standardization No standard exists at present for this layer; a large amount of experience has been accumulated in this area and several process expend their best effort so it will be successful, and the benefits can' be shared by all users, manufacturers of proposals are available. The most widely known proposal is the Transport Protocol terminals and computers, and the Pf'T's/common carriers. proposed by IFIP and known as INWG 96.1 [15], which could serve as a basis for defining an international standard. Standard protocols for the Data link Layer have already been developed within ISO, which are described in other papers within this issue [11], [12] . The most popular Data link Layer protocol is likely to be HDLe [13], without ruling out the possibility of using also other character-oriented standards. Just as for the Physical Layer, the remaining work will consist mainly of clearly relating these existing standards to the OSI Architecture model.
ACKNOWLEDGMENT
E. Protocolsfor the Session Layer No standard exists and no proposal has been currently available, since in most networks, session functions were often considered as part of higher layer functions such as Virtual Terminal and File Transfer. A standard Session Layer Protocol can easily be extracted from existing higher layer protocols.
The OSI Architecture model briefly described in this paper results from the work of more than 100 experts from many countries and international organizations. Participation in this collective work was really a fascinating experience for the author who acknowledges the numerous contributions from 8C16 members which have peen merged in the final version of the OSI Architecture briefly presented here.
THE BEST OF THE BEST
606
REFERENCES II]
L. G. Roberts and B. D. Wessler 'Computer network development to achieve resource sharing." in Proc. SlCC. 1970, pp. 543-549. [2] L. Pouzin, "Presentation and major design aspects of the CYCLADES computer network:' in Proc. Ird ACM-IEEE Commun. Svmp., Tampa. FL, Nov. 1973. pp. 80-87. [3] "J. H. McFayden, "Systems network architecture: An overview," IBM Syst, t.. vol. 15, no. I, pp. 4-23, 1976. [4] G. E. ConantandS. Wecker, uDNA.AnArchitecwreforheterogeneous computer networks," in Proc .. Ieee. Toronto, ant.• Canada, AI 1976, pp. 618-625. [5) H. Zimmermann, "High level protocols standardization: Technical and politicalissues," in Proc. Ieee. Toronto, Ont., Canada, Aug. 1976,pp. t
•
[14] CCITI, uX25:' Orange Book. vol. VlII-2, 1977, pp. 70-108. [15] IFIP-WG 6.1, "Proposal for an internetwork end-to-end transport protocol." INWG Note 96.1; also. doc. ISO/TC97 SCt6/N24. 46 pp .• Mar. 1978.
[16]
IFIP-WG 6. I, "Proposal for a standard virtual terminal protocol." doc. ISO/TC97/SC16/N23, 56 pp., Feb. 1978. [17] EURONET. "Data entry virtual terminal protocol for EURONET. VTPlD-Issue 4. doc. EEC/WGS/165. [18] ISO. "Extended control characters for I/O imaging devices," DP6429. [19] J. Day, "Terminal protocols." this issue, pp. 585-593.
tt
*
373-376.
ISO/TC97/SC16, "Provisional model of open systems architecture," Doc. N34, Mar. 1978. (7) ISO/TC97/SC 16, "Reference model of open systemsinterconnection," Doc. N227 , June 1979. [8] H. Zimmermann and N. Naffah, "On open systems architecture," in Proc./CCC, Kyoto. Japan, Sept. 1978, pp. 669--674. [9] H. V. Bertine, "Physical level protocols. this issue pp. 433-444. [10] H. C. Folts. "Procedures for circuit-switched service 'in synchronous public data networks, ,. and ··X.25 transaction-oriented featuresDatagram and fast select," this issue, pp. 489-496. [II] 1. W. Conard, "Character oriented data link control protocols," this issue, pp. 445-454. [12] D. E. Carlson, "Bit-oriented data link control procedures," this issue. pp.455-467. [13J ISO, "High level data link control-elements of procedure;" IS 4335, 1977. .
Hubert Zimmermann' received degrees in engineering from Ecole Polytechnique. Paris. France, in
[6]
H
1963, and from Ecole Nationale Superieure des Telecommunications, Paris. France. in 1966. He is presently in charge of the computer communications group at IRIA. Rocquencourt, France. He was involved in development of command and control systems before joining IRIA in 1972 to start the CYCLADES project with L. Pouzin. Within CYCLADES~ he was mainly responsible for design and implementation of host protocols. Dr. Zimmermann is a member of lAP WG 6. I [International Network Working Group (INWG)]. He also chaired the Protocol Subgroup and coauthored several proposals for international protocols. He is an active participant in the development of standards for Open Systems Interconnection (OSI) within ISO. where he chairs the working group on OSl architecture.
PHOTO NOT AVAILABLE
Deadlock Avoidance in Store-and-Forward Networks-I: Store-and-Forward Deadlock PHILIPM. MERLIN, MEMBER.IEEE.ANDPAq~LJ. SCHWEITZER
Abstract-Store-and-forward deadlo.ck in store-and·~orwanl networks may - be avoided by forwarding messages from buffer to buffer in accordance with a loop-free directed, buffer graph w~ch accommodates all the desired message routes. Schemes for designing such buffer graphs are presented, together with methods for usingtbem to forward the messages in an efficient and deadl~k·free manner. These methods can be implemented by a set of counters at each node. Such an implementation Increases the efficiency of buffer use, and simpliftes Jumping between normal lowoverhead operation when deadlock is far and more careful operation when deadlock is near. The proposed deadlock avoidance mechanism works for any network topology and any finite routing algorithm.
NODE A
Fig. 1.
NODE E
Example of store-and-forward deadlock: all buffers deadlocked. NODE A
NODE 8
1. INTRODUCTION
I
N most cases, the occurrence of network deadlock has a highly objectionable impact. upon network users. When the deadlocks are discovered, they are frequently corrected in an ad hoc fashion [1]. Since changes in existing implementations can be costly, it is preferable to incorporate deadlock avoidance (or recognition-and-recovery) procedures into the design as early as possible and in an orderly fashion. A primary concern with any deadlock avoidance procedure is that it have minimal effect upon system performance during normal operating conditions.
In this paper, a general method is present for the design NODE C NODE D and implementation of, store-and-forward networks [2] free from "store-and-forward-deadlock" (or circular deadlock) [3]. This method has minimal overhead during normal opera- ' tions, and switches into a more careful mode of operation when deadlock is near. A companion paper [4] extends this method to other types of deadlock. Store-and-forward deadlock refers to the situation in NODE E which there is a set of buffers, all of which hold messages waiting to be forwarded, and these messages can be forwarded Fig. 2. Example of store-and-forward deadlock: smaller deadlocked set of buffers. only to other buffers of the set. The result is standstill. Figs. 1 and 2 show two examples of store-and-forward deadlock. In Fig. 1 there are two network nodes, each full with mes- sages for the other and therefore neither node can receive or eject any messages. Fig. 2 shows a more complex example. All buffers marked x are occupied by messages waiting for Paper approved by the Editor for Computer Communication of the other buffers marked x to become available. Since all xIEEE Communications Society for publication after presentation at the buffers are full, the result is standstill. Notice there may be Third Jerusalem Conference on Information Technology, Jerusalem, Israel, August 1978. Manuscript received November 7, 1977; revised other empty buffers at these nodes, e.g., buffers y, but June 11, 1979. This work was performed mainly at the IBM T. J. Watpermanently allocated for other purposes. son Research Center, Yorktown Heights, NY, and was completed shortly It can be shown that store-and-forward deadlock implies before P. M. Merlin's premature death; by his request, it is dedicated to the medical staff at the B-Internal Disease Department and at the Ina cycle of buffer requests [5], [6J. Basically, our method is stitute of Oncology at the Rambam Hospital in Haifa, Israel and in on preventing cycles of requests, in an efficient way. based particular to Dr. Y oram Cohen. P. M. Merlin was with the Department of Electrical Engineering, The proposed method generalizes schemes for deadlock Technion-Israel Institute of Technology, Haifa, Israel. He is now avoidance proposed by [7] which are now incorporated in the deceased. GMD-net protocol, and which can also be used for congesP. J. Schweitzer is with the Graduate School of Management, University of Rochester, Rochester, NY 14627. tion control [8] -[10]. Conceptually we use a directed graph Reprinted from IEEE Transactions on Communications, vol. COM-28, no. 3, March 1980.
The Best ofthe Best. Edited by W. H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W. Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
607
608
THE BEST OF THE BEST
as in [11] to represent the possible forwarding of messages between buffers. However, we differentiate' between "guarateed paths" and "optional paths." This permits generation of deadlock-free schemes which would otherwise be impossible to generate or difficult to visualize. Several assumptions are made in this paper. a 1) All nodes, communication channels, and destinations are operational, i.e., they will process a message in finite time. (This is relaxed later.) a2) No messages are lost. (This is relaxed later.) a3) Messages are transmitted as complete units, Le., no packetizing or reassembly. (Reassembly deadlock is treated in [4] .) a4) No protocol prevents the destination node from turning messages over to the user as soon as they arrive. (This can be relaxed as shown for the pacing protocol in [4].) a5) Any topology with a finite number of nodes is permitted. Any node-to-node routing algorit~m is permitted which achieves message delivery within a bounded number of hops, e.g., a bounded number of repeat visits to a node is allowed. Adaptive routing can be employed provided route lengths are bounded. a6) All message sizes are bounded; hence, no message requires more than, say, M buffers at a node. Throughout most of this paper, any message is assumed to fit into one buffer; and the extension to rnultibuffer messages is given in Section V. The deadlock avoidance method described in this paper affects only the transfer of messages between adjacent network nodes. Provisions can be made such that no alterations are required in the protocols by which network nodes accept messages from attached users, except the ability to slow the admittance rate, if necessary, until the appropriate network buffers become available: the implementation can be arranged to preserve FIFO for all messages following a fixed route. Section II presents the basic method for deadlock avoidance. Section III extends the method to allow for better utilization of the buffers as well as to provide more flexibility in the ways in which messages can be forwarded. 'Section IV presents several schemes for designing the directed graphs used by the proposed method. Section V presents an approach which buffer "counters" implement the method efficiently.
in
II. GUARANTEEDPATHS
Suppose one is .given a store-and-forward network with prescribed nodes, prescribed communication channels between some pairs of nodes, a prescribed finite set of buffers at each node, and a prescribed set of all message routes permitted by the routing algorithm. Each route consists of a finite sequence of nodes and communication channels from the source node to the destination node. At each node the message is assumed to fit into one buffer. (Section V extends the method to multibuffer messages.) A' buffer graph (BG) is a directed graph whose nodes are a subset of the buffers, and where directed arcs connect some pairs of buffers, indicating permitted message flow from one buffer to the next. Arcs are permitted only between buffers in the same node, or between buffers in distinct nodes which are connected by communication channels.
A....- _ _ ...... B
0
C
Fig. 3.
Network for Example 2.1.
a
f
- ,
" ......
I
I I
A
,
.......
I I
,
I I \
0
Q \
D
,,_ ....
/
0 "-
I I
./
/
Fig. 4.
\
\
I
.1
I I I I
I I I I I
0
I
'I
1
I
0
I
I
I I I
0
I I I I I
D
.
\
I
I I
I
\
·1
I
I I I
I I I
....
\
I
I I I I
I
c
/--
I
.....
.3
,.-
/
0 ......
-
./
I
J
I
I I
1 r
I I I I I I
I
,I
I
D
I I I
I
D
I I
I I . I
I
0
I I I
I I
,
0 \ ..... - -"
I
I
j
I I
BG for Example 2.1.
The BG is required to have the following properties. 1) Thete is at least one. directed path, called a guaranteed path, in the BG corresponding to each route in the network; and 2) the BG contains no directed loops. A fresh message accepted by the network is placed, at the source node, in a buffer from which there is at least one guaranteed path in the buffer graph to its destination node, which conforms to the given routing algorithm. The message makes its way from source to destination, using the buffers of such a guaranteed path. In this way, each message in the network is waiting for the next buffer in the path. However, since the BG is loop-free, waiting loops cannot form and, as proven below, deadlocks never occur. Example 2.1: For illustration, the network of Fig. 3 is given. In this example, any route which does not visit a node more than once is permitted. Fig. 4 shows a corresponding BG in which the squares denote buffers (numbered for convenience). The reader can verify that, in both directions, any route visiting all four nodes is contained in some path in the buffer graph. Routes involving fewer than four nodes are clearly included within the routes which visit all four nodes. If, for example, a m.essage is to be transmitted from B to A to D to C, then it will be first placed in buffer 2 of Band moved in the path 2 3 ~ 6 9, and absorbed by the destination. A message in the reverse direction will travel in buffers 5 ~ 6 ~ 7 ~ 8, and a direct message from C to B will choose between the guaranteed paths 1 ~ 2 and 9 -+ 10. Note that the graph of Fig. 4 has no directed loops and therefore cannot have deadlocks. -)0
-)0
609
Fifty Years ofCommunications and Networking
In many cases it is difficult to graphically construct buffer graphs. The following schematic example from [7] will demonstrate that BG's can be constructed without actually drawing the directed graph. Furthermore, the example will demonstrate that for any given network, a corresponding BG can be constructed, provided that the routing algorithm puts a bound on the maximal number of hops a message may take. Example 2.2-Hops-So-Far Scheme: Let K denote the maximal number of hops a message can make in the net. At each node, assign K + 1 buffers, say [BO, HI, B2, ..., BK] . At each node, from each buffer Bi there is an arc to the buffers of type B(i + 1) located at nodes with communication channels to this node. A message is placed initially in BO at the source node, transferred to buffer B I at the next node on the route, buffer .32 at the following node, etc. At each node, buffer Bi holds messages which have made exactly i hops so far. Clearly, all possible routes involving at most K hops are accommodated and the monotonicity of the index i insures that loops are not formed. The hops-so-far scheme has the convenient features of easy implementation and a simple criterion for admitting external messages to the network.. whether or not buffer BO is available. Theorem 1: Any message accepted by the network will be delivered to the destination in a finite time, and all buffers occupied by the message will be released in a finite time, provided that the followingconditions hold. C1) Messages are accepted by the network only if there is an empty buffer in the source node from which there is at least one guaranteed path in the BG to its destination node. C2) Any message in a buffer at other than its destination node is characterized by a nonempty set of buffers it can be transferred to. This set is called the waiting set, and each of its buffers lies on a guaranteed path from the buffer the message occupies to its destination. The waiting set is constant in time. If any buffer in the waiting set becomes vacant and grants permission for this message to enter, the message is transferred within a finite period of time. C3) If a buffer becomes vacant, then within a finite period of time, one of the messages having this buffer in its waiting set is granted admission to the buffer. C4) Consider any buffer in the waiting set for a given message. Then only a finite number of other messages can be admitted to this buffer before this message is granted admission ("fairness in admissions"). C5) When a message is transmitted from one buffer to another, the previous buffer is released within a finite time. C6) A message occupying a buffer at its destination node is not forwarded any further; it is delivered to the user, and the buffer is released,within a finite time.
Schematic Proof: 1) Since the BG is loop-free, each buffer of the BG can be labeled by its buller class, defined as the maximum number of arcs of all directed paths to this buffer from buffers lacking incoming arcs. (e.g., in Fig. 4, buffer j is in classj - 1, for j = 1, ..., 11). With this definition of buffer class, the forwarding of messages involves buffers having strictly increasing buffer classes.
2) Let the highest butler class be class r. Any message occupying a buffer of class r is at buffer with no outgoing arcs; hence, it is a message at its destination. By C6), this message will be delivered within a finite time and its buffer released. By CS), within a finite time no copies of this message will exist anywhere in the BG. 3) An inductive proof proceeds downward on j = r - I, r - 2, ..., O. Any message in a buffer of classi is either at its destination [and will be delivered within a finite time with release of all buffers holding copies, via C6) and C5)] or is waiting for some buffer of class exceeding j [which will become vacant within a finite time, and will both become available to, and be used by, this message within a finite time, via C3), C4), and C2)] . Hence, any message not at its destination will be forwarded within a finite amount of time.. Since the message always lies on a guaranteed path in the' BG, all of which have finite length, it must be delivered within a finite time with release of all buffers containing copies. Remark 1: OUf proof carries over to some more complicated cases where the waiting set is not constant in time
[12].
Remark 2: Assumptions a1), a2) can be relaxed. If messages are lost, the method still guarantees deadlock-free operation, but clearly, a lost message will not be delivered. The method can also cope with nonoperational nodes or links, provided that alternate routes are supplied or messages with unreachable destinations are discarded [4]. Under normal operations, however, deadlock is avoided without resorting to throwaway. III. BUFFERGRAPH EXTENSIONS AND GENERALIZED USAGE The proposed method restricts the set of messages which can use some of the buffers at a node. Therefore, at certain times there may be empty buffers at a node which cannot be used by messages waiting at adjacent nodes. In other words, the buffer usage is restricted for the sake of deadlock avoidance. There are several ways by which the frequency of such situations can be reduced to a very modest level by using the methods described in this section.
A. AdditionalGuaranteed Paths One may add arcs to the BG, provided the loop-free property is maintained, and provided these new arcs connect either buffers in the same node or buffers in nodes having a communication channel between them. (The direction of such an arc is dictated if the original BG contained a path between these two buffers, and is otherwise arbitrary.) This creates alternative guaranteed paths in the BG while maintaining the deadlock-free property. As an illustration of additional paths, consider Example 2.2. One may add arcs from buffer Bi at one node to every buffer Bi, j ? i, at the same node or at nodes connected by communications channels to this node. "This may achieve better performance than restrictingj to i.+ 1. The buffers lacking arcs can be put into use by adding arcs. This can be employed to augment the BG by parallel disjoint paths (e.g., for different traffic types) or replace a
610
THE BEST OF THE BEST
buffer by a set of interchangable buffers, all having the same incoming and outgoing arcs.
B. The Use ofPath Switching Suppose some buffer, say, u I,. holds a message on a guaranteed path, and in the same or an adjacent node there is an empty buffer, 'say, vI, also lying on some guaranteed path for this message. Then, the message can be transferred from u 1 to vI, even if there is no directed arc in the buffer graph from u 1 to ul , provided that this transfer is allowed by the routing algorithm. This "path switching" does not compromise deadlock avoidance since from vI there is also a guaranteed path in the BG to the destination. Path switching increases the number of possible buffers that a message can use at each node. For instance, a message one hop away from its destination may use any empty buffer at the destination node. Example 3.1: Suppose that in the network of Figs. 3 and 4, a message is sent on route C 4 D ~ A ~ B. The message will be accepted by the network only if buffer 5 is empty because the only path in the BG for this route is 5 ~ 6 ~ 7 ~ 8. If, when the message resides in 6, it happens that 7 is full but 3 is empty, then since 3 has a path in the same route to the destination (i.e., 3 ~ 4), it is possible to undertake a path switch by transferring the message to buffer 3. Although 7 is guaranteed to become available within a finite time, the use of 3 reduces the waiting time. Note that when the message arrives at 6, and 3 happens to be in use, it is not guaranteed that 3 will be available to the message in 6, for example, if 3 holds a message waiting for the release of buffer 6. There.. fore, to avoid deadlock, a message is not allowed to wait indefinitely only for buffers which are reachable exclusively via path switching. That is, if in a finite time, buffers for path switching do not become available, the message should start waiting also for buffers which lie on a guaranteed path in the BG from the currently occupied buffer to the destination. (This is the motivation for Condition C8) of Theorem 2 below.) Example 3.2: As another example, possible transfers also exist in the hops-so-far scheme in Example 2.2. For a given message, let t denote the total number of hops from source to destination. When the message has made h < t hops it can make a transfer to any buffer Bg at an adjacent node (lying on the message route) such that
o~ g ~ K -
(t - h)
+ 1.
From each of those Bg's there is a path to the destination: Bg -+ B(g
+ 1) ~ B(g + 2) ~ ... ~ B(g + (t -
h) -1).
and by definition of g, g + (t - h)-l ~ K. Assuming that messages are forwarded from source to destination through a sequence of buffers, each of which has a guaranteed path in the BG from itself to the destination which satisfies the routing constraints, then the following theorem can be established. Theorem 2: Any message accepted by the network will be delivered to the destination in a finite time, and all buffers
occupied by the message will be released in a finite time, provided that: Cl), C2) (without requiring the waiting sets to be constant in time), C3), C4), C5), and C6) hold as in Theorem 1, and also that the following hold. C7) A message cannot wait indefinitely only for path switching. Within a finite time after the. message enters the buffer, the waiting set must include buffers on guaranteed paths from that buffer. C8) For each message in the current node, let U denote a subset of the set of buffers being waited for in adjacent nodes (i.e., buffers in a node connected to the current node by a communications channel with each such buffer having a guaranteed path in the BG to the destination). Within a finite time after the message enters the current node, from this time onwards until the message leaves the node: a) U is nonempty b) there exists a path in the BG from' the buffer now holding the message at the current node to some buffer in U c) elements of U are never deleted, even if the message is transferred among buffers in the current node. This condition permits arbitrarily many path switchings within a node, but ensure's departure from the node within finite time..If unbounded intranode transfers are not required, this condition can be simplified. Schematic Proof' 1) The buffers of the BG can be labeled into classes 0, 1, 2, ..', r as in Theorem I. Now messages can move into a buffer of either a lower or higher class. But·because of C7), within a finite time after a message enters a buffer, either the message leaves the buffer or the waiting set for this message must include buffers of higher class. 2) As in Theorem 1, buffers of class r will empty in finite time and all copies of these messages will disappear in finite time ~ 3) The inductive proof of Theorem 1 shows each message must transfer buffers' within finite time (since a higher class buffer will become available), but not always to a buffer of higher class if path switchingoccurs. 4) A message cannot stay forever in one node (via internal buffer transfers); it must leave the node within finite time. This holds because C8) ensures its waiting set eventually includes buffers (on guaranteed paths) in other nodes, and part 3) of this proof shows these will become available within a finite time. Then C2), C3), and C4) ensure that one of these buffers will receive the message in finite time. S) A given message will travel only in buffers, each of which has a guaranteed path in the BC (from themselves to the destination) which satisfies the routing constraints. Part 4 shows that only a finite time is spent in each node and a5) limits the number of nodes visited. Hence the message must reach its destination in finite time. Remark: If path switching is permitted, as above, into buffers lying on a guaranteed path to the destination, but relaxing the requirement that the original routing algorithm be satisfied, then new routes can be traversed but deadlockfree operation is still maintained. However, in this case, infinite looping is possible and additional precautions should be taken to ensure message delivery in a finite time.
611
Fifty Years ofCommunications and Networking
C. The Common Buffer Pool Via path switching, the buffers lacking arcs in the BG can be used more flexibly than as described in Section III· A. These buffers can be organized as a common buffer pool which can be used arbitrarily. Conceptually,. this is done by adding an arc from each buffer in the common buffer pool to every buffer of the same node belonging to the original buffer graph and to everyone of their immediate successors in the BG. These arcs ensure that from every buffer in the buffer pool, the message can be continued along any of the paths in the BG which pass through this node. Therefore, using the path switching mechanism, any buffer in the buffer pool can host any of the messages traveling through the node. Adding the common buffer pool to the BG maintains the loopfree, property because those buffers have only exiting arcs. Buffer availability is improved by reducing the number of buffers belonging only to special paths in the BG, and shifting these buffers into the common buffer pool. Section IV illustrates schemes for constructing BG's with a nearly minimal number of buffers at each node. This permits either shifting buffers into the buffer pool, or system design with small buffer requirements. It is desirable for a node to move messages from the common buffer pool into' the nonbuffer-pool buffers whenever possible, because this will increase the number of empty buffers which can be used arbitrarily. This shifting of messages is called "reclassification" and may be efficiently imple .. mented via the logical buffer scheme of Section V.
D. Discussion
Design Freedom: Great flexibility in implementation exists because of design freedom in constructing the buffer graph (see Section IV), selecting among guaranteed paths, and selecting among possible path switchings. Multiple Interpretations: The buffer graph concept requires only that the buffers and their allowed connections be known, and does not require that the buffers at each node be explicitly labeled in some special manner. Since buffers can be labeled in a variety of ways, multiple explanations can be given for how they are being used. For example, the hops-sofar scheme described in Example 2.2 has a BG which permits an z-hop message to travel either in buffers BO ~ HI -+ B2 ~ ... ~ Bi or in, say, buffers B(K - i)-+ B(K - i + 1) -+ .•• -+ B(K - 1) -+ BK. If one now renames the buffers so that Bj becomes B(K - /}, then the same use of the 'buffer graph may be newly interpreted as remaining hops, because the message now travels in buffers BK ~ B(K - 1) -+ B(K -2) -+ ... -+ B(K - i) or in buffers Hi -+ B(i - 1) ~ ... ~ Bl ....BO. Although an explanation (or interpretation) of the BG is unnecessary, it is sometimes convenient because it permits a straightfor.. ward implementation. Reversed Routings: If the reversal of every permitted route is also a permitted route, then reversing the arc directions in a given BG produces another BG (e.g., the hops-so-far scheme can be converted into the hops-remaining scheme.) Internode Transfer Implementation: Link-level protocols are described in [12] which implement the proposed deadlock avoidance method. These protocols can be implemented
Bu Bd
Fig. 5.
BG for tree networks.
in a way which guarantees FIFO for all messages on the same fixed route.
IV. SCHEMES FOR CONSTRUCTING BUFFERGRAPHS This section describes several families of buffer graphs which are practical both regarding ease of implementation and having small buffer requirements. In [7], hop-type schemes which lead to practical buffer graphs are described. Below we generalize some of these schemes and describe new ones. Some of the schemes which follow exploit the network topology explicitly. Buffer graphs may also be constructed exploiting the specific given set of routes [13], but these may be more difficult to implement.
A. The Tree Scheme Suppose a network with a tree topology is given, and the routing rules forbid a message to make repeated visits to any node. It is known [7] that deadlock-free operation in such a tree can be achieved using two buffers at each node. Let one node be arbitrarily designated as the root, and suppose "up" means toward the root while "down" means away from it. At each node two buffers Bu and Bd, are allocated for traffic going up or down, respectively. Because no node can be visited more than once, when a message starts going down it cannot turn up. On the other hand, a message going up may at certain nodes turn down to a different branch of the tree. Therefore, for each node, Bd is connected to the Bd's of its sons, and Bu is connected to both Bu of its parent and Bd of the same node. Fig. 5 shows the BG for such a tree. Note that only one buffer, named BO, is needed at the root node. On the other hand, 'in the special case in which messages never turn down after starting up the tree, and if the root node has two buffers Bu and Bd, then only one buffer BL is necessary at each leaf node (i.e., nodes without "sons"). This buffer is used for both down-moving messages, and, when empty, for accepting external traffic to the leaf node. The BG of Fig. 5 will be used in following sections, as a basic building block .of loop-free BG's for networks with more complex topologies.
B. The Mesh and Tree Scheme In some practical cases [14], two conditions may hold:
1) the network decomposes into a mesh plus a collection of trees Tl, T2, .... ) such that each tree joins the mesh at one node (the node common to the mesh and tree is called the root), and any two trees are disjoint; 2) the routing of each
612 message does not permit repeated visits to a node in a tree, and permits only a finite number of total visits to mesh nodes. In such cases, a BG can be constructed by connecting several independent BG's as follows. 1) A buffer graph BGO is constructed for the mesh part (including the roots) in which the trees are disregarded (i.e., routes terminating (originating) in a tree are treated as terminating (originating) 'in the tree root). Any scheme can be used to construct BGO, provided it accommodates, in a loop-free manner, all of the desired routes within the mesh. 2) For each tree Tt, a buffer graph BGi is constructed as shown in Fig. 5, with the root node chosen to be the node in the mesh. 3) Each BGi is connected to BGO as follows: , a) Let Su (Sd) denote the set of up (down) buffers in BGi at all nodes which are "sons" of the root node. Let So denote the set of buffers in BGO at the root node for Ti. b) in BGi, delete the root buffer (node) and all arcs attached to it. c) Add an arc from each buffer in Su to each buffer in So. Add an arc from each buffer in So to each buffer in Sd: The combined BG is loop-free' because each buffer graph BGO, BG1, ... is loop-free, the new arcs add loop-free paths from buffers Bu to buffers Bd (via buffers in aGO) but not the opposite, and a path from Bu to Bd does not close loops. The combined BG accommodates all permitted routes. For illustration, if the mesh nodes are managed by the hop scheme of Example 2.2, then each mesh node has buffers (80, BI, ... Bq) where q "is the maximum number of hops within the mesh. The nodes in each tree, excluding the root, have buffers and, arcs as in Fig. 5. The connections between the tree buffers a~d the mesh buffers are shown in Fig. 6. Because the topology is exploited, the use of such mixed schemes reduces the buffer requirements. Such mixed schemes can be created not only for joining trees and mesh, but also for many other particular cases, as shown in Section IV-E.
c.
The HLoop Breaking" Scheme
This scheme for the construction of loop-free buffer graphs can be applied to any network topology. It is especially attractive for cases in which there is a small set of nodes whose removal will leave the remaining network loop-free. Given a network, suppose that a set of nodes G = (Gl, G2, '.., GL> is marked. These nodes are chosen such that if all of them are removed along with all communication channels connected to them, then the remaining network has no loops. (It is immaterial whether the remaining network is connected.) The remainder of the network will consist of one or more unconnected trees, in each of which at most two buffers per node are enough to avoid deadlock (see Section IV..A) for messages moving exclusively within the tree. The proposed buffer graph is defined as follows. We assume that no route of interest has repeated visits to any node, and that q ~ L denotes the maximal number of nodes in G which can be visited on any route. Each node in G is allocated exactly q buffers labeled
THE BEST OF THE BEST
TREE NOTATION: 0000 REPRESENT A SET OF ARCS AS IN
Fig. 6.
(b),
BG for mesh-and-trees networks.
nonroot node in a tree is allocated 2(q + 1) buffers labeled <BuO, Bul, ..., Buq, BdO, Bdl, ..., Bdq) where buffer Bui (Bdi) is reserved for messages currently heading up (down) the tree and which have already made i visits to nodes in G. The root node in a tree has q + 1 buffers labeled (BudD, Bud I, ..., Budqt where buffer Budi is reserved for messages which have already made i visits to G and which can be headed either up (i.e., out of) the tree or down the tree. Fig. 7 schematically illustrates the buffer graph. The bottom plane holds the buffers for nodes in the trees which are reserved for messages which have not yet visited G: (BuO, BdO, BudO). The first visit to G is represented by the second plane, which holds buffers BI for nodes in G. The third plane is essentially a replication of the first plane, and holds buffers <Bul, Bdi , Buds) for messages in the tree nodes which have made exactly one visit to G. In general, copies of the first two planes, are alternated, ending with a replication (Buq, Bdq, Budq) of the first plane. Buffers in each plane for tree nodes have directed arcs only to buffers in the same plane or to the next higher plane of buffers for G nodes. The G buffers have arcs to the next higher plane for tree nodes and/or to the next level of plane of buffers for G nodes. Such a structure is loop-free because each plane is loop-free and the planes are connected only in one direction. This scheme requires q buffers per node in G, q + 1 buffers per root node in a tree, and 2q + 2 buffers per nonroot node in a tree. If q is small, the total number of buffers can be significantly smaller than the K + I buffers per node (K = maximum number of hops for any route, any given set of routes) required for the hop number scheme shown in Example 2.2. For illustration, Fig. 8 shows an example of a network with 9 nodes where a loop-free path exists which visits all the nodes; hence, K = 8. The hop number scheme requires 9 buffers per node or 8 r buffers total. By taking q = 2 and G = (G1, G2) as indicated, the removal of G leaves one tree consisting of
613
Fifty Years ofCommunications and Networking
TREES AFTER
Qfh VISIT
Qt h
VISIT TO SET G
TO SET G
SECOND VISIT TO SET G
TREES AFTER FIRST VISIT TO SET G
FIRST VISIT TO SET G
:REES BEFORE VISIT TO SET G
Fig. 7.
BG for loop-breaking scheme.
GI A4 ___-__e--..
AI
A 2 ......... - - - - - - . - - ... A5
A3
Fig. 8.
G2
A6
Network illustrating loop-breaking scheme.
Another example consists of a ring of R > 2 nodes, with any routes permitted which .do not visit a node more than once. The hop-number scheme requires R buffers per node, while the loop-breaking scheme with q = L = 1 requires 1 buffer at node Gl, 2 buffers at the root node of. the tree, and 4 buffers at each of the R - 2 nonroot tree nodes. For large rings, this is significantly better than" the hop-number
A route which visits three consecutive nodes, say ·0 4' b ~ is said to have a valley at node b if nodes a and c have higher node numbers than node b. Since all messages depart
C,
after a bounded number of hops, there exists an integer
K ;;, 0 such that no
route has more than K valleys. A loop-free buffet· graph is constructed with 2K + 2 buffers per node (the method in [7] requires 4K + 2 buffers per node). The buffers at node are named
the encountered node numbers, and requites half as many buffers as the "peaks and valleys" deadlock avoidance method given in [7].. A "peaks" scheme can be developed which completely parallels the "valley" scheme to be described
1) At each node, buffer ui (0 ~ i ~ K) is connected to buffer di at the same node.. 2) At each node, buffer di (0 ~ i ~ K -1) is connected to buffer u(i + 1) at the same node. 3) At node ns, buffer ui (0 ~ i ~ K) is connected to buffer ui at node nt. 4) At node nt, buffer di (0 ~ i ~ K) is connected to buffer di of node ns. The BGis loop-free because of the following. . 1) For fixed i, a path in the BG consisting exclusively of ui buffers involves monotonically increasing node numbers; hence, it cannot forin a loop.
below. Given any N-node network, assign distinct static numbers tn 1, n2, n3, .", liM to the nodes. Any static or adaptive routing algorithm is permitted which ensures message delivery within a bounded number of hops, i.e., a bounded number of repeated visits to any node is allowed .
2) Similarly, for fixed i, a path in the BG consisting ex~ elusively of di buffers cannot form a loop. 3) As shown in Fig. 9, the BG consists of alternating layers of buffers (uO), (do), (u}), (dl), ..., (UIO,
scheme. D. The "Valley Counting" Scheme This scheme can be applied to construct a buffer graph for any network topology. This scheme is based oil valleys in
614
THE BEST OF THE BEST
Let R 1 denote the set of all permitted routes among the subnetworks, i.e., a route in R 1 is deduced from the corresponding route in R by identifying the sequence of subnetworks visited by a message following the route in R. Let R2i, 1 ~ i ~ m, denote the set of routes within subnetwork si which are used by R, i.e., any list of consecutive nodes in a route of R, all of which lie in si, belongs to R2i. The buffer graph BG for the given network can be constructed in the following three steps. First, construct a loopfree buffer graph BGI which accommodates all routes in R 1 for the abstract network of aggregate nodes. Second, construct BG as the collection, over all buffers b of BG1, of buffer graphs BG(b) defined next. Each BG(b), where b is at, say, aggregate node si, is a loop-free buffer graph which accommodates all the routes of R2i. Third, if in BG1, buffer b is connected to buffer b' b, then in BG, an arc is added from a buffer in BG(b) to a buffer in BG(b') whenever those buffers lay in the same node or in nodes connected by a communicaNODE nt NOD~ ns. tions channel. Fig. 9. BG for valley-counting scheme. The combined BG accommodates all routes R because any route in R is a concatenation of routes of R2i, I ~ i ~ m, It remains to be shown that any route in the physical according" to one of the sequences of R 1. The combined BG network has a corresponding path in the BG. Let the sequence is also loop-free because each BG(b) is loop-free, and since of node numbers m 1, m2, m3, ..', m9 denote any arbitrary BG1 is loop-free, the arcs added in the third step cannot close route. The corresponding path in the puffer graph starts in loops. buffer uO of node m 1, and continues in buffer uO of sueThe mesh and tree scheme of Section IV-B can be concessive nodes as long as the node numbers are increasing. If structed. by an interconnection of subnetworks. In this case, a node mp is encountered whose successor has a smaller node the mesh, and each tree node (excluding the root node), are number, i.e., node mp is "the first "peak" in the route, then considered as subnetworks. Here BG1 will be as in Fig. 5, the path shifts from buffer uO to bu-ffer dO within node mp, each BG(b) will' be a single buffer, and BG will have the form and then to buffer dO of node m(p + 1). The path continues of Fig. 6. The loop-breaking scheme of Section IV-C may also through dO buffers of successive nodes as long as the node be interpreted as an interconnection of subnetworks. In this numbers are decreasing. If the path encounters a node mv case, the nodes of the set G are taken as distinct subnetwhose successor node has a higher node number, i.e., node works, and all of the remaining nodes are taken as a single mv is a yalley, then the path shifts from buffer, dO to buffer subnetwork. Here BGI agrees with Fig. 7 provided each plane u 1 within node mu, and continues to buffer u 1 of the suc- labeled "trees" is regarded as a single buffer. cessor node m(v + 1). A simple deadlock-free connection of several subnetworks The path continues similarly to the end, with up-steps can be accomplished by providing buffers at each node with a taken in buffers uj, with switches at a peak node from buffer two-part index (s, t) where s (for BG1) indicates, for instance, uj to buffer dj, with down-steps taken in buffers dj, and with the valleys-so-far in subnetwork number, and t [for BG(b)] switches at a valley node from buffer dj to u(j + 1). Since indicates, for instance, the hops-so-far within the current at most K valleys are encountered, each node requires at subnetwork. most K + 1 each of d~buffers and u-buffers. The interconnection of subnetworks can be recursively applied, in the sense that while constructing BG(b) for some E. Interconnection ofSubnetworks subnetwork, this subnetwork can itself be decomposed into A hierarchical scheme may be used in many practical subnetworks, applying the method again to each subnetwork. situations to generate efficient new buffer graphs from old In other words, the original network can be hierarchically ones. See also [7, Theorem 5] . decomposed into subnetworks. Conversely, any given set of Consider a network of N nodes, communication channels deadlock-free networks can be combined into a deadlock-free between some pairs of nodes, and suppose that a finite set R supernetwork while preserving its internal buffer scheme of permitted routes is given. Suppose that the nodes of the [the SB(b)] by replicating it as many times as BGI demands. network are partitioned into m < N nonempty subnetworks V. LOGICAL BUFFERS sl, s2, "'. sm. That is, the network may be abstractly viewed as consisting of "aggregate nodes" s 1, s2, .'., sm with a "comThe schemes in previous sections assume that real buffers munication channel" between aggregate nodes st and sj if and are actually earmarked (reserved) to match the buffers in the only if a communication channel exists between some pair of buffer graph, and that a message fits into one real buffer. Here we describe a way of managing logical buffers at each the original nodes in si and sj.
'*
615
Fifty Years of Communications and Networking
node so that real buffers are not physically earmarked, and so that multibuffer messages can be accommodated. The use of counters to manage logical buffers also permits rapid
reclassification of messages from one buffer type to another, thereby improving performance within the network. Suppose a loop-free BG is given which accommodates all the desired routes. For a given node in the network, let
buffers must be provided at the node. Furthermore, if TCi denotes the number of physical buffers currently containing messages assigned to logical buffer ci, then Rei = max(M TCi; 0) empty physicalbuffersmust be reserved for messages of logical class ci. These buffers are kept empty until either arrival of additional messages of classci or reclassification of a current message into class ci. Because of the possibilities of path switchingand of multiple guaranteed paths, a message at a node might conceivably fit in any of several logical buffers. However, each message at a node must be actually associated with exactly one logical buffer ci, and all physical buffers actually occupied by this message will also be considered to be currently of class ci until this message departs or is reclassified. Consequently, TCi = ~ {xj Imj is assigned to logical buffer class ci} where tm1, ml, .", mpi denotes the messages in the node and <x 1, x2, ": xp) denotes the number of physical buffers these messages occupy. The total number of reserved empty buffers at the node is t
R
=
~RCi~
vious poll has been responded to, but the same ideas can be generalized to accommodate multiple outstanding polls. We assume also that the length of a message is not known by the polling node until the message is received. When 101M is larger than the maximum number of messag~s which can be sent in response to a poll, then "send everything" polling is used. (Hopefully, this situation occurs most of the time.) If not, put JOIM ~ 1, then the node can issue a poll for at most [IO/il/] messages of arbitrary classes. If 10 < M, the node must resort to selective polling for a single message of any class ci such that Ji ~ M. Whenever every Ji < M, the node will suspend polling. For nonpolling implementation, see [12].. The buffer encounters rei are incremented or decremented upon message arrivals, departures, or reclassifications, with recomputation of RCi, Ji, and R . As described in Section III-C, the main goal of reclassification is to increasethe threshold for selective. polling by increasing /0 (or decreasing R). Although this section has not explicitly treated the common buffer pool, all of its properties have been preserved in the implementation. The buffer counter schemehas the additional advantage of automatically putting as much as possible of an incoming (or already-present) class ci message into the reserved empty buffers Rei, and then putting the remainder of the message, if any, into the buffer pool. This is a simple form of reclassification which postpones the threshold for selective polling..
VI. CONCLUSIONS The schemes for constructing buffer graphs, in conjunction with the different possibilities of using them for the actual
forwarding of the messages, provide a large number of alternative ways of applying to actual networks the deadlock avoidance ideas presented in this paper. For conventional networks, many of these alternatives are a viable, practical solution to the store-and-forward-deadlock problem. The examples throughout the paper demonstrate that the buffer require.. ments for typical buffer graphsare modest, and that low overhead implementations can be designed.
REFERENCES
i= 1
If E denotes the actual number of empty physical buffers at the node, then in order to guarantee that the network is deadlock-free,we must ensure thatR ~E holds at all times. This implies that the node could now accept messages of a given logical class ci, 1 ~ i ~ t, which occupies no more than Ji = E - R + Rei physical buffers. Similarly, it could now accept messages which occupy up to 10 = E - R + min {Rei 11 ~ i ~ t} physical buffers, irrespective of their logicalclasses. A polling scheme which implements the deadlock avoidance method is described below. We assume also that the node is not allowed to discard incoming messages, and therefore it can issue a poll only when it can commit buffer space for the messages which can be received in response. A poll will be responded to within finite time, either with messages or with a null response if there is no message to send. For simplicity, we assume that a new poll can be sent only after the pre-
[II
L. Kleinrock, HARPANET lessons. in Conf, Rec.• 1111. Conf. Commun .. Philadelphia. PA. June 14-16. 1976. f2J D. W. Davies and D. L. A. Barker. Communication Networks for Computers. New York: Wiley, 1973. r3] R. E. Kahn and W. R. Crowther. "Flow control in a resource-sharing computer network." IEEE Trans. Commun .. vol. COM-20. pp. 539546, 1972. (41 P. M. Merlin and P. J. Schweitzer, "Deadlock avoidance in store-andforward networks-i-ll: Other deadlock types," this issue, pp. 355-360. H
[5) E. G. Coffman, Jr., M. J. Elphick. and A. Shoshani, "System deadlocks," ACMCompul. Surveys, vol. 3. pp. 67-78,1971. [61 R. C. Holt, "Some deadlock properties of computer systems," ACM Comput. Surveys. vol. 4. pp. 179-196. 1972. [7] K. D. Gunther. "Prevention of buffer deadlocks in packet-switching networks," rep. presented at the IAP-lIASA Workshop on Data Comrnun., Laxenburg, Austria, Sept. 15-19, 1975. [8) E. Raubold and J. Haenle. UA method of deadlock-free resource (91
allocation and flow control in packet networks, " in Proc. 3rd Int. Conf. Comput. Commun .• Toronto, Ont., Canada, Aug. 3-6, 1976.
W. L. Price and J.D. Haenle...Some comments on simulated datagram store-and-forward networks." Comput. Networks. vol. 2. pp. 70-73. 1978.
THE BEST OF THE BEST
616 (101
A. Giessler, J. Hanle, A. Konig, andE. Dade, "Free bufferallocation-
An investigation by simulation," Comput. Networks, vol. 2. pp. 191-
108, 1978. R. C. Chen. uBus communication systems." Ph.D. dissertation. Dep. Comput. Sci., Carnegie-Mellon Univ.• Pittsburgh, PAt Jan. 1974. NTIS-PB-235 ~97. (12] P. M. 'Merlin and P. J. Schweitzer. "Deadlock avoidance in store-andforward networks I: Store-and-forward deadlock." IBM T. J. Watson Res. Cen., YorktownHeights. NY. Rep. RC-6624, July 1977. (13) B. Gavish, P. M. Merlin. 'and P. J. Schweitzer. "Minimal buffer requirements for deadlock avoidance in store-and-forward networks." IBM J. Res. Develop., to be published. r14) Program Product, "Introduction to advanced communication function. Multiple system data communicationnetworks,,. IBMCorp., Dep. EO) . P. O. Box 12195. Research Triangle Park, NC 27709, GC30-3033-0. Oct. 1976. ' [II)
Philip M. Merlin (S'74-M'76) received the B.S.
PHOTO NOT AVAILABLE
(cum laude) and M.S. degrees in electrical engineering from the Technion-Israel Institute of Technology. Haifa. and the Ph.D. degree in information and computer sciencefrom the University of California, Irvine. in 1971. 1973. and 1974. respectively. From 197) to 1973 he led several projects on digital systemsand computercommunications at the Technion. During 1974 he was employed by the University of California, Irvine; from 1914 to 1977
he was with the IBM T. J. Watson Research Center; and from 1977to 1979he wasa faculty member with the Departmentof ElectricalEngineering,Technion. Since 1974he had been engaged in researchactivitieson distributedcomputer systems, computer network's. communication protocols, Petri nets. and recoverability. He is now deceased. Dr. Merlin was several times a Visiting Faculty Member at the IBM T. J. Watson Research Center; a British Science ResearchCouncil Senior Visiting Fellow at the Universityof Newcastle-upon-Tyne: and a consultant with Intel. H~ was a member of the Associationfor Computing Machinery and the ~E~E Computer Society.
PHOTO NOT AVAILABLE
Paul J. Sch':Veitzer was bornin New York, NY. on January 16. 194t. He received B.Sc. degrees in physics and mathematics froin ~.I.T., Cambridge, in 1961. and the Sc.D. degree in physics from M.I.T.
In 1965.
. He was a member of the Institute for Defense Analyses, Arlington, VA. from 1965 to .J970~ a Visiting Associate Professorof Operations Research in the Technion-:-Israel Institute of Technology, Haifa, from 1970 to 1972, arid a Research Staff member in the Computer sCience Department, IBM Watson Research Center, Yorktown Heights, NY. from 1972 until 1977. Hehas been a Professor in the Graduate Schoolof Management at the University of Rochester, Rochester, NY. since 1977. His research interests include per-
fonnance e~aluation of telecommunications networks. computer protocols, queueing networks, optimization algorithms, and Markovian decision process~s.
A Minimum Delay Routing Algorithm Using Distributed Computation ROBERT G. GALLAGER, FELLOW,
Abstract-An algorithm is defined for establishing routing tables in the individual nodes of a data network. The routing table at a node i specifies, for each other node j, what fraction of the traffic destined for node j should leave node;'"on each of the linksemanating from node i. The algorithm is applied independently at each node and successively updates the routing table at that node based on information communicated betwee.. adjacent nodes about the marginal delay to each destination. For stationary input traffic statistics, the average delay per message through the network converges, with successive updates of the routing tables, to the minimum average delay over all routing assignments. The algorithm bas the additional property that the traffic to each destination is guaranteed to be loop free at each iteration of the algorithm, In addition, a new global convergence theorem for noncontinuous iteration algorithms is developed. Manuscript received March 16, 1976; revised September 15, 1976. This work was supported in part by the Advanced Research Projects Agency of the Department of Defense under Grant NOOOI4-75-C-1183, in part by the National Science Foundation under Grant NSF-ENG7514103, and in part by Codex Corporation, Newton, MA 02195. This paper was presented at the International Conference on Communications, Philadelphia, PA, June 14-16, 1976. The author is with the Department of Electrical Engineering and Computer Science and the Electronic Systems Laboratory/Research Laboratory for Electronics, Massachusetts Institute of Technology, Cambridge, MA 02139.
IEEE
INTRODUCTION
T
HE problem of routing assignments has been one of the most intensively studied areas in the field of data networks in recent years. These routing problems can be roughly classified as static routing, quasi-static routing, and dynamic routing. Static routing can be typified by the following type of problem. One wishes to establish a new data network and makes various assumptions about the node locations, the link locations, and the capacities of the links. Given the traffic between each sourc~ and destination, one can calculate the traffic on each link as a function of the routing of the traffic. If one approximates the queueing delays on each link as a function of the link traffic, one can calculate the expected delay per message in the network. The problem then is to choose routes in such a way as to minimize expected delay. This is a multicornmodity flow problem, and the reader is referred to Cantor and Gerla [1] for a particularly elegant algorithm and for other references. Quasi-static routing problems can be typified by the following situation. A data network is in operation, but over
Reprinted from IEEE Transactions on Communications, vol. COM-25, no. 1, January 1977.
The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
617
618
time, new source-receiver pairs establish data transmission sessions and old sessions are terminated, It is necessary at the very least to establish routes for these new sessions and it might in addition be desirable to occasionally change routes for established sessions or to change the fraction of the traffic for a session that takes different routes. Over a longer range .time scale, links or nodes fail, new links'and riodes are added, and routings must be" changed accordingly. The usual approach to this problem is:to have a special node in the network that makes all decisions about routings. In principle such a node periodically gets information from all the other nodes about traffic requirements and uses this information' to solve the current static routing problem. Such a strategy seems simple and straightforward, but in fact it is not. First there is the need for protocols for the nodes in the network to send updating information to the control node. Similarly protocols are requited for the control node to send its routing decisions to the other nodes. There is also a serious problem about what to do when nodes or links in the network fail. The routes by which notification of such catastrophes are sent to the control node might in fact be destroyed by the catastrophe. Finally, there is the possibility that 4 failure of the control node may cause the whole network to fail, The point of this is not that central node routing is unworkable, but rather to convince the reader that the problems of communicating information about routing through a network is conceptually as difficult as making routing decisions once all the information is available. Finally, dynamic routing refers to the kinds of problems that arise in a network when messages or packets are routed according to the instantaneous states of the queues at the links of the network, The routing of a' particular message or packet is not determined when it enters the network; instead, each node that receives the message selects the next node to which the message is routed on its 'path to the destination. Here, in addition to the problem of determining an algorithm to make these decisions, there is also the problem of conveying infor .. mation about queue lengths through the network and the problem of coping with lost messages and messages which arrive out of order at the destination node. . Our major interest here is in distributed algorithms for quasi-static routing, i.e., in algorithms in which each node constructs its own routing tables based on periodic updati~g information from neighboring nodes. We first develop a number of theoretical results that should be applicable to any such algorithm and then we develop a particular algorithm. The analysis is based on a static model with stationary traffic inputs and an unchanging network. We show that the average delay per message converges under these conditions to the minimum over all routing assignments. We have not addressed the problem of how well the algorithm adapts to variations in the input traffic or the network. Qualitatively, an algorithm's ability to adapt to variations is intimately connected with its speed of convergence in the static case and with its robustness. We. feel that distributed algorithms hav~ important advantages in both these areas. A distributed algorithm can react rapidly to a local disturbance at the point of the disturbance with slower "fine tuning" in the rest of the network ..The robustness comes from lack of reliance on a central node that might
THE BEST OF THE BEST
fail and from avoiding the "chicken and egg" problem of centralized routing where one needs routes to transmit the routing information required to establish routes. The algorithm here is quite similar to the algorithm used in the Advanced Research Projects Agency Network (ARPANET) [~l. The major difference is that the ARPANET attempts to send each packet over a route' that minimizes that packet's delay with no regard to other packet's delays, whereas here packets are sent over routes to minimize the overall delay of ~11 messages. This difference between "user optimization" and "system optimization" was evidently first noticed by Pigou [3] , later used by Dafermos and Sparrow [4], and then by Agnew [5], [6]. Angew analyzed a network with a single source and destination and described an algorithm v~ry similar to that described here. Kahn and Crowther [7] also developed a distributed algorithm which meters traffic so as to change routes slowly in response to quasi-static variations. Stern [8] developed another distributed algorithm based on an electrical network analogy of a communication network. Finally our algorithm has similarities to the centralized flow deviation strategy of Fratta et al. [8]. Their algorithm was the 'first to effectively exploit the marginal change in network delay with a change in link flow, a notion which we also use extensively. One important characteristic of the algorithm, not possessed by any other routing algorithm tp our knowledge, is its property of being loop free at every iteration. Aside from reducing delay, it appears that loop freedom can be importan t in simplifying higher level protocols. In fact, the major reason for building loop freedom into the algorithm was to prevent a potential deadlock in the protocol for communicating update information between the nodes. FORMULATION OF THE MODEL
Let the nodes of an n-node network be represented by the integers 1, 2, ..., n and let a link from node i to node k be represented by (i,k). Let L be the set of links, L = {(i,k): a link goes from i to k}. In order to discuss traffic flow, we distinguish link (i,k) from (k,i), but assume that if one exists the other does also. Let 'i(j) ~ 0 be the expected traffic, in bits/s, entering the network at node i and destined for node j (see Fig. 1). We assume that this input traffic forms an ergodic process such as, for example, a Poisson process of message arrivals with a geometric distribution on message lengths. Let ti(j) be the total expected traffic (or node flow) at node i destined for node t. Thus ti(j) includes both 'j(i) and the traffic from other nodes that is routed through i for destination t. Finally let
= 'i(j) +
E lz(j)¢li(j),
all i.j.
(1)
I
Equation (1) implicitly expresses the conservation of flow
619
Fifty Years of Communications and Networking
r (3 )
2
Fig. 1.
Nodes, links, and inputs in a data network.
at each node; the expected traffic into a node for a given destination is equal to the expected traffic out of the node for that destination. Note that (1) deals with expected traffic and thus does not preclude the existence of traffic queues at the nodes. Now let fik be the expected traffic, in bits/s, on link (i,k) (with fik = 0 if (i,k) !f= L). Since ti(j)t/Jik(j) is the traffic destined for j on (i,k), we have
fih
= ~ ti(iypik(j)·
(2)
j
In what follows we refer to the set of expected inputs {r;(j)} as the input set r; the set of expected total node flows {t;(j)} as the node flow set t, the set of fractions {¢ik(j)} as the routing variable set ¢, and the set of expected link traffics Uik} as the link flow set f We have seen for an arbitrary
strategy of routing (subject to the existence of the expectations {ti(i)} and the conservation of flow) that T, t, ¢, and f all have meaning and satisfy (1) and (2). We are interested in distributed routing algorithms in which each node i chooses its own routing variables ~ik (j) for each k, j. The question then arises whether the inputs r and the routing variables set rp uniquely specify t and f. Before answering this question, we define c/J precisely, adding one additional constraint. Definition: A routing variable set (/> for an n-node network with links L is 'a set of nonnegative numbers cPik(j), 1 ~ i, k, j ~ n, satisfying the following conditions. 1) (/Jik(j) = 0 if (i,k) f/= L or if i = i.
2) "I;k 4'ik(j) = 1. 3) For each i. j (i
'*
j) there is a routing path from i to j, which means there is a sequence of nodes i, k, 1, ..., m, j such that
inputs or from other nodes outside the set, then there are multiple solutions to (1); 2) otherwise there is no solution to (1). Physically, the first case above corresponds to a set of nodes which have no traffic for a given destination comingin or going out, but which might have some messages circulating around within the set. The second case corresponds to traffic coming into the set for a given destination) 'but none going out, leading to an infinite build-up of queues or lost traffic. The more customary way to treat routing in a network is to regard it as a multicommodity flow problem (see, for example, Frank and Chou [9]). The traffic flow to each destination can be regarded as a commodity, and then (1) is equivalent to the multicommodity flow constraints. Our restrictions on the routing variables ¢ are somewhat more restrictive than the usual multicommodity flow constraints. In particular lPjk(/) = 0 prevents traffic at a destination j from looping back into the network, and the existence of routing paths prevents the isolated looping referred to in case 1) above. We have seen that any routing policy, subject to the previously mentioned restrictions, leads to the sets t, tf>, and f, and any distributed algorithm in which ~ is selected by the individual nodes leads to. a unique t, f. We now tum our attention to delay of messages in the network. Let D i h be the expected number of messages/s transmitted on link (i,k) times the expected delay/message (including queueing delays at the link input). Assume that Dik is a function only of the link flow hk' i.e., that Di h depends on the routing variables only through fik. We also make the assumption that messages are delayed only by the links of the network. This is reasonable if the processing time at an intermediate node is associated partly with the link on which the message arrives and partly with the link on which it departs. It can now be seen with a little thought (or see Kleinrock [1 I]) that the total expected delay per message times the total expected number of message arrivals/s is given by D T == LDik(!ik)'
(3)
i.k
Since hk = 0 for (i,k) ~ L, we also take DikUik) = 0 for (i,k) f. L. Since the total message arrival rate is independent of the routing algorithm, we can minimize the expected delay/ message on the network by minimizing DT over all choices of routing variables (recall that f is a function of rand ¢). The algorithm we describe subsequently will be an iterative algorithm for performing this minimization. Before proceeding, however, we should point out some of the consequences of our assumption that D i k is a function only of hk. Suppose that there are two paths from node i to j and that half the traffic is sent over each path, but the delay is greater on one path than the other. Then we could reduce the delay/message by sending the short messages over the smalldelay path and the long messages over the long-delay path. Keeping the same traffic (in bits/s) on each path, we would have more messages on the short path than the long, and thus would reduce delay/message. The assumption that D i h is a function only of fik restricts us from comparing such alternatives. Another consequence arises with dynamic routing, where one would· hope to reduce the queueing delays on the links
620
THE BEST OF THE BEST
without reducing the long-term expected link flow. This, however, would change the functions DikUik). Thus our assumption effectively masks the distinctions between dynamic and quasi-static routing (and for this reason makes the problem analytically tractable). Kleinrock [11 J showed that if queueing delays are the only nonnegligible source of delay in a network, and if each link traffic can be modeled as Poisson message arrivals with independent exponentially distributed lengths, then Dik(fik) = fik/(Cik - fik) where Cik is the capacity of link (i,k). This formula has also been refined to account for overhead and propagation delays (Kleinrock [12]). For our purposes, it is immaterial what function D ik is, although we shall make the reasonable assumption that D i h is increasing and .convex U in fik ·.Before describing the algorithm, we develop necessary and sufficient conditions on > to minimize DT -
NECESSARY ANDSUFFICIENT CONDITIONS FOR MINIMUM DELAY' First we calculate the partial derivatives of the total delay
DT with respect to the inputs r and the routing variables
cp.
Assume a small increment € in the input riU). For each adjacent node k, an increment €l!>ik (j) of this new incoming traffic will flow over (i,k), and to first order, this will cause an incremental delay on that link of
(4) If node k is not the destination node, then the increment
€
(5) We take aDT/orj(j) = 0 in this and subsequent equations and also take terms for which (i,k) f. L to be o. Theorem 2, which follows, gives a rigorous justification of (5). Next consider oDT/al/Jik(j). An increment € in
aDT
· [,
anT]
a
oDT/ariU)
and aDT/d(/>ik(j) for
This theorem is proved in Appendix A. The appendix also gives explicit expressions for aDT/arj(j) and oDT !3l/>ikU), but it turns out that the implicit forms in (5) and (6) are needed in the algorithm to be presented. One might now hope that all that is required to minimize D T is to find a stationary point for D T with respect to variations in cp. Using Lagrange multipliers for the constraint ~h (/Jik(j) = 1, and taking into account the constraint
aDT O
{= "ii, ~
Aij,
(7)
This states that for a given i, j, all links (i,k) for which ¢ik(j) > have the same marginal delay aDT/a
o must
,
Dik (!ik)
aD T
+ ark(j)
Agnew [5 J, [6] develops an equation similar to (5) but omits the final term aDT / 3' k (j ) ; his algorithm, however, effectively includes the effect of this term.
e
<e < <
e
anT
(8)
;;;. ari (j) .
This theorem is proved in Appendix B. Note that the theorem does not assert the existence of a minimum; the conditions of the theorem do not even assert that l/J is nonempty. Note also that if we multiply both sides of (8) by (/>ik(j) and sum over k, then we see from (5) that (8) must be satisfied with equality for
,
anT
o., (hk) + a'kU)
(6)
Theorem 2: Let a network have inputs r and routing variables
more, (6) is valid and both
i =1= j, (i,k) E L are continuous in rand cP.
min m:(i.m)E
for all i =1= t, (i,k) E
L
[Dim'(hm) + aD(;. ] arm J)
~0
(9)
L with equality for ¢ik{j) greater than O. THE ALGORITHM
The general structure of an algorithm to minimize D T (assuming stationary traffic inputs) should now be clear. Each
621
Fifty Years ofCommunications and Networking OJ;::: 3
Fig. 2.
Inflection point in DT(
node i must incrementally decrease those routing variables
t.
t.
value aVT/ari(j) separately on a link, and indeed the inefficiency would be high if each such number required an individual packet. However, the routing update information could easily be piggy-backed on other packets, requiring very little overhead. One might also object to the time required for the updating to propagate through the network, but speed is relatively unimportant in a quasi-static algorithm. We shall later define one small but important detail that has been omitted so far in the updating protocol between nodes; a small amount of additional information is necessary for the algorithm to maintain loop freedom. It turns out to be necessary, for each destination j and each node i, to specify a set Bj(j) of blocked nodes k for which tPik(/) = 0 and the algorithm is not permitted to increase
(10) For k
f. Bi(j) , define
(11) (12) where 11 is a scale parameter of A to be discussed later.. Let kmin(i,j) be a value of m that achieves the minimization in (11). Then
k =1= k m in (i,j) k = k m in (i,j).
(13) The algorithm reduces the fraction of traffic sent on nonoptimal links and increases the fraction on the best link. The amount of reduction, given by dik(j), is proportional to aikV), with the restriction that QJik l(j) cannot be negative. In turn aik (j) is the difference between the marginal delay to node j using link (i,k} and using the best link.. Note that as the sufficiency condition (9) is approached, the changes get small, as desired. The amount of reduction is also inversely proportional to tjU) . The reason for this is that the change in link traffic is related to Llik (j)ti(j). Thus when ti(j) is small, ~ik (j) can be changed by a large amount without -greatly affecting the marginal link delays. Finally the changes depend on the scale factor 11. For 11 very small, convergence of the algorithm is guaranteed, as shown in Theorem 5, but rather slow. As 11
THE BEST OF THE BEST
622 increases, the speed of convergence increases but the danger of no convergence also increases. It is not difficult to develop heuristic improvements on this algorithm to speed up its convergence; we have settled on this particular version since it allows us to prove convergence. We now must complete the definition of algorithm A by defining the sets Bi(j).· First define a routing variable ¢ik(j) to be improper if ¢ik(j) > 0 and aDT/ariU) ~ aDT/ork(j). We have already said that Bi(j) includes only k for which ¢>ik(j) == 0, and thus, from (5), •
m
I
aDT
mIn Dim (Jim) + - f/: B j(j) arm(j)
aDT
~ --
O'iV)
.
(14)
Assuming 'positive marginal link delays, oDT/3riU) < aDT/a'hU) + Di k ' (hk) if
the algorithm always reduces improper routing variables. In we fact, since aDT/a'i(j) is the marginal delay from i to would expect marginal delay to decrease as we move downstream, and improper routing variables should be rather atypical. For a given destination node j, the set of marginal delays oDT/3'i(j) (i =1= j) forms an ordering of the nodes i. Note that if there are no improper routing variables, this ordering is consistent with the downstream partial ordering. Fig. 3 illustrates these orderings. The horizontal axis represents marginal delay (for the given destination node j = 5) and the solid lines show the downstream partial ordering by denoting the links for, which cPik(5) > The dotted lines are examples of links (l~k) for which loops would form if
t.
o.
aD
Decreasino dDT!Orj (5)
Fig. 3.
Marginal delay ordering, downstream partial ordering, and possible loop formation.
Proof: Assume to the contrary that cP1 has a loop, say with respect to destination j. Then from condition 2) above there is some link (I,m) on the loop for which 'm 1 (i) O. This implies that (15) is satisfied. Now move backward around the assumed loop to the first link (i,k) for which
>
lim DT(¢m) m~oo
~
min D T (¢ )
(16)
ct>
where f/>m = A(¢m-l) for all m ~ 1. This is proved in Appendix C. Note that 11 depends on some upper bound Do to D T ; this is natural, since when the link flows are very close to capacity, small changes in the link flows cause large changes in marginal delay. The proof uses a ridiculously small value of 11 to guarantee convergence under all conditions and experimental work is necessary to determine practical values for 17.
USE OF THE'ALGORITHM FOR QUASI~STATIC ROUTING (15) Note that the defmition permits k to be identical to 1. The reason for (15) can be seen from (14) and (12). If(15) is not satisfied, then ~lm(J) = tPzm(j) and cf>lm 1()) = 0, so that (I,m) can not be part of a loop for destinationj. Theorem 4: If the marginal link delays D i h ' are positive and (jJ is loop free, then (jJl = A (e/» is loop free.
We have shown in the last section that the algorithm A must eventually converge to the minimum average delay for a network with stationary inputs and links. The algorithm is really Intended, however, for quasi-static applications where the input statistics are slowly changing with time and where occasionally links or nodes fail or are added to the network. Under these more general conditons, it is clear that the loop freedom of the algorithm is maintained since this is a mathe-
623
Fifty Years of Communications and Networking
matical property that is independent of the marginal link delays and node flows, which are the only inputs to the algorithm (note that the inputs r plays a role in the theoretical development, but do not appear in the algorithm and need not ~e estimated). The question of whether the algorithm can adapt fast enough to keep up with changing statistics is difficult and requires more study. Clearly, the faster the statistics change, the more frequently the algorithm should be updated, but frequent updating has two undesirable effects. First, frequent updates require more updating protocol, thus reducing the effective link capacities available for data, and second, frequent updates will necessitate noisier measurements of marginal link delays and node flows. Experimentation would be helpful both in determining update rates and the scale parameter 11Another oPrn question is that of a starting rule for the algorithm (finding a loop free 4J to start with). One possibility is to start with shortest paths; that is, set cf>ikU) = I for the link (i,k) that leads to j from i with the smallest number of links. Such a strategy might well lead to link flows' which mathematically. exceed capacity, but in this case a well designed flow control would limit the input to the network, thus yielding large but finite marginal delays on such links and allowing the routing algorithm to gradually adapt. The problem of dropped links or nodes is somewhat more complicated. Some of the problems here must be solved by higher order protocols, since if the network becomes disconnected, there is ~O way to route data between disconnected parts of the network, However the routing algorithm should still adapt by finding routes for any data that tan be sent. Each node i -at the end of a link (i,k) that has failed or whose opposite node has failed should signal the fact that an update should start throughout the network. In addition, i should no longer regard k as being downstream with respect to any destination i. and if k was the only downstream neighbor, then i should broadcast aDT/a'i(j) = 00. This latter broadcast prevents upstream nodes from waiting indefinitely for update information to propagate through a failed link or node. The
(A3) Any solution to (A3) and (A2) satisfies (A 1) and vice versa. Let 4i be the n X n matrix with components ¢u. 4> is stochastic (i.e_,
(AS)
Using (AS) in (A4), the solution to (AI) is conveniently expressed, for any', as (A6)
Finally, differentiating (AI) with respect to ¢hm' we get
exact details of updating protocols in the presence of link and node failures is a subject for futher research.
= 1 for i = m and 0 otherwise. For fixed k, m, this is the same set of equations as (A 1), so that the solution, continuous in ¢, is
APPENDIX A
where 0im
Proof of Theorem 1:
Without loss of generality, take the destination node j to be the nth of the n nodes and drop the argument j from (1), t,
= ri +
n-l
~ tl¢u,
1 ~ i~n.
(AI)
1=1
at i
oti
O
arm
- - = - tk·
(A7)
Proof of Theorem 2
Summing both sides over i, we see that any solution to (AI)
First we show that (5), repeated below with the destination node again taken to be n, has a unique solution.
(A2)
(A8)
satisfies
Temporarily let
q>ni
= 'i/tn and substitute this in (AI).
624
THE BEST OF THE BEST
bn - l ) . Let V·D T be the column vector (aDT/a'l' aDT/arn-l). Then (A8) can be rewritten as V· D T = b
+
(A9)
We saw in the proof of Theorem 1- that I - has a unique inverse with components given by (AS). Thus the unique solution to (A9) is
DrCA) = ~Dik(.tik(A).
(B4)
i,k
There is a set of routing variables cP('A) which gives rise to [(A), but they are not linear in A and their existence is not relevant to our proof. Since each link delay D i h is a convex U function ofthe link flow,DTCA.) is convex U in X, and hence (B5)
(AlO) (All)
Differentiating D T directly with (2) and (J), we get the same unique solution, which, from Theorem 1, is continuous in
aD T
an..
'f'zk
~' = £J D'm (fim)¢'m l.m
at,
Since ~* is arbitrary, provingthat dDT(X)/dA ~ 0 at X = 0 will complete the proof. From (B4) and (B3), (B6)
We now show that (B7)
I
aA-.. + D ik (fik)t j
.
'f'lk
(AI2) (B8)
We have used (A7) and (AIO) to derive (A12), which is the same as (6). This is clearly continuous in ¢ given the continuity of ti and aDT/a'i, and the proof is complete.
Multiplying both sides of (BB) by t;*(j), summing over i, j, and recalling that hh* = Ljtj*U)4>ih *(j), we obtain
APPENDIX B Proof of Theorem 3 First we show that (7) is a necessary condition to minimize D T by assuming that > does not satisfy (7). This means that there is some i. j, k, and m such that
aD T (¢) > aDT (¢) a4>iI~ (j)
.
(Bl)
a¢im (j)
Since these- derivatives are continuous, a sufficiently small increase in ¢im (j) and corresponding decrease in ¢ik (j) will decrease Dr, thus establishing that > does not minimize D T . Next we show that (8), repeated below, is a sufficient conditio!! to minimize D T . all i, j, k.
(B2)
> satisfies (B2) and has node flows t and link flows f. Let ¢* be any other set of routing variables with node
-
~ t.*( ")I/J. *( ') aDT(I/J) . kI ' J lh J a ( .) rk J
(B9)
t.t,«
From (1), "£i t i*(j )4>ik *(j) = tk *u) - rk(i). Substituting this into the rightmost term of (B9) and canceling, we get (B7). Note that the only inequality used here was (B8), and that if (/J is substituted for ~*, this becomes an equality from the equation for aDT /3' j(j) in (5). Thus
(BID) Substituting (B10) and (B7) into (B6), we see that dDT(A)/dX ~ A = 0, completing the proof. We note in passing that (B10) is valid for any set of routing variables and appears to be a rather fundamental conservation equation.
o at
Suppose that
flows t* and link flows f* . Define
(B3)
APPENDIXC We prove Theorem 4 through a sequence of sevenlemmas. The first five establish the descent properties of the algorithm,
625
Fifty Years ofCommunicationsand Networking
the sixth establishes a type of continuity condition, showing that if l/J does not minimize DT , then for any l/J* in a neighborhood of >, DT(Am(l/J*)) < D T (» for some m. The seventh lemma is a new global convergence theorem which does not require continuity in the algorithm A; Lemmas 6 and 7 together establish Theorem 4_ Let cf> be an arbitrary set of routing variables satisfying DT(ct» < Do for some Do- Let cpl = A(¢) and let t, f, t 1 , [1 be the node and link flows corresponding to 4> and cpl, respectively. Let fA (0 ~ A ~ 1) be defined by hk A = (1 - A)/ik + Mk1 , and let DT(l\.) = ~ Dik(Jik A ) .
using(1) and (2)~ we get
~ d ik(j)aik(j)t i 1(j)
i;i,k
(C6)
(C1)
i,k
= ~ (fik
From the Taylor remainder theorm, DT(tj)l)- DT(
+ .!-d_2_D_T_(A_) 2
til (j)) summing, and
dX2
-hh 1)D ik '(fi k )
i,k
= _ dDT(A) dX
I
(e2)
A=A*
where X* is some number, 0 ~ X* ~ 1. The continuity of the second derivative above will be obvious from the proof of Lemma 4) which upper bounds that term. The first three lemmas deal with dDT(A)/t!A IA=oLemma 1:
(C7)
I
(C8)
A=O
We have used (B10) to get (C7), and (C8) follows from (C1), completing the proof. Lemma 2: dLJ (X) _T__ , ' d 1\
I
A=O
1 d 2 . 2 · ( _ 1)3 ~ i Uv, (j) 1111 . 1,J
~
"" -
(C9)
where t1;(j) = ~ t1i k (j).
(CI0)
k
(C3)
Proof: From the definition of A.ik(j) in (12),
-ti(j)~ik(j)111.Substituting
Proof>: Usingthe definitions of QikU) and t1i k (j) in (11) and (12)
dDT(X) -dX
I
(C5)
In (C4), we have used (13) to extend the sum over all k and in (C5), we have used (5). Multiplying both sides of (C5) by
r.
~s m~ntioned before there is a tJ>A corresponding to but it is nonlinear In A, and dDTfA)/aA cannot be calculated in a straightforward way by differentiating with respect to (/)7\. 2
-aik(j) ~
1
A=O
~-- ~ dik2(j)ti(j)til(j) 11 i.i.k 1
~
(C4)
this into (C3) yields
~ A. i 2(j)ti(j)ti1 (j)
(n - 1)71 i.i
(C11)
where (C11) follows from Cauchy's inequality, (~kO:kl3h)2 ~ (1;Q k 2)(1;l3k 2), with Qk '= 1, Pk = t1ik(j ), and the sum over k=l=-i. Now define ti*(j) as the total flow at node i destined for j if the routing variables ifJik(j) (for k =1= kmin(i,j)) are reduced by t1ik (i ) but l/Jik(j) for k = kmin(i,j) is not increased.Mathematically t i *(j) satisfies t;*U):= ~ t,*(j) [¢u(j) - Au(j)] l
+ 'i(j).
(e12)
This has a unique solution because of the loop freedom of
l
(C13)
626
THE BEST OF THE BEST
From (A6), using 'Lt,*(j)D.1i(j) for ri(j),
ti(j) - ti*(j)
= ~ ati(~) ~ tk *(j)t:.k 1(;)' l'
arl(J) k
(C14)
Since 1> is loop free, atiU)/ar,(i) ~ 1. Also if ati(j)/arl(j) > 0, then 1 is upstream of i for destination j and ¢il(j) (and hence tlil(i» is ~ero. Thus I
d?DTCA) _ . " A d"A 2 - ~Djk (fik Hfik1 i,k
lilU) - ti(j)
I
+ ~ t,V) [4>li 1 (j) - ¢'f(j)]
(C15)
I
Multiplying the left side by Ai(j) ~ 1 preserves the inequality, yielding
••
1
~ a·{3.~~a·2 ~ II 2 ~,.
m
i=l
I
where we have used
oj
k
We can lower bound (C23) in the same way, considering only terms in which ¢~ll(j) - ¢kl(j) < 0, and this leads to
I li1U) - t;(1') I ~ ~ tk(j)Lik(j)
(CI7)
Qi ~ ~i
1
ak 2
for all k.
,
0
1
~ Li,2(J')t·(J·)t.*(J') ~ (n- q2 ~ f'. f' ~i2(j")t,2(')J. I
I
A
.
hk = ~ [ti 1 (j) -
-
i
lfik 1 -
o.,
. I
j
+
E lj(j) I
ihk 1
-
fik IZ
w~
get
(C25)
~i
fik 1 - fik
1 (;)]?
j,l
2
1
~ t;2(;)[l/lik
~ n(n - 1)1 Lt
l
1(j)
-l/lik(;)]2!
2 (j)A 2 U) 1
i,l
+2
Taking the second derivative,
I.
~ nin - 1) l~tI2(j)t:.12(j)[l/lik +
k
Proof: The pound M must exist because D i k " (hk A) is a continuous function of Aover the compact region 0 ~ A ~ I"
The double SUJTI in (e2S) has at most (n - 1)2 nonzero terms (j =1= i, I =1= j) and the second sum at most n - 1 terms. Using Cauchy's inequality on both terms together, we get
(C20)
Since f.;*(1') ~ til (j), we can substitute (C20) into (C11), getting (C9) and completing the proof of Lemma 2. Lemma 4:' Let M be a~ upper bound to "(fih A) over all t. k and over 0 ~ A~ 1. Then for any A, 0 ~ ~ ~ 1,
lj(j)]
hk I~ ~ ~ t IU)d 1(j )4>ikl(j)
(C19)
This implies (C l7)~ completing the proof of Lemma 3. Now let (Xi = ti(j)Ai(j) aI1~ (3i = ti*(j)~i(j). Since these terms are nonzero only for i =1= i. we can take m == ~ - 1. Since the conditions of the lemma ar~ satisfied for this choice,
I
1
(C18)
and then Cauchy's inequality.
(C24)
h
fik
for ~11 k,
rn
0
h
ti 1(1') - ti(j) ~ ~ tk(j)LikU),
i
~Qi~i~ ~~i2~;(L~ir ~ (X.a. ~ ~ 'w,
J
(e23)
Proof of Lemma 3.-
Si~ce ~~i ~ CXh
'1
Since 0 ~ atiI (j)f a'l(j) ~ 1, we can upper bound this by
Since the right-hand side of (C14) is nonnegative, we also have t;(1')J1 i U) ~ t i *(1')A;(1') , We interrupt the proof now for a short technical1 lemma. Lemma 3: Let ai' ~; (I ~ i ~ m) be nonnegative numbers satisfying Cti < ~k~k ;a:i ~ ~i for 1 ~ i < Ill. Then m
~ qatil(~)') ~ tk (j)(l/lk 8j) -I/lk/V)].
(C16)
ok
o
[t,l(j) - t,(j)] >li 1(j)
=~
k=l=i
ti(j)Ai(j) ~ ~ tk *(j)Ak(j)·
i.k
We now upper bound Ihk 1 - fik I by first ~pper bounding (til(j) - tiel) I. As in the proof of Lemma 2, we have
0
h;#:i
fikJ2~LMUikl -Jik]2. (e22)
ti(j) - ti*(j) ~ ~ ~ t~ *(j)Akl(1') = ~ lk *:(j)~k(j), l
-
~ tNj)t:.;2(j) I.
(C26)
Summing over i and substituting the result in (C22), we get (C21), completing the proof.
627
Fifty Years of Communications and Networking
Lemma 5: For given Do, define M = max itk
max
(C27)
D i h " (f)
f:Dik(f) ~ DO
11= [Mn 6 ] - 1.
(e28)
Then for all ¢ such that DT(cP) ~ Do,
1 - - - ~~.2(·)t.2(·) 271(n - 1)3
ft
I
J
I
J.
(C29)
Proof: Temporarily let M be as defined in Lemma 4. Combining Lemmas 2 and 4, D T (!/>l ) - D T (»
~ [_ +
1 l1(n - 1)3
Mn(n -
where Ai(j) and t i(1') correspond to the given 4>. Choose € small enough so that (C33) is satisfied for I' 4> - 4>* t < € and also so that
1)(n + 2)] ~ b.i2(j )ti2(j).
2
Z,]
Combining this with (C33), we have (C31) for m = 1. Case 2: Blocking occurs. For any ¢, we can use (5) to lower bound aik(j) by
aik(j) ~ Dik'(!ik)
b.ik(j)t i(j) ;;;. min
(C30) For 1'/ = [Mn 6 ] -1, the second term in brackets above is less than half the magnitude of the first term, yielding (C29). It follows that DT(l/>l ) ~ DT(t/J) ~ Do. By convexity then Di h ifA) ~ Do for 0 ~ A ~ 1. Thus M as given in (e2?) satisfies the condition on M in Lemma 4, compleing the proof. Lemma 6: Let the scale factor '11 satisfy (e28) for a given Do and Jet 4> be an arbitrary set of routing variables that does not minimize DT and that satisfies D T (, there exists an e > 0 and an m, 1 ~ m ~ n, such that for all 1* satisfying Ilf> - (/>* I < €,
+ oDTlarkU) -
I~ik
+ aD
(j)ti(j), 71
T _anT]
ark (j)
>
min
(C32)
l~m~n
which is continuous in 4>. It follows from (12) that aoz(j) is continuous in If>, and the upper bound to DT(A(ep)) - DT (4)) in (C29) is continuousf in 4>. Since by assumption the bound in (C29) is strictly negative, there is a neighborhood of l/J* around if> for which 3 As a precaution against being too casual about these arguments, one should note that if the minimizing m in (C32) is not unique, then A(l/» is not continuous in 4>.
(C34)
~ik' (fik)
I
(C35)
The lower bounds above are continuous functions of
(C36) and (C37)
(C31)
Proof: We consider three cases. The first is the typical case in which no blocking occurs and DT(A(cf») < DT((j», the second is the case in which blocking occurs, and the third is the situation typified by Fig. 2 in which DT(A«(/») = DT (¢). Case 1: No blocking; ai(j)ti(j) 0 for some i, j. If no nodes are blocked for l/>, then by the definition of blocking (15), there is a neighborhood of t/J* around l/> for which no blocking occurs. In this neighborhood,
'iJri(j)
aDT/oriU)
Combining (C35) to (C37),
~ih (j)t i(j) ~ l1Dih'(!ik)'
(C38)
Since the right-hand side of (C35) is continuous in 3 be the set of cP for which dik(j)ti(j) = 0 for all i, j, k. Let 4>(1) = A I(
628
THE BEST OF THE BEST
(C43) '-+00
(C40)
Furthermore, by assumption, DT(A m (
Since this equation is satisfied for all 1, 0 ~ 1 ~ m - 2, we see that oDT(qll))/orj(j) can be reduced on iteration 1 only if aDT (¢(1- 1» /3' h(J) is reduced on iteration 1 - 1 for some k such that oDT(ep(l·l »/3'k(j) < 3DT (ep(l» /o' j(j ). This reduc.. tion at node k however implies a reduction at some node k' of smaller differential delay at iteration 1 - 2 and so forth. Since this sequence of differential delays is decreasingwith decreasing 1 and since (from (C40» the differential delay at a given node is nondecreasing with decreasing 1, each node in the sequence must be distinct. Since there are n - 1 nodes other than the given destination available for such a sequence, the initial 1 in such a sequence satisfies I ~ n - 2. On the other hand, if DT(ep(l »/3riV) is unchanged' for all l, t. we see from (C40) that ¢(l) satisfies the sufficient conditions to minimize D T ' and then ¢ also minimizes DT ' contrary to our hypothesis; thus we must have m ~ n.4 Now observe that the middle expression in (C40), for 1 = 0, is a continuous function of ¢ and consequently 3DT(¢(1»)/ O'j(j) is a continuous function of ¢ for all i. j. It follows by induction that oDT (¢(l» )/3rj( j) is a continuous function of ¢ for all i, j and for I ~m - 1. Finally ¢(m-1) f/= <1>3, so it must satsify the conditions of case 1 or 2; it will be observed that the analyses there apply equally to ¢(m -1) because of the continuity of oD T (¢(m- 1 »)/a'i(j) as a function ¢. This completes the proof. ' Our last lemma will be stated in greater generality than
m-+ oo
all
m~
I.
(C45)
To complete the proof, we must show that ¢' E cP m i n ; we assume the contrary and demonstrate a contradiction. By assumption then, there is an e > 0 and an m' associated with (/>' such that DT(A m ' (¢*)) < DT ( (j)') for all ¢* E
ACKNOWLEDGMENT The author would like to thank L. Kleinrock, A. Segall, J. Wozencraft, and two anonymous reviewers for a number of helpful comments on an earlier version of this paper.
required since it is a global convergence theorem for algo-
rithms that avoids the usual continuity constraint on the algorithm (see Luenberger '[16] for a good discussion of global convergence). Lemma 7: Let be a compact region of Euclidean N space. Let A be a mapping from ep into and let D T be a continuous real valued function in
=
[2]
[3]
[4]
[5] [6] [7]
Proof' Since is compact, the sequence {Am(¢)} has a convergent subsequence, say {qi}, with ep'
REFERENCES [1]
lim
¢' E.
(C42)
(8)
(9)
Z-""OO
[10]
Since D T is continuous, 4 It can be seen from this that the algorithm converges in at most n steps to a if> satisfying the sufficient conditions (8) if Dih is linear in fik, for each i. k (in this case, from (C28), 11 = 00).
[11] [12]
D. G. Cantor and M. Gerla, "Optimal routing in a packet switched computer network," IEEE Trans. Comput., vol. C-23, pp, 10621069, Oct. 1974. F. E. Heart, R. E. Kahn, S. M. Ornstein, W. R. Crowther, and D. C. Walden, "The interface message processor for the ARPA Computer Network," in Conf. Rec. 1970 Spring Joint Comput. Conf, AFIPS Conf. Proc., 1970, pp. 551-566. A. C. Pigou, The Economics of Welfare. London, England: MacMillan, 1920. S. C. Dafermos and F. T. Sparrow, "The traffic assignment problem for a general network," J. Res. Nat. Bureau of Standards-B Math. vol. 73B, no. 2, pp. 91-118,1969. C. Agnew, HOn the optimality of adaptive routing algorithms," in Conf. Rec. Nat. Telecommun. Conf., 1974, pp. 1021-1025. - , HOn quadratic adaptive routing algorithms:' Commun. Ass. Comput. Mach., vol. 19, no. 1, pp. 18-22,1976. . R. E. Kahn and W. R. Crowther, "A study of the ARPA Network design and performance," BBN rep. 2161, Aug. 1971. T. E. Stern, "A class of decentralized routing algorithms using relaxation," to be published. L. Fratta, M. Gerla, and L. Kleinrock, "The flow deviation method: An approach to store-and-forward communication network design," Networks, vol. 3, pp. 97-133, 1973. H. Frank and W. Chou, "Routing in computer networks," Networks. vol. 1, pp. 99-122, 1971. L. Kleinrock, Communication Nets: Stochastic Message Flow and Delay. New York: McGraw·Hill, 1964. - , "Analytic and simulation methods in computer network design," in Conf. Rec., Spring Joint Comput. coni., AFIPS Conf. Proc., 1970, pp. 569-579.
s«.
629
Fifty Years ofCommunications and Networking A. Segall, "The modeling of adaptive routing in data communication networks," this issue, pp. 85-95. [14] M. Bello, S. M. thesis, Dep. Elec. Eng. and Comput. ScL, Massachusetts Inst. Technol., Cambridge, Sept. 1976. [15] F. R. Gantmacher, Matrix Theory, vol. 2. New York: Chelsea,
Laboratories and from 1954 to 1956 was in
[131
1959.
(161
D. G. Luenberger, Introduction to Linear and Nonlinear Programming. Reading, MA: Addison Wesley, 1973.
* Robert G. Gallager (S'58-M'61-F'68) was born in Philadelphia, PA on May 29, 1931. He received the S.B. degree in electrical engineering from the University of Pennsylvania, Philadelphia, in 1953 and the S.M. and Sc.D. degrees" in electrical engineering from the Massachusetts Institute of Technology, Cambridge, in 1957 and 1960, respectively. From 1953 to 1954 he was a member of the technical staff at Bell
PHOTO NOT AVAILABLE
the signal corps of the U.S. Army. He has been .at the Massachusetts Institute of Technology since 1956 and was Associate Chairman of the Faculty from 1973 to 1975. He is currently a
Professor of Electrical Engineering and Computer Science and is the Associate Director of
the Electronic Systems Laboratory. He is also a consultant to Codex Corporation, Newton, MA.. He is the author of the text book Information Theory and Reliable Communication (New York: Wiley, 1968), and was awarded the IEEE Baker Prize Paper Award in 1966 for the paper "A Simple Derivation of the Coding Theorem and Some Applications." Mr. Gallager was a member of the Administrative Committee of the IEEE group on Information Theory, from 1965 to 1970 and was Chairman of the group in 1971. His major research interests are data networks, information theory, and computer architecture.
Packet Switching in Radio Channels: Part I-Carrier Sense Multiple-Access Modes and Their Throughput-Delay Characteristics LEONARD KLEINROCK,
FELLOW, IEEE, AND ~"'OUAD
Abstract-Radio communication is considered as a method for providing remote terminal access to computers. Digital byte streams from each terminal are partitioned into packets (blocks) and transmitted in a burst mode over a shared radio channel. When many terminals operate in this fashion, transmissions may conflict with and destroy each other. A means for controlling this is for the terminal to sense the presence of other transmissions; this leads to a new method for multiplexing in a packet radio environment: carrier sense multiple access (CSMA). Two protocols are described for CSMA and their throughput-delay characteristics are given. These results show the large advantage CSMA provides as compared to the random ALOHA access modes.
I. INTRODUCTION
L
AR GE COl\t[I:lUTER installations, enormous data banks, and extensive national computer networks are now becoming available. They constitute large expensive resources which must be utilized in a cost/effective fashion. The constantly growing number of computer applications and their diversity render the problem of accessing these large resources a rather fundamental one. Prior to 1970, wire connections were the principal means for communication among computers and between users and computers. The reasons were simple: dial-up and leased telephone lines were available and could provide inexpensive and reasonably reliable communications for short distances, using a readily available and widespread technology. It was long recognized that this technology was inadequate for the needs of a computer-communication system which is required to handle bursty traffic (i.e., large peak to average data rates). For example, the inadequacies included the long dial-up and connect time, the minimum three-minute tariff structure, the fixed and limited data rates, etc. However, it was not until 1969 that the cost to switch communication bandwidth dropped below the cost of the bandwidth being switched [1]. At that time, the new technology of packet-switched computer networks emerged and developed a cost/effective means for connecting computers together over long-distance high-speed Paper approved by the Associate Editor for Computer Communication of the IEEE Communications Society for publication after presentation at the National Computer Conference, Anaheim, Calif., 1975. Manuscript received December 5,1974; revised June 11, 1975. This work was supported in part by the Advanced Research Projects Agency, Department of Defense under Contract DAHC1573-C-0368. The authors are with the Computer Science Department, University of California, Los Angeles, Calif. 90024.
A. TOBAGI
lines. However, these networks did not solve the local interconnection problem, namely, how can one efficiently provide access from the user to the network itself? Certainly, one solution is to use wire connections here also. An alternate solution is the subject of this paper, namely, ground radio packet switching. We wish to consider broadcast radio communications as an alternative for computer and user communications. The ALOHA System [2J appears to have been the first such system to employ wireless communications. The advantages in using broadcast radio communications are many: easy access to central computer installations and computer networks; collection and dissemination of data over large distributed geographical areas independent of. the availability of preexisting (telephone) wire networks: the suitability of wireless connections for communications with and among mobile users (a constantly growing area of interest and applications); easily bypassed hostile terrain; etc. Perhaps, this broadcast property is the key feature in radio communication, The Advanced Research Projects Agency (ARPA) of the Department of Defense recently undertook a new effort. whose goal is to develop new techniques for packet radio communication among geographically distributed, fixed or mobile, user terminals and to provide improved frequency management strategies to meet the critical shortage of RF spectrum. The research presented in this paper is an integral part of the total design effort of this system which encompasses many other research topics [3]-[9J. Consider an environment consisting of a number of (possibly mobile) users in line-of-sight and within range of each other, all communicating over a (broadcast) radio channel in a common frequency band. The classical approach for satisfying the requirement of two users who need to communicate is to provide a communication channel for their use so long as their need continues (lineswitching). However, the measurements of Jackson and Stubbs [10J show that such allocation of scarce communication resources is extremely wasteful. Rather than providing channels on a user-pair basis, we much prefer to provide a single high-speed channel to a large number of users which can be shared in some fashion. This, then, allows us to take advantage of the powerful "large number laws" which state that with very high probability, the demand at any instant will be approximately equal to
Reprinted from IEEE Transactions on Communications, vol. COM-23, no. 12, December 1975. The Best ofthe Best. Edited by W H. Tranter,D. P Taylor, R. E. Ziemer,N. F. Maxemchuk, and 1. W Mark. Copyright© 2007 The Institute of Electrical and ElectronicsEngineers, Inc.
631
632
THE BEST OF THE BEST
the sum of the average demands of that population. We think of various actions to be taken by the terminal. Two wish to take advantage of these gains due to resource protocols will be described and analyzed which we call sharing. "persistent' CSl\1A protocols: the nonpersistent and the Of interest to this paper is the consideration of radio p-persistent CSl\tIA. Below, we present the protocols, dischannels for packet switching (also called packet radio cuss the assumptions, and finally establish and display channels). A packet is merely a package of data prepared the throughput-delay performance for each. by one user for transmission to some other user in the II. CS1\.1A TRANS~lISSION PROTOCOLS AND system. As soon as we deal with shared channels in a SYSTEIVI ASSUlVIPTIONS packet-switching mode, then Vie must be prepared to resolve conflicts which arise when more than one demand is The various protocols considered below differ by the simultaneously placed upon the channel. In packet radio action (pertaining to packet transmission) that a terminal channels, whenever a portion of one user's transmission takes after sensing" the channel. However, in all cases, overlaps with another user's transmission, the two collide when a terminal learns that its transmission was unsuccessand "destroy" each other. The existence of some acknowl- ful, it reschedules the transmission of the packet according edgment scheme permits the transmitter to determine if to a randomly distributed retransmission delay. At this his transmission was successful or not. The problem we are new point in time, the transmitter senses the channel and faced with is how to control the access to the channel in repeats the algorithm dictated by the protocol. At any a fashion which produces, under the physical constraints instant a terminal is called a ready terminal if it has a of simplicity and hardware implementation, an acceptable packet ready for transmission at this instant (either a level of performance, The difficulty in controlling a channel new packet just generated or a previously conflicted packet whieh must carry its own control information gives rise rescheduled for transmission at this instant) . to the so-called random-access modes, A simple scheme, A terminal may, at anyone time, either be transmitting known as "pure ALOHA," permits users to transmit any or receiving (but not both simultaneously). However, time. they desire. If, within some appropriate time-out the delay incurred to switch from one mode to the other period, they receive an acknowledgment from the destina- is negligible. Furthermore, the time required to detect tion, then they know that no conflicts occurred. Otherwise, the carrier due to packet transmissions is negligible (that they assume a collision occurred and they must retransmit. is a zero detection time is assumed}.' All packets are of To avoid continuously repeated conflicts, some scheme constant length and are transmitted over an assumed must be devised for introducing a randorn retransmission noiseless channel (i.e., the errors in packet reception delay, spreading the conflicting packets over time. A caused by random noise are not considered to be a serious second method for using the radio channel is to modify problem and are neglected in comparison with errors the completely un synchronized use of the ALOHA channel caused by overlap interference). The system assumes by "slotting" time into segments whose duration is exactly noncapture (i.e., the overlap of any fraction of two packets equal to the transmission time of a single packet (assum- . results in destructive interference and both packets must ing constant-length packets). If we require each user to he retransmitted). We further simplify the problem by asstart his packets only at the beginning of a slot, then when suming the propagation delay (small compared to the two packets conflict, they will overlap completely rather packet transmission time) to be identical" for all sourcethan partially, providing an increase in channel efficiency. destination pairs. This method is referred to as "slotted ALOHA" [11 J-[13J. We first consider the tunipersistent CSAilA. The idea here The radio channel as considered in this paper is charac- is to limit the interference among packets by always reterized as a wide-band channel with a propagation delay scheduling a packet which finds the channel busy upon between any source-destination pair which is very small arrival. More precisely; a ready terminal senses the chancompared to the packet transmission time. 1 This suggests nel and operates as follows. a third approach for using the channel; namely, the carrier 1) If the channel is sensed idle, it transmits the packet. 2) If the channel is sensed busy, then the terminal sense multiple-access (CSl\1A) mode, In this scheme one attempts to avoid collisions by listening to (i.e., "sensing") schedules the retransmission of the packet to SOIne later the carrier due to another user's transmission.' Based on time according to the retransmission delay distribution. this information about the state of the channel, one may At this new point in time, it senses the channel and repeats the algorithm described. 1 Consider, for example, IOOO-bit packets transmitted over a l\. slotted version of the nonpersistent CS1\1A can be channel operating at a speed of 100 kbits/s. The transmission time of a packet is then 10 ms. If the maximum distance between the source and the destination is 10 mi, then the (speed of light) packet. propagation delay is of the order of 54 IJ,S. Thus the propagation delay constitutes only a very small fraction (a = 0.005) of the
transmission time of a packet. On the contrary, when one considers
satellite channels [13} the propagation delay is a relatively large multiple of the packet transmission time (a » 1). 2 Sensing carrier prior to transmission is a well-known concept in use for (voice) aircraft communication. In the context of packet radio channels, it was originally suggested by D.Wax of the University of Hawaii in an internal memorandum dated Mar. 4, 1971.
3 Each terminal has the capability of sensing carrier on the channel. The practical problems of feasibility and implementation of sensing, however, are not addressed here. 4 The detection time is considered negligible for relatively wideband channels (100 kHz). In Part II [l9] the detection time on the "busy-tone" narrow-band channels (on the order of 2 kHz) will be accounted for in the analysis. 6 By considering this constant propagation delay equal to the largest possible, one gets lower (i.e., pessimistic) bounds on performance.
633
Fifty Years ofCommunications and Networking
considered in which the time axis is slotted and the slot size is T seconds (the propagation delay). All terminals are synchronized" and are forced to start transmission only at the beginning of a slot. When a packet's arrival occurs during a slot, the terminal senses the channel at the beginning of the next slot and operates according to the protocol described above. We next consider the p-persieient C SAl44 protocol. However, before treating the general case (arbitrary p), we introduce the special case of p = 1. The 1-persistent CSltfA protocol is devised in order to (presumably) achieve acceptable throughput by never letting the channel go idle if some ready terminal is available. More precisely, a ready terminal senses the channel and operates as follows, 1) If the channel is sensed idle, it transmits the packet with probability one. 2) If the channel is sensed busy, it waits until the channel goes idle (i.e., persisting on transmitting) and only then transmits the packet (with probability one-hence, the name of I-persistent). A slotted version of this I-persistent CSlVf A can also be considered by slotting the time axis and synchronizing the transmission of packets in much the same way as for tile previous protocol. The above l-persistent and nonpersistent protocols differ by the probability (one or zero) of not rescheduling a packet which upon arrival finds the channel busy. In the case of a I-persistent CS~1A, we note that whenever two or more terminals become ready during a transmission period (TP), they wait for the channel to become idle (at the end of that transmission) and then they aU transmit with probability one. A conflict will also occur with probability one! The idea of randomizing the starting time of transmission of packets accumulating at the end of a TP suggests itself for interference reduction and throughput improvement. The scheme consists of including an additional parameter 11, the probability that a ready packet persists (1 - p being the probability of delaying transmission by T seconds). The parameter p will be chosen so as to reduce the level of interference while keeping the idle periods between any t\VO consecutive nonoverlapped transmissions as small as possible. This gives rise to the p-persistent CSlY!A, which is a generalization of the I-persistent CSl\lA. More precisely, the protocol consists of the following: the time axis is finely slotted where the (mini) slot size is T seconds. For simplicity of analysis, we consider the system to be synchronized such that all packets begin their transmission at the beginning of a (mini) slot. Consider a ready terminal. If the channel is sensed idle, then: with probability p, the terminal transmits the packet; or with probability 1 - p, the terminal delays the transmission of the packet by T seconds (i.e., one slot). If at this new point in time, the channel is still detected In this paper, the practical problems involved in synchronizing terminals are not addressed. 6
idle, the same process is repeated. Otherwise, some packet must have started transmission, and our terminal schedules the retransmission of the packet according to the retransmission delay distribution (i.e., acts as if it had conflicted and learned about the conflict). If the ready terminal senses the channel busy, it waits until it becomes idle (at the end of the current transmission) and then operates as above.
III.
l\10DEL: ASSUl\1PTIONS AND NOTATION
'rRAli'I~'IC
In the previous section, we identified the system protocols, operating procedures, and assumptions. Here we characterize the traffic source and its underlying assumptions. We assume that our traffic source consists of an infinite number of users who collectively form an independent Poisson source with an aggregate mean packet generation rate of A packets/so This is an approximation to a large but finite population in which each user generates packets infrequently and each packet can be successfully transmitted in a time interval much less than the average time between successive packets generated by a given user. Each user in the infinite population is assumed to have at most one packet requiring transmission at any time (including any previously blocked packet). In addition, \\Te characterize the traffic as follows, We have assumed that each packet is of constant length requiring T seconds for transmission. Let S = 'AT. S is the average number of packets generated per transmission time, i.e., it is the input rate normalized with respect to T. Under steadv-state conditions, l~ can also be referred to as the channel" throughput rate. No,v, if we were able to perfectly schedule the packets into the available channel space with absolutely no overlap or gaps between the packets, we could achieve a maximum throughput equal to 1; therefore we also refer to S as the channel utilizat1:on. Because of'the interference problem 'inherent in the random nature of the access modes, the achievable throughput will always be less than 1. The maximum achievable throughput for an access mode is called the capacity of the channel under that mode. Since conflicts can occur, some acknowledgment scheme is necessary to inform the transmitter of its success or failure. We assume a positive acknowledgment scheme": if within some specified delay (an appropriate time-out period) after the transmission of a packet, a user does not receive an acknowledgment, he knows he has conflicted. If he now retransmits immediately, and if all users behave likewise, then he will definitely be interfered with again (and forever!). Consequently, as mentioned above, each user delays the transmission of a previously collided packet by some random time whose mean is X (chosen, for example, uniformly between 0 and X max = 2X). The traffic 7 The channel for acknowledgment is assumed to be separate from the channel we are studying (i.e., acknowledgments arrive reliably and at no cost).
634
THE BEST OF THE BEST
offered to t he cha nnel from our collect ion of users consists not only of new packets but also of previously collided packets : this increases the mean offered traffic rate which we denot e by G (packets per t ra nsmissio n t imeT) where G~
8.
.
Our two further assumpt ions are t he following. Assumption 1: The average retran smission delay X IS larg e compared to T . A ssumption 2: The interarrival t imes of t he point pro cess defined by the st art times of all the packets plus retransmissions are independent and exponent ially distributed . It is clear that Assump ti on 2 is violated in the protocols we conside r. (We hav e introduced it for analytic isimplicity. ) How ever, in Secti on V, some simulat ion results are dis cussed which show th at performance results based on this assumption are excellent approximations, parti cularly when the av erage retransmission delay X is larg e compared to T. Mor eover, in t he context of slotted ALOHA it was analytically shown [14J in the limit as X -; 00 , that Assumption 2 is satisfied; furthermore, simulation results showed t hat only the first moment of the retransmission delay distribution had a noti ceable effect 0!1 the average throughput-delay performance. So far , we have defined t he following imp ortant system vari abl es: S (t hroughput) , (offered channel traffi c rate) , '1' (packet t ransmission t ime), X (average retran smission delay) , T (propagat ion delay ) , and p (p--persistent paramete r) . Without loss of generality , we choose '1' = 1. This is equivalent to expressing ti me in uni ts of T. We express X and T in t hese normalized ti me uni ts as 0 = X/'l' and
a
a = T/T.
A LOHA C HAN NELS
.&
.2
,
°Ob:01====-------l...-------l..--=:::::=--~---.l----__,J 01
100
G lOFFEREO CHANNEL TRAFFICI
Fig. 1. Throughput in ALOHA channels.
2) In finite popul ation cases, stable situati ons ar e possible for which st eady-state results prevail over an infinite time horizon. (See [14.J and [16]. ) 3) Control procedures have been prescribed for the slotted ALOHA rand om access [14J which stabi lize unstable channels, achievin g perform ance very close t o the equilibrium results.
A . A LOHA Channels In the pure A LOHA access mode, each termina l t ransmits it s packet over the data eha nnel in a completely unsyn chronized man ner . Under t he system and model assumpt ions (mainly Assum ption 2), we have .') =
GP,
IV. THROUGHPUT ANALYSIS where P , is the probability th at an arbitra ry offered pa cket We wish to solve for the channel capacity of the system is successful. A given pa cket will overlap with another for all of the access protocols describ ed above. Thi s we packet if there exist s at least one st art of transmission do by solving for S in terms of G (as well as the other within T seconds before or aft er the start time of the system parameters) . The channel capacit y is then found by given packet (Le., over a "vulnerable" interval of length maximizing S with respect to G. S/ G is merely the prob- 2'1' ). Using th e P oisson t raffic assumption, Abramson [2J ability of a successful transmission and G/ S is the average first showed that number of times a packet must be tra nsmitte d (or sched(1) uled) until success. In Section V, we discuss delay and Thus, we . see t hat pure ALOH A achieves a maximum give the throughput-delay tradeoff for t hese prot ocols. This an alysis is based on renewal theory and probabili s- throughput of 1/ ( 2e) 7' 0.184 (at G = 1/2 ). In the slotted A LOHA, if t wo packets conflict , they will tic arguments requiring inde pendence of rand om vari abl es provided by Assumption 2. Moreover stead y-state con- overlap complet ely rather than partially (i.e., a vulnerable ditions are assumed to exist. However from th e ( 8 ,G) interval only of lengthZ') . The throughput equation t hen relationships found below one can see that steady st ate .becomes may not exist because of inh erent instability of these (2) rand om-access techniques. This instabi lity is simply explain ed by the fact that when statis tical fluctuati ons in and was first obtai ned by Roberts [1 2J who exte nded G increase the level of mu tu al interference am ong trans- Abramson's result in (1) . With t his simple change, the missions, then the positive feedback causes th e throughp ut maximum throughput is increased by a factor of two to to decrease t o O. Nevertheless, the results ar e useful for th e 1/ e = 0.368 (at G = 1) . In Fig. 1, we plot the throughput following reasons. 8 versu s the offered tra ffic G for these tw o syste ms. From 1) They are meaningful for a finite (a nd possibly long) t hese results, it is all to o evident that a significant fra cti on period of time. (Simulations supporting these analytic of t he channel' s ultimat e capacit y (C = 1) is not utilized results showed no saturat ion over the simulated period with the ALOHA access modes; we recover a maj or porof time when X was larg e enough; see Section V.) ti on of this loss with the C$MA protoc ols, as we now show.
635
Fifty Years ofCommunications and Networking
B. Nonpersistent C8MA
UNSUCCESSFUL
r PERIOo--, TRA NSMI SSI ON
The basic equation for the throughput 8 is expressed in terms of a (the ratio of propagation delay to pack et transmission time) and G (the offered traffic rate) as follows:
Ge-aG S = - - - - - - aG G(l + 2a) + e-
(4)
The probability that a TP is successful is simply the probability that no terminal transmits during the first a seconds of the period and is equal to raG. Therefore (5)
The average duration of an idle period is simply l /G . The average duration of a busy interval is 1 + Y + a, where Y is the expected value of Y. 8 The reference time axis considered in this and subsequent proofs is the transmitter's time. Shifting all transmissions by T seconds will give a description of events on the station's time axis. Any time overlap in transmission on the station's time axis results in packet interference. 9 In this and other figures, a vertical arrow represents a terminal becoming ready. .
Ii
-t~y U-, ~I ,
1-- -
t
r--~
-..J'j---
(3)
Proof: G denotes the arrival rate of new and rescheduled packets. All arrivals, in this case, do not necessarily result in actual transmissions (a packet which finds the channel in a busy state is rescheduled without being transmitted). Thus, G constitutes the "offered" channel traffic and only a fraction of it constitutes the channel traffic itself. Consider the time axis" (See Fig. 2)9 and let t be the time of arrival of a packet which senses the channel idle and such that no other packet is in the process of transmission. Any other packet arriving between t and t + a will find (sense) the channel as unused, will transmit, and hence will cause a conflict . If no other terminal transmits a packet during these a seconds (the "vulnerable" period), then the first packet will be successful. Let t + Y be the ti me of occurrence of th e last packet arriving between t and t + a. The transmission of all packets arriving in (t, t + Y) will be completed at t + Y + 1. Only a seconds later will the channel be sensed unused . Now, any terminal becoming ready between t + a and t + Y + 1 + a will sense the channel busy and hence will reschedule its packet. The interval between t and t + Y + 1 + a is called a transmission period (TP). Note that there can be at most one successul transmission d~r ing a TP. Define an idle period to be the period of time qetween two consecutive TP's (also cal~cd busy periods in this simple case) . A busy per~od plus th e following iRlp period constitute a cycle. Let B be the expected duration of the busy period, i the expected duration of the idle period , and B + 1 the expected length of a cycle. Let U denote the time during a cycle that the channel is used without conflicts. Using renewal theory arguments, the average channel utilization is simply given by
o s---8+1'
Ii i
BUSY - PERI OD
PE R IO D- I
r
SUCCESSFUL
TRANSMISSION
NORMALIZED
l~ ,j--.-~'·' f--p:~~;o~
IDLE PERIOD
Fig. 2. Nonpersistent CSMA: Busy and idle periods.
The distribution function for Y is Fy(y) ~Pr!Y S y}
= PI' [no arrival occurs in an interval of length a - y} =
exp
I- G(a
- y) },
(y Sa).
(6) The average of Y is therefore given by
_ Y
1
= a - G (1
- e- aG ) .
(7)
Applying (4) and using the expressions found for 0 , 13, and I, we get (3). Q.E.D. It is easy to prove that the throughput equation for the slotted nonpersistent CSIVIA is given by
aGe-aG S = - - - aG- (1 - e- ) + a .
(8)
Note that for both cases we have lim S a-+O
= G/(l
+ G) .
(9)
This shows that when a = 0, a throughput of 1 can theoretically be attained for an offered channel traffic equal to infinity. S versus G for various values of a is plotted in Fig. 3. C. t-Persisieni CSMA
The throughput equation for this protocol is given by
S
=
+ +
+ + aG /2) le-G(1+2a) + (1 + aG)e-G(l+a)
G[l G aG(l G . G(1 + 2a) - (1 - e- aG) .
(10)
Proof: Consider Fig. 4 and again let t be the time of arrival of a packet which senses the channel to be idle with no other packet in the process of transmission. In this protocol, any packet arriving in the interval [t + a, t + Y + 1 + a] will sense the channel busy and hence must wait until the channel is sensed idle (at time t + 1 + Y a) at which time they will all transmit ! The number of packets accumulated at the end of TP is the number of arrivals in 1 + Y seconds . If this total is equal to or greater than two , then a conflict occurs in the next TP with probability 1. Define a busy period to be the time between t and the end of that TP during which no packets accumulate. De-
+
636
THE BEST OF THE BEST
.8
~
Only cases 1) and 3) contribute to a successful transmission. Let 13' be the expected value of B'. })"OIn renewal theory arguments, the probability that_ an _arri~al finds the channel idle [case (1) ] is given by 1/(8 + I) , and the probability that an arrival finds the channel. in situation 3) is B'/ (B + 1) ; then the probability of success of the packet is given by
NON - PERSISTENT CSMA
.6
I
g o
ex:
I
t:
.4
Vl
P. .2
0.\
1
10
100
I
I
I I
i
roo,
.. ·-1 i
--I
I
I
r--:
NORMALIZED CTiME
t
IDLE PERIOD
Fig. 4.
B+I
B+I
1 = I/C.
Fig. 3. Throughput in nonpersistent CSMA.
1--
B'
1-
I-persistent CSl\lA: TP's, busy periods, and idle pe!iods.
fine an idle period to be the period of time in which the channel is idle and no packets are present awaiting transmission. A busy period plus the following idle period constitute a cycle. _ Let B be the expected duration of the busy period, I the expected duration of the idle period, and B + I the expected length of a cycle. Let us now consider the transmission of an arbitrary packet. Three situations must be considered. 1) If the packet arrives to an idle system, then its transmission is successful if and only if no packets arrive during its first a seconds; its probability of success is therefore er". 2) If the packet arrives during the first a seconds of a TP, then its probability of success is o. 3) If the packet arrives during the channel busy period (excluding the first a seconds of the TP) , then it is successful (in the next TP) if and only if it is the only packet to arrive during this TP and no packets arrive during its first a seconds. To calculate t~§ probability of success, we observe that a TP is of random length equal to 1 + a + Y where Y is a random variable. Let B' denote the time during a cycle that the channel is in its busy period excluding the first a seconds of each TP. B' is a sequence of segments of random length 1 + Y ~ Z separated by periods of a seconds. Knowing that a packet arrives in B', this packet is more likely to arrive in a longer segment Z than in a shorter one (due to the "paradox of residual life" [17J). Let Z denote the segment in which the arrival occurred, and qo (derived below) be the probability that no arrival occurs in Z; the probability of success' of the packet is therefore qoe--aG •
(11)
The determination of !,t3,B' J and qo follows. Since the traffic is Poisson, it is clear that the average idle period is given by
G (OffERED CHANNEL TRAFFIC)
UNSUCCESSFUL SUCCESSfUL UNSUCCESSFUL TRANSMISSION i TRANSMISSION TRANSMissioN PER'OO -'--PERIOO'--- - - PERIOD
-
~ Pr {success} = ~ e-aG + ----- qoe-aG •
(1~)
For B,B', and qo we must first obtain some intermediate results as follows. The distribution function for Y and its average are given in (6) and (7), respectively. The Laplace transform of the probability density function of Y, defined as
Fy*(s)
~
1'" e87/ dFy(y) , o
is given by
]?y*(s) = e- aG
G (e- as
-
-1
.
+.
(r -
e- aG ) ~
•
(13)
Let us now find the distribution of the number of packets accumulated at the end of a TI). Let qm (y) ~ Pr {m packets accumulated at end of TP I Y = y}
and q", = [qm(Y) dFy(y). 9
Let Q(z) denote the generating function of qm defined by Q(z) ~
L
";==0
qmZm.
The number of packets. accumulated at the end of a TP is equal to the ~unlber of packets arriving during a period of time equal to 1 + y. Let 'In} denote the number of packets arriving in 7' = I, and 1112 the number of pack.ets arriving in Y. Let Ql(Z) and Q2(Z) denote the generating functions of the probability distributions for l1tl and m2, respectively. Since the arrival process is Poisson, the random variables m, and 1112 are independent and the generating function Q(z) of qrn, ~rhere m = 1n1 + 1n2, is given by
Q(z) = Ql ( Z) Q2 (z ) . We have [17J Ql(Z) = exp {G(z - 1) }
and
637
Fifty Years ofCommunications and Networking
From (13) we get
Q(z)
=
exp {G (z - 1) , exp {- aG} [ 1 +
1]
faGzJ z
exp
z
_ xfz(x) fz(x) -
'.
e':"G
= - - - uo(x - 1)
We may invert this explicit expression for Q(z); in particular we find that the probability of zero packets accumulated at the end of TP is
a
qo = Q(z) 12=:0 ~"exp f-G(l .
'f"
+ a)}[l + aG].
k(I
=
as
lJ = ...
fa ... fa i Sli-=O
[k(1
y~-o k==l
+ a) + Yl + ... +
k-l
• qo(y,,)
II (1
-
QO(Y1')) dli'Yl(Yl)··
}Y"i 1
qo
Yk]
-ar-, (Yi)···.
It is easy to see that by inverting the order of summation and integration, the contribution of the term k (1 + a) reduces to (1 + a)/qo·~nd the contribution of the generic term Yj simply reduces to Y (1 - qo) j-l. Finally, we have
.
qo
t
= 1 + a + Y.
Y(1 .,- qo)i-l
Since the average number of TP's is l/qo, from the distribution of B' we have .
B';:;:: 1 + Y.
(17)
qo
In (11) for Ps, it remains only to compute qo. The probability density function of Z = 1 + Y is easily obtained from the distribution of F, From (6), the probability densi~y function of can be expressed as fy(y) = exp f ~aG}uo(Y)
+ G exp
x
~
1 + a.
= 'exp
exp {-Gl xJf z (.l ) d»
I~:(~+ a)} [I + aG(1 +
qj2)].
(18)
Using our expressions for 1, B, B', and qo in (12), (16), (17), and (18), respectively, we immediately obtain from (11)
is the unit impulse at y
fz(x) =: exp {-a.GJUll(X - 1)
P8 = - - - - - - - -
l+a+Y 1 ---+...:..
= O.
+ G exp
exp {G (x - 1) },
The probability density function of
Thus we have
f -aG} l~x~l+a.
Z is given by
G
qo
Substituting the expressions obtained for qo, qo, and f, and recalling that S = (iPs, we have finally established (10).
. . Q.E.D. Slotted l-persisieni C8MA: Let us now consider the slotted version of I-persistent CSl\1A. The throughput equation for this case is given by
s=
(1
G exp r -G(I + a) l[I + a - exp l-~G}J ~ (1 - exp {- ~G I) a exp {- G(1 + a) }
+
+ a)
(19)
Proof: In this slotted version, as in slotted ALOHA, if two packets conflict, they will overlap completely, The length of a TP is always equal to 1 + a. (We have assumed that the' packet transmission time is an integer multiple of the propagation delay.) Since the traffic process is an independent one (Assumption 2), the number of slots in an idle period is geometrically distributed with a mean equal to 1/ (1 -- e- aG ) ~ Thus the average idle period is given by -
[17J
a
1 = - - -aG 1 - e-
f -aG} exp tGy}, O~Y5:a
whe~e uo(y)
/ :r=l
(16)
qo
j-l
r
1+ 6
=
we
i=l
13 = 1 -t- a +
~
+ a) + Yl + Y2 + ... + Yk.
Therefore, by removing the conditions on k and
get'S
-,
l+Y
Finally, the probability that no arrival occurs (from our Poisson source) in the interval Z is simply
a
(J(Yl'~2,·· ·,Yk)
G.le- aGeG(z-l)
1
(15)
To find the average busy period, we let Y i denote the random variable Y defined above corresponding t~ 'the ith TP in busy ·period..All Vi, i = 1,2,·· ~, are independent and identically distributed. It is easy to see that the number of TP's in a busy period is geometrically distributed with mean l/qo. Conditioned on the fact that we have exactly k TP's in .the busy period and that F, = Yi for i =:= 1,2<.· ,k, the average busy period IS
.
~+Y
(14)
+
(20)
•
Using a similar argument, we find that the average busy period is given by B
= exp
1+ a {-G(l + a)
I.
(
21)
Let U again denote the expected time during a cycle that the channel is used without conflicts. In order to find rJ. we need to determine the probability of success over each
638
1--------_._._.
TP in the busy period. The probability of success over the first TP is given by Pr {success over first TP I
=
Pr {only one packet arrives during the last slot of the preceding idle period/some arrival occurred 1 aGe- aG I-e- aG
.2
o
+
+
I
The number of TP's in a busy period is geometrically distributed with a mean equal to exp IG(I thus aG exp l-aGI 1 - exp {-aGI
+
+ a)
~
I
I
'Il'\"
TIME
_
I<'ig.6. v-persistent CSMA: TP's, busy periods, and idle periods.
I
(22)
'
(23)
D. p-Persisieni CSMA
For a given offered traffic G and a given value of the parameter p, we can determine the throughput S as
+ P.(I - 'lro)J -( at 1 - 'lro) + 1 + a J + a'ITo
(1 - e- aG ) [P.''lrO
+
PERIOD
ID LE
For any value of a, the maximum throughput S will occur at an optimum value of G. In Fig. :j we show S versus G for the nonslotted version of I-persistent CSMA for various values of a.
( 1 - e-aG)[at-,71"0
TRANSMISSION
PERI OD
Applying (4) and using the expressions found for 0, t, and B, we get (19). Q.E.D. The ultimate performance in the ideal case (a = 0), for both slotted and unslotted versions, is
=
Fig. 5. Throughput in l-peraistent CSMA.
,
--1
G+e-
'00
'0
NORMALIZED
(1qo )
Ge-G(l + G) = ---'---'G
.J
..L-~~~::--L
0_'
I/qo,
G(l + a) exp {-G(l + a) 1 - exp I-G(l + a) I
S
~:::::::::::=-_..L-
0 .01
a) exp {-G(I a) 1 - exp I-G(1 + a) I
G(I
S(G ,p,a )
a '" 0
'
Pr Isuccess over any other TP I
--
1 - PERSIS TEN J CS MA
.8
.4
Similarly we have:
U
THE BEST OF THE BEST
(24)
where P:, P., if, i, and 'ITO are defined in the following proof in (37), (34), (36), (30), and (25), respectively. Proof: Consider a TP and assume that some packets arrive during the period as shown in Fig . 6. These packets sense the channel busy and accumulate at the end of the TP, at which point they randomize the starting times of their transmission according to the randomizing process described in Section II. This randomization creates a random delay before a TP starts, called the initial random
transmission delay (IRTD) , during which time the channel is "wasted." If, at the start of a new TP, two or more terminals decide to transmit; then a conflict will certainly occur . All other packets which have delayed their transmission by T seconds will then sense the channel busy and will have to be rescheduled .for transmission by incurring a retransmission delay o..'T hus, at the expense of creating this IRTD, we greatly ,improve the probability of success over a TP. '.~ Consider Fig. 6 in which we obskr~~hiwo TP's separated by an IRTD. One can also define bUs'~ periods and idle periods in much the same way as beford, An idle period is that period of time during which the ~hannel is idle and no packets are ready for transmission, A busy period consists of a sequence of transmission periods such that some packets arrive during each transmission period except the last one. Let 3; denote the ith TP Of a busy period. In order to find the channel utilization, we once again apply (4), which requires identifying and determining the average busy and idle periods, the gaps between TP's, as well as the condition for success over each TP. This we do as follows. Recall that we require the system to be (rnini-) slotted (the slot size equal to a, the normalized propagation delay) and all transmissions to start at the beginning of a slot. Here again we consider the transmission time of a packet to be an int-eger number 1/a slots (recall T = 1). Let g = aG; g is the average arrival rate of new and rescheduled packets during a (mini) slot. We first determine the distribution of the number of packets accumulated at the end of a TP. Let N denote this number and let 'Ir" ~ Pr IN = n}. According to the
639
Fifty Years of Communications and Networking
protocol described in Section II, only those packets arriving during a TP will accumulate at the end of that TP. Therefore, by Assumption 1, \ve have 1r n
=
[(1
+ a)GJn
exp f - (1
11.1
+ a)G},
n
2
O.
{in>
k} =
.
q(k+l)n
IIk[ClOL j=zl
exp {-g}
gm
]
~ qm(Jc-i+ 1)
m==O
Pr {L n =
l\
00
=
L
k=l
= q(k+l)n II exp {y(qk-i+ 1 - I)} j-l
>
and, therefore, for k IJr {tn = k) = Pr {tn =
exp {g (
-
k
(26)
0 we have
>
k - 1) - Pr {tn
>
q( l - qk-l) )} p - (k - 1)
=
O} = 1 - qn.
(27)
(28)
The average IRTD is given by CIO
in
=
:E
Pr {tn > k}.
l ~ n
(1 - qn)OI,n,
1r
ao
L
(29)
n
P8(n) - - . n=l 1 -- 11"0
~Pr
(32)
(34)
{N' = n}
gn e- O - n! 1 - e- g
= 0, Pr {tn
kj
For the probability of success over 31 we note that the number of packets present at the beginning of a busy period, denoted by N ', is the number of packets arriving in the last slot of the previous idle period. We then have 7r n
k}
qkn[1 - qn exp f -g(l - q")}] • exp {g (
and for k
p
Un =
where Oi,i is the Kronecker delta. The probability of success over 3 i is equa.l to the probability that none of the L; transmit over a..
P, =
)}
) e-kopr n !
+
k
= q(k+O n
,-
l-n
Removing the condition on N, we get
m.
q( l - qk)
(kg)
(l
(2f»
To find the distribution of the IRTD between t\VO successive TP's in the same busy period, we condition N = nand we let tn be the number of slots elapsed until some packet is transmitted. Let q = 1 - p. It is easy to see that Pr
Removing the condition on tn,
n
'
~
1.
(35)
Given N' = n, let tn' denote the first initial random transmission delay of the busy period, and P/ (11,) denote the probability of success over ~h. The distribution of t.,/ and its average ln' are the same as for in [(27) and (29)]. P/(n) is the same as Ps(n) [see (33)J. Removing the condition on N', we get
if
00
=
L
(36)
In'1r n'
n=-l
k=O
Removing the condition on N, "re get ~ t = c: n=l
P/ =
1rn t",--.
1-
(30)
1l"o
probability of success over J i the number of packets present at the starting time of ~i merely the number of packets arriving during the gap t;
(kg) ·
l-n
(l - n) !
e- k o ,
It remains to compute B, 0, and 1. It is clear that the
number of TP's in a busy period is equal to m with probability ?ro(1 -- 1rn)m-l.
Consider a busy period with 111, TP's. Let N i denote the number of packets accumulated at the end of the ith TP. We know that N m = 0, and that all other N, ~ 1 are independent and identically distributed random variables. Conditioned on the fact that N i = ni,i = 1,.· ·,m. - 1, the average busy period is given by m-l
Bm(nl"· ·,nm-l) = al'
+L
al n i
i-I
+ m.(l + a).
(38)
The expected time, during the busy period, that the channel is used without conflicts is given by
By the Poisson assumption we have Pr {I.ln = Lit; = k} =
(37)
n==l
L is the average gap between two consecutive TP's in a busy period. In order to find the probability of success over a TP 3(one has to distinguish two cases: i = 1 and i rf= 1. We first treat the second case, i r! 1. Given N = n, define'":
.
00
1: Ps'(n)1r n ' .
l ~ n.
(31)
1°1.'he quantities Ps(n), P a , and L; need no index i since they are identical for all 3i, i ~ 1.
711-1
Om(nt,_· ·,nm-l) = P/
+ :E
i==1
On the other hand, we know that
Ps(ni).
(39)
THE BEST OF THE BEST
640
n;
1, i = 1,2,.· ·,m - 1.
~
Since l and l' are finite, by letting a ~ 0 in (46) we get 1"J '
(40)
S
Therefore, removing the conditions N i = n, in (38) and (39), we get
Em = =
o:
+ (m -P/ + (m at'
1)al + m(l +*a)
(41)
l)P s
(42)
and removing the condition on B- --
~
~
B-",11"0(1
)
-- 11'"0 m
-1 _
-
we get
111,
at-'
+
at(l -:-
11"0)
n
m~l
+1+
a
S(G,p,a
-
=
1 --
1ro
P/ + - - Ps •
(44)
7ro
The idle period is geometrically distributed with mean 1/ (1 - e- o) ; its average is: a
1=---
]
S(G,p,a)
"J I a
+
1 --
1rO
P
11'"0
_1 --
To compute P, we have to get back to (31) through (34). With a = 0 we have
Pr (L n = Ut; = k} = 1,
11"0
1ro
l
=n
and Pr {L n = n l
=
1.
Therefore P.(n) = npqn-l 1 -- qn
(48)
and co
npqn-l
'Trn
Ps=2:---1 - qn 1 -
(49)
11"0
where (50)
By the same token} we see from (35) that (46)
1
11"0
1
8
+a a at + a t - - + - - +--_,
8
(47)
1
n=al
Finally, using (4) and substituting for B, 0, and 1 the expressions found in (43), (44), and (45), respectively, we get the throughput S; it is a function of G, p, and a = l/T and is expressed as
P
-+-G
(45)
1 -- e:»
= 0)
1l"0
1r(}
1ro
(43)
U
1 --
+
1 --
e- g
and that
P/ = P, (1)
= 1.
QE.D. which reduces to (24). In order to evaluate S(G,p,a) , a PLjl program was With these considerations, the throughput is given by written and run on the IBM 360/91 of the Campus Com= 0) = 6[11"0 + (1 - 1I"0)P.J (51) puting Network at UCL~t\. For small values of p (0.01 ~ ,p,a G + 1ro p ~ 0.1), the numerical computation as suggested by (24) becomes time consuming and requires an extremely where P, and 1f n are given in (49) and (1)0), respectively. large amount of storage. Fortunately some approxima- . When p = 1, we have, from (48), tions have been found useful which lead to a closed-form P8(1) = 1 solution for the throughput (see the derivations of S'(G,p,a) in Appendix A). Ps(n) = 0, n >1 Special case a = 0: Let us now consider the special case and therefore a = O. For finite G, g = aG = O. Equation (26) becomes
sea
Pr {in> k}
=
q(k+l)rt.
The average IRTD is then given by (29), and is expressed as co
i: = L
l:>r {tn
k=O
>
k}
qn 1 - qn
=--
It is important to note that In is finite, so is I: On the other hand the idle period given in (4.5) becomes -
1
1=0'
Equation (51) then becomes S(G,p = 1,a = 0)
Gel
+ G)e-
O
G + e"
which is (and should be) identical to the I-persistent CSIVIA when a = 0 [see (23)]. Let us now consider p ~ O. Since 1 -- qn ~ np, (48) then becomes P,(n) = qn-l
641
Fifty Years of Communications and Networking
p _ P E RSIST ENT CSMA l~
= 0 .0 11
.6
.6
.2
.2
p =O .8
0"== :=::_ _ 0.01
---L
.L.
..::s:::,;:;,~_..::::",_..::::....._J
10
1
0.'
o "==:=. O,ol
_ _
---l..
-'-
~ ~:__-~"'--______::
1
0 .1
G fOFF f RED CHANNEL TRAff lCI
G tO FffR EDC HANNEL T RAF f iC)
(b)
(a )
p _ PE RSISTENT CSMA la .
.8
11 - P ERS ISTENT CSMA la. 0 .051
0.11
.6
.2
.2
o b=:==:::=. 0.01
_ ---L
.L.
--.l_
0 .1
_
~"__
'0
. l=:::::::::::~=------!1 0 .1 O~ l
_ _--+--~-';
G (O FF E R ED CH AN N EL TRA FF IC'
G IOF fEREO CHANNEL TR AFf IC I
Fig. 7. Channel throughput in p-persistent CSMA. (a) a (b) a = 0.01. (0) a = 0.05. (d) a = 0.1.
"
100
O.
p-persistent CSMA reduces to the slotted I-persistent CSMA. Indeed we can check that, when p = 1, (24) reduces to (19), since P8' I, and I' then become
and
P,
qe-G(eqG - 1) 1 - e- G
= 0)
-+ G
G
a Ge1 - e- aG
I' = I = O.
In particular, p -+ 0 gives P .(n ) -+ 1, for all n, and P , -+ 1. In this limit the throughput is then given by
S(G,p-+O,a
=
G
+ e:
a
(.52)
which shows that a channel capacity of 1 can be achiev ed when G-+ 00 . For each value of a, one can plot a family of curves S versus G with parameter p [as shown in Fig. 7 (a) -(d)]. The channel capacity for each value of p can be numerically determined at an optimum value of G. In Fig. 8 we show the channel capacity as a function of p, for a = 0, 0.01, 0.05, and 0.1. We note that the capacity is not very sensitive to small variations of p; for a = 0.01, it reaches its highest value (i.e., the channel capacity for this protocol) at a value p = 0.03. When p = 1, th e (slotted)
E. Performance Comparison and Sensitivity of Capacity to the Parameter a To summarize, we plot in Fig. 9 for a = 0.01, S versus G for the various access modes introduced so far and thus show the relative performance of each, as also indicated in Table 1. While the capacity of ALOHA channels does not depend on the propagation delay, the capacity of a CSMA channel does. An increase in a increases the vulnerable period of a packet. This also results in "older" channel state information from sensing. In Fig . 10 we plot, versus a, the channel capacity for all of the above random-access modes . We note that the capa cities for nonpersistent and p-persistent CSMA are more sensitive to increases in a, as compared to the l-persistent scheme. Nonpersistent CSMA drops below I-persistent for larger a. Also, for large a,
THE BEST OF THE BEST
642
SLOTTED NON - PfRSISUNT CSMA
OPTIMUM p - PERSJSUNT CSMA NON "':'P E ~~ ISTE N l C$ M A
.8 ~--------....!2'·
SLO TT EO
1 - PERSISTENT C:S""A
1 - PER SISTEN T CSMA SLO TT ED A LO HA
PU RE A LOH A p - P ERSIS TEN T CSMA
.2
•.o;;-l----;:::----;:---=---....J..---~-"-----'-------.J 02 .03 .J .s
Fig. 10. Effe<;t of propagation delay on channel capacity.
Fig. 8. p-persistent CS~A: effect of 'f on channel capacity.
.8
.-0,01
.6
.OJ - PER SISTE N T CSM A .1 - P ERSI STENT CSM~
.c
.2
1
G IOFFEREO CHANNEl TRAFFICI
Fig. 9.
I.
100
Throughput for the various access modes (a = Q.01).
CAPACITY
C
TABLE I FOR THE VARIOUS PROTOCOLS CONSIDERED
Protocol
(a
= O.O~)
Capacity C
Pure ALOHA Slotted ALOHA f-Peraietent CSMA Slotted f-Persistent CSMA O.I-P.ersistent CSMA ' Nonpersistent CSMA O·.03'-PersistentCSMA Slotted Nonpersistent CSMA Perfect Scheduling
0;184 0.368 0.529 0.531 0.791 0.815 0.827 0.857
1.000
slotted ALOHA (and even "pure" ALOHA) is superior to any CSMA mode since decisions based on partially obsolete data are deleterious : this effect is due in part to our assumption about the constant propagation qelliY. (For p-persistent, numerical results are shown only for a .::;; 0.1. Clearly, for larger fl, optimum p-persistent is lower-bounded by ]-persist ent.)
fined es the average time from when a packet is generated tmtii it is successfully received. Our principal concern in this section is t o investigate the tradeoff between the average delay D. and the throughput S. . As we have already stated, for t he correct operation of the system, a positive acknowledgment scheme is needed, If an acknowledgment is not received '~y 'the sender of a packet within .a specified time-out period, then thepacket is retransmitted (in~urringthe random retransmissio~ '. . , . . delay X, introduced to avoid repeated conflicts). For the present study, we assume the following. .. •4 ssumption 3: Th e acknowledgment packets ar e always correct ly received ~ith probability one. . . The simplest way to accompli sh thi s is to create a separate -channel" (assumed to be available) t o handle acknowledgment traffic, If sufficient bandwidth ~ provided to this channel overlaps between acknowledgment packets are avoided, since ' a 'positive acknowledgment packet is created only when a packet ill correctly received, and'there will be a~ most one such packet at any given time. Thus, if 1'.. denotes the transmission time of the acknowledgment packet on the separate channel, then the time-out for receiving a positive acknowledgment is T + T -+- T .. + T, provided that we make the following assumption. . . A.ssU1nption 4: The processing tim e needed to perform the sumcheck and to generate the acknowledgment packet is negiigible: . , . " . , Assumption :! further simplifies our delay model by implicitly assuming that the probability of a packet's success is the same wh~her the packet is new or has been blocked, or interfered ~'ith anv number of times before; this probability is ~imply gi~~n by the throughput equation, i.e.,
S P,= -
G
V. OELA Y CONSIDERATIONS
throughput offered 't raffic .
A . Delay Model
Bearing these assumptions ill mind . we can write the delay equations for each of the previous access modes .
In the previous sect ion, we analyzed the performance of CSMA modes in terms of maximum achievable throughput. We now introduce the expected packet delay D de-
11 The reader is referred to [16] for a study of the effect of acknowledgment traffic on channel throughput when acknowledgment packets are carried by the same channel.
0
'
•
•
, "
Fifty Years ofCommunications and Networking
643
AB an example let us conside r th e ALOH A mode. Let R be the average delay bet ween two consecutive t ransmissions (i.e., a retransmi ssion) of a given packet . R consists of th e transmission time of the packet , th e transmission time of th e acknowledgment packet, the round-trip.propagation delay, and the average retransmission delay, that is R
=
T
+ + T" + + X. T
T
USIng our normalized tim e units , we hav e
R= 1+2a+a+o
(53)
where a = T,,/T. Since (0/ 8 - 1) is t he average numb er of retr ansmissions required, th e average delay is given by D =
(~-
I)R + 1 + a.
.2
(054)
(Special attention must be devoted to the CSMA modes in which packets may incur pretransmission delays, and in which all arrivals do riot necessaril y correspondto actual transmiSsions. The delay equat ions and th eir deriv ations are given in Append ix B. ) Let us begin with some comments concerning the above delay equations. Fi rst , G/ .s as obtained from th e throughput equa t ions rests on t wo important and st rong Assumpti ons L : and 2; namel y, th at we have an ind ependent Poisson point process and that 0 is infinit e, or large cornpared to the transmission time (in which case delays are also large and una cceptable ). On the other hand, 0 cann ot be arbitrarily small. It is intu itiv ely clear th at when a certain backlog of packets is present , t he smaller 0 is, the higher is the level of in terference and hence th e larger is the offered channel traffic G. Thus, G = G( 8 ,0) is a decreasing function of 0 such that the average number of transmissions per packet, [ G(8 ,0) J/ S , decreases with increasing values of 0, and reaches th e asymptoti c valu e predicted by th e throughput equation. Thus, for each 8, .a minimum delay can' be achieved by choosing an opt imal o. Such an optimization problem is difficult t o solve analyti cally, .and simul ati on t echniques hav e been employed in our evalu ati ons below." Before we proceed with the discussion of simulation results , we compare t he vari ous access modes in te rms simply of th e average number of t ransmissions (or average number of schedulings") G] 8. For this purpose, we plot G j 8 versus 8 in Fig . 11 for the ALOHA and CSMA modes, when a = 0.01. Note that CSMA modes are superior in that th ey provide lower values for G/ 8 th an the ALOHA mod es. Furthermore, for each value of the throughput, there exists a valu e of p such that p-persistent is opt imal. For small values of 8 , p = 1 (i.e., l -persistent ) is opt imal. As 8 increases, the opt imum p decreases. ~' We have been able to solve the problem anal ytically in the case of the nonpers isten t CSM~ when we are in presence of a large population but with a ·finite· number of users; all conclusions obtained from simulation in Sect ion V-B have been verified by the analysis. For this the reader is referred to reference [161. U For the nonpersistent and p,persistent CSMA, G measures the offered channel traffic and not the actual channel traffic. GIS represents, then, the average number of times a pa cket was scheduled for tr ansmlssicn before success,
Fig. 11. GIS versus throughput (a = 0.01).
B. S imulation Results
The simulat ion model is based on all system assumptions presented in Secti on II. However, we relax Assumpti ons i and 2 concerning the retransm ission delay and the ind ependence of arrivals for t he offered channel traffic . That is, in the simulation model, only th e newly generated packets are derived ind ependently from a Poi~son distribution; collisions and uniformly distributed random retransmissions are accounted for without 'further" assumptions. In general , our simulation results indicate 'the following. 1) For each value of th e input rate S , th ere is a minimum value 0 for th e average retransmission delay variable, such that below that value it is impossible ' to achieve a throughput equal to the input rat e.14 T he higher 8 is, the larger 0must be to prevent a constant ly increasing backlog, i.e., to prevent t he channel from saturati ng. In other words, th e maximum achieva ble throughput (under assumed stable conditions) is a function of 0, and th e larger o is, th e higher is the maximum throughput. 2) Recall that th e throughput equations were based on th e assumption that X jT = 0» 1. Simulation shows that for finite values of 0, 0 > 00, but not too large compared to 1, the system alread y " reaches" the asymp totic results (0 -.,-+ ec } , That is, for some finit e values of 0, As':' sumption 2 is excellent and delays are acceptable. Moreover, th e comparison of the (8,G) relationship as obtained from simulat ion and the results obtained from the analytic model exhibits an excellent match. Simulation experim ents were also conducted t o find the optimal delay; that is, the value of 0 (8) which allows one to achieve the indi cated throughput with the minimum delay. Fina lly, in Fig. 1215 we give th e throughput-minimum delay tradeoff for t he ALOHA and CSMA mod es (when a = 0.01) . This 1'S the basic performance curve. We conclude 14 Such behavior is charac teristic of random multiple-access modes. Similar results were already encountered by Kleinrock and Lam (13) when studying slotted ALOHA in the context of a satellite chan riel. U In Fig. 12, the curve corresponding to slotted ALOHA is obtained from the analytical model developed in /13J successfully verified by simulation.
THE BEST OF THE BEST
644
I
20
SLOTTED NON-PERSISTENT
SLOTTED! ALOHA
i
0
6
J
>-
<
~ 10 0
0
w
~ 5
1/.
J
2:
2
••
I
I
~
0
,
P-:ERSISTENT
6-
a:
1
!OPTIMUM
J
N
inated by dividing the available bandwidth into two separate channels: a busy tone channel and a message channel. As long as the station is receiving a signal on the message channel, it transmits a busy tone signal on the busy tone channel (which terminals sense for channel state information). The CSIVIA with a busy tone under a nonpersistent protocol has been analyzed. It is shown to provide a maximum channel capacity of approximately 0.65 when a = 0.01 for a channel bandwidth W of 100 kHz (modulated at 1 bit/Hz); when W = 1 MHz and a = 0.01, the channel capacity is 0.71 [19J. These values compare favorably witHtlie capacity of 0.815 for nonpersistent CSl\fA with no hidden terminals.
SLOTTED l·PERSfSTENT
PURE 6. ALOHA
40
•
APPENDIX A
---'-_A--I-----L_.L-....L.....---J._..l.--J
o
.1
.2
.3
.4
.5
.6
.7
.8
.9
SlVIALL p APPROXIlVIATIONS IN p-PERSISTENT CSMA
S (THROUGHPUT)
Fig. 12. Throughput-delay tradeoffs from simulation (a
= 0.01).
that the optimum p-persistent CSMA provides us with the best performance; on the other hand the performance of the (simple) nonpersistent CSl\1A is quite comparable.
We claim, for small p, that S(G,p,a) mated by
may be approxi-
S' (G,p,a)
VI. SUlVIMARY AND DISCUSSION
We have introduced and evaluated the new CS~IA mode and have shown it to be an efficient means for randomly accessing packet switched radio channels which have a small ratio of propagation delay to packet transmission time. Just as with most "contention" systems, these random multi-access broadcast channels (ALOHA,
(AI) where t : [>81 l, and I' are defined hereafter in the proof. Proof: We show here that, with some approximations, we can get a closed-form solution for the throughput when
p has small values (p < 0.1). These approximations are validated by comparing the results obtained in this section with those obtained from Section IV-D for p = 0.1. For the distribution of idle time between two TP's, We have from (26)
CSMA) are characterized by the fact that the throughput goes to zero for large values of channel traffic. At an optim,um traffic level, we achieve a maximum throughput which we define to be the system capacity. This and the throughput-delay performance were obtained by a steadystate analysis under the assumption of equilibrium conq( l - qk) . ,)} Pr It n > k I = q(k+l)n exp { g ( p - k · (A2) ditions. However, these channels exhibit unstable behavior at most input loads as shown by Kleinroek and Lam [18J. When p is small, we may make the following approximaIn this last reference, the dynamic behavior and stability tion (actually a lower bound) : of an ALOHA channel are considered; quantitative esti(A3) qk = (1 - p) k ~ 1 - kp mates for the relative stability of the channel are given, which indicate the need for special control procedures to and therefore we may rewrite (A2) as avoid a collapse. Optimal control procedures have been Pr {in> k} "'-' q(k+l) e- k p g = qn[qne-poJk. (A4) found [14J, [15J and similar procedures are necessary for CSMA as well, since it can be shown [16J that CSMA Let tn>*(z) and t.",*(z) be the generating functions defined exhibits similar unstable behavior. by Throughout the paper, it was assumed that all terminals GO are within range and in line-of-sight of each other. A (A5) tn>*(z) ~ L Pr ltn > k}Zk common situation consists of a population of terminals, k-o all within range and communicating with a single "station" (computer center, gate to a network, etc.) in line-of-sight co (A6) t;*(z) ~ 2: Pr {t" = k}zk. of all terminals. Each terminal, however, may not be able k-<> to hear all the other terminals' traffic. This gives rise to what is called the "hidden-terminate" problem. The latter We have badly degrades the performance of CSMA as shown in Part II of this paper [19J. Fortunately, in a single-station environment, the hidden-terminal problem can be elim7L
645
Fifty Years of Communicationsand Networking
Since Pr {t~
k}
=
Pr {tn
=
>k-
1} - Pr {tn
:> k J,
k>O
= qn-l
(1 - e- fJP )q2..;,.,,.n- l
(A17)
1 - q"e- 2 g p
and.
Pr {tn =
0) = 1 - Pr {tn > O},
Here again, (34) defines
we have
t,,*(z) = 1
+
(z - l)t,,>*(z) = 1
qn(z - 1)
+1-
.
(AS)
qne-Pllz
The averages defined in (29) can now be written as
I
t. = at~ *(z)
dZ "
which does not lead to a closed-form expression. Instead, we replace P, by fiB; which is defined as C
A
t:-lln n/
Equation (30), which defines 1 as i (1 - 7ro), does not lead to a closed-form expression. Instead, replace 1 by l, which is defined as
C
~
where C is as expressed in (All), and
,,·e
t..:..:..-"--- 1 ~ Ce-Po
(AID)
where q = l:::i qn 1rnl(1 - 1I"d). (i is smaller than l since l; = .qnj (1 - qne~poj is a convex function of qn.)
(At8)
B
(A9)
z=l
(1 - e-fJP)C'
P = ------q q(l - Ce- gp)
co
l_q2 _
· , = ". 2 _ 7r _ n 11"'0 - - 11"'0 L..J qn = C n-l 1 - 1ro I - 1to
(A19)
Finally, P can be expressed as shown in the following equation: B
....
P,
7I"o? -
=.
11"'0
.
q(1 - '7ro)
q(1 - 11'"0) ..:- q exp {-2gp} (1I"0P ~ 'Jro) .
C can be expressed .as
c = exp. { .
(A20)
-=- (i 0+ a)p~l 1-
71"0
= 71"0" -
71"0
(All)
1 - 1ro
1J"o
The quantities i' and p/ are readily obtained from (A12), and (A20) , respectively; by replacing
and therefore,
11"0 ~ exp {- G (1
+ a) }
.
~
.
1r:OP -
.
t =="
1 --
7r~
-
71"0
'. '7ro) e- P O
(1r()p -
(A12)
To find .the probability of success over TP G we first define the following generating functions:
1· ,
".
i
Ln*(z) ~
L. Pr iz, 00
=
~ ~ 1, \
(AI3)
l}Zl
4
A
:...
A
by the quantity e:«. The substitution of Pi; 1, Pi', and l' for P; t, P/, and l', respectively, in (46) provides us with
a closed-form solution for S(G,p,a) when p is small. In Table II, w~ compare' for p = 0.1 the "exact" results obtained from Section IV.~D to those obtained by the proximation; note that the closed-form solution is quite satisfactory for p < 0.1.
ap-
l:::oon-l
APPENDIX B
00
Ln*(z/ic) ~ L Pr {L n = tit; = klz l .
(A14)
1=-n-1
DELAY EQUATIONS
It is clear that [.In* (z
Ik)
A. iVonper8i~tent C8MA =
exp {kg (z - I)} zn-l.
(A15)
Removing the condition on k, we get L; *(z ) =
ao
.
L t; *(z / k) •Pr
k-o
ac
= zn-i
L
k=O
R
{tn = lc}
1 =
1
+ ex + 2a + 0,
0,
(i.e., senses the channel busy). We have 1_ P = a
1) })
== .. q"(e~p .{y(z.~ .1) i
-
b
l)z,,-I
1 - qn exp {-pg} exp fg(z - 1) }
+ Z..-I.
if the packet is transmitted (Bl) if the packet is blocked.
Let PiJ be the probability that an arrival gets blocked
.
exp {Jcg(z ~ 1)} -Pr {tn = k}
= zn~ltn *(exp {g(z -
In this case, the average delay R between t\VO successive sense points of the same packet is
(Ai6)
The probability of success Pa(n), defined in (33), IS now simply expressed (since 1 . . :. ql ~ lp) as
+ _1/G C
_ - 1
i + aG . + G(1 + a + Y)
·
(B2)
Under the traffic independence assumption, the rate of
646
THE BEST OF THE BEST TABLE
II
COMPARISON OF RESULTS FOR 1"'HROUGHPUT S OBTAINED FROM THE EXACT ANALYSIS (24) AND RESULTS OBTAINED FROM THE ApPROXIMATION (ApPENDIX A) WHEN P = 0.1
= 0.01
a
G 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3
a
Approxi-
=
mate
Exact
0.773 0.778 0.781 0.783 0.784
0.098 0.192 0.279 0.358 0.428 0.490 0.544 0.590 0.630 0.663 0.691 0.714 0.733 0.749 0.761 0.771 0.778 0.784 0.787 0.790 0.791
0.095 0.179 0.252 0.316 0.370 0.417 0.457 0.490 0.519 0.54.3 0.563 0.580 0.594 0.606 0.616 0.624 0.630 0.635 0.639 0.642
0.783
0.790
0.646
Exact
0.098
0.192
0.279 0.358 0.428 0.490 0.544 0.589 0.628 0.661 0.689 0.711 0.730 0.745
0.757 0.766
0.784
0.05 Approximate
0.094 0.178 0.251 0.314 0.367 0.413 0.453 0.486 0.515 0.539 0.560 0.578 0.593
:F") 2 = 1
+ 2Y +
Y2.
From the distribution of Y given in (6) we then have
Z2
=
1
+ a + 2(1 2
- 1jG)Y.
(B6)
Therefore the average pretransmission delay j\ can be easily expressed as f\ =
0.605
1
+
a2
+ 2(1 - l/G) Y 2(1 + Y) · Pr {the packet finds the channel busy}
0.638 0.643
1
0.647 0.649 0.651 0.653
0.645
+
Z2 = (1
0.616 0.625 0.632
0.644
0.791
where B, i, Y, and go are given in (16), (12), (7), and (15), respectively. Unqer the condition that the packet found the channel busy',· the average waiting time until the channel is detected idle (i.e., until the end of the TP) is simply equal [17J to Z2/2Z by the Poisson assumption. The second moment of Z is simply given by
+ a + 2(1 - l/G) Y 2qo(B + 1) 2
(B7)
Finally, the expected packet delay is D = (GIS - 1)(1
+ 2a + a + (j + Tt) + Tl + 1 + a (BS)
actual transmissions is given by
where OJS is given by the I-persistent CSl\'lA throughput equation (10).
H = G(l - Ph).
Since (HI S) - 1 represents the average number of actual retransmissions per packet, the average delay D is therefore
D
=
(HIS - 1)[1
+ a + 2a + oJ[(O -
H)/S]b
+ 1+a (B3)
where GIS is given by the nonpersistent CS1VI A throughput equation (3). If we choose to treat all packet arrivals in a uniform manner, we may assume that when a packet is blocked, it behaves as if it could transmit, and learned about its blocking only T'; seconds after the end of its "virtual' transmission. With this simplification, the delay equation IS
D = (G/ S - 1) (1
+ 2a + a +
(j)
+ 1+ a
(B4)
thus introducing an additional delay equal to (OP b / S)
[1
+ a + 2a].
c.
p-Persistent CS"A!f.A.
Similar to the special case of I-persistent CSlVIA, a packet in this general scheme incurs an initial delay which we denote by r p • In order to compute its expected value rp, one must consider the following situations. 1) An arbitrary packet, upon arrival, will find the channel idle with probability 1/ (B + 1), in which case its average initial wait is at'. 2) An arbitrary packet, upon arrival, will find the channel in the first IRTD (first t' seconds) of a busy period with probability ~l' / (B + I). In this case, its average initial delay is at'2 /2l'. 3) An arbitrary packet, upon arrival, will find the channel in the remaining part of a busy period with probability (B - at') / (B + 1) , in which case the average initial wait is (1 + a + at)2/2(1 + f).. + at). Therefore
I
B. l-Persisieni CSMA
fp
_,
at'
= B + 1 at + B + f
at'2
2t'
+
B -
at'
(1
B + 1 · 2 (1
+ a + at) + a + at)
2
.
Unlike the ALOHA channel, a packet 011 a CSIVIA channel incurs an additional pretransmission delay r, if upon its arrival, that packet detects the channel busy. Recall that the probability of finding the channel busy is given by (see Section IV-C)
(by introducing artificial delays due to "virtual" transmissions and acknowledgment}, the expected delay can
Pr {a packet finds the channel busy}
simply be expressed as
fJ - ajqo
B+I
+ iT qo(B + 1) 1
(B;l)
(B9) Treating all transmissions and schedulings uniformly
D = (GIS - 1)[1
+ 2a + () + f + 1 + a + T p]
p
(BIO)
647
Fifty Years ofCommunications and Networking where G/S is given by the p-persistent CSl\1A throughput equation (24). REI~ERENCES
[1} L. G. Roberts, "Data by the packet." IEEE Spectrum, voL 11, pp. 46-51, Feb. 1974. (2] N. Abramson, "The ALOHA System-Another alternative for comQuter communications," in 1970 Fall Joint Comput. cs«, AFIPS Conf. Proc., voL 37. Montvale, N. J.: AFIPS Press, 1970, p~. 281-285. [31 R. E. Kahn, "The organization of computer resources into a. packet radio network," in Nat. Compui. Conf., AFIPS Conf. Proc., voL 44. Montvale, N. J.: AFIPS Press, 1975, pp. 177-186. [4] L. Kleinrock and F. Tobagi, "Random access techniques for data transmission over packet-switched radio channels," in Nat. Comput. Conj., AFIPS Conf. Proe., vol. 44. Montvale, N. J.: AFIPS Press, 1975, pp. 187-20l. [51 R. Binder, N. Abramson, F. Kuo"A. Okinska, and D. Wax, "ALOHA packet broadcasting-A retrospect," in Nat. Comput. Conf., AFIPS Con], Proc., vol. 44. Montvale, N. J.: AFIPS Press, 1975, I!p. 203-215. [61 H. Frank, I. Gitman, and R. Van Slyke, "Packet radio systemNetwork considerations," in Nat. Comput. Conf., AFIPS Conf. Proc., vol. 44. Montvale, N. J.: AFIPS Press, 1975, pp. 217-231. [7] S. Fralick and J. Garrett, "Technological considerations for packet radio networks," in Nat. Compui. Conf., AFIPS Conf, Proc., vol. 44. Montvale, N. J.: AFIPS Press, 1975, pp. 233-243. [8J J. Burchfiel, R. Tomlinson, and M. Beeler, "Functions and structure of a packet radio station," in Nat. Comput. Conf., AFIPS Con], Proc., vol. 44. Montvale, N. J.: AFIPS Press, 1975, pp. 245-251. [9J S. Fralick, D. Brandin, F. Kuo, and C. Harrison, HDigital terminals for packet broadcasting," in Nat. Comput. Conf., AFIPS Conf. Proc., vol. 44. Montvale, N. J.: AFIPS Press, 1975, pp. 253-261. [10] P. E. Jackson and C. D. Stubbs, "A study of multi-access computer communications," in 1969 Spring Joint Comput. Conf., AFIPS Conf. Proc., vol. 34. Montvale, N. J.: AFIPS Press, 1969, pp. 491-504. [11] N. Abramson, "Packet switching with satellites," in Nat. Comput. Con!., AFIPS Conf. Proc., vol. 42. Montvale, N. J.: AFIPS Press, 1973, pp. 695-702. [12] L. Roberts, "ARPANET Satellite System," Notes 8 (NIC Document 11290) and 9 (NIC Document 11291), available from the ARPA Network Information Center, Stanford Re-
search Institute, Menlo Park, Calif. [131 L. Kleinrock and S. Lam, "Packet-switching in a slotted satellite channel," in Nat. Compui. Conf., AFIPS Conf. Proc., vol. 42. Montvale, N. J.: AFIPS Press, 1973, pp. 703-710. [l41 S. Lam, "Packet switching in a multi-access broadcast channel with application to satellite communications in a computer
network," Scbool of Eng. and Appl. Sei., Univ, of California, Los Angeles, rep. UCLA-ENG 7429, Apr. 1974. [15] S. Lam and L. Kleinrock, "Dynamic control schemes for a packet switched multi-access broadcast channel," in Nat. Compui. Conf., AFIPS Con!. Proc., vol. 44. Montvale, N. J.: AFIPS Press, 1975, pp. 143-153. [16] F. Tobagi, "Random access techniques for data transmission over packet switched radio networks," Ph.D. dissertation, . Comput. Sci. Dep., School of Eng. and Appl. ScL, Univ. of California, Los Angeles, rep. UCLA-ENG 7499, Dec. 1974. [17J L. Kleinrock, Queueing Systems, Vol. I, Theory; Vol. II, Computer Applications. New York: Wjley Interscience, 1975. [18J L. Kleinrock and S. S. Lam, "Packet switching in a multiaccess broadcast channel: Performance evaluation," IEEE Trans. Commun., vol. COM-23, pp. 410--423, Apr. 1975. [19) F. A. Tobagi and L. Kleinrock, "Packet switching in radio channels: Part II-The hidden terminal problem in carrier sense multiple access and the busy tone solution," IEEE Trans. Commun., this issue, pp. 1417-1433.
Leonard Kleinrock (S'55-M'64-SM'7l-F'73), for a biography, see page 662.
Fouad A. Tobagi was born in Beirut, Lebanon on July 18, 1947. He received the Engineering Degree from Ecole Centrale des Arts of Manufactures, Paris, France, in 1970 and PHOTO the M.S. and Ph.D. degrees in computer NOT science from the University of California, Los Angeles, in 1971 and 1974, respectively. AVAILABLE From 1971 to 1974 he was with the University of California, Los Angeles, where he participated in the ARPA Network Project as a Postgraduate Research Engineer and did research on packet radio communication. During the summer of 1972 he was with the Communications Systems Evaluation and Synthesis Group, IBM J. Watson Research Center, Yorktown Heights, N.Y. Since December 1974 he has been a Research Staff Project Manager with the ARPA project, Computer Science Department, University of California, Los Angeles. His current research interests include computer communication networks, packet switching over radio, and satellite networks. From 1967 to 1970 he held a scholarship from the Ministry of Foreign Affairs of the French government. During the academic year 1972-1973 he held an Earl Anthony Fellowship.
Packet Switching in a Multiaccess Broadcast Channel: Performance 'LEONARD KLEINROCK,
Evaluation
FELLOW, IEEE, AND
SIMON S. LAM,
MEMBER, IEEE
Abstract-In this paper, the rationale and some advantages for multiaccess broadcast packet communication using satellite and ground radio channels are discussed. A mathematicalmodel is formulated for a "slotted ALOHA" random access system. Using this model, a theory is put forth which gives a cohere~t qualitative interpretation of the system stability behavior which leads to the definition of a stability measure. Quantitative estimates for the relative instability of UDstable channels are obtained. Numerical results are shown illustrating the trading relatioDs among channel stabilityI throughput, and delay. These results provide tools for the performece evaluation and design of an uncontrolled slotted ALOHA system. Adaptive channel control schemes are studied in a companion paper.
and methods for the evaluation and optimization of the channel performance of a slotted ALOHA system. The problem of performance evaluation is addressed in this paper. In: [1], we present dynamic channel control procedures as solutions to some of the issues considered herein. In this paper, the rationale for multiaccess broadcast packet communication is first discussed. The mathematical model to be considered is then described. Following that, a theory is proposed which explains the dynamic and stochastic channel behavior. In particular, we display the delay-throughput performance curves obtained under the assumption of equilibrium conditions [6J. We then demonINTRODUCTION strate that a slotted ALOHA channel often exhibits "unstable behavior." A stability definition is proposed which N THIS and a forthcoming paper [IJ, a packet switchcharacterizes stable and unstable channels. A stability ing technique based upon the random access concept of measure (FET) is then defined which quantifies the the ALOHA System [2J will be studied in detail. This relative instability of unstable channels. An algorithm is technique, referred to as slotted AWHA random access, given for the calculation of FET. Finally, numerical results enables efficient sharing of a data communication channel are shown which illustrate the trading relations among by a large population of users, each with a bursty data channel stability, channel throughput, and average packet stream. This packet switching technique may be' applied delay. Our main concern in this paper is the consideration to the use of satellite and ground radio channels for of the stability issue and its effect on the channel throughcomputer-computer and terminal-computer communicaput-delay performance. tions, respectively' [3]-[10J. The multiaccess broadcast capabilities of these channels render them attractive soluMULTIACCESS BROADCAST PACKET tions to two problems: 1) large computer-communication COl\1MUNICATION networks with nodes distributed over wide geographic areas, and 2) large terminal access networks with po- Rationale tentially mobile terminals. For almost a century, circuit switching dominated the The objective of this study is to develop analytic models design of communication networks. Only with the higher Paper approved by the Associate Editor for Computer Com- speed and lower cost of modern computers did packet communication of the IEEE Communications Society for publication munication become competitive. It was not until approxiafter presentation at the 7th Hawaii International Conference on mately 1970 that the computer (switching) cost dropped System Sciences, Honolulu, Hawaii, January 8-10, 1974. Manuscript received June 30, 1974; revised September 30, 1974. This research below the communication (bandwidth) cost in a packet was supported by the Advanced Research Projects Agency of the switching network [11J.. This also marked the first apDepartment of Defense under Contract DAHC 15-73-C-0368. L. Kleinrock is with the Department of Computer Science, pearance of packet switched computer-communication University of California, Los Angeles, Calif. 90024. S. S. Lam is with the IBM Thomas J. Watson Research Center, networks [2J, [12J. Yorktown lIeights, N. Y. 10598. Circuit switching is relatively inefficient for computer
I
J
Reprinted from IEEE Transactions on Communications, April 1975.
The Best ofthe Best. Edited by W H. Tranter, D. ~ Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
649
THE BEST OF THE BEST
650
communications, especially over long distances. Measurement studies [13J conducted on time-sharing systems indicate that both computer and terminal data streams are bursty. Depending on the channel speed, the ratio betw.een the peak and the average data rates may be as higk. as 2000 to 1 [5J. Consequently, if a high-speed point-to- . point channel is used, the channel utilization may extremely low since the channel is idle most of the time. On the other hand, if a low-speed channel is used, the transmission delay is large. The above dilemma is caused by channel users imposing bursty random demands on their communication channels. By the law of large numbers in probability theory, the total demand at any instant from a large population of . independent users is, with high probability, approxi- -, mately equal to the sum of their average demands (i.e., a nearly deterministic quantity). Thus, if a channel is dynamically shared in some fashion among many users, the required channel bandwidth to satisfy a given delay constraint may be much less than if the users are given dedicated channels. This concept is known as statistical load averaging and has been applied in many computer-communication schemes to various degrees of success. These schemes include: polling systems [14J, loop systems [15J, asynchronous time division multiplexing (ATDM) [16J, and the store-and-forward packet switching concepts [17J-[19J implemented in the ARPA network [12J. We are currently facing an enormous growth in computer networks [20J. To design cost-effective computercommunication networks for the future, new techniques are needed which are capable of providing efficient highspeed computer-computer and terminal-computer communications in a large network environment. The application of packet switching techniques to radio communication (both satellite and ground radio channels) appears to provide a solution. Radio is a multiaccess broadcast medium. That is, a signal generated by a radio transmitter may be received over a wide area by any number of receivers. This is referred to as the broadcast capability. Furthermore, any number of users may transmit signals over the same channel. This is referred to as the multiaccess capability. (However, if two signals at the same carrier frequency overlap in time at a radio receiver', we assume that neither is received correctly. This destructive interference is the key issue in studying the multiaccess radio channel used in a packet switching mode.) Thus, a single ground radio channel provides a completely connected network topology for a large number of nodes within range of each other. Similarly, a satellite transponder in a geostationary orbit above the earth acts as a radio repeater. Any number of earth stations may transmit signals up to the satellite at one carrier frequency (the multiaccess channel). Any signal received by the satellite transponder is beamed back to earth at another frequency (the broadcast channel). This broadcasted signal may be received by all earth
be
1
This event will be referred to as a channel collision.
stations covered by the transponder beam. Thus, a satellite channel (consisting of both carrier frequencies) provides a completely connected network topology for all earth stations covered by the transponder beam. Consider the use of packet communication in a computer-communication network environment to support large populatons of (bursty) users over a wide area. We can then identify and summarize the following advantages of satellite and ground radio channels over conventional 'wire communications. 1) Elimination of Complex Topological Design and Routing Problems: Topological design and routing problems are very complex in networks with a large population of users. Existing implementations suitable for a (say) 50 node network may become totally inappropriate for a 500 node network required to perform the same functions [21J. On the other hand, ground radio and satellite channels used in the multiaccess broadcast mode provide a completely connected network topology, since every user may access any other user covered by the broadcast. 2) Wide Geographical Areas: Wire communications become expensive over long distances (e.g., transcontinental, transoceanic). Even on a local level, the communication cost, for an interactive user on an alphanumeric console over distances of over 100 miles may easily exceed the cost of computation [2J. On the other hand, satellite and radio communications are relatively distance independent, and are especially suitable for geographically scattered users. S) Mobility of Users: Since radio is a multiaccess broadcast medium, it is possible for users to move around freely. This considerati~n will soon become important in the development of personal terminals in future telecommunication systems [22] as well as in aeronautical and maritime applications [23]. 4) Large Population of Active and Inactive Users: In wire communications, the system overhead usually increases with the number of users (e.g., polling schemes). The maximum number of users is often bounded by some hardware limitation (e.g., the fan-in of a communications processor). In radio communication, since each user is merely represented by an ID number, the number of active users is bounded only by the channel capacity and there is no limitation to the number of inactive' (but potentially. active) users beyond that of a finite address space. 6) Flexibility in System Design: A radio packet communication system can become operational with two or three users. The size of the user population can be increased up to the channel capacity. More. users can be accommodated by increasing the radio channel bandwidth. In other words, the communication system can be expanded or contracted without major changes in the basic system design and operational schemes. 6) Statistical Load Averaging: Wire communication links are more efficiently utilized in a store-and-forward packet switched network than in a circuit switched network. However, at any instant, there may be unused channel capacity in some parts while congestion exists in
Fifty Years ofCommunications and Networking
other parts of the network. The application of packet switching techniques to a single high-speed satellite or radio channel permits the total demand of all user input sources to be statistically averaged at the channel. Note also that each user transmits data at the wide-band channel rate. 7) Multiacce88 Broadcast Capability: This capability in radio communication may be useful for certain multipointto-multipoint communication applications.
The Multiacces8 Channel Model Consider a radio communication system such as a packet switched satellite system [5}-[10J or the ALOHA System [2J. In each case, there is a broadcast channel for point-to-multipoint communication and a muliioccees channel shared by a large number of users. Since the broadcast channel is used by a single transmitter, no transmission conflict will arise. All nodes covered by the radio broadcast ~an receive on the same frequency, picking out packets addressed to themselves and discarding packets addressed to others. The problem we are faced with is how to effect timesharing of the multiaccess channel among all users in a fashion which produces an acceptable level of performance. As soon as we introduce the notion of sharing in a packet switching mode, we must be prepared to resolve conflicts which arise when simultaneous demands are placed upon the channel. There are two obvious solutions to this problem: the first is to form a queue of conflicting demands and serve them in some order; the second is to "lose" any demands which are made while the channel is in use. The former approach is taken in ATDM and in store-and-forward networks assuming that storage may be provided economically at the point of conflict. The latter approach is adopted in the ALOHA System random access scheme: in this system, in fact, all simultaneous demands made on the radio channel are lost. Let us define channel throughput rate Bout to be the average number of correctly received packet transmissions per packet transmission time (assuming stationary conditions) . We also define channel capacity 8 m ax to be the .maximum possible channel throughput rate. The channel capacity of a pure ALOHA multiaccess channel was shown by Abramson to be 1/2e ~ 18 percent for a fixed packet sise [2J. Under similar assumptions, Gaarder showed that a pure ALOHA channel with a fixed packet size is always superior (in terms of channel capacity) to one with different packet sizes [24J. Roberts suggested that the channel may be slotted by requiring all users to synchronises the leading edges of their packet transmissions to coincide with an imaginary time slot boundary at the multiaccessed radio receiver [25J. The duration of a channel time slot is chosen to be equal to a packet transmission time. The resulting scheme will be referred to as "slotted ALOHA random access" or t The problem of synchronizing channel users is a nontrivial one. It will not be addressed in this paper.
651
"slotted ALOHA." I~ this scheme, the users transmit newly generated packets into channel time slots independently. In the event of a channel collision, the collided packets are retransmitted after random retransmission delays. (See Fig. 1.) The channel capacity of a slotted ALOHA channel was shown to be lie ~ 36 percent [25J. To achieve a channel throughput rate larger than the 36 percent limitation, various other multiaccess broadcast packet swiching schemes have been proposed to take advantage of special system and traffic characteristics. The reader is referred to the references [3J, [7J, [26J for' description of these schemes. Consider a slotted ALOHA channel. The channel input in a time slot is defined to be a random variable representing the total number of new packets transmitted by all users in that time slot. Assuming stationary conditions, the channel input rate S is the average number of new packet transmissions per time slot.. The channel traffic in a time slot is defined to be a random variable representing the total number of packet transmissions (both newand previously collided packets) by all users in that time slot. Assuming stationary conditions, the channel traffic rate G is the average number of packet transmissions per time slot. The channel' throughput (or output) in a time slot is defined to be a random variable representing the number (0 or 1) of successful packet transmissions in that time slot. Assuming stationary conditions, the channel throughput (output) rate Bout is the probability of exactly one packet transmission in a channel time slot. The retransmission delay (RD) incurred by an unsuccessful packet transmission may be regarded as the sum of a deterministic component (R) and 8, random component. The random component is. necessary since if collided packets are retransmitted after the same deterministic delay, they will collide again for sUie~·i~ a ground radio system, RD corresponds to the positive acknowledgment time-out interval [2J. In a satellite system, since each channel user listens to the satellite broadcast, one round-trip propagation time after transmitting a packet he knows 'whether he was successful or if a channel collision occurred. In this case, the deterministic component corresponds to a round-trip satellite propagation delay. We shall assume a noise-free channel such that a packet is received incorrectly if and only if it suffered a channel collision. In [6J, a uniform probability distribution is assumed for the random component of RD such that each user retransmits a previously collided packet at random during one of the next K slots (each such slot being chosen with probability 11K). Thus, retransmission will take place either R + 1, R + 2,·· -or R + K slots after the previous transmission, This is said to be the uniform retransmission randomization scheme. Under this scheme, equilibrium throughput-delay tradeoffs have been obtained for a slotted ALOHA channel with a Poisson input source (the infinite population model). Such throughput-delay contours are shown here in Fig. 2 for different values of K. Note that the minimum envelope of these contours defines the optimum channel perform-
652
THE BEST OF THE BEST
given by S , K, and DA as the channel operating point, since this is the desired channel performance given Sand K .) This observation suggests that the assumption of equilibrium conditions adopted in most previous analytic models [4}-[7J may not be valid. In order to study the dynamic behavior of these channels, simulations were performed for the infinite population model [10]. Each simulation run was observed to behave in the following manner. Starting from an initially empty system , the channel stays in equilibrium at the channel operating point for a finite period of time until stochastic fluctuations give rise to some high channel traffic rate which reduces the channel throughput rate which in turn further increases the channel traffic rate. .AB this vicious cycle 'continues, the channel becomes inundated with collisions and retransmissions. At the same time, the channel throughput rate vanishes rapidly to zero. This phenomenon will be referred to as channel saturation. Thus, we realize that the equilibrium throughput-delay tradeoffs are not sufficient to characterize the performance of the infinite population model. A more accurate measure of channel performance must reflect the trading relations among channel stability, throughput and delay. A mathematical model with a simpler structure than that used in [6J will be defined below. This model is similar to the one studied by Metcalfe [4J. Using this model, the concepts of channel saturation and stability in a slotted ALOHA random access channel have been characterized [8J, [101
USER'~ ---....... -+-~I--+I--.111--+1--.11--+--+JZr-:--~-a=JIl---+--+-+I
I
I
I
USER 2 -+-1
•
SUCCESSFUL PACKET TRANSMISSION
~
TRANSMISSION CONFLICT
, , - - - RANDOM RETRANSMISSION DELAY
Fig. 1. Slotted ALOHA random access. o
1000
soo
~ ....0200
!a
>-e
u:o
~ 100 o
"s w "a: <{
~ <{
so
STABILITY-THROUGHPUT-DELAY TRADEOFF PERFORMANCE
30 20
10 O~---:=--~-~---l.-----'L.-_.L-._..L..L-.....J .05
.35 11.
.4
S
Fig. 2. Equilibrium througbput-delay tradeoff.
ance. These results correspond to the use of a 50 KPBS satellite channel, 1125 bite per packet, and a satellite round-trip propagation delay of 0.27 s for all users. Thus R is equal to 12 slots and there are 44.4 slots in one second. (These numbers will be assumed throughout this paper.) In Fig. 2, D represents the average packet delay in slots. Note that the channel input rate S is equal to the channel throughput rate Sout under the assumption of channel equilibrium. The channel capacity Smax approaches 'l/e in the limit as K -+ co. For K = 15, it is almost there. For values of K between 8 and 15, the equilibrium throughputdelay tradeoffs are very close to the optimum performance envelope over a wide range of S. The analytic results presented so far are based upon the assumption that the channel is in equilibrium. Referring to Fig. 2, we see that given Sand K (say K = 40), there are two possible equilibrium solutions for D! They correspond to a small delay value DA and a much larger delay value DB. (We shall refer to the equilibrium point
In this section, a Markovian model is first formulated for a population of M channel users. The variable M is assumed to be large and may be either finite or infinite. A theory is then proposed which characterizes the instability phenomenon in the following ways. 1) Stable and unstable channels are defined. 2) In a stable channel, equilibrium throughput-delay results (as shown in Fig. 2) are achievable over an infinite time horizon. In an unstable channel, such channel performance is achievable only for some finite time period before the channel goes into saturation. 3) For unstable channels, a stability measure is defined and an efficient computational procedure for its calculation is given. 4) Using the above stability measure, the stabilitythroughput-delay tradeoff for unstable channels is examined.
The Markovian Model We consider a slotted ALOHA channel with a user population consisting of M users. Each such user can be in one of two states: blocked or thinking. In the thinking state, a user generates and transmits a new packet in a time slot with probability CT. A packet which had a channel collision and is waiting for retransmission is said to be backlogged. The retransmission delay RD of each backlogged packet is assumed to be geometrically distributed,
Fifty Years ofCommunications and Networking
653 sitates a state description consisting of the channel history
i.e., each backlogged packet retransmits in the current
time slot with probability p. Assuming bursty users, we for at least R consecutive time slots. The difficulty in must have p» o, From the time a user generates a packet mathematical analysis using such a state description was until that packet is successfully received, the user is illustrated in [10J. However, simulation results have blocked in the sense that he cannot generate (or accept shown that the slotted ALOHA channel performance (in from his input source) a new packet for transmission. terms of average throughput and delay) is dependent Let N' be a random variable (called the channel backlog) primarily upon the average retransmission delay (RD) and representing the total number of backlogged packets at quite insensitive to the exact probability distributions time t. The channel input rate at time t is Be = (M - . considered [10J. In order to use the 'analytic results of the N')u. Note that 8' decreases linearly as Nt increases. The Markovian model here to predict the throughput-delay vector (N',St) will be denoted as the channel state vector. In performance of a slotted ALOHA channel with nonzero R, this context, both M and a may be functions of time. We it is necessary to use a value of p in the Markovian model shall assume M and a to be time-invariant unless stated which gives the same RD. For example, to approximate a otherwise. In this case,' N' is a Markov process (chain) slotted ALOHA channel with uniform retransmission ranwith stationary transition probabilities and serves as the domization, we must let state description for the system. The discrete state space 1 (3) will now consist of the set of integers {O,1,2,-· -,M}. The P = R+ (K+ 1)/2 one-step state transition probabilities of N' are, for i = such that RD = R (K + 1) /2 in both cases. 0,1,2,-· -,M,
+
i5i-2 j == i - I
ip(l - p)i-l(1 - u)M-i
pi;
= Prob [N'+l = j IN'
(1 - p)i(M - i)cr(1 =
i]
=
(M - i)a(1 - O')M-i-l[1 -
(M-i) .
J -
.
O'i-i( 1
-
+ [1
-.ip(l - p)i-lJ(l -
(1 - p)']
i =i
i ?:.
u) M-J
i
+ 2. (1)
0'
--+- 0
such that Ma
=S
which is constant and finite, the
o
j~i-2
ip (1 - p) i-I exp (- S)
j=i-1
(1 - p)iS exp (-8)
O')M-i
j=i+l
~
For the infinite population model in. which M --+- co and above equation becomes
Pi; =
u)M-i-l
+ [1
- ip(l - p)i-1Jexp (-8)
j=i
S exp (-8)[1 - (1 - p)'J
j=i+l
Sf- i (j _ i) , exp (- S)
j ~i
The assumption that RD has a memoryless geometric distribution permits a simple state description for the mathematical model. However, this assumption implies that RD has a zero deterministic component (R = 0). In a satellite channel this obviously represents an approximation. (However, it may be physically realizable in radio communications over short distances in which channel propagation delays are negligible compared to a packet transmission time.) A (geostationary) satellite channel has a round-trip propagation delay of 0.27·s, which neces-
+ 2.
(2)
We define the length of time for which a packet is backlogged to be the backlog time of the packet and denote the average backlog time by Db. To obtain the average packet delay (as defined in [6J) , we must add to Db, R + 1 time slots, which represent the delay incurred by each successful transmission. Thus, we have (4) Numerical results in this paper will be expressed in terms of K (rather than p) through u~e of (3) and (4) for
654
THE BEST OF THE BEST
comparison with previous results for channel performance [6].
The Theory Conditioning on N' = n, the expected channel throughput Sout(n,cT) is the probability of exactly one packet transmission in the tth time slot. Thus,
~::""-_----"':----_S
Fig. 3. Throughput surface above the (n,S) plane.
Sout(n,cT) == (1 - p)"(M - n)cr(1 - O')M-,,-I
+ np(l
- p)"-1(1 - O')M-".
For the infinite population model, i.e., in the limit as M i co and 0' ! 0 such that M 0' = S is finite and the channel input is Poisson distributed at the constant rate S, the above equation reduces to Sout(n,S)
= (1 -
p)"S exp (-S)
+ np(I
- p)..-l exp (-S) .
n
(5)
Z40
zoo ~ w
g
'" 160
(6)
§
~ 120 This expression is very accurate even for finite M if 0' « 1 al J and if we replace S == MO' by S = (M - n)cT. We assume ~ that the condition u« 1 (which implies bursty users) is ~ 80 u always satisfied in problems of interest to us. In Fig. 3, for a fixed K we sketch Sout(n,S) as a threedimensionalsurface above the (n,S) plane. Note that there 40 is an equilibrium contour in the (n,S) plane defined as the locus of points on which the channel input rate S is equal to the expected channel throughput Sout (n,S) given by o s o . 10 .20 .30 .40 (6). In the crosshatched region enclosed by the equilibrium CHANNEL INPUT IPACKETSISLOT) contour, Sout(n,S) exceeds Si elsewhere, S is greater than Fig. 4. Equilibrium contours in the (n,S) plane. Sout (n,S) . In Fig. 4, a family of equilibrium contours for various K are displayed. We see that if we increase the average retransmission delay (by increasing K or 'equivalently decreasing p), the equilibrium contour moves upwards. We show below that these equilibrium contours playa crucial role in determining the stability behavior of the channel. L-...--'---......L.----'---_l Given an equilibrium contour in the (n,S) plane, we first consider the dynamic behavior of the channel subject to ti~varying inputs using a fluid approximation interFig. 5. M (t). pretation. The following example serves to illustrate the n underlying concepts. Consider the case in which a is constant while M = M(t) is a function of time as shown in Fig. 5. We use the fluid approximation for the trajectory of the channel state vector (N',S') in the (n,S) plane as sketched in Fig. 6. Recall that S' == (M - N')u. The arrows indicate the "fluid" flow direction which depends on the relative magnitudes of the instantaneous channel throughput rate Sout(n,S) and the channel input rate S. Two possible cases are shown corresponding to different values of the amplitude M a, of the input pulse in Fig. 5. The solid line Fig. 6. Fluid approximation trajectories. (Case 1) represents a trajectory which returns to the original state on the equilibrium contour despite the input pulse. The dashed line (Case 2) represents a less fortunate eventually, the channel "fails" as a result of an increasing situation in which the decrease in the channel input rate at backlog and a vanishing channel throughput. time ~ is not sufficient to bring the trajectory back into the The above example demonstrates channel saturation "safe" region (shown shaded) in which S < Sout(n,S); due to a time-varying input. Let us now study the con-
Fifty Years of Communications and Networking
ditions under which the slotted ALOHA channel with a stationary input (constant M and (f) can gointo saturation as a result of statistical fluctuations. Assume that M and a are constant. The trajectory of (Nt,S') is constrained to lie on the straight line S = (M - n)u called the channel load line which intercepts the n-axis at n = M and has a slope equal to -1/0'. We now propose the following definition for characterizing stable and unstable channels. The Stability Definition: A slotted ALOHA channel is said to be stable if its load line intersects (nontangentially) the equilibrium contour in exactly one place. Otherwise, the channel is said to be unstable. Examples of stable and unstable channels are shown in Fig. 7. Arrows on the channel load lines indicate directions of fluid flow given by the fluid approximation. In other words, the arrows point in the direction of increasing backlog size if S > Sout(n,S) and in the direction of decreasing backlog size if Sout(n,S) > S. Each channel load. line may have one or more equilibrium points. A point on the load line is said to be a stable equilibrium point if it acts as a "sink" with respect to fluid flow. It is a globally stable equilibrium point if it is the only stable equilibrium point on the channel load line. Otherwise, it is a locally stable equilibrium point. (Each stable equilibrium point is identified by a dot on channel load lines' in Fig. 7 except in Fig. 7 (c), where one of the stable equilibrium points is at n = 00.) An equilibrium point is said to be an unatable equ'ilibrium point if fluid flow emanates from it. Thus, the channel state N! sitting on such a point will drift away from it given the slightest perturbation. The stability definition given above is equivalent to defining a stable channel to be one whose channel load line has a globally stable equilibrium point. In Fig. 7 (a), we show the channel load line of a stable channel. The globally stable equilibrium point on the load line, (no,So), will be referred to as the channel operati'fJ,g point. If M is finite, a stable channel can always be achieved by using a sufficiently large K (see Fig. 4). Of course, a large K implies that the equilibrium backlog size no is large; the corresponding average.packet delay may be too large to be acceptable. Since the Markov chain N' has a finite state space and is irreducible (assuming p,O' > 0), a stationary probability distribution always exists [27J, [28J. The stationary probability distribution {P,,},,_oM of N! can be computed by solving the following set of linear simultaneous equations
r,
M
=
L
i-O
PiPij
j
= 0,1,·· ·,!vI
and
where the state transition probabilities Pij are given by (1). The steady-state channel throughput rate Bout and
655 n
M
CHANNEL SATURATION
M
POINT
CHANNEL OPERATING POINT
"0
(a)
A STABLECHANNEL
(b) AN UNSTABLE CHANNEL
n
n CHANNEL SATURATION POINT
M
So (c) AN UNSTABLE CHANNEL
(d) AN OVERLOADED CHANNEL
Fig. 7. Stable and unstable channels.
expected channel backlog
N can
then be obtained from
M
Bout
=L
Sout(n,u)Pn
(7)
and (8) ~umerical results have shown that these values of Bout and
N for a stable channel are closely approximated by the
equilibrium So and no at the channel operating point, and also by the equilibrium throughput-delay values in Fig. 2 for the infinite population model. For example, suppose K = 60, M = 200, and l/u = 536.1; the equilibrium channel throughput rate at the channel operating point is So = 0.346. In Fig. 9 below (to be described later), we see that the steady-state channel throughput rate com~u~ed by using (7) is Bout = 0.344. For the same example, N IS calculated to be 15.4 slots. By Little's result [27J, the average backlog time is
D = N _ 15.4 _ . " Bout - 0.344 - 44.8 slots.
+
13 = 57.8 slots. Now Applying (4), we get D = 44.8 given So = 0.346, the K = 60 equilibrium throughputdelay contour for the infinite population model [6J gives D = 56.5 slots. In Fig. 7 (b), we show the channel load line of an unstable channel. The point (no,So) is again the desired channel operating point since it yields the larger channel throughput and smaller average packet delay between the two locally stable equilibrium points on the load line. In fact, the other locally stable equilibrium point, having a huge backlog and virtually zero throughput, corresponds
THE BEST OF THE BEST
656
to the channel saturation state; it will be referred to as the channel saturation point. Although it has a stationary probability distribution, N! will (CHip-flop" between the two locally stable equilibrium points in the following manner. Starting from an empty channel (N° = 0) quasi... stationary conditions will prevail at the operating point (no,So). The channel, however, cannot maintain equilib... rium at this point indefinitely since N' is a random process; that is, with probability one, the channel backlog Nt crosses the unstable equilibrium point n c in a finite time, and as soon as it does, the channel input rate S exceeds Sout(n,S). Under this condition, N! will drift toward the saturation point. Although there is a nonzero probability that N' may return below n c , all our simulations show that the channel state N' accelerates up the channel load line producing an increasing backlog and a vanishing' throughput rate. Since the saturation point is a locally' stable equilibrium point, quasi-stationary conditions will prevail there for Borne finite (but probably very long) time period. In this state, the communication channel can be regarded as having failed. (In a practical system, external control this point to restore proper channel should be applied operation.) Thus, the two locally stable equilibrium points on the load line of an unstable channel correspond to the channel being "up" or "down". An unstable channel may be acceptable if the average channel up time is large and external control is available to bring the channel back up whenever it goes down. In Figs. 8 and 9, we see how, as the number of channel users M increases, an originally stable channel becomes unstable although the channel input rate So at the operating point remains constant (by reducing (f). (These results are obtained by first solving for the stationary probability distribution of N' and then applying (7) and (8).) For So == 0.36 and K := 10, we see that as M exceeds 80, the stationary channel throughput rate decreases and the average packet delay increases very rapidly with M. Using the K = 10 equilibrium contour in Fig. 4, the maximum value of M that is possible without making the channel load line intersect the equilibrium contour more than once is determined (graphically) to be Mm ax = 79, which exactly gives the knees of the curves in Fig. 8. This excellent agreement provides the motivation for the stability definition proposed above. In Fig. 9, by using a larger value of K (=60), a larger M IIl• x is possible. Note, however, that the average packet delay (~56 slots) for K = 60 is much larger than the average packet delay (~36 slots) for K = 10. Given K and So, M max can be obtained graphically from the equilibrium contours such as shown in Fig. 4. In Fig. 10 we show Mmax as a function of K with So fixed at the maximum possible value given K. Note the linear relationship between M m ax and K for the values shown. In Fig. 11, we illustrate how an originally unstable channel can be rendered stable by using a sufficiently large K. In Fig. 7 (c), we show the channel load line of an infinite population model. This is an unstable channel since
0.5
SOO
0.4
400
0.2
200
,,
,, ,, ,,
c;; t-
o
K So
..J
en
;:- 100
:5 w Q
• 10 • 0.38
•,
90 80
,
E 70
\
&&S
~
m:
50
w
~
,, ,
\
~ 60 A.
40 .....
\
_-----~ DELAY
30
at
20
10
0.01
~
_ _"---_ _.a...-_ _.a-- _ _ .a--_---'
&0
eo
70
80
100
90
M
NUMBER OF USERS
Fig. 8. Channel performance versus M at K
= 10 and So =s
0.36.
0.5 0.4
0.3
0.2
,, ,, ,, ,, , , ,,
200
;: o..J
~ ~ ~
~
K So
~-'
w
0.1
< i III
~
> 100
:5w
Q to~
0=
%
g
~ 0.05
:c t-
-' w Z
• 80 • 0.348
.
60
~
50
~
40
a: w
\
70
~
w
\ \
90 80
~
-----
\
DELAY
,
\
Z
c(
:z: (J
30
20
0.01
10
L . . - _ _ .&..-_ _.a...-_
170
180
_.&.-._ _. l - - - _ - - - '
190
200
NUMBER OF USERS
Fig. 9. Cha.nnel performance versus M at K
210
220 M
= 60 and So = 0.346.
Fifty Years ofCommunications and Networking
657
300
200 M
o
e
~
roo
O"'---~_~--"_-..L-_..I....--..-..L_--I.-.-_a----,----.J
o
o
20
30
40
50
60
80
70
K
90
100
Fig. 10. M maa versus K. 0.5
soo
0,.
400
0,3
300
0.2
200
,~-------------------THROUGHPUT
§
1/a • 675 M • 250
..I
se I!!
~
....
""
0
~
~w
-I
0.1
to-
c(
!!
>-
100 90
...
80
:5W
Q
cc t:» L
&II ~
CJ
::z:
70
CJ
f
80
~ 0.06
0
60
.J Vol
>-e
40
~
...::c Z
au
~
ex: w
DELAY
2
<::z:
30
u
20
0.01
10
"-_----11....--_---1.
70
80
90
--....._ _--'
no
100
120 K
Fig. 11. Channel performance versus K at M == 250 and 1/(1 == 675.
ft;(m)
= Prob [Tij
= m] =
stable equilibrium point in this case is the channel saturation point! Thus, this represents an "overloaded" channel as &, result ofbad system design. To correct this situation, the number of active users M supported by the channel should be reduced. From now on, a stable channel will always refer to the load line depicted in Fig. 7 (a) instead of Fig. 7 (d). Let us summarize the major conclusion» in the above discussion. 1) The steady-state throughput-delay performance of &, stable channel is closely approximated by its globally stable equilibrium point and by the equilibrium throughput-delay results for the infinite population model. 2) In an unstable channel, the throughput-delay performance at a locally stable equilibrium point can be achieved only for some' finite time period.
A Stability Measure From the above discussion and referring to Fig. 7 (b) , the load line of an unstable channel can be partitioned into two regions. The safe region consisting of the channel states {O,1,2,• • •,nc } and the umafe region consisting of the channel states Inc + 1,··· ,M). A good stability measure (for these unstable channelal) is the average time to exit into the unsafe region starting from a safe channel state. To be exact, we define FET to be the average first exit time into the unsafe region starting from an initially empty channel (N° = 0). Thus, FET gives an approximate measure of the average up time of an unstable channel. Below we derive the probability distributions and expected values of such first exit times. The derivations are based upon well-known results of first entrance times in the theory of Markov chains with stationary transition probabilities [28J, [30J. Consider the Markovian model with constant M and (T, where M may be infinite. N' is a Markov process (chain) with stationary transition probabilities {piil given by (1) or (2). Define the random variable T i; to be the number of transitions which N! goes through until it enters statej for the first time starting from state i. The probability distribution of T i; (called the first entrance probabilitiea from state i to state j) may be defined as
o
m = 0
PH
m=l
Prob [Nt+tn = j, N,+h
=
is a stable equilibrium point. In fact,' since N! has > Sout. (n,B) for n > nc , a stationary probability distribution does not exist for N'. (See, for example, [29, pp. 543-546J for such a proof in a queueing context.) The channel load line shown in Fig. 7 (d) is stable according to the stability definition. However, the globally
n
co
an infinite state space and 8
¢
i, h = 1,.· ·,m - 11 N' = i]
m~
2.
(9)
The state space S for N! consists of the set of nonnegative integers {O,l,2,·· ·,nc, n c + 1,·· -,M) which is parti... tioned into the safe region {O,1,2,- - -,nc } and the unsafe region tn, + 1,···,M}. Now considerthe modified state space S' = 10,1,2,- ••,nc,nu} where nv is an absorbing state such that N' is now characterized by the transition probabilities
658
THE BEST OF THE BEST
i,i
P'ij
M
Pi/ =
E
l-ftc:+1
i
PH
o
i
= 0,1,·, ·,nc
= 0,1,'· ·,nc;j =
i, nu
.
(10)
= 'ftu;j = 0,,· · ,n c
1
Define the random variable T i to be the number of transitions which N' goes through before it enters the unsafe region for the first time starting from state i in the safe region. T i is called the first exit time from state i. The probability distribution of T, is defined to be (f,(m) }m_lCX) which are called the first exit probabilitie8. It is trivial to show that starting from state i (0 ~ i ~ n,), the first entrance probabilities into the absorbing state nu in the modified state space S' are the same as the first exit probabilities into the unsafe region of s. Using (9), such probabilities are given by the following recursive equation
[30J, n,
fin.(m) = pift,/8(m - 1) where
8(m)
=
+L
;-0
I:
M
= :E
m = 1
otherwise.
fl.
piJ-a(m - 1)
~fI~1
+ L piJi;(m
- 1)
~
m ~ 1; 0
siS
n.
(11)
where J,(m) can be solved recursively for m ~ 1 starting with fiCO) = 0 for all i. The probability distribution tfi (m) }""'1~ for the random variable 'I', typically has a very long tail and cannot be easily computed. We had defined earlier FET as a stability measure for an unstable channel. By our definition, FET is the same as the expected value of the random variable To. Let Ti be the expected value and T,2 be the second moment of T i. These moments can be obtained by solving a set of linear simultaneous equations. It can easily be shown[30] that
T. _/1 •-
1
with probability Pin,.'
+ Ti
with probability pi;
from which we obtain [28J, [30J
=
1+
n.
L
;-0
i = 0,1,- -
pijTj
21', - 1 +
"-
L pijTj
i
-,n.
= 0,1,·· ·,n,.
(12)
(13)
;.:.0
Equation (12) forms a set of n c + 1 linear simultaneous equations from which {1\} i. .?l e can be solved and the stability measure FET (= To) determined. After {T i } i..{Jtlc have been found, (13) can then be solved in a similar manner for {T ,2} i.,nc.
Numerical Results With the stability measure defined above, we are now
in a position to examine quantitatively the tradeoff among channel stability, throughput and delay for unstable channels. Below we first give a computational procedure to solve for Ti and hence, FET . We then compute these quantities for various values of K, So, and M (corresponding to different channel load lines). The trading relations among channel stability, throughput, and delay are then illustrated. The solution of the set of simultaneoua equations in either (12) or (13) requires inverting the (nc 1) by (nc + 1) matrix of Pij for i,j = 0, 1,···,nc• When nc is large, this becomes a nontrivial task because of the large number of computational steps and large computer storage requirement for the [Pij] matrix. The fact that Pij = 0 for j ~ i - 2 in (1) and (2) enables us to use an algorithm given in the Appendix which is very efficient in terms of both computer time and space requirements. For our purposes, this algorithm is superior to conventional methods such as Gauss elimination [31J for solving linear simultaneous equations. In this algorithm, each Pij is used exactly once and can be computed using (1) or (2) only when it is needed in the algorithm. This eliminates the need for storing the [Pii] matrix and practically eliminates any computer storage constraint on the dimensionality of the problem. The number of arithmetic operations (+ - X +) required by the above algorithm is in the order of 2ncJ which is comparable to that of Gauss elimination. In Fig. 12, we show FET as a function of K for the infinite population model and for fixed values of the channel throughput rate So (at the channel operating point). We see that FET can be improved by either decreasing the channel throughput rate So or by increasing K (which in turn increases the average packet delay). The infinite population model results give the worst case estimates for channel stability as demonstrated in Fig. 13 in which we show FET as a function of M for K = 10 and four values of So. Note that FET increases as lJf decreases and there is a critical value of M below which the channel is always stable in the sense of Fig. 7(a). As M increases to infinity, FET reaches a limiting value corresponding to the infinite population model with a Poisson channel input. Fig. 14 is similar to Fig. 12 except now the number of users
+
Pi/lift.(m - 1)
The above equation can be rewritten in terms of the first exit probabilities as fi(m)
Ti 2
=
659
Fifty Years of Communications and Networking soo 400
;;200
9en
I HR
~ 10'
>~ 100 w o
9
~
...w aa.
... '" ~
10·
~
Go
IMIN
'" ~
103
102
~
20
0
40
80
60
lie
100
CM- I!50)
Fig. 12. FET values for the infinite population model. FET (SLOTS) •07
10
.
.2~
.3
.35
lie
Fig. 15. Stability..throughput-delay tradeoff. KalO
M is 150. Recall that if M is finite, the channel will become stable when K is sufficiently large. As an example, we see that in Fig. 14 for M = 150, if the channel throughput rate So is kept at approximately' 0.28 and K = 10 is 'used, the channel is estimated to fail once every two days on the average. If this is an acceptable level of channel reliability, then no other channel control procedure is necessary except to restart the channel whenever it goes into saturation. However, if absolute channel reliability is required at the same -throughputdelay performance, then dynamic channel control strategies should be adopted. Channel control schemes have been studied [10J and the results will be published in a forthcoming paper [1]. In Fig. 15, we show the optimum performance envelope in Fig. 2 as a lower bound for the throughput-delay tradeoff of the infinite population model. This corresponds to the performance of the channel at the channel operating point. However, from Fig. 7, we see that the
10'
100
200
300
NUMBER OF TERMINALS
400 M
Fig. 13. FET versus M.
FET (SL.OTS)
107 I DAY
I HR 10'
"MIN NUMBER OF TERMINALS M a 150
L..--.--'-_1-..-'--_L..-~_L..--""'----J
o
~--,,-_-.a.-_-'------I'------L-_.-L---&-..I
.2
IMIN
102
FET It I DAY
FET It I HR FINITE USER MODEL.
10
20 K
Fig. 14. FET values for a finite user population (M
150).
channel operating point (n.,So) provides no information regarding the stability behavior of the channel. The equilibrium performance given by (no,So) is achievable in the long run if M is small enough such that the channel is stable; elsewhere it is achievable only for some random time period whose average is estimated by our stability measure FET. In addition to the infinite population model optimum envelope, we also show in Fig. 15 two sets of equilibrium 'throughput-delay performance curves with guaranteed FET values. The first set 'consists of three solid curves corresponding to an infinite population model with the stability measure FET ~ 1 day, 1 hour, and 1 minute. ~gain, these results represent worst case estimates if M is actually finite. The second set consists of two dashed curves corresponding to M = 150 with FET ~ 1 day and 1 hour. These results were obtained by looking up the values of K and So in Fig. 12 or Fig. 14 corresponding to a
660
THE BEST OF THE BEST
fixed FET. The average packet delay was then obtained
APPENDIX
from Fig. 2. This figure illustrates the fundamental tradeoff among channel stability, throughput and delay. In [IJ, [10J, control strategies are devised to dynamically regu-
The algorithm below solves for the variables {t.It..I.JI in the following set of (I + 1) linear simultaneous equations,
late the channel usage to achieve truly stable throughputdelay performance close to the optimum performance envelope.
to = ho + ~ POit;
1
1
ti
A Design Example The designer of a slotted ALOHA channel is faced with the problem of deciding whether he wants 1) a stable channel by limiting its use to a small population of users and sacrificing channel utilization, or 2) an unstable channel which supports a large population of users operat.. ing at a certain level of reliability (some value of FET). For example, suppose K is chosen to be 10. (Note in Fig. 2 that K = 10 gives close to optimum equilibrium throughput-delay performance over a "ride range of channel throughput rate.) Also, suppose that the channel users have an average think time of 20 s which, for our channel numerical constants, correspond to 888 time slots. Now if we draw channel load lines in Fig. 4 with a slope equal to -888, the channel is stable up to approximately 110 channel users. For M = 110, the channel throughput rate So is about 0.125 packet/slot. From Fig. 2, the average packet delay is roughly 16.5 time slots (= 0.37 s). The same channel can be used (in an unstable mode) to support 220 users at a channel throughput rate of So = 0.25 packet/slot. The average packet delay is 21 time slots (=0.47 s). From Fig. 12, for K = 10 and So = 0.25, the average up time (FET) of the channel is approximately two days for the infinite population model. Note that this value represents a lower bound for the FET of M = 220. Thus, we see that if a channel failure rate of once every two days on the average is an acceptable level of reliability, the second channel design is much more attractive than the first since the number of channel users is more than doubled at a modest increase in delay.
CONCLUSIONS In this paper, the rationale and Borne advantages for broadcast packet communication have been discussed. A mathematical model was then formulated for a slotted ALOHA random access system. Using this model,. a theory was put forth which gives a coherent qualitative interpretation of the system stability behavior. Quantitative estimates for the relative instability of unstable channels were obtained through definition of the stability measure FET. Numerical results were shown illustrating the trading relations among channel stability, throughput and average packet delay. These results establish tools for the performance evaluation and design of an uncontrolled slotted ALOHA system. Further· improvement in the system performance may be accomplished through adaptive control techniques studied in [lJ, [10J.
(AI)
i-O
= hi + L
i=1,2, .. ·,I.
Pl;t;
(A2)
j-i-l
The Algorithm 1) Define
=1 11 = 0 61
2) For i
=I
1-
PII
eI-l
=
/1-1
= - --.
PI,I-1
hI PI,1-1
- 1, I - 2,·· ·,1 solve recursively .
, 1
1
ei-l = - - [ei - ~ Pilei] pi,i-l
j-i
3) Let
10 - Ito -
I
~ Poi; ;-0 tr=-----1
E
i-O
POje; -
eo
i=O,1,2,···,I-1.
Derivation of the Algorithm Define
i
= 0, 1, 2,· • •,1 -
1
(A3)
and el == 1 [t
= o.
(A4)
The last equation in (A2) is tt = hI
Substituting we get tl
tI-l
+ Pl,I-1 tI - l + PlItl.
= el- 1tr
+ /1-1 into the above equation,
= hI + PI,l-leI-ltl + PI,I-JI-l +
ptitr.
Fifty Years ofCommunications and Networking
661
(A6)
F. Heart, "A system for broadcast communication: reserva tion-ALOHA," in Proc. 6th Hawaii Int. Con!. System Sciences, Univ. Hawaii, Honolulu, Jan. 1973. [4] R. M. Metcalfe, "Steady-state analysis of a slotted and controlled ALOHA system with blocking," in Proe. 6th Hawaii Int. Con/. System Sciences, Univ. Hawaii, Honolulu, Jan. 1973. [5] N. Abramson, "Packet switching with satellites," in 1973 NoJ.. Com/put. Conf., AFIPS Con!. Proc., vol. 42. New York: AFIPS Press, 1973, pp. 695-702. [6] L. Kleinrock and S. S. Lam, "Packet-ewitching in a slotted satellite channel," in 1973 Nat. Co?nput. ConJ., AFIPS ConI. Proc., vol. 42. New York: AFIPS Press, 1973, pp. 703-710. [7) L. G. Roberts, "Dynamic allocation of satellite capacity through packet reservation," in 1973 Nat. Comput. Coni., AFIPS Con!. Proc., vol. 42. New York: AFIPS Press, 1973, pp. 711-716. [8] L. Kleinrock and S. S. Lam, HOn stability of packet switching in a. random multi-access broadcast channel," in Proc. 7th
In each of the above equations, use (A3) to substitute for
[9) S. Butterfield, R. Rettberg, and D. Walden, liThe satellite
Equating the coefficients of tr and the constant terms, we have eI-l
=
!I-i
=
1-
PII
pr,I-l
hI
(A5)
Equation (A2) can be rewritten as follows, ti-l
= -
1
pi,i-l
[ti - hi -
r
L Pijt;]. ;-i
ti. We then have
I
I
- (L pi;ei)tr - E Piti]. Equating the coefficients of tr and the constant terms, we get
From (A4), (A5), and (A7), e, and!i (i = I - 2, I - 3, - - -,1,0) can then be determined recursively. We next solve for tI- Equation (A3) is used to substitute for ti in (Al}, which then becomes t.
eotr
1
+ 10 = 'ho + CE POiei)tI + E POli. ;-0 ;-0
Solving for ta in the above equation, we have
t
PoJj
';-0
tr= - - - - - - 1
1: pOjej -
IMP for the ARPA network," in Proe. 7th Hatvaii Int. Con/. System Science« (SpecialSubconj. Computer Net,), Univ. Hawaii, Honolulu, Jan. 8-10, 1974. [10] S. S. Lam, "Packet switching in a multi-access broadcast channel "With application to satellite communication in a computer network," Ph.D. dissertation, Dep. Comput. ScL, Univ. Calif., Los Angeles, Mar, 19741 also in Univ. of Calif., Los Angeles, Tech. Rep, UCLA-ENLi-7429, Apr. 1974. [11] L. G. Roberts, "Data by the packet," IEEE Spectrum, vol. 11, pp. 46-51, Feb. 1974. [12] L. G. Roberts and B. D. Wessler, "Computer network development to achieve resource sharing," in 1970 Spring Joint Comput. Con/., AFIPS Con!. Proc., vol. 36. Montvale, N. J.: AFIPS Pr~, 1970, pp. 543-549. [13] P. E. Jackson and C. D. Stubbs, "A study of multiaccess computer communications," in 1969 Spring Joint Compui. Con!., AFIPS Con!. Proc., vol. 34. Montvale, N. J.: AFIPS Press, 1969, pp. 491-504. [14] J. Martin, ~Jjstem8 AnalY8is [or Data Transmission: Englewood Cliffs, N. J.: Prentice-Hall, 1972. [15] J. R.. Pierce, "Network for block switching of data," in IEEE Conv. Rec., New York, Mar. 1971. (16] W. W. Chu, "A study of asynchronous time division multi.. plexing for time-sharing computer systems," in 1969 Fall Joint Com~t. Conf., AFIPS Conf. Proc., vol. 35. Montvale, N. J.: AFIPS Press, 1969, pp. 669-678. [17] P. Baran, "On distributed communications XI. Summary overview," Rand Corp., Santa Monica, Calif., Memo. RM3767-PR., Aug. 1964. . {i8] L. Kleinrock, COl1t.munication Nets: Stochaatic }t[ e88~e Flow and Delay. New York: McGraw-Hill, 1964 (out of print); reprinted by New York: Dover, 1972. . [19] D. W. Davies, "The principles of a data. communication network for computers and remote ~ripher&1s," in Proc. Int. Fed. Information Proce88ing Oongr., Edinburgh, Scotland, 1968, pp.
DI1-D15.
1
fo - he, -
Hawaii Int. Conf. System. Sciences (Special BOOconJ. Computer Nets), Univ. Hawaii, Honolulu, Jan. 8:-10 1974.
(A8)
eo
j-O
Finally, ti (i = 0,1, 2, - • -,I - 1) can be obtained from (A3), since ei,!i, and tr are all known. The derivation of the algorithm is now complete.
REFERENCES [1] S. S. Lam and L. Kleinrock, "Packet switching in a multiaccess broadcast channel: dynamic control procedures," IEEE Trans. Commum., to be published; also in IBM Corp., Yorktown Heights, N. Y., Res. Rep. RC-5062, Oct. 1974. (2] N. Abramson, ('The ALOHA system-another alternative for computer communications," in 1970 Fall Joint Comput. Conf, AFIPS Conf. Proc., vol. 37. Montvale, N. J.: AFIPS Press, 1970,~pp. 281-285. [3J W. Crowther, R. Rettberg, D. Walden, S. Ornstein, and
(20] P. Wright, "Facing a. booming demand for networks," Datamation, vol. 19, pp. 138-139, Nov. 1973. [21] H. Frank, M. Gerls, and W. Chou, "Issues in the design of large distributed computer communication networks," in Proc. Nat. Telecom'municatioM Con/., Atlanta, Ga., Nov. 26-28, 1973. [22) L. G. Roberts, "Extensions of packet communication technology to a hand held personal terminal.," in 197B Spring Joint CornJ!1!..t. Conf., AFIPS ConI. Proc., vol. 40. Montvale, N. J.: AFIPS Press, 1972, pp. 295-298. (23) In Inst. Elec. Eng. (TAndon) Proc. Int. Conf. Satellite Sy8tem.8 lor Mobile Clnnmunicatiom and Surveillance, Mar. 13-15, 1973. [24] N.T. Gaarder, "ARPANET satellite system," ARPA Network Inform. Center, Stanford Res. Inst., Menlo Pa.rk, Calif., ASS Note 3 (NIC 11285), Apr. 1972. [25) L. G. Roberts, "ALOHA packet system with and without slots and capture," ARPA Network Inform. Center, Stanford Res. Inst., Menlo Park, Calif., ASS Note 8--(NIC 11290), June 1972. [26] L. Kleinrock and F. A. Tobagi, "Carrier-sense multiple access for packet switched radio channels," in Proc. Int. Con/. Communications, Minneapolis, Minn., June 1.974. (27) L. Kleinrock, Queueing S'!l~tem8, Vol. I, Theory, Vol. II, Computer Applications. New York: Wiley-Interscience, 1975. [28] E. Parsen, Stochastic Processes. San Francisco, Calif.: HoldenDay, 1962. [29] J. W. Cohen, The Singl.e Server Queue. New York: Wiley, 1969. l30] R. Howard, Dunamic Probabilistic Systems, Vol. 1: Markov Model» and Vol. 2: Semi-Markov and Decision Proce8868. New York: Wiley, 1971. . ' l31] E. J. Craig, Laplace and Fourier Tronsforms fOT Electrical Enginet:rs. New York: Holt, Rinehart, and Winston, 1964.
THE BEST OF THE BEST
662 Leonard Kleinrock (S'55-M'64-SM'71-F'73) was born in New York, N. Y., on June 13, 1934. He received the B.E.E. degree from the City College of New York, N. Y., in PHOTO 1957, and the S.M.E.E. and Ph.D. degrees NOT in electrical engineering from the Massa... chusetts Institute of Technology, CamAVAILABLE bridge, in 1959 and 1963, respectively, while participating in the Lincoln Laboratory Staff Associate Program. From 1951 to 1957, he was employed at the Photobell Company, Inc., New York, N. Y., an industrial electronics finn. He spent the summers from 1957 to 1961 at the M.l.T. Lincoln Laboratory, Lexington, Mass., first in the Digital Computer Group and later in the Systems Analysis Group. At M.LT. he was a Research Assistant, initially with the Electronic Systems Laboratory, and later with the Research Laboratory for Electronics, where he worked on communication nets in the Information Processing and Transmission Group. After completing his graduate work at the end of 1962, he worked at Lincoln Laboratory on communication nets and on signal detection. In 1963 he accepted a position on the faculty at the University of California, Los Angeles, where he is now Professor of Computer Science. He is 8, referee for numerous scholarly publications, book reviewer for several publishers, and a consultant for various aerospace, research, and governmental organizations. He is principal investigator of 8 large contract with the Advanced Research Projects Agency (ARPA) of the Department of Defense. He has published over 60 papers and is the author of Communication Nets; Stochastic M e8sage Flow and Delay (New York: McGraw-Hill, 1964), Queueing Systems, Vol. 1: Theory and Vol. B: Computer Applications (New York: Wiley-Interscience, 1975). His main interests are in communication nets, computer nets,
data compression, priority queueing theory, and theoretical studies of time-shared 'systems. Dr. Kleinrock is a member of Tau Beta Pi, Eta Kappa Nu, Sigma Xi, the operations Research Society of America, and the Association for Computing Machinery. He was awarded a Guggenheim Fellowship in 1971.
* Simon S. Lam (S'69-M'74) was born in Macao on July 31, 1947. He received
the B.B.E.E. degree in electrical engineering from Washington State University, Pullman, in 1969, and the M.S. and Ph.D. degrees in engineering from the University of CaJifornia, AVAILABLE Los Angeles, in 1970 and 1974, respectively. At the University of California, Los Angeles, he held a Phi Kappa Phi Fellowship from 1969 to 1970, and a. Chancellor's Teaching Fellowship from 1969 to 1973. He also participated in the ARPA Network project at UCLA as a postgraduate research engineer from 1972 to 1974 and did research on satellite packet communication. Since June 1974 he has been a. research staff member with the IBM 'I'homas J. Watson Research Center, Yorktown Heights, N. Y. His current research interests include computer-communication networks ana queueing theory. Dr. Lam is a member of Tau Beta Pi, Sigma Tau, Phi Kappa Phi, Pi Mu Epsilon, and the Association for Computing Machinery.
PHOTO NOT
A Protocol for Packet VINTON G. CERF
AND
Network Intercommunication ROBERT E. I\:AHN,
Abstract-A protocol that supports the sharing of resources that exist in different packet switching networks is presented. The protocol provides for variation in individual network packet sizes, transmission failures, sequencing, flow control, end-to-end error checking, and the creation and destruction of logical process-to-process connections. Some implementation issues are considered, and problems such as internetwork routing, accounting, and timeouts are exposed.
INTRODUCTION
I N THE LAST
fe,,· ~ears co~siderable eff.ort has been expended on the design and implementation of packet switching networks [lJ-[7J,[14J,[17J. A principle reason for developing such networks has been to facilitate the sharing of computer resources. A packet communication network includes a transportation mechanism for delivering data between computers or between computers and terminals. To make the data meaningful, computers and terminals share a common protocol (i.o., a set of agreed upon conventions). Several protocols have already been developed for this purpose [8J-[12J,[16J. However, these protocols have addressed only the problem of communication on the same network, In this paper we present a protocol design and philosophy that supports the sharing of resources that exist in different packet switching networks. After a brief introduction to internetwork protocol issues, ""C describe the function of a GATEWAY as an interface between networks and discuss its role in the protocol. We then consider the various details of the protocol, including addressing, formatting, buffering, sequencing, flow control, error control, and so forth. We close with a description of an interprocess communication mechanism and show how it can be supported by the internetwork protocol. Even though many different and complex problems must be solved in the design of an individual packet switching network, these problems are manifestly compounded when dissimilar networks arc interconnected. Issues arise which may have no direct counterpart in an individual network and which strongly influence the way in which internetwork communication can' take place. A typical packet switching network is composed of a Paper approved by the Associate Editor for Data Communications of the IEEE Communications Society for publication without oral presentation. Manuscript received November f), 1973. The research reported in this paper was supported in part by the Advanced Research Projects Agency of the Department of Defense under Contract DARC 15-73-C-0370. ' V. G. Cerf is with the Department of Computer Science and Electrical Engineering, Stanford University, Stanford, Calif. R. E. Kahn is with the Information Processing Technology Office, Advanced Research Projects Agency, Department of l)efense, Arlington, Va. '
MEMBER, IEEE
set of computer resources called HOSTS, a set of one or more packet switches, and a collection of communication media that interconnect the packet switches. Within each HOST, we assume that there exist processes which must communicate with processes in their O\\Tn or other HOSTS. Any current definition of a process will be adequate for our purposes [13J. These processes are generally the ultimate source and destination of data in the network. Typically, within an individual network, there exists a protocol for communication between any 8our~e ~nd destination process. Only the source and destination processes require knowledge of this convention for C01Umunication to take place. Processes in two distinct networks would ordinarily usc different protocols for this purpose. The ensemble of packet switches and communication media is called the packet' s'witching subnei. l~ig. 1 illustrates these ideas. In typical packet switching subnct, data of a fixed maximum size are accepted from a source HOST, together with a formatted destination address which is used to route the data in a store and forward fashion. The transmit time for this data is usually 'dependent upon internal network parameters such as communication media data rates, buffering and signaling strategies, routing, propagation delays, etc. In addition, some mechanism is generally present for error handling and dotormination of status of the networks components. Individual packet switching networks may differ in their implementations as follows. 1) Each network may have distinct ways of addressing the receiver, thus requiring that a uniform addressing scheme be created which can be understood by each individual network. 2) Each network may accept data of different maximum size, thus requiring networks to deal in units of the smallest maximum size (which may be impractically small) or requiring procedures which allow data crossing a network boundary to be reformatted into smaller pieces. a) The success or failure of a transmission and its performance in each network is governed by different time delays in accepting, delivering, and transporting the data. This requires careful development of internetwork timing procedures to insure that data can be successfully delivered through the various networks. 4) Within each network, communication may be disrupted due to unrecoverable rmitation of the data or missing data. End-to-end restoration procedures are desirable to allow complete recovery from these conditions.
a
Reprinted from IEEE Transactions on Communications, vol. COM-22, no. 5, May 1974. The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
663
664
THE BEST OF THE BEST
intact the internal operation of each individual network This is easily achieved if two networks interconnect as if each were a HOST to the other network, but without utilizing or indeed incorporating any elaborate HOS'] protocol transformations. It is thus apparent that the interface between network. must play a central role in the development of any net work interconnection strategy. We give a special name t< this interface that performs these functions and call it ~ GATEWAY.
THE GATEWAY NOTION PACKET-SWITCHING NETWORK
PS • PACKET SWITCH
Fig. 1.- Typical packet switching network.
In Fig. 2 we illustrate three individual networks labeler A, B, and C which are joined by GATE'VAYS Ivl and N interfaces network A with network B, an: N interfaces network B to network C. W assume that an individual network may have more- thar one GATEWAY (e.g., network B) and that there may b more than one GATEWAY path to use in going bet,veen : pair of networks. The responsibility for properly routiru data resides in the GATEWAY. . In practice, a GATEWAY between two networks may b composed of two halves, each associated with its O\Vl network, It is possible to implement each half of a GATE WAY so it need only embed internetwork packets in loca packet .format or extract them, We propose that th GATEWAYS handle internetwork packets in a standan format, but we are" not proposing any particular trans mission procedure between GATEWAY halves. Let us now trace the flow of data through the inter connected networks. We assume a packet of data fran process X enters network A destined for process Y il network C. The address of Y is initially specified b: process X and the address of GATEWAY ill is derived fron the address of process Y. We make no attempt to specif whether the choice of GATEWAY is made by process X its nosr, or one of the packet switches in network ~4. Th packet traverses network A until it reaches GATEWAY itt At the GATEWAY, the packet is reformatted to meet th requirements of network B, account is taken of this uni of flow between A and B, and the GA'l'EW AY delivers th packet to network B. Again the derivation of the nex GA'rEWAY add~ess is accomplished based on the address 0 the destination Y. In this case, GATE\VAY ...N is the next onr The packet traverses network B until it finally reache GA'l'EWAY N where it is formatted .to meet the requirement of network C. Account is again taken of this unit of flo, between networks Band C. Upon entering network C the packet is routed to the HOST in which process 1 resides and there it iH delivered to its ultimate destination Since the GATEWAY must understand the address of th source and destination HOSTS, this information must b available in a standard format in every packet whic] arrives at the GATEWAY. "This information is containec in an. iniernetuork header prefixed to the packet by th. source HOST. The packet format, including the internet GATEWAY ~11
.5) Status, information, routing, fault detection, and isolation are typically different in each network. Thus, to obtain verification of certain conditions, such as an inaccessible or dead destination, _various kinds of coordination must be invoked between the communicating networks. It would be extremely convenient if all the differences between networks could be economically resolved by suitable interfacing at .the network -boundaries. For many of the differences, this objective can be achieved. However, both economic and technical considerations lead us to prefer that the interface be as simple and reliable as possible and deal primarily with passing data between networks that use different packet switching strategies. The question nO\\T arises as to whether the interface ought to account for differences in HOST or process level protocols by transforming the source conventions into the corresponding destination conventions'. We obviously want to allow conversion between packet switching strategies at the interface, to permit interconnection of existing and planned networks. However, the complexity
and dissimilarity of the HOS'I' or process level protocols makes it desirable to avoid having to transform between them at the interface, even if this" transformation were always possible. Rather, compatible HOST and process level protocols must be developed to achieve effective internetwork resource sharing. The unacceptable alternative is for every HOST or process to implement every protocol (a potentially unbounded number) .that may be needed to communicate with other networks, . We there. fore assume that a common protocol is to be used between HOST'S or processes in different networks and that the interface between networks should take as amall a role as possible in this protocol. To allow networks under di fferent ownership to interconnect, some accounting "rill undoubtedly be needed for traffic that passes across the interface. In its simplest terms, this involves an accounting of packets handled by each net for which charges are passed from net to net until the buck finally stops at the user or his representativ«. Furthermore, the interconnection must preserve
GATEWAY
665
Fifty Years of Communications and Networking
packet size parameters of one network from the internal packet size parameters of all other networks.
packet from
processx
2) It would be very difficult to increase the maximum GATEWAV
GATEWAY
Fig. 2. Three networks interconnected by two
Fig. 3.
GATEWAYS.
Internetwork packet format (fields not shown to scale).
work header, is illustrated in Fig, a. The source and destination entries uniformly and uniquely identify the address of every HOST in the composite network. Addressing is a subject of considerable complexity which is discussed in greater detail in the next section. The next t\VO entries in the header provide a sequence number and a byte count that may be used to properly sequence the packets upon delivery to the destination and may also enable the GATEWAYS to detect fault conditions affecting the packet. The flag field is used to convey specific control information and is discussed in the section on retransmission and duplicate detection later. The remainder of the packet consists of text for delivery to the destination and a trailing check sum used for end-to-end software verification. The GATEWAY does not modify the text and merely forwards the check sum along without computing or recomputing it. Each network may need to augment the packet format before it can pass through the individual network. We have indicated a local header in the figure which is prefixed to the beginning of the packet. This local header is introduced merely to illustrate the concept of embedding an internetwork packet in the format of the individual network through which the packet must pass. It will obviously vary in its exact form from network to network and may even be unnecessary in some cases. Although not explicitly indicated in the figure, it is also possible that a local trailer may be appended to the end of the packet. Unless all transmitted packets are legislatively restricted to be small enough to be accepted by every individual network, the GATEWAY may be forced to split a packet into two or more smaller packets. This action is called fragmentation and must be done in such a way that the destination is able to piece together the fragmented packet. It is clear that the internetwork header format imposes a minimum packet size which all networks must carry (obviously all networks will want to carry packets larger than this minimum) . We believe the long range growth and development of internetwork cornmunication would be seriously inhibited by specifying how much larger than the minimum a packet size can be, for the following reasons. 1) If a maximum permitted packet size is specified then it brC0111eS impossible to completely isolate the internal
permitted packet size in response to new technology (e.g., large memory systems, higher data rate communication facilities, etc.) since this would require the agreement and then implementation by all participating networks. 3) Associative addressing and packet encryption may require the size of a particular packet to expand during transit for incorporation of new information. Provision for fragmentation (regardless of where it is performed) permits packet size variations to be handled on an individual network basis without global administration and also permits HOSTS and processes to be insulated from changes in the packet sizes permitted in any networks through which their data must pass. If fragmentation must be done, it appears best to do it upon entering the next network at the GA'fE'VAY since only this GATEWAY (and not the other networks) must be aware of the internal packet size parameters which made the fragmentation necessary. If a GATEWAY fragments an incoming packet into t,YO or more packets, they must eventually be passed along to the destination HOST as fragments or reassembled for the HOST. It is conceivable that one might desire the GA'rE'YAY to perform the reassembly to simplify the task of the destination HOST (or process) and/or to take advantage of a larger packet size. We take the position that GATE'VAYS should not perform this function since GA'rEWAY reassembly can lead to serious buffering problems, potential deadlocks, the necessity for all fragments of a packet to pass through the same GATEWAY, and increased delay in transmission. Furthermore, it is not sufficient for the GATEWAYS to provide this function since the final GA1'E\VAY may also have to fragment a packet for transmission, Thus the destination nosr must be prepared to do this task. Let us now turn briefly to the somewhat unusual accounting effect which arises when a packet 111ay be fragmented by one or more GATEWAYS. We assume, for simplieity, that each network initially charges a fixed rate per packet transmitted, regardless of distance, and if one network can handle a larger packet size than another, it charges a proportionally larger price per packet. We also assume that" a subsequent increase in any network's packet size does not result in additional cost per packet to its users. The charge to a user thus remains basically constant through any net which must fragment a packet. The unusual effect occurs when a packet is fragmented into smaller packets which rnust individually pass through a subsequent network with a larger packet size than the original unfragmented packet. We expect that most networks will naturally select packet sizes close to one another, but in any case, an increase in packet size in one net, even when it causes fragmentation, will not increase the cost of transmission and may actually decrease it. In the event that any other packet charging policies (than
666
THE BEST OF THE BEST
the one we suggest) are adopted, differences in cost can be used as an economic lever toward optimization of individual network performance.
PROCESS LEVEL COI\!J:NIUNICATIQN We suppose that processes wish to communicate in full duplex with their correspondents using unbounded but finite length messages. A single character might constitute the text of a message from a process to terminal or vice versa. An entire page of characters might constitute the text of a message from a, file to a process. _A data stream (e.g., a continuously generated bit string) can be represented as a sequence of finite length messages. Within a HOST we assume the existence of .a transmission control program (TCP) which handles the transmission and acceptance of messages on behalf of the processes it serves. The TCP is in turn served ,by one or more packet switches connected to the HOST in which the TCP resides. Processes that want to communicate present messages to the TCP for transmission, and TCP's deliver incoming messages to the appropriate destination processes. We allow the TCP to break up messages into segments because the destination may restrict the amount of data that may arrive, because the local network may limit the maximum transmission size, or because the TCP may need to share its resources among many processes concurrently. Furthermore, we constrain the length of a segment to an integral number of 8-bit bytes. This uniformity is most helpful in simplifying the soft\vare needed with HOST machines of different natural word lengths. Provision at the process level can be made for padding a message that is not an integral number of bytes and for identifying which of the arriving bytes of text contain information of interest to the receiving process. Multiplexing and demultiplexing of segments among processes are fundamental tasks of the TCP. On transmission, a TCP must multiplex together segments from different source processes and produce internetwork packets for delivery to one of its serving packet switches. On reception, a TCP will accept a sequence of packets from its serving packet switch (es). From this sequence of arriving packets (generally from different HOSTS), the TCP must be able to reconstruct and deliver messages to the proper destination processes. 'Ve aSSU111e that every segment is augmented with additional information that allows transmitting and receiving TCP's to identify destination and source processes; respectively. At this point, we must face a major issue. How should the source TCl> format segments destined for the same destination TCP? We consider t\VO cases. (rase 1) : If \VO take the position that segment boundaries are immaterial and that a byte stream can be formed of segments destined for the same 'I'Cf', then we may gain improved transmission efficiency and resource sharing by arbitrarily parceling the stream into packets, permitting many segments to share a single internetwork packet header. However, this position results .in the need to re-
a
construct exactly, and in order, the stream of text bytes produced by the source TCP. At the destination, this stream must first be parsed into segments and these in turn must be used to reconstruct messages for delivery to the appropriate processes. There are fundamental problems associated with this strategy due to the possible arrival of packets out of order at the destination. The most critical problem appears to be the amount of interference that processes sharing the .same 'rCP-TCP byte stream may cause among themselves. This is especially so at the receiving end. First, the TCP may be put to some trouble to parse the stream back into segments and then distribute them to buffers where messages are reassembled. If it is not readily apparent that all of a segment has arrived (remember, it may come as several packets), the receiving TCP may have to suspend parsing temporarily until more packets have arrived. Second, if a packet is missing, it may not be clear whether succeeding segments, even if they are identifiable, can be passed on to the receiving process, unless the TCP has knowledge of some process level sequencing scheme. Such knowledge would permit the TCP to decide whether a succeeding segment could be delivered to its waiting process. Finding the beginning of a segment when there are gaps in the byte stream may also be hard. Case 2) : Alternatively, we might take the position that the destination TCP should be able to determine, upon its arrival and without additional information, for which process or processes a received packet is intended, and if so, whether it should be delivered then. If the TCP is to determine for which process an arriving packet' is intended, every packet must contain a process header (distinct from the internetwork header) that completely identifies the destination process. For simplicity, we assume that each packet contains text from ,a single process which is destined for a single process. Thus each packet need contain only one process header. To decide whether the arriving data is deliverable to the destination process, the TCP D1USt be able to determine whether the data is in the proper sequence (we can make provision for the destination process to instruct its TCP to ignore sequencing, but this is considered a special case). With the assumption that each arriving packet contains a process header, the necessary sequencing and destination process identification is immediately available to the destinatior Tep. Both Cases 1) and 2) provide for the demultiplexing and delivery of segments to destination processes, but only Case 2) does so without the introduction of potential interprocess interference. Furthermore, Case 1) introduces extra machinery to handle flow control on a HOST-toHOST basis, since there must also be some provision for process level control, and this machinery is little used since the probability is small that within a given HOST, t\VO processes will be coincidentally scheduled to send messages to the same destination HOST. For this reason, we select the method of Case 2) as a part of the internetwork tromemission. protocol.
Fifty Years ofCommunications and Networking
ADDRESS FORMATS The selection of address formats is 'a problem between networks because the local network addresses of TOP's
667
I
8
NETWORK
16
I rep IDENTIFIER I
Fig. 4. ',TCP address.
may vary substantially in format and size. A uniform internetwork TCP address space, understood by each GATEWAY and TCP, is essential to routing and delivery of internetwork packets. Similar troubles are encountered when we deal with process addressing and, more generally, port addressing. We Introduce the notion of ports in order to permit a process to distinguish between multiple message streams. The port is simply a designator of one such message stream associated with a process. The means for identifying a port are generally different in different operating systems, and therefore, to obtain uniform addressing, a standard port address format is also required. A port address designates a full duplex message stream. TCP ADDRESSING
TCP addressing is intimately bound up in routing issues, since a HOST or GATEWAY must choose a suitable destination HOST or GATEWAY for an outgoing internetwork packet. Let us postulate the following address format for the TCI) address (Fig. 4). The choice for network identification (8 bits) allows up to 256 distinct networks. This size seems sufficient for the forseeable future. Similarly, the TCP identifier field permits up to 6,5 536 distinct TCP's to be addressed, which seems more than sufficient for any given network, As each packet passes through a GATEWAY, the GATEWAY observes the destination network ID to determine how to route the packet. If the destination network is connected to the GATEWAY, the lower 16 bits of the TCP address are used to produce a local TOP address in the destination network, If the destination network is not connected to the GATEWAY, the upper 8 bits are used to select a subsequent GATEWAY. We make no effort to specify how each individual network shall associate the internetwork TCP identifier with its local TCP address. We also do not rule out the possibility that the local network understands the internetwork addressing scheme and thus alleviates the GArrEWAY of the routing responsibility. PORT ADDRESSING A receiving TOP is faced with the task of demultiplexing the stream of internetwork packets it receives and reconstructing the original messages for each destination process. Each operating system has its own internal means of identifying processes and ports. We' assume that 16 bits are sufficient to serve as internetwork port identifiers. A sending process need not know how the destination port identification will be used. The destination TCP will be able to parse this number appropriately to find the proper buffer into which it will place arriving packets. We permit a large port number field to support processes which want to distinguish between many different messages streams concurrently. In reality, we do not care how the 16 bits are sliced up by the TOP's' involved.
Even though the transmitted port name field is large, it is still a compact external name for the internal representation of the port. The use of short names for port identifiers is often desirable to reduce transmission overhead and possibly reduce packet processing time at the destination TCP. Assigning short names to each port, however, requires an initial negotiation between source and destination to agree on a suitable short name assign... ment, the subsequent maintenance of conversion tables at both the source and the destination, and a final transaction to release the short name. For dynamic assignment of port names, this negotiation is generally necessary in any ~ase. SEGl\1ENT AND PACI\:ET FORIVIATS As shown in Fig. 5, messages are broken by the TOP into segments whose format is shown in more detail in Fig. 6. The field lengths illustrated are merely suggestive. The first two fields (source port and destination port in the figure) have already been discussed in the preceding section on addressing. The uses of the third and fourth fields (window and acknowledgment in the figure) will be discussed later in the section on retransmission and duplicate detection. We recall from Fig. 3 that an internetwork header contains both a sequence number and a byte count, as well as a flag field and a check sum. The uses of these fields are explained in the following section. REASSEIVIBLY AND SEQUENCING The reconstruction of a message at the receiving TCP clearly requires' that each internetwork packet carry a sequence number which is unique to its particular destination port message stream. The sequence numbers must be monotonic increasing (or decreasing) since they are used to reorder and reassemble arriving packets into a message. If the space of sequence numbers were infinite, we could simply assign the next one to each new packet. Clearly, this space cannot be infinite, and we will consider what problems a finite sequence number space will cause when we discuss retransmission and duplicate detection in the next section. We propose the following scheme for performing the sequencing of packets and hence the reconstruction of messages by the destination Tel>. A pair of ports will exchange one or more messages over a period of time. We could view the sequence of messages produced by one port as if it were embedded in an infinitely long stream of bytes. Each byte of the message has a unique sequence number which we take to be its byte location relative to the beginning of the stream, When a 1 In the case of encrypted packets, a preliminary stage of reassembly may be required prior to decryption.
668
THE BEST OF THE BEST
IMESSAGE
AI
, byte identification -+- sequence number
-~~ ... ~ IMESSAGE BI I I f~~=:~ I I I Local Network Packet I Internetwork Packet LH
= Local
o
I' I 2/ ...
First Message
Header
IH = Internetwol1< Header PH = Process Header CK = Checksum
Segment
32
Source Port
Destination/Port
16
. 16
Third Message
Second Message
I Segment
...
Segment \ Segment (seQ = k)
Fig..5. Creation of segments and packets from messages. 32
I ...
k
Fig. 7.
Assignment of sequence numbers.
8n
(Field sizes in bits)
16 bits
------Process Header-------
Fig. 6.
Segment format (process header and text).
segment is extracted from the message by the source TCP and formatted for internetwork transmission, the relative location of the first byte of segment text is used as the sequence number for the packet. The byte count field in the internetwork header accounts for all the text inthe segment (but does not include the check-sum bytes or the bytes in either internetwork or process header). We emphasize that the sequence number associated with a given packet is unique only to the pair of ports that are communicating (see Fig. 7). Arriving packets are examined to determine for which port they are intended. The sequence numbere on each arriving packet are then used to determine the relative location of the packet text in the messages under reconstruction. We note that this allows the exact position of the data in the reconstructed message to be determined even when pieces 'are still
missing, Every segment produced by a source Tel> is packaged in a single internetwork packet and a check sum is computed over the text and process header associated with the segment, The splitting of messages into segments by the TCP and the potential splitting of segments into smaller pieces by GATEWAYS creates the necessity for indicating to- the destination TCP when the end of a segment (ES) has arrived and when the end of a message (EIVI) has arrived. The flag field of the internetwork header is used for this purpose (see Fig. 8). The ES flag is set by the source TCP each time it prepares a segment for transmission. If it should happen that the message is completely contained in the segment, then the EIVI flag would also be set. The ElVI flag is also set on the last segment of a message, if the message could not be contained in one segment. These t\VO flags are used by the destination TCP, respectively, to discover the presence of a check sum for a given segment and to discover that a complete message has arrived. The ES and E1VI flags in the internetwork header are known to the GATEWAY and are of special importance when packets must be split apart for propagation through the next local network. We illustrate their use with an example in Fig. 9. The original message ..4 in Fig. 9 is shown split into two segments A 1and 4-2 'and formatted by the Tel") into a pair
Fig. 8.
Internetwork header flag field.
..------1000 bytes-----.... 100
101 102
I I
rEX! OF; MESSAGE A
split by source
TCP
SEa
CT
500
ES EM
2
ISRC IDST I600 I500 11 I 1 ~ PH I TEXT ICK I
I segment
ISRCI spfit by GATEWAY
I I
250
2
~
packet A"
TeXT
~
packet A 12
PH
I TEXT ~
packet A 21
PH
I
packet A 22
PH
11 ~
PH
~ OST 1600 ~ ~ OST 1850 ~
~I DST I350
Fig. 9.
I
250
0
0
I
TEXT
~
OST 11001250
A2
I I
TEXT
~
Message splitting and packet splitting.
of internetwork packets. Packets Al and A 2 have the ES bits set, and A 2 has its El\1 bit set as well. Whe packet Al passes through the GATEWAY, it is split into tw pieces: packet .A11 for which neither EIVI nor ES bits ai Ret, and packet A 12 whose ES bit is set. Similarly, pack: A. 2 is split such that the first piece, packet A 21, has neitlu bit set, but packet A 22 has both bits set. The sequenr number field (SEQ) and the byte count field (CT) of eac packet is modified by the GATEWAY to properly identil the text bytes of each packet. The GATEWAY need on] examine the internetwork header to do fragmentation, The destination TCP, upon reassembling segment . .4 will detect the ES flag and will verify the check sum knows is contained in packet .4 12 • Upon receipt of packt A. 22, assuming all other packets have arrived, the dest nation TCP detects that it has reassembled a complei message and can now advise the destination process of i1 receipt.
669
Fifty Years ofCommunications and Networking
RETRANS1vIISSION AND DUPLICATE DETECTION No transmission can be 100 percent reliable. We propose a timeout and positive acknowledgment mechanism which will allow TCP's to recover from packet losses from one HOST to another. A TCP transmits packets and waits for replies (acknowledgements) that are carried in the reverse packet stream. If no acknowledgment for a particular packet is received, the TCP will retra~8~it. It is our expectation that the HOST level retransmission mechanism, which is described in the following paragraphs, will not be called upon very often in practice. Evidence already exists2 that individual networks can be effectively constructed without this feature. However, the inclusion of a HOST retransmission capability makes it possible to recover from occasional network problems and allows a wide range of HOST protocol strategies to be incorporated. \Ve envision it will occasionally be invoked to allow HOST accommodation to infrequent overdemands for limited buffer resources, and otherwise not used much. Any retransmission' policy requires some means by which the receiver can detect duplicate arrivals. Even if an infinite number of distinct packet sequence numbers were available, the receiver would still have the problem of knowing how long to remember previously received packets in order to detect duplicates. Matters are complicated by the fact that only a finite number of distinct sequence numbers are in fact available, and if they are reused, the receiver must be able to distinguish between new transmissions and retransmissions. ' A window strategy, similar to that used by the French CYCLADES system (voie virtuelle transmission mode [8J) and the AnPANET very distant HOST connection [18J, is proposed here' (see Fig. 10). Suppose that the sequence number field in the internetwork header permits sequence numbers to range from o to n - 1. We assume that the sender will not transmit more than w bytes without receiving an acknowledgment, The w bytes serve as the window (see Fig. 11). Clearly, w must be less than n. The rules for sender and receiver are as follows. Sender: Let L be the sequence number associated with the left window edge. 1) The sender transmits bytes from segments whose text liesbetween L and up to L + w - 1. 2) On timeout (duration unspecified), the sender retransmits unacknowledged bytes. 3) On receipt of acknowledgment consisting of the receiver's current left window edge, the sender's. left window edge is advanced over the acknowledged bytes (advancing the right window edge implicitly).
Receiver:
1) Arriving packets whose sequence numbers coincide with the receiver's current left window edge are acknowledged by sending to the source the next sequence number 2
The
ARPANET
is one such example.
Left Window Edge
G.----~--B I~ window
----.i
packet sequence number space
, 1
B .1
Fig. 10. The window concept.
Source Address
2
Destination Address
3
Nex t Packet Seq.
4
Current Buffer Size
5
Next Write Position
6
Next Read Position
7
End Read Position Max Retrans. No. Retrans.
8 9
Timeout
Flags
10
Curro Ack
Window
Fig. 11.
Conceptual TCB format.
expected. This effectively acknowledges bytes in between, The left window edge is advanced to the next sequence number expected. 2) Packets arriving with a sequence number to the left of the window edge (or, in fact, outside of the window) are discarded, and the current left window edge is returned as acknowledgment. 3) Packets whose sequence numbers lie within the receiver's window but do not coinicide with the receiver's left window edge are optionally kept or discarded, but are not acknowledged. This is the case when packets arrive out of order. We make some observations on this strategy. First, all computations with sequence numbers and window edges must be made modulo n (e.g., byte 0 follows byte n - 1). Second, w must be less than n/2 3 ; otherwise a retransmission may appear to the receiver to be a new transmission in the case that the receiver has accepted a window's worth of incoming packets, but all acknowledgments have been lost. Third, the receiver can either save or discard arriving packets whose .sequence numbers do not coincide with the receiver's left window. Thus, in the simplest implementation, the receiver need not buffer more than one packet. per message stream if space is critical. Fourth, multiple packets can be acknowledged simultaneously. Fifth, the receiver is able to deliver messages to processes in their proper order as a natural result of the reassembly mechanism. Sixth, when duplicates are detected, the acknowledgment method used naturally works to resynchronize sender and receiver. Furthermore, if the receiver accepts packets whose sequence numbers lie within the current window but 3 Actually n/2 is merely a convenient number to use; it is only required that a retransmission not, appear to be a new transmission.
670
THE BEST OF THE BEST
which are not coincident with the left window edge, an acknowledgment consisting the current left window edge would act as a stimulus to cause retransmission of the unacknowledged bytes, Finally, we mention an overl~p problem which results from r~tr~nsmissiQn, pack~t splitting, and alternate routing of packets through dIfferent GATEWAYS. . A 600-byte packet might pass through one GATEWAY and be broken into two 300-byte packets. On retransmission, the same packet' might be broken into three 2DD-byte packets going through a different GATEWAY. Since each byte has a sequence number, there is no confusion at the receiving TCP. \Ve leave for later the issue of initially synchronizing the sender and receiver left window edges and the window size.
of
FLOW CONTROL Every segment that arrives at the destination ·TCP is ultimately acknowledged by returning the sequence number of the next segment which must be passed to the process (it may not have arrived). Earlier we described the use of a sequence number space and window aid in duplicate detection, Acknowledgments are carried in the process header (see Fig. 6) and along with them there is provision for a "suggested window" .which the receiver can use to control the flow of data from the sender. This is intended to be the main component of the process' flow c~ntrol mechanism. The receiver is free to vary the window size according to any algorithm it desires so long as the ,:in~ow size never exceeds half the sequence number space. This flow control mechanism is exceedingly powerful and flexible and does not siIffer from synchronization troubles that may be encountered by incremental buffer allocation schemes [9J,[10J. However, it relies .heavily on an effective retransmission strategy. The receiver can reduce the window even while packets are en route from the sender whose window IS presently larger. The net effect of this reduction '"rill be that the receiver may discard incoming packets (they' may be outside the window) and reiterate the current window size along with a current window edge as acknowledgment. By the same token, the sender can'. upon occasion, choose to send more than a window's worth of data on the possibility that the receiver will expand the window to accept it (of course, the sende~'lnusinot send more. than half the sequence number space 'at any time) . Normally, we would expect the sender to abide by the window .limitation, Expansion of the ,vindo\~ by the receiver merely allows mo~e data to be accepted. For the receiving HOST with a small amount of buffer space, a strategy of discarding all packets whose sequence numbers do' not coincide with the ~urr?n~ left edge of the window is probably necessary, but It '~11l1nc~r the expense of extra delay and o~crhead for retransmission.
yet
to
TCP INPUT/OUTPUT HAND,LING The TCP has a component which handles input/output (I/O) to and from the network.' When a packet has rived, it validates the addresses and places the packet on a queue.. A pool of buffers can be set up to handle arrivals, and if all available buffers are used up, succeeding arrivals can be discarded since unacknowledged packets will be retransmitted. 'On output, a smaller amount of buffering is needed; since process buffers can hold the data to be transmitted, Perhaps double buffering will be adequate. We make 1)( attempt to specify how the buffering should be done except to require that it be able to service the network with as little overhead as possible. Packet sized buffers one or more ring buffers, or any other combination art possible candidates. , When a packet arrives at the destination TCP, it is placer on a queue which the TCP services frequently. For ex. ample, the TCP could be interrupted when a queue placement occurs. The TCP then attempts to place the packet text into the proper place in: the appropriate procesi receive 'buffer. If the packet terminates a segment, ther it can be checksummed and acknowledged. Plaeemem may fail for several reasons. . 1) The destination process may not be. prepared tc receive from the-stated source, or the destination port II nlay not exist. . '. . 2) There may be insufficient buffer space for the text 3) The beginning sequence number of the text rna; not coincide with the next sequence number to be deliverer to the process (e.g., the packethas arrived out of ~'rder) in the first case, the TCP should simply discard thE packet (th~s' far, no provision has been nla~e for erroi acknowledgments). In the second and third. cases, packet sequence number can be inspected to de~e.rmln( whether the packet text lies within the legitimate ~vlndo~, for reception, If .it does, the TCP may optionally keep thr packet queued for later processing. If D9t, the TCl can discard the packet. In either 'case the TCP car optionally acknowledge with the current left ,vind?,v edge It may happen that the process receive buffer is n01 present in the active memory of the HOST, but is stored ~I secondary storage. If this is the case, the TCP can promp the scheduler to' bring in the appropriate buffer and t~( packet can be queued for later processing. If there are no more input buffers available to the Tel for temporary queueing ofIncoming packets, and if thi TCP cannot quickly usc the arriving data .(e.g., a ~?I to TCP message), then the packet is discarded. Aasumiru a sensibly functioning system, no other processes than th. one for which the packet was intended should be affectec by this discarding. If the delayed processing queue grow:
ar-
:hE
4 This component can serve to. handle other protocols ,:~os associated control programs are designated by Internetwork destma tion address.
Fifty Years of Communications and Networking
671
excessively long, any packets in it can be safely discarded since none' of them have yet been acknowledged. Congestion at the TCP level is flexibly handled owing to the robust retransmission and duplicate detection strategy.
The read and write positions move circularly around the transmit buffer, with the write position always to the left (module the buffer size) of the read position. The next packet sequence number should be constrained to be less than or equal to the sum of the current acTCP/PROCESS COl\1l\1UNICATION knowledgment and the window fields. In any event, the In order to send a message, a process sets up its text next sequence number should not exceed the sum of the in a buffer region in its own address space, inserts the current acknowledgment and half of the maximum possible requisite control information (described in the following sequence number (to avoid confusing the receiver's list) in a transmit control block (TCB) and passes control duplicate .detection algorithm). A possible buffer layout to the TCP. The exact form of a TCB i~ not specified .is shown in, Fig. 12. The R.CB is substantially the same, except that the end here, but it might take the form of a passed pointer, a pseudointerrupt, or various other forms. To receive a read field is replaced by a partial segment check-sum message in its address space, a process sets up a receive register which permits the receiving TCP to compute and buffer, inserts the requisite control information in a remember partial check sums in the event that a segment receive control block (ROB) and again passes control arrives in several packets. When the final packet of the to the TOP. segment arrives, the TCP can verify the check sum and if In some simple systems, the buffer space may .in fact successful, acknowledge the segment. be provided by the TCP. For simplicity- we assume that CONNECTIONS AND ASSOCIATIONS a ring buffer is used by each process, but other structures (e.g., buffer chaining) are not ruled out. Much of the thinking about process-to-process comA possible format for the TCB is shown in Fig. 11.. The munication in packet switched networks has been inTCB contains informatiori necessary to allow the TCP fluenced by the ubiquitous telephone system, The HOSTto extract and send the process data. Some of the informa- HOST protocol for the ARPANET deals explicitly with the tion might be implicitly known, but we are not concerned opening and closing of simplex connections between with that level of detail. The various fields in the TCB processes [9J,[10J. Evidence has been presented tliat are described as follows. message-based "connection-free" protocols can be con1) Source Address: This is the full netjHosT/TCP/port structed [12J, and this leads us to carefully examine the address of the transmitter. notion of a connection. 2) Destination Address: This is the full net/rrosr/ The term connection has "a wide variety of meanings. It TCP/port of the receiver. can refer to a physical or logical path between tV{O en3) Next Packet Sequence Number: This is the sequence tities, it can refer to the flo\v over the path, it can innumber to be used for the next packet the TCP will ferentially refer to an action associated with the setting transmit from this port. up ·of a path, or it can refer to association between two 4) Current Buffer Size: This is the present size of the or more entities, with or without regard to any path process transmit buffer. between them. In this paper, we do not explicitly reject 5) Next UTrite P osition: This is the address of the next the term connection, since it is in such widespread use, position in the buffer at which the process can place new and does connote a meaningful relation, but consider data for transmission. it exclusively in the sense of an association between t\VO or 6) Next Read Position: This is the address at which the more entities without regard to a path. To be more precise l'CP should begin reading to build the next segment for about our intent, we shall define the relationship between output. two or more ports that are in communication, or are pre7) End Read Position: This is the address at which the pared to eommimieate to be an association. Ports that 'I'Cl" should halt transmission. Initially 6) and 7) bound are associated with each other are called associates. the message which the process wishes to transmit. It is clear that for any communication to take place 8) Number of Retrcnemiesions] j{azimum Retronsniie- between two processes, one must be able to address the sions: These fields enable the TC.P to keep track of the other. The t\VO important cases here are that the destinumber of times it has retransmitted the data and could be nation port may have a global and unchanging address or omitted if the TCP is not to give up. that it may be globally unique but dynamically reassigned. 9) 1"l'iJneout/Flags: The timeout field specifies the While in either case the sender may have to learn the delay after which unacknowledged data should be retrans- destination address, given the destination name, only in mittod, The flag field is used for semaphores and other the second instance is there a requirement for learning the TCl:>/proccss synchronization, status reporting, etc. address from the destination (or its representative) each 10) Current . .4ckno1.vledgJnent/TV-indo'w: The current time an association is desired. Only after the source has acknowledgment field identifies the first byte of data learned how to address the destination can an association still unacknowledged by the destination TOP. be said to have occurred. But this is not yet sufficient. If
an
672
THE BEST OF THE BEST . . . . - - - - - Current M e s s a g e - - - -... Partial Next Message
Sent. Not Acked
t Next Seq. No.
Current Ack
t
Next+Read
t
End Read
t
Next Write
I-window----------.I
.....---------Transmit
Fig. 12.
Buffer Size - - - - - - - - -
Transmit buffer layout.
ordering of delivered messages is also desired, both TCP's must maintain sufficient information to allow proper sequencing. When this information is also present at both ends, then an association is said to have occurred. Note that we have not said anything about a path, nor anything which implies that either end be aware of the condition of the other. Only when both partners are prepared to communicate with each other has an association occurred, and it is possible that neither partner may be able to verify that an association exists until some data flows between them. CONNECTION-}1~REEPROTOCOLS
WITH
ASSOCIATIONS In the ARPANET, the interface. message processors (Il\/IP's) do not have to open and close connections from source to destination. The reason for this is that connections are, in effect, always open, since the address of every source and destination is never' reassigned. When the name and the place are static and unchanging, it is only necessary to label a packet with source and destination to transmit jt through the network, In our parlance, every source and destination forms an association. In the case of processes, however, we find that port addresses are continually being used and reused. Some ever-present processes could be assigned fixed addresses which do not change (e.g., the logger process). If we supposed, however, that every TCP had an infinite supply of port addresses so that no old address would ever be reused, then any dynamically created port would be assigned the next unused address. In such an environment, there could never be any confusion by source and destination TCP as to the intended recipient or implied source of each message, and all ports would be associates. Unfortunately, "TOP's (or more properly; operating systems) tend not to have an infinite supply of internal port addresses. These internal addresses are reassigned after the demise of each port. Walden [12J suggests that a set of unique uniform external port addresses could be supplied by a central registry. A newly created port could apply to the central registry for an address which the central registry would guarantee to be unused by any HOST system in the network. Each TCP could maintain tables matching external names with internal ones, and use the external ones for communication with other 6
Unless the Il\1P is physically moved to another site, or the is connected to a different IMP.
HOST
processes. This idea violates the premise that interprocess communication should not require centralized' control. One would have to extend the central registry service to include all HOST'S in all the interconnected networks to apply this idea to our situation, and we therefore do not attempt to adopt it . Let us consider the situation from the standpoint of the TCP. In order to send or receive data for a given port: the TCP needs to set up a TCB and ReB and initialize the window size and left window edge for both. Oil the receive side, this task might even be. delayed until the first packet' destined for a given port arrives. By convention, the first packet should be marked so that the receiver will synchronize to the received sequence number On the send side, the first request to transmit couk cause a TCB to be set up with some initial sequence number (say, zero) and an assumed window size. ThE receiving 'rep can reject the packet if it wishes anc notify the sending TCP of the correct window size via thr acknowledgment mechanism, but only if either 1) we insist that the first packet be a complete segment 2) an acknowledgment 'can be sent for the first packei (even if not a segment, as long as the acknowledg ment specifies the next sequence number such tha' . the source also understands that no bytes have beer accepted) . It is apparent, therefore, that the synchronizing of windov size and left window edge can be accomplished withou what would ordinarily "be called a connection setup. The first packet referencing a newly created ReI sent from 'one associate to another can be marked with : bit which requests that the receiver synchronize his lef window edge with the sequence number of the arriviru packet (see SYN bit in Fig. 8). The TCP can examine the source and destination port addresses in the packet anr in the ReB to decide whether to accept or ignore tht request. Provision should be made for a destination process t< specify that it is willing to LISTEN to a specific port 0 "any" port. This last idea permits processes such as th. logger process to accept data arriving from unspeeifier sources. This is purely a HOST matter, however. The initial packet may contain data which can be storer or discarded by the destination, depending on the avail ability of destination buffer space at the time. In the othe direction, acknowledgment is returned for receipt of dat: which also specifies -the receiver's window size. If the receiving TCP should want to reject the syn chronization request, it merely transmits an acknowledg ment carrying a release (REL) bit (see Fig. 8) indicatini that the destination port address is unknown or inacces sible. The sending HOST waits for the aeknowledgmen (after accepting or rejecting the synchronization request' before sending the next message or segment. This rejectioi is quite different from a negative data acknowledgment We do not have explicit negative. acknowledgments. If ru acknowledgment is returned, the sending HOST rna-
673
Fifty Years of Communications and Networking
retransrnit without introducing confusion if, for example,
might be adequate for most low-bandwidth uses. This idea
Ithe left window edge is not changed on the retransmission. I Because messages may be broken up into many packets for transmission or during transmission, it will be necessary to ignore the REL flag except in the case that the E~f flag is also set. This could be accomplished either by the TCP or by the GATEWAY which could reset the flag Ion all but the packet containing the set E1VI flag (see Fig. 9). At the end of an association, the TCP sends a packet with ES, El\I, and REL flags set. The packet sequence number scheme will alert the receiving TCP if there are still outstanding packets in transit which have not yet arrived, so a premature dissociation cannot occur. To assure that both TCP's are aware that the association has ended, we insist that the receiving TCP respond to the ItEL by sending a R.EL acknowledgment of its IIown. Suppose now that a process sends a single message to. an associate including an REL along with the data. Assuming IIan RCB has been prepared for the receiving TCP to 1accept the data, the Tell will accumulate the incoming packets until the one marked ES, E1VI, REL arrives, at which point a REL is returned to the sender. The association is thereby terminated and the appropriate TCB and ReB are destroyed. If the first packet of a message contains a SYN request bit and the last packet contains IES, EM, and REL bits, then data \..~ll flow "one message at a time.' This mode is very similar to the scheme described by Walden [12J, since each succeeding message can only be accepted at the receiver after a new LISTEN (like Walden's RECEIVE) command is issued by the receiving process to its serving 'I'Cl', Note that only if the acknowledgment is received by the sender can the association be terminated properly. It has been pointed out" that the receiver may erroneously accept duplicate transmissions if the sender does not receive the acknowledgment. This may happen if the sender transmits a duplicate message with the SYN and REL'bits set and the destination has already destroyed any record of the transmission. One way of preventing this problem IIS~revious to destroy the record of the association at the destination only after some known and suitably chosen timeout. However, this implies that a new association with the same source and destination port identifiers could not be establi<Jhed until this timeout had expired. This problem can occur even with sequences of messages whose SYN and REL bits are separated into different internetwork packets. We recognize that this problem must be solved, but do not go into further detail here. Alternatively, both processes can send one message, causing the respective TCP's to allocate RCBjTCB pairs at both ends which rendezvous with the exchanged data and then disappear. If the overhead of creating and destroying RCB's and TCB's is small, such a protocol
might also form the basis for a relatively secure transmission system. If the communicating processes agree to change their external port addresses in some way known only to each other (i.e., pseudorandom), then each message will-appear to the outside world as if it is part of a different association message stream. Even if the data is intercepted by a third party, he "rill have no way of knowing that the data should in fact be considered part of a sequence of messages, We have described the way in which processes develop associations with each other, thereby becoming associates for possible exchange of data. These associations need not involve the transmission of data prior to their formation and indeed t\VO associates need not be able to determine that they are associates until they attempt to communicate.
I
1
l
G
S. Crocker of AllPA/IPT.
CONCLUSIONS
'Ve have discussed some fundamental issues related to the interconnection of packet switehing networks. In particular, we have described a simple but very powerful and flexible protocol which provides for variation in individual network packet sizes, transmission failures, sequencing, flow control, and the creation and destruction of process-to-process associations. We have considered SODle of the implementation issues that arise and found that the proposed protocol is implementable by HOST'S of widely varying capacity. The next important step is to produce a detailed specification of the protocol so that some initial experiments with it can be performed, These experiments are needed to determine some of the operational parameters ( e.g., how often and how far out of order do packets actually arrive; w hat sort of delay is -there between segment acknowledgmenta; what should be retransmission timeouts be?) of the proposed protocol. ACI{NOWLEDGIVIENT
The authors wish to thank a number of colleagues for helpful comments during early discussions of international network protocols, especially R. Metcalfe, R. Scantlebury, D. Walden, and H. Zimmerman; D. Davies and L. Pouzin who constructively commented on the fragmentation and accounting issues; and S. Crocker who commented on the creation and destruction of associations.
REFERENCES [1] L. Roberts and B. Wessler, "Computer network development to achieve resource sharing," in 197'0 Spring Joint Computer Conf·, AFIPS Conf. Proc., vol. 36. Montvale, N. J.: AFIPS Press, 1970, pp. 543-549. (2] L. Pouzin, "Presentation and major design aspects of the CYCLADES computer network," in Proc. 3rd Data Communications Sytnp., 1973. 13] F. It. E. Dell, "Features of a proposed synchronous data network," in Proc. 2nd Sy-n~p. Problems in the Optimization of Data Communicaiions Systerns, 1971, pp, 50-57.
674
THE BEST OF THE BEST
[4] R.. A. Scantlebury and P. T. Wilkinson, "The design of a Vinton G. Cerf was born in New Haver switching system to allow remote access to computer services Conn., in 1943. He did undergraduate wor by other computers and terminal devices," in Proc. 2nd SY1np. in mathematics at Stanford University Problems in the Opti-lnization of Data Communication» Systems, Stanford, Calif., and received the Ph.D. de PHOTO 1971, pp. 160-167. gree in computer science from the Universit [5J n. L. A. Barber, HThe European computer network project," NOT of California at Los Angeles,Los Angeles in Computer Communicauons: Impod» and Implications, Calif., in 1972. AVAILABLE S. Winkler, Ed. Washington, D. C., 1972, pp. 192-200. He was with IBM in Los Angeles fror [6] R, Despres, "A packet switching network with graceful saturated operation," in Computer Communications: Impads and 1965 through 1967 and consulted and/c Implications, S. Winkler, Ed. Washington, D. C., 1972, pp. worked part time at UCLA from 1967 throug 345-3t51. 1972. Currently he is Assistant Professor ( [7] R.. J~. Kahn and W. R, Crowther, "Flow control in a resource- Computer Science and Electrical Engineering at Stanford Universitj sharing computer network," IEEE Trans. Commun., vol. and consultant to Cabledata Associates. Most of his current researc COM-20, pp. 539-546, June 1972. [8] J. F. Chambon, M. Elie, J. Le Bihan, G. LeLann, and H. Zim- is supported by the Defense Advanced Research Projects Agency an merman, "Functional specification of transmission station in by the National Science Foundation on the technology and economic the CYCLA1)ES network. ST-ST protocol" (in French), of computer networking. He is Chairman of IFIP TC6.1, an iutei I.H..LA. Tech. Rep. SCH502.3, May 1973. national network working group which is studying the probler [9] S. Carr, S. Crocker, and V. Cel'f, "HOST~HOST Communica- of packet network interconnection. tion Protocol In the AllPA Network," in Spring Joint Computer Conf., AFIPS Conf. Proc., vol. 36. Montvale, N. J.: AFIPS Press, 1970, pp. 589-597. (10] A. McKenzie, "HOST/HOST protocol for the ARPA network," Robert E. Kahn (M'65) was born in Brooklyt in Current Network Protocols, Network Information Cen., N. Y., on December 23, 1938. He received til Menlo Park, Calif., NIC 8246, Jan. 1972. B.E.E. degree from the City College of Ne' [11] L. Pouzin,' "Address format in Mitranet," NIC 14497, INWG York, New .York, in 1960, and the M.l PHOTO 20, Jan. 1973. and Ph.D. degrees from Princeton Universitj [12] D. Walden, "A system for interprocess communication in- a NOT Princeton, N. J., in 1962 and 1964, r4 resource sharing computer network," Commun, Ass. Comput. spectively. AVAILABLE . Modi., vol. V), pp. 221-230, Apr. 1972. From 1960 to 1962 he was a Member of th [13] B. Lampson, "A scheduling philosophy for multiprocessing systems," Commun, Ass. Comp·ut. Mach., vol. 11, pp. 347-360, Technical Staff of Bell Telephone Labors May 1968. tories, MurrayHill, N. J., engaged in traffi [14] F. E. Heart, R,. E. Kahn, S. Ornstein, W. Crowther, .and and communication studies. From 1964 t D. Walden, "The interface message processor for the ARPA 1966 he was a Ford Postdoctoral Fellow and an Assistant Professc computer network," in Proc. Spring Joint Computer Conf·, AFIPS Conf. Proe., vol. 36. Montvale, N. J.: AFIPS Press, of Electrical Engineering at the Massachusetts Institute of Tecl nology, Cambridge, where be worked on communications and 'ir 1970, pp . .551-567. [15] N. G. Anslow and J. Hanscoff, "Implementation of inter- formation theory. From 1966 to 1972 he was a Senior Scientist t national data exchange networks," in Computer Communica- Bolt Beranek and Newman, Inc., Cambridge, Mass., where t lions: Impacts and Implications, S. Winkler, Ed. Washington, worked on computer communications network design and technique I). C., 1972, pp. 181-184 for distributed computation.' Since 1972 he has been with the A< [16] A. McKenzie, "HOST/HOST 'protocol design considerations," vanced Research Projects Agency, Department of Defens INWG Note 16, NIC 13879, Jan. 1973. . , Arlington, Va. . [17] R. E. Kahn, "Resource-sharing computer communication Dr. Kahn is a member of Tau Beta Pi, Sigma Xi, Eta Kappa Ni networks," Proc. IEEE, vol. 60, pp. 1397-1407, Nov. 1972. [18] Bolt, Beranek, and Newman, "Specification for the intercon- the Institute of Mathematical Statistics, and the Mathematic: nection of a host and an IMP," Bolt Beranek and Newman, Association of America. He was selected to serve as a Nation: Lecturer for the Association for Computing Machinery in 1972. Tnc., Cambridge, Mass., BBN Rep. 1822 (revised), Apr. 1973.
*
On Distributed Communications Networks PAUL BARAN,
SENIOR MEMBER, IEEE
Summary-This paper! briefly reviews the distributed communication network concept in which each station is connected to all adjacent stations rather than to a few switching points, as in a centralized system. The payoff for a distributed configuration in terms of survivability in the cases of enemy attack directed against nodes, links or combinations of nodes and links is demonstrated. A comparison is made between diversity of assignment and perfect switching in distributed networks, and the feasibility of using low-cost unreliable communication links, even links so unreliable as to be unusable in present type networks, to form highly reliable networks is discussed. The requirements for a future all-digital data distributed network which provides common user service for a wide range of users having different requirements is considered. The use of a standard format message block permits building relatively simple switching mechanisms using an adaptive store-and-forward routing policy to handle all forms of digital data including digital voice. This network rapidly responds to changes in the network status. Recent history of measured network traffic is used to modify path selection. Simulation results are shown to indicate that highly efficient routing can be performed by local control without the necessity for any central, and therefore vulnerable, control point.
(a)
(b)
(c)
Fig. 1-(a) Centralized. (b) Decentralized. (c) Distributed networks.
loop. Such a network is sometimes called a Udecentralized" network, because complete reliance upon a single point is not always required.
INTRODUCTION
L
us
ET CONSIDER the synthesis of a communication network which will allow several hundred major communications stations to talk with one another after an enemy attack. As a criterion of survivability we elect to use the percentage of stations both surviving the physical attack and remaining in electrical connection with the largest single group of surviving stations. This criterion is chosen as a conservative measure of the ability of the surviving stations to operate together as a coherent entity after the attack. This means that small groups of stations isolated from the single largest group are considered to be ineffective. Although one can draw a wide variety of networks, they all factor into two components: centralized (or star) and distributed (or grid or mesh). (See types (a) and (c), respectively, in Fig. 1.) The centralized network is obviously vulnerable as destruction of a single central node destroys communication between the end stations. In practice, .a mixture of star and mesh components is used to form communications networks. For example, type (b) in Fig. 1 shows the hierarchical structure of a set of stars connected in the form of a larger star with an additional link forming a
EXAMINATION OF A DISTRIBUTED NETWORK
Since destruction of a smallnumber of nodes in a decentralized network can destroy communications, the properties, problems, and hopes of building" distributed" communications networks are of paramount interest. The term "redundancy level" is used as a measure of connectivity, as defined in Fig. 2. A minimum span network, one formed with the smallest number of links possible, is chosen as a reference point and is called "a network of redundancy level one." If two times as many links are used in a gridded network than in a minimum span network, the network is said to have a redundancy level of two. Fig. 2 defines connectivity of levels 1, It, 2, 3, 4, 6 and 8. Redundancy level is equivalent to link-to-node ratio in an infinite size array of stations. Obviously, at levels above three there are alternate methods of constructing the network. However, it was found that there is little difference regardless of 'vhich method is used. Such an alternate method is shown for levels three and four, labelled R'. This specific alternate mode is also used for levels six and eight." Each node and link in the array of Fig. 2 has the capacity and the switching flexibility to allow transmission between any ith station and any jth station, provided a path Manuscript received October 9, 1963. This paper was presented can be drawn from the ith to the jth station. at the First Congress of t~ Information Systems Sciences sponStarting with a network composed of an array of sored by the MITRE Corporation, Bedford, Mass., and the' USAF Electronic Systems Division, Hot Springs, Vs., November, 1962. stations connected as in Fig. 3, an assigned percentage The author is with The RAND Corporation, Santa Monica, of nodes and links is destroyed. If, after this operation, Calif.
I Any views expressed in this paper are those of the author. They should not be interpreted as reflecting the views of The RAND Corporation or the official opinion or policy of any of its governmental or private research sponsors.
2 See L. J . Craig, and 1. S. Reed, "Overlapping Tessellated Communications Networks," The RAND Corporation, Santa Monica, Oalif., paper P-2359; July 5, 1961.
Reprinted from IEEE Transactions on Communications Systems, March 1964.
The Best ofthe Best. Edited byW. H. Tranter, D. P.Taylor, R. E. Ziemer, N. F. Maxemchuk, and 1. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
675
676 0 0 0
1
0
0
0
0
0
0
R ,I
:+
~
o
R 'I.5
~ R'3
10
-- - -- - -- -·--------r
I
THE BEST OF THE BEST
0
R '2
* R '4
0
0
0
0
0
0
0
0 0 0
0
0
0
0
0
0
0
0
R" 3
R" 4 0 0 01
o
0.2
Fig. 4-Perfect switching in a distributed network: sensitivity to node destruction, 100 per cent of links operative.
o R" 8
Fig.2-Definition of redundancy level.
Fig. 3-An array of stations.
it is st ill possible t o draw a line to connect t he ith station to the jth station, the ith and jth stations are said to be connected.
Node Destruction Fig . 4 indicates network performance as a funct ion of the probability of destruction for each separate node. If the expected " noise" was destruction caused by conventional hardware failure, the failures would be randomly distributed through the network. But if the disturbance were caused by enemy attack, the possible" worst cases" must be considered. To bisect a 32-link network requires direct ion of 288 weapons each with a probability of kill, P» = 0.5, or 160
with a Pk = 0.7, to produce over an 0.9 probability of successfully bisecting the network. If hidden alternative command is allowed, then the largest single group would still have an expected value of almost 50 per cent of the initial stations surviving intact. If this raid misjudges complete availability of weapons, complete knowledge of all links in the cross section, or the effects of the weapons against each and every link, the raid fails. The high risk of such raids against highly parallel structures causes examination of alternative attack policies. Consider the following uniform raid example. Assume that 2000 weapons are deployed against a lOOO-station network. The stations are so spaced that destruct ion of two stations with a single weapon is unlikely. Divide the 2000 weapons into two equal lOOO-weapon salvos. Assume any probability of destruction of a single node from a single weapon less than 1.0; for example, 0.5. Each weapon on the first salvo has a 0.5 probability of destroying its target. But, each weapon of the second salvo has only a 0.25 probability, since one half the targets have already been destroyed. Thus, the uniform attack is felt to represent a worst-case configuration. Such worst-case attacks have been directed against an 18 X 18-array network model of 324 nodes with varying probability of kill and redundancy level, with results shown in Fig. 4. The probability of kill was varied from zero to unity along the abscissa, while the ordinate marks survivability . The criterion of survivability used is the percentage of stations not physically destroye d and remaining in communication with the largest single group of surviving stations. The curves of Fig. 4 demonstrate survivability as a function of attack level for networks of
677
Fifty Years of Communications and Networking
,. ~ 0 .•
::
Z 0 .6
>--
40%
of Nodes
~"..['~---
--
_.. -
!::", -'z ;;;0
~~
~'" ~l5
":
z o
0.4
>=
~
....0::
!:i 5.°. 2 C>
~
~I
0 .2
~
Q4
Q$
U
~
~8
U
Q2
ID
Fig. 5-Perfect switching in a distr ibuted network: sensitivity to link destruction, 100 per cent of nodes operative.
varying degrees of redundancy. The line labeled "best possible line" marks the upper bound of loss due to the physical failure component alone. For example, if a network underwent an attack of 0.5 probability destruction of each of its nodes, then only 50 per cent of its nodes would be expected to survive, regardless of how perfect its communications. We are primarily interested in the additional system degradation caused by failure of communications. Two key points are to be noticed in the curves of Fig. 4. First, extremely survivable networks can be built using a moderately low redundancy of connectivity level. Redundancy levels on the order of only three permit the withstanding of extremely heavy level attacks with negligible additional loss to communications. Secondly, the survivability curves have sharp break points. A network of this type will withstand an increasing attack level until a certain point is reached, beyond which the network, rapidly deteriorates. Thus, the optimum degree of redundancy can be chosen as a function of the expected level of attack. Further redundancy gains little. The redundancy level required to survive even very heavy attacks is not great; it is on the order of only three or four times that of the minimum span network. Link Destruction
In the previous example we have examined network performance as a function of the destruction of the nodes (which are better targets than links). We shall now reexamine the same network, but using unreliable links . In particular; we want to know how unreliable the links may be without further degrading the performance of the network. .
~
M
~
SINGLE LI NK PROBAB ILITY OF
SINGLE LINK PROBAB ILITY OF DESTRUCTION
1.0
DESTRUC TI ON
Fig. 6-Perfect switching in a distributed network : sensitivity to link destruction after 40 per cent nodes are destroyed.
OJ,
~
QS
W
~
\.ARG EST FRACTION OF
COMMUNICATION
~
M
S TATI ONS 1N
WITH ONE ANOTHER
Fig. 7-Probability density distribution of largest fraction of stations in communication: perfect switching, R = 3, 100 cases, 80 per cent node survival, 65 per cent link survival.
Fig. 5 shows the results for the case of perfect nodes; only the links fail. There is little system degradation caused even using extremely unreliable links, on the order of 50 per cent down time, assuming all nodes are working. Combination Link and Node Destruction
The worst case is the composite effect of failures of both the links and the nodes. Fig. 6 shows the effect of link failure upon a network having 40 per cent of its nodes destroyed. It appears that what would today be regarded as an unreliable link can be used in a distributed network almost as effectively as perfectly reliable links. Fig. 7 examines the result of 100 trial cases in order to estimate the probability density distribution of system performance for a mixture of node and link failures. This is the distribution of cases for 20 per cent nodal damage and 35 per cent link damage.
678
THE BEST OF THE BEST
Comparison with Present Systems
DIVERSITY OF ASSIGNl\fENT
There is another and more common technique for using redundancy than in the method described above in which each station is assumed to have perfect switching ability. This alternative approach is called "diversity of assignment." III diversity of assignment, switching is not required. Instead, a number of independent paths are selected between each pair of stations in a network which .requires reliable communications. However, there are marked differences in performance between distributed switching and redundancy of assignment as revealed by the following Monte Carlo simulation. Simulation
In the matrix of N separate stations, each ith station is connected to every jth station by three shortest but totally separate independent paths (i = 1, 2, 3, ... , N; j = 1, 2, 3, ... , N; i ~ j). A raid is laid against the network. Each of the preassigned separate paths from the ith station to the jth station is examined. H one or more of the preassigned paths survive, communication' is said to exist between the itll and the jth station. The criterion of survivability used is the mean number of stations connected to each station, averaged over all stations. Unlike the distributed perfect switching case, Fig. 8 shows that there is a marked loss in communications capability with even slightly unreliable nodes or links. The difference can be visualized by remembering that fully flexible switching permits the communicator the privilege of ex post facto decision of paths. Fig. 8 emphasizes a key difference between some present-day networks and the fully flexible distributed network we are discussing.
for Rond P=3
Present conventional switching systems try only a small subset of the potential paths that can be drawn on a gridded network. The greater the percentage of potential paths tested, the closer one approaches the performance of perfect switching. Thus, perfect switching provides an upper bound of expected system performance for a gridded network; the diversity of assignment case provides a lower bound. Between these two limits lie systems composed of a mixture of switched routes and diversity of assignment, Diversity of assignment is useful for short paths, eliminating the need for switching, but requires survivability and reliability for each tandem element in long-haul circuits passing through many nodes. As every component in at least one out of a smoli number of possible paths must be simultaneously operative, high reliability margins and full standby equipment are usual. ON FUTURE
SYSTEMS
We will soon be living in an era in which we cannot guarantee survivability of any single point. However, we can still design systems in which system destruction requires the enemy to pay the price of destroying n of n stations. If n is made sufficiently large, it can be shown that highly. survivable system structures can be built, even in the thermonuclear era. In order to build such networks and systems we will have to use a large number of elements. Weare interested in knowing how inexpensive these elements may be and still permit the system to operate reliably. There is a strong relationship between element cost and element reliability. To design a system that must anticipate a worst-case destruction of both enemy attack and normal system failures, one can combine the failures expected by enemy attack together with the failures caused by normal reliability problems, provided the enemy does not know which elements are inoperative. Our future systems design problem is that of building at lowest cost very reliable systems out of the described set of unreliable elements. In choosing the communications links of the future, digital links appear increasingly attractive by permitting low-cost switching and low-cost links. For example, if "perfect switching" is used, digital links are mandatory to permit tandem connection of many separately connected links without cumulative errors reaching an irreducible magnitude. Further, the signaling measures to implement highly flexible switching doctrines always require digits. Future Low-Cost All-Digital Communications Links
°0
0.1
0.2
0.3 0.4 0.5 0.6 0.7 SINGLE NODE PROBABILITY OF KILL
0.8
09
1.0
Fig. 8-- Diversity of assignment VB perfect switching in a distributed network.
When one designs an entire system optimized for digits and high redundancy, certain new communications link techniques appear more attractive than those common today. A key attribute of the new media is that it permits cheap formation of new routes, yet allows transmission on the order of a million or so bits per second, high enough to be economic yet low enough to be inexpensively
679
Fifty Years of Communications and Networking
processed with existing digital computer techniques at the relay station nodes. Reliability and raw error rates are secondary.. The network must be built with the expectation of heavy damage anyway. Powerful error removal methods exist. Some of the communication construction methods that look attractive for the near future include pulse regenerative repeater line, minimum-cost or "mini-cost" microwave, TV broadcast station digital transmission and satellites. Pulse Regenerative Repeater Line: S. F. B. Morse's regenerative repeater invention for amplifying weak telegraphic signals has recently been resurrected and transistorized. Morse's electrical relay permits amplification of weak binary telegraphic signals above a fixed threshold. Experiments by various organizations (pri-
marily the Bell Telephone Laboratories) have shown that digital data rates on the order of 1.. 5 million bits per second can be transmitted over ordinary telephone line at repeater spacings on the order of 6000 feet for 22-gage pulp paper insulated copper pairs. At present, more than 20 tandemly connected amplifiers have been used without retiming synchronization problems. There appears to be no fundamental reason why either lines of lower loss, with corresponding further repeater spacing, or more powerful resynchronization methods cannot be used to extend link distances to in excess of 200 miles. Such distances would be desired for a possible national distributed network. Power to energize the miniature transistor amplifier is transmitted over the copper circuit itself. "Mini-Cost" lIficrowave: While the price of microwave equipment has been declining, there are still untapped major savings. In an analog signal network we require a high degree of reliability and very low distortion for each tandem repeater. However, using digital modulation together with perfect switching we minimize these two expensive considerations from our planning. We would envision the use of low-power, mass-produced microwave receiver/transmitter units mounted on low-cost, short, guyed towers. Relay station spacing would probably be on the order of 20 miles. Further economies can be obtained by only a minimal use of standby equipment and reduction of fading margins. The ability to use alternate paths permits consideration of frequencies normally troubled by rain attenuation problems reducing the spectrum availability problem. Preliminary indications suggest that this approach appears to be the cheapest way of building large networks of the type to be described. TV Stations: With proper siting of receiving antennas, broadcast television stations might be used to form additional high data rate links in emergencies. Satellites: The problem of building a reliable network using satellites is somewhat similar to that of building a communications network with unreliable links. When a satellite is overhead, the link is operative. When a satellite is not overhead, the link is out of service. Thus, such links are highly compatible with the type of system to be described.
Variable Data Rate Links . In a conventional circuit-switched system each of the tandem links requires matched transmission bandwidths.
In order to make fullest use of a digital link, the posterror-removal data rate would have to vary, as it is a function of noise level. The problem then is to build a communication network made up of links of variable data rate to use the communication resource most efficiently.
Variable Data Rate Users We can view both the links and the entry point nodes of a multiple-user all-digital communications system as elements operating at an ever-changing data rate. From instant to instant the demand for transmission will vary. We would like to take advantage of the average demand over all users instead of having to allocate a full peak demand channel to each. Bits can become a common denominator of loading and we would like to efficiently handle both those users who make highly intermittent bit demands on the network and those who make longterm continuous, low-bit demands. Common
User
In communications, as in transportation, it is most economic for many users to share a common resource rather than each to build his own system, particularly when supplying intermittent or occasional service. This intermittency of service is highly characteristic of digital communication requirements. Therefore, we would like to consider one day the interconnection, of many alldigital links to provide a resource optimized for the handling of data for many potential intermittent users: a new common-user system. . Fig. 9 demonstrates the basic notion. A wide mixture of different digital transmission links is combined to form a common resource divided among many potential users. But each of these communications links could possibly have a different data rate. How can links of different data rates be interconnected?
Fig. 9-AlI-digital network composed of mixture of links.
680
THE BEST OF THE BEST
A
A future system, incorporating the features outlined in the preceding section, has been modeled and simulated. The key attribute of the systemis in its switching scheme. But prior to considering the way in which the system would work, some thought must be given to message format standardization.
ized message block simplifies construction of very high speed switches. Every user connected to the network can feed data at any rate up to a maximum value. The user's traffic is stored until a full data block is received by the first station. This block is rubber stamped with a heading and return address, plus additional housekeeping information. Then it is transmitted into the network.
Standard Message Block
Switching
Present common carrier communications networks, used for digital transmission, use links and concepts originally designed for another purpose-voice. These systems are built around a frequency division multiplexing link-to-link interface standard. The standard between links is that of data rate. Time division multiplexing appears so natural to data transmission that we might wish to consider an alternative approach, a standardized message block as a network interface standard. While a standardized message block is common in many computer-communications 'applications, no serious attempt has ever been made to use it as a universal standard. A universally standardized message block would be composed of perhaps 1024 bits. Most of the message block would be reserved for whatever type data is to be transmitted, while the remainder would contain housekeeping information such as error detection and routing data, as in Fig. 10.
In order to build a network with the survivability properties shown in Fig. 4, we must use a switching scheme able to find any possible path that might exist after heavy damage. The routing doctrine should find the shortest possible path and avoid self-oscillatory or "ringaround-the-rosey" switching. We shall explore the possibilities of building a H realtime" data transmission system using store-and-forward techniques. The high data rates of ·the future carry us into a hybrid zone between store-and-forward and circuit switching. The system to be described is clearly store and forward .if one examines the operations at each node singularly. But, the network user who has called up a "virtual connection" to an end station and has transmitted messages across the United States in a fraction of a second might also view the system as a black box providing an apparent circuit connection across the country. There are t\VO requirements that must be met to build such a quasi-real-time system. First, the in-transit storage at each node should be minimized to prevent undesirable time delays. Secondly, the shortest instantaneously available path through the network should be found with the expectation that the status of the network will be rapidly changing. Microwave will be subject to fading interruptions and there will be rapid moment-to-moment variations in input loading. These problems place difficult requirements upon the switching. However, the development of digital computer technology has advanced so rapidly that it now appears possible to satisfy these requirements by a moderate amount of digital equipment. What is envisioned is a network of unmanned digital switches implementing a self-learning policy at each node, without need for a central and possibly vulnerable control point, so that over-all traffic is effectively routed in a changing environment. One particularly simple routing scheme examined is called the "hot-potato" heuristic routing doctrine and will be described in detail. Torn-tape telegraph repeater stations and our mail system provide examples of conventional store-and-forward switching systems. In these systems, messages are relayed from station to station and stacked until the "best" outgoing link is free. The key feature of store-andforward transmission is that it allows a high line occupancy factor by storing so many messages at each node that there is a backlog of traffic awaiting transmission. But the price for link efficiency is the price paid in storage capacity and time delay. However, it was found that most 0/ the advantages of store-and-ioruiard 8witching could be obtained with extrernely little storage at the nodes.
MODEL ALL-DIGITAL DISTRIBUTED SYSTEM
MESSAGE
&so errs/SEC
--1 .
BUFFER
_____
BUFFER i....--
.----1
~
-----'
MESSAGE 1,540,000 BITS/SEC
Fig. lo-1\.1essage block.
As we move to the future, there appears to be an increasing need for a standardized message block for our all-digital communications networks. As data rates increase, the velocity of propagation over long links becomes an increasingly important consideration. 3 We soon reach a point where more time is spent setting the switches in a conventional circuit-switched system for short holding-time messages than is required for actual transmission of the data. Most importantly, standardized data blocks permit many simultaneous users, each with widely different bandwidth requirements to economically share a broad-band network made up of varied data rate links. The standard3 3000 miles at ~150,OOO miles/sec ~50 msec transmission time, T. l024-bit message at 1,500,000 bits/sec ~2/3 msec message
time, M. Therefore, T
»:».
681
Fifty Years ofCommunications and Networking
Thus, in the system to be described, each node will attempt to get rid of its messages by choosing alternate routes if its preferred route is busy or destroyed. Each message is regarded as a "hot potato," and rather than hold the hot potato, the node tosses the message to its neighbor who will now try to get rid of the message. The Postman Analogy.' The switching process in any store-and-forward system is analogous to a postman sorting maiL A postman sits at each switching node. Messages arrive simultaneously from all links. The postman records bulletins describing the traffic loading status for each of the outgoing links. With proper status information, the postman is able to determine the best direction to send any letters. So far, this mechanism is general and applicable to all store-and-forward communication systems. Assuming symmetrical bidirectional links, the postman can infer the "best" paths to transmit mail to any station merely by looking at the cancellation time or the equivalent handover number tag. If the postman sitting in the center of the United States received letters from San Francisco, he would find that letters from San Francisco arriving from channels to the west would come in with later cancellation dates than if such letters had arrived in a roundabout manner from the east. Each letter carries an implicit indication of its length of transmission path. The astute postman can then deduce that the best channel to send a message to San Francisco is probably the link associated with the latest cancellation dates of messages from San Francisco. By observing the cancellation dates for all letters in transit, information is derived to route future traffic. The return address and cancellation date of recent letters is sufficient to determine the best direction in which to send subsequent letters. Hot-Potato Heuristic Routing Doctrine: To achieve realtime operation it is desirable to respond to change in network status as quickly as possible, so we shall seek to derive the network status information directly from each message block. Each standardized message block contains a Uta" address, a "from" address, a handover number tag and error detecting bits together with other housekeeping data. The message block is analogous to a letter. The "from" address is equivalent to the return address of the letter. The handover number is a tag in each message block set to zero upon initial transmission of the message block into the network. Every time the message block is passed on, the handover number is incremented. The handover number tag on each message block indicates the length of time in the network or path length. This tag is somewhat analogous to the cancellation date of a conventional letter. The Handover Number Table: While cancellation dates could conceivably be used on digital messages, it is more convenient to think in terms of a simpler digital analogy; a tag affixed to each message and incremented every time the message ~s relayed. Fig. 11 shows the handover table located in the memory of a single node. A row is reserved
for each major station of the network allowed to generate traffic. A column is assigned to each separate .link connected to a node. As it was shown that redundancy levels on the order of four can create extremely H tough" networks and that additional redundancy can bring little, only about eight columns are really needed.
I
2
4
3
5
6
7
8
1st 2nd 3rd
22
co
12
10
9
4th
9
8
13
7
5
6
4
:.
2
3
4
8
2
~ .~
B
5
3
2
2
4
5
12
C
7
8
13
9
22
to
7
8
I
7
2
8
0
2t
23
19
21
12
10
12
13
6
5
7
8
E
7
10
12
14
12
13
13
15
•
2
3
5
F
7
10
12
13
1
2
3
4
G
6
c:::::::~
4
~
~l.,..../"
~---V-14
:';1'.
LINK NUMBER for CECISION CHOICE
HANDOVER NUMBER ENTRI ES A
-
CHOICE
BEST
LINK NUMBER
-
- -3 L,1- L£.' 5
-
., ~:-.
V
~ Fig. II-The handover number table.
Perfect learning: If the network used perfectly reliable, error-free links, we might fill but our table in the following manner. Initially, set entries on the table to high values. Examine the handover number of each message arriving on each line for each station. If the observed handover number is less than the value already entered on the handover number table, change the value to that of the observed handover number. If the handover number of the message is greater than the value on the table, do nothing. After a short time this procedure will shake down the table to indicate the path length to each of the stations over each of the links connected to neighboring stations. This table can now be used to route new traffic, For example, if one wished to send traffic to station 0, he would examine the entries for the row listed for station C based on traffic from C, and select the link corresponding to the column with the lowest handover number. This is the shortest path to C. If this preferred link is busy, do not wait, choose the next best link that is free. Digital Simulation: This basic routing procedure was tested by a Monte Carlo simulation of a 7 X 7 array of stations. All tables were started completely blank to simulate a worst-case starting condition where no station knew the location of any other station. Within one-half second of simulated real-world time, the network had learned the locations of all connected stations and was routing traffic in an efficient manner. The mean measured path length compared very favorably to the absolute shortest possible path length under various traffic loading conditions. Preliminary results indicate that network loadings on the order of 50 per cent of link capacity
682
THE BEST OF THE BEST
could be inserted without undue increase of path length.. When local busy spots occur in the network, locally generated traffic is intennittently restrained from entering the busy points while the potential traffic jams clear. Thus, to the node the network appears to be a variable data rate system, which will limit the number of local subscribers that can be handled. If the network were carrying light traffic, any new input line into the network would accept full traffic, perhaps 1.5 million bits per second. But, if every station had heavy traffic and the network became heavily loaded, the total allowable input data rate from any single station in the network might drop to perhaps 0.5 million bits per second. The absolute minimum guaranteed data capacity of the network from any station is a function of the location of the station in the network, the redundancy level and the mean path length of transmitted traffic in the network, The "choking" of input procedure has been simulated in the network and no signs of instability under overload noted. It was found that most of the advantage of store-and-forward transmission can be provided in a system having relatively little memory capacity. The network" guarantees" very rapid delivery of all traffic that it has accepted from a user. F~rgetting
and Irnperfect Learning
We have briefly considered network behavior when all links are working. But we are also interested in determining network behavior with real-world links, some destroyed, while others are being repaired. The network can be made rapidly responsive to the effects of destruction, repair and transmission fades by a slight modification of the rules for computing the values on the handover number table. Learning: In the previous example, the lowest handover number ever encountered for a given origination, or U from" station, and over each link. was the value recorded in the handover number table. But if some links had failed, our table would not have responded to the change.. Thus, we must be more responsive to recent measurements than old ones. This effect can be included in our calculation by the following policy.. Take the most recently measured value of handover number; subtract the previous value found in the handover table; if the difference is positive, add a fractional part of this difference to the table value to form the updated table value. This procedure merely implements a "forgetting" procedure: placing more belief upon more recent measurements and less on old measurements. In the case of network damage, this device would automatically modify the handover number table entry to exponentially and asymptotically approach the true shortest path value. If the difference between measured value minus the table value is negative, the new table value would change by only a fractional portion of the recently measured difference. This implements a form of skeptical learning. Learning will take place even with occasional errors. Thus, by the simple device of using only two separate "learning constants,' depending on whether the measured value is
greater or less than the table value, we can provide a mechanism that permits the network routing to be responsive to varying loads, breaks and repairs. This learning and forgetting technique has been simulated for a few limited cases and was found to work well. Adaptation to Environment: This simple simultaneous learning and forgetting mechanism implemented independently at each node causes the entire network to suggest the appearance of an adaptive system responding to gross changes of environment in several respects, without human intervention. For example, consider selfadaptation to station location. A station, Able, normally transmitted from one location in the network, as shown in Fig. 12(a). If Able moved to the location shown in Fig. 12(b), all it need do to announce its new location is to transmit a few seconds of dummy traffic. The network will quickly learn the new location and direct traffic toward Able at its new location. The links could also be cut and altered, yet the network would relearn. Each node sees its environment through myopic eyes by only having links and link-status information to a few neighbors. There is no central control; only a simple local routing policy is performed at each node, yet the over-all system adapts.
Fig. 12-Adaptability to change of user location. (a) Time "Tt ." (b) Time UTi."
Lowest Cost Path We seek to provide the lowest cost path for the data to be transmitted between users. When we consider complex networks, perhaps spanning continents, we encounter the problem of building networks with links of widely different data rates. How can paths be taken to encourage most use of the least expensive links? The fundamentally simple adaptation technique can again be used. Instead of incrementing the handover by a fixed amount, each time a message is relayed, set the increment to correspond to the link cost/bit of the transmission link. Thus, instead of the "instantaneously shortest nonbusy path" criterion, the path taken will be that offering the cheapest transportation cost from user to user that is available. The technique can be further extended by placing priority and cost bounds in the message block itself, permitting certain users more of the communication resource during periods of heavy network use.
683
Fifty Years of Communications and Networking WHERE
WE
STAND TODAY
Although it is premature at this time to know all the problems involved in such a network and understand all costs, there are reasons to suspect that we may not wish to build future digital communication networks exactly the same way the nation has built its analog telephone plant.
There is an increasingly repeated statement made that one day we will require more capacity for data transmission than needed for analog voice transmission. If this state-
ment is correct, then it would appear prudent to broaden our planning consideration to include new concepts for future data network directions. Otherwise, we may stumble into being boxed in with the uncomfortable restraints of communications links and switches originally designed for high-quality analog transmission. New digital computer techniques using redundancy make cheap unreliable links potentially usable. Some sort of switched network compatible with these links appears appropriate to meet this new upcoming delnand for digital service. Of course, we could use our existing circuit switching
from the following list of tentative titles:
[1] Paul Baran, "Introduction to Distributed Communications Networks.'" [2] S. Boehm and P. Baran, "Digital Simulation of Hot-Potato Routing in a Broadband Distributed Communications Network." [3] J. W. Smith, "Determination of Path-Lengths in a Distributed Network.' (4] P. Baran, "Priority, Precedence, and Overload." (5] - - , "History, Alternative Approaches, and Comparisons." [6] - - , Mini-Cost Microwave."
[7J - - , "Tentative Engineering Specifications and Preliminary Design for a High Data Rate Distributed Network Switching Node." [8] - - , "The Multiplexing Station." [9] - - , '·'JSecurity, Secrecy, and Tamper-Free Considerations." [10] - - , "Cost Analysis." (11] - - , "Summary Overview."
Because of the dependence of each of these Memoranda techniques, but a system with greater capacity than the long lines of telephone plants might best be designed for (vols. 2-11) upon one another, we have elected to release such data transmission and survivability at the outset. the volumes as a set as an aid to the reader. Such a system should economically permit switching of ACKNOWLEDGMENT very short blocks of data from a large number of users In discussing this work, I received a number of helpsimultaneously with intermittent large volumes among a smaller set of points. Considering the size of the market, ful ideas and suggestions. Wherever possible, these there appears to be an. incommensurately small amount acknowledgments are included within the detailed papers of thinking about a national data plant, designed primarily amplifyingthesu~em. Specific acknowledgments for the present paper include around bit transportation. the excellent programming assistance provided by Bharla POSTSCRIPT Boehm, J. Derr, and J. W. Smith. I am also indebted This paper was, in essence, written about 18 months to J. Bower for his suggestions that switching in any ago. Since that time the most critical aspects of the store-and-forward system can be described by a model of system have been examined and developed in detail, a postmaster and a blackboard. and a series of amplifying RAND Memoranda is in preparation. An idea of the subjects covered can be gained 4 This is essentially the present paper.
Routing Procedures in Communications N etworksPart I: Random Procedures" REESE T. PROSSERt Summary-A study is made of possible routing procedures in military communications networks in order to evaluate these procedures in terms of future tactical requirements. In Part I this study is devoted to procedures involving random choices. In such
networks each message path is essentially a random walk. Estimates of the average traverse time of each message and average traffic flow through each node are derived by statistical methods under reasonable assumptions on the operating characteristics of the network for various typical random routing procedures. This paper does not purport to present a complete system design. Many design questions, common to all network routing probl~ms response to temporary loss of links or nodes, roles for handlmg of message priorities, etc.-are not considered here. It is shown that random routing procedures are highly inefficient but extremely stable. A comparison of these theoretical results with the results of an extended computer simulation effort lends support to their reliability, discrepancies being accounted for by the simplifying nature of the statistical assumptions. It is suggested that in circumstances where the need for stability outweighs the need for efficiency, this type of network might be advantageously employed.
A.
INTRODUCTION
T
H IS REPORT grewout of an attempt to describe a military comm~lnic~tions sys:em su~table for combat units operating In a hostile environment. The requirements for such a system differ in various respects from those of a civilian system, such as the telephone or telegraph systems, operating in normally favorable environments, and these differences are reflected not only in the operating -characteristics of the components of the system but also in the organization of the system itself. This report is devoted to a study of the statistical properties of different kinds of communications system organizations and an evaluation of their relative effectiveness under different types of environment. The central problem, stated in general terms, appears to be one of efficiency vs reliability: In an entirely favorable environment, the reliability of a communications system is limited only by the characteristics of the components themselves, and the system should b~ orga~ized for maximum efficiency. In an extremely hostile environment on the other hand, the system must be organized for maximum reliability, and the efficiency is limited by the characteristics of this organization. Thus in a civilian system, messages normally get through quickly and .accurately, but if fire sweeps a central office, the communications failure may be absolute. In a military system, how-
* Received September 25,
1962. This work. was supported by the S. Continental Army Command under Air Force Contract AF 19 (604)-5200.
u.
t M.LT. Lincoln Laboratory, Lexington, Mass.
ever, degradation must be only gradual: messages must still be delivered--possibly delayed and distorted-even after a loss of half the system, A closer analysis reveals that these general considerations can be given a theoretical basis, based on the following observations. Any large-scale civilian communications system is essentially a directory system: Every operating station in the system has a directory, or has access to a directory, which contains complete information on how to reach every other station in the network. In most cases there are also "central" stations, whose primary function is to handle, or to assist in handling, the routing of all messages through the system. This type of organization has the obvious advantage that efficient routing procedures can be obtained by all stations from the directory, and the obvious disadvantage that it depends heavily upon this directory. In a hostile environment the disadvantage is .Iikely to outweigh the advantage: any change in the system requires a revision of the directory of every station, and the removal of a central station is likely to paralyze a large segment of the system. In contrast, any military communications system operating in an extremely hostile environment tends to operate as a random system: there are no directories, and each station sends messages to whatever stations he can, based on whatever information he has, and hopes for the best. This type of org-anization has the obvious disadvantage that it is likely to be extremely inefficient, but has the advantage that it will continue to operate in the presence of serious degradation. In a favorable environment the disadvantage surely outweighs the advantages: No military establishment would tolerate so high a degree of inefficiency in its communications unless driven to it by truly desperate circumstances. These observations can be restated in terms of the information available to each station about the rest of the system. On the one hand, the directory system supposes that each station has access to complete and correct information about the system and routes messages on the basis of this information. Such a system can be made efficient, but is faced with the problem of devising optimum routing procedures from this information, and keeping this information up to date. On the other hand, the random system supposes that no station has access to any information about the system and each station routes message on the basis of random choices. Such a system can be rn.ade reliable, but must determine how to organize itself to ensure best possible efficiency. It is likely that any military communications system suitable
Reprinted from IEEE Transactions on Communications Systems, December 1962.
The Best ofthe Best. Edited by W H. Tranter, D. P Taylor, R. E. Ziemer, N. F. Maxemchuk, and J. W Mark. Copyright © 2007 The Institute of Electrical and Electronics Engineers, Inc.
685
686
THE BEST OF THE BEST
for operating under a wide variety of conditions will necessarily lie somewhere between these extremes, incorporating elements of both in a proportion which might well change with circumstances. In this report we shall attempt to reinforce these qualitative considerations with quantitative information on the statistical properties and behavior of various kinds of communications systems organizations. In Part I we consider essentially random systems in which the routing procedures depend to a large extent on random choices made at each station. Theoretical estimates of average traffic rates, average traverse times, and average storage requirements are derived for various routing procedures and compared with the results of an extensive simulation experiment made on the IBlVl 709 computer. The behavior of such systems under degradation is then investigated in terms of these parameters. In Part II we consider essentially deterministic systems, in which the routing procedures are determined from directory information at the initial station. Here algorithms for optimum efficiency routing procedures and possible updating schemes are investigated, average traffic rates, traverse times and storage requirements are derived, and behavior under degradation investigated. The report concludes with a table of comparisons and summary of results.
to a prescribed procedure (the routing procedure) which varies with the system under study, but always includes the following: a) If the message is addressed to the node where it is located, it is dropped from the net (i.e., "received") . b) If the node has (partial) information on the location of the addressee, it relays the message to one of its k neighbors selected on the basis of that information. c) Otherwise the node relays the message to one of its k neighbors selected at random. 3) Each node handles messages one at a time only, and the time required to handle a message' is directly proportional to the length of the message. For convenience ,ve take the constant of proportionality to be 1. It is desirable to avoid specifying the unit of time, and so we shall refer to it as a cycle. The length of time required to send a message from one station to a neighbor station is called a relay. Note that this length depends on the message, but not on the station. The average relay is then l/JL cycles. We assume 1/1J. > 1. C.
MEAN TOTAL TRAFFIC
We now proceed to make a series of rough estimates of the operating characteristics of the system described An accurate analysis of an arbitrary communications in the previous section. These estimates are statistical system operating under a random routing procedure is in nature, and are based on assumptions, stated explicitly exceedingly difficult to achieve. For this reason we begin below, about the statistical properties of the system which by introducing a series of broad assumptions designed to are gross over-simplifications of the actual state of affairs. simplify the analysis without adulterating the problem, The estimates are consequently to be regarded as indiSpecifically, all of the communications systems considered cative, but not descriptive of the actual characteristics here are assumed to have the following features in common. of the system. A comparison of these estimates with the Each network may be considered as a connected graph results of the simulation experiment described in Section I. [1] consisting of a fixed number n of nodes of which certain will give some idea of their reliability. pairs ("neighbors") are joined by links. Since we are The first step in our analysis consists in estimating dealing with systems with random routing procedures, the average total number N(t) of messages in the network we shall assume that the network is (approximately) at the end of a given cycle t. Roughly speaking, this homogeneous in the sense that each node is linked to average is given by 1), the average total number of mes(approximately) k others (i.e., has k "neighbors"). sages in the network at the end of the previous cycle, Messages are supposed to originate at each node of plus 2), the average total number of messages added to the network with a Poisson distribution in time of mean A the network during a single cycle, minus 3), the average and with an exponential distribution in length of mean total number of messages dropped from the network l/IJ.. Both A and IJ. are to be the same for all nodes. Each during a single cycle. Now 1) is simply N(t - 1) and 2) message is addressed to a node in the network chosen at is just An. 3) is the sum of t\VO terms, one arising from random. It is convenient to include the (unlikely) pos- messages lost because of insufficient storage facilities, sibility that the sender and receiver nodes for a message and the other from messages delivered to their destinations. are identical. Each message is relayed from node to node We shall assume here that the storage facilities at each along existing links in the network subject to the following node are sufficient to justify neglecting the first of these terms. (Estimates of these storage requirements are general rules. derived below.) The second term is the product of the 1) Messages arriving at a node, whether relayed or average fraction f of N(t) actually handled during a cycle originating there, are placed in storage in order of (i.e., not waiting in storage), times the probability p arrival. Any message arriving while the storage is that one of these messages will be relayed to its destinafull is dropped from the network (i.e., (Clost"). 2) Messages are taken one at a time from storage on 1 i.e., the time required to process and transmit a message. Time a first-come, first-serve basis and handled according spent in storage is not included.
B.
OPERATING CHARACTERISTICS
687
Fifty Years of Communications and Networking
tion, divided by the expected number of cycles 1/JL reThe requirement that N « n places a condition on the quired to relay such a message to its destination. Thus we parameters of the system; namely, that Anip,p « n. are led to the following equation for N(t): This leads to N(t)
= N(t -
1)
+
Xn - jJ.pfN(t - 1).
-A «p
(1)
(10)
f.L
This is an ordinary first-order difference equation whose We conclude that when (10) is satisfied, the network solution is determined once we settle on an initial conwill operate at a steady state with (9) as the expected dition. Since we are principally interested in steady-state value for N. As the ratio AIf.L increases past p, however, estimates, it is convenient to assume that the network is then messages originate in the system faster than they empty at t = 0 and investigate the behavior of N(t) as can be delivered, and N may be expected to increase t ~ co. This leads to without limit, and ultimately behave like (6). These (2) conclusions are borne out reasonably well by the simulaN(O) = o. tion results (ct. Section I). The solution now depends 011 assumptions made on vV-e have assumed above that the number of messages p and f. Now p (and essentially only p) depends on the lost because of insufficient storage is negligible. This routing procedure of the system, and is investigated in assumption places a requirement 011 the storage facilities Section E. Tentatively we shall assume here that p is a at 'each node which we shall estimate here. constant, independent of t or N, and postpone its evaluaThe storage facilities at each node may be considered tion until Section E. On the other hand, f is the fraction as forming a queue, with messages arriving with a Poisson of N not waiting in storage, and in general depends on N. distribution of mean which because of the homogeneous If we assume that the messages are independently rancharacter of the net we may take to he nX, and leaving domly distributed throughout the network, then the with an exponential distribution of mean lip.. Thus the methods of Riordan [6] may be used to show that the methods of queueing theory apply (ct. [5], chapter 2). expected value of t is given by . According to this theory, the probability that the queue is empty at any cycle is given by n (3) teN) = n + N - 1 . 1- p (11) Po = 1 8+1 - P This value leads to a troublesome equation for (1). But two cases of interest are immediately accessible. where p = nXIJJ.J and 8 is the length of the queue. The We first note that probability that the queue contains b messages is
feN) if N
»
rv
n if N» n { N if l N «n;
n, we get N(t)
=
N(t - 1)
+
(A - Jip)n
whose solution is easily seen to be N(t)
In this case N(t) -7 steady state. If N « n, we get
00
=
(X - JJ.p)nt.
as t ~
N(t) = N(t - 1)
+
co,
(0
=
~
s).
(12)
The mean of this distribution over b when p « 1 is approximately p (see [5], page 18). Now the probability that a message is lost at a given node during a cycle is certainly no greater than the (5) probability P, that the storage queue at that node is full. The requirement that this is negligible relative to the probability that the message is delivered during this (6) cycle is simply (assuming f r'
r. < ap,p
and the system has no
(13)
where a is a preassigned fraction (for instance a = 0.01). Using (11) and (12) we get An - JJ.pN(t - 1)
(7) p
whose solution is given by (ct. [4]) N(t)
S b
An (1 _ (1 _ JJP
An
1- p 1 - p 8+1
<
oqrp .
(14)
This is surely satisfied if p,p)t).
(8)
In this case N (t) --+ Anip,p as t ~ 00, and we may take and this implies this as the expected value for N when the network is operating in steady state. Formally, N(oo) = -. p.p
$
(15)
8
>
log (aJlP) log
p
.
(16)
The condition expressed in (16) places a lower bound on (9) the size of the storage facilities at each node needed to
688
THE BEST OF THE BEST
ensure that message loss is negligible relative to message delivery. Thus we must have storage for at least 8 messages at each node, where 8 satisfies (16).
E.
EFFECTS OF VARIOUS ROUTING PROCEDURES
It remains to give estimates for the value of p, the probability that a message will be relayed to its destination in a single relay. This probability obviously depends D. MEAN TRAVERSE TIMES directly upon the choice of routing procedure; and in this section we estimate the value of p for various typical The next step in our analysis consists of estimating procedures, each of them consistent with the general the average traverse time of a single message, i.e., the description given in Section B, and each of them involving average number of cycles required to relay the message random choices. from sender to receiver by routing procedures of the type It is convenient for this purpose to regard a message described in Section B. as having reached its destination as soon as its subsequent As in Section 0, let p be the probability that a given route is completely deterministic. Thus a message is message, placed at random in the network, is relayed to regarded as "in the network" only as long as its subseits destination on' the next relay. Again we shall assume quent path is not completely determined by the routing tentatively that p is independent of time. Then the In all cases considered here this convention procedure. probability that the message is still in the network after not materially affect the results of Sections C and D. does 8 relays is simple (1 p)', This gives a familiar distri2 With this in mind, we see that the value of p has the bution over s, whose mean m and variance u are readily following form: it is given by 1), the probability that the computed [3]: message is located at a neighbor of its destination, taken in the sense described above, times 2), the probability m = ~",,! (17) that this neighbor will relay it to its destination. The p p value of 1) is computed assuming that the message is 2 1- P 2 U = --2-"" m, (18) located at random in the network, and 2) is computed p on the basis of the routing procedure. We now examine various cases: Note that (72 is larger than m. This may be interpreted as saying that while the mean traverse time ,is of the a) pure random. In this case we assume each node order of lip, the variance is very large, and some messages knows only its own identity. Thus each node relays may be expected to have very large traverse times. every message to one of its k neighbors selected at It remains to express m and (1'2 in terms of cycles. The random unless the message is addressed to it. Here number z of cycles required for the average relay is just 1), the "destination" of the message consists of the the average service time in cycles, plus 2), the average addressee, and there are k neighbors of this destinatime in cycles spent waiting in storage. Now 1) is just lip" tion. The probability' that message is located at and 2) is equal to 1/JJ. times the average number of messuch a neighbor is kin, and we get sages waiting in storage. The average number of messages k1 1 waiting in storage, assuming adequate storage facilities, (23) p = ~'k =~. is approximately pl(! - p) (cf. [5], page 22). Combining, we get b) first-order neighbors. In this case we assume each 1(
z= ~
p) 1 1 1 + 1 - p <; (1 _
p)'
(19)
Thus the mean traverse time in cycles, from (18) and (19), IE
m = (1 - p)._l_. 1 -
p,p
When p
«
p
(20)
1 this becomes
1 1 · p,p (1 - p)
mf""'It-J-
(21)
When p is negligibly small, the network is essentially empty, and (21) becomes 1 p,p
m "" - .
(22)
Thus the delaying effect caused by the presence of other messages is given by the factor (I/[] - pD.
node knows the identity of itself and each of its neighbors. Messages are relayed at random unless addressed to a neighbor, in which case they are relayed to that neighbor. The "destination" of a message now consists of k + 1 nodes (addressee plus k neighbors) and the number of neighbors of this destination depends somewhat on the details of the network graph. Two cases are accessible when n is large. If neighbors of each node are located at random through the network, then they are relatively independent of each other, and there are approximately k(k - 1) neighbors of the destination (i.e., neighbors of neighbors of the addressee). The probability that a message is located at one of these neighbors is then k(k - l)/n, and we get
P=
k(k - 1) 1
n
.k
k - 1
= --.
n
(24)
If the graph is planar, however, and if neighbors are located among the geographical neighbors of a node
689
Fifty Years of Communications and Networking then they are no longer independent of each other;
and a study of various regular lattice configurations in the plane shows that when k is small there are approximately 2k neighbors of the destination, each of which is connected to the destination in two different ways, This gives us
2k 2 4 -. P = -n. -k = n
(25)
c) second-order neighbors. In this case each node knows the identity of itself, its neighbors and their neighbors. This case is an extension of the previous one, and the dependence on the details of the graph, already noted there, also applies here. If neighbors are located at random through the network , then the "destination" consists of k 2 12 nodes and there are approximately le(k 1) neighbors of the destination. In this case we get
+
Recall that for large n we have 1
mt'"".J-
(17)
1 2 P
(18)
p
(J'
2
N
t""..J
1 P
r-..J - .
p
where
p
An = -.
(9)
}Jt
From Table I we see that in genera], the larger the value of p, the smaller the value of m, (12 and N, and the more efficient the behavior of the network, In this sense the more information a node has, the more efficiently it will perform as a relay station in this type of network. F.
C:RITIQUE OF ASSUMPTIONS
In deriving the formulas of the preceeding sections we have relied heavily upon several simplifying assumptions, which may well be expected to introduce errors into the results. In this section we examine briefly the nature of le(k - 1)2 1 (k - 1)2 P.- = . (26) these errors. n k n In the first place, we have assumed that the network If neighbors are located only among geographical is homogeneous, i.e., connected in such a way that, neighbors then the "destination" consists of approxi- statistically speaking, the nodes are essentially identical, mately 3lc + 1 nodes and there are approximately yet statistically independent, in their behavior. In point 3k neighbors each of which is connected to the of fact, such a network is hard to realize. Any graph with destination in two different ways, Thus we get a high degree of homogeneity in its connectivity is lik~ly to have a high degree of symmetry as well, and the statis3k 2 6 p = -.=-. (27) tical independence of the different nodes is not easy to n k n justify. This is particularly true for planar graphs, where neighbors are chosen geographically. It is evident that we can continue this extension until I t is difficult to get a clear view of the situation without each node has complete information about the net, becorning hopelessly entangled in advanced combinatorics. though the approximations made here become less and Nevertheless, it seems reasonable to suppose that the less valid as we proceed. When k is large, however, not dependence of our results upon these assumptions is not many steps are required. (Thus it has been said that any critical when n, the number of nodes, is large (n 100) pair of persons in the United States are fourth-order and deviations from homogeneity are not too pronounced neighbors via acquaintance links.) (dk .~ 2). Thus we shall expect the behavior of such We may summarize the preceding arguments in the networks to follow at least qualitatively the predictions following way: if the "destination" of a message is deof our results and actual mean traverse times and mean fined as the set of nodes which know the location of total traffic to approximate the values we have predicted the addressee node and can assure a deterministic path with at least the right order of magnitude. Excessive to the addressee, and if there are h neighbors of this reliance on the predicted values, however, is to be avoided. destination, each of them connected to the destination in the second place, we have assumed throughout that i different way'S, then we may estimate the value of p as theInvalue of p, the probability that a message reaches its destination on the next relay, is constant in time. Actually, hj (28) it can be shown that p decreases somewhat in time. This P = ;;'k' effect tends to make the mean traverse times somewhat We assemble our results in Table I. longer, and the mean total traffic somewhat larger, than predicted in Section D. Th.e actual behavior of p is, like everything else, TABLE I shrouded in combinatorics. In the pure random case a Routing Procedure Value of p rough idea may he obtained in the following way: the process of relaying a single message through a ~etwork pure random. .. l/n may be regarded as a finite Markov process, With the first-order neighbors (arbitrary) (k - l)jn nodes as states, and the routing doctrine determining the first-order neighbors (geographic) 4/n second-order neighbors (arbitrary) (k - 1)2/n transition matrix (ct. [3]). The behavior of the network second-order neighbors (geographic) 61n is then determined by the rules for :finite Markov processes. t"J
690
THE BEST OF THE BEST
As an example we consider the pure random case with n large and k = 2. Here the graph consists of a single closed loop. Suppose a message originates somewhere at random in the network, addressed for node n. Its probable position at time t 0 is described by the one-column matrix
PunreIiable
lin v
= lin
(29)
lin and its position after one relay where A is the transition matrix.
IS
described by A-v,
0
!
0 0
0
2"
1
0
2
1
0
0
0
!
0
2"
1
0
0 0 ! 0 A= 0...... ................ .,
1
2"
0
0
2
0
0
0
2"
1
0
0
1
0
0
1
1
2"
1
.The effect of this assumption is twofold: in the first place, it obviously reduces the probability that a message is delivered to its destination (in the sense of Section E) in the next relay by the probability that the appropriate links are operational. Thus we have
2"
TlPrelia.bJe
(31)
In the second place, our assumption increases the number of cycles in a relay and hence the mean traverse time by a factor depending on the (small) probability that a message is delayed. at a relay node because no link leading from that node is operational. This probability is just (1 - Tt)\ and we have Zunreliable
(30)
=
= (1 + (1 -
rl)k)zreliable
(32)
This effect is negligible when r, !'oJ 1. Another type of effect of environment upon the system results from supposing that links are unaffected, but that nodes are "unreliable." In this case we suppose that each node is operational with probability Tn, and further, that each node knows his neighbors, and relays messages only to operational nodes. Again the effect is two-fold: in the first place the probability that a message is delivered to its destination in the next relay is reduced by a factor of Tn,
(33) Punrelill.ble = Tnpreliable, It is easy to see that after several relays the probability that the message is located at node i ¢ n is not the and the number of cycles in a relay is increased by a (1 - 1\)k, so that same for all i, being least at the nodes closest to node n, factor of the form 1 and greatest at the nodes farthest away from node n. Zunrol i ab l e = (1 + (1 rnY')zrcl i ab l e (34) Nevertheless, the decay of the probability that the message is not at n is still exponential in time, being deter- which is negligible when r; ,....., 1. From these arguments we see that this type of network mined essentially by the largest eigenvalue of A different from 1 ([3]). This eigenvalue depends on n but is always is extremely stable in the presence of a hostile environsomewhat larger than (1 - p), which is the value used ment, in the sense that the principle effect of the environin Section E. Similar arguments can be given for other ment on the over-all operation of the network is a moderate values of k and other routing logics. The problem in each attenuation of efficiency; in no sense is this operation case is to find the largest eigenvalue (different from 1) destroyed. of the associated transition matrix, and this depends H. EFFECTS OF SEQUENTIAL RECEPTION in general on the details of the network graph. The net result, however, is as stated above: the mean traverse The results of the preceding sections have been derived times are increased, and the mean total traffic increased, under the assumption that each node of the network by the inclusion of this effect. sends messages sequentially (i.e., one at a time) but These observations are borne out by the simulation receives messages simultaneously, being limited only by its storage facilities. In practice it is more likely that results described in Section I. each node both sends and receives messages sequentially. G. EFFECT OF UNREIJIABLE LINKS AND UNRELIABLE This is certainly the case, for example, in any military NODES communications system organized around microwave relay Efforts to evaluate the behavior of a random routing links connecting mobile stations with a single directional system operating in a hostile environment will depend antenna. At any given instant, such a station may send in general on assumptions made on the effect of the or receive a simple message, or neither, but not both. environment upon the system. The situation is not appreciably changed if each link conPerhaps the simplest way to learn something about this sists of several channels. For such a system the previous behavior is to assume that the graph of the network is derivations are no longer valid, but a slight modification unaffected, but that the links are "unreliable", i.e., that makes it possible to include this more "realistic" case each link is operational only a part of the time. To make within our study. Let us assume, then, that each node of the network this precise, we suppose that the probability that any link is operational at any moment is r., T, being the same for operates sequentially, in the sense that it can perform all links. only one operation (send or receive a single message) at
+
Fifty Years ofCommunications and Networking
691
a time. The principle effect of this assumption is to render the nodes "unreliable"-each node is operational only during that fraction of the time that it is not busy. Thus the results of Section G apply. The value of p must be reduced by a factor r; and the value of z increased by a factor 1 + (1 - Tn)k. Here r« is the probability that a node is operational, i.e., not busy. A crude estimate of r; may be given along the following lines. The fraction of the time that a node is busy is roughly equal to the average density of messages in the network, i.e., to the average number of messages located at each node. As in Section D we take for this average the value piC! - p). If we neglect p2 relative to p, we find that r n = 1 - p and hence (1 -
Psequcntial
='
Zsequential
= (1
+
TABLE II MEAN TOTA]~ TRAFFIC-ANALYSIS VB SIMULATION
N = Xnlp.p
n
k
50
3.
--50
--100
100
9
2.60 3.89
7.48
9
3.74
60.3
7.38 40.3 8.78
4.55
5.20
38.3 52.4
4.55 6.50
30.1
14.8
11.7 35.0 52.2
41.9
1.87
6.06
1.85
3.73 5.23
2.70 11.4
0.250
C
28.9
12.1 14.8 17.6
9.09
3
B
A
Alp X 10 3
3.04 3.69
4.39
35.8
0.231
5.60
0.471 0.653
1.70
1.82
14.9
22.3
61.7 91.2
1.35 7.45 60.8
11.2
0.675 3.73 54.7
1.30 2.60
15.0 35.2
23.7 74.1
1.88 4.40
0.234 0.550 6.30
1.30
P)PsimultaoeouB
(35)
pk)ZsimUltaneous.
(36) Column B: 1st-order neighbors case
Column A: pure random case
Column C: 2nd-order neighbors case
The factor modifying z may certainly be neglected. Other effects of sequential operation are negligible relative to these.
14.6
5.60
Results of Analysis: 0.000 Results of Simulation: 0.000
TABLE III MEAN TRAVElRSE TIMES-ANALYSIS VS SIMULATION
I.
m == 1/p.p X 1/(1 - p)2
RESULTS OF A SIMULATION EXPERIMENT
In an effort to verify quantitatively the conclusions described in this report, an extensive simulation experiment was performed by D. F. Clapp at Lincoln on the IBM 709. Details ,of this effort are described elsewhere [2]. Here it suffices to say that the simulation involved typical homogeneous networks with typical values of nand k, and routing logics including the pure random, first-order neighbors, and second-order neighbors cases as described in Section E. The operating characteristics of these networks were made consistent with the general principles laid down in Section B, except that it was felt desirable to include provisions for the sequential reception feature described in Section G, and the actual simulation was run in this mode. Each node was provided with storage facilities for seven messages. The values of the ratio XI J.I, were chosen by trial and error to be as large as possible subject to the requirement that the message loss through storage overflows be negligible. In this sense the networks were operated at the largest ,possible average traffic load. Smaller values of XIJ.I, are expected to result in a better over-all performance. The data recorded in this simulation included values of the mean total traffic and mean traverse time as described in Sections C and D. From the point of view of this study, these are the factors necessary. to evaluate the performance of this type of network. A comparison of their values as determined by analysis and by simulation for typical situations is assembled in Tables II and III. We feel that the results of the simulation tend to bear out the results of the previous sections within the broad range of uncertainty introduced by the complexities of the problem.
n
k
"Alp. X 108
50
3
2.60 3.89 4.55 5.20
106 124 135 147
4.55 6.50 9.09
135 51.3 206 94.9 272 95.0
16.9 25.6 34.0 12.1
0.250 1.30 1.82
170 311 213 252 241 250
85.0
1.30 2.60
213 128 294 141
-50
9
-100- -3-100
9
164
187 191 161
Column A: pure rand.om case Column 'B: 1st-order neighbors case Column C: 2nd-order neighbors case
J.
C
B
A
53:0 62.0 67.5 131 73.5
107 121
26.6
36.7
194
50.1
26.5 31.0
33.8
36.8
126
2.11 3.20 4.25 4.0 47.5 53.5
60.5
206
3.33
4.60 22.6
Results of Analysis: 0.000 Results of Simulation: 0.000
SUMMARY AND CONCLUSIONS
The quantitative results established in this report tend to bear out and to emphasize the qualitative conclusions drawn in the introduction on the performance characteristics of communications systems depending upon random routing procedures as defined in Section B. From the point of view of military communications requirements, these conclusions may be summarized as follows: 1) The system provides access to the whole network for each station. 2) It does not require the use of directories. 3) It does not require the use of complex routing doctrines. 4) I t does not require the use of central offices or trunk lines.
692
THE BEST OF THE BEST
5) Its performance is essentially independent of the actual structure of the network.
If transmission rates of the order of a millisecond" can be achieved by high-speed microwave components, then 6) Its performance is extremely stable in the presence messages may originate at the rate of one per second of a hostile environment, in the sense of Section G. at each station without overloading a system of one hundred stations. Mean traverse times are then of the order of a tenth of a second. In this case it might be Disadvantages: possible to provide a simple, workable, non-real-time com1) Average traverse times are long, and variances large. munications system which could assume all of the advantages listed above, and which could be supplemented 2) Traffic rates must be kept low. with whatever additional facilities seem desirable. 3) The system is not suitable for operation in real In Part II we will consider essentially deterministic time. In particular, it cannot be employed as a systems, in which the routing procedures are determined telephone system. from directory information at the initial station. 4) The system is not suitable for any but the simplest BIBLIOGRAPHY types of messages. In particular, it cannot be ex[1] C. Berge, "Theorie des Graphes et ses Applications," Dunod, pected to deliver "all points" messages. Paris, France: 1958. 5) The system is extremely vulnerable to overloading.. Consequently, it cannot be expected to handle nonessential or duplicate messages. It is evident that from a military point of view the advantages are extremely attractive but that the price to be paid for them is high. Perhaps the severest disadvantage is the limitation on traffic rates. This can be alleviated somewhat by radically increasing the transmission rates, thus reducing the effective message lengths.
[2] D. Clapp, "On A Communications Network Simulation Program," Lincoln Leb., Lexington, 1\1as8., Group Rept. No. 22G0016; May, 1960. [3] W. Feller, HAn Introduction to Probability Theory and Its Applications," John Wiley and Sons, Inc., New York, N. Y.; 1950. [4] F. Hildebrand, "Methods of Applied Mathematics," PrenticeHall Inc., New York, N. Y.; 1952. 15] P. Morse, "Queues, Inventories and Maintenance," John Wiley and Sons, Inc., New York, N. Y.; 1958. [6] J. Riordan, "An Introduction to Combinatorial Analysis," John Wiley and Sons, Inc., New York, N. Y.; 1958. 2
Per relay.