Multiuser Detection in CDMA Mobile Terminals
For a listing of recent titles in the Artech House Mobile Communications ...
25 downloads
720 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Multiuser Detection in CDMA Mobile Terminals
For a listing of recent titles in the Artech House Mobile Communications Series, turn to the back of this book.
Multiuser Detection in CDMA Mobile Terminals
Piero Castoldi
Artech House Boston . London www.artechhouse.com
Library of Congress Cataloging-in-Publication Data Castoldi, Piero. Multiuser detection in CDMA mobile terminals / Piero Castoldi. p. cm.—(Artech House mobile communications series) Includes bibliographical references and index. ISBN 1-58053-330-2 (alk. paper) 1. Code division multiple access. 2. Signal detection. I. Title. TK5103.452.C37 2002 621.3845’6—dc21
2002016426
British Library Cataloguing in Publication Data Castoldi, Piero Multiuser detection in CDMA mobile terminals.—(Artech House mobile communications series) 1. Signal processing 2. Code division multiple access I. Title 621.3’822 ISBN 1-58053-330-2 Cover design by Igor Valdman Ó 2002 ARTECH HOUSE, INC. 685 Canton Street Norwood, MA 02062 All rights reserved. Printed and bound in the United States of America. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system without permission in writing from the publisher. All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. Artech House cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark. International Standard Book Number: 1-58053-330-2 Library of Congress Catalog Card Number: 2002016426 10 9 8 7 6
5 4 3 2 1
Contents Preface
xiii
1
Introduction and CDMA Models
1
1.1
Multiple Access Techniques
1
1.2
Code Division Multiple Access
2
1.3 1.3.1
Spread Spectrum Techniques for CDMA DS-SS
4 4
1.4
Short Codes Versus Long Codes: Orthogonal Versus Random Spreading
6
1.5 1.5.1 1.5.2 1.5.3
1.6
Synchronous CDMA Spreading Sequences for Synchronous Multirate CDMA Systems Synchronous Multirate CDMA Model in the Presence of Multipath Propagation Time-Discrete Model for the Synchronous System
12
Asynchronous CDMA
17
v
8 8 9
vi
Multiuser Detection in CDMA Mobile Terminals
1.6.1 1.6.2 1.6.3 1.7
Spreading Sequences for Asynchronous CDMA Systems Asynchronous CDMA Model in the Presence of Single-Ray Propagation Time-Discrete Model for the Asynchronous System
21
Dichotomies in CDMA
27
17 18
References
27
2
Single-User Detection
29
2.1
Single-User Versus Multiuser Receivers
29
2.2
Performance Measures of CDMA Receivers
30
2.3 2.3.1
The Conventional Receiver Synchronous Multirate CDMA System Without ISI Asynchronous Single-Rate CDMA System Synchronous Multirate CDMA System in the Presence of ISI
30
35
2.4 2.4.1 2.4.2
The Rake Receiver Observation Model for Rake Detection The Rake Principle
37 37 40
2.5
Limits of Single-User Detection
43
2.3.2 2.3.3
30 34
References
43
3
Linear Multiuser Detection
45
3.1
Synchronous CDMA, Short Code, Multipath, and Multirate
45
Contents
3.1.1
vii
The Zero-Forcing Detector for Both MAI and ISI (ZF-MI) The Zero-Forcing Detector for ISI Only (ZF-I) The MMSE Detector for Mitigation of Both MAI and ISI (MMSE-MI) The MMSE Detector for Mitigation of ISI Only (MMSE-I)
50
3.2 3.2.1
Sliding Window Formulation Validation of the Sliding Window Algorithm
51 55
3.3
MMSE Receivers for CDMA Asynchronous Systems
57
3.4
Sliding Window MMSE Receiver
62
3.5
Considerations on Linear Detection
66
3.1.2 3.1.3 3.1.4
References
46 47 48
66
4
Structured Versus Unstructured Linear Detection and Interference Mitigation 69
4.1 4.1.1 4.1.2
System Model Reduced Complexity Single-Cell Models Multiple-Cell Models
69 73 74
4.2
Extended Linear Receivers
76
4.3
Sliding Window Formulation
80
4.4
Typical Operating Modes of the Linear Receivers
81
Multiuser Detection Versus Interference Mitigation
82
4.5
References
82
viii
Multiuser Detection in CDMA Mobile Terminals
Appendix 4A: Exact Expression of the Correlation Matrix Rn(i )
83
5
Adaptive Linear Multiuser Detection
85
5.1
Adaptive MMSE Receivers
86
5.2
Trained Adaptive MMSE Receiver
86
5.3
Blind Adaptive MMSE Receiver
88
5.4
Blind Adaptive MMSE Receiver with Surplus Energy Constraint
95
5.5
Advantages of Adaptive Linear Detection
101
References
102
6
Performance of Linear Multiuser Detection
103
6.1
Performance of the ZF and MMSE Receivers in Synchronous Systems Features of the Synchronous System Exact Evaluation of BEP Gaussian Approximation for BEP Evaluation BER and BEP Performance in a Single-Cell Scenario BER Performance in a Multiple-Cell Scenario
6.1.1 6.1.2 6.1.3 6.1.4 6.1.5 6.2 6.2.1 6.2.2 6.2.3
Performance of the MMSE Receivers in Asynchronous Systems Features of the Asynchronous System Transient Performance: MSE Convergence BER Performance of the MMSE Detectors in the Ideal Case
103 104 104 105 106 109 112 113 114 118
Contents
6.2.4 6.2.5
6.2.6
BER Performance of the MMSE Detectors in the Presence of a Timing Error BER Performance of the MMSE Detectors in the Presence of a Phase or Frequency Offset Improving the Performance of Linear Detectors
ix
120
123 127
References
128
7
Nonlinear Multiuser Detection
131
7.1
Introduction and System Model
132
7.2
SIC
134
7.3
PIC
136
7.4 7.4.1 7.4.2
Decision Feedback Nonlinear Receivers A Decision Feedback Multiuser Detector Complexity Reduction by Per-Survivor Processing Techniques
139 139
Linear Versus Nonlinear Multiuser Detection
143
7.5
143
References
144
8
Synchronization Techniques
145
8.1
Observation Model
145
8.2 8.2.1 8.2.2
Timing Acquisition Coarse Timing Acquisition Fine Timing Acquisition
149 149 152
8.3 8.3.1
Performance of the Sliding Correlator Coarse Acquisition
154 154
x
Multiuser Detection in CDMA Mobile Terminals
8.3.2 8.3.3
Fine Acquisition Complexity Issues of the Sliding Correlator
160 162
8.4 8.4.1 8.4.2 8.4.3 8.4.4
Frequency Offset Estimation Front-End Processing of the Observation First-Order Statistics of z(i ) Second-Order Statistics of z(i ) LMS Frequency Estimator
164 165 168 168 169
8.5
Performance of the Frequency Estimator
169
8.6
Coherent Versus Noncoherent Detection
173
References
174
Appendix 8A: Derivation of Matrix Rz
175
9
Third-Generation Mobile Radio System
179
9.1
General Specifications
179
9.2 9.2.1 9.2.2 9.2.3
The Physical Layer of the TDD-HCR Mode Physical Channels Training Sequences Spreading and Scrambling Sequence
181 181 187 188
9.3
Radio Interface and Transmitted Signal
189
9.4
Cell Search and Synchronization
191
9.5
Novelties of the Third-Generation Radio System 193
References
193
10
The ASI-CNIT Communication System
195
10.1
System Description
196
Contents
10.2
Data Frame Format
Algorithms for Data Detection and Their Performance 10.3.1 Trained Adaptive MMSE Detection 10.3.2 Blind Adaptive MMSE Detection
xi
196
10.3
10.4 10.5
200 201 201
Optimization of the Length of the Training Sequence
202
Signaling Using the Synchronization Sequence
206
References
208
Glossary
211
About the Author
217
Index
219
Preface The rapidly increasing demand for bandwidth in wireless services has stimulated research towards the development of highly flexible radio interfaces. Code division multiple access (CDMA) has emerged as a promising technique for providing services with different signaling rates (from the user point of view) and with efficient statistical multiplexing of the active users (from the network point of view). Both satellite and terrestrial wireless CDMA systems for cellular services, satellite distribution, local loops, and local area networks (LANs) are currently in operation since the technological problems of signal processing at chip rate are no longer an issue. The recent advances in detection schemes for CDMA systems have also allowed network designers to conceive novel radio interface architecture. In particular, enhanced multiuser detection techniques have improved the potential capacity of the system due to their ability to deal with the near-far effect and their interference rejection capability. Moreover, the so-called short-code CDMA design philosophy encourages the use of a multiuser receiver to exploit the dimensional separation of the data streams associated with different codes. The focus of this book is on low-complexity multiuser detection and interference mitigation techniques to be employed in the mobile terminal. It challenges past literature, which suggests that multiuser detection can only be realized in base stations. In fact, most of the research results included in the book are related to the implementation of multiuser detection in a mobile terminal for the synchronous time division duplex (TDD) standards. Other results are related to another mobile receiver prototype with multiuser detection capability for an asynchronous satellite system, developed by the Italian National Consortium for Telecommunications (CNIT) within a xiii
xiv
Multiuser Detection in CDMA Mobile Terminals
research contract with the Italian Space Agency (ASI). Although the abovementioned projects have been the main drive for the research herein, the book contains a very general and comprehensive treatment of linear multiuser detection. The book is organized into 10 chapters. It starts with the basic principles of CDMA transmission and detection and describes the open issues of this type of multiple access. After reviewing single-user detection, the book presents an in-depth description of linear multiuser detectors in terms of underlying philosophy and receiver performance. Both nonadaptive and adaptive versions of these receivers are presented with analytical details and intuitive design motivations. Next is a chapter on nonlinear multiuser detection and another one on synchronization techniques. The last two chapters of the book present the physical layer of two standards: the TDD modes of third-generation mobile radio systems and a satellite asynchronous CDMA system. Design guidelines are given for the receivers to be employed in mobile terminals, highlighting the appropriate algorithms of previous chapters to be used in the commercial systems described in the last chapters. The book is suitable for undergraduate and graduate students of electrical engineering who want to gain more insight into the fundamentals of signal processing involved in the practical implementation of low-complexity CDMA multiuser detectors. Also, research and development department staff of companies that manufacture hardware based on a CDMA wireless interface may be interested in this book as well. Most of the material in this book is the result of my cooperation with Professor Hisashi Kobayashi at Princeton University, whom I sincerely thank for hosting me during the 1996–1997 academic year and the summers of 1999 and 2000. I also gratefully acknowledge the efficient collaboration of my students M. Bertinelli, M. Guerra, M. Coccetti, M. Morani, and D. Mainardi, as well as several useful discussions with my colleague, Professor Giulio Colavolpe. Without their help this book would not have been written. I am also grateful to Dr. Anila Scott-Monkhouse who helped me edit the English version of this book. Last but not least, I express my gratitude to Paola for her patience and her encouragement in the realization of this project, to my parents for creating the opportunities for me to reach this goal, and to my dear sister for her special attitude towards life. And finally, a special word of thanks to my friends and colleagues in Parma and Pisa for creating a pleasant work environment.
1 Introduction and CDMA Models 1.1 Multiple Access Techniques The spectrum available for radio and wired electrical telecommunications is a limited resource, which calls for optimal allotment to the various services. If we have a point-to-point link based on digital communication, the maximization of the throughput (total data rate), given a specified bandwidth and a certain bit error rate, can be achieved in at least two obvious ways [1]. The first is to increase the transmission power so that the average energy received per bit over noise spectral density (Eb =N0 ) is increased. The second is to use modulation techniques with high spectral efficiency. In a system whose communication resource in terms of bandwidth is shared by many users, the best way to achieve throughput maximization is to deploy an efficient assignment of the transmitting resource to each user. This task belongs to the area of multiple access in communications. All multiple access schemes provide each user with a channel, defined as a portion of the communication resource, using a deterministic law or according to a random access strategy with possible contention. As the latter schemes are beyond the scope of this book, we will focus only on the former. The key to all multiple access schemes is that the various signals share the communication resource without creating unmanageable interference with one another in the detection process. The qualitative limit of such interference is that signals on one channel should not significantly increase the probability of error on another channel. For example, if signals on different channels are mutually orthogonal, interference among users is completely avoided. Signal waveforms si ðt Þ ði ¼ 1; 2; . . .Þ are defined as orthogonal in the time domain if they enjoy the property Z 1 K for i ¼ j si ðt Þsj ðt Þdt ¼ ð1:1Þ 0 otherwise 1 1
2
Multiuser Detection in CDMA Mobile Terminals
where K is a nonzero constant. As an example, time division multiple access (TDMA) realizes orthogonality by partitioning the time axis into slots, each of which is assigned to a different incoming digital stream. TDMA requires that geographically distant users maintain mutual time synchronism. Similarly, we can conceive signals that are orthogonal in the frequency domain if they satisfy Z 1 K for i ¼ j Si ð f ÞSj ðf Þdf ¼ ð1:2Þ 0 otherwise 1 where the functions Si ð f Þ are the spectra of the signal waveform si ðt Þ. Frequency division multiple access (FDMA) realizes orthogonality in the frequency domain by allocating subbands of frequency to each user. As an example, each user can be assigned a different carrier frequency so that the resulting spectra do not overlap. Note that FDMA allows completely uncoordinated transmission in the time domain; no time synchronization among the users is required [2, 3].
1.2 Code Division Multiple Access So far we have considered multiple access schemes where orthogonality is realized by considering nonoverlapping channels, either in time domain or frequency domain. What point is there in considering multiaccess techniques that do not comply with the principle of dividing the communication resource into noninterfering channels? The most important reason is that noninterfering multiaccess strategies usually waste resources when the number of potential users is much larger than the number of simultaneous active users. As a matter of fact, orthogonality can still be realized for signals that overlap both in time and frequency, as shown in Figure 1.1. Because it is very often impossible to realize perfect orthogonality between different signals, the target is to have as little mutual interference among users as possible, so that the impairment to one user’s signal due to the interference from another is tolerable. This is the principle of code division multiple access (CDMA). Users are assigned different signature waveforms, spreading waveforms (we will see later that a CDMA signal is usually a spread spectrum signal), or codes. Each transmitter sends its data stream by modulating its own signature waveform as in a single-user digital communication system. With an adequate design of each signature waveform, we are able to achieve near-orthogonality. At the receiver side, multiple access signals are distinguished according to their signature waveform. Digital TDMA and FDMA systems can be seen as special cases of orthogonal CDMA where
3
Introduction and CDMA Models
s 2 (t )
s1 (t )
T
t
T
t
Figure 1.1 Example of orthogonal signals in time domain overlapping both in time and frequency domain. (Source: [3]. Reprinted with permission.)
the signature waveforms do not overlap in the time domain and in the frequency domain, respectively. For systems with antenna arrays, a further dimension (i.e., the spatial diversity) can be exploited for multiple access (MA), and the corresponding MA scheme is generally referred to as space division multiple access (SDMA). No signals can be both strictly time-limited and strictly band-limited. But adopting a relaxed definition of bandwidth and/or duration (for example, based on the energy within a frequency band (B; B) or a time interval (0; T )), we can evaluate how many orthogonal signals with approximate duration T and bandwidth B can be constructed. It has been shown that, unless the product BT is small, we can construct 2BT orthogonal signals [4]. An immediate consequence of this result is that a CDMA system with J orthogonal users, employing antipodal modulation at the rate of Br ¼ T1 bits per second, requires a bandwidth approximately equal to 1 B ¼ Br J 2
ð1:3Þ
Hence, to achieve the minimum bandwidth, the ‘‘soft’’ duration-bandwidth product of each signature waveform in an orthogonal CDMA system must be J =2. In contrast, achieving (1.3) with TDMA or FDMA requires a pulse with minimum duration-bandwidth product (which does not grow with the number of users). It should be noted that, if we limit our attention to practical orthogonal signals used in CDMA—like those obtained by direct sequence spread spectrum (DS-SS) or frequency hopping spread spectrum—the number of orthogonal signals is further reduced and its maximum value becomes BT [3]. The signals on which TDMA and FDMA are based typically have BT ’ 1 and are recognized as narrowband systems, while CDMA
4
Multiuser Detection in CDMA Mobile Terminals
requires signals with a large BT product, which are usually called spread spectrum signals. The need for signals with larger duration-bandwidth products in CDMA, however, involves benefits such as robustness against unknown channel distortion and antijamming capabilities. As mentioned at the beginning of this section, the perfect orthogonality of the signature waveforms is not mandatory for CDMA. Removing this restriction provides some benefits that make CDMA an attractive multiple access technique for practical communication systems. Specifically: The users can be asynchronous (that is, their time epochs need not be aligned) and yet ‘‘quasi-orthogonality’’ can be maintained by adequate design of spread spectrum signature waveforms. The number of simultaneous users is no longer constrained to twice the duration-bandwidth product of the signature waveform. The sharing of the communication resource is dynamic: The reception performance of each user depends on the number of simultaneous active users, rather than on the number of potential users of the system, which is usually much larger. Thus, unlike orthogonal multiaccess techniques, the performance gradually degrades as the number of active users increases. This allows for a trade-off reception quality for increased capacity.
1.3 Spread Spectrum Techniques for CDMA Spread spectrum techniques were initially developed for military applications due to their excellent antijamming performance and later found a wide range of applications in commercial wireless systems [5–7]. The main idea of spread spectrum is to spread a signal over a frequency band that is much larger than the original signal bandwidth and transmit it with low power per unit bandwidth. This is the reason why CDMA is also called power division multiple access (PDMA). 1.3.1 DS-SS DS-SS realizes the band spreading by amplitude modulation of the information symbol stream. This modulation is obtained with a higher rate sequence (chip sequence). Figure 1.2 illustrates the basic principle of a DS spectrum signal with a binary information sequence and a binary spreading sequence. As seen, each symbol of duration T is spread into multiple chips of duration Tc T . The bandwidth expansion factor, denoted by Q
5
Introduction and CDMA Models
Symbol sequence a(n) 1 Spread sequence b(k)
–1 T
1
–1
Spreading chip sequence c(k) 1
1
1 –1
1
1
1 1
1
–1 –1
1 –1
–1
Tc
Figure 1.2 DS-SS modulation principle. (Source: [8]. Reprinted with permission.)
Q ¼
T Tc
ð1:4Þ
is often called the spreading factor or processing gain. A spreading sequence can cover one or more symbol intervals, defined as short codes and long codes, respectively. For example, so called pseudorandom or pseudonoise (PN) chip sequences are often employed to randomize the spread signal as much as possible. These sequences can be generated by combining the outputs of feedback shift registers, which typically give the DS-SS signal the properties of a white noise, whose whiteness increases with the increasing length of the sequence [9]. Another way to realize spreading is the use of orthogonal spreading sequences, which enables a simple separation of the users at the receiver side. We make use of the baseband equivalent notation in order to express the DS-SS modulated signal. The chip sequence is filtered by a shaping filter (i.e., a chip pulse) pðt Þ, which determines the bandwidth of the CDMA system. Mathematically, the superposition of J DS-SS signals can be expressed as sðt Þ ¼
J X 1 X
aj ðnÞsj ðt ðn 1ÞT Þ
ð1:5Þ
j¼1 n¼1
where sj ðt Þ ¼
Q 1 X l ¼0
is the signature waveform.
cj ðl Þpðt lTc Þ
j ¼ 1; 2; . . . ; J
ð1:6Þ
6
Multiuser Detection in CDMA Mobile Terminals
In (1.6) we have assumed that the spreading factor is Q and the signature waveform covers exactly one symbol interval. The new signal has an approximate band of B ¼ 1=Tc . The increase in bandwidth or signal/ dimensionality provides the required interference noise resistance. If a DS-SS signal has BT Q and Q is a power of two, we can accommodate J ¼ Q orthogonal distinct spreading waveform sj ðt Þ, and each of them lies in 1 out of Q dimensions [3]. Because of this, the spread spectrum signal is immune to randomly positioned interfering signals. At the receiver side, a matched filter captures the desired signal from the one-dimensional subspace defined by the generic spreading waveform sj ðt Þ, and at the same time any (undesired) interference (narrowband, spread spectrum, impulsive) with nonzero projection on the subspace. For example, the interference coming from a narrowband jammer, which has equal projection on all the system signatures, is reduced by a factor Q . If the spreading sequence is longer than the spreading factor Q (long codes), the signature waveform (1.6) varies from symbol to symbol, and we denote it by Q 1 X cj ðn; l Þpðt lTc Þ ð1:7Þ sj ðn; t Þ ¼ l ¼0
Expression (1.5) still holds, provided that we substitute sj ðt Þ with the new time-varying signature sj ðn; t Þ. Long codes may be periodic if they span a few symbol intervals and the resulting time-varying signature is periodic as well. If a very long PN sequence is used, the spreading sequence is aperiodic (on a typical time observation window) and the spreading waveform sj ðn; t Þ can be regarded as random.
1.4 Short Codes Versus Long Codes: Orthogonal Versus Random Spreading In some systems (like IS-95) [6], each data stream is spread with an extremely long PN sequence with a period of some hours. This CDMA approach is termed R-CDMA, which stands for (pseudo)-random CDMA [10]. The spreading sequence in this case is not periodic on a short scale observation and changes from symbol to symbol. The received signal is not cyclostationary, and as we will see, this prevents an efficient use of linear interference mitigation techniques. In fact, linear interference cancelers require that each data stream occupy a unique dimension (or switch between a small number of dimensions) in signal space (as will be shown in Chapter 3).
Introduction and CDMA Models
7
In other systems (like the TD-CDMA), the period of the spreading codes is short relative to the symbol period, and in many cases the code period is the same as the symbol duration so that the spreading waveform becomes periodic. In this case, a specific dimension is assigned to each data stream, and according to [10] we name this approach deterministic or dimension-limited CDMA (D-CDMA). The sequences assigned to the different data streams are designed to be orthogonal or with low cross-correlation. The difference is not only the period of the PN sequence employed. Multiaccess interference is treated differently in the two approaches: R-CDMA is a more naive design strategy and was conceived earlier; D-CDMA was born after the development of the multiuser detection theory. R-CDMA is an approach to system design that is based on the ‘‘minimax’’ risk theory used in statistical signal processing [11]. It is well known from information theory that, when the variance of an additive interferer is constrained, the worst case of additive interference for any user is white Gaussian noise (WGN) [12]. Hence, a robust design methodology makes multiuser interference look like WGN. This is accomplished by assigning users extremely long pseudorandom spreading sequences. Powerful error control coding techniques are used to mitigate this multiaccess noise. Furthermore, since each user is a white noise source from the perspective of other users, it is important that the transmission power of the users is controlled. The conclusion is that in R-CDMA systems the capacity is limited by the total power of the interference, so that the system is designed for average interference characteristics, rather than for the worst case [10]. In contrast to the minimax approach seen above, D-CDMA assigns a unique dimension in the signal space to each user, in a similar way to FDMA and TDMA. The signature sequences of different users are weakly correlated, and multiuser receivers are employed to contrast the mutual interference among users. In linear cancellation schemes, the dimensionality separation of users is exploited in order to reduce the interference, unlike R-CDMA, where the coding gain of the error control codes makes the system interference tolerant. Moreover in D-CDMA, since linear multiuser receivers require that the waveform assigned to the users is linearly independent, the number of users must be lower than, or equal to, the number of dimensions. These are determined by the total bandwidth, the per-user bit rate, the error control coding redundancy, and the number of bits per modulation symbol of each user [10]. A disadvantage of D-CDMA is that in the presence of a frequencyselective multipath channel, interference cancellation is affected because any particular interferer will appear in multiple dimensions, thereby increasing the perceived number of users. On the contrary, R-CDMA can handle multipath energies effectively, resulting in a negligible capacity degradation.
8
Multiuser Detection in CDMA Mobile Terminals
1.5 Synchronous CDMA In this section we will describe in detail the system model of a CDMA system in which the timing of all users is aligned. Such a system is called synchronous CDMA. We will account for a possible variable spreading factor transmission, for a multiple propagation channel, and for oversampling and/or space-time processing. 1.5.1 Spreading Sequences for Synchronous Multirate CDMA Systems The spreading sequences are constructed using orthogonal codes with variable spreading factor (OVSF). The variable spreading factor is realized by considering spreading sequences with different lengths. Let us denote the jth sequence as 4 ð1:8Þ cj ¼ cj ð0Þ; cj ð1Þ; . . . ; cj ðQj 1Þ j ¼ 1; 2; . . . ; J where Qj is its length. Using a chip pulse, we can define an OVSF signature sequence for the jth code using (1.6). Typical values of Qj are f1; 2; 4; 8; 16g. These codes enjoy the property of being orthogonal, even when the correlation is carried out among codes with different length, provided that correlation is performed on the window corresponding to the largest spreading factor. For example, if we consider cj and ck with spreading factor Q j > Qk (and by definition Qj is always an integer multiple of Qk ), then the following property holds: Q j 1 X
cj ðiÞck ðjijQk Þ ¼ 0
ð1:9Þ
i¼0 4
where jijL ¼ i mod L. Hence, the family of codes cj is named OVSF because they warrant orthogonality in the presence of multiple spreading factor. The most well-known family of OVSF is the Walsh-Hadamard family. Walsh codes have length 2k and can be generated recursively using the following algorithm: 1 1 Hk1 Hk1 k ¼ 2; 3; . . . ð1:10Þ Hk ¼ H1 ¼ Hk1 Hk1 1 1 Both the rows and columns of an Hk Hadamard matrix give a set of 2k orthogonal codewords. The orthogonality is due to the fact that each pair of words has the same number of digit agreements and disagreements. Figure 1.3
9
Introduction and CDMA Models
cj = (1, 1, 1, 1) cj = (1, 1) cj = (1, 1, –1, –1) cj = (1) cj = (1, –1, 1, –1) cj = (1, –1) cj = (1, –1, –1, 1)
Q=1
Q=2
Q=4
Q=8
Q = 16
Figure 1.3 Tree for generation of OVSF codes.
is a representation of the generation of these codes using a tree. Each level in the tree defines a spreading factor, denoted by Qj . This tree is important because the deployment of OVSF in a system is not completely arbitrary. In order to warrant orthogonality, we must follow some rules in the selection of the codes. Specifically, in the assignment of the codes to the system, we can use a certain code if no other code is used in the branches between this code and the root, and if no other code belonging to a branch having the specified code as root is in use. This implies that the number of available codes for a multirate transmission depends on the spreading factor in use and on the amount of codes we allocate to each spreading factor. 1.5.2 Synchronous Multirate CDMA Model in the Presence of Multipath Propagation In this section we will present an accurate modeling of a very general synchronous CDMA system. The low-pass equivalent of the transmission system considered is shown in Figure 1.4. The J synchronous intracell codes are sent over a slowly varying multipath fading transmission channel. The received signal is further corrupted by nðt Þ, an additive white Gaussian noise (AWGN) whose power spectral density is N0 . We account for OVSF codes by considering a code-dependent data sequence length; specifically, we denote the data sequence spread by the jth code with Nj (j ¼ 1; 2; . . . ; J ), where Nj is the data sequence length. The faj ðnÞgn¼1 chip interval Tc is a constant in order to meet a fixed bandwidth requirement, so that the symbol interval Tj ¼ Q j Tc may be different from user to user, denoting by Q j the spreading factor for the jth code. The product Q j Nj ¼ Ld , which measures the length of the data transmission
10
Multiuser Detection in CDMA Mobile Terminals
c1 (i) a1(n)
b1(k)
↑(×Q 1)
p(t) W q(t)
h(t) aJ (n)
bJ (k)
↑(×Q J )
p(t)
y(t)
ym (i)
v(t)
cJ (i)
Figure 1.4 Baseband model of the CDMA synchronous transmission system. Ó 2000 IEEE [13].
in multiples of the chip interval Tc , is constant. The spreading sequence (Walsh-Hadamard) code for the jth user is given by (1.8) and is repeated at each symbol interval. We denote the transmitter and receiver filter impulse response by pðt Þ and qðt Þ, respectively. The front-end processing scheme employed (i.e., receiver filter and sampling) comply with that proposed in [14]. Assuming we sample at a rate of W ¼ Tbc , where b is the oversampling factor, the receiving filter qðt Þ is a low pass filter whose squared frequency response has vestigial symmetry around W=2. An example of a receiving filter scheme is shown in Figure 1.5, where Q ð f Þ is a root raised cosine . This with roll-off factor a such that the useful signal band is B < ð1aÞW 2 receiving filter has the double advantage of acting as a dðt Þ for the received signal and keeping the additive noise white [14]. If we sample at symbol rate, the receiving filter qðt Þ can be a filter matched to pðt Þ impulse response. The time-continuous complex baseband signal after the low pass filter can be expressed as yðt Þ ¼
Nj J X X j¼1 n¼1
aj ðnÞ
Q j 1 X
cj ðkÞf ðt ðn 1ÞTj þ kTc Þ þ vðt Þ
ð1:11Þ
k¼0
where f ðt Þ is the total channel impulse response given by the convolution of the chip-shaping P pulse pðt Þ common to all users with the multipath channel response hðt Þ ¼ l hl dðt tl Þ. In the definition of hðt Þ, we have implicitly assumed the channel as being constant over the burst duration, while vðt Þ is the result of the low pass filtering of nðt Þ.
11
Introduction and CDMA Models
Q(f )
Q(f – W)
Signal spectrum
B
W 2
W
f
( 1 + α)W 2
( 1 – α)W 2
Figure 1.5 Scheme of the receiving filter employed.
We further modify (1.1) according to the definition of a chip sequence fbj ðkÞg corresponding to the data symbol aj ðnÞ for the jth user as [13]
k 4 bj ðkÞ ¼ aj ðnÞcj ðjk 1jQj Þ k ¼ 1; 2; . . . n ¼ ð1:12Þ Qj 4
where we have used the operators jijL ¼ i mod L, and dxe is the smallest integer greater than or equal to x (ceiling function). Therefore, (1.11) becomes yðt Þ ¼
J N j Qj X X
bj ðkÞf ðt ðk 1ÞTc Þ þ vðt Þ
ð1:13Þ
j¼1 k¼1
The received signal yðt Þ is sampled at a rate W ¼ b=Tc , and after defining the polyphase representation [15] of a sampled function g ðt Þ as gm ðiÞ ¼ g ððib mÞ=W Þ (m ¼ 1; 2; . . . ; b and i ¼ 1; 2; . . .), we have ym ðiÞ ¼
J X X j¼1
¼
J X X j¼1
¼
bj ðkÞfm ði k þ 1Þ þ vm ðiÞ
k
fm ðk 0 þ 1Þbj ði k 0 Þ þ vm ðiÞ
k0
J X P 1 X j¼1 k¼0
fm ðk þ 1Þbj ði kÞ þ vm ðiÞ
ð1:14Þ
12
Multiuser Detection in CDMA Mobile Terminals
where the last identity holds since the function f ðt Þ is causal and has a finite support that spans P chip intervals. The resulting scalar observation model (1.14) describes both intersymbol interference (ISI) and multiple access interference (MAI). 1.5.3 Time-Discrete Model for the Synchronous System The cyclostationary received signal, oversampled at b times the chip rate, can be stacked together to obtain vector y~ðiÞ, which gathers all the samples belonging to the same chip interval (i ticks at the chip rate). We define the following vectors: 2 3 2 3 2 3 y1 ðiÞ f1 ðk þ 1Þ v1 ðiÞ 4 6 4 6 4 6 . 7 7 7 .. ð1:15Þ y~ðiÞ ¼ 4 ... 5 ~f ðkÞ ¼ 4 5 ~v ðiÞ ¼ 4 .. 5 . yb ðiÞ
fb ðk þ 1Þ
vb ðiÞ
which leads to y~ðiÞ ¼
J X P1 X
~f ðkÞbj ði kÞ þ ~v ðiÞ
ð1:16Þ
j¼1 k¼0
~ and the P -dimensional If we further define the (b P )-dimensional matrix F ~ vector bj ðiÞ as follows: h i 4 ~ ~¼ F f ðP 1Þ; ~f ðP 2Þ; . . . ; ~f ð0Þ ð1:17Þ T 4 ~j ðiÞ ¼ bj ði P þ 1Þ; bj ði P þ 2Þ; . . . ; bj ðiÞ b
ð1:18Þ
we can express (1.16) as y~ðiÞ ¼
J X
~j ðiÞ þ ~v ðiÞ ~b F
ð1:19Þ
j¼1
Now using definition (1.12) we can write that
i P þ1 ~ cj ji P jQj ; . . . ; bj ðiÞ ¼ aj Qj T i aj cj ji 1jQj Qj
ð1:20Þ
13
Introduction and CDMA Models
l m PþQj 1 different symbols are We recognize that in (1.20), at the most lj ¼ Qj involved [14]. Specifically, 8l m PþQj 1 > > for jP þ Q 1jQj ¼ 0 and ji 1jQj < jP þ Qj 1jQj < Qj 4 lj ¼ l m > > : PþQj 1 1 otherwise Qj ð1:21Þ Parameter lj describes the relevant ISI for the data stream associated to the jth code as we can easily check in Figure 1.6. We can define the lj -dimensional data vector as
T i 4 n¼ ð1:22Þ ~aj ðiÞ ¼ aj ðn lj þ 1Þ; aj ðn lj þ 2Þ; . . . ; aj ðnÞ Qj It should be stressed that the temporal index n is different from user to user (because of different Qj ), although this dependence is not explicit either in (1.22) or in (1.32). We now define the (P lj )-dimensional matrix 2
cj ðji PjQj Þ 6 .. 6 . 6 6 6 cj ðQj 1Þ 6 6 6 6 4 ~ j ðiÞ ¼ 6 C 6 6 6 6 6 6 6 6 4 0
0
cTj ..
. cTj cj ð0Þ .. . cj ðji 1jQj Þ
3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
ð1:23Þ
which is a block diagonal matrix, and the row vector cj is given by (1.8). ~ j ðiÞ are column vectors containing a partial (first The diagonal blocks of C and last block) or full (inner blocks) spreading sequence, so that ~j ðiÞ ¼ C ~ j ðiÞ~aj ðiÞ. Consequently, (1.19) can be rewritten as follows: b y~ðiÞ ¼
J X j¼1
~ j ðiÞ~aj ðiÞ þ ~vðiÞ ~C F
ð1:24Þ
14
Multiuser Detection in CDMA Mobile Terminals
Impulse response span = PTc #1, Q 1 = 1 (no spreading) #2, Q 2 = 2 #3, Q 3 = 4 #4, Q 4 = 4 #5, Q 5 = 8
Figure 1.6 Time slotting when codes with different spreading factors are used. The thick lines separate symbol intervals; the thin lines separate chip intervals.
4
Then, by further definition of lt ¼
P
j lj
and
4 ~ ~ ðiÞ ¼ ~ 2 ðiÞ; . . . ; C ~ J ðiÞ C C1 ðiÞ; C h iT 4 ~aðiÞ ¼ ~a1T ðiÞ; ~a2T ðiÞ; . . . ; ~aJT ðiÞ
ðP lt Þ ðlt 1Þ
ð1:25Þ ð1:26Þ
we can give different expressions to the observation y~ðiÞ ~ðiÞ þ ~vðiÞ ¼ G ~ ðiÞ~aðiÞ þ ~v ðiÞ ¼ F ~ ðiÞ~aðiÞ þ ~vðiÞ ~C ~b y~ðiÞ ¼ F
ð1:27Þ
4 ~ ~ðiÞ ¼ with the implicit definition of the P -dimensional vector b CðiÞ~aðiÞ 4 ~ ðiÞ. ~ ðiÞ ¼ F ~C and the ðb lt Þ-dimensional matrix G In order to solve the data detection problem, we may need a longer observation vector; so we can stack N vectors yðiÞ by defining
T y ðN Þ ðiÞ ¼ y~T ði N þ 1Þ; y~T ði N þ 2Þ; . . . ; y~T ðiÞ ðbN 1Þ ð1:28Þ In order to express (1.28) in closed form, we first define the bN ðP þ N 1Þ-dimensional matrix
15
Introduction and CDMA Models
FðN Þ
2~ ~f ð0Þ f ðP 1Þ 6 ~f ðP 1Þ 6 6 ¼6 .. 6 4 .
0 ~f ð0Þ .. ~f ðP 1Þ
0
.
3 7 7 7 7 7 5
ð1:29Þ
~f ð0Þ
Then we let
D
nj ¼
8 lPþN þQ 2m j > > Qj > > < > > l m > > : PþN þQj 2 1 Qj
for jP þ Q þ N 2jQj ¼ 0 and ji 1jQj < jP þ Qj þ N 2jQj
ð1:30Þ
otherwise
We now define the block-diagonal matrix
3 cj ðji N P þ 1jQj Þ 0 7 6 .. 7 6 . 7 6 7 6 cj ðQj 1Þ 7 6 7 6 . 7 6 .. 7 6 7 6 cj ð0Þ 7 6 7 6 .. 7 6 7 6 . 7 6 46 ðN Þ 7 cj ðji P jQj Þ Cj ðiÞ ¼ 6 7 7 6 . 7 6 .. 7 6 7 6 cj ðQj 1Þ 7 6 7 6 . 7 6 .. 7 6 7 6 c ð0Þ 7 6 j 7 6 .. 7 6 5 4 . 0 cj ðji 1jQj Þ 2
ð1:31Þ
16
Multiuser Detection in CDMA Mobile Terminals
whose dimension is ðP þ N 1Þ nj and its structure is qualitatively the same as (1.23). Similarly to (1.22), we define a data vector of length nj as
T i 4 ðN Þ aj ðiÞ ¼ aj ðn nj þ 1Þ;...;aj ðn lj þ 1Þ;...;aj ðnÞ n¼ ð1:32Þ Qj ðN Þ
Note that the lower right P lj minor of Cj is exactly equal to Cj (for j ¼ 1; 2; . . . ; J ), and that for N ¼ 1 we have lj ¼ nj , as expected. After defining a new vector of noise samples T vðN Þ ðiÞ ¼ ~vT ðiÞ; ~v T ði þ 1Þ; . . . ; ~v T ði þ N 1Þ
ðbN 1Þ ð1:33Þ
we can write y
ðN Þ
ðiÞ ¼
J X
ðN Þ
FðN Þ Cj
ðN Þ
ðiÞa j
ðiÞ þ vðiÞ
ð1:34Þ
ðP þ N 1Þ nt
ð1:35Þ
j¼1
The final definition of h i 4 ðN Þ ðN Þ ðN Þ CðN Þ ðiÞ ¼ C1 ðiÞ;C2 ðiÞ;...;CJ ðiÞ
h iT 4 ðN Þ T ðN Þ T ðN Þ T ðiÞ a ðN Þ ðiÞ ¼ a 1 ðiÞ; a2 ðiÞ; . . . ; aJ 4
where nt ¼
P
j
ðnt 1Þ
ð1:36Þ
nj , leads to
y ðN Þ ðiÞ ¼ FðN Þ CðN Þ ðiÞaðN Þ ðiÞ þ vðN Þ ðiÞ ¼ FðN Þ bðN Þ ðiÞ þ vðN Þ ðiÞ ð1:37Þ with the implicit definition of the ðP þ N 1Þ-dimensional vector bðN Þ ðiÞ ¼ CðN Þ ðiÞa ðN Þ ðiÞ. Another useful expression for the observation vector can be given if we define the (bN nt )-dimensional matrix GðN Þ ðiÞ ¼ FðN Þ CðN Þ ðiÞ, as suggested in [14]: y ðN Þ ðiÞ ¼ GðN Þ ðiÞa ðN Þ ðiÞ þ v ðN Þ ðiÞ
ð1:38Þ
The noise term vðN Þ ðiÞ is strictly WGN in a single-cell environment, while it can be modeled as an unstructured colored Gaussian noise in a multicell environment, whose unknown covariance matrix is denoted by R vðN Þ h i ¼ E vðN Þ v ðN Þ
H
.
17
Introduction and CDMA Models
1.6 Asynchronous CDMA In this section we describe the features of an asynchronous DS-CDMA system. We also derive a general time-discrete model for this system in the presence of a single-ray propagation channel, which is typical, for example, of satellite communications. 1.6.1 Spreading Sequences for Asynchronous CDMA Systems Completely orthogonal spreading sequences cannot be designed when the system is asynchronous. In this case, the cross-correlation between any two spreading sequences must be kept as low as possible for any relative delay among signals, in order to keep the MAI as low as possible. One of the most popular family of codes for asynchronous systems are the Gold codes. These codes are obtained starting from a maximal length sequence (named m-sequence), generated using a shift register with r stages and maximum period equal to N ¼ 2r 1. In Figure 1.7 we show an example of binary logics with r stages which yields an m-sequence as an output given a proper initial configuration of the shift register. The tap coefficients are denoted by gi 2 ð0; 1Þ and addition is accomplished modulo-2. Sequence a generated with this procedure consists of 0 and 1, but the codes which are actually used for CDMA are mapped into antipodal sequences by the rule cðnÞ ¼ ð1ÞaðnÞ . From the m-sequence with length N , we derive a second sequence through decimation (i.e., by sampling the original m-sequence—extended by replicating it periodically—every q sequence elements). It is possible to prove that after sampling we obtain another m-sequence only if N and q are prime between them [9]. We consider two m-sequences a ¼ ðað0Þ; að1Þ; . . . ; aðN 1ÞÞ and adec ¼ ðadec ð0Þ; adec ð1Þ; . . . ; adec ðN 1ÞÞ, for which we define the circular Tc
Tc
Tc
g1
g2
gr –1
Figure 1.7 Binary logics with r stages for generation of an m-sequence.
a
18
Multiuser Detection in CDMA Mobile Terminals
spectrum cross-correlation of the corresponding antipodal code sequence defined by cðnÞ ¼ ð1ÞaðnÞ as follows: Raadec ðkÞ ¼
1 1 NX cðnÞcdec ðjn þ kjN Þ k ¼ 0; 1; . . . ; N 1 N n¼0
ð1:39Þ
The m-sequences can have a 3- or 4-valued cross-correlation spectrum. Specifically, Gold codes are derived from a pair of m-sequences with a 3-valued cross-correlation spectrum having the values 1 1 1 ½t ðrÞ 2 ð1:40Þ t ðrÞ; ; N N N where
t ðrÞ ¼
1 þ 20:5ðrþ1Þ 1 þ 20:5ðrþ2Þ
if r is odd if r is even
ð1:41Þ
The set of sequences defined by ða; adec ; a þ adec ; a þ Da dec ; :::; a þ D N 1 adec Þ, using the rule cðnÞ ¼ ð1ÞaðnÞ , is the set of Gold codes with period N . Note that in the above definition, addition is modulo-2 and notation D j cdec denotes a circular shift by j units of code cdec [9]. As an example, we evaluated the cross-correlation spectrum between an m-sequence a with length N ¼ 2r 1 (r ¼ 6) and adec , obtained by decimation with a factor q ¼ 5. The cross-correlation is a 3-valued function 17 1 whose values are 15 63 ; 63 ; 63, hence t ðrÞ ¼ 17, in agreement with (1.40) and (1.41). 1.6.2 Asynchronous CDMA Model in the Presence of Single-Ray Propagation We now present a system model of an asynchronous CDMA system where the users transmit without a common time reference. We take into account several nonideal operating conditions, such as a chip pulse that has a time extension larger than the chip interval, and that may not satisfy the Nyquist criterion. Moreover, we account for a phase offset and a frequency error in the carrier recovery, for possible near-far effect due to different received amplitudes. All users employ the same spreading factor Q , which is usually large compared to that employed by synchronous systems, so that possible interchip interference (ICI) causes ISI at the boundary of a symbol interval only. The baseband equivalent system is shown in Figure 1.8. The jth code is periodic with period equal to the spreading factor Q , and it is denoted as follows: ð1:42Þ cj ¼ cj ð0Þ; cj ð1Þ; . . . ; cj ðQ 1Þ j ¼ 1; 2; . . . ; J
19
Introduction and CDMA Models
c1(k) a1(n)
b1(k)
Q
p(t)
h1(t)
p(t) aJ (n)
p(t)
Q
bJ (k)
y(i) y(t)
v(t)
hJ (t)
cJ (k)
Figure 1.8 Baseband equivalent of the CDMA asynchronous system.
where J is the number of users. We denote the sequence transmitted by the jth Nj , where Nj is the length of its sequence. The transmission user as faj ðnÞgn¼1 channel for the jth user can be described by its low-pass equivalent: ð1:43Þ hj ðt Þ ¼ Aj d t tj where tj is the jth propagation delay. It is an ideal AWGN channel with only the effect of attenuating and delaying the received signal, superimposing an additive white Gaussian noise with zero-mean and given power spectral density. We denote by pðt Þ the impulse response of the transmitting filter, which is the chip pulse (usually a root raised cosine in frequency domain); as we plan to use symbol-spaced sampling in this system, qðt Þ, the receiving filter, is a matched filter. The time-continuous complex baseband signal after the receiving filter can be written as yðt Þ ¼
J X j¼1
Aj e
jð2pfj t þfj Þ
N j 1 X n¼0
aj ðnÞ
Q 1 X
cj ðkÞ
k¼0
pðt kTc nQTc tj Þ þ vðt Þ
ð1:44Þ
where tj includes both the delay due to asynchronous transmitting epochs and possible different propagation delay tj of the jth user; fj is the frequency error and fj is the phase offset—both are regarded as constant on the observation interval. Let us define fbj ðkÞg as the chip sequence corresponding to the data sequence faj ðnÞg of the jth user as k 4 bj ðkÞ ¼ aj ðnÞcj ðjkjQ Þ k ¼ 1; 2; . . . n ¼ ð1:45Þ Q
20
Multiuser Detection in CDMA Mobile Terminals 4
where we have used operator jijL ¼ i mod L and bxc, which selects the largest integer that equals or does not exceed x (floor function). After this definition, (1.44) becomes yðt Þ ¼
J X
Aj e
jð2pfj t þfj Þ
NX j Q 1
j¼1
bj ðkÞpðt kTc tj Þ þ vðt Þ
ð1:46Þ
k¼0
The received signal yðt Þ is sampled at chip rate, yielding the following timediscrete signal: yðiTc Þ ¼
J X
Aj e
jð2pfj iTc þfj Þ
NX j Q 1
j¼1
bj ðkÞpðiTc kTc tj Þ þ vðiTc Þ
k¼0
ð1:47Þ The delays tj are modeled as follows: tj ¼ lj Tc
ð1:48Þ
where the possible noninteger coefficient lj ¼ dj þ dj , and 0 4 dj 4 Q 1 is an integer number (the integer part of the asynchronism), while 0 4 dj < 1 denotes the fractional part. Equation (1.47) can be written as follows: yðiTc Þ ¼
J X
Aj e
jð2pfj iTc þfj Þ
j¼1
X
bj ðkÞpðði k dj ÞTc
k
dj Tc Þ þ vðiTc Þ
ð1:49Þ
We assume that pðt Þ is extended over a number of chip intervals equal to ðP 1Þ, and we operate a change of variable k 0 ¼ i k, turning (1.49) into the following: yðiTc Þ ¼
J X
Aj e jð2pfj iTc þfj Þ
j¼1
¼
J X
J X j¼1
bj ði k 0 Þpððk 0 dj ÞTc dj Tc Þ þ vðiTc Þ
k
Aj e
jð2pfj iTc þfj Þ
Pþd j 1 X k 0 ¼d
j¼1
¼
X
Aj e jð2pfj iTc þfj Þ
P 1 X
bj ði k 0 Þpððk 0 dj ÞTc dj Tc Þ þ vðiTc Þ
j
bj ði k dj Þpððk dj ÞTc Þ þ vðiTc Þ
k¼0
ð1:50Þ
21
Introduction and CDMA Models
where the last identity holds since we let k ¼ k 0 dj , and pðt Þ is causal with finite support over 0 4 t 4 ðP 1Þ. Equation (1.50) can be rewritten as follows: yðiTc Þ ¼
J X
Aj e
jð2pfj iTc þfj Þ
j¼1
P1 X
bj ði k dj Þpj ðkÞ þ vðiTc Þ
ð1:51Þ
k¼0
where pj ðkÞ ¼ pððk dj ÞTc Þ.
1.6.3 Time-Discrete Model for the Asynchronous System To express (1.51) using a compact notation, we define the P -dimensional vectors f j ðiÞ e bj ðiÞ in the following way: T 4 ~fj ðiÞ ¼ Aj e jð2pfj iTc þfj Þ pj ðP 1Þ; pj ðP 2Þ; . . . ; pj ð0Þ
ð1:52Þ
T 4 ~j ðiÞ ¼ b bj ði dj P þ 1Þ; bj ði dj P þ 2Þ; . . . ; bj ði dj Þ ð1:53Þ Using (1.45), we also have i dj P þ 1 ~ bj ðiÞ ¼ aj cj ji dj P þ 1jQ ; . . . ; Q T i dj aj cj ji dj jQ Q
ð1:54Þ
Hence, (1.50) can be expressed as 4
yðiÞ ¼ yðiTc Þ ¼
J X
~j ðiÞ þ vðiÞ ~f T ðiÞb j
ð1:55Þ
j¼1
l m 1 We emphasize that in (1.54) at the most l ¼ PþQ different symbols Q of each user are involved. Parameter l describes the relevant ISI for each data stream: since in our model P << Q , we have l ¼ 2 in the worst case, which occurs for samples close to the boundaries of the symbol interval. The worst case for ISI is then accounted for by defining a bidimensional vector:
22
Multiuser Detection in CDMA Mobile Terminals
T ~aj ðiÞ ¼ aj ðn 1Þ; aj ðnÞ 4
n¼
i dj Q
ð1:56Þ
It is important to stress that the temporal index n may be different from user to user since the delays dj are different. We then introduce matrix Cj ðiÞ with dimension P 2: 2 6 6 6 6 6 6 4 ~ j ðiÞ ¼ 6 C 6 6 6 6 6 4
cj ðji dj P þ 1jQ Þ
3
0
.. . cj ðQ 1Þ cj ð0Þ .. . 0
cj ðji dj jQ Þ
7 7 7 7 7 7 7 7 7 7 7 7 5
ð1:57Þ
Hence, the ith scalar sample can be expressed as
yðiÞ ¼
J X
~f T C ~ j ðiÞ~aj ðiÞ þ vðiÞ j
ð1:58Þ
j¼1
In order to simplify (1.58), we define the following: 4 ~ 1 ðiÞ; C ~ 2 ðiÞ; . . . ; C ~ J ðiÞ ðJP 2J Þ ~ ðiÞ ¼ diag C C h iT 4 ð2J 1Þ ~a ðiÞ ¼ ~a1T ðiÞ; ~a2T ðiÞ; . . . ; ~aJT ðiÞ
ð1:60Þ
h iT 4 ~T ~f ðiÞ ¼ f1 ðiÞ; ~f2T ðiÞ; . . . ; ~fJT ðiÞ
ð1:61Þ
ðJP 1Þ
ð1:59Þ
We can finally write ~ ðiÞ~aðiÞ þ vðiÞ yðiÞ ¼ ~f T ðiÞC
ð1:62Þ
It is useful to define the bidimensional row vector as 4 ~ j ðiÞ g~jT ðiÞ ¼ ~fjT ðiÞC
ð1:63Þ
23
Introduction and CDMA Models
and the ð2J 1Þ-dimensional vector as h iT 4 g~ðiÞ ¼ g~1T ðiÞ; g~2T ðiÞ; . . . ; gTJ ðiÞ
ð1:64Þ
Finally, it is easy to see ~ ðiÞ g~T ðiÞ ¼ ~f T ðiÞC
ð1:65Þ
so that we can rewrite yðiÞ as follows: yðiÞ ¼ g~T ðiÞ~aðiÞ þ vðiÞ
ð1:66Þ
4 ~ ~ðiÞ ¼ Using the definition of the JP -dimensional vector b CðiÞ~aðiÞ, we can write
~ðiÞ þ vðiÞ yðiÞ ¼ ~f T ðiÞb
ð1:67Þ
where it is easy to show that h iT 4 ~T ~T ðiÞ; . . . ; b ~T ðiÞ ~ðiÞ ¼ b1 ðiÞ; b b 2 J
ð1:68Þ
~j ðiÞ ¼ C ~ j ðiÞ~aj ðiÞ. with b In order to solve the problem of data detection, we may need a longer observation vector consisting of a generic number N of samples: y ðN Þ ðiÞ ¼ ½ yði N þ 1Þ; yði N þ 2Þ; . . . ; yðiÞT
ðN 1Þ
ð1:69Þ
In order to express (1.69) in closed form, we define the following matrix with dimensions N ðP þ N 1Þ: 2 ðN Þ
Pj
pj ðP 1Þ
6 6 6 6 ¼6 6 6 4
pj ð0Þ
pj ðP 1Þ
..
0
3
0
7 7 7 7 7 7 7 5
pj ð0Þ ..
. pj ðP 1Þ
.
pj ð0Þ ð1:70Þ
24
Multiuser Detection in CDMA Mobile Terminals
and the diagonal matrix with dimensions N N : 2 ðN Þ
Aj
e j2pfj ðiN þ1ÞTc
0
0
e j2pfj ðiN þ2ÞTc
6 6 ðiÞ ¼ Aj e jfj 6 6 4
..
0
3 7 7 7 7 5
. e j2pfj iTc
0
ð1:71Þ and ðN Þ
Fj
4
ðN Þ
ðiÞ ¼ A j
ðN Þ
ðiÞPj
N ðP þ N 1Þ
ð1:72Þ
The number of symbols involved in the observation interval referred to the jth user is given by 8l m PþN þQ 2 > for ji dj jQ 4 x 1 or x ¼ 0 < Q nj ðiÞ ¼ l ð1:73Þ m > : P þN þQ 2 1 for x 4 ji dj j 4 Q 1 Q Q 4
where x ¼ jP þ N 2jQ . Let us finally define the block-diagonal matrix 2 cj ðjidj N Pþ2jQ Þ .. 6 6 . 6 6 c ðQ1Þ j 6 .. 6 6 . 6 6 cj ð0Þ 6 6 .. 6 . 46 ðN Þ c ðjid Pþ1j Cj ðiÞ ¼6 j j QÞ 6 6 . .. 6 6 6 cj ðQ1Þ 6 6 .. 6 . 6 6 6 6 4 0
0
3
cj ð0Þ .. .
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
cj ðjidj jQ Þ ð1:74Þ
25
Introduction and CDMA Models
whose dimensions are ðP þ N 1Þ nj ðiÞ, and a data vector of length nj ðiÞ T i dj 4 ðN Þ a j ðiÞ ¼ aj ðn nj ðiÞ þ 1Þ;...;aj ðn 1Þ;...;aj ðnÞ n¼ Q ð1:75Þ ðN Þ
Let us note that the lower right minor of matrix Cj ðiÞ is exactly ~ j ðiÞ (j ¼ 1; 2; . . . ; J ). After defining the vector of noise coincident with C samples vðN Þ ðiÞ ¼ ½vði N þ 1Þ; vði N þ 2Þ; . . . ; vðiÞT
ðN 1Þ
ð1:76Þ
we can write y ðN Þ ðiÞ ¼
J X
ðN Þ
Aj
ðN Þ
ðiÞPj
ðN Þ
Cj
ðN Þ
ðiÞaj
ðiÞ þ vðN Þ ðiÞ
ð1:77Þ
j¼1
or y ðN Þ ðiÞ ¼
J X
ðN Þ
ðiÞCj
ðN Þ
ðiÞbj
Fj
ðN Þ
ðN Þ
ðiÞa j
ðiÞ þ v ðN Þ ðiÞ
ð1:78Þ
j¼1
y
ðN Þ
ðiÞ ¼
J X
Fj
ðN Þ
ðiÞ þ vðN Þ ðiÞ
ð1:79Þ
j¼1 ðN Þ
ðN Þ
ðN Þ
where bj ðiÞ ¼ Cj ðiÞa j ðiÞ is a vector with length ðP þ N 1Þ. P Finally, if we let nt ðiÞ ¼ Jj¼1 nj ðiÞ (total number of symbols involved in the observation window considered including all users), we can define the following: h i 4 ðN Þ ðN Þ ðN Þ J ðP þ N 1Þ nt ðiÞ CðN Þ ðiÞ ¼ diag C1 ðiÞ; C2 ðiÞ; . .. ; CJ ðiÞ ð1:80Þ h iT 4 ðN Þ T ðN Þ T ðN Þ T aðN Þ ðiÞ ¼ a1 ðiÞ; a2 ðiÞ;. .. ; aJ ðiÞ nt ðiÞ 1
ð1:81Þ
h i 4 ðN Þ ðN Þ ðN Þ FðN Þ ðiÞ ¼ F1 ðiÞ; F2 ðiÞ; . .. ; FJ ðiÞ
ð1:82Þ
N J ðP þ N 1Þ
26
Multiuser Detection in CDMA Mobile Terminals
This allows us to write y ðN Þ ðiÞ ¼ FðN Þ ðiÞCðN Þ ðiÞa ðN Þ ðiÞ þ vðN Þ ðiÞ
ð1:83Þ
Introducing vector bðN Þ ðiÞ ¼ CðN Þ ðiÞa ðN Þ ðiÞ with length J ðPþ N 1Þ and matrix GðN Þ ðiÞ ¼ FðN Þ ðiÞCðN Þ ðiÞ with dimensions N nt ðiÞ, we have y ðN Þ ðiÞ ¼ FðN Þ ðiÞbðN Þ ðiÞ þ vðN Þ ðiÞ ¼ GðN Þ ðiÞaðN Þ ðiÞ þ v ðN Þ ðiÞ ð1:84Þ Furthermore, it is easy to show that FðN Þ ðiÞ ¼ A ðN Þ ðiÞPðN Þ
ð1:85Þ
where h i 4 ðN Þ ðN Þ ðN Þ A ðN Þ ðiÞ ¼ A 1 ðiÞ; A 2 ðiÞ; . . . ; A J ðiÞ h i 4 ðN Þ ðN Þ ðN Þ PðN Þ ¼ diag P1 ; P2 ; . . . ; PJ
N JN
ð1:86Þ
JN J ðP þ N 1Þ
ð1:87Þ
From those definitions we obtain y ðN Þ ðiÞ ¼ A ðN Þ ðiÞPðN Þ CðN Þ ðiÞa ðN Þ ðiÞ þ v ðN Þ ðiÞ
ð1:88Þ
which can be rewritten as y ðN Þ ðiÞ ¼ A ðN Þ ðiÞPðN Þ bðN Þ ðiÞ þ vðN Þ ðiÞ ¼ A ðN Þ ðiÞuðN Þ ðiÞ þ v ðN Þ ðiÞ ð1:89Þ where uðN Þ ðiÞ ¼ PðN Þ bðN Þ ðiÞ is a JN -dimensional vector. The parameters that appear in (1.89) are listed below: A ðN Þ ðiÞ depends on amplitudes Aj , on frequency errors fj , and phase offsets fj of the various users ð j ¼ 1; 2; . . . ; J Þ. PðN Þ ðiÞ depends on the fractional delay dj of the users ð j ¼ 1; 2; . . . ; J Þ. CðN Þ ðiÞ and aðN Þ ðiÞ depend on the integer delay dj of the users ð j ¼ 1; 2; . . . ; J Þ.
Introduction and CDMA Models
27
1.7 Dichotomies in CDMA This chapter has introduced the fundamental concepts of code division multiple access, with a special emphasis on its implementation using DS-SS. Both short and long codes can be used for spreading, and the intrinsic differences between these two approaches have been highlighted. The former can easily be coupled to a multiuser detection approach due to an isomorphic geometric separation of the users; the latter is not suitable for multiuser detection and it is more similar to old fashioned systems where the signal of each user is impaired by white noise. Synchronous and asynchronous systems have also been described, and a derivation of an accurate and flexible time-discrete model has been provided. Theoretically and practically, multiuser detection can be applied to any shortcode CDMA system regardless of the presence or absence of synchronization among users. For the sake of completeness, we must point out, however, that asynchronous systems, or systems where orthogonality has been compromised, benefit more from multiuser detection schemes than from other detectors.
References [1]
Sklar, B., Digital Communications: Fundamentals and Applications, Upper Saddle River, NJ: Prentice Hall, 1988.
[2]
Proakis, G., Digital Communications, Third Edition, New York: McGraw-Hill, 1995.
[3]
Verdu`, S., Multiuser Detection, New York: Cambridge University Press, 1988.
[4]
Landau, H. J., and H. O. Pollak, ‘‘Prolate Spheroidal Wave Functions, Fourier Analysis and Uncertainty-III: The Dimension of the Space of Essentially Time- and Band-Limited Signals,’’ Bell System Technical Journal, Vol. 41, 1962, pp. 1295–1336.
[5]
Simon, M. K., et al., Spread Spectrum Communications Handbook. Revised Edition, New York: McGraw-Hill, 1994.
[6]
Salmasi, A., and K. S. Gilhousen, ‘‘On the System Design Aspects of Code Division Multiple Access Applied to Digital Cellular and Personal Communications Networks,’’ Proceedings of the IEEE Vehicular Technology Conference (VTC ’91), St. Louis, MO, May 1991. pp. 57–62.
[7]
3rd Generation Partnership Project (3GPP), Radio Interface Technical Specifications, http://www.3gpp.org.
[8]
Liu, H., Signal Processing Applications in CDMA Communications, Norwood, MA: Artech House, 2000.
28
Multiuser Detection in CDMA Mobile Terminals
[9]
Peterson, R. L., R. E. Ziemer, and D. E. Borth, Introduction to Spread Spectrum Communications, Upper Saddle River, NJ: Prentice Hall, 1995.
[10]
Vembu, S., and A. J. Viterbi, ‘‘Two Different Philosophies in CDMA—A Comparison,’’ Proceedings of the IEEE Vehicular Technology Conference (VTC ’96), Atlanta, GA, April–May 1996, pp. 869–873.
[11]
Poor, V., An Introduction to Signal Detection and Estimation, Second Edition, New York: Springer-Verlag, 1994.
[12]
Gallager, R. G., Information Theory and Reliable Communication, New York: John Wiley & Sons, 1968, p. 337.
[13]
Castoldi, P., and H. Kobayashi, ‘‘Low Complexity Group Detectors for Multirate Transmission in TD-CDMA 3G Systems,’’ IEEE Broadband Wireless Symposium (Globecom 2000), San Francisco, CA, November–December 2000.
[14]
Castoldi, P., and R. Raheli, ‘‘On Recursive Optimal Detection of Linear Modulation in the Presence of Random Fading,’’ European Transactions on Telecommunications, Vol. 9, No. 2, March/April 1998, pp. 209–220.
[15]
Fliege, N. J., Multirate Digital Signal Processing, New York: John Wiley & Sons, 1994.
2 Single-User Detection 2.1 Single-User Versus Multiuser Receivers CDMA is a multiple access technique that is intrinsically impaired by multiple access interference. In principle, orthogonality of the spreading sequences allows optimal detection in synchronous systems by using a conventional detector (a filter matched to the signature of interest). Practical CDMA synchronous systems and asynchronous systems do not warrant perfect orthogonality between users, and residual nonzero cross-correlations might cause a significant multiple access interference. In fact, due to the nonorthogonality of practical CDMA spreading sequences, the crosscorrelation between the spreading sequence of the user of interest and the signal from a strong interferer can be greater than the correlation with the signal of the desired user. Detection becomes highly unreliable. This implies that the conventional detector suffers from the near-far problem. The classical way to deal with this problem is power control, whereby all users’ transmission power is controlled so that the power received from all users is equal. This adds complexity to the system, and inaccuracies in power control have a detrimental impact on performance. It has been shown in [1] that DS-CDMA is not fundamentally MAIlimited and can be near-far resistant. The optimum multiuser receiver exists and allows to relax the constraints of choosing spreading sequences with good correlation properties at a cost of increased receiver complexity. The practical application of this approach is limited by the complexity of the detector [2], which requires a search on a trellis with exponential complexity according to the number of users. Furthermore, this detector requires perfect knowledge of all spreading sequences, timing, phase, and frequency offset of all users. 29
30
Multiuser Detection in CDMA Mobile Terminals
Various types of suboptimal multiuser detectors have appeared in the technical literature. Some of them will be analyzed in detail in the following chapters. They mainly differ in the strategy for MAI elimination (linear or nonlinear) and on the assumed knowledge of the interference parameters. Many of the suboptimal receivers proposed allow adaptive versions, and some of them will be discussed in detail. Table 2.1, reported from [2], highlights the assumed knowledge for the CDMA detectors that we will consider in this book. Specifically, we are going to consider (1) single-user detectors, namely the conventional and Rake detector; (2) multiuser linear detectors: the zero forcing (ZF) and minimum mean square error (MMSE) detectors; (3) multiuser nonlinear detectors: serial and parallel interference cancellation (SIC, PIC) detectors; and (4) adaptive linear multiuser detectors: the blind and trained MMSE detectors.
2.2 Performance Measures of CDMA Receivers There are standard measures to compare different CDMA receivers, such as the asymptotic efficiency, the near-far resistance [3], and the bit error probability. Since the ultimate goal in most digital communications systems is to decrease the probability of bit error, this performance measure will be considered in this book. Linear detectors usually allow closed-form solutions for the BER evaluation, which are provided in the following chapters.
2.3 The Conventional Receiver In this section we will show the structure of the so-called conventional receiver, a single-user receiver that does not exploit the structure of the multiple access interference for the purpose of detection. These receivers suffer performance impairments when the orthogonality of the codes is either destroyed because of some external effect (such as multipath propagation), or not originally perfect at the transmitting side. 2.3.1 Synchronous Multirate CDMA System Without ISI Let us consider the multirate synchronous model presented in Chapter 1 and assume that no multipath propagation is present but just a single ray is received. The channel can be modeled as hðt Þ ¼ hdðt Þ where h is a constant. For sake of simplicity, let h ¼ 1, which yields hðt Þ ¼ dðt Þ, and
Table 2.1 Complexity Requirements of Some Detection Algorithms for CDMA Systems Signature of Interferers
Timing of Desired User
Timing of Interferers
Relative Amplitudes
Training Sequence
Conventional and Rake
Y
N
Y
N
Y1
N
Linear ZF receiver
Y
Y
Y
Y
N
N
Linear MMSE receiver
Y
Y
Y
Y
Y
N
SIC and PIC
Y
Y2
Y
Y
Y
N
Trained adapt. MMSE
N
N
Y3
N
N
Y
Blind adapt. MMSE
Y
N
Y
N
N
N
Single-User Detection
Signature of Desired User
1. Strict power control required for adequate performance. 2. Adequate performance with information about the most powerful interferers only. 3. Symbol timing only may be sufficient. From : [2].
31
32
Multiuser Detection in CDMA Mobile Terminals
f ðt Þ ¼ pðt Þ qðt Þ. If the timing of the desired user is perfectly known, we ~ ¼ f ð0Þ is can exploit symbol-spaced Nyquist sampling of f ðt Þ. Hence, F actually a scalar, and the scalar chip observation is given by
J X
J X
i bj ðiÞ þ vðiÞ ¼ f ð0Þ cj ðji 1jQj Þaj ðnÞ þ vðiÞ n ¼ yðiÞ ¼ f ð0Þ Q j j¼1 j¼1
ð2:1Þ Therefore, in this case FðN Þ ¼ f ð0ÞIN , where IN is the identity matrix of order N and the observation vector of length N can be written as y ðN Þ ðiÞ ¼ f ð0Þ
J X
ðN Þ
Cj
ðiÞa ðN Þ ðiÞ þ vðN Þ ðiÞ ¼ f ð0Þ
j¼1
¼
J X
J X
ðN Þ
bj
ðiÞ þ v ðN Þ ðiÞ
j¼1
ðN Þ ðN Þ aj ðiÞ þ v ðN Þ ðiÞ
ð2:2Þ
Gj
j¼1
In this case the optimal observation window has width N ¼ Q , where Q is the maximum spreading factor among those in use. Assuming we know the system timing, this window is exactly synchronized with the symbol interval of the slowest data stream as shown in Figure 2.1. We then define
Q1 = 2 a1(1)
a1(2)
a1(3)
a1(4)
a1(5)
a1(6)
a1(7)
a1(8)
a1(9)
a1(10) a1(11)
a1(12)
Q2 = 4 a2(2)
a2(1)
a2(3)
a2(4)
a2(5)
a2(6)
Q3 = 8 a3(1)
a3(2)
a3(3)
a4(1)
a4(2)
a4(3)
yD (1)
yD (2)
yD (3)
Q4 = 8
Figure 2.1 Selection of the optimal observation window for the case of absence of ISI, when four data streams with different spreading factors are active. In this case the optimal window width is N ¼ Q, where Q ¼ Q3 ¼ Q4 is the maximum spreading factor currently used by the system.
33
Single-User Detection
y D ðmÞ ¼ y ðQ Þ ðmQ Þ ðQ Þ
Cj ¼ Cj
ð2:3Þ
ðmQ Þ
ð2:4Þ
a j ðmÞ ¼ aj ðmQ Þ
ð2:5Þ
vðmÞ ¼ vðQ Þ ðmQ Þ
ð2:6Þ
ðQ Þ
(m ¼ 1; 2; . . .). This notation allows us to express the generic processing window observation as y D ðmÞ ¼
J X
Cj a j ðmÞ þ vðmÞ
m ¼ 1; 2; . . .
ð2:7Þ
j¼1
Conventional detection is accomplished by time-discrete matched filtering, as shown in Figure 2.2. Suppose that the kth user is that of interest—the conventional matched filtering operated by matrix CH k ðiÞ yields the following soft decision: xk;MF ðmÞ ¼ CH k Ck a k ðmÞ þ
J X
H CH k Cj a j ðmÞ þ Ck vðmÞ
j¼1; j6¼k
¼ a k ðmÞ þ uMF ðmÞ
ð2:8Þ
H where the last identity holds since CH 6¼ jÞ due to k Ck ¼ IN and Ck Cj ¼ 0 ðk 4 the orthogonality between codes. The new noise process uMF ðmÞ ¼ CH k vðmÞ remains white. Hard detection is realized by a threshold device that operates a quantization on the soft decision
^ak;MF ¼ quant½xk;MF ðmÞ
ð2:9Þ
In this case the orthogonality is preserved in the received signal as well and matched filtering is optimal for detection.
yD (m)
FIR filter H Ck
xk,MF (m) = CHk yD (m)
Threshold detector
âk(m)
Figure 2.2 Scheme of the matched filtering detection scheme for the ISI-free synchronous multirate system.
34
Multiuser Detection in CDMA Mobile Terminals
2.3.2 Asynchronous Single-Rate CDMA System In CDMA asynchronous systems perfect orthogonality between users cannot be realized. Conventional detection is realized, as in the synchronous case using a finite impulse response (FIR) filter, whose tap coefficients are matched to the spreading sequence of the desired user (i.e., a correlation receiver is used). We assume that no frequency error, no phase offset, and no near-far effect are present. This final condition implies that, without loss of Þ genderality, A ðN ¼ IN (for j ¼ 1; 2; . . . ; J ). Let us denote the desired user by j index k. We further assume that the desired user’s timing is perfectly recovered, ðN Þ which allows Nyquist sampling, which can be expressed by Pk ¼ IN . As a consequence, using the expression of the observation vector y ðN Þ ðiÞ for an asynchronous system derived in Chapter 1, ðN Þ
ðN Þ
y ðN Þ ðiÞ ¼ Ck ðiÞak ðiÞ þ
J X
ðN Þ
Pj
ðN Þ
Cj
ðN Þ
ðiÞaj
ðiÞ þ vðN Þ ðiÞ ð2:10Þ
j¼1; j6¼k
We focus on an observation window of a symbol interval (i.e., N ¼ Q ): y D ðmÞ ¼ y ðQ Þ ðmQ Þ ðQ Þ
Cj ¼ Cj
ð2:11Þ
ðmQ Þ
ð2:12Þ
ðmQ Þ
ð2:13Þ
a j ðmÞ ¼ aj ðmQ Þ
ð2:14Þ
vðmÞ ¼ vðQ Þ ðmQ Þ
ð2:15Þ
ðQ Þ
Pj ¼ Pj
ðQ Þ
(m ¼ 1; 2; . . .). For the desired user whose timing is perfectly known, matrix Ck is actually a Q -dimensional vector ck containing the desired user signature; A k ðmÞ ¼ Ak ðmÞ is the scalar symbol relevant to the mth processing window. The detection scheme is shown in Figures 2.3 and 2.4. The soft decision is given by
yD (m)
FIR CkH
xk (m) = cH . k yD (m)
Figure 2.3 Conventional detection scheme.
Threshold detector
Âk (m)
35
Single-User Detection
c*1 yD (m)
Tc
c*2
Tc
c*3
Tc
c*Q xk (m)
Figure 2.4 FIR for conventional detection.
xk ðmÞ ¼ cH k y D ðmÞ
ð2:16Þ J X
H ¼ cH k ck ak ðmÞ þ ck
Pj Cj a j ðmÞ þ cH 1 vðmÞ
ð2:17Þ
j¼1; j6¼k
¼ ak ðmÞ þ cH k
J X
Pj Cj aj ðmÞ þ uðmÞ
ð2:18Þ
j¼1; j6¼k
and it is the input for a symbol-by-symbol detector that yields a^k ðmÞ. This detection strategy exploits the autocorrelation and crosscorrelation properties of the spreading sequences and allows a partial cancelation of the MAI. While a synchronous system with perfectly orthogonal spreading sequences is not sensitive to differences in power among users, in an asynchronous system performance it is severely impaired by the near-far effect. This means that the performance is degraded if the interfering signals have a much higher power than that of the useful signal. 2.3.3 Synchronous Multirate CDMA System in the Presence of ISI When we sample at symbol rate or oversample and Nyquist criterion is not satisfied, ICI (and ISI) arises. Following Chapter 1, we can express the observation vector belonging to the same chip interval as
36
Multiuser Detection in CDMA Mobile Terminals
~ y~ðiÞ ¼ F
J X
~j ðiÞ þ ~v ðiÞ ¼ F ~ b
J X
j¼1
~ j ðiÞ~aj ðiÞ þ ~v ðiÞ ¼ C
j¼1
J X
~ j ðiÞ~aj ðiÞ þ ~vðiÞ G
j¼1
ð2:19Þ An observation vector of length N can be expressed in the usual way [4]: y ðN Þ ðiÞ ¼ FðN Þ
J X
ðN Þ
Cj
ðiÞa ðN Þ ðiÞ þ vðN Þ ðiÞ ¼ FðN Þ
j¼1
¼
J X
J X
ðN Þ
bj
ðiÞ þ v ðN Þ ðiÞ
j¼1
ðN Þ ðN Þ a j ðiÞ þ vðN Þ ðiÞ
ð2:20Þ
Gj
j¼1 ðN ÞH
Matched filtering can be accomplished by considering Gk ðiÞ as the matched filter, but MAI interference is not completely eliminated and the matched filter is no longer optimal. In this case the optimal observation spans N ¼ Q þ P 1 chips, since all energy for the symbol with the largest spreading factor is captured, as shown in Figure 2.5. In order to collect the most energy from the received signal, we define ðm ¼ 1; 2; . . .Þ the following: y D ðmÞ ¼ y ðQ þP 1Þ ðmQ þ P 1Þ
ð2:21Þ
ðQ þP 1Þ Cj ðmQ
þ P 1Þ
ð2:22Þ
ðmQ þ P 1Þ
ð2:23Þ
ðmQ þ P 1Þ
ð2:24Þ
ðmQ þ P 1Þ
ð2:25Þ
vðmÞ ¼ vðQ þP 1Þ ðmQ þ P 1Þ
ð2:26Þ
Cj ¼
ðQ þP 1Þ
G j ¼ Gj
ðQ þP 1Þ
Fj ¼ Fj
ðQ þP 1Þ
aj ðmÞ ¼ aj
Soft decisions are given by H x0MF ðmÞ ¼ GH k y D ðmÞ ¼ Gk Gk a k ðmÞ þ
J X
H GH k Gj a j ðmÞþ Gk vðmÞ
j¼1; j6¼k
¼ GH k Gk a k ðmÞ þ
J X
0 GH k Gj a j ðmÞ þuMF ðmÞ
j¼1; j6¼k H ¼ CH k F FCk a k ðmÞþ
J X j¼1; j6¼k
H 0 CH k F FCj a j ðmÞþ uMF ðmÞ
ð2:27Þ
37
Single-User Detection
Q1 = 2
Impulse response span = PTc a1(1)
a1(2)
a1(3) a1(4) a1(5)
a1(6)
a1(7) a1(8)
a1(9) a1(10)
a1(11) a1(12)
Q2 = 4 a2(1)
a2(2) a2(3) a2(4) a2(5)
a2(6)
Q3 = 8 a3(1) a3(2)
a3(3)
Q4 = 8 a4(1) a4(2)
a4(3)
yD (1) yD (2) yD (3)
Figure 2.5 Observation windows in the presence of ISI.
In this case, the orthogonality between users is not preserved and matched filtering is not optimal. MAI arises due to the presence of the term FH F.
2.4 The Rake Receiver The Rake receiver is a single-user detector that is particularly efficient in the presence of multipath propagation. It collects the energy of all the replicas contained in the received signal, exploiting its intrinsic diversity. A filter matched to each of the replicas is used and then combined, assuming that the channel is perfectly estimated. 2.4.1 Observation Model for Rake Detection In order to fully understand the structure of the Rake receiver, we need to modify the observation model, accounting for each echo due to multipath
38
Multiuser Detection in CDMA Mobile Terminals
propagation. The considered system is a synchronous single-rate CDMA system (i.e., Qj ¼ Q , j ¼ 1; 2; . . . ; J ). The channel impulse response of the multipath channel is given by L 1 X
hðt Þ ¼
hl dðt tl Þ
ð2:28Þ
l ¼0
where we assume, for simplicity, that the delays are multiples of the sampling interval (i.e., tl ¼ dl Tc , with dl integer). If no energy is present for a given delay, then the corresponding tap coefficient is zero. The baseband time-continuous signal, after front-end processing, can be written as the superposition of L asynchronous signals: yðt Þ ¼
L1 X
yl ðt Þ þ vðt Þ
ð2:29Þ
l ¼0
where yl ðt Þ ¼ hl
J X X j¼1
bj ðkÞpðt ðk 1ÞTc dl Tc Þ
ð2:30Þ
k
A finger of the Rake will be dedicated to each of these signals. The chip interval sampled signal becomes yl ðiTc Þ ¼ hl
J X X j¼1
bj ðkÞpðði k þ 1 dl ÞTc Þ
ð2:31Þ
k
Assuming that pðt Þ extends for P 1 chip intervals and letting k 0 ¼ i k dl , we obtain yl ðiTc Þ ¼ hl
J X X j¼1
bj ði k 0 dl Þpððk 0 þ 1ÞTc Þ
ð2:32Þ
k0
If we assume that pðt Þ is Nyquist sampled, we have yl ðiÞ ¼ hl
J X j¼1
where fl ¼ hl pð0Þ.
pð0Þbj ði þ 1 dl Þ ¼
J X j¼1
fl bj ði þ 1 dl Þ
ð2:33Þ
39
Single-User Detection
Then the chip observation is given by yðiÞ ¼
L 1 X
yl ðiÞ þ vðiÞ
ð2:34Þ
l ¼0
We consider an observation vector of length N for each finger ðN Þ
yl
T ðiÞ ¼ ylT ði N þ 1Þ; ylT ði N þ 2Þ; . . . ; ylT ðiÞ
ð2:35Þ
We consider the N -dimensional square matrix pð0ÞIN and the corresponding chip sequence:
i N dl þ 1 ðN Þ bj;l ðiÞ ¼ aj cj ji N dl þ 1jQ ; . . . Q T i dl . . . ; aj ð2:36Þ cj ji dl jQ Q We now define the jth code relevant block-diagonal matrix for the l th finger: 3
2
cj ðji N dl þ 1jQ Þ 6 .. 6 . 6 6 c ðQ 1Þ j 6 6 .. 6 6 . 6 6 6 6 6 6 4 6 ðN Þ Cj;l ðiÞ ¼ 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4
cj ð0Þ .. . cj ðji P dl jQ Þ .. . cj ðQ 1Þ .. . cj ð0Þ .. . cj ðji dl jQ Þ l
m 2 whose dimensions are N n0 , where n0 ¼ N þLþQ . Q
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
ð2:37Þ
40
Multiuser Detection in CDMA Mobile Terminals
We consider the associated data string: 4 ðN Þ a j;l ðiÞ ¼
T aj ðn n0 þ 1Þ; . . . ; . . . ; aj ðnÞ
ðN Þ
ðN Þ
i dl n¼ Q
ð2:38Þ
ðN Þ
such that bj;l ðiÞ ¼ Cj;l ðiÞa j;l ðiÞ. ðN Þ After definition of Fl ¼ fl IN , we have ðN Þ
yl
ðN Þ
ðiÞ ¼ Fl
ðiÞ
J X
ðN Þ
ðN Þ
Cj;l ðiÞaj;l ðiÞ
ð2:39Þ
j¼1
¼ fl
J X
ðN Þ
ðN Þ
Cj;l ðiÞaj;l ðiÞ
ð2:40Þ
ðN Þ
ð2:41Þ
j¼1
¼
J X
ðN Þ
Gj;l ðiÞa j;l ðiÞ
j¼1 ðN Þ
where Gj;l ðiÞ is a (bN nt )-dimensional matrix. After defining the noise samples vector vðN Þ ðiÞ ¼ ½vðiÞ; vði þ 1Þ; . . . ; vði þ N 1ÞT
ð2:42Þ
the complete observation can be expressed as y ðN Þ ðiÞ ¼ ¼ ¼
L 1 X l ¼0 L 1 X
ðN Þ
yl fl
ðiÞ þ vðN Þ ðiÞ
J X
j¼1 l ¼0 J L1 X X
ðN Þ
ðN Þ
ð2:43Þ
Cj;l ðiÞaj;l ðiÞ þ vðN Þ ðiÞ
ð2:44Þ
ðN Þ
ð2:45Þ
ðN Þ
Gj;l ðiÞa j;l ðiÞ þ v ðN Þ ðiÞ
l ¼0 j¼1
2.4.2 The Rake Principle Figure 2.6 illustrates the principle of the Rake when L < Q . In order to collect all energy from the received signal, the processing window is defined as y D ðmÞ ¼ y ðQ þL1Þ ðmQ þ L 1Þ
ð2:46Þ
y l ðmÞ ¼ y ðQ þL1Þ ðmQ þ L 1Þ
ð2:47Þ
41
Single-User Detection
Code 1
a1(1)
a1(2)
a1(3)
Code 2
a2(1)
a2(2)
a2(3)
yD (1) yD (2) yD (3)
Figure 2.6 Model of observation for Rake detection when two streams with spreading factor Q ¼ 8 are active, and L ¼ 4. ðQ þL1Þ
ðmQ þ L 1Þ
ð2:48Þ
ðQ þL1Þ
ðmQ þ L 1Þ
ð2:49Þ
ðmQ þ L 1Þ
ð2:50Þ
vðmÞ ¼ vðQ þL1Þ ðmQ þ L 1Þ
ð2:51Þ
Cj;l ¼ Cj
Gj;l ¼ Gj
ðQ þL1Þ
aj;l ðmÞ ¼ aj
(m ¼ 1; 2; . . .). Each finger of the Rake accomplishes a conventional detection by collecting the relevant energy using matrix GH k;l , yielding L soft decision vectors: xk;l ðmÞ ¼ GH k;l ðmÞy D ðmÞ ¼ GH k;l ðmÞGk;l ðmÞa k;l ðmÞ þ
L 1 X l 0 ¼0;
þ
GH k;l ðmÞGk;l 0 ðiÞa k;l ðmÞ
l 6¼l 0
J L X X j¼1; j6¼k
GH k;l ðmÞGj;l 0 ðiÞa j;l ðmÞ
l 0 ¼1
þ GH k;l ðmÞvðmÞ
ð2:52Þ
In this case the MAI mitigation depends only on the cross-correlation between the spreading sequences.
42
Multiuser Detection in CDMA Mobile Terminals
The structure of a Rake receiver is shown in Figure 2.7. The final soft-decision is given by the composition of the soft decision of each Rake finger: xk ðmÞ ¼
L X
xk;l ðmÞ
ð2:53Þ
GH k;l ðmÞGk;l ðmÞa k;l ðmÞ
ð2:54Þ
l ¼1
¼
L1 X l ¼0
þ
L1 X L1 X
GH k;l ðmÞGk;l 0 ðiÞa k;l ðmÞ
ð2:55Þ
l ¼0 l 0 ¼0; l 0 6¼l
þ
J L 1 X L X X
GH k;l ðmÞGj;l 0 ðiÞa j;l ðmÞ
ð2:56Þ
l ¼0 j¼1; j6¼k l 0 ¼1
þ
L1 X
GH k;l ðmÞvðmÞ
ð2:57Þ
l ¼0
In the last expression, (2.54) represents the useful signal component; (2.55) accounts for self-interference due to multipath; (2.56) represents the cross-multipath MAI; and (2.57) is AWGN. Further expansion of (2.53) gives xk ðmÞ ¼
L X
xk;l ðmÞ
ð2:58Þ
jfl j2 CH k;l ðmÞCk;l ðmÞa k;l ðmÞ
ð2:59Þ
l ¼1
¼
L 1 X l ¼0
GHk,0
yD (m)
GHk,1
GHk,L–1
Figure 2.7 Structure of the Rake receiver.
L–1 ∑ l=0
xk (m)
43
Single-User Detection
þ
L 1 X L 1 X l ¼0
þ
l 0 ¼0;
fl fl 0 CH k;l ðmÞCk;l 0 ðiÞa k;l ðmÞ
ð2:60Þ
l 0 6¼l
J L1 X L X X
fl fl 0 CH k;l ðmÞGj;l 0 ðiÞa j;l ðmÞ
ð2:61Þ
l ¼0 j¼1; j6¼k l 0 ¼1
þ
L 1 X
fl CH k;l ðmÞvðmÞ
ð2:62Þ
l ¼0
In (2.59) a ‘‘weighting’’ operation has been highlighted. Factor jfl j2 gives more weight to the strongest paths compared to the weakest. The term (2.60) is mitigated by the use of spreading sequences with low autocorrelation for any time offset, while the term (2.61) is mitigated by the use of spreading sequences with low cross-correlation for any time offset. It can be easily shown that the Rake receiver is equivalent to the modified matched filter detector of (2.27).
2.5 Limits of Single-User Detection In this chapter various types of single-user detectors have been presented and analyzed. It has been shown that single-user detection is optimal only when the spreading sequences are perfectly orthogonal and the transmission channel does not compromise this feature. Otherwise, the conventional detector suffers from the near-far effect and its performance is highly sensitive to the total number of active users and to their power levels. The Rake detector is useful when the transmission channel is affected by multipath propagation. The Rake detector exploits the intrinsic diversity of the received signal in order to obtain a performance gain over the plain conventional detection.
References [1]
Verdu`, S., ‘‘Minimum Probability of Error for Asynchronous Gaussian MultipleAccess Channel,’’ IEEE Transaction on Information Theory, Vol. IT-32, January 1986, pp. 85–96.
[2]
Woodward, G., and B. Vucetic, ‘‘Adaptive Detection for DS-CDMA,’’ Proceedings of the IEEE, Vol. 86, No. 7, July 1998.
[3]
Verdu`, S., Multiuser Detection, New York: Cambridge University Press, 1988.
44
[4]
Multiuser Detection in CDMA Mobile Terminals
Castoldi, P., and H. Kobayashi, ‘‘Low Complexity Group Detectors for Multirate Transmission in TD-CDMA 3G Systems,’’ IEEE Broadband Wireless Symposium (Globecom 2000), San Francisco, CA, November–December 2000.
3 Linear Multiuser Detection In this chapter we present a general formulation of some linear detectors designed to eliminate or mitigate MAI and ISI, either in synchronous or asynchronous CDMA systems. The common design principle is the minimization of an error function, with the aim of finding a vector of soft decisions for the data associated to the active codes. Symbol-by-symbol detection only is considered, although decision feedback or trellis-based detection can be easily implemented. In the case of a synchronous system two main criteria are employed, namely the zero-forcing (ZF) and the minimum mean square error (MMSE) criterion, for the minimization of a specific error. A first option consists in the deployment of the ZF or MMSE to deal simultaneously with both MAI and ISI (MI), which leads to detectors named ZF-MI or MMSE-MI. A second option consists of the application of the ZF or MMSE to deal with ISI only (I), which leads to the ZF-I or MMSE-I detectors. This second class of detectors attempts to restore the orthogonality of the codes only, so MAI removal is performed by additional signal processing based on conventional detection. In the case of an asynchronous system, the MMSE criterion only is considered to minimize the error function as it leads to substantially better performance and can also be implemented easily in adaptive versions, as shown in Chapter 5.
3.1 Synchronous CDMA, Short Code, Multipath, and Multirate We refer to the system model derived in Section 1.5.3 and present four types of linear detectors that can be operated under the assumption that all J spreading codes in use are known by the mobile user. The detectors operate in the presence of AWGN. The case of colored background noise, which models 45
46
Multiuser Detection in CDMA Mobile Terminals
the interference from other cells, will be dealt with in Chapter 4. The receiving structures are derived as if the data of all the active codes needs to be detected, which is an extreme case. These detectors can be used also to detect a subset of the active codes by implementing only the relevant FIR filters of the desired codes (they correspond to the appropriate rows of the matrix that describes the linear detector). Moreover, the detectors presented are inherently block detectors, which need some form of truncation in order to be operated as a sliding window algorithm. The task of the four detectors in this section is as follows: Given an observation vector y ðN Þ ðiÞ (with length bN ) whose first element is in the ith chip and extend up to chip ði N þ 1Þ, find the best linear detector that detects all the data from which y ðN Þ ðiÞ depends; this best linear detector can be found according to different criteria [1]. This chapter contains the theoretical background and numerical validation of the receivers; the performance results are presented in Chapter 6. In order to simplify the notation, we neglect the superscript ðN Þ in all vectors and matrices. 3.1.1 The Zero-Forcing Detector for Both MAI and ISI (ZF-MI) It is well known that the zero-forcing detector task is to perform the following minimization [1–3]: xZF MI ðiÞ ¼ min kyðiÞ GðiÞxðiÞk2 xðiÞ
ð3:1Þ
where the nt -dimensional vector xZF MI ðiÞ can be regarded as soft decision for aðiÞ, which it is the best solution (in the least squares sense) to the overdetermined set of equations yðiÞ ¼ GðiÞxðiÞ, for which we suppose bN nt , assuming GðiÞ has rank nt . We mention that the solution given in (3.3) can be obtained, for example, by resorting to formal methods of vectorial differentiation of the right-hand of (3.1) [4, pp. 521–522]. We first define the pseudoinverse of GðiÞ as G# ðiÞ ¼ ðGH ðiÞGðiÞÞ1 GH ðiÞ
ð3:2Þ
which satisfies G# ðiÞGðiÞ ¼ Int . Following [4, pp. 521–522] and recalling the system model yðiÞ ¼ GðiÞaðiÞ of Chapter 1, it is easy to show that xZF MI ðiÞ ¼ G# ðiÞyðiÞ
ð3:3Þ
¼ G# ðiÞGðiÞaðiÞ þ G# ðiÞvðiÞ
ð3:4Þ
¼ aðiÞ þ uZF MI ðiÞ
ð3:5Þ
4 where uZF MI ðiÞ ¼ G# ðiÞvðiÞ is a new colored noise process.
Linear Multiuser Detection
47
A geometric interpretation of (3.3) stems from the fact that, in general, the bN -dimensional vector yðiÞ does not lie in the column space of GðiÞ. Hence, the vector xZF MI ðiÞ that minimizes (3.1) is that for which GðiÞ xZF MI ðiÞ is the orthogonal projection of yðiÞ on the subspace generated by the columns of GðiÞ. This orthogonal projection can be written, by definition, as GðiÞðGH ðiÞGðiÞÞ1 GH ðiÞyðiÞ [4, p. 525], and the comparison with GðiÞxZF MI ðiÞ yields (3.3). Equation (3.3) is the linear transformation operated by the matrix TZF MI ðiÞ ¼ G# ðiÞ on observation yðiÞ, which causes a zero-forcing of both ISI and MAI. This transformation yields the new observation vector xZF MI ðiÞ. Hard detection can be accomplished by ^aZF MI ðiÞ ¼ quant½xZF MI ðiÞ
ð3:6Þ
Note that due to the constraint bN nt , the feasibility of this detector requires a minimum number of samples to be processed, which may be large. 3.1.2 The Zero-Forcing Detector for ISI Only (ZF-I) An alternative solution based on the zero-forcing principle is to zero-force ISI only and exploit the orthogonality of the spreading codes used for the separation of the data streams. In this case it is better to rely on the following formulation of the observation: yðiÞ ¼ FbðiÞ þ vðiÞ
ð3:7Þ
The minimization to be performed is kyðiÞ Fx0 ðiÞk2 x0ZF I ðiÞ ¼ min 0 x ðiÞ
ð3:8Þ
where the ðN þ P 1Þ-dimensional vector x0ZF I ðiÞ can be regarded, in this case, as soft decisions for bðiÞ. Again, we define the pseudoinverse of F as F# ¼ ðFH FÞ1 FH
ð3:9Þ
which satisfies F# F ¼ IN þP 1 if bN N þ P 1, assuming F has rank N þ P 1. Following [4, pp. 521–522] and recalling the system model yðiÞ ¼ FCðiÞaðiÞ of Chapter 1, it is easy to show that
48
Multiuser Detection in CDMA Mobile Terminals
x0ZF I ðiÞ ¼ F# yðiÞ
ð3:10Þ
¼ F# FCðiÞaðiÞ þ F# vðiÞ ¼ CðiÞaðiÞ þ
u0ZF I ðiÞ
ð3:11Þ ð3:12Þ
4 # where u0ZF I ðiÞ ¼ F vðiÞ is a new noise colored process. Equation (3.10) is the linear transformation of yðiÞ, operated by TZF I ðiÞ ¼ F# , which zero-forces ISI only, yielding the new observation vector x0ZF I ðiÞ. In order to solve the detection problem, we can now use a timediscrete filter matched to all the users’ (structured) codes CH ðiÞ, whose nt -dimensional output xZF I ðiÞ is a soft estimate of aðiÞ and is given by
xZF I ðiÞ ¼ CH ðiÞx0ZF I ðiÞ
ð3:13Þ
¼ CH ðiÞCðiÞaðiÞ þ CH ðiÞu0ZF I ðiÞ
ð3:14Þ
¼ aðiÞ þ uZF I ðiÞ
ð3:15Þ
4 H where uZF I ðiÞ ¼ C ðiÞu0ZF I ðiÞ and
CH ðiÞCðiÞ ¼ Int
ð3:16Þ
due to the orthogonality of the Walsh-Hadamard codes. Note that (3.16) is valid only when the detection window spans the whole block. Otherwise, there are border effects, predictable from the structure of CðiÞ, which causes the presence of residual MAI in the decision variable of peripheric symbols of each user. Peripheric symbols are those symbols whose symbol interval is not completely spanned by the observation window. Hard detection can be accomplished by ^aZF I ðiÞ ¼ quant½xZF I ðiÞ
ð3:17Þ
Note that the constraint bN N þ P 1 is in general less demanding than that of the previous equalizer. 3.1.3 The MMSE Detector for Mitigation of Both MAI and ISI (MMSE-MI) The MMSE detector task is to find an nt bN -dimensional matrix by performing the following minimization: kaðiÞ TðiÞyðiÞk2 ð3:18Þ TMMSE MI ðiÞ ¼ min E TðiÞ aðiÞ;vðiÞ
Linear Multiuser Detection
49
4 so that the nt -dimensional vector xMMSE MI ðiÞ ¼ TMMSE MI ðiÞyðiÞ can be regarded as soft decisions for aðiÞ, which is the best solution (in the minimum mean square sense) to the problem of the prediction of aðiÞ, regarded as continuous variables. In other words, the prediction matrix (3.19) is obtained by imposing orthogonality (in a stochastic sense) between the observation yðiÞ and the prediction error ½aðiÞ TMMSE MI ðiÞyðiÞ
[5, p. 438]. On the other hand, the result given in (3.19) can be obtained also by using the formal method of matrix differentiation [6, p. 331]. After some algebra, it is easy to show that [5]
1 TMMSE MI ðiÞ ¼ E ½aðiÞy H ðiÞ E ½yðiÞy H ðiÞ
¼ R a ðiÞGH ðiÞ½GðiÞR a ðiÞGH ðiÞ þ s2v IbN 1 1 H ¼ ½GH ðiÞGðiÞ þ s2v R 1 a ðiÞ G ðiÞ
ð3:19Þ
where R a ðiÞ ¼ E ½aðiÞa H ðiÞ . As a consequence, 1 H xMMSE MI ðiÞ ¼ ½GH ðiÞGðiÞ þ s2v R 1 a ðiÞ G ðiÞyðiÞ
¼ ½Int þ s2v ðR a ðiÞGH ðiÞGðiÞÞ1 1 ½GH ðiÞGðiÞ 1 GH ðiÞyðiÞ ¼ ½Int þ s2v ðR a ðiÞGH ðiÞGðiÞÞ1 1 xZF MI ðiÞ 4
¼ W MI ðiÞxZF MI ðiÞ
ð3:20Þ
where the nt nt square matrix W MI ðiÞ is a Wiener estimator. This estimator observes xZF MI ðiÞ and produces the MMSE soft estimate xMMSE MI ðiÞ of aðiÞ, reducing the performance degradation of the zero-forcing detector, whose decisions do not take into account the noise correlations existing in the decision variables. By recalling the expression of xZF MI ðiÞ, (3.20) can be rewritten as follows: xMMSE MI ðiÞ ¼ diagðW MI ðiÞÞaðiÞ þ diagðW MI ðiÞÞaðiÞ þ uMMSE MI ðiÞ
ð3:21Þ
1 H 4 where uMMSE MI ðiÞ ¼ ðGH ðiÞGðiÞ þ s2v R 1 a ðiÞÞ G ðiÞvðiÞ is the new noise process. The first term in the right hand side of (3.21) is the useful term, while the second accounts for residual ISI and MAI due to imperfect zero-forcing of the MMSE algorithm.
50
Multiuser Detection in CDMA Mobile Terminals
Hard detection can be accomplished by ^a MMSE MI ðiÞ ¼ quant½xMMSE MI ðiÞ
ð3:22Þ
Note that this time there are no constraints on the minimum size of the observation vector yðiÞ. 3.1.4 The MMSE Detector for Mitigation of ISI Only (MMSE-I) We refer again to observation model (3.7)—in order to mitigate ISI only, the MMSE detector must find an ðN þ P 1Þ bN –dimensional matrix by performing the following minimization: kbðiÞ TðiÞyðiÞk2 ð3:23Þ TMMSE I ðiÞ ¼ min E TðiÞ bðiÞ;vðiÞ
4 so that the ðN þ P 1Þ-dimensional vector x0 ðiÞ ¼ TMMSE I ðiÞyðiÞ can be regarded as soft decisions for bðiÞ. Similarly to (3.19), it is easy to show that 1 H TMMSE I ðiÞ ¼ ½FH F þ s2v R 1 b ðiÞ F
ð3:24Þ
where R b ðiÞ ¼ E ½bðiÞbH ðiÞ . As a consequence, 1 H x0MMSE I ðiÞ ¼ ½FH F þ s2v R 1 b ðiÞ F yðiÞ
¼ ½IN þP 1 þ s2v ðR b ðiÞFH FÞ1 1 ½FH F 1 FH yðiÞ ¼ ½IN þP 1 þ s2v ðR b ðiÞFH FÞ1 1 x0ZF I ðiÞ 4
0
¼ W I ðiÞxZF I ðiÞ
ð3:25Þ
where the ðN þ P 1Þ ðN þ P 1Þ square matrix W I ðiÞ is a Wiener estimator, which observes x0ZF I ðiÞ and produces the MMSE soft estimate x0MMSE I ðiÞ of bðiÞ. Although hard decisions are not based on the observation x0MMSE I ðiÞ, this term can be interpreted in a similar way to (3.21): x0MMSE I ðiÞ ¼ diagðW I ðiÞÞbðiÞ þ diagðW I ðiÞÞbðiÞ þ u0MMSE I ðiÞ
ð3:26Þ
1 H 4 where u0MMSE I ðiÞ ¼ ðFH F þ s2v R 1 b ðiÞÞ F vðiÞ is the new noise process. The first term in the right hand side of (3.21) is the useful chip sequence term,
Linear Multiuser Detection
51
while the second accounts for residual ISI due to imperfect zero-forcing of the MMSE algorithm. Note that the MAI is not suppressed here. From the new set of observations given by (3.26), we can now use a time-discrete matched filter to the users’ (structured) codes CH ðiÞ, whose nt -dimensional output xMMSE I ðiÞ is a soft estimate of aðiÞ and is given by xMMSE I ðiÞ ¼ CH ðiÞx0MMSE I ðiÞ ¼ diagðW I ðiÞÞCH ðiÞCðiÞaðiÞ þ CH ðiÞdiagðW I ðiÞÞCðiÞaðiÞ þ CH ðiÞu0MMSE I ðiÞ ¼ diagðW I ðiÞÞaðiÞ þ CH ðiÞdiagðW I ðiÞÞCðiÞaðiÞ þ uMMSE I ðiÞ
ð3:27Þ
4 H where uMMSE I ðiÞ ¼ C ðiÞu0MMSE I ðiÞ and CH ðiÞCðiÞ ¼ Int due to the orthogonality of the Walsh-Hadamard codes. The same considerations made about the border effects for the ISI-ZF detector apply here for the first term of (3.27). Hard detection can be accomplished by
^aMMSE I ðiÞ ¼ quant½xMMSE I ðiÞ
ð3:28Þ
Also in this case, there are no constraints on the minimum size of the observation vector yðiÞ.
3.2 Sliding Window Formulation The detectors presented in the previous section are block detectors. Using an observation vector yðiÞ of arbitrary length, they estimate the relevant data transmitted with a symbol-by-symbol strategy. Low and finite complexity formulations of these detectors are necessary in order to employ them in practical receivers. The following derivation refers to the linear detector presented in Section 3.1.3 but holds for all other classes of linear detectors. Detectors that only mitigate the ISI can be used too, since the peripheric decision variables affected by border effects are not utilized by the algorithm [1, 3]. The strategy is that the block minimization accomplished by the MMSE can be simplified because vector y ðN Þ ðiÞ is a stationary vector if N Q , where Q is the maximum spreading factor currently used by the system. We again refer to the detection in the first block and keep Figure 3.1 as a reference. The MMSE sliding window algorithm spans a window of ðQ þ P 1Þ chip intervals. Accordingly, we define the following:
52
Multiuser Detection in CDMA Mobile Terminals
Impulse response span = PTc 1st stream (Q 1 = 4) m1(k–2)–ñ. 1+1 .. m1(k–1)–ñ. 1+1 .. . m1(k–2) .. . m1(k–1)
a1(.)
a1(.)
2nd stream (Q 2 = 8) m2(k–2)–ñ 2+1 m2(k–1)–ñ 2+1 m2(k–2) m2(k–1)
a2(.)
3rd stream (Q 3 = 8) m3(k–2)–ñ 3+1 m3(k–1)–ñ 3+1 m3(k–2) m3(k–1)
a3(.)
a1(.)
a1(.)
a1(.)
a2(.)
a2(.)
a3(.)
xD(k) xD(k – 1)
a3 (.)
yD (k)
Symbol index
a1(.)
Detection
yD (k–1)
Figure 3.1 Symbol-by-symbol detection scheme, assuming three data streams with different data rates (thick lines separate symbol intervals; thin lines separate chip). In this case Q ¼ 8 (maximum spreading factor). The figure shows the nominal symbol interval for each data stream that is shaded along with the ISI caused by each symbol, which spans ðP 1Þ additional chips. On the bottom of the figure, two observation vectors yD ðkÞ, yD ðk 1Þ are reported. On the left of the figure, the corresponding set of decision variables xD ðkÞ and xD ðk 1Þ are marked, respectively, with circles ( ) and triangles (4). Note that, at each detection step, only those decision variables filled in black are retained and used for the detection.
y D ðkÞ ¼ y ðQ þP 1Þ ðkQ þ P 1Þ
ð3:29Þ
a D ðkÞ ¼ a ðQ þP 1Þ ðkQ þ P 1Þ ð3:30Þ P ðk ¼ 1; 2; . . .Þ. Accordingly, n~t ¼ j n~j symbols contribute to the window, l m 2ðP 1ÞþQj þQ where n~j ¼ . Qj First we need to determine a transformation matrix TD whose dimensions are n~t bðQ þ P 1Þ, such that k ¼ 1; 2; . . . ð3:31Þ E kaD ðkÞ Ty D ðkÞk2 TD ¼ min T
aD ðkÞ;vD ðkÞ
It should be emphasized that TD is independent of detection step k and is a bank of linear filters that will be used throughout the detection process. By further definition of
53
Linear Multiuser Detection
GD ¼ GðQ þP 1Þ ðkQ þ P 1Þ
ð3:32Þ
þP 1Þ ðkQ þ P 1Þ R aD ¼ R ðQ a
ð3:33Þ
it is easy to show that 2 1 1 H TD ¼ ðGH D GD þ sv R aD Þ GD
ð3:34Þ
We can build xD ðkÞ, soft-estimate of a D ðkÞ, as xD ðkÞ ¼ TD y D ðkÞ
k ¼ 1; 2; . . .
ð3:35Þ
It is worthwhile to expand xD ðkÞ as h ðk1Þ ðk1Þ ðkÞ xD ðkÞ ¼ x1 ðm1 n~1 þ 1Þ; x1 ðm1 n~1 þ 2Þ; . . . ; x1 ðm1 Þ; .. .
.. . ðk1Þ
xJ ðmJ
ðk1Þ
n~J þ 1Þ; xJ ðmJ
.. .
i ðkÞ n~J þ 2Þ; . . . ; xJ ðmJ Þ ð3:36Þ
where ðkÞ 4 mj ¼
kQ þ P 1 Qj
k ¼ 1; 2; . . .
ð3:37Þ
is the first symbol of the jth data stream for which we obtain a decision variable at the kth detection step. The dimension of the vector xD ðkÞ is n~t . Detection of all symbols could be accomplished by a symbol-by-symbol decision device according to the rule a D ¼ quant½ xD
ð3:38Þ
Only a part of these decision variables must be evaluated and used for symbol detection, namely for user j, at the kth step, the range is from ðk1ÞQj kQ þ 1 to Qjj , as shown in Figure 3.1. The caption for Figure 3.1 describes Qj how the sliding window algorithm proceeds, according to the above steps. In Table 3.1 we list the major synchronization points of the detection process for the jth user, having as reference a time-discrete axis based on its symbol intervals. The table gives indications concerning the use of the decision variables (DV) to realize a continuous detection of the data. The integration of Figure 3.1 and Table 3.1 will be further clarified by a specific example in Section 3.2.1.
54
Multiuser Detection in CDMA Mobile Terminals
Table 3.1 Sequence of the Synchronization Epochs of the Detection Process at the kth Detection Step .. .
.. .
ðk1Þ
n~j þ 1 .. .
First DV of kth observation block
ðk2Þ
Last DV of ðk 1Þ-th observation block
mj
mj
.. .
.. .
ðk1ÞQ Qj ðk1ÞQ Qj
þ1
Detection starts using DVs of kth observation block
n~j þ 1
First DV of ðk þ 1Þ-th observation block
.. .
.. . ðkÞ
mj
.. .
.. .
ðk1Þ
mj
.. .
.. .
kQ Qj kQ Qj
þ1
Last DV of ðk 1Þ-th observation block
Detection stops using the DVs of kth observation block Detection starts using the DVs of the ðk þ 1Þ-th observation block
.. .
.. .
Detection stops using DVs of ðk 1Þ-th observation block
ðkþ1Þ
mj
n~j þ 1
.. .
.. .
mj .. .
ðkÞ
1st DV of ðk þ 2Þ-th observation block
Last DV of kth observation block
Note: The potential decision variables (marked with circles) produced by the kth processing window agree with Figure 3.1. Those filled in black are retained for detection.
55
Linear Multiuser Detection
3.2.1 Validation of the Sliding Window Algorithm In order to validate the proposed detection strategy, we have analyzed the receiver performance on each component of the vector xD ðkÞ given by (3.36), whose n~t components are the decision variables for as many symbols. To this purpose, it is instructive to read Figures 3.2 and 3.3, keeping Figure 3.1 as a reference, which refers to the same scenario of three active codes ðQ1 ¼ 4; Q2 ¼ Q3 ¼ 8Þ [3]. We now describe the fine structure of the detector shown in Figure 3.1, which uses at the kth detection step vector y D ðkÞ with size N ¼ Q þ P 1 ¼ 14. The corresponding decision variables, which are the components of vector xD ðk 1Þ, are n~t ¼ 12, more precisely n~1 ¼ 6 for the first data stream, and n~2 ¼ n~3 ¼ 3 for the second and third data stream (the 12 circles in Figure 3.1) [3]. We report the signal-to-noise-and-interference ratio (SNIR) on each of the 12 decision variables of the proposed detectors, of the ZF-MI and the MMSE-MI in Figure 3.2, and of the ZF-I and the MMSE-I in Figure 3.3. The analytic exact expression of the SNIR for the ZF-MI and MMSE-MI detectors is given by GZF MI ðjÞ ¼
GMMSE MI ðjÞ ¼
E ½jaðjÞj2
s2u ZF MI ðjÞ
ð3:39Þ
~ MI Þ j2 E ½jaðjÞj2
jðW j;j s2MAI ISI ðjÞ þ s2u MMSE MI ðjÞ
ð3:40Þ
(j ¼ 1; . . . ; n~t ) where ~H Þ ~ MI W s2u MMSE MI ðjÞ ¼ s2u ZF MI ðjÞðW MI j;j
j ¼ 1; 2; . . . ; n~t ð3:41Þ
and 4 s2MAI ISI ðjÞ ¼ ðW MI R a Þj;j 2RefðW MI R a Þj;j ðW H MI Þj;j g
þ E ½jaðjÞj2 jðW MI Þj;j j2
ð3:42Þ
We invite the reader to derive accordingly that of the ZF-I and MMSE-I detectors. The variance of the background noise is set to the value of s2v ¼ 10 dB. Considering the first stream ðQ1 ¼ 4Þ, it can be seen that the n~1 ¼ 6 decision variables (six circles of the first stream in Figure 3.1) exhibit a poor SNIR
56
Multiuser Detection in CDMA Mobile Terminals
20.0 Q2=8 (DV #7–9)
10.0
SNIR (dB)
0.0 Q3=8 (DV #10–12)
–10.0 Q1=4 (DV #1–6)
–20.0 MMSE,MI ZF,MI
–30.0
–40.0
1
2
3
4 5 6 7 8 9 DV: Components of vector x(k)
10
11
12
Figure 3.2 SNIR for MAI and ISI mitigation receivers, as a function of the components of xD ðkÞ, for s2v ¼ 10 dB, three active codes: Q1 ¼ 4, Q2 ¼ 8, Q3 ¼ 8. (Source: [3], Ó 2000 IEEE, reprinted with permission.) 20.0 Q1=4 (DV #1–6)
Q2=8 (DV #7–9)
Q3=8 (DV #10–12)
10.0 0.0
SNIR (dB)
–10.0 –20.0 –30.0 –40.0
MMSE,I ZF,I
–50.0 –60.0
1
2
3
4 5 6 7 8 9 DV: Components of vector x(k)
10
11
12
Figure 3.3 SNIR for ISI mitigation receivers, as a function of the components of xD ðkÞ, for s2v ¼ 10 dB; three active codes: Q1 ¼ 4, Q2 ¼ 8, Q3 ¼ 8.
57
Linear Multiuser Detection
performance for symbol 1, 2, 5, 6 (empty circles in Figure 3.1). The low SNIR is easily justified by observing in Figure 3.1 that the observation vector yðk 1Þ does not collect all the received energy available for the detection of those symbols. Only decision variables 3 and 4 (filled circles of stream 1 in Figure 3.1), which have maximum SNIR, are used for detection. A similar behavior occurs for the two other data streams with Q2 ¼ Q3 ¼ 8, where only the decision variables 8 and 11 are retained for detection. The performance of the ZF is, in general, slightly worse than that of the MMSE. In the figure pairs (Figures 3.4 and 3.5) and (Figures 3.6 and 3.7), where different sets of active codes are considered, we observe a similar behavior for the SNIR. As expected, the inner symbols take advantage from detection, and the small variations in SNIR performance are due to the different codes employed by the different data streams.
3.3 MMSE Receivers for CDMA Asynchronous Systems As seen in Chapter 1, an observation vector of N samples can be expressed as follows: y ðN Þ ðiÞ ¼ GðN Þ ðiÞa ðN Þ ðiÞ þ v ðN Þ ðiÞ
ð3:43Þ
20.0 Q2=8 (DV #9–12)
10.0 Q1=4 (DV #1–8)
SNIR (dB)
0.0
Q3=16 (DV #13–15)
–10.0
–20.0 MMSE,MI ZF,MI
–30.0
–40.0
1
2
3
4
5 6 7 8 9 10 11 12 13 14 15 DV: Components of vector x(k)
Figure 3.4 SNIR for MAI and ISI mitigation receivers, as a function of the components of xD ðkÞ, for s2v ¼ 10 dB; three active codes: Q1 ¼ 4, Q2 ¼ 8, Q3 ¼ 16.
58
Multiuser Detection in CDMA Mobile Terminals
20.0 Q1=4 (DV #1–8)
Q2=8 (DV #9–12)
Q3=16 (DV #13–15)
10.0
SNIR (dB)
0.0
–10.0
–20.0 MMSE,I ZF,I
–30.0
–40.0
1
2
3
4
5 6 7 8 9 10 11 12 13 14 15 DV: Components of vector x(k)
Figure 3.5 SNIR for ISI mitigation receivers, as a function of the components of xD ðkÞ, for s2v ¼ 10 dB; three active codes: Q1 ¼ 4, Q2 ¼ 8, Q3 ¼ 16. 20.0 10.0 0.0
SNIR (dB)
–10.0 –20.0
Q1=8 (DV #1–3)
Q2=8 (DV #4–6)
Q3=8 (DV #7–9)
Q4=8
–30.0 –40.0
MMSE,MI ZF,MI
–50.0 –60.0
1
2
3
4 5 6 7 8 9 DV: Components of vector x(k)
10
11
12
Figure 3.6 SNIR for MAI and ISI mitigation receivers, as a function of the components of xD ðkÞ, for s2v ¼ 10 dB; four active codes: Q1 ¼ Q2 ¼ 8 ¼ Q3 ¼ Q4 ¼ 8.
59
Linear Multiuser Detection
20.0 10.0 0.0
SNIR (dB)
–10.0 –20.0 –30.0
Q1=8 (DV #1–3)
Q2=8 (DV #4–6)
Q4=8 (DV #10–12)
Q3=8 (DV #7–9)
–40.0 MMSE,I ZF,I
–50.0 –60.0
1
2
3
4 5 6 7 8 9 DV: Components of vector x(k)
10
11
12
Figure 3.7 SNIR for ISI mitigation receivers, as a function of the components of xD ðkÞ, for s2v ¼ 10 dB; four active codes: Q1 ¼ Q2 ¼ 8 ¼ Q3 ¼ Q4 ¼ 8.
where GðN Þ ðiÞ ¼ FðN Þ ðiÞCðN Þ ðiÞ is a matrix with dimensions N nt ðiÞ. FðN Þ ðiÞ depends on the users’ amplitudes and on phase and frequency offsets, while CðN Þ ðiÞ contains the spreading codes. Vector a ðN Þ ðiÞ with length nt ðiÞ comprises all the symbols of the J users spanned by the processing window considered; v ðN Þ ðiÞ is the N -dimensional noise vector. From now on, we drop the ðN Þ in all vectors and matrices. The MMSE detector employs matrix TMMSE ðiÞ (with dimension nt ðiÞ N ), realizing the following minimization: TMMSE ðiÞ ¼ min
E
TðiÞ aðiÞ;vðiÞ
kaðiÞ TðiÞyðiÞk2
ð3:44Þ
4 TMMSE ðiÞyðiÞ, with length nt ðiÞ, is the vector of soft Vector xMMSE ðiÞ ¼ decision for aðiÞ, and it is the best solution (according to the MSE criterion) to the problem of prediction of aðiÞ seen as a set of continuous variables given observation yðiÞ. Hard decisions can be obtained using a threshold detector as follows:
^a MMSE ðiÞ ¼ quant½xMMSE ðiÞ
ð3:45Þ
60
Multiuser Detection in CDMA Mobile Terminals
Matrix TMMSE ðiÞ can be obtained by setting orthogonality between observation yðiÞ and prediction error ½aðiÞ TMMSE ðiÞyðiÞ . It can be shown (also using formal differentiation methods) that 1 TMMSE ðiÞ ¼ E ½aðiÞy H ðiÞ E ½yðiÞy H ðiÞ
¼ R a ðiÞGH ðiÞ½GðiÞR a ðiÞGH ðiÞ þ s2v IN 1 1 H ¼ ½GH ðiÞGðiÞ þ s2v R 1 a ðiÞ G ðiÞ
ð3:46Þ
where R a ðiÞ ¼ E ½aðiÞaH ðiÞ . In order to solve the problem of MMSE, knowledge of matrix GðiÞ is required: hence, amplitudes, phase, and frequency offset, and delays of all users must be known. Rewriting the MMSE cost function, E
aðiÞ;vðiÞ
kaðiÞ TðiÞyðiÞk2
ð3:47Þ
it is easy to recognize that it is the norm of a vector. Minimization of the norm of a vector can be accomplished by minimization of each component if they are independent. Hence, the minimization problem can be described as nt ðiÞ decoupled minimization problems of the following type [7]: tH ðj;kÞ;MMSE ðiÞ ¼ min
E
h
tðj;kÞ ðiÞ aðiÞ;vðiÞ
with n ¼
j
idj Q
2 jaj ðn kÞ tH ðj;kÞ ðiÞyðiÞj
i
ð3:48Þ
k , j ¼ 1; 2; . . . ; J and k ¼ 0; 1; . . . ; nj ðiÞ 1.
The symbol aj ðn kÞ is the ðn kÞ-th transmitted symbol from the jth user, and it is the ðnj ðiÞ kÞ-th component of aj ðiÞ. It should be stressed that n depends on dj and it is different from user to user. Each of the nt ðiÞ vectors tH ðj;kÞ;MMSE ðiÞ with length N , solution of each of the nt ðiÞ minimization problems, is a row of matrix TMMSE ðiÞ and represents an FIR filter with N taps. This property of the MMSE detector has an impact on the complexity, which depends on the number of streams to be detected. In general we can write 1 H H tH ð j;kÞ;MMSE ðiÞ ¼ E ½aj ðn kÞy ðiÞ E ½yðiÞy ðiÞ
¼ E ½aj ðn kÞaH ðiÞ GH ðiÞ½GðiÞR a ðiÞGH ðiÞ þ s2v IN 1 ð3:49Þ
61
Linear Multiuser Detection
The previous result can be further expressed as 1 H H 2 tH ðj;kÞ;MMSE ðiÞ ¼ ½0 . . . 010 . . . 0 G ðiÞ½GðiÞG ðiÞ þ sv IN
¼ ½0 . . . 010 . . . 0 TMMSE ðiÞ
ð3:50Þ
where ½0 . . . 010 . . . 0 is a vector with length nt ðiÞ and a ‘‘1’’ in the proper position. As a matter of fact, in this case R a ðiÞ ¼ IN and [see (3.46)] TMMSE ðiÞ ¼ GH ðiÞ½GðiÞGH ðiÞ þ s2v IN 1 . Analogously, (3.49) can be rewritten as tð j;kÞ;MMSE ðiÞ ¼ ½GðiÞR a ðiÞGH ðiÞ þ s2v IN 1 GðiÞ½0 . . . 010 . . . 0 T ð3:51Þ Starting from (3.50) and (3.51), we can evaluate the minimum of the MSE function: h i 2 ðiÞyðiÞj MMSE ¼ E jaj ðn kÞ tH ð j;kÞ;MMSE
ð3:52Þ
We have that n h io ðiÞyðiÞ MMSE ¼ jaj ðn kÞj2 2<e E aj ðn kÞ tH ð j;k;MMSE Þ h i 2 þ E jtH ð j;kÞ;MMSE ðiÞyðiÞj n o ðiÞE a ðn kÞ yðiÞ ¼ 1 2<e tH j ð j;kÞ;MMSE h i H þ E tH ðiÞyðiÞy ðiÞt ðiÞ ð j;kÞ;MMSE ð j;kÞ;MMSE n o T ¼ 1 2<e tH ð j;kÞ;MMSE ðiÞGðiÞ½0 . . . 010 . . . 0
1 þ tH ð j;kÞ;MMSE ðiÞR y tð j;kÞ;MMSE ðiÞ n o T ðiÞGðiÞ½0 . . . 010 . . . 0
¼ 1 2<e tH ð j;kÞ;MMSE
þ ½0 . . . 010 . . . 0 GH ðiÞtð j;kÞ;MMSE ðiÞ
ð3:53Þ
where we have defined R y ¼ E yðiÞy H ðiÞ ¼ GðiÞR a ðiÞGH ðiÞ þ s2v IN
ð3:54Þ
62
Multiuser Detection in CDMA Mobile Terminals
3.4 Sliding Window MMSE Receiver The algorithm described operates on the whole vector of samples received. Using an observation vector yðiÞ with arbitrary length, it estimates the transmitted symbols with a symbol-by-symbol strategy. A sliding window formulation is necessary to use this receiver. A processing window of length ðQ þ P 1Þ is considered for the decisions (see Figure 3.8): y D ðmÞ ¼ y ðQ þP 1Þ ðmQ þ P 1Þ
m ¼ 1; 2; . . .
ð3:55Þ
We then define 4 ðQ þP1Þ a ðmQ þ P 1Þ aD ðmÞ ¼
nt 1
ð3:56Þ
4 GðQ þP 1Þ ðmQ þ P 1Þ ðQ þ P 1Þ nt GD ðmÞ ¼
ð3:57Þ
aD ðmÞ ¼ aðQ þP1Þ ðmQ þ P 1Þ
3M 1
ð3:58Þ
4 ðQ þP1Þ v ðmQ þ P 1Þ vD ðmÞ ¼
ðQ þ P 1Þ 1
ð3:59Þ
Observation yD (0) Observation yD (1) Impulse response span PTc
Observation yD (2)
a1(1)
First station
a1(2) a1(3)
a1(4)
a2(1)
Second station
a2(2) a2(3)
a2(4)
a3(1)
k th station
a3(2) a3(3)
a3(4)
a4(1)
Jth station
a4(2) a4(3)
a4(4)
Figure 3.8 Sliding window MMSE detection scheme for an asynchronous system.
Linear Multiuser Detection
63
We observe that nt is independent of m since the observation window slides of Q chip at each detection step and the codes are periodic with period Q . The minimization problem with respect to TD (matrix with dimension 3M ðQ þ P 1Þ) can be formulated as follows: E kaD ðmÞ TD ðmÞy D ðmÞk2 ð3:60Þ TD;MMSE ðmÞ ¼ min TD ðmÞ a D ðmÞ;vD ðmÞ
We have that 1 H TD;MMSE ðmÞ ¼ E ½aD ðmÞy H D ðmÞ E ½y D ðmÞy D ðmÞ
H ¼ E ½aD ðmÞa H D ðmÞ GD ðmÞ 1 2 ½GD ðmÞR aD ðmÞGH D ðmÞ þ sv IðQ þP 1Þ
ð3:61Þ
4 ðQ þP 1Þ where R aD ðmÞ ¼ Ra ðmQ þ P 1Þ. Evaluation of matrix TD;MMSE ðmÞ requires large computational complexity, because a matrix inverse with dimensions ðQ þ P 1Þ ðQ þ P 1Þ (usually Q >> in asynchronous systems) must be evaluated. This is one of the disadvantages of the MMSE detector in its original formulation. 4 TD;MMSE ðmÞy D ðmÞ, and recalling that J is the Defining xD ðmÞ ¼ total number of users, we see that the 3J elements of such a vector represent soft decisions for 3J symbols embraced by the mth processing window:
3 x1 ðm 2Þ 6 x1 ðm 1Þ 7 7 6 6 x1 ðmÞ 7 7 6 6 x2 ðm 2Þ 7 7 6 xD ðmÞ ¼ 6 7 .. 7 6 . 7 6 6 xJ ðm 2Þ 7 7 6 4 xJ ðm 1Þ 5 xJ ðmÞ 2
ð3:62Þ
In principle, the detection of all 3J symbols can be obtained using a symbol-by-symbol decision device, which operates the following quantization: aD ðmÞ ¼ quant½ xD ðmÞ
ð3:63Þ
In fact, as in the synchronous case, only a part of this decision variable should be utilized for detection; specifically, we use xj ðm 1Þ only ð j ¼ 1; 2; . . . ; J Þ and decide on symbols aj ðm 1Þ only ð j ¼ 1; 2; . . . ; J Þ. As can be seen from
64
Multiuser Detection in CDMA Mobile Terminals
Figure 3.8, for those symbols the signal-to-interference ratio is maximum because all relevant energy for those symbols is collected by the processing window. Recalling the function to be minimized, we have that E
a D ðmÞ;vD ðmÞ
¼
kaD ðmÞ TD ðmÞy D ðmÞk2
J 2 X h i X 2 E jaj ðm kÞ tH ðmÞy ðmÞj D;ðj;kÞ D
ð3:64Þ
k¼0 j¼1
In particular, at the mth detection step, we are interested in detecting symbols aj ðm 1Þ ð j ¼ 1; 2; . . . ; J Þ. As a matter of fact, setting k ¼ 1 and tD;ðj;1Þ ðmÞ ¼ tj ðmÞ, the cost function becomes J h i X 2 E jaj ðm 1Þ tH ðmÞy ðmÞj j D
ð3:65Þ
j¼1
Hence, the detection problem at the mth detection step can be decoupled in J transversal filters with length ðQ þ P 1Þ such that tj;MMSE ðmÞ ¼ min
E
h
tj ðmÞ a D ðmÞ;vD ðmÞ
2 jaj ðm 1Þ tH j ðmÞy D ðmÞj
i
ð3:66Þ
with j ¼ 1; 2; . . . ; J . The solution is 1 2 tj;MMSE ðmÞ ¼ ½GD ðmÞR aD GH D ðmÞ þ sv IðQ þP 1Þ
GD ðmÞE ½aj ðm 1ÞaD ðmÞ
ð3:67Þ
with j ¼ 1; 2; : : : ; J . 1 H H are the Vectors tH j;MMSE ðmÞ ¼ E ½aj ðm 1Þy D ðmÞ fE ½yðiÞy ðiÞ g J rows of matrix TD;MMSE ðmÞ. A decision is made on symbol ðm 1Þ, instead of m; the delay of the MMSE equalizer is equal to 1. As an example, without loss of generality, if we suppose that the useful signal is identified by superscript 1 and that we have PSK independent symbols with unit amplitude, it is easy to show that 1 T 2 t1;MMSE ðmÞ ¼ ½GD ðmÞR aD GH D ðmÞ þ sv IðQ þP1Þ GD ðmÞ½010 . . . 0
¼ R yD ðmÞ1 GD ðmÞGD ðmÞ½010 . . . 0 T
ð3:68Þ
Linear Multiuser Detection
65
defining R yD ðmÞ ¼ E y D ðmÞy H D ðmÞ . In fact, three symbols of user 1 are embraced; the second symbol being the one with maximum SNIR. The minimum mean square error for user 1 is given by h i 2 ðmÞy ðmÞj MMSE ¼ E ja1 ðm 1Þ tH ð3:69Þ 1;MMSE D which can be expanded, following that done in (3.53): n o T MMSE ¼ 1 2<e tH ðmÞG ðmÞ½010 . . . 0
D 1;MMSE þ ½010 . . . 0 GH D ðmÞt1;MMSE ðmÞ
ð3:70Þ
m ¼ 2; 3; . . . : Matrix TD;MMSE ðmÞ and vectors tj;MMSE ðmÞ ð j ¼ 1; 2; . . . ; J Þ depend on detection step m, because matrix GD ðmÞ depends on m. Actually, we have that GD ðmÞ ¼ A D ðmÞPD CD
ð3:71Þ
where we have defined the following: 4 ðQ þP 1Þ A D ðmÞ ¼ A ðmQ þ P 1Þ
ð3:72Þ
4 ðQ þP 1Þ PD ¼ P
ð3:73Þ
4 ðQ þP1Þ CD ¼ C ðmQ þ P 1Þ
ð3:74Þ
Both matrices PD and CD are independent of m. Specifically, the latter is independent of the detection step because the codes are periodic with period Q , and the observation window slides by Q chips at a time. As a consequence, GD ðmÞ depends on m only through A D ðmÞ, where only the frequency offset causes time-variance. In fact, the amplitudes received and the phase offset are regarded as slowly varying. These considerations are very important in the definition of an adaptive algorithm for the iterative search of filters tj;MMSE ðmÞ ð j ¼ 1; 2; . . . ; J Þ. In fact, if the required solution is time-varying, the iterative algorithm, that is updated symbol-by-symbol, must be able to track the variations. The major problem is the frequency offset, which will be analyzed in Chapter 6. Another important consideration in the adaptive formulation of the MMSE receiver is related to the period of the spreading sequence. The processing window must be stationary, or, in other words, the processing window must slide of a shift equal to the period of the cyclostationary signal received.
66
Multiuser Detection in CDMA Mobile Terminals
3.5 Considerations on Linear Detection Four families of linear multiuser detectors operating according to different criteria have been presented. These detectors are very simple, because they can be implemented using a bank of FIR filters, which operates directly on the samples of the received signal. A high degree of flexibility is allowed by this detector in describing the ‘‘don’t care’’ part of the received signal (i.e., the contribution due to the codes that do not need to be detected). In this chapter this part of the signal is modeled as structured interference, while in Chapter 4 a coarser unstructured description as noise with known statistics is presented. This flexibility feature of these detectors makes them particularly suitable to be deployed in mobile terminals. The only drawback of the optimal linear detectors is that the evaluation of the tap coefficients of the FIR filters involves a matrix inversion. Although this computational effort must be performed once at the beginning of the detection process, reduced complexity matrix inversion algorithms may be required [8]. The adaptive versions of these detectors presented in Chapter 5 do not need a matrix inversion, although this class of receivers operates with some kind of side information and needs a tap coefficients update at each detection step.
References [1]
Castoldi, P., and H. Kobayashi, ‘‘Co-channel Interference Mitigation Detectors for Multirate Transmission in TD-CDMA System,’’ IEEE Journal on Selected Areas in Communications (special issue on multiuser detection for the wireless channel), Vol. 20, No. 2, February 2002, pp. 273–286.
[2]
Klein, A., and P. Baier, ‘‘Linear Unbiased Data Estimation in Mobile Radio Systems Applying CDMA,’’ IEEE Journal on Selected Areas in Communications, Vol. 11, No. 7, September 1993.
[3]
Castoldi, P., and H. Kobayashi, ‘‘Low Complexity Group Detectors for Multirate Transmission in TD-CDMA 3G Systems,’’ IEEE Broadband Wireless Symposium (Globecom 2000), San Francisco, CA, November–December 2000.
[4]
Therrien, C. W., Discrete Random Signals and Statistical Signal Processing, Englewood Cliffs, NJ: Prentice Hall, 1992.
[5]
McDonough, R. N., and A. D. Whalen, Detection of Signals in Noise, Second Edition, Burlington, MA: Academic Press, 1995.
[6]
Anderson, B., and J. Moore, Optimal Filtering, Englewood Clifs, NJ: Prentice Hall, 1979.
Linear Multiuser Detection
67
[7]
Lim, T. J., and S. Roy, ‘‘Adaptive Filters in Multiuser (MU) CDMA Detection,’’ Wireless Networks, Vol. 4, 1998, pp. 307–318.
[8]
Lei, Z. D., and T. J. Lim, ‘‘Simplified Polynomial-Expansion Linear Detectors for DSCDMA Systems,’’ Electronics Letters, Vol. 34, No. 16, August 1998.
4 Structured Versus Unstructured Linear Detection and Interference Mitigation In this chapter we address the problem of detection in a CDMA multirate communication system where block synchronous groups of users are active. The model describes the downlink detection of a mobile radio CDMA system where the MAI is due both to intracell and cochannel intercell interference. This derivation is an extension of the algorithms proposed in Chapter 3 for the case of a multiple cell system. The proposed detection schemes employ a structured model of the MAI having tunable complexity for the purpose of linear detection and interference mitigation. Specifically, examples of structured and unstructured intracell and intercell interference suppression are considered, and linear multiuser detectors designed on top of those models are presented and compared. This is a ‘‘hot’’ problem at the moment, since time division duplex (TDD),1 one of the proposed duplexing techniques for the Universal Mobile Telecommunication System (UMTS) third-generation mobile radio system [1], calls for such a capability for the downlink too. Soft hand-over procedures are also proposed, which allow direct suppression of intercell interference and also group detection of the data of neighboring cells.
4.1 System Model An intracell signal in the downlink is a CDMA synchronous signal where all codes experience the same propagation channel. A generic interferer coming 1. The derivation applies for both the low chip rate and the high chip rate TDD modes. 69
70
Multiuser Detection in CDMA Mobile Terminals
from another cell in the downlink (intercell interferer) can be described by the same structure as the intracell signal. The intercell interferers are asynchronous to each other and with respect to their respective intracell signal. In order to avoid direct interference between the same code possibly used in different cells, a base station (BS)-specific scrambling sequence is used to ‘‘mask’’ every cell beam of OVSF channelization codes. The scrambling sequences are short (i.e., their duration is at most a few symbol intervals). The effect is that the overall spreading sequence becomes time-varying, but a short observation window with stationary statistics [2] exists, and a sliding window algorithm can once again be employed for the detection. The signal of each cell can be modeled as a synchronous system identical to that presented in Chapter 1, on which a cell-specific modulation realized using the scrambling sequence is superimposed. For the derivation of a general multicell time-discrete observation, we start focusing our attention on the scrambled single-cell model shown in Figure 4.1. Subscript d is appended to the parameters (number of users, data, codes, etc.) that belong to this cell and are regarded as desired intracell parameters. As an example, aJd ;d ðnÞ denotes the symbol of the Jd th data stream of the desired cell transmitted at the nth symbol interval. We consider Jd synchronous intracell data streams carried by as many and codes sent through a slowly varying multipath fading channel hd ðt Þ. By faj;d ðnÞg we denote the data sequence that is spread by the jth code (j ¼ 1; 2; . . . ; Jd ), whose spreading factor is Qj;d . We follow the system model of [3] in order to account for the user-specific OVSF spreading code followed by scrambling of the BS-specific overlay sequence. We rename the OVSF spreading code (Walsh-Hadamard sequence) for the jth user with
c1,d (n;k) a1,d (n)
Q1,d
b1,d (k)
p(t ) β/Tc hd (t)
aJd ,d (n)
QJd ,d
bJd ,d (k)
p(t )
q(t)
ym,d (i) yd (t)
v(t)
cJd ,d (n;k) Intracell signal
Figure 4.1 Model of single cell transmission system in the presence of a scrambling sequence.
71
Structured Versus Unstructured Linear Detection and Interference Mitigation
w j;d ¼ wj;d ð0Þ; wj;d ð1Þ; . . . ; wj;d ðQj;d 1Þ
j ¼ 1; 2; . . . ; Jd
ð4:1Þ
This sequence (denoted by cj in Chapter 1) is repeated at each symbol interval. The BS-specific scrambling code, whose length is equal to the maximum possible spreading factor Qmax ¼ 16, is described by referring it to the jth code symbol interval as follows: zd ðn; iÞ;
Qmax n ¼ 0; 1; . . . ; 1; Qj;d
and i ¼ 0; 1; . . . ; Qj;d
1 ð4:2Þ
where index n counts with the symbol intervals of the jth user, while index i runs on chip intervals. We define a structured periodic spreading code as 4
cj;d ðn; iÞ ¼ wj;d ðiÞzd ðjn 1jQmax =Qj;d ; iÞ n ¼ 1; 2; . . . ;
and ð4:3Þ
i ¼ 0; 1; . . . ; Qj;d 1 4
where we have used the operator jijL ¼ i mod L. Different from the singlecell model, cj;d ðn; iÞ is a periodically time-varying spreading sequence because of the presence of the overlay scrambling sequence. The sequences . cj;d ðn; iÞ and zd ðn; iÞ have the same period equal to QQmax j;d We then define a modulated chip sequence fbj;d ðkÞg corresponding to the data sequence faj;d ðnÞg of the jth user:
k bj;d ðkÞ ¼ aj;d ðnÞcj;d ðn; jk 1jQj;d Þ k ¼ 1; 2; . . . ; n ¼ Qj;d 4
ð4:4Þ
where dxe denotes the smallest integer that equals or exceeds x. The communication system shown in Figure 4.1 has a structure identical to the single-cell counterpart shown in Chapter 1. After front-end filtering, an oversampling of b times the chip rate is realized to obtain scalar time discrete observations denoted by ym;d ðiÞ where subscript m counts the sample per chip interval, i runs on the chip intervals themselves, and d indicates that the observation is due to the contribution of the intracell (desired) signal only. Although the expression of bj;d ðkÞ is different from that of bj ðkÞ derived in Chapter 1, the scalar time discrete observation can be written in the same way [3, 4]:
72
Multiuser Detection in CDMA Mobile Terminals
ym;d ðiÞ ¼
Jd X P 1 X
fm;d ðk þ 1Þbj;d ði kÞ þ vm ðiÞ
ð4:5Þ
j¼1 k¼0
m ¼ 1; . . . ; b; i ¼ 1; 2; . . . where fm;d ðk þ 1Þ are the time-invariant channel coefficients and vm ðiÞ are samples of additive white Gaussian noise. The resulting scalar observation model (4.5) describes both ISI and MAI. If we replicate this structure for L interfering cells, we end up with the system model presented in Figure 4.2. The overall scalar observation is given by summing the contribution of the intracell signal and the interfering cells signal as follows: ym ðiÞ ¼ ym;d ðiÞ þ
L X
ym;l ðiÞ
ð4:6Þ
l¼1
We build an observation vector using bN observations ym ðiÞ (starting at the ith chip interval and spanning N chip intervals backward): T ð4:7Þ y ðN Þ ðiÞ ¼ y1 ði N þ 1Þ; . . . ; yb ði N þ 1Þ; . . . ; y1 ðiÞ; . . . ; yb ðiÞ It is easy to show [3, 4] that in this multiple cell scenario, the most general structured expressions for vector (4.7) are given by
a 1,d (n) aJ ,d (n) d
Intracell signal (“d” )
hd (t) β/Tc
a 1,1(n) aJ1,1(n)
Intercell signal 1
y(t) v(t)
λ a 1,L(n) aJL,L(n)
ym(i)
q(t)
h1(t)
Intercell signal L
hL(t)
Figure 4.2 Model of the multiple cell system considered.
73
Structured Versus Unstructured Linear Detection and Interference Mitigation
y ðN Þ ðiÞ ¼
Jd X
ðN Þ
ðN Þ
ðN Þ
Fd Cj;d ðiÞaj;d ðiÞþ
Jl L X X
j¼1
ðN Þ
ðN Þ
ðN Þ
Fl Cj;l ðiÞa j;l ðiÞ þ vðN Þ ðiÞ
l¼1 j¼1
ð4:8Þ ¼
Jd X
ðN Þ ðN Þ
Fd bj;d ðiÞ þ
j¼1
¼
Jd X j¼1
Jl L X X
ðN Þ ðN Þ
Fl bj;l ðiÞ þ vðN Þ ðiÞ
ð4:9Þ
l¼1 j¼1 ðN Þ
ðN Þ
Gj;d ðiÞaj;d ðiÞ þ
Jl L X X
ðN Þ
ðN Þ
Gj;l ðiÞa j;l ðiÞ þ v ðN Þ ðiÞ
ð4:10Þ
l¼1 j¼1
where we identify with subscript d the desired intracell signal; the summation over l accounts for L beams of block-synchronous intercell signals; Jd and Jl are the number of active codes in the desired cell and in the lth interfering cell, respectively. Denoting by * a subscript wildcard: Þ The matrices FðN contain the channel samples of the correspond ing block synchronous beam. ðN Þ The matrices Cj ðiÞ contain the code sequences of the jth code of the corresponding beam. ðN Þ The vectors aj ðiÞ contain the corresponding data. ðN Þ ðN Þ ðN Þ The vectors bj ðiÞ ¼ Cj ðiÞa j ðiÞ contain the modulated chips. ðN Þ Þ ðN Þ The matrices GJ ðiÞ ¼ FðN
Cj ðiÞ are the modified code matrices after transmission through the channel.
The noise term vðN Þ ðiÞ is AWGN. Several special cases of the above observation model can be derived. In order to simplify the notation in the rest of this section, we drop the superscript ðN Þ in all vectors and matrices. 4.1.1 Reduced Complexity Single-Cell Models In a single-cell scenario, the intercell interference (subscript l) is absent in (4.8), (4.9), and (4.10). In Chapter 1, the intracell term (subscript d ) is described in a totally structured way. In this section we present, instead, a tunable-complexity description of the intracell term. Specifically, the active intracell codes are grouped in two sets: The codes whose contribution to the observation is described in a structured way belong to a set denoted by S d ; the complementary set S d contains the codes whose interference is only statistically described by a colored Gaussian noise, denoted by zðiÞ. Obviously, jS d [ S d j ¼ Jd , where j j denotes the cardinality of a set.
74
Multiuser Detection in CDMA Mobile Terminals
We then define a new global Gaussian colored noise term nðiÞ ¼ zðiÞ þ vðiÞ, which accounts for both the unstructured part of the intracell interference zðiÞ and the standard AWGN noise vðiÞ. The noise term nðiÞ has correlation matrix R n ðiÞ ¼ s2v I þ R z ðiÞ and contributes to the observation as follows: yðiÞ ¼ Fd Cd ðiÞa d ðiÞ þ nðiÞ
ð4:11Þ
¼ Fd bd ðiÞ þ nðiÞ
ð4:12Þ
¼ Gd ðiÞa d ðiÞ þ nðiÞ
ð4:13Þ
The structured part of the received signal (4.11), (4.12), and (4.13) is expressed in terms of properly defined matrices as detailed in the following: The bN ðP þ N 1Þ-dimensional matrix Fd contains the channel samples. The ðP þ N 1Þ nt -dimensional matrix Cd ðiÞ ¼ ½. . . ; Cj;d ðiÞ; . . . ðj 2 S d Þ contains the code sequences. The nt -dimensional vector ad ðiÞ ¼ ½. . . ; a j;d ðiÞ; . . . ðj 2 S d Þ contains the data. The ðN þ P 1Þ-dimensional vector bd ðiÞ ¼ Cd ðiÞa d ðiÞ contains the modulated chips. The (bN nt )-dimensional matrix Gd ðiÞ ¼ Fd Cd ðiÞ represents the modified code matrices. In the extreme case in which we give a complete structured description of the intracell signal and zðiÞ ¼ 0, models (4.11), (4.12), and (4.13) are those employed in [5] and described in Chapter 2 (except for the presence of the scrambling sequence) [4].
4.1.2 Multiple-Cell Models The models in (4.8), (4.9), and (4.10) can be simplified in terms of notation. In accordance with [3, 4], we introduce two new multiple-cell models. The first model envisions a complete structured description ðjSj ¼ Jd Þ of the intracell signal and deploys a statistical description of the intercell interference. It is given by
Structured Versus Unstructured Linear Detection and Interference Mitigation
75
yðiÞ ¼ Fd Cd ðiÞa d ðiÞ þ nðiÞ
ð4:14Þ
¼ Fd bd ðiÞ þ nðiÞ
ð4:15Þ
¼ Gd ðiÞa d ðiÞ þ nðiÞ
ð4:16Þ
4 vðiÞ þ zðiÞ, with where, in this case, the overall noise term nðiÞ ¼ 2 correlation matrix R n ðiÞ ¼ sv I þ R z ðiÞ, includes thermal noise vðiÞ and intercell interference zðiÞ. The second model [3, 4] is useful when we know all interference parameters allowing for a fully structured interference mitigation. Recalling that intercell interference consists of the superposition of a few asynchronous yet block-synchronous CDMA signals, we can derive a simplified version of models (4.8), (4.9), and (4.10):
yðiÞ ¼ Fd Cd ðiÞa d ðiÞ þ
L X
Fl Cl ðiÞal ðiÞ þ vðiÞ
ð4:17Þ
l¼1
¼ Fd bd ðiÞ þ
L X
Fl bl ðiÞ þ vðiÞ
ð4:18Þ
l¼1
¼ Gd ðiÞad ðiÞ þ
L X
Gl ðiÞal ðiÞ þ vðiÞ
ð4:19Þ
l¼1
where for the construction of Cd ðiÞ and ad ðiÞ, we have assumed jS d j ¼ Jd . Similarly, we have built Cl ðiÞ and al ðiÞ assuming jS l j ¼ Jl . Observation models (4.17), (4.18), and (4.19) are related to (4.14), (4.15), and (4.16) as follows: zðiÞ ¼
L X l¼1
Fl bl ðiÞ ¼
L X
Fl Cl ðiÞal ðiÞ ¼
l¼1
L X
Gl ðiÞal ðiÞ
ð4:20Þ
l¼1
By using the definitions of [3, 4], 4
nt ¼ ðnt ;d þ
L X
ntl Þ
ð4:21Þ
l¼1
4
F ¼ ½Fd ; F1 ; . . . ; FL bN ðL þ 1ÞðP þ N 1Þ T 4 bðiÞ ¼ bTd ðiÞ; bT1 ðiÞ; . . . ; bTL ðiÞ
ðL þ 1ÞðN þ P 1Þ 1
ð4:22Þ ð4:23Þ
76
Multiuser Detection in CDMA Mobile Terminals
T 4 aðiÞ ¼ a Td ðiÞ; a T1 ðiÞ; . . . ; a TL ðiÞ
nt 1
ð4:24Þ
4
ðL þ 1ÞðP þ N 1Þ nt ð4:25Þ
4
ð4:26Þ
CðiÞ ¼ diag½Cd ðiÞ; C1 ðiÞ; . . . ; CL ðiÞ
GðiÞ ¼ ½Gd ðiÞ; G1 ðiÞ; . . . ; GL ðiÞ bN nt we can derive the following, extremely compact, notation: yðiÞ ¼ FbðiÞ þ vðiÞ
ð4:27Þ
¼ FCðiÞaðiÞ þ vðiÞ
ð4:28Þ
¼ GðiÞaðiÞ þ vðiÞ
ð4:29Þ
where F, bðiÞ, aðiÞ, CðiÞ, and GðiÞ take into account the parameters of all the interfering cells including the desired one. This model describes, in a compact but structured notation, both the desired intracell signal and the intercell interference.
4.2 Extended Linear Receivers Since multiple codes may be assigned to the same user, the receiver should be able to detect the data carried by a subset of active codes. In this section we extend the application of the four types of linear detectors to the multiple-cell environment [6]. The receiver operates assuming that all the parameters (spreading codes, channel coefficients) in the ‘‘structured’’ part of the observation yðiÞ are known or estimated, while the ‘‘unstructured’’ part of the observation (noise in a wide sense) has a known or estimated correlation matrix. The task of the detectors is the following: Given the observation vector yðiÞ, expressed by any of the single-cell models (4.14), (4.12), (4.13) or multiple-cell models (4.14), (4.15), (4.16) or (4.27), (4.28), (4.29), find the best linear detector that detects all the data on which yðiÞ depends [4]. Two types of interference arise in the system under consideration: ISI and MAI, due to the deployment of oversampling and to the presence of multipath propagation. ISI can be alleviated by equalization, and MAI by interference mitigation, which can be realized either jointly or separately. Formally, the detectors are identical to those presented in Chapter 3, but the underlying system model is completely different [4].
Structured Versus Unstructured Linear Detection and Interference Mitigation
77
In two of the proposed multiuser detectors, we counteract both MAI and ISI simultaneously utilizing the ZF or MMSE criterion. This strategy is designed to operate with observation models (4.10), (4.16), or (4.29). Through a linear transformation described by a proper matrix Ta ðiÞ (an nt bN matrix), we obtain an nt -dimensional vector of soft decision xa ðiÞ for the transmitted data aðiÞ: xa ðiÞ ¼ Ta ðiÞyðiÞ
ð4:30Þ
Two other schemes are designed to operate with observation models (4.8), (4.15), or (4.27), and they mitigate ISI only (using the ZF or MMSE criterion) in order to reduce the complexity. Using a different transformation Tb ðiÞ (a ðP þ N 1Þ bN matrix), we attempt to restore the orthogonality of the codes by constructing an ðN þ P 1Þ-dimensional vector of soft decision xb ðiÞ for the chip sequence bðiÞ: xb ðiÞ ¼ Tb ðiÞyðiÞ
ð4:31Þ
Then, an additional linear transformation is required to mitigate the MAI. This step is a conventional detection scheme xa ðiÞ ¼ CH ðiÞxb ðiÞ where H means ‘‘transposition and complex conjugation.’’ Conventional detection relies on the fact that CH ðiÞCðiÞ ¼ Int due to the orthonormality of the Walsh-Hadamard codes. Orthogonality is satisfied apart from some possible border effects [5], as already pointed out in Chapter 3. Eventually, a symbol-by-symbol decision device produces hard decisions for the transmitted data ^aðiÞ ¼ quant½xa ðiÞ: In the following we present the four detection algorithms, which are formulated as if the data of the cells that are included in the structured part of the observation must be detected. If this is not the case, the linear transformation can be redefined for an appropriate subspace. For example, if we need to detect the intracell codes only, maintaining a structured intercell interference mitigation, only the relevant FIR filters of the desired codes (appropriate rows of the matrix TðiÞ that describes the linear detector) need to be used. All the detection schemes presented below require the Cholesky decomposition R n ðiÞ ¼ LðiÞLH ðiÞ of the noise covariance matrix. In the following we only provide a summary of the receiver features, since the statement of the problems is identical to those of Chapter 3. Nevertheless, since the detectors operate in the presence of possibly nonwhite Gaussian noise, the results of Chapter 3 are slightly modified. Referring to [3, 4], for each detector, we provide the associated linear transformation TðiÞ and the resulting expression for the soft decision.
78
Multiuser Detection in CDMA Mobile Terminals
It is well known that, in this case, the zero-forcing detector for both MAI and ISI (ZF-MI) has the goal to perform the following minimization: xZF MI ðiÞ ¼ arg min kL1 ðiÞðyðiÞ GðiÞxðiÞÞk2 xðiÞ
ð4:32Þ
This detector uses the linear transformation described by TZF MI ðiÞ ¼ ½L1 ðiÞGðiÞ# L1 ðiÞ
ð4:33Þ
where # denotes the pseudo-inverse of a matrix. Soft decisions for the data are given by xZF MI ðiÞ ¼ aðiÞ þ uZF MI ðiÞ
ð4:34Þ
4 ½L 1 ðiÞGðiÞ# L1 ðiÞnðiÞ is a new noise where uZF MI ðiÞ ¼ process.
The zero-forcing detector for ISI only (ZF-I) performs, as a first step, the following minimization: x0ZF I ðiÞ ¼ arg min kL1 ðiÞðyðiÞ FðiÞx0 ðiÞÞk2 0 x ðiÞ
ð4:35Þ
Hence, this detector uses the preliminary linear transformation described by TZF I ðiÞ ¼ ½L1 ðiÞF# L1 ðiÞ
ð4:36Þ
Preliminary soft decisions for the chips are given by x0ZF I ðiÞ ¼ bðiÞ þ u0ZF I ðiÞ
ð4:37Þ
4 ½L 1 ðiÞF# L 1 ðiÞnðiÞ is a new noise process. where u0ZF I ðiÞ ¼ After conventional detection, we obtain the soft decision for the data
xZF I ðiÞ ¼ aðiÞ þ uZF I ðiÞ 4 CH ðiÞu0 where uZF I ðiÞ ¼ ZF I ðiÞ.
ð4:38Þ
Structured Versus Unstructured Linear Detection and Interference Mitigation
79
The MMSE detector for mitigation of both MAI and ISI (MMSEMI) performs the following minimization:
TMMSE MI ðiÞ ¼ arg min
E
TðiÞ aðiÞ;vðiÞ
kaðiÞ TðiÞyðiÞk2
ð4:39Þ
obtaining the following linear transformation: 1 H 1 TMMSE MI ðiÞ ¼½GH ðiÞR 1 n ðiÞGðiÞ þ R a ðiÞ G
ðiÞR 1 n ðiÞ
ð4:40Þ
where R a ðiÞ ¼ E ½aðiÞa H ðiÞ. The soft decisions for the data are given by xMMSE MI ðiÞ ¼ diag ½ W MI ðiÞ aðiÞ þ diag ½ W MI ðiÞ aðiÞ þ uMMSE MI ðiÞ
ð4:41Þ
where 1 1 W MI ðiÞ ¼ ½Int þ ðR a ðiÞGH ðiÞR 1 n ðiÞGðiÞÞ
ð4:42Þ
is a Wiener estimator and 4 ½GH ðiÞR 1 ðiÞGðiÞ þ R 1 ðiÞ1 GH ðiÞ uMMSE MI ðiÞ ¼ n a R 1 ðiÞnðiÞ n
ð4:43Þ
is the new noise process. The MMSE detector for mitigation of ISI only (MMSE-I) performs first the minimization
TMMSE I ðiÞ ¼ arg min
E
TðiÞ bðiÞ;vðiÞ
kbðiÞ TðiÞyðiÞk2
ð4:44Þ
obtaining the linear transformation described by 1 H 1 1 TMMSE I ðiÞ ¼ ½FH R 1 n ðiÞF þ R b ðiÞ F R n ðiÞ
ð4:45Þ
where R b ðiÞ ¼ E ½bðiÞbH ðiÞ ¼ CðiÞR a ðiÞCH ðiÞ. soft decisions for the chips are given by
Preliminary
80
Multiuser Detection in CDMA Mobile Terminals
x0MMSE I ðiÞ ¼ W I ðiÞbðiÞ þ u0MMSE I ðiÞ
ð4:46Þ
1 1 W I ðiÞ ¼ ½IN þP1 þ ðR b ðiÞFH R 1 n ðiÞFÞ
ð4:47Þ
where
is a Wiener estimator, and 4
1 1 u0MMSE I ðiÞ ¼ W I ðiÞ ¼ ½FH R 1 n ðiÞF þ R b ðiÞ
FH R 1 n ðiÞnðiÞ
ð4:48Þ
is a new noise process. After conventional detection, we obtain the following soft decisions for the data: xMMSE I ðiÞ ¼ diag ð W I ðiÞÞaðiÞ þ CH ðiÞdiag ð W I ðiÞ Þ CðiÞaðiÞ þ uMMSE I ðiÞ
ð4:49Þ
4
where uMMSE I ðiÞ ¼ CH ðiÞu0MMSE I ðiÞ is a new noise process.
4.3 Sliding Window Formulation Finite complexity formulations of the detectors proposed in the previous section are necessary in order to employ them in practical receivers. The sliding window formulation presented in Chapter 3 applies. The guiding principle supporting the sliding window formulation is that the block minimization accomplished by a linear detector can use a finite window, because the vector y ðN Þ ðiÞ has stationary statistics if N Qmax , where Qmax is the maximum potential spreading factor (i.e., the length of the BS-specific scrambling sequences). The width of the observation window is set to ðQmax þ P 1Þ chip intervals, because this size allows us to collect all samples carrying useful energy for detection of one symbol of the slowest stream. As in [5], we now define 4
y D ðkÞ ¼ y ðQmax þP 1Þ ðkQmax þ P 1Þ 4
aD ðkÞ ¼ a ðQmax þP 1Þ ðkQmax þ P 1Þ 4
T ;D ðkÞ ¼ T
ðQmax þP1Þ
ðkQmax þ P 1Þ
ð4:50Þ ð4:51Þ ð4:52Þ
where k ¼ 1; 2; . . . ; K is an index ticking with the rate of the BS-specific scrambling code, and * is a wildcard, indicating that the sliding window algorithm holds for any of the linear detectors proposed. The total number of
Structured Versus Unstructured Linear Detection and Interference Mitigation
81
potential decision variables for all symbols embraced by the processing window y D ðkÞ is n t . We can then derive the nt -dimensional vector xa;D ðkÞ as xa;D ðkÞ ¼ T ;D y D ðkÞ ¼ 1; 2; . . .
ð4:53Þ
For the sake of clarity, note that the argument of y D ðÞ runs over chip intervals, while that of y D ðÞ, aD ðÞ; and xa;D ðÞ runs over the detection step (which ticks at the rate of the BS-specific scrambling code). For a given processing window y D ðkÞ, there are code-specific central ) of xa;D ðkÞ, which yield an optimal decision subvectors (with length QQmax j;d variable for the corresponding symbols of the data stream carried by the jth code. The other symbols are optimally detected in the following or preceding processing window. This choice was validated in [5].
4.4 Typical Operating Modes of the Linear Receivers Some level of intracell interference mitigation is always ‘‘on,’’ while the intercell interference is treated differently according to some criteria (e.g., its power level s2z compared to the whole signal power s2y ). Some examples of the operating modes are identified and described in [3, 4]. In particular, depending on the level of the intercell interference power and the complexity of the model, we can distinguish between four types of receivers as detailed in the following: 1. Structured intracell interference mitigation: This is the case of a single-cell situation in a strict or approximate sense, for which the observation models (4.11), (4.12), and (4.13) apply. The data part of the observation is completely structured ðjS d j ¼ Jd Þ, yielding n ¼ v. 2. Unstructured intracell interference mitigation: For the single-cell scenario, it is based on a purely statistical description of the intracell interference. In fact, the codes, which are not intended for the MS considered, are modeled as colored noise whose covariance matrix is evaluated in closed form (we have explicit knowledge of all these codes; see Section 4.1). The observation models (4.11), (4.12), and (4.13) apply again, and the receiver of Section 4.2 can be used. 3. Structured intracell and unstructured intercell interference mitigation: This strategy can be adopted when the intercell interference cannot be neglected. In this case, the observation models (4.14), (4.15),
82
Multiuser Detection in CDMA Mobile Terminals
and (4.16) apply, and the global noise term is modeled as colored Gaussian noise whose covariance matrix R n is either estimated or evaluated in closed form. We can devise a simplified version of the algorithm modeling the intercell interference power as plain white Gaussian noise whose power is added to the thermal noise power. 4. Structured intracell and intercell interference mitigation: This strategy is typical of the soft hand-over procedure. The observation model is given by the block-synchronous observations (4.27), (4.28), and (4.29). The receiver acts as a multicell direct interference suppressor of both the interference coming from the original cell and that coming from the hand-over candidate cell(s). The receiver has the capability of detecting both the data of the desired and interfering cell.
4.5 Multiuser Detection Versus Interference Mitigation In this Chapter we have presented the model of a CDMA multiple-cell radio mobile system and dealt with the extension of linear detection to this system. Specifically, we have highlighted two ways to counteract interference at different levels of abstractions. By employing a structured description of the data-like interference in the system model, the receiver has the potential ability to detect all relevant data. For those codes that are actually detected, the receiver acts as a multiuser detector; for the other codes, which are included in the structured description of the observation but not detected, the receiver acts as an interference mitigator. The roughest way to realize an interference mitigator, however, is to describe the whole interference term (i.e., codes that do not need to be detected) as a Gaussian process with known correlation matrix. As will be shown in Chapter 6, this last unstructured description of the interference yields a poor BER performance in a multiple-cell environment.
References [1]
3rd Generation Partnership Project (3GPP), Radio Interface Technical Specifications, ftp://ftp.3gpp.org.
[2]
Verdu`, S., Multiuser Detection, New York: Cambridge University Press, 1998.
[3]
Castoldi, P., and H. Kobayashi, ‘‘Intracell and Intercell Interference Mitigation Detectors for Multirate Transmission in TD-CDMA 3G Systems,’’ Africom 2001, South Africa, May 2001.
Structured Versus Unstructured Linear Detection and Interference Mitigation
83
[4]
Castoldi, P., and H. Kobayashi, ‘‘Co-channel Interference Mitigation Detectors for Multirate Transmission in TD-CDMA Systems,’’ IEEE Journal on Selected Areas in Communications (special issue on multiuser detection for the wireless channel), Vol. 12, No. 2, February 2002.
[5]
Castoldi, P., and H. Kobayashi, ‘‘Low Complexity Group Detectors for Multirate Transmission in TD-CDMA 3G Systems,’’ Wireless Broadband Symposium (Globecom 2000), San Francisco, CA, November 2000.
[6]
Holma, H., et al., ‘‘Interference Considerations for the Time Division Duplex Mode of the UMTS Terrestrial Radio Access,’’ IEEE Journal on Selected Areas in Communications, Vol. 18, No. 8, August 2000, pp. 1386–1393.
Appendix 4A: Exact Expression of the Correlation Matrix Rn(i) If we assume that the structure of the interference is perfectly known, we can evaluate the autocorrelation of the global interference as follows: " R n ðiÞ ¼ E
L X l¼1
¼
L X L X
! Fl bl ðiÞ þ vðiÞ
L X
!H # Fk bk ðiÞ þ vðiÞ
k¼1
ð4A:1Þ
2 Fl R bl;k ðiÞFH k þ sv IbN
l¼1 k¼1 4
where R bl;k ðiÞ ¼ E ½bl ðiÞbH k ðiÞ can be evaluated in closed form. After some manipulations, we obtain 8 9 L X L < s2 Fl Cl ðiÞCH ðiÞFH dl;k = X al l l þ s2v IbN R n ðiÞ ¼ : ; H H l¼1 k¼1 Fl bl ðiÞbk ðiÞFk 8 9 L < = s2al Fl FH X l þ s2v IbN : H H ; l¼1 Fl bl ðiÞbk ðiÞFk
ð4A:2Þ
where dl;k is the Kronecker delta. The upper line of (4A.2) accounts for cross-correlation of data-level described interferers; the second line accounts for cross-correlation of chip-level described interferers.
5 Adaptive Linear Multiuser Detection This chapter contains the fundamentals of CDMA adaptive linear multiuser detection. It is well known from detection theory that the adaptive approach is a convenient choice for detection in uncertain environments [1–5]. Uncertainty may be caused by different reasons. For example, time-varying parameters may cause it, but also non-time-varying unknown parameters may create problems in the receiver design. In these cases an adaptive approach is a solution to the problem of detection. Adaptivity involves a transient phase, which is followed by the steady-state operation if convergence is ensured. Otherwise, the detector diverges and detection becomes highly unreliable. Two classes of CDMA adaptive detectors are presented in this chapter, namely the trained and blind detectors. We point out that the specifications trained and blind refer to the presence or the absence of some knowledge regarding the transmitted data symbols. The trained detector is a robust adaptive detector that does not require the knowledge of the spreading code of the desired user (i.e., it is ‘‘blind’’ with respect to this parameter). Yet, the training sequence enables the receiver to learn the spreading code. Instructions to the trained detector are provided by the decision after the end of the training sequence. The blind detector is a powerful adaptive detector that does not require any preliminary information about the data sequence. On the other hand, it requires an accurate knowledge of the spreading code of the desired user (i.e., it is ‘‘trained’’ with respect to this parameter). In fact, the knowledge of the spreading sequence allows the detector to realize the despreading with a short transient phase. The derivation in this chapter refers to the system model of the asynchronous CDMA system presented in Chapter 1. The performance assessment and comparisons with nonadaptive counterparts are presented in Chapter 6. 85
86
Multiuser Detection in CDMA Mobile Terminals
5.1 Adaptive MMSE Receivers We have seen in Chapter 3 that in order to use the optimal MMSE receiver, we need a structured knowledge of the interference (phase and frequency offset of all users, delays and amplitudes of all users), which isn’t always available at the receiver side. Furthermore, MMSE detection requires the inversion of a matrix of large size, which is computationally expensive. Adaptive MMSE detection is a convenient alternative to optimal MMSE detection with a reduced computational burden and an acceptable receiver performance. Optimal MMSE detection based on the sliding window algorithm is realized by a priori evaluation of the matrix TD;MMSE ðmÞ. Using a proper set of rows of the matrix, a bank of optimal MMSE FIR filters tj;MMSE ðmÞ ð j ¼ 1; 2; . . . ; J Þ can be built. From now on we will focus on the detection of the data stream associated to one code, because in the adaptive case the iterative search of the minimum of the function mean square error can be accomplished independently user by user [1].
5.2 Trained Adaptive MMSE Receiver In the evaluation of matrix TD;MMSE ðmÞ (see Chapter 3) we use matrix GD ðmÞ, which depends on A D ðmÞ, PD , and CD , reported below for our convenience: GD ðmÞ ¼ A D ðmÞPD CD
ð5:1Þ
Hence, the MMSE optimal solution requires, beyond a matrix inversion, the knowledge of amplitude Aj , phase offset fj , frequency offset fj , spreading codes, and delays tj , for all users ( j ¼ 1; 2; :::; J ). The trained MMSE detector does not require any matrix inversion, but only a rough knowledge of the timing (with uncertainty of a few chip intervals) of the desired data stream(s). Unlike the optimal MMSE receiver, it requires a known training sequence at the receiver side used to train the receiver at the beginning of the detection process. The cost function to be minimized is ð5:2Þ J ð^t1 Þ ¼ E jaj ðm 1Þ ^t1H y D ðmÞj2 The steepest descent algorithm for the iterative search of the minimum of function (5.2) is
Adaptive Linear Multiuser Detection
qJ ð^t1 Þ ^t1 ðm þ 1Þ ¼ ^t1 ðmÞ m q^t1 ^t1 ¼^t1 ðmÞ
87
ð5:3Þ
where ^t1 ðmÞ is the receiver FIR filter at the mth detection step. Defining e1 ðmÞ ¼ a1 ðm 1Þ ^t1H ðmÞy D ðmÞ
ð5:4Þ
2 qJ ð^t1 Þ qE je1 ðmÞj qE ½e1 ðmÞe1 ðmÞ ¼ ¼ q^t1 q^t1 q^t1
ð5:5Þ
we have that
We can change the order of operators E and
q q^t1 ,
obtaining
qJ ð^t1 Þ qðe1 ðmÞe1 ðmÞÞ ¼E q^t1 q^t1 qe1 ðmÞ qe1 ðmÞ ¼E e ðmÞ þ e1 ðmÞ q^t1 1 q^t1
ð5:6Þ
qe1 ðmÞ q a1 ðm 1Þ ^t1H y D ðmÞ ¼ ¼0 q^t1 q^t1
ð5:7Þ
qe1 ðmÞ q a1 ðm 1Þ ^t1H y D ðmÞ ¼ ¼ y D ðmÞ q^t1 q^t1
ð5:8Þ
It is easy to show [6]
while
Hence, we have qJ ð^t1 Þ ¼ E e1 ðmÞy D ðmÞ q^t1
ð5:9Þ
and the iterative algorithm to update the filter coefficients becomes (using (5.3)) ^t1 ðm þ 1Þ ¼ ^t1 ðmÞ þ mE e1 ðmÞ y D ðmÞ ð5:10Þ We have defined e1 ðmÞ in (5.4). During the training period, (5.10) is the true error, while in the decision direct mode, the error is estimated in a
88
Multiuser Detection in CDMA Mobile Terminals
data-aided fashion using tentative decisions. As a matter of fact, in the decision directed mode we have ^t1 ðm þ 1Þ ¼ ^t1 ðmÞ þ mE ^e1 ðmÞ y D ðmÞ ð5:11Þ with ^e1 ðmÞ ¼ a^1 ðm 1Þ ^t1H ðmÞy D ðmÞ. The block diagram of the adaptive receiver is sketched in Figure 5.1. The block that realizes the tap coefficients update has as an input observation vector y D ðmÞ and error e1 ðmÞ or ^e1 ðmÞ . The steepest descent algorithm defined in (5.10) can be approximated using the instantaneous value instead of the stochastic mean, yielding the least mean square (LMS) update: ^t1 ðm þ 1Þ ¼ ^t1 ðmÞ þ m^e1 ðmÞ y D ðmÞ
ð5:12Þ
Figure 5.2 shows the FIR implementation of the receiver: the input to the H ^ decision device is x1 ðm 1Þ ¼ t1 ðmÞy D ðmÞ i.e., the soft decision for symbol ^ a1 ðm 1Þ .
5.3 Blind Adaptive MMSE Receiver The trained MMSE detector needs a training sequence and a rough acquisition of the timing of the desired data streams. An alternative within the class of adaptive receivers is the blind adaptive MMSE detector [5]. The blind detector does not need a training sequence, but it requires a fine knowledge of the timing (maximum uncertainty is a fraction of a chip interval), the spreading sequence(s) of the desired data stream(s), and, if present, the frequency offset. A modified version of the blind detector will be presented later. This modified algorithm utilizes a constraint on a quantity called the surplus energy, in order to deal with situations where there is a mismatch between the actual and nominal spreading sequence of the desired user. yD(m)
FIR t1
Tap coefficients update
x1(m) = tH1 . yD(m)
â1(m – 1)
Threshold detector
e(m)
Decision directed Training symbols a1(m – 1)
Figure 5.1 Block diagram of the trained adaptive MMSE receiver.
89
Adaptive Linear Multiuser Detection
t 1*
yD(m)
Tc t2*(m)
Tc t3*(m)
Tc * t(Q+P–1) (m)
x1(m)
Tap coefficients update Figure 5.2 FIR implementation of the trained adaptive MMSE receiver.
The blind detector relies on the fact that the knowledge of the spreading sequence of the desired stream allows to be set a degree of freedom in the search of the FIR filter coefficients utilized by the MMSE algorithm. Without loss of generality we can set 4
t1 ¼ c1 þ x1
with xH 1 c1 ¼ 0
ð5:13Þ
where c1 is the spreading sequence of data stream 1 (the desired stream). We will show that the minimization of the mean output energy (MOE) at the output of the searched FIR filter coincides with the minimization of the mean square error between the sample at its output and the transmitted symbol. This means that the MMSE criterion coincides with the minimum mean output energy (MMOE) criterion; that is, with the minimization of the following quantity: 2 ð5:14Þ MOE ¼ E jtH 1 y D ðmÞj It is crucial to write t1 in form (5.13), otherwise (5.14) is satisfied simply by setting t1 ¼ 0. To prove the coincidence of the two criteria [6, 7], we first evaluate the cost function mean square error:
90
Multiuser Detection in CDMA Mobile Terminals
2 J ð^t1 Þ ¼ E ja1 ðm 1Þ tH 1 y D ðmÞj 2 H ¼ ja1 ðm 1Þj2 2RefE ½a1 ðm 1ÞtH 1 y D ðmÞ g þ E ½jt 1 y D ðmÞj
ð5:15Þ The expression of y D ðmÞ that we plan to use is y D ðmÞ ¼
J X
FD;j ðmÞCD;j aD;j ðmÞ þ v D ðmÞ
ð5:16Þ
j¼1
where we have defined 4
ðQ þP1Þ
FD;j ðmÞ ¼ Fj 4
ðQ þP 1Þ
CD;j ¼ C1 4
ðmQ þ P 1Þ
ðQ þP1Þ
aD;j ðmÞ ¼ aj
ðmQ þ P 1Þ
ðmQ þ P 1Þ
ð5:17Þ ð5:18Þ ð5:19Þ
Hence, we can write y D ðmÞ ¼ FD;1 ðmÞCD;1 a D;1 ðmÞ þ
J X
FD;j ðmÞCD;j aD;j ðmÞ þ vD ðmÞ
ð5:20Þ
j¼2
In particular, without loss of generality, f1 ¼ 0 (the frequency offset of the desired user must be perfectly compensated), which yields FD;1 ¼ A1 e jf1 PD;1
ð5:21Þ
ðQ þP1Þ
. It is easy to see that FD;1 does not depend on being PD;1 ¼ P1 detection step m. In order to simplify (5.15), we observe that ja1 ðm 1Þj2 ¼ 1 if we consider phase shift keying (PSK) symbols with unitary norm. Hence, this term can be neglected in the minimization process. Focusing on the term E ½a1 ðm 1ÞtH 1 y D ðmÞ , we can see that H E ½a1 ðm 1ÞtH 1 y D ðmÞ ¼ t1 FD;1 CD;1 E ½a1 ðm 1Þa D;1 ðmÞ J X þ tH 1 FD;j ðmÞCD;j E ½a1 ðm 1Þa D;j ðmÞ j¼2
þ E ½a1 ðm 1Þv D ðmÞ
ð5:22Þ
Adaptive Linear Multiuser Detection
91
The second term on the right-hand side of (5.22) is equal to 0 because E ½a1 ðm 1ÞaD;j ðmÞ ¼ 0, since the PSK symbols are assumed to be uncorrelated and with zero mean value. At the same time, E ½a1 ðm 1Þv D ðmÞ ¼ 0, since the symbols and the noise samples are uncorrelated. With regard to the first term of the right-hand side of (5.22), we can express it as 2 3 0 H 415 y ðmÞ ¼ t F C ð5:23Þ E ½a1 ðm 1ÞtH D;1 D;1 1 D 1 0 This last identity derives from the fact that the PSK symbols are uncorrelated, with unitary norm and a D;1 ðmÞ ¼ ½a1 ðm 2Þ; a1 ðm 1Þ; a1 ðmÞ T . Now, let us define the following: 2 3 2 3 0 0 4 jf1 4 5 4 ~c1 ¼ FD;1 CD;1 1 ¼ A1 e PD;1 CD;1 1 5 ð5:24Þ 0 0 If we know exactly the delay of the desired stream, and we sample at Nyquist epoch (absence of interchip), it is easy to show that ~c1 ¼ A1 e jf1 pð0Þc1
ð5:25Þ
c1 ¼ ½c1 ð0Þ; c1 ð1Þ; . . . ; c1 ðQ 1Þ; 0; . . . ; 0 T
ð5:26Þ
where 4
and pðt Þ satisfies the Nyquist criterion if sampled at multiples of Tc (t ¼ 0; Tc ; 2Tc ; . . .), since it is a raised cosine pulse. Vector c1 contains the tap coefficients of the FIR filter describing the conventional receiver when we use a processing window wider than a symbol interval. Since c1 is known (it depends only on the spreading sequence of the desired stream), without loss of generality we can express t1 in form (5.13), where c1 and x1 are orthogonal. In this way, H jf1 E ½a1 ðm 1ÞtH c1 1 y D ðmÞ ¼ t1 A1 pð0Þe
¼ ðc1 þ x1 ÞH A1 pð0Þe jf1 c1 H ¼ A1 pð0Þe jf1 cH 1 c1 þ x 1 c1 ¼ A1 pð0Þe jf1 kc1 k2
ð5:27Þ
92
Multiuser Detection in CDMA Mobile Terminals
Amplitude A1 is approximately constant (we consider it as slowly varying), as well as fj , pð0Þ, and c1 so that the term E ½a1 ðm 1ÞtH 1 y D ðmÞ does not play any role in the minimization process. As a consequence, we have 2 2 ð5:28Þ E ja1 ðm 1Þ tH ¼ E jtH 1 y D ðmÞj 1 y D ðmÞj j þ cost The fact that mean square error and output energy differ by a constant, if we set (5.13), has a key effect in the realization of the adaptive MMSE receiver. The functions MOE and MMSE are minimized by the same value of the independent variable t1 , which implies that a training sequence is not needed to implement a steepest descent algorithm for the minimization of the MSE. The solution can be expressed as follows: 2 t1;MOE ¼ min E ja1 ðm 1Þ tH 1 y D ðmÞj t1 a D ðmÞ;vD ðmÞ H ¼ min E jt1 y D ðmÞj2 ð5:29Þ t1
a D ðmÞ;vD ðmÞ
The minimization (5.29) can be realized under the constraint (5.13) and corresponds to the minimization of the MOE of filter t1 . The function to be minimized is a quadratic function of t1 and has a unique global minimum. The desired solution t1;MOE can be obtained by minimization of the lagrangian function associated with (5.29) under the constraint (5.13): 2 2 2 H T Lðt1 ; lÞ ¼ E jtH 1 y D ðmÞj þ lðkc1 k t 1 c1 Þ þ l ðkc1 k t1 c1 Þ ð5:30Þ where l is a complex lagrangian multiplier [4]. In order to optimize (5.30), two conditions are necessary: Ht 1 L ¼ 0
ð5:31Þ
Hl L ¼ 0
ð5:32Þ
The latter equation simply regenerates the constraint (5.13) because it is 2 equivalent to tH 1 c1 ¼ kc1 k . The former equation (5.31) must be solved to find t1;MOE . The first term of (5.30) can be rewritten as 2 H H E jtH ¼ E ðtH 1 y D ðmÞj 1 y D ðmÞÞðy D ðmÞt1 Þ ¼ t1 R y t1 where R y ¼ E y D ðmÞy H D ðmÞ . We then have Ht1 L ¼ R y t1 l c1 ¼ 0
ð5:34Þ
Adaptive Linear Multiuser Detection
93
from which we obtain t1;MOE ¼ l R 1 y c1
ð5:35Þ
The value of l can be found by setting 2 tH 1;MOE c1 ¼ kc1 k
ð5:36Þ
Placing (5.35) in (5.36) we obtain H 2 lðR 1 y c1 Þ c1 ¼ kc1 k
ð5:37Þ
which yields the following value of l: l¼
kc1 k2 H ðR 1 y c1 Þ c1
ð5:38Þ
Denoting by xmin the minimum output energy, we can write xmin ¼ tH 1;MOE R y t1;MOE H 1 ¼ ðlðR 1 y c1 Þ ÞR y ðl R y c1 Þ ¼ l kc1 k2
ð5:39Þ
where the last identity is a direct consequence of (5.38) and H ðR 1 y c1 Þ c1 ¼
kc1 k2 l
ð5:40Þ
It can be seen that l, which in general is a complex multiplier, in the considered case is purely real. Finally, the solution to the constrained minimization problem is t1;MOE ¼
xmin 1 R y c1 kc1 k2
ð5:41Þ
The optimal MMSE solution seen in Section 3.3 is in our case (ideal case of absence of interchip) 1 t1;MMSE ¼ E y D ðmÞy H E a1 ðm 1Þy D ðmÞ D ðmÞ 2 3 0 4 5 ¼ A1 pð0Þe jf1 R 1 c1 ¼ R 1 F C ð5:42Þ D;1 D;1 1 y y 0
94
Multiuser Detection in CDMA Mobile Terminals
where we have used (5.21), (5.23), and (5.24). If the phase offset is perfectly known and can be recovered (f1 ¼ 0), the MMOE solution is a scaled version of the MMSE solution. In fact, t1;MOE ¼
xmin 1 xmin t1;MMSE 2 R y c1 ¼ kc1 k A1 pð0Þkc1 k2
ð5:43Þ
e jf1 xmin t1;MMSE A1 pð0Þkc1 k2
ð5:44Þ
If f1 6¼ 0, we have that t1;MOE ¼
and t1;MOE is a scaled rotated version of t1;MMSE . Optimization (5.29) can actually be performed with respect to x1 , since t1 ¼ c1 þ x1 where c1 is fixed. Hence, we have x1;MOE ¼ min
E
x1 aD ðmÞ;vD ðmÞ
2 jtH 1 y D ðmÞj
ð5:45Þ
and t1;MOE ¼ c1 þ x1;MOE
ð5:46Þ
The unconstrained gradient of the MOE function at the mth detection step is given by 2 ¼ Hx1 fðc1 þ x1 ÞH R y ðc1 þ x1 Þg Hx1 E jtH 1 y D ðmÞj
H H H ¼ Hx1 cH 1 R y c1 þ x1 R y c1 þc1 R y x 1 þ x1 R y x1 ¼ RH y c1 þ R y x 1
ð5:47Þ
where the last identity follows from the differentiation rules of matrices [4]. Moreover, R y is a hermitian matrix (R y ¼ R H y ), which implies 2 ¼ R y ðc1 þ x1 Þ Hx1 E jtH 1 y D ðmÞj ¼ R y t1
ð5:48Þ
The gradient can be approximated with its instantaneous value y D ðmÞy H D ðmÞ^t1 so that T ^ ^ y D ðmÞy H D ðmÞt1 ¼ t1 y D ðmÞ y D ðmÞ
ð5:49Þ
95
Adaptive Linear Multiuser Detection
where ^t1 ðmÞ ¼ c1 þ ^x1 ðmÞ is the MMOE filter at the mth detection step. The direction of the instantaneous gradient is that of y D ðmÞ. To update ^x1 ðmÞ we only need the component of the gradient (5.49) orthogonal to c1 . This component can be easily obtained by substituting y D ðmÞ with y D ðmÞ ½y TD ðmÞc1 c1 . In conclusion, the iterative algorithm for the filter coefficients update is given by ð5:50Þ ^x1 ðm þ 1Þ ¼ ^x1 ðmÞ m½^t1T ðmÞy D ðmÞ y D ðmÞ ½y TD ðmÞc1 c1 We could obtain the same result, starting from the expression of Ht1 given in (5.34), extracting from its instantaneous value the component orthogonal to c1 . The block diagram of the blind adaptive MMSE detector is shown in Figure 5.3. The tap coefficients update block must have both observation vector y D ðmÞ and c1 as an input.
5.4 Blind Adaptive MMSE Receiver with Surplus Energy Constraint So far we have considered an ideal situation in which we have perfect knowledge of the delays of the desired streams, which allows Nyquist sampling (absence of interchip). Under those assumptions, x1 , which is chosen to be orthogonal to c1 , is automatically orthogonal to ~c1 too, since (5.25) applies. If there is timing uncertainty, (5.25) no longer holds; similarly, it is not true that minimization of the MOE coincides with the minimization of the MMSE. As a matter of fact, the term E ½a1 ðm 1ÞtH 1 y D ðmÞ is no longer constant. Using (5.13) and (5.24), we can write
yD (m)
FIR t1
Tap coefficients update
x 1(m) = tH1 . yD (m)
Threshold detector
c1
Figure 5.3 Block diagram of the blind adaptive MMSE detector.
â1(m – 1)
96
Multiuser Detection in CDMA Mobile Terminals
H E ½a1 ðm 1ÞtH c1 ¼ ðc1 þ x1 ÞH ~c1 1 y D ðmÞ ¼ t1 ~
¼ cH c1 þ xH c1 1 ~ 1 ~
ð5:51Þ
In the ideal case, (5.25) implies xH c1 ¼ 0, while in the real case considered 1 ~ ~ c ¼ 6 0, and it plays a role in the MOE minimization with respect to here, xH 1 1 x1 . The term c1~c1 does not depend on the detection step m, because if f1 ¼ 0, even in the presence of ICI, ~c1 does not depend on m. In this scenario the minimization of the MOE without additional constraints may cause the cancellation of the useful signal carried by ~c1 . The following derivations prove that for blind detection we need to set a constraint on the quantity w ¼ kx1 k2 (i.e., on the surplus energy at the output of the detector). We call this excess energy ‘‘surplus energy,’’ since the total energy of the linear transformation (5.14) at the output of the receiver is kt1 k2 ¼ kc1 k2 þ kx1 k2 ¼ kc1 k2 þ w
ð5:52Þ
Hence, w is the surplus energy with respect to the conventional receiver c1 . In the case of ICI and f1 ¼ 0, the observation vector can be written as 2 3 1 6 7 y D ðmÞ ¼ ~c1 a1 ðm 1Þ þ FD;1 CD;1 4 0 5a1 ðm 2Þ 0 2 3 0 J X 6 7 FD;j ðmÞCD;j a D;j ðmÞ þ vD ðmÞ þ FD;1 CD;1 4 0 5a1 ðmÞ þ 1
j¼2
ð5:53Þ We refer to ~c1 as the actual signature, because from (5.53) we see that it is the signature that multiplies the symbol a1 ðm 1Þ to be detected. We underline once more that ~c1 is no longer the one expressed by (5.25), because of a sampling offset (for example). The second term on the righthand side of (5.53) is the term due to ISI caused by ICI that appears at the 0 edge of the symbol interval; the term FD;1 CD;1 0 a1 ðmÞ accounts for the 1 fact that the observation window extends for more one symbol interval. Pthan J The multiple access interference term is given by j¼2 FD;j ðmÞ CD;j a D;j ðmÞ,
Adaptive Linear Multiuser Detection
97
and it must be suppressed in order to achieve good detection performance. As c1 is the nominal signature, x1 is not orthogonal to actual signature ~c1 , because (5.25) does not hold. Hence, we might incur a cancellation of the useful signal ~c1 . A geometric representation of this phenomenon is shown in Figure 5.4. In Figure 5.4, we can see that the contribution of the desired code at the output of the linear transformation may be completely canceled, with obvious negative consequences on the detection. By choosing x1 such that t1 ¼ c1 þ x1 is a scaled version of pc (pc is the projection of c1 on the subspace orthogonal to ~c1 ), t1 becomes orthogonal to ~c1 . At the output of the linear filter, the contribution of the desired signal is eliminated, which is highly undesirable. One important issue to stress is that vector x1 (with norm wc ) in Figure 5.4 is the minimum norm vector that allows a complete cancellation of the desired signal. Hence, wc is the minimum value of the surplus energy necessary for the complete cancellation of ~c1 . Setting kx1 k2 < wc at the receiver, we prevent this complete cancellation. Another goal of detection is to realize a good cancellation of MAI. In Figure 5.5, we denote by SI the interference space (i.e., the space generated from terms of the type): 2 3 2 3 2 3 0 0 0 607 617 607 6.7 6.7 6 7 6.7 617 .7 FD;2 ðmÞCD;2 6 6 . 7; FD;2 ðmÞCD;2 6 . 7; .. . ;FD;J ðmÞCD;J 6 . 7 ð5:54Þ 405 405 4 .. 5 0 0 1
x1 χc t1
pc
θc
c1 c1
Figure 5.4 Evaluation of the minimum value wc of the surplus energy which causes complete cancellation of the desired code ~c1 .
98
Multiuser Detection in CDMA Mobile Terminals
which multiplies the symbols of the interfering users that are embraced by the observation window. In order to cancel completely the contribution of the interference, we choose x1 (with norm wI ) such that t1 ¼ c1 þ x1 is a scaled version of pI , where pI is the projection of c1 orthogonal to the space of interference. This choice of x1 is the choice with minimum norm, which allows complete cancellation of the interference. The conclusion is that at the receiver side, kx1 k2 > wI is required for complete interference cancellation. Combining the two constraints on the surplus energy, we must minimize the MSE, keeping the following in mind (in the case that (5.25) does not hold): ð5:55Þ
wI < w < wc where
wI is the minimum value of w ¼ kx1 k2 necessary for complete cancellation of the interference; wc is the minimum value for complete cancellation of the desired signal. Obviously, condition (5.55) can be satisfied only if wI < wc ; otherwise it is not possible to prevent the cancellation of the useful signal while canceling MAI. In fact, relation wI > wc states that c1 is closer to the interfering space than to the space to which the useful signal ~c1 belongs. Using geometric considerations, we can infer from Figure 5.4 that
x1 χI t1
SI
pI
θI
c1
Figure 5.5 Evaluation of the minimum value wI of the surplus energy which achieves complete cancellation of the MAI.
99
Adaptive Linear Multiuser Detection
2
2
wc ¼ kc1 k tan yc ¼ kc1 k ! 2 2 kc1 k ¼ kc1 k 1 kpc k2
2
1 cos2 yc cos2 yc
ð5:56Þ
kp k
where cos yc ¼ kc1c k. From Figure 5.5 we can derive 2
2
wI ¼ kc1 k tan yI ¼ kc1 k where cos yI ¼
2
! kc1 k2 1 kpI k2
ð5:57Þ
kpI k kc1 k .
Because of (5.56) and (5.57), the relation wI < wc is equivalent to kpI k2 > kpc k2 . Moreover, using the definitions of pc e pI , kpI k2 is the distance of c1 from SI , while kpc k2 is the distance of c1 from the subspace to which ~c1 belongs. So far we have not considered the background noise. We have seen that the surplus energy must be great enough (w > wI ) to allow the cancellation of the interference; at the same time, high values of w enhance the noise contribution at the output of the FIR filter (kt1 k2 ¼ kc1 k2 þ w and kc1 k2 is constant). The optimal value of the surplus energy is determined by this trade-off. Higher levels of background noise require lower values of w, in order to keep the energy at the output of the linear detector low. This implicit constraint is important because, for low signalto-noise ratio (SNR), automatically w < wc . If we consider high SNR, the constraint on the surplus energy must be explicitly set. We must minimize the following cost function: MOEðx1 Þ ¼ E fj½c1 þ x1 H y D ðmÞj2 g
ð5:58Þ
subject to the constraints kt1 k2 ¼ kc1 k2 þ w
ð5:59Þ
xH 1 c1 ¼ 0
ð5:60Þ
The associated lagrangian is 2 Lðt1 ; l; nÞ ¼ E jtH þ lðkc1 k2 tH 1 y D ðmÞj 1 c1 Þ 2 þ l ðkc1 k2 tT1 c1 Þ þ n tH 1 t 1 kc1 k w
ð5:61Þ
100
Multiuser Detection in CDMA Mobile Terminals
where l is a complex Lagrange multiplier, while n is a purely real Lagrange multiplier since constraint (5.60) is purely real. Letting Hl L ¼ 0 and Hn L ¼ 0, we generate constraints (5.59) and (5.60), while by differentiation of Lðt1 ; l; nÞ with respect to t1 , we obtain the following equation: Ht1 L ¼ R y t1 l c1 þ nt1 ¼ 0
ð5:62Þ
The iterative algorithm to search the minimum of the MOE with surplus energy constraint is obtained by modifying (5.34). It is intuitive to see that n adds to the projection of the gradient of the MOE orthogonal to c1 , a term proportional to the component of t1 orthogonal to c1 . This component is equal to x1 . Hence, we have ^x1 ðm þ 1Þ ¼ ^x1 ðmÞ m½^t1T ðmÞy D ðmÞ y D ðmÞ ½y TD ðmÞc1 c1 mn^x1 ðmÞ ð5:63Þ The solution to (5.62) is given by 1 t1;MOE ¼ l R y þ nIðQ þP1Þ c1
ð5:64Þ
2 The value of l is obtained by setting tH 1;MOE c1 ¼ kc1 k :
l¼
kc1 k2
1 cH c1 1 R y þ nIðQ þP1Þ
ð5:65Þ
Finally, if we consider that R y þ nIðQ þP 1Þ is hermitian, we can further write 1 kc1 k2 R y þ nIðQ þP1Þ c1 t1;MOE ¼ 1 cH c1 1 R y þ nIðQ þP1Þ
ð5:66Þ
Comparing the MOE solution in this case with the MOE solution obtained in the absence of constraint on the surplus energy, we observe that they differ for the presence of matrix R y þ nIðQ þP 1Þ in the one case, and matrix R y in the other case (letting n ¼ 0, the two cases coincide). We can write 2 R y ¼ GD R aD GH D þ sv IðQ þP 1Þ
ð5:67Þ
and 2 R y þ nIðQ þP 1Þ ¼ GD R aD GH D þ ðsv þ nÞIðQ þP 1Þ
ð5:68Þ
Adaptive Linear Multiuser Detection
101
Hence, n can be interpreted as a fictitious noise variance necessary to increase the level of background noise so that w does not exceed ws . Moreover, the blind adaptive receiver with surplus energy constraint must know or compensate for the frequency offset f1 of the desired user. In fact, if f1 is not equal to 0, matrix FD;1 depends on detection step m, and as a consequence, we can define 2 3 0 4 ð5:69Þ ~c1 ðmÞ ¼ FD;1 ðmÞCD;1 4 1 5 0 which depends on m. The values of the independent variable that minimizes the MSE function and the MOE function are different because we have H c1 ðmÞ ¼ ðc1 þ x1 ÞH ~c1 ðmÞ E ½a1 ðm 1ÞtH 1 y D ðmÞ ¼ t 1 ~
c1 ðmÞ þ xH c1 ðmÞ: ¼ cH 1 ~ 1 ~
ð5:70Þ
In this case, not only term xH c1 ðmÞ is different from 0 and depends on x1 as 1 ~ in the case of ICI and f1 ¼ 0, but it is also time-varying from symbol to symbol. It could be conjectured that by setting a constraint on the surplus energy, the adaptive algorithm tracks these variations while searching the minimum of the MOE function. This is not true, however, because we must c1 ðmÞ, which varies with m. The constraint on w affects consider the term cH 1 ~ the update of x1 only and has no effect on the first term of (5.70).
5.5 Advantages of Adaptive Linear Detection In this chapter we have presented the two most popular versions of adaptive detectors, namely the trained detector and the blind detector. The class of the adaptive receivers is an attractive alternative to optimal linear detection, since the performance gap of the former compared to the latter is limited, as shown in Chapter 6. The complexity gain is significant since no matrix inversion is required by the adaptive receivers. The trained detector has no knowledge of the desired user’s code and uses a known training sequence to learn the code sequence. In Chapter 6 it is shown that this receiver has a performance close to the optimal and is robust to slowly varying phase offsets and low values of frequency offsets. Coarse timing acquisition only is needed by this adaptive receiver. The blind detector does not require a training sequence, but it does require an accurate knowledge of the desired user’s spreading sequence. As a result, the receiver is not robust, since any mismatch between the actual and
102
Multiuser Detection in CDMA Mobile Terminals
nominal spreading sequence may cause a dramatic performance degradation, as shown in Chapter 6. Uncompensated phase offsets or frequency offsets and timing errors may cause this mismatch. Chapter 6 shows that, in this case, the enhanced version of the blind detector making use of surplus energy constraints may recover part of the performance loss.
References [1]
Lim, T. J., and S. Roy, ‘‘Adaptive Filters in Multiuser (MU)CDMA Detection,’’ Wireless Networks, Vol. 4, 1998, pp. 307–318.
[2]
Woodward, G., and B. Vucetic, ‘‘Adaptive Detection for DS-CDMA,’’ Proceedings of the IEEE, Vol. 86, No. 7, July 1998.
[3]
Moshavi, S., ‘‘Multi-User Detection for DS-CDMA Communications,’’ IEEE Communications Magazine, October 1996, pp. 124–136.
[4]
Madhow, U., and M. L. Honig, ‘‘MMSE Interference Suppression for Direct-Sequence CDMA,’’ IEEE Transactions on Communications, Vol. 42, December 1994, pp. 3178– 3188.
[5]
Honig, M., ‘‘Blind Adaptive Multiuser Detection,’’ IEEE Transaction on Information Theory, Vol. 41, No. 4, July 1995, pp. 944–960.
[6]
Haykin, S., Adaptive Filter Theory, Englewood Cliffs, NJ: Prentice Hall, 1991.
[7]
Therrien, C. W., Discrete Random Signals and Statistical Signal Processing, Englewood Cliffs, NJ: Prentice Hall, 1992.
6 Performance of Linear Multiuser Detection This chapter collects all the numerical results regarding the linear detectors presented in Chapters 3, 4, and 5. It is divided into two main sections. Section 6.1 contains the numerical results of the receivers for the synchronous systems presented in Chapters 3 and 4; Section 6.2 contains the numerical results of the receivers for the asynchronous systems presented in Chapters 3 and 5. Steady-state and transient performance is analyzed, as well as the sensitivity to nonideal parameters. On the one hand, the transient performance evaluation is expressed in terms of the mean square error (MSE) of the MMSE receiver, because it clearly demonstrates the different behavior of the receivers considered. On the other hand, the steady-state performance is evaluated in terms of the a priori bit error probability (BEP) when a closed-form expression exists, or the a posteriori bit error rate (BER) obtained by Monte-Carlo computer simulation.
6.1 Performance of the ZF and MMSE Receivers in Synchronous Systems In this section we present some numerical results regarding the four proposed fractionally spaced linear interference mitigation receivers (ZFMI, ZF-I, MMSE-MI, MMSE-I). Both exact and approximate theoretical methods based on closed-form analysis as well as computer simulations are used to assess the receiver performance. Exact performance analysis is also used to check the validity of the Gaussian approximation for MAI. 103
104
Multiuser Detection in CDMA Mobile Terminals
6.1.1 Features of the Synchronous System The system is modeled according to the 3G recommendations [1]. The assumed modulation scheme is binary phase shift keying (BPSK); OVSF codes complying with those suggested in [1] are used. The chip pulse shaping is a root raised cosine with roll-off a ¼ 0:22, and the oversampling factor is b ¼ 2 with respect to a chip rate equal to 3.840 Mchip/s. The multipath channel is a 3-ray channel where the relative power of the three taps are s2h0 ¼ 1 dB, s2h1 ¼ 13 dB, s2h2 ¼ 25 dB, and the delays are t0 ¼ 0, t1 ¼ 240 ns, t2 ¼ 490 ns. The delay spread of the overall channel (convolution of the multipath channel and chip pulse) is determined by truncation to the value P ¼ 7. Due to the short burst used in the transmission, the channel can be regarded as constant over the burst. If not differently specified, the receivers operate with perfect estimates of the channel coefficients. We also introduce the normalized system load r defined as r¼
X 1 Qi i
ð6:1Þ
where Qi are the spreading factors currently used by the system. The 3G recommendations suggest keeping r < 1=2.
6.1.2 Exact Evaluation of BEP The theoretical analysis of this section holds for any of the linear detectors presented in Chapters 3 and 4. As an example, we evaluate the BEP in closed form for the ZF-MI and MMSE-MI, extending the results of [2, 3] to a multirate system with a simultaneous presence of ISI and MAI [4] and employing any modulation format. Using soft decision vector xZF MI ðÞ reported here [see (3.5) and (4.34)] xZF MI ðiÞ ¼ aðiÞ þ uZF MI ðiÞ
ð6:2Þ
we can express the BEP associated to its jth component as sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! E ½jaðjÞj2 j ¼ 1; 2; . . . ; n t Pe;ZF MI ðjÞ ¼ Q s2u;ZF MI ðjÞ
ð6:3Þ
Performance of Linear Multiuser Detection
105
where Q ðxÞ is the standard cumulative Gaussian distribution, and 1 s2u;ZF MI ðjÞ ¼ ðGH R 1 n GÞj;j , being ðÞm;n the operator that returns the element of row m and column n of a matrix. In soft decision xMMSE MI ðÞ given by xMMSE MI ðiÞ ¼ diagðW MI ðiÞÞaðiÞ þ diagðW MI ðiÞÞaðiÞ þ uMMSE MI ðiÞ
ð6:4Þ
there is a presence of residual ISI and MAI (see Chapters 3 and 4). Hence, in order to evaluate the BEP, it is necessary to average over all nt involved symbols: 1 Pe;MMSE MI ðjÞ ¼ n~ M t ! P~t ð6:5Þ X ðW MI Þj;j aðjÞ þ nk¼1;k6 ¼j ðW MI Þj;k aðkÞ
Q su;MMSE MI ðjÞ n~t ðað1Þ;...;að~ nt ÞÞ2C
where Cn~t denotes the set of all the n~t -uples whose components are the symbols of the assumed constellation and s2u;MMSE MI ðjÞ ¼ s2u;ZF MI ðjÞðW MI W H MI Þj;j
j ¼ 1; 2; . . . ; n~t
ð6:6Þ
6.1.3 Gaussian Approximation for BEP Evaluation The Gaussian approximation for the MAI/ISI term allows for a simplification in the evaluation of the BEP in closed form. We first express the SNIR as a function of the component j ðj ¼ 1; 2; . . . ; n~t Þ of the soft decision vector for both ZF-MI and MMSE-MI detectors: GZF MI ðjÞ ¼
GMMSE MI ðjÞ ¼
E ½jaðjÞj2 s2u;ZF MI ðjÞ
~ MI Þ j2 E ½jaðjÞj2 jðW j;j s2MAI ISI ðjÞs2u;MMSE MI ðjÞ
ð6:7Þ
ð6:8Þ
(j ¼ 1; . . . ; n t ) where s2u;MMSE MI ðjÞ is given by (6.6) and D
s2MAI ISI ðjÞ ¼ ðW MI R a Þj;j 2RefðW MI R a Þj;j ðW H MI Þj;j g þ E ½jaðjÞj2 jðW MI Þj;j j2
ð6:9Þ
106
Multiuser Detection in CDMA Mobile Terminals
For the ZF-MI detector, the exact and Gaussian analyses coincide for the ZF-MI detector, since noise is given by uZF MI , which is strictly Gaussian. For the MMSE-MI detector, we can use the Gaussian approximation in the denominator of GMMSE MI ðjÞ, obtaining qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi GMMSE MI ðjÞ j ¼ 1; 2; . . . ; n~t ð6:10Þ Pe;MMSE MI ðjÞ Q As shown in Figure 6.1, the approximation yields good results even for a low number of active codes, extending the results of [3] to multirate systems. 6.1.4 BER and BEP Performance in a Single-Cell Scenario The performance measure is either the average a posteriori BER or the average a priori BEP versus Eb =N0 (Eb being the average received energy per bit). Most of the results have been obtained by computer simulation, and some by theoretical analysis using both the exact analysis and the Gaussian approximation. In Figure 6.1 we present a BEP comparison between the approximate analysis based on the Gaussian approximation (solid line) and the exact analysis (points marked with triangles) for a single-cell system with three active codes: Q1 ¼ 4, Q2 ¼ Q3 ¼ 8. The theoretical analysis based on the Gaussian approximation is in excellent agreement with the exact BEP evaluation, extending the validity of the Gaussian approach outlined in [3] to multiple-rate systems. In Figure 6.2 we present the BER performance of the four types of receiver operating in a single-cell scenario where three codes with spreading factor Q1 ¼ 4, Q2 ¼ Q3 ¼ 8 are active. The BER performance has been evaluated under two extreme situations: (1) a pure Rice channel with deterministic amplitude of the echoes, and (2) a pure Rayleigh fading channel where the echo amplitudes are assumed to be a random variable but remain unchanged along the whole burst. A perfect channel estimation has been assumed, hence the performance curve can be regarded as a lower bound for any practical receiver. For a deterministic channel (curves marked with circles), no difference in performance can be seen between the ZF-MI, MMSE-MI, and MMSE-I, and a slight loss in performance is exhibited by the ZF-I receiver. We can conclude that in the presence of a direct propagation path and for the multipath profile considered, an equalizer for ISI could suffice to restore the orthogonality of the codes. Note that all curves are very close to the single user lower bound. On the contrary, in the presence of a pure Rayleigh channel, the performance is significantly worse
107
Performance of Linear Multiuser Detection
0
10
–1
10
–2
10
BEP
–3
10
–4
10
–5
10
Gauss. appr. Exact BEP
–6
10
–7
10
0
1
2
3
4
5
6
7
8
9 10 11 12
Eb /N0 [dB] Figure 6.1 Comparison between BEP performance obtained by the Gaussian approximation and the exact evaluation for the MMSE-MI receiver assuming a deterministic Rice channel, with three active codes: Q1 ¼ 4, Q2 ¼ Q3 ¼ 8.
than the previous one. The ZF-MI, MMSE-MI, and MMSE-I receivers perform approximately in the same way, equal to the single user bound for a random channel, while the ZF-I has a slightly worse performance.
0
10
–1
BER
10
–2
10
–3
10
–4
10
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Eb /N0 [dB] ZF–MI; MMSE–MI; MMSE–I deterministic channel ZF–I deterministic channel ZF–MI; MMSE–MI; MMSE–I random channel ZF–I; random channel single user lower bound
Figure 6.2 BER performance of the receivers proposed for a deterministic and a random channel with r ¼ 0:5. Ó 2000 IEEE [5].
108
Multiuser Detection in CDMA Mobile Terminals
In Figure 6.3 we present the performance of the ZF-MI and MMSE-MI receivers, for a constant deterministic channel,Punder various system loads. The normalized system load is defined as r ¼ i Qi1 , where i runs over all active codes ð0 r 1Þ. The performance is almost independent of the combination of spreading factors used by the system with light to moderate loads ðr 0:5Þ and it is almost equivalent to the single user lower bound. A loss in performance appears with a greater load; see, for example, the case of r ¼ 0:875. In Figure 6.4 we present the BER performance of the MMSE-MI detector as a function of the SNR in the presence of five data streams with spreading factor 16 and oversampling factor b ¼ 2. Unlike the cases shown in the previous figures, matrix F is reconstructed using channel estimates obtained from the midamble of the data frame. Two channel estimation algorithms derived from [6] are considered: (1) a low complexity channel estimation algorithm based on correlation, and (2) the least square estimator that attains asymptotically the Cramer Rao lower bound, implemented by a cost-effective DFT technique. We notice that the proposed detectors are sensitive to channel estimation error caused by the bias inherent in the correlation estimation algorithm. For an equal BER, the performance attained by the correlation estimation algorithm is worse by 1.3 dB with respect to the curves obtained with the least square channel estimator. In the latter case, the performance is very close to the ideal case where the channel state information is perfectly known.
0
10
–1
10
–2
10
–3
BER
10
–4
10
–5
10
–6
10
–7
10
16–16, ρ=0.125 4–8–16, ρ=0.4375 8–8–8–8, ρ=0.5 2–4–8, ρ=0.875 single user lower bound
–8
10
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Eb /N0 [dB] Figure 6.3 BER performance of the ZF-MI and MMSE-MI detectors, under different normalized system loads r in the presence of a deterministic Rice channel. Ó 2000 IEEE [5].
Performance of Linear Multiuser Detection
109
0
10
–1
BER
10
–2
10
–3
10
–4
10
1
2
3
4
5
6
7
8
9 10 11 12 13
Eb /N0 [dB] MMSE–MI (perfect channel state information) MMSE–MI (ML channel estimation) MMSE–MI (correlation channel estimation)
Figure 6.4 BER performance comparison for the MMSE-MI detector assuming perfect channel state information and an estimated channel using two different algorithms. There are five active codes ðQ1 ¼ Q5 ¼ 16Þ.
6.1.5 BER Performance in a Multiple-Cell Scenario In this section we present some numerical results to assess the receiver’s performance in a multiple-cell scenario. The theoretical analysis reported in Section 6.1.2 still holds, but it may be more complex than a traditional simulation. Hence, all results reported in this section are obtained through computer simulations. The assumed modulation scheme is BPSK. The chip pulse shaping is a root raised cosine with roll-off a ¼ 0:22 and the oversampling factor is b ¼ 2 with respect to the chip rate equal to 3.840 Mchip/s. The multipath channel is a 3-ray channel where the relative power of the three taps are s2h0 ¼ 0 dB, s2h1 ¼ 9:5 dB, and s2h2 ¼ 33 dB and the delays are t0 ¼ 0, t1 ¼ 240 ns, and t2 ¼ 485 ns, respectively. The delay spread of the overall channel (convolution of the multipath channel and the chip pulse) is truncated to the value P ¼ 7. The OVSF codes have normalized unit energy such that the system operates with constant energy per bit Eb . Due to the short burst used in the transmission [1], the channel can be regarded as constant over the burst. As a simulation environment, we have considered a two-cell system where the cells are circular with equal ray r. The BSs are located at the center of each cell. We have assumed that the mobile station (MS)
110
Multiuser Detection in CDMA Mobile Terminals
considered can roam within the cell of interest along the line connecting the two BSs. We have also assumed that the average power received by an MS from a BS decays with 1=d 4 , where d is the distance between the MS and the BS. Defining Pd and Pi as the average power of the two CDMA signals received from the desired BS and from the interfering BS, respectively, it is easy to show that at the MS we have the following ratio of power: r þl 4 ð6:11Þ Pd =Pi ¼ r l where l is the distance of the MS from the boundaries of the cell of interest. In Figure 6.5 we present the comparison between the performance of the receivers in operating modes 1 and 2, described in Section 4.4, assuming a single-cell scenario. We have considered three active codes, whose spreading factors are Q1 ¼ 4 and Q2 ¼ Q3 ¼ 8. The performance refers to the detection of the fastest user (i.e., the one with Q1 ¼ 4), modeled using a structured description; whereas the two other codes are treated as colored noise according to models described in Chapter 4. We can see only a small loss in performance using the unstructured approach (operating mode 2 in Section 4.4), which offers a strong complexity reduction with respect to the fully structured approach (operating mode 1 in Section 4.4). In Figure 6.6 we present the performance of the MMSE-MI receiver in the operating modes 3 and 4, for a constant deterministic channel in a 0
10
Unstructured intracell Structured intracell
–1
10
–2
BER
10
–3
10
–4
10
–5
10
0
1
2
3
4 5 6 7 Eb /N0 [dB]
8
9 10 11
Figure 6.5 BER performance of the MMSE-MI receiver operating in modes 1 and 2 as a function of Eb =N0 , in a single-cell scenario.
111
Performance of Linear Multiuser Detection
two-cell scenario, for a fixed Eb =N0 ¼ 7 dB. The multiple-cell model comprises, as mentioned, two cells whose radii are r ¼ 500m. Figure 6.6 shows the BER versus the percent penetration l =rð%Þ of the MS in the cell for three detection strategies: direct intracell and intercell interference mitigation (mode 4); unstructured intercell interference modeled as colored Gaussian noise (mode 3); and unstructured intercell interference modeled as white Gaussian noise (type 3 simplified). As a performance lower bound, Figure 6.6 also reports the performance of the MMSE-MI receiver in a single-cell environment, for Eb =N0 ¼ 7 dB. It can be seen that when the MS is close to the serving BS, all the receivers perform almost as in a single-cell scenario. At the cell boundaries, the receiver that treats the intercell interference as structured (mode 4) is the best choice. For penetration greater than 15%, the three operating modes considered perform in a comparable way. When the penetration is greater than 25%, there is practically no difference between the performance of the three different operating modes. Figure 6.7 compares the performance of the MMSE-MI receivers operating according to modes 3 and 4 exactly on the border of the cell of interest (l ¼ 0, i.e., handover scenario), where the powers of the two BSs are equal. In this case the receiver performing direct mitigation of both intracell and intercell interference (mode 4) is obviously the best. The use of the other receiver is completely impractical in such a situation. –1
10
Structured Unstructured, colored noise Unstructured, white noise Single cell
BER
–2
10
–3
10
0
10
20
30
40 50 l/r (%)
60
70
80
Figure 6.6 BER performance of the MMSE-MI receivers operating in modes 3 and 4 for a two-cell scenario, at Eb =N0 ¼ 7 dB, as a function of the percent penetration of the MS in the cell of interest.
112
Multiuser Detection in CDMA Mobile Terminals
0
10
–1
10
–2
BER
10
–3
10
Unstructured, colored noise Structured
–4
10
–5
10
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Eb /N0 [dB]
Figure 6.7 BER performance of the MMSE-MI receivers operating in modes 3 and 4 for a two-cell scenario, as a function of Eb =N0 , across the borders of the cell of interest.
Figure 6.8 shows a comparison similar to the previous one, with the MS at a distance of l ¼ 40m, from the cell boundary, which yields a l =r% ¼ 8%. A superior performance of the structured operating mode (mode 4) is still evident, although the receiver of mode 3, based on an unstructured model, performs better than in the previous situation.
6.2 Performance of the MMSE Receivers in Asynchronous Systems This section is dedicated to the performance of the MMSE receivers in CDMA asynchronous systems. We will consider both the optimal MMSE receiver presented in Chapter 3 and the corresponding adaptive versions presented in Chapter 5. Specifically, we have evaluated both transient and steady-state performance in terms of the following: MSE convergence for the blind and trained adaptive MMSE detectors. The convergence curves represent the MSE, averaged over 500 independent simulations, between the output sample of the detector and the transmitted symbol. The MSE is plotted as a function of the transmitted symbol index.
Performance of Linear Multiuser Detection
113
0
10
–1
10
Unstructured, colored noise Structured
–2
BER
10
–3
10
–4
10
–5
10
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Eb /N0 [dB]
Figure 6.8 BER performance of the MMSE-MI receiver operating in modes 3 and 4 for a two-cell scenario, as a function of Eb =N0 , near the border of the cell of interest.
BER performance curves for the optimal, blind, and trained adaptive MMSE detectors. The BER curves present the receiver’s performance as a function of Eb =N0 , the average energy per symbol over noise spectral density. A first set of curves is obtained assuming perfect synchronization at the receiver side, no phase or frequency offset, and no near-far effect. A second set of curves is devoted to the case of nonperfect timing, presence of phase and frequency offset, and near-far effect. The lower bound of the performance refers to the uncoded singleuser BPSK or QPSK modulation. All numerical results presented in this section are obtained by computer simulation. 6.2.1 Features of the Asynchronous System To assess the performance of the asynchronous case, we have considered a CDMA system employing Gold sequences with spreading factor Q ¼ 63. The transmission filter pðt Þ has a frequency response that is a root raised cosine filter, and qðt Þ is matched to pðt Þ. The time truncation of the overall channel filter is set to P ¼ 4. The modulation format is QPSK. Two types of asynchronous systems are considered for simulations: (1) systems where all users are asynchronous to one another, and (2) systems
114
Multiuser Detection in CDMA Mobile Terminals
where block-synchronous groups of codes are active and the groups are asynchronous to one another. The chip rate considered is 4.032 Mchip/s, and the symbol rate is 64 Ks/s. Further specification on the simulation setup is provided case by case.
6.2.2 Transient Performance: MSE Convergence In this section we assess the transient performance of both the blind and trained adaptive receivers in terms of MSE. As the performance lower bound we consider the lowest achievable MSE error (i.e., that of the optimal MMSE receiver). The convergence performance presented refers to the case of a CDMA system with six asynchronous active codes. The delays of the users are random and their amplitudes are all equal (no near-far effect). Perfect timing recovery is also assumed. The transient behavior of both blind and trained adaptive receivers is dominated by the choice of the step-size m, which affects the stability and the convergence speed of the algorithm. This parameter also affects the steady-state MSE performance, as detailed in the following numerical results. For the trained detector, another critical design parameter is the number of training symbols employed by the receiver (i.e., when we switch from the trained mode to the decision directed mode). Sensitivity to these two design parameters is shown in this section. Figures 6.9 and 6.10 represent the convergence curves using m ¼ 0:01 and m ¼ 0:001 for the trained adaptive MMSE receiver, when Eb =N0 ¼ 3, in the absence of phase and frequency offset. The solid horizontal line marking the level of 2 101 represents the theoretical MSE lower bound obtained by the optimal MMSE detector. This lower bound value is shared by Figures 6.9, 6.10, and 6.11. We note that for m ¼ 0:01, convergence is faster than for m ¼ 0:001: in the former case the steadystate MSE is reached in about 250 iterations, while in the latter case in about 2,000. Faster convergence in the former case, however, involves a greater steady-state MSE compared to the latter case, for which the MMSE of the optimal MMSE algorithm is almost approached. Figures 6.9 and 6.10 are simulations of trained adaptive detectors for which we switch from the training mode to the decision directed mode after 50 information symbols. If we switch to the decision directed mode earlier (for example, after 10 symbols), we suffer a larger steady-state MSE, as shown in Figure 6.11. In this case the distance from the optimal MSE performance curve is much greater, and we should expect a poor BER performance too.
115
MSE averaged over 500 independent simulations
Performance of Linear Multiuser Detection
10
0
Trained MMSE
–1
10
–2
10
0
500
1,000
1,500 Iteration #
2,000
2,500
3,000
MSE averaged over 500 independent simulations
Figure 6.9 Convergence curve for the trained adaptive MMSE detector with m ¼ 0:01 and Eb =N0 ¼ 3 dB. Near-far effect, phase and frequency offset are absent. The receiver switches from the training mode to the decision directed mode after 50 symbols. 0
10
Trained MMSE
10
10
–1
–2
0
500
1,000
1,500 Iteration #
2,000
2,500
3,000
Figure 6.10 Convergence curve for the trained adaptive MMSE detector with m ¼ 0:001 and Eb =N0 ¼ 3 dB. Near-far effect, phase and frequency offset are absent. The receiver switches from the training mode to the decision directed mode after 50 symbols.
Multiuser Detection in CDMA Mobile Terminals
MSE averaged over 500 independent simulations
116
0
10
Trained MMSE
10
10
–1
–2
0
500
1,000
1,500 Iteration #
2,000
2,500
3,000
Figure 6.11 Convergence curve for the trained adaptive MMSE detector with m ¼ 0:001 and Eb =N0 ¼ 3 dB. Near-far effect, phase and frequency offset are absent. The receiver switches from the training mode to the decision directed mode after 10 symbols.
Figure 6.12 compares the convergence curve for Eb =N0 ¼ 17 dB of the blind MMSE receiver (with m ¼ 0:002) with that of the trained MMSE receiver (with m ¼ 0:01), in the absence of phase and frequency offset. We notice that the MSE optimal lower bound reaches the value of approximately 102 , much lower than in the three previous cases, due to the higher value of Eb =N0 considered. In agreement with the results presented in [7], we notice a faster convergence for the blind detector with respect to the trained one, although in the latter case the steady-state mean square error is much lower. Finally, in Figure 6.13 we have tested the MSE performance of a trained detector operating in a system where all six users are affected by a frequency offset of 50 Hz. In this case the adaptive algorithm has the task of acquiring a time-varying signature due to the presence of frequency offset. Simultaneously, the cross-correlations of the desired user’s code with the interfering codes are also time-varying. The trained adaptive algorithm operates in a hostile environment. We expect that, in this case, this dataaided algorithm will fail in the decision directed mode, when decisions becomes unreliable enough. The convergence has been tested for Eb =N0 ¼ 6; 7; 8 dB, with a stepsize m ¼ 0:05, switching from the training mode to the decision directed
117
MSE averaged over 500 independent simulations
Performance of Linear Multiuser Detection
10
10
0
–1
Blind MMSE 10
–2
Trained MMSE 10
10
–3
–4
0
500
1,000 Iteration #
1,500
2,000
MSE averaged over 500 independent simulations
Figure 6.12 Comparison between the MSE convergence speed of a trained adaptive MMSE detector (m ¼ 0:01) and that of a blind adaptive MMSE detector (m ¼ 0:002) for Eb =N0 ¼ 17 dB. The CDMA system has six asynchronous users, and no near-far effect, phase or frequency offset are present.
1
10
Signal–to–noise ratio=6 dB
0
10
Signal–to–noise ratio=7 dB 10
10
–1
Signal–to–noise ratio=8 dB
–2
0
500
1,000
1,500 Iteration #
2,000
2,500
3,000
Figure 6.13 MSE convergence curve of the trained adaptive MMSE algorithm for Eb =N0 ¼ 6; 7; 8 dB, where each user has a frequency offset of 50 Hz.
118
Multiuser Detection in CDMA Mobile Terminals
mode after 500 symbols. We have selected a long training sequence to ensure that the sensitivity to this parameter can be eliminated. For Eb =N0 ¼ 6 dB, as soon as we switch from the training mode to the decision directed mode, the MSE starts increasing and diverging very quickly. The corresponding BER performance will be very poor. This is due to the simultaneous bad effects of using several wrong decisions to track a fast time-varying solution because of the frequency offset. Convergence is maintained when the receiver operates at larger signal-to-noise ratios because the decisions become more reliable. We can infer that, for the system considered, Eb =N0 ¼ 7 dB sets the threshold of operation for the receiver.
6.2.3 BER Performance of the MMSE Detectors in the Ideal Case In this section, we assess the steady-state BER performance of the conventional, optimal MMSE, blind and trained adaptive MMSE receivers, in the ideal case of perfect parameter recovery (timing, phase, and frequency offset). The performance of the CDMA detectors is compared with that of the single-user lower bound for QPSK uncoded modulation. In all cases the trained detector switches from the training mode to the decision directed mode after 50 symbols. The performance of the conventional detector is expected to decrease when the number of users increases or in the presence of near-far effect, while the MMSE detectors should be more robust to this phenomena. Figures 6.14 and 6.15 present the BER as a function of Eb =N0 for the conventional receiver, the optimal MMSE receiver, the trained adaptive MMSE detector (m ¼ 0:001), and the blind adaptive MMSE detector (m ¼ 0:002). Besides the above general assumptions, we assume the absence of near-far effect. Figure 6.14 shows the BER performance of a system with six asynchronous users, while Figure 6.15 represents the BER performance of a system with 18 codes, synchronous 6 by 6. Figure 6.14 shows that the trained MMSE detector suffers minor performance degradation with respect to the optimal MMSE, and it is very close to the single user lower bound. The blind detector has worse performance than the trained detector, but it is still better than a conventional receiver (e.g., for a BER of approximately 2 104 , the gain is 2 dB with respect to the conventional receiver). Comparing Figures 6.14 and 6.15, we notice the performance degradation of the conventional detector for an increasing number of users. The performance of the blind
119
Performance of Linear Multiuser Detection
0
10
BER
10
10
10
10
–1
–2
–3
–4
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10 11 12 13 14
Conventional receiver with perfect timing recovery Single user bound Optimal MMSE Blind MMSE with perfect timing recovery (µ=0.002) Trained MMSE with perfect timing recovery (µ=0.001)
Figure 6.14 BER performance in CDMA system with six asynchronous codes, perfect timing recovery, no phase, or frequency offset, and no near-far effect.
and trained receivers also degrades when the number of users increases, but less notably than in the case of a conventional detector. In particular, the performance of a trained detector is very close to the optimal MMSE even for 18 users. Finally, Figure 6.16 represents the BER behavior of the receivers considered in a system with six asynchronous codes, when a severe near-far effect is present. Specifically, the amplitude of the five interferers is three times larger than that of the useful signal. In this situation we notice that a conventional receiver is completely useless. The blind detector, though affected by a performance degradation compared to Figure 6.14, maintains acceptable behavior. The trained detector, however, is very robust because the procedure of learning the signature during the trained period allows the receiver to become aware of the near-far situation and consequently adjust its tap coefficients.
120
Multiuser Detection in CDMA Mobile Terminals
0
10
BER
10
10
10
10
–1
–2
–3
–4
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10 11 12 13 14
Conventional receiver with perfect timing recovery Single user bound Optimal MMSE Blind MMSE with perfect timing recovery (µ=0.002) Trained MMSE with perfect timing recovery (µ=0.001)
Figure 6.15 BER performance in CDMA system with 18 block-synchronous codes (three sets of six synchronous codes), perfect timing recovery, no phase or frequency offset, and no near-far effect.
6.2.4 BER Performance of the MMSE Detectors in the Presence of a Timing Error In this section we assess the steady-state BER performance of the conventional, optimal MMSE, blind and trained adaptive MMSE receivers in the presence of a timing error, while the other parameters are assumed to be perfectly recovered (phase and frequency offset). Specifically, the desired data stream has a possible timing acquisition error whose maximum value is equal to half a chip interval. No near-far effect is considered in this case. Again, the BER performance of the CDMA detectors is compared with that of the single-user lower bound for QPSK uncoded modulation. In all cases the trained detector switches from the training mode to the decision directed mode after 50 symbols. Moreover, the sensitivity of the detectors to this less than ideal operating condition is
121
Performance of Linear Multiuser Detection
10
0
–1
BER
10
–2
10
–3
10
–4
10
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10
11
12
13
Conventional receiver with perfect timing recovery Single user bound Optimal MMSE Blind MMSE with perfect timing recovery (µ=0.002) Trained MMSE with perfect timing recovery (µ=0.001)
Figure 6.16 BER performance of the detectors when the interferers have amplitude three times larger than that of the desired code (near-far effect). The CDMA system has six active codes, no phase and frequency offset.
evaluated, and possible countermeasures to avoid performance losses are illustrated. Figure 6.17 refers to a system with six asynchronous codes, and Figure 6.18 to a system with 18 codes (organized as three blocks of six synchronous codes). The figures present the BER performance of the trained adaptive MMSE receiver, when the timing of the useful signal is not perfectly recovered. Fixed timing offsets are considered to test the robustness of the receiver: specifically values T2c and T4c . The performance loss is almost negligible compared to the case of perfect timing recovery. It can be shown that the trained adaptive receiver is robust for timing errors up to a few chip intervals, because of its capability of learning the operating condition in the training mode. On the contrary, the blind adaptive MMSE detector is much more sensitive to timing offsets, in agreement with the conclusions drawn in [7].
122
Multiuser Detection in CDMA Mobile Terminals
10
BER
10
10
10
10
0
–1
–2
–3
–4
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14
Eb /N0 [dB] Conventional receiver with perfect timing recovery Single user bound Optimal MMSE Trained MMSE with perfect timing recovery (µ=0.001) Trained MMSE with timing error equal to Tc /4 (µ=0.001) Trained MMSE with timing error equal to Tc /2 (µ=0.001)
Figure 6.17 Performance comparison between the trained adaptive MMSE receiver with perfect timing recovery and in the presence of a timing error. The CDMA system has six asynchronous codes, no phase or frequency offset, and no near-far effect.
Furthermore, a timing error causes the occurrence of ICI, which is detrimental to the performance of the blind detector. In fact, a timing error causes a mismatch between the nominal signature used by the blind detector and the actual signature associated to the desired data stream. The effect of this mismatch is highlighted in Figures 6.19 and 6.20, in which a system with six users and 18 users (synchronous 6 by 6) is considered. In both Figures 6.19 and 6.20, we see that in the absence of any countermeasure (surplus energy constraint) the blind detector exhibits BER floors for large Eb =N0 . The levels of these error floors depend on the number of active users and/or timing errors (see, for example, the case of Tc 4 in both figures). As detailed in the previous chapter, to counteract the appearance of this error floor and improve the BER performance of the blind detector for large
123
Performance of Linear Multiuser Detection
10
BER
10
10
10
10
0
–1
–2
–3
–4
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10 11 12 13 14
Conventional receiver with perfect timing recovery Single user bound Optimal MMSE Trained MMSE with perfect timing recovery (µ=0.001) Trained MMSE with timing error equal to Tc /4 (µ=0.001) Trained MMSE with timing error equal to Tc /2 (µ=0.001)
Figure 6.18 Performance comparison of a trained adaptive MMSE receiver with perfect timing recovery and presence of a timing error on the desired user. The CDMA system has 18 active codes (three blocks of six synchronous codes), no phase or frequency offset, and no near-far effect.
signal-to-noise ratios in the presence of a signature mismatch, we need a constraint on the surplus energy. The resulting BER performance curves are again reported in Figures 6.19 and 6.20, which examine the efficiency of a constraint on the surplus energy. The value of parameter n, which appears in the tap coefficients update, is set to 1 optimized for best performance by trial and error procedure. As shown in Figures 6.19 and 6.20, using a surplus energy constraint, the error floors disappear for all values of timing errors considered. As an example, for a timing error of T8c , the BER performance is moved very close to the case of no timing error by the surplus energy constraint. 6.2.5 BER Performance of the MMSE Detectors in the Presence of a Phase or Frequency Offset In this section we assess the steady-state BER performance of the conventional, optimal MMSE, blind and trained adaptive MMSE receivers in the presence
124
Multiuser Detection in CDMA Mobile Terminals
10
0
–1
BER
10
–2
10
–3
10
–4
10
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10 11 12 13 14
Blind MMSE with timing error equal to Tc /8 (µ=0.0001) Blind MMSE with timing error equal to Tc /4 (µ=0.0001) Blind MMSE with s.e. constraint (ν=1) and timing error equal to Tc /8 Blind MMSE with s.e. constraint (ν=1) and timing error equal to Tc /4 Single user bound Conventional receiver with perfect timing recovery Blind MMSE with perfect timing recovery (µ=0.002)
Figure 6.19 BER performance of blind adaptive MMSE detector in the presence of a timing error. The CDMA system has six asynchronous codes, no phase or frequency offset, and no near-far effect.
of a phase or frequency offset, while perfect timing recovery is assumed. No near-far effect is considered in this case. Again, the BER performance of the CDMA detectors is compared with that of the single-user lower bound for QPSK uncoded modulation. In all cases the trained detector switches from the training mode to the decision directed mode after 50 symbols. Critical operating conditions are considered and possible working threshold is highlighted. The phase offset, if constant during transmission, does not cause major problems in the convergence of the two adaptive algorithms for the values of time offsets considered. As a general statement, a constant phase offset does not cause the steady-state solution of the MMSE detector to be time-varying from iteration to iteration. Nevertheless, this fact impacts the blind and trained detectors performance differently.
Performance of Linear Multiuser Detection
10
BER
10
10
10
10
125
0
–1
–2
–3
–4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Eb /N0 [dB] Blind MMSE timing error equal to Tc /8 (µ=0.0001) Blind MMSE with timing error equal to Tc /4 (µ=0.0001) Blind MMSE with s.e. constraint (ν=0.1) and timing error equal to Tc /8 Blind MMSE with s.e. constraint (ν=0.1) and timing error equal to Tc /4 Single user bound Conventional receiver with perfect timing recovery Blind MMSE with perfect timing recovery (µ=0.002)
Figure 6.20 BER performance of the blind adaptive MMSE detector in the presence of a timing error. The CDMA system has 18 active codes (three sets of six blocksynchronous codes), no phase or frequency offset, and no near-far effect.
In Figure 6.21 we show the effect of a phase offset affecting the 18 codes (synchronous 6 by 6) active in the system. Specifically, each group of six codes has a random phase offset between 0 and 10 . The BER performance is practically the same as the performance with no phase offset. The reason why the trained detector suffers no performance degradation caused by the presence of a phase offset once again lies in its learning capability. Larger values of the phase offset would not affect its performance either. The blind detector, instead, is sensitive to the presence of phase offset because it causes a signature mismatch. The minor performance loss highlighted in Figure 6.21 is due to the low maximum value of the phase offset (10 ). Larger values of the phase offset would cause error floors and call for the deployment of a surplus energy constraint.
126
Multiuser Detection in CDMA Mobile Terminals
10
0
–1
BER
10
–2
10
–3
10
–4
10
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10 11 12 13 14
Conventional receiver in the absence of phase offset Single user bound Trained MMSE in the absence of phase offset (µ=0.001) Trained MMSE with random phase offset for all users (µ=0.001) Blind MMSE in the absence of phase offset (µ=0.02) Blind MMSE with random phase offset for all users (µ=0.002)
Figure 6.21 BER performance comparison between the trained and blind adaptive MMSE detectors in the presence of phase offset on all 18 asynchronous codes (synchronous 6 by 6). The phase offset is randomly generated between 0 and 10 , the frequency offset and near-far effect are absent.
In Figure 6.22 we test the performance of a trained adaptive CDMA detector operating in a system with six asynchronous users, in the presence of frequency offset. As already stressed in the paragraph dedicated to the convergence curves, the trained adaptive MMSE detector converges in the presence of frequency offset only for sufficiently large signal-to-noise ratios. For a frequency offset of 10 Hz, we need Eb =N0 2 dB; for 50 Hz, we need at least 8 dB; for 100 Hz, the receivers converge from 10 dB on. The figure shows that although convergence is assured, the BER performance has degraded significantly. Hence, the trained detector is able to track the continuous rotation of the actual signature caused by the frequency offset. This ability of learning is maintained in the decision directed mode too, if the ‘‘instructor’’ (i.e., the detected data) is reliable enough. This happens if the signal-to-noise ratio is large enough depending on the value of the frequency offset.
127
Performance of Linear Multiuser Detection
0
10
BER
10
10
10
10
–1
–2
–3
–4
0
1
2
3
4
5
6 7 8 Eb /N0 [dB]
9
10 11 12 13 14
Single user bound Optimal MMSE Trained MMSE with no frequency offset (µ=0.001) Trained MMSE with frequency offset of each user equal to 10 Hz (µ=0.01) Trained MMSE with frequency offset of each user equal to 50 Hz (µ=0.05) Trained MMSE with frequency offset of each user equal to 100 Hz (µ=0.05)
Figure 6.22 BER performance of the trained adaptive MMSE detector in the presence of a frequency offset equal to 10, 50, and 100 Hz for all users. The CDMA system has six asynchronous codes, with no phase offset and no near-far effect.
We emphasize that in this situation a traditional blind detector would not work because the mismatch between the nominal and actual signature is time-varying and a fixed constraint on the surplus energy cannot compensate for it. 6.2.6 Improving the Performance of Linear Detectors In this chapter we have seen some numerical results regarding the linear detection schemes based on the ZF and MMSE criteria for synchronous systems, and the performance of adaptive and nonadaptive MMSE schemes for asynchronous systems. In Section 6.1, we tested the different levels of abstractions in the description of both intracell and intercell interference which affect the downlink received signal of a multiple-cell multirate CDMA system. Specifically, we have proposed a description of the received signal having
128
Multiuser Detection in CDMA Mobile Terminals
tunable complexity by identifying two components: a structured part, containing the most significant part of the signal (either for its detection or its rejection as interference), and an unstructured part, which can be essentially treated as Gaussian colored noise and is only statistically described. We have concluded that a structured description of the observation always leads to a detector with superior performance, especially in soft handover scenarios. In Section 6.2, we assessed the performance of adaptive and nonadaptive MMSE detectors in asynchronous systems. We highlighted advantages and disadvantages of the blind and trained detectors, showing their transient MSE performance and steady-state BER performance. The trained detector is robust to timing error, phase offset, and frequency offsets to some extent. It has the drawback of requiring a training sequence in order to operate it. The blind detector is more fragile, and it is not robust to timing errors, phase offsets, or frequency offsets. A constraint on the surplus energy can only compensate for non-time-varying mismatches between the known nominal and actual signature. In the next chapter we introduce another class of CDMA detectors: the nonlinear detectors, which could be used in conjunction with linear detectors or as stand-alone detectors. The underlying philosophy is that these detectors reconstruct and subtract the MAI from the signal received. This is particularly useful when there are significant differences in power levels between the data streams of different codes (significant near-far effect).
References [1]
3rd Generation Partnership Project (3GPP), Radio Interface Technical Specifications: TS 25.221, TS 25.222, TS 25.223, http://www.3gpp.org.
[2]
Klein, A., G. Kaleh, and P. Baier, ‘‘Zero Forcing and Minimum Mean-Square-Error Equalization for Multiuser Detection in Code Division Multiple-Access Channels,’’ IEEE Transactions on Vehicular Technology, Vol. 45, No. 2, May 1996.
[3]
Poor, H. V., and S. Verdu´, ‘‘Probability of Error in MMSE Multiuser Detection,’’ IEEE Transactions on Information Theory, Vol. 43, No. 3, May 1997.
[4]
Castoldi, P., and H. Kobayashi, ‘‘Co-channel Interference Mitigation Detectors for Multirate Transmission in TD-CDMA Systems,’’ IEEE Journal on Selected Areas in Communications, special issue on multiuser detection for the wireless channel, Vol. 20, No. 2, February 2002, pp. 273–286.
[5]
Castoldi, P., and H. Kobayashi, ‘‘Low Complexity Group Detectors for Multirate Transmission in TD-CDMA 3b-Systems,’’ IEEE Broadband Wireless Symposium (Globecom 2000), San Francisco, CA, November–December 2000.
Performance of Linear Multiuser Detection
129
[6]
Steiner, B., and P. Jung, ‘‘Uplink Channel Estimation in Synchronous CDMA Mobile Radio System with Joint Detetction,’’ 4th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC ’93), Yokohama, Japan, September 1993.
[7]
Honig, M., ‘‘Blind Adaptive Multiuser Detection,’’ IEEE Transaction on Information Theory, Vol. 41, No. 4, July 1995.
7 Nonlinear Multiuser Detection In this chapter we will describe a class of nonlinear multiuser detectors based on subtractive interference cancellation. The principle they rely on is the estimation and reconstruction of the MAI seen by each user, with the objective of canceling it from the received signal. This task is realized in different fashions, often using more than one stage by means of an iterative procedure in which the estimation of the MAI, and consequently the reliability of the user’s signal, improves stage by stage [1]. Interference subtraction is particularly efficient when the receiver operates at a high signal-to-noise ratio and the received users’ power is significantly different (severe near-far effect) [2]. In Sections 7.1, 7.2, and 7.3 we make use of a simplified system model and introduce some concepts regarding the nonlinear approach to CDMA detection. The two basic nonlinear detection strategies, namely the SIC and the PIC, are illustrated. The former utilizes a serial (user by user) approach in the interference cancellation. The latter uses a parallel approach, canceling all the interfering users at once, but it usually requires a few stages of parallel interference cancellation to achieve reasonable performance. In Section 7.4 we present an example of a nonlinear CDMA detection scheme that can be implemented with moderate to high complexity depending on the operating environment and the modulation format adopted. It is a trellis-based sequence estimator, designed to operate in the synchronous multirate system presented in Chapter 1. It can be regarded as a competitor of the linear (nonadaptive) multiuser detectors presented in Chapter 3. 131
132
Multiuser Detection in CDMA Mobile Terminals
7.1 Introduction and System Model The underlying philosophy of nonlinear detection is the same as the decision feedback mechanism used to counteract ISI in single-user communication systems. Subtractive interference cancellation receivers subtract reconstructed MAI using tentative decisions of the interfering users. This procedure will cancel the interfering user’s signal if the decision is correct; otherwise it will double the contribution of the interferer. The tentative decisions used to reconstruct the MAI can be hard or soft. When the soft decision is employed, the estimate usually includes amplitude estimation of the interfering users and is easier to implement. The hard decision approach feeds back a symbol decision and it is nonlinear; it also requires an estimate of the relevant signal amplitude for proper reconstruction of its contribution to the MAI. In general, if reliable amplitude estimation is available, hard-decision subtractive interference cancellation outperforms their soft-decision counterparts [1]. In order to illustrate the principle of subtractive interference detectors, we adopt the synchronous multirate model without ISI presented in Chapter 2, with the slight modification of considering a user-dependent channel described by a coefficient fj ( j ¼ 1; 2; . . . ; J Þ, where J is the total number of users and Qj denotes the spreading factor of the jth user. The extension to the asynchronous case is straightforward and can be found in [2]. The scalar chip observation is given by yðiÞ ¼
J X
fj bj ðiÞ þ vðiÞ
j¼1
¼
J X
fj cj ðji 1jQj Þaj ðnÞ þ vðiÞ
j¼1
i n¼ Qj
ð7:1Þ
ðN Þ
By considering Fj ¼ fj IN , where IN is the identity matrix, the observation vector of length N can be written as y ðN Þ ðiÞ ¼
J X
ðN Þ
fj Cj
ðiÞaðN Þ ðiÞ þ vðN Þ ðiÞ
j¼1
¼
J X
ðN Þ
fj bj
ðiÞ þ v ðN Þ ðiÞ
j¼1
¼
J X j¼1
ðN Þ ðN Þ a j ðiÞ
Gj
þ v ðN Þ ðiÞ
ð7:2Þ
133
Nonlinear Multiuser Detection
In order to illustrate the subtractive interference cancellation in a synchronous system without ISI, we consider a case where the users’ signatures are not ðN Þ perfectly orthogonal, such that the correlation matrix R jk ðiÞ between the jth and the kth code matrix on the considered processing window 4
ðN Þ
ðN Þ
R jk ðiÞ ¼ Cj
ðN Þ
ðiÞCk ðiÞ
ð7:3Þ
is significantly different from the identity matrix IN of order N . It can be shown that it is again convenient to consider a sliding window approach for nonlinear detection. Given the simplified system model, the optimal processing window has width N ¼ Q , where Q is the maximum spreading factor among those in use and it is exactly synchronized with the symbol interval of the slowest data stream, as shown in Figure 7.1. For the generic mth processing window, we define 4
y D ¼ y ðQ Þ ðmQ Þ 4
ðQ Þ
4
Cj ¼ Cj
ð7:4Þ
ðmQ Þ
ð7:5Þ
aj ¼ aj ðmQ Þ
ðQ Þ
ð7:6Þ
v ¼ v ðQ Þ ðmQ Þ
ð7:7Þ
4
Q1 = 2 a1(1)
a1(2)
a1(3)
a1(4)
a1(5)
a1(6)
a1(7)
a1(8)
a1(9)
a1(10)
a1(11)
a1(12)
Q2 = 4 a2(2)
a2(1)
a2(3)
a2(4)
a2(5)
a2(6)
Q3 = 8 a3(1)
a3(2)
a3(3)
a4(1)
a4(2)
a4(3)
yD (1)
yD (2)
Q4 = 8
yD (3)
Figure 7.1 Selection of the optimal observation window for the case of no ISI, when four data streams with different spreading factors are active. In this case, the optimal window width is N ¼ Q, where Q ¼ Q3 ¼ Q4 is the maximum spreading factor currently used by the system.
134
Multiuser Detection in CDMA Mobile Terminals
where we have omitted the dependence on m in the left side of the definitions. This notation allows us to express the observation of the generic processing window as yD ¼
J X
fj Cj aj þ v
ð7:8Þ
j¼1
7.2 SIC The SIC detector [3, 4] makes use of a serial approach to canceling interference. Each stage of this detector decides on a user’s symbol and regenerates its contribution to MAI. This reconstructed MAI contribution is canceled out from the received signal so that the next stage ‘‘sees’’ a smaller amount of MAI. Some pitfalls are evident in this approach: the cancellation of a user’s contribution to the MAI actually takes place only provided that the decision is correct. Once the subtraction has been done, the receiver assumes that the resulting signal contains one fewer user (i.e., a smaller amount of MAI) and the process can be repeated with another interferer, until all but one user have been demodulated. A diagram of the first stage of an SIC detector is shown in Figure 7.2, where a hard decision approach is taken. The first stage is preceded by another block (not shown in the figure), which ranks the signals in descending order of received powers, assigning index 1 to the strongest and index J to the weakest. Ranking the users in terms of power, however, is rather difficult and is not always the best order of cancellation, especially when significant cross-correlations between the signatures are present. A better criterion consists of ranking the users from the strongest to the weakest based on the expected value of the squared correlation between the received signal and the considered user’s signature [2]. yD
FIR filter H C1
x1
Threshold detector
C1 yD (1)
Figure 7.2 First stage of an SIC detector.
f1
â1
Nonlinear Multiuser Detection
135
The first stage of a SIC detector shown in Figure 7.2 realizes the following tasks: Detect with a conventional detector the data of the strongest user with code matrix C1 x 1 ¼ f 1 CH 1 C1 a 1 þ
J X
H fj CH 1 Cj a j þ C1 v
ð7:9Þ
j¼2
Make a hard decision on x1 , obtaining a decision ^a1 for user 1 as follows: ^a1 ¼ quant½x1
ð7:10Þ
Regenerate an estimate of the received signal for user 1, using previous data decision ^a1 , its signature sequence CH 1 , and amplitude f^1 estimate. Subtract f^1 C1 ^a1 , the reconstructed MAI contribution of user 1, from the total signal yD received, obtaining a partially clean version of the observation yD , denoted yD ð1Þ. The argument ‘‘1’’ indicates that the observation yD has been processed by the first stage. Assuming that the estimation of ^a1 and the amplitude estimation f^1 are accurate, the outputs of the first stage are: A data decision on the strongest user ^a1 ; A modified received signal without the MAI caused by the strongest user yD ð1Þ. This process can be repeated for the other users using a multistage structure. The jth stage takes as its input the ‘‘partially clean’’ signal output of the previous stage yD ð j 1Þ, gives one additional data decision ^aj and a ‘‘cleaner’’ received signal yD ð jÞ. It is clear that the SIC detector requires a number of stages equal to the total number of users J to complete its task. The reasons for canceling the signals in descending order of signal strength are straight-forward [1]. First, it is easiest to achieve reliable acquisition and demodulation on the strongest user. Second, the removal of the strongest users provides the greater benefit for the remaining users. The result of this algorithm is that the strongest user will not benefit from any MAI reduction; the weakest users, however, will potentially see a huge reduction in their MAI.
136
Multiuser Detection in CDMA Mobile Terminals
The drawbacks of the SIC receiver based on hard or soft decision are that: The signals must be continuously reordered before detection if the power profile changes. The receiver requires knowledge of the received amplitudes and reliability in the initial data estimates. An error in the estimation of the received amplitudes and a wrong preliminary decision (hard approach) or a noisy soft decision (soft approach) cause an additional noise for the subsequent decisions. The serial processing introduces a delay in the detection process, which grows linearly with the number of users. Detection performance for different users is discriminating to some extent; equal power users are detected with different reliability depending on their order of demodulation. Since the detector is nonlinear, such class of devices must be used in its practical realization and a sophisticated control logics employed for coordination of the various tasks. The main advantage with respect to (nonadaptive) multiuser linear detectors is that it does not require any matrix inversion, which is usually computationally heavy.
7.3 PIC Unlike the SIC detector, the PIC detector estimates and subtracts all the MAI of each user in parallel [5]. The first stage of the detector is shown in Figure 7.3, where a hard decision approach is assumed. The PIC detector is initially fed by a set of preliminary symbol decisions, denoted by ^aj ð0Þ( j ¼ 1; 2; . . . ; J ) obtained using, for example, a conventional detector for each user (not shown in Figure 7.3). We refer to this early stage, which initializes the PIC detector, with index ‘‘0.’’ The first stage of a PIC detector realizes the following tasks: A bank of modulators reconstructs the contribution of each data stream to the received signal using the amplitude estimated f^j ( j ¼ 1; 2; . . . ; J ), and the knowledge of the relevant code matrices Cj ( j ¼ 1; 2; . . . ; J ).
137
Nonlinear Multiuser Detection
yD â1(0)
C1
Σ
CH1
j ≠2
Σ
CH2
Σ
CHJ
j ≠1
x1(1)
Thresh. â1(1) detect.
x2(1)
Thresh. â2(1) detect.
xJ (1)
Thresh. âJ (1) detect.
f1 â2(0)
C2
f2
âJ (0)
CJ
j ≠J
fJ
Figure 7.3 Scheme of a PIC detector. Ó 1996 IEEE [1].
A bank of partial summers reconstruct the MAI seen by each user, which is canceled in the subsequent nodes. Assuming perfect amplitude estimation ( f^j ¼ fj , j ¼ 1; 2; . . . ; J ), the input to the jth matched filter of the matched filter bank is
yD
J X
fk Ck ^ak ð0Þ ¼ fj Cj a j þ
k¼1;k6¼j
J X
fk Ck ðak ^ak ð0ÞÞ þ v
k¼1;k6¼j
j ¼ 1; 2; . . . ; J
ð7:11Þ
The second term in (7.11) is absent in the case that all decisions of the preceding stage (index 0) are correct. Each output of type (7.11) is passed on to a second bank of matched filters whose soft decisions ðxj ð1ÞÞ (j ¼ 1; 2; . . . ; J ) feed a bank of hard detectors to produce a new, hopefully better set of data estimates: ^aj ð1Þ ¼ quant½xj ð1Þ
j ¼ 1; 2; . . . ; J
ð7:12Þ
138
Multiuser Detection in CDMA Mobile Terminals
This process can be repeated for multiple stages. Each stage takes as its input the data estimate of the previous stage and produces a new set of estimates at its output. Unlike the SIC detector, the number of stages needed by the PIC detector does not depend on the number of users J , but on convergence and performance considerations. At the nth iteration of type (7.11), if all decisions ^aj ðnÞ ( j ¼ 1; 2; . . . ; J ) are correct, there is no point in considering additional stages of PIC. No performance improvement can be expected. The drawbacks of the PIC receiver are that: The receiver requires knowledge of the received amplitudes and reliability in the initial data estimates. An error in the estimation of the received amplitudes causes additional noise in the following stages. The successive processing introduces a delay in the detection process, which grows linearly with the number of stages. This nonlinear detector requires a hardware that is even more complex than that of the SIC detector. With respect to the SIC detector, it does not require us to rank the users in terms of received power. The PIC detector also shares the advantage of the SIC detector of not requiring any matrix inversion for its implementation. In [6] it is shown that in their native versions the SIC detector performs better than the PIC detector in a fading environment, while the opposite behavior is observed in power-controlled system (softdecision approach). A large number of enhanced versions of the PIC detector have been proposed for improved performance. We list below the two enhanced versions, among those mentioned in the extensive survey presented in [1]: Improve the performance of the first stage : The performance of the PIC detector depends heavily on the initial data estimates ^aj ð0Þ, ( j ¼ 1; 2; . . . ; J ). These tentative decisions could be provided, for example, by any of the proposed (nonadaptive) multiuser linear detectors proposed in Chapter 3, (i.e., ZF- or MMSE-based detectors), which definitely perform better than a conventional detector especially in the presence of severe near-far effect. A good initial estimate prevents an error propagation from the first stage, which is the most critical for performance. Subsequent stages may be equipped again with a matched filter bank.
Nonlinear Multiuser Detection
139
Reliability-weighted MAI cancellation : This techniques exploits the fact that tentative decisions of the early stages are less reliable than that of the late stages. A natural countermeasure is to do a partial cancellation of the MAI in the early stages by scaling the reconstructed MAI by a factor that accounts for the risk that this cancellation is inaccurate. For the later stages this risk is much lower and cancellation can be done in full. Large performance gains are attained by this technique, although the weighting operation is not so easy to tune.
7.4 Decision Feedback Nonlinear Receivers The receivers presented in this section are derived using a maximum likelihood sequence estimation (MLSE) approach. The system model is the same of Chapter 1, allowing for the presence of both MAI and ISI, which arises in a synchronous multirate CDMA system affected by a multipath propagation channel. The receiver is particularly easy to realize when it operates in the presence of AWGN, because in this case it admits a recursive formulation. From a general point of view, the receiver utilizes the detected bits in the previous window to cancel part of the MAI ‘‘seen’’ by the following observation window. In fact, the receiver can be viewed as a PIC scheme with a trellis-based decision feedback mechanism.
7.4.1 A Decision Feedback Multiuser Detector We first state the problem of optimum block detection performed using an observation vector y ðN Þ ðN Þ, which spans N chip intervals, from chip epoch 1 to chip epoch N . Maximum likelihood sequence detection (MLSD) requires us to perform the following maximization: ð7:13Þ ^aðN Þ ðN Þ ¼ arg max p y ðN Þ ðN Þja ðN Þ ðN Þ a ðN Þ ðN Þ
where pðy ðN Þ ðN ÞjaðN Þ ðN ÞÞ is the probability density function of the observation vector y ðN Þ ðN Þ given the relevant data vector aðN Þ ðN Þ. It is well known that if the channel is known and constant during the observation interval, and assuming additive Gaussian noise (AGN) with known covariÞ ðN Þ ðN Þja ðN Þ ðN ÞÞ is ance matrix R ðN v , the probability density function pðy multivariate Gaussian and can be expressed in closed form as follows:
140
Multiuser Detection in CDMA Mobile Terminals
p y ðN Þ ðN ÞjaðN Þ ðN Þ n 1 ðN Þ ¼ exp ½y ðN Þ ðN Þ y 0 ða ðN Þ ðN ÞÞH Þ pN det R ðN v o ðN Þ ðN Þ 1 ðN Þ Rv ½y ðN Þ y 0 ðaðN Þ ðN ÞÞ
ð7:14Þ
ðN Þ
where y 0 ða ðN Þ ðN ÞÞ ¼ GðN Þ ðN ÞaðN Þ ðN Þ. Without further assumptions, the maximization of (7.14) must be performed by an exhaustive search over all possible Cnt data sequences, where C is the number of constellation symbols. The complexity is astronomic. In the case of AWGN, R v ¼ s2v IbN , where s2v is the noise variance and IbN is the identity matrix of size bN . Detection is simplified and the following negative log-likelihood has to be minimized for all possible data sequences aðN Þ ðN Þ: ðN Þ L a ðN Þ ðN Þ ¼ ky ðN Þ ðN Þ y 0 ðaðN Þ ðN ÞÞk2 ¼
N X
~ ðkÞ~aðkÞk2 k~ y ðkÞ G
ð7:15Þ
k¼1
where in the derivation of (7.15), we have used the notation of Chapter 1 for observation y~ðkÞ, which spans a chip interval. We recall that the ISI span, measured in multiples of symbol intervals (by the parameter lj ), varies from code to code (shorter for slower users and longer for faster users) due to the different spreading factor Qj , as shown in Chapter 1. On the other hand, this ISI span is finite, and it causes an overlapping of data dependence in summation (7.15). Minimization of (7.15) can be performed by a trellis search. However, defining the relevant trellis is not a trivial task. Let us refer to Figure 7.4, which illustrates the detection of data. Let Q ¼ maxj Qj be the maximum spreading factor currently used by the system, to which corresponds the code with the slowest data rate. We then split the summation of (7.15) into a double sum by grouping the samples that belong to the symbol interval of the slowest data stream. It is also useful to identify a parameter M such that N ¼ MQ , which is the data block length measured in multiples of symbol interval of the slowest user. Hence, we can rewrite (7.15) as L½a
ðN Þ
ðN Þ ¼
Q M 1 X X k¼0 q¼1
~ ðkQ þ qÞ~aðkQ þ qÞk2 k~ y ðkQ þqÞ G
ð7:16Þ
141
Nonlinear Multiuser Detection
Impulse response span = PTc 1st stream (Q 1 = 4) a1(.)
a1(.)
a1(.)
a1(.)
a1(.)
a1(.)
2nd stream (Q 2 = 8) a2(.)
a2(.)
a2(.)
3rd stream (Q 3 = 8) a3(.)
a3(.)
µk µk –1
a3(.) Detection
Observation window to evaluate λ (µk–1 → µk )
Figure 7.4 Example of trellis-based decision feedback detection for the case of a data block with three data streams with different data rates (thick lines separate symbol intervals, thin lines separate chip). In this case Q ¼ 8. The figures show the nominal symbol interval, which is shaded along with the ISI caused by each symbol. A generic detection step is shown: the symbols of the initial state are marked with triangles ð4Þ, while the symbols of the final state are marked with circles ().
We can define a partial metric from epoch 1 to epoch k as Lk ½a ðkQ Þ ðMQ Þ ¼
Q k1 X X
~ ðmQ þ qÞ~aðmQ þ qÞk2 k~ y ðmQ þ qÞ G
m¼0 q¼1
ð7:17Þ and recognize that Lkþ1 ½a
ððkþ1ÞQ Þ
ððk þ 1ÞQ Þ ¼ Lk ½a
ðkQ Þ
ðkQ Þ þ
Q X
k~ y ðkQ þ qÞ
q¼1
~ ðkQ þ qÞ~aðkQ þ qÞk2 G
ð7:18Þ
The transitions are defined by mk ! mkþ1 , and the relevant branch metrics are given by
142
Multiuser Detection in CDMA Mobile Terminals
Q X
lðmk ! mkþ1 Þ ¼
~ ðkQ þ qÞ~aðkQ þ qÞk2 k~ y ðkQ þ qÞ G
ð7:19Þ
q¼1
The state of the relevant trellis to search the above metric is defined by h ðk1Þ ðk1Þ ðkÞ þ 1Þ; a1 ðn1 þ 2Þ; . . . ; a1 ðn1 Þ; mk ¼ a ðQ Þ ðkQ Þ ¼ a1 ðn1 .. . ðk1Þ
aJ ðnJ
.. .
.. . ðk1Þ
þ 1Þ; aJ ðnJ
ðkÞ
þ 2Þ; . . . ; aJ ðnJ Þ
i
ð7:20Þ where ðkÞ 4
¼
nj
kQ Qj
k ¼ 1; 2; . . .
j ¼ 1; 2; . . . ; J
ð7:21Þ
P The state mk is composed by nt0 ¼ Jj¼1 QQj elements. Hence, the number of states of this trellis that depend on the number of users but not on the 0 sequence length is Cnt . Table 7.1 shows the numbering of the data symbols of the jth streams that belong to the state defined by (7.20). It is shown that state mk1 is Table 7.1 Data Symbols that Define the Trellis State .. . ðk2ÞQ Qj
4 .. . 4
ðk2ÞQ Qj
þ1
Last symbol of the j th code belonging to state mk2 First symbol of the j th code belonging to state mk1
.. .
ðk1ÞQ Qj
Last symbol of the j th code belonging to state mk1
.. .
ðk1ÞQ Qj
kQ Qj
Last symbol of the j th code belonging to state mk
þ1 .. .
First symbol of the j th code belonging to state mkþ1
þ1
First symbol of the j th code belonging to state mk
.. .
kQ Qj
Nonlinear Multiuser Detection
143
contributed by the symbols marked with triangles, while state mk is contributed by the symbols marked with circles. In Figure 7.4, the evolution of the decision feedback algorithm is shown using the same notation as the table. The observation vector that is used at the generic detection step for the evaluation of the branch metrics is also shown.
7.4.2 Complexity Reduction by Per-Survivor Processing Techniques The decision feedback multiuser detector intrinsically detects all the codes active in the system. A preliminary useful interpretation of the receiver is done by splitting the complete set of the J codes S ¼ fc1 ; c2 ; . . . ; cJ g into two subsets: one composed by the K J codes denoted U ¼ fc1 ; c2 ; . . . ; cK g and its complementary set with respect to S, denoted U, so that U [ U ¼ S. In a full complexity solution with the state defined by (7.20), all the data carried by all the codes is detected, and eventually only the data carried by the code belonging to the set U is retained, while data carried by the codes belonging to the set U is discarded. In order to reduce complexity, we lower the number of symbols that define the states and complete the metric using the symbols stored in the survivors’ history, according to per-survivor processing (PSP) techniques [7]. PSP is a general technique that allows us to realize reduced state sequence estimation with low performance loss. We now discuss two different logical strategies to reduce the state size to a fixed length rt < nt0 . Reduced complexity solutions may be conceived by shortening the description of each code, causing a uniform performance degradation on the detection of the useful data carried by the codes in U and in the interference cancellation performed by the data carried by the codes of U. An alternative strategy is to maintain the full complexity description for the data of the codes that belong to U, but drastically reduce the description for the ‘‘don’t care’’ data carried by the codes in U.
7.5 Linear Versus Nonlinear Multiuser Detection Linear multiuser detectors apply a linear transformation to an observation vector to obtain a new observation vector, which serves as soft decision for the transmitted data. Nonlinear detectors always make use of some sort of preliminary single-user or multiuser detection to obtain an observation for further nontrivial nonlinear processing (a multistage interference cancellation). The SIC detector adopts a serial approach to interference cancellation, while the PIC detector employs a parallel approach.
144
Multiuser Detection in CDMA Mobile Terminals
As a general conclusion, nonlinear detectors are more complex than linear detectors and do not appear to be the most immediate choice for mobile terminals. This is because a nonlinear operation is more difficult to implement on a chip compared to linear filtering. Another point is that in the downlink of a cellular mobile radio system, no near-far effect is present if the detection involves a single intracell signal sent by the BS. Absence of a near-far effect does recommend use of a nonlinear multiuser detection scheme because no major gain is expected with respect to a linear one. It is different in the case of a multicell environment where a near-far effect between cells may occur. In this case hybrid detection schemes (linear for intracell interference mitigation and nonlinear PIC for the intercell) have appeared as good alternatives to alllinear schemes [8].
References [1]
Moshavi, S., ‘‘Multi-User Detection for DS-CDMA Communications,’’ IEEE Communications Magazine, October 1996, pp. 124–136.
[2]
Verdu´, S., Multiuser Detection, New York: Cambridge University Press, 1988.
[3]
Viterbi, A. J., ‘‘Very Low-Rate Convolutional Codes for Maximum Theoretical Performance of Spread-Spectrum Multiple-Access Channels,’’ IEEE Journal on Selected Areas in Communications, Vol. 8, No. 4, May 1990, pp. 641–649.
[4]
Holtzman, J. M., ‘‘DS/CDMA Successive Interference Cancellation,’’ Proceedings of the International Symposium on Spread Spectrum and Applications (ISSSTA ’94), Oulu, Finland, July 1994, pp. 69–78.
[5]
Varanasi, M. K., and B. Aazhang, ‘‘Multistage Detection in Asynchronous CodeDivision Multiple-Access Communications,’’ IEEE Transactions on Communications, Vol. 38, No. 4, April 1990, pp. 509–519.
[6]
Patel, P., and J. Holtzmann, ‘‘Performance Comparison in a DS/CDMA System Using a Successive Interference Cancellation (IC) Scheme and a Parallel IC Scheme under Fading,’’ Proc. ICC ’94, New Orleans, LA, May 1994, pp. 510–514.
[7]
Raheli, R., A. Polydoros, and C.-K. Tzou, ‘‘Per-Survivor Processing: A General Approach to MLSE in Uncertain Environments,’’ IEEE Transactions on Communications, Vol. 43, No. 3 (part 3), February–March–April 1995.
[8]
Host-Madsen, A., and K.-S. Cho, ‘‘MMSE/PIC Multiuser Detection for DS/CDMA Systems with Inter- and Intra-Cell Interference,’’ IEEE Transactions on Communications, Vol. 47, No. 2, February 1999.
8 Synchronization Techniques As mentioned in Chapter 6, synchronization is a key feature for proper operation of CDMA receivers. In this chapter, we use the term ‘‘synchronization’’ to refer to problems of timing acquisition, phase error recovery, and frequency offset estimation. These are ancillary functions of a coherent receiver, meaning that some detection schemes (noncoherent) do not need their implementation. A comparison between the two classes of receivers is presented in the concluding remarks of this chapter. For example, Table 2.1 gives an overview of the robustness of the main families of CDMA detectors to some of the above-mentioned parameters. In the previous chapters, we commented on the sensitivity of some multiuser detectors to uncertainties on these parameters. This chapter focuses on data-aided timing acquisition and frequency offset estimation. Once these parameters are acquired, the problem of phase error recovery is easy to solve, so we will not describe it. The technical literature offers a large variety of timing and frequency recovery schemes, and we intend to provide an overview of some of these. The chapter is divided into two main parts, where the basics of timing acquisition and frequency offset estimation are illustrated. Specifically, the first part investigates the use of a sliding correlator for timing acquisition, while the second part presents a scheme for frequency offset recovery based on a simplified least mean square (LMS) algorithm.
8.1 Observation Model There are several strategies to provide the receiver with timing information. We assume that the transmitter interleaves data transmission with synchronization sequences, which are normally represented by sequences of 145
146
Multiuser Detection in CDMA Mobile Terminals
unmodulated chips with or without guard periods. The system model we are assuming is an asynchronous system with a unique spreading factor. In Figure 8.1 we show the data frame format consisting of a pattern of unmodulated chips (pilot chip sequence) of user k, interfered by other spread spectrum data streams. The pilot chip sequence is usually transmitted with a greater chip power than that devoted to the spread data sequence, as depicted in Figure 8.1. Observation f yðiÞg, taken in correspondence of the pilot chip sequence, can be directly used to acquire timing and estimate the frequency offset. As derived in Chapter 1, the chip-rate sampled observation yðiÞ can be rewritten as yðiÞ ¼ Ak e jð2pfk iTc þfk Þ pTk bk ðiÞ þ
J X
Aj e jð2pfj iTc þfj Þ pTj bj ðiÞ þ vðiÞ
ð8:1Þ
j¼1;j6¼k
where Tc is the chip interval, fj is the frequency offset, and fj the phase offset. The vectors pj , bj ðiÞ ð j ¼ 1; 2; . . . ; k; . . . ; J Þ are defined as T 4 pj ¼ pj ðP 1Þ; pj ðP 2Þ; . . . ; pj ð0Þ T 4 bj ðiÞ ¼ bj ði dj P þ 1Þ; . . . ; bj ði dj Þ
ð8:2Þ ð8:3Þ
where P is the span of the possible ICI, dj is the integer delay (modulus Q ), 4 and the fractional delay dj ð0 dj Tc Þ is included in pj ðl Þ ¼ pððl dj Þ Tc Þ. All parameters can be random, but they remain constant during the code 1
code k – 1 Pilot chip sequence Data
Data code k code k + 1
code J
Figure 8.1 Scheme of the signaling system considered.
147
Synchronization Techniques
transmission. In (8.1) we separated the desired user contribution (kth user) from the contribution of the other users and of the thermal noise. The frequency offset fk of the desired user is assumed as deterministic, while those of the interfering users fj are assumed as uniformly distributed between 0 and a maximum value n. All the phases fj are assumed as uniformly distributed between 0 and 2p. We define the synchronization sequence for the jth transmitting sequence as bj ðNs 1Þ j ¼ 1; 2; . . . ; J ð8:4Þ bj ð0Þ; bj ð1Þ; . . . ; where Ns is its length. If there is no guard period between the synchronization sequence and the data fields, the number of ‘‘clean’’ observations N is given by N ¼ Ns P þ 1, as shown in Figure 8.2. ðN Þ The vectorial observation model of Chapter 1, expanded in terms aj ðiÞ ðN Þ and Cj ðiÞ, no longer applies for the observation relevant to the synchronization sequence. On the contrary, we define for each user j, a column vector ðN Þ bj ðiÞ, of length Ns ¼ N þ P 1, as follows: ðN Þ
bj
T 4 ðiÞ ¼ bj ði N P þ 2Þ; bj ði N P þ 3Þ; . . . ; bj ðiÞ
ð8:5Þ
This vector does not depend explicitly on the integer delay dj ð0 dj
Q 1), because the synchronization sequence is not the outcome of data modulation, rather it is a sequence of unmodulated chips. The synchronization sequence is periodically interleaved in the data frame, so the maximum acquisition time is the period of the sequence insertion. The observation model used to estimate the delay is a modification of the asynchronous system model of Chapter 1:
P – 1 cross interfered chips
Data
Sync. seq
Data
Useful samples
Figure 8.2 Identification of the useful samples for timing acquisition and frequency estimation.
148
Multiuser Detection in CDMA Mobile Terminals
y ðN Þ ðiÞ ¼
J X
ðN Þ
Fj
ðN Þ
ðiÞbj
ðiÞ þ v ðN Þ ðiÞ
ð8:6Þ
j¼1
in which an explicit dependence on the pulse samples is highlighted, as follows: y ðN Þ ðiÞ ¼
J X
ðN Þ
ðN Þ ðN Þ bj ðiÞ
ðiÞPj
Aj
þ vðN Þ ðiÞ
ð8:7Þ
j¼1 ðN Þ
Unknown parameters in matrix A j ðiÞ, with dimension N N , are amplitudes Aj , frequency offsets fj , and phase offsets fj (j ¼ 1; 2; . . . ; J ). ðN Þ Matrix Pj with dimensions N ðN þ P 1Þ depends on the fractional delays dj . Assuming that the desired user is the kth, we rewrite observation (8.6), highlighting the contribution of three additive terms (i.e., useful signal, interference, and thermal noise): ðN Þ
ðN Þ ðN Þ
y ðN Þ ðiÞ ¼ A k ðiÞPk bk ðiÞ þ
J X
ðN Þ
Aj
ðN Þ ðN Þ bj ðiÞ
þ vðN Þ ðiÞ
ðiÞPj
j¼1;j6¼k ðN Þ
ðN Þ
¼ Fk ðiÞbk ðiÞ þ
J X
ðN Þ
Fj
ðN Þ
ðiÞbj
ðiÞ þ vðN Þ ðiÞ
j¼1;j6¼k
¼
ðN Þ ðN Þ Fk ðiÞbk ðiÞ
þ zðN Þ ðiÞ þ v ðN Þ ðiÞ
ð8:8Þ
where vector zðN Þ ðiÞ represents the interference ðN Þ
z
ðiÞ ¼
J X
ðN Þ
ðiÞPj
ðN Þ
ðiÞPj
Aj
ðN Þ ðN Þ bj ðiÞ
j¼1;j6¼k
¼
J X
Aj
ðN Þ
ðN Þ
Cj
ðN Þ
ðiÞaj
ðiÞ
ð8:9Þ
j¼1;j6¼k
For the purpose of timing acquisition, a low value of the frequency error induces a phase rotation along the span of the synchronization sequence that ðN Þ can be regarded as negligible. Under this assumption, A j ðiÞ becomes ðN Þ independent of i, as well as Fj ðiÞ.
Synchronization Techniques
149
8.2 Timing Acquisition One of the crucial problems of CDMA systems is timing acquisition. In fact, in order to correctly realize despreading, the receiver needs code synchronization (i.e., the timing of the data frame both at the chip and symbol level). The timing acquisition algorithms usually estimate delay tj of the desired signal referred to an arbitrary time origin. This delay is then evaluated modulus one symbol interval. The signaling system considered is an inband signaling system, as shown in Figure 8.1. The literature offers various algorithms for the timing recovery of either the desired user or all users. In [1] several timing acquisition algorithms for the estimation of all users’ delays are considered. The maximum likelihood estimator is based on the maximization of the probability density function of the observation vector given all unknown parameters (amplitudes, integer and fractional delays, and so on). In [2] an algorithm for timing acquisition based on the maximum probability is presented. This algorithm acquires the timing of a desired user only, and it is particularly efficient when the codes of the interfering signals are unknown. It is based on the approximation of the MAI as a colored Gaussian noise, whitened by a whitening filter. The estimator is near-far resistant but rather complex. In [3] we find four different timing acquisition algorithms, which acquire the delay modulus one symbol interval, with maximum uncertainty dependent on the oversampling factor. Among them, the simplest is the sliding correlator, which is extensively described in this section. It is very simple and instructive, although it is not near-far resistant in terms of performance. The problem of timing acquisition is translated into the search of two parameters: an integer delay dj (0 dj Q 1) and a fractional delay dj (0 dj 1). In other words, coarse acquisition is focused on the estimation of the integer delay dj of the desired user. Fine acquisition is then realized by the estimation of the fractional delay dj , which is essential in some adaptive algorithms to achieve good performance (e.g., the blind adaptive MMSE receiver). 8.2.1 Coarse Timing Acquisition In this section we use a sliding window correlator to acquire integer timing [3]. This technique is based on the correlation of the received signal with the reconstructed signal for the synchronization sequence considered; the integer
150
Multiuser Detection in CDMA Mobile Terminals
delay estimate is the value of the delay that maximizes this sliding correlation (which operates with a step of Tc ). In order to ensure adequate performance, we need to use a synchronization sequence with proper length and chip energy. Both these parameters must be optimized in order to trade off between acquisition performance and interference with other users. ðN Þ The reconstructed signal for the kth user denoted by Y^ k can be written as follows: ðN Þ ^ðN Þ ^ ðN Þ P ^ ðN Þ b Y^ k ¼ A k k k
ð8:10Þ
^ ðN Þ and P ^ ðN Þ depend on parameters that we have not In (8.10), matrices A k k estimated yet. Since the focus is on the estimation of the integer delay dk of ^ ðN Þ is assumed to be the kth data stream as a first approximation, matrix A k an identity matrix with dimensions N N ; in other words, the desired user is assigned a unit amplitude (Ak ¼ 1), and a zero phase and frequency ^ ðN Þ is evaluated assuming zero fractional offset (fk ¼ 0, fk ¼ 0). Matrix P k ^ðN Þ is the synchronization sequence defined in (8.4). delay dk ¼ 0. Vector b k Summarizing, the following specifications apply for the reconstructed signal: 4 ^ ðN Þ ¼ A IN N k
ðN N Þ
4 ðN Þ ^ ðN Þ ¼ P Pk jdk ¼0 ðN ðN þ P 1ÞÞ k T 4 ^ðN Þ ¼ bk ð0Þ; bk ð1Þ; . . . ; bk ðN þ P 2Þ b k
ððN þ P 1Þ 1Þ ð8:11Þ
Let us assume that the sliding correlator starts searching the position of the kth synchronization sequence at the generic time t0 , as shown in Figure 8.3. More exactly we denote by t0 the epoch at which the receivers start collecting the first window of samples for the purpose of acquisition; we denote by dk the number of complete chips between the epoch t0 and the closest edge of the optimal observation window. From time t0 we collect observation samples for a time window equal to Tw ¼ NTc ¼ ðNs P þ 1Þ Tc , building the first observation vector y ðN Þ ð1Þ. The correlation between the observation vector and the reconstructed signal is then evaluated. At this point, the observation window is shifted chip by chip and the new observation vector y ðN Þ ðl Þ, l ¼ 2; 3 . . . is correlated with the reconstructed signal. The integer timing estimate d^k is the value of dk that maximizes the correlation and will occur at chip l^. In other words, the decision strategy is as follows:
151
Synchronization Techniques
δkTc
(P – 1)Tc dkTc
kth st. Tc β
Tw
Tc Tsync
Ts
Synchronization sequence
t0
Figure 8.3 Structure of one data stream for the desired user: t0 is the starting epoch of the correlation process, Ts is the symbol time, Tc is the chip interval, b is the oversampling factor, Tw is the window of ‘‘clean’’ observations, and Tsync is the whole length of the synchronization sequence.
ðN Þ H ^ ðN Þ ^ ðN Þ ^ðN Þ 2 ðl ÞA k Pk bk y l^ ¼ max ^ ðN Þ ^ ðN Þ ^ðN Þ 2 l A b P k k k that is, using definitions (8.11): 2 ðN Þ H ^ðN Þ ^ ðN Þ b ðl ÞIN N P y k k l^ ¼ max 2 l ^ðN Þ ^ ðN Þ b IN N P k k
d^k ¼ d k ðl^Þ
ð8:12Þ
d^k ¼ d k ðl^Þ
ð8:13Þ
H
where y ðN Þ ðl Þ represents the transposed conjugate observation vector of N samples at the i-th chip. Neglecting matrix IN N , which has no effect on the delay, ðN Þ H ^ ðN Þ ^ðN Þ 2 ðl ÞPk bk y ð8:14Þ l^ ¼ max d^k ¼ d k ðl^Þ ^ ðN Þ ^ðN Þ 2 l Pk bk To simplify the derivation, we summarize the decision strategy (8.14) as follows: jN ðl Þj2 ð8:15Þ d^k ¼ d k ðl^Þ l^ ¼ max l D where H 4 ^ðN Þ ^ ðN Þ b N ðl Þ ¼ y ðN Þ ðl ÞP k k 2 4 ^ ðN Þ ^ðN Þ D ¼ P b
k
k
ð8:16Þ ð8:17Þ
depends on the l th chip considered (i.e., on the last window considered).
152
Multiuser Detection in CDMA Mobile Terminals
Using the above strategy, a coarse acquisition can be achieved, since the reconstructed signal has been assigned zero fractional delay and the correlator slides chip by chip. Using this technique, the timing error never exceeds a chip interval Tc . In the next section we will present a strategy to obtain a fine synchronization again by using a sliding correlator. 8.2.2 Fine Timing Acquisition Some detection schemes, such as the blind adaptive MMSE detector, require that the timing error does not exceed a small fraction of the chip interval. Hence, a fine acquisition algorithm must be used. In order to achieve fine acquisition, the strategy is similar to that proposed in (8.14), but it is necessary to operate with an oversampled received signal (b denotes the oversampling factor). With this choice we want to estimate a fractional delay ^ dk , which belongs to a set Db , with cardinality b, defined as r j r ¼ 0; 1; . . . ; b 1 ð8:18Þ Db ¼ b The problem can be formulated in the estimation of the continuousvalued delay tk ¼ lk Tc ¼ ðdk þ dk ÞTc
ð8:19Þ
which is the actual delay of the kth user referred to time t0 . Parameter dk is the number of integer chips from t0 (trailing edge of the first observation window) to the epoch of the leading edge of the window on which no ICI occurs, while dk is the time interval between t0 and the next leading edge of a chip, in agreement with Figure 8.3. We want to estimate both integer delay dk and fractional delay dk so that the total delay defined as ^tk ¼ l^k Tc ¼ ðd^k þ ^dk ÞTc
ð8:20Þ
differs from the true delay tk by a sampling interval at the most. As a matter of fact, in the previous section we always considered a fractional delay dk ¼ 0 ðN Þ in the reconstructed signal Y^ k [see (8.10)]; using a sliding correlator that moves with steps of one chip, it may happen (especially if the true fractional delay dk is more than 0:5) that the estimated integer delay d^k is larger by one chip interval with respect to d^k , which appears in (8.20). The latter consideration states the obvious fact that the coarse acquisition may fail by one chip interval.
Synchronization Techniques
153
Let us define the possible estimated delays as 4
dk;r ¼
r b
r ¼ 0; 1; . . . ; b 1
ð8:21Þ
^ ðN Þ , with dimensions N ðN þ P 1Þ, as We then define b matrices P k;r follows: 4 ðN Þ ^ ðN Þ ¼ Pk jdk ¼dk;r P k;r
r ¼ 0; 1; . . . ; b 1
ð8:22Þ
The fractional delay dk can be found using the same strategy utilized to estimate the integer delay, correlating the oversampled received signal with ^ ðN Þ appear: the reconstructed signal where the above defined matrices P k;r maximization is obtained over all possible values of the estimated delays dk;r 2 Db . In general, at the generic l th chip, the estimation of the fractional delay becomes ðN Þ H ^ ðN Þ ^ðN Þ 2 ðl ÞPk;r bk y ^ ð8:23Þ dk ðl Þ ¼ max ^ ðN Þ ^ðN Þ 2 dk;r 2Db P b k;r k Based on previous considerations, once the l^th chip (which produces the peak in the search of d^k ) has been found, it is sufficient to evaluate function (8.23) for both l ¼ l^ and l ¼ l^ 1. In conclusion, ^dk is that which maximizes the function for l 2 fl^ 1; l^g, and r ¼ 0; 1; . . . ; b 1: ðN Þ H ^ ðN Þ ^ðN Þ 2 y ðl Þ P b k;r k ^ max dk ¼ ^ ðN Þ ^ðN Þ 2 dk;r 2Db ;l 2fl^ 1;l^g Pk;r bk
ð8:24Þ
In the special case for which b ¼ 4, we have that
dk;r
8 0 > < 0:25 ¼ > : 0:5 0:75
if if if if
r r r r
¼0 ¼1 ¼2 ¼3
ð8:25Þ
Hence, we search for an estimate of the fractional delay dk;r 2 D4 ¼ f0; 0:25; 0:5; 0:75g. The timing acquisition error jtk ^tk j will be less than Tc =4 in the complete absence of background noise, when we assume an oversampling factor b ¼ 4.
154
Multiuser Detection in CDMA Mobile Terminals
8.3 Performance of the Sliding Correlator The performance of the acquisition algorithm must be judged from a different point of view. First of all, the sliding correlator must be able to give at its output the principal peak of the acquisition, well separated from secondary peaks, whose presence is due to nonperfect orthogonality of the Gold sequences. The features of the CDMA system comply with the following specifications: spreading factor Q ¼ 63, chip interval Tc ¼ 0:25 10 6 s, root raised cosine elementary pulse with roll-off a ¼ 0:22, and oversampling factor b ¼ 4. The synchronization sequences also belong to the Gold codes family. 8.3.1 Coarse Acquisition In this section we present some typical outputs of the correlator when we employ the coarse acquisition algorithm given by (8.15). The parameters are the number of interfering users, the average chip energy of the synchronization sequence, and the length of the synchronization sequence itself. What really matters is the correlator output in the neighbor area of the synchronization sequence. The delay of the desired user k has been assumed equal to 0, while all other users have a random delay dj (0 dj Q 1, j 6¼ k). The fractional delay dj (j ¼ 1; . . . ; J , j 6¼ k) is random too, uniformly distributed between 0 and 1, for each user. In all simulations the frequency offset has been omitted. A random phase offset fj uniformly distributed between 0 and 2p has been considered. Case Study A: Isolated Synchronization Sequence
This synchronization technique is typical of the third-generation mobile radio system, which envisions silent (guard) periods before and after the synchronization sequence. Both guard periods are assumed to have a duration of 63 chips. Hence, the correlation has been evaluated over 2 Q ¼ 126 different shifts of the window. The correlation peak must occur at the 64th position. The operating window is shown in Figure 8.4. Ts
Tsync
kth st.
Figure 8.4 Synchronization sequence without any interfering or adjacent signal (Case Study A).
155
Synchronization Techniques
The correlation peak has been obtained for three different synchronization sequences. Specifically, the following lengths have been considered for the (normalized) synchronization sequences: Ns ¼ 63 chips, Ns ¼ 127 chips, Ns ¼ 255 chips. The results are presented in Figure 8.5. The correlator output has not been normalized for any of the synchronization sequences themselves: the longer the synchronization sequence, the higher the correlation peak. In the following plots, instead, the correlator peak is normalized, because the chip energy of the synchronization sequences is normalized. Case Study B: Synchronization Sequence Periodically Inserted
This case corresponds to a situation in which a single DS modulated data stream is transmitted and a synchronization sequence is periodically inserted in it for the purpose of timing acquisition. The data frame we refer to for the kth user (the desired user) is shown in Figure 8.6, where Ts ¼ QTc is the symbol interval, and Tsync ¼ Ns Tc is the synchronization sequence duration. 0.8
Correlation value
0.6
63 chips 127 chips 255 chips
0.4
0.2
0.0
0
20
40 60 80 100 # of steps of the sliding window
120
Figure 8.5 Correlator output for Case Study A, when Ns ¼ 63, 127, and 255 chips, Eb =N0 ¼ 17 dB, and P ¼ 4. Ts
Tsync
kth st.
Figure 8.6 Synchronization sequence without any interfering signal but with an adjacent signal (Case Study B).
156
Multiuser Detection in CDMA Mobile Terminals
We evaluate the correlation starting from three data symbols on the left of the synchronization sequence, and we stop the correlation process three data symbols to the right. In the data fields, a DS-QPSK modulation is present with spreading factor Q ¼ 63. The unmodulated synchronization sequence belongs to the set of Gold sequences (again Ns ¼ 63, 127, and 255 are assumed), and the spreading sequence used for data modulation belongs to the set of Gold sequences with length Q ¼ 63. Given the above specifications, the correlation is evaluated over 2 3 Q ¼ 378 different window shifts; hence, the correlation peak will appear in position 190. Since we are using the coarse acquisition algorithm with uncertainty equal to Tc , the maximum correlation peak will be either in position 190 or 191. The three plots of Figures 8.7, 8.8, and 8.9 represent the correlator output for Case Study B using three synchronization sequences with three different lengths and normalized correlation output. The pictures show an increasing interference rejection capability for increasing synchronization sequence lengths (the lateral noise process has vanishing power, already for Ns ¼ 63). The chip energy of the synchronization sequence has been assumed larger than the one associated to the modulated data.
1.0
Normalized correlation value
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.7 Correlator output for Case Study B, when Ns ¼ 63, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, and P ¼ 4.
157
Synchronization Techniques
1.0
Normalized correlation value
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.8 Correlator output for Case Study B, when Ns ¼ 127, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, and P ¼ 4. 1.0
Normalized correlation value
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.9 Correlator output for Case Study B, when Ns ¼ 255, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, and P ¼ 4.
158
Multiuser Detection in CDMA Mobile Terminals
Case Study C: Multiple Asynchronous Data Streams with Periodically Inserted Synchronization Sequence
We want to test the synchronization algorithm in the presence of interfering channels. Hence we assume that six asynchronous data streams are simultaneously active, and each of them has a periodically inserted synchronization sequence, as shown in Figure 8.10. Here we do not consider the event of partial overlap of synchronization sequences, as it has low probability. The parameters used to test the system are as follows: the length of the synchronization sequence Ns , the average chip energy of the synchronization sequence Esync compared to the chip energy Ec used by the data stream, and the duration of the chip pulse measured in multiples of the chip interval (parameter P ). The first numerical results are obtained by setting Esync ¼ 6Ec , P ¼ 4, and synchronization sequences with Ns ¼ 63 chips (see Figure 8.11), Ns ¼ 127 chips (see Figure 8.12), and Ns ¼ 255 chips (see Figure 8.13). From Figures 8.11–8.13, we can see that the acquisition improves for increasing values of Ns , and unlike the single stream case results of Figure 8.7, with Ns ¼ 63, and Esync ¼ 6Ec , in a multistream scenario such values do not warrant adequate acquisition performance. The general conclusion is that the correlation acquisition algorithm is not MAI resistant. By further increasing the chip energy of the synchronization sequence to Esync ¼ 12Ec and keeping Ns ¼ 63 chips, we obtain the results of Figure 8.14.
Ts
Tsync
User 1 User 2 User 3 User 4 User 5 User 6
Figure 8.10 Data format for Case Study C, when six streams are active.
159
Synchronization Techniques
Normalized correlation value
1.0
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.11 Correlator output for Case Study C, when Ns ¼ 63, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, and P ¼ 4.
Normalized correlation value
1.0
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.12 Correlator output for Case Study C, when Ns ¼ 127, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, and P ¼ 4.
160
Multiuser Detection in CDMA Mobile Terminals
Normalized correlation value
1.0
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.13 Correlator output for Case Study C, when Ns ¼ 255, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, and P ¼ 4.
These values appear to be a reasonable trade-off between the length of the synchronization sequence and its energy.
8.3.2 Fine Acquisition Once the coarse acquisition has been achieved, the fine acquisition algorithm can be used to improve the resolution up to the value of the oversampling factor adopted. We have tested the efficiency of the proposed fine acquisition algorithm by evaluating for different operating conditions, the following acquisition error jtk ^tk j where ^tk ¼ ðd^k þ ^dk ÞTc is such that d^k satisfies (8.15), while ^ dk has been found with (8.23). The operating environment considered is shown in Figure 8.10 (six asynchronous codes). In Figure 8.15 we have considered a synchronization sequence of length Ns ¼ 63 chips and chip energy Esync ¼ 12Ec ; the SNR is Eb =N0 ¼ 17 dB and P ¼ 4. The oversampling factor b is set equal to 4. A number of 200 independent simulation runs have been considered to evaluate the overall acquisition error. It can be seen that the acquisition error very rarely exceeds T4c .
161
Synchronization Techniques
Normalized correlation value
1.0
0.8
0.6
0.4
0.2
0.0
0
50
100 150 200 250 300 # of steps of the sliding window
350
Figure 8.14 Correlator output for Case Study C, when Ns ¼ 63, Esync ¼ 12Ec , Eb =N0 ¼ 17 dB, and P ¼ 4. 1.00
|τk–τk’|
0.75
0.50
0.25
0.00
0
20
40
60
80
100 120 Run
140
160
180
200
Figure 8.15 Fine acquisition error for 200 independent runs, when Ns ¼ 63, Esync ¼ 12Ec , Eb =N0 ¼ 17 dB, P ¼ 4, and b ¼ 4.
162
Multiuser Detection in CDMA Mobile Terminals
In Figure 8.16 we present the same simulation except for the fact that the synchronization sequence has a lower energy per chip: Esync ¼ 6Ec . The fine acquisition process is degraded similarly to what happened for the coarse acquisition when we used a low value of Esync . In Figure 8.17 the timing acquisition error has been analyzed for a synchronization sequence with length Ns ¼ 63 chips and Esync ¼ 12Ec , but for a particularly low Eb =N0 ¼ 7 dB. In this case the fine algorithm exceeds the resolution more often. 8.3.3 Complexity Issues of the Sliding Correlator The evaluation of the optimal length of the training sequence is the result of a performance trade-off in which the independent variables are the features of the interfering CDMA channels (number, spreading factor, frame format, chip energy) and implementation issues. With regard to this last aspect, we provide some considerations below, while the other optimization aspects are examined in Section 10.4.
1.00
|τk–τk’|
0.75
0.50
0.25
0.00
0
20
40
60
80
100 120 Run
140
160
180
200
Figure 8.16 Fine acquisition error for 200 independent runs, when Ns ¼ 63, Esync ¼ 6Ec , Eb =N0 ¼ 17 dB, P ¼ 4, and b ¼ 4.
163
Synchronization Techniques
1.00
|τk–τk’|
0.75
0.50
0.25
0.00
0
20
40
60
80
100 120 Run
140
160
180
200
Figure 8.17 Fine acquisition error for 200 independent runs, when Ns ¼ 63, Esync ¼ 12Ec , Eb =N0 ¼ 7 dB, P ¼ 4, and b ¼ 4.
The timing acquisition algorithm presented is based on the use of a sliding correlator, whose observation window spans N ¼ Ns P þ 1 samples. The correlator must elaborate its output during each sliding step (i.e., each chip interval). This is feasible if the length of the synchronization sequence Ns is not too large (for example, equal to 63 chips) or the signal processing device that realizes the correlation is fast enough. If this is the case, we can acquire the timing after an interval whose upper bound is the insertion period of the synchronization sequence. In the case that the synchronization sequence is too long, the FIR filter of the correlator is no longer able to produce a correlation output at chip rate. As shown in Figure 8.18, it may be convenient to modify the acquisition strategy [4]. Let Ttot be the length measured in chip interval of the data interleaved between two synchronization sequences and Tc the chip interval; from time t0 , the epoch at which timing acquisition is started, every ðTtot þ Ncorr Þ chip intervals we store in a buffer N þ Ncorr þ 1 samples, corresponding to a wider window than that used by the correlator (Ncorr is a design parameter). In the remaining time ðTtot N 1ÞTc , Ncorr correlations are evaluated, using each time a different set of observation samples (retrieved from the buffer, by sliding a pointer for each correlation). The maximum of
164
Multiuser Detection in CDMA Mobile Terminals
NcorrTc
Ttot Step 1 t0
1
N
Ncorr
Step 2 t1
Figure 8.18 Modified timing acquisition, realized on a longer time span reducing the complexity.
the correlator output is found and stored along with its timing information. Then the timing reference is shifted ahead by Ncorr Tc , as highlighted in Figure 8.18 (where t1 ¼ t0 þ ðTtot þ Ncorr ÞTc ); the same correlations of the previous step are performed and the maximum is stored. The drawback of this technique is a longer timing acquisition; in fact, it tot chip intervals to complete timing acquisition. This acquisition time takes NTcorr can be reduced by increasing the parameter Ncorr , which in turn needs a faster correlator unit. The fine synchronization can be also applied with the usual strategy. Because of the possible uncertainty of a chip interval, it may be necessary to store one extra sample in the buffer and start the correlation from the second sample in the buffer.
8.4 Frequency Offset Estimation Frequency offset estimation is essential to warrant good detection performance whenever a coherent receiver scheme is used for detection. Coherent detection is often used in burst-mode transmission schemes, such as satellite communication systems and mobile terrestrial radio systems, based either on TDMA or CDMA [5]. These schemes usually envision a sequence of pilot chips employed for rapid parameter acquisition. Focusing on frequency offset estimation, many techniques have been proposed for the TDMA environment (see [5] for an overview). Among the most popular algorithms is the Rife and Boorstyn algorithm [5], which performs maximum likelihood sequence estimation of phase and frequency offset. This algorithm has been proposed in a single-user environment (AWGN noise) and can be
Synchronization Techniques
165
interpreted as a spectrum analyzer that searches for the maximum Fourier transformation of the time-discrete observations. The estimator can perform very close to the modified Cramer-Rao bound (MCRB) [6], even for low signal-to-noise ratios. One of the most popular low complexity algorithms is the Kay algorithm [5], based on a least square estimation. It avoids the phase unwrapping problem (i.e., the problem of evaluating the phase modulus 2p before frequency estimation) of the Tretter estimator [5], since it operates on a differential observation. The Kay algorithm has a threshold effect, but it achieves the MCRB for high enough signal-to-noise ratios. Frequency estimators for DS-CDMA systems in the presence of MAI are difficult to design. The above algorithms exhibit bad performance in CDMA systems since they have been designed to operate in the presence of AWGN, which hardly ever models MAI accurately. Work [7], for example, deals with joint ML estimation of a set of parameters (including frequency offset) of a single DS spread spectrum signal in the presence of AWGN. In this section we focus on an efficient cost-effective least mean square estimation of a DS-SS signal in the presence of MAI [8, 9]. The proposed algorithm preprocesses the observation vector differentially similar to the Kay algorithm, and it realizes a LMS estimation of the frequency offset of the desired CDMA stream relying on the transmission of a pilot sequence of unmodulated chips with adequate power. On the one hand, the algorithm does not require the knowledge of the delay, amplitude, phase or frequency offset of the interfering users, eliminating those parameters by averaging over their probability density function. On the other hand, it requires the exact knowledge of the identity of the active codes in the system. Under the above assumptions, we prove that the proposed algorithm is equivalent to the standard Kay algorithm in the presence of AWGN only.
8.4.1 Front-End Processing of the Observation To derive the frequency estimator, we need to operate some manipulation of the observation vector. We first rewrite (8.1) as follows: yðiÞ ¼
Ak0 ðiÞe jð2pfk iTc þfk Þ
þ
J X
Aj0 ðiÞe jð2pfj iTc þfj Þ þ vðiÞ
ð8:26Þ
j¼1;j6¼k
where 4
Aj0 ðiÞ ¼ Aj pTj bj ðiÞ
j ¼ 1; 2; . . . ; J
ð8:27Þ
166
Multiuser Detection in CDMA Mobile Terminals
is unknown for j 6¼ k, and known for j ¼ k, where k denotes the desired user. Furthermore, it is assumed that the kth user has perfect timing acquisition, which enables ideal Nyquist sampling such that Ak0 ðiÞ ¼ Ak pTk bk ðiÞ ¼ Ak b k ðiÞ
ð8:28Þ
We can then define the following: 4
y 0 ðiÞ ¼
J X Aj0 ðiÞ jð2pf iT þf Þ yðiÞ vðiÞ jð2pfk iTc þfk Þ j c j þ þ ¼ e 0 ðiÞ e 0 A Ak0 ðiÞ A k ðiÞ j¼1; j6¼k k
ð8:29Þ We now introduce ZðiÞ obtained applying a derotation to y 0 ðiÞ: 4
ZðiÞ ¼ y 0 ðiÞe jð2pfk iTc þfk Þ ¼1þ
J X Aj0 ðiÞ
e A 0 ðiÞ j¼1;j6¼k k
jð2pðfj fk ÞiTc þfj fk Þ
þ
v 0 ðiÞ Ak0 ðiÞ
ð8:30Þ
4
where v 0 ðiÞ ¼ vðiÞ e jð2pfk iTc þfk Þ : Using the polar representation ZðiÞ ¼ jZðiÞje jarg½ZðiÞ , and considering (8.30), (8.29) becomes y 0 ðiÞ ¼ jZðiÞje j ð2pfk iTc þfk þarg½ZðiÞÞ
ð8:31Þ
Making use of samples y 0 ðiÞ (n ¼ 1; 2; . . . ; N 1), we can now build a set of differential observations: y 0 ðiÞy 0 ðn 1Þ ¼ jZðiÞjjZðn 1Þje j ð2pfk Tc þarg½ZðiÞ arg½Zðn 1ÞÞ
ð8:32Þ
The actual observation used for frequency estimation is given by 4
rðiÞ ¼ arg½y 0 ðiÞy 0 ðn 1Þ ¼ 2pfk Tc þ arg½ZðiÞ arg½Zðn 1Þ ð8:33Þ From (8.30) we derive that 8 9 J < X = 0 0 Aj ðiÞ jð2pðf f ÞiT þf f Þ j c k j k þ v ðiÞ Im e 0 0 Ak ðiÞ; :j¼1;j6¼k Ak ðiÞ 8 9 arg½ZðiÞ ¼ arctg J < X = 0 Aj ðiÞ jð2pðf f ÞnT þf f Þ v 0 ðiÞ j c k j k 1þRe e þ A0 ðiÞ 0 k :j¼1;j6¼k Ak ðiÞ ;
167
Synchronization Techniques
8 9 J < X Aj0 ðiÞ jð2pðf f ÞnT þf f Þ v 0 ðiÞ = j c k j k þ e Im :j¼1;j6¼k Ak0 ðiÞ Ak0 ðiÞ;
ð8:34Þ
where the approximation (8.34) holds only for large signal-to-noise ratios and for Aj0 ðiÞ Ak0 ðiÞ. The last assumption is usually valid since the pilot chips are usually transmitted with a greater power than that devoted to the modulated chips of the data streams. Finally, defining 4
zðiÞ ¼ arg½ZðiÞ arg½Zðn 1Þ 8 J < X Aj0 ðiÞ jð2pðf f ÞnT þf f Þ j c k j k e ¼ Im :j¼1;j6¼k Ak0 ðiÞ
J X Aj0 ðn 1Þ
A 0 ðn 1Þ j¼1;j6¼k k
e jð2pðfj fk Þðn 1ÞTc þfj fk Þ
v 0 ðiÞ v 0 ðn 1Þ þ Ak0 ðiÞ Ak0 ðn 1Þ
9 = ;
ð8:35Þ
the observation for frequency estimation is given by rðiÞ ¼ 2pfk Tc þ zðiÞ n ¼ 1; 2; . . . ; N 1
ð8:36Þ
In order to estimate fk using observation rðiÞ, we need to find the statistics of zðiÞ, finding its covariance matrix Cz whose generic element is defined as 4 Cz ðn; mÞ ¼ E ½zðnÞz ðmÞ. The stationarity of the discrete time process zðiÞ is then discussed. Some preliminary considerations apply. In (8.35) we note an explicit dependence on fk , the parameter we are estimating. If we define the two new random variables fj 0 ¼ fj fk and f0j ¼ fj fk , (8.35) becomes 0 Aj ðiÞ jð2pf 0 nTc þf0 Þ Aj0 ðn 1Þ jð2pf 0 ðn 1ÞTc þf0 Þ j j j j Im 0 e 0 zðiÞ ¼ e Ak ðiÞ Ak ðn 1Þ j¼1;j6¼k J X
0 v ðiÞ v 0 ðn 1Þ þ Im 0 Ak ðiÞ Ak0 ðn 1Þ
ð8:37Þ
Using the entirely scalar version of ð8:37Þ, it can be shown that the first- and second-order statistics of zðiÞ are independent of the parameters fj 0 and f0j .
168
Multiuser Detection in CDMA Mobile Terminals
The dependence on fk that appears in v 0 ðiÞ is unimportant, since a phase rotation does not modify the statistics of thermal noise. It is also assumed that quantity Ak0 ðiÞ is real, since we assume that the pilot unmodulated chips are real, while Aj0 ðiÞ is, in general, a complex number (since the adopted modulation may be QPSK or QAM).
8.4.2 First-Order Statistics of z (i ) From (8.37) it is easy to show that the mean value of zðiÞ is zero, since the thermal noise has zero mean value and the constellation symbols belong to a QPSK or QAM constellation. As a consequence, the covariance matrix Cz coincides with the correlation matrix R z , and the first order statistics of zðiÞ are stationary.
8.4.3 Second-Order Statistics of z (i ) After a nontrivial proof, we have shown that under the above assumptions, the second-order statistics of zðiÞ are stationary, too, and the expression of the correlation matrix R z is tridiagonal and given by 2
2 Ak2
2 sI þ s2v
6 6 1 2 6 A2 sI þ s2v 6 k Rz ¼ 6 6 6 4 0
A12 s2I þ s2v
0
3
k
7 7 7 7 7 .. .. 7 1 2 2 7 . . A2 sI þ sv 5 k 2 2 2 A12 s2I þ s2v s þ s 2 I v A 2 Ak2
s2I þ s2v
k
..
.
ð8:38Þ
k
where we have exploited the fact that the kth user has perfect timing acquisition and Ak02 ðiÞ ¼ Ak2 ðn ¼ 0; 1; . . . ; N 1Þ independent of n. If we define by vI0 ðiÞ and vQ0 ðiÞ the in-phase and in-quadrature component of the noise samples v 0 ðiÞ, we denote E ½v 0 2I ðiÞ ¼ E ½v 0 2Q ðiÞ ¼ s2v . The interference equivalent variance s2I is given by 4
s2I ¼
Q 1 J P 1 X P 1 X 1 X Q jl rj X 2 E ½A p cj ðji l jQ Þcj ðji rjQ Þ l ;r j Ak2 j¼1;j6¼k Q2 i¼0 l ¼0 r¼0
ð8:39Þ where pl ;r ¼
R1 0
pððl dÞTc Þpððr dÞTc Þd d.
169
Synchronization Techniques
8.4.4 LMS Frequency Estimator We first give a vectorial expression to observation (8.36), by defining 4
ð8:40Þ
4
ð8:41Þ
r ¼ ½rð1Þ; rð2Þ; . . . ; rðN 1ÞT z ¼ ½zð1Þ; zð2Þ; . . . ; zðN 1ÞT 4
and m ¼ 2pTc 1, where 1 ¼ ½1; 1; . . . ; 1T is a vector of N 1 unit elements. At this point, the definitions (8.36) can be rewritten as r ¼ mfk þ z
ð8:42Þ
It is well known that the LMS estimation of fk , being z a stationary process, can be obtained as f^k ¼ aT r
ð8:43Þ
where 4
a¼
T 1 1 1 R 1 z 1 1 Rz 1 2pTc
ð8:44Þ
We notice that weighting vector a is independent of coefficient A12 s2I þ s2v , k and its components an are given by an ¼
1 6nðN nÞ 2pTc N ðN 2 1Þ
ð8:45Þ
This estimator does not require the estimation of the noise variance, nor the estimation of the interference equivalent variance s2I . The weight vector a is the same as that of the standard Kay algorithm in the presence of AWGN only.
8.5 Performance of the Frequency Estimator To assess the performance of the proposed estimator, we have considered a CDMA system employing Gold sequences with spreading factor Q ¼ 63. The transmission filter pðt Þ has a frequency response that is a root raised cosine filter and qðt Þ is matched to pðt Þ. The time truncation of the overall
170
Multiuser Detection in CDMA Mobile Terminals
channel filter is set to P ¼ 4. The modulation format is QPSK and the modulated chips have energy Ec , while the pilot chips (an unmodulated Gold sequence of length N ¼ Q ) are transmitted with energy Esync ¼ 12Ec . The desired data stream has a maximum timing acquisition error equal to T4c . In this case, at the receiver side we assume Nyquist sampling, but the observation is generated assuming that the last equality in (8.28) does not hold. The test has been performed assigning the kth user a normalized frequency offset nk ¼ fk Tc ¼ 0:0025 (corresponding to a frequency error fk ¼ 10 KHz @ 4.032 Mchip/s), while the other codes have been assigned a random frequency offset between 0 and nk . For each signal a random phase error between 0 and 2p has been considered. No near-far effect has been considered, and the amplitude is Aj ¼ 1 for all J users. The performance of the proposed estimation algorithm is evaluated as a function of the SNR defined as Esync A2 4 ¼ 10 log10 k 2 ð8:46Þ ðSNRÞdB ¼ N0 dB Q sv Variable SNR does not take into account MAI. In order to assess the efficiency of an estimator to fight MAI, it is convenient to introduce the new variable: 4
SNIR ¼
Esync N0 þ N0;I
ð8:47Þ
where N0;I ’ ðJ 1ÞEc ¼ ðJ 1Þ
Aj2 Q
¼
J 1 Q
ð8:48Þ
is the equivalent spectral density due to the J 1 interfering signals. The relation between SNR and SNIR can be written as follows: SNIR ¼
N0 þ N0;I Esync
1
¼
1 J 1 1 þ SNR Ak2
ð8:49Þ
and expressed in decibels: ðSNIRÞdB ¼ 10 log10
1 J 1 þ SNR Ak2
ð8:50Þ
171
Synchronization Techniques
As a performance reference, the MCRB [6] has been considered. The MCRB sets a lower bound to estimator variance as a function of the SNR and the number of samples used for the parameter estimation. In detail, MCRB ¼
3 2p2 N 3 T 2
1 Ec =N0
ð8:51Þ
where N is the number of chips used for the estimation, T ¼ 1 is the normalized signaling interval, and Ec =N0 is the SNR (Ec is the average energy per chip and N0 is the unilateral noise spectral density). In Figure 8.19 we plot the standard deviation of the frequency E estimator as a function of Nsync0 for different operating conditions. In the case of no MAI, the estimator reaches the MCRB without any threshold effect. The other curves are plotted assuming a timing error uniformly distributed between 0 and Tc =4. For a single user a timing error causes a negligible performance degradation. If we consider other users in the system (MAI caused by two extra users), the standard deviation saturates around the value
–1
10
–2
σ
10
–3
10
–4
10
MCRB Perfect timing, single user Timing error, single user Timing error, 3 active codes Timing error, 3 active codes
–5
10
0
3
6
9
12 15 Esync /N0 [dB]
18
21
24
27
Figure 8.19 Standard deviation of frequency estimate as a function of the signal to noise ratio Esync =N0 .
172
Multiuser Detection in CDMA Mobile Terminals
of 5:5 10 4 . The floor indicates that practically above 0 dB the performance is not sensitive to any reduction of the thermal noise level. In Figure 8.20 we plot the standard deviation of the estimation error as a function of the SNIR, which is a more appropriate way to assess the estimator performance. Note that the scale of this figure is enlarged in order to show the exact performance of the two cases considered (i.e., three and six asynchronous users). This curve is the remapping of the curves reported in Figure 8.19 to the plane (SNIR, s), using the coordinate transformation given by (8.50). The plotted points accumulate for increasing SNIR values since the curve is obtained by varying the noise power only and keeping the value of the MAI constant (set by the number of active users). The performance is very close to the MCRB bound, although the standard deviation weakly increases if the number of users is increased from three to six. From Figure 8.21 we can draw some conclusions about the bias of the proposed estimator. We have plotted the estimated fk Tc as a function of the true fk Tc for a synchronization sequence length of N ¼ 63, Esync ¼ 12Ec , E P ¼ 4, and Nsync0 ¼ 0 dB. The estimator is not biased and provides a good performance for the range of fk Tc considered.
–3
σ
10
MCRB Timing error, 3 active codes Timing error, 6 active codes –4
10
0
2
4 SNIR [dB]
6
Figure 8.20 Standard deviation of frequency estimate as a function of SNIR.
8
173
Synchronization Techniques
0.2
fkTc (estimated)
0.1
Timing error 6 active codes
0.0
–0.1
–0.2 –0.2
–0.1
0.0 fkTc
0.1
0.2
Figure 8.21 Bias of the LMS frequency estimator.
8.6 Coherent Versus Noncoherent Detection In this chapter we have presented two simple algorithms for timing acquisition and frequency offset estimation. The sliding correlator is a cost-effective solution to the problem of timing acquisition, although it is not robust to either near-far effect or MAI. The performance has been assessed under different operating scenarios, highlighting the trade-off in the synchronization sequence design. With regard to the frequency offset estimator, we have pointed out the difficulties in designing feedforward schemes, since noise (AWGN and MAI) is difficult to model in this context. Hence, approximations should be used, which, however, result in an estimator with good performance, like the one presented in this chapter. We underline that ancillary parameter estimation (timing, phase, frequency)—that is, synchronization in a wide sense—is mandatory for coherent detectors, which are the focus of this book. Nevertheless, most coherent detectors allow for some uncertainties in the estimation of these parameters. For the sake of completeness, we mention that there is a research area on noncoherent detection schemes that investigates self-synchronized detection schemes for both TDMA and CDMA [10–13]. These transmission schemes use differential encoding (at the symbol level or at the chip level), and,
174
Multiuser Detection in CDMA Mobile Terminals
provided that timing is acquired, do not require an explicit preliminary phase and frequency offset estimation, as it is implicitly provided by the detection scheme itself.
References [1]
Str€om, E. G., et al., ‘‘Propagation Delay Estimation in Asynchronous Direct-Sequence Code-Division Multiple Access System,’’ IEEE Transaction on Communications, Vol. 44, No. 1, January 1996, pp. 84–93.
[2]
Bensley, S. E., and B. Aazhang, ‘‘Maximum-Likelihood Synchronization of a Single User for Code-Division Multiple-Access Communication Systems,’’ IEEE Transactions on Communications, Vol. 46, No. 3, March 1998, pp. 392–399.
[3]
Zheng, D., et al., ‘‘An Efficient Code-Timing Estimator for DS-CDMA Signals,’’ IEEE Transactions on Signal Processing, Vol. 45, No. 1, January 1997, pp. 82–89.
[4]
Conti, A., et al., ‘‘DSP-Based CDMA Satellite Modem: CNIT-ASI Project,’’ 12th Tyrrhenian International Workshop on Digital Communications: ‘‘Software Radio: Technologies and Services,’’ Isola d’Elba, September 2000.
[5]
Morelli, M., and M. Mengali, ‘‘Feedforward Frequency Estimation for PSK: A Tutorial Review,’’ European Transactions on Telecommunications, Vol. 9, No. 2, March–April 1998.
[6]
D’Andrea, A. N., U. Mengali, and R. Reggiannini, ‘‘The Modified Cramer-Rao Bound and Its Application to Synchronization Problems.’’ IEEE Transactions on Communications, Vol. 42, No. 2/3/4, February–April 1994.
[7]
Rezeanu, S.-C., R. E. Ziemer, and M. A. Wickert, ‘‘Joint Maximum-Likelihood Parameter Estimation for Burst DS Spread-Spectrum Transmission,’’ IEEE Transactions on Communications, Vol. 45, No. 2, April 1997.
[8]
Castoldi, P., and G. Colavolpe, ‘‘Low-Complexity Least Mean Square Data-aided Frequency Estimation in CDMA Systems,’’ Proc. of Conf. on Information Science and Systems (CISS 2001), John Hopkins University, Baltimore, MD, March 2001.
[9]
Hu, A. Q., P. C. K. Kwok, and T. S. Ng, ‘‘MPSK DS/CDMA Carrier Recovery and Tracking Based on Correlation Technique,’’ IEE Electronics Letters, Vol. 35, No. 3, February 1999.
[10]
Colavolpe, G., and R. Raheli, ‘‘Noncoherent Sequence Detection,’’ IEEE Transactions on Communications, Vol. 47, September 1999, pp. 1376–1385.
[11]
Colavolpe, G., and R. Raheli, ‘‘Noncoherent Sequence Detection of Continuous Phase Modulations,’’ IEEE Transactions on Communications, Vol. 47, September 1999, pp. 1303–1307.
[12]
Colavolpe, G., and R. Raheli, ‘‘Improved Differential Detection of Chip-level Differentially Encoded Direct-sequence Spread-spectrum Signals,’’ IEEE Transactions on Wireless Communications, Vol. 1, No. 1, January 2002.
175
Synchronization Techniques
[13]
Cavallini, A., et al., ‘‘Chip-level Differential Encoding/Detection of Spread-Spectrum Signals for CDMA Radio Transmissions over Fading Channels,’’ IEEE Transactions on Communications, Vol. 45, April 1997, pp. 456–463.
Appendix 8A: Derivation of Matrix Rz In the expression of zðiÞ given by (8.37) we can isolate two terms: the contribution of interfering signals and the contribution of thermal noise. Hence, elements Rz ðn; mÞ of correlation matrix R z can be derived separately evaluating the expected values of the four resulting products. In order to obtain general results, we have averaged over all random parameters, except the number of active users J , which is considered deterministic and known. Specifically, the general expression of Rz ðn; mÞ is 4
Rz ðn; mÞ ¼ E ½zðnÞz ðmÞ 20 J P 1 X X 1 4 @ Aj pj ðl Þ 0 ¼E Re bj ðn dj l Þ Ak ðnÞ j¼1;j6¼k l ¼0 Im bj ðn dj l Þ 0 0 sinð2pfj nTc þ fj Þ þ cosð2pfj 0 nTc þ f0j Þ Ak0 ðnÞ Re bj ðn 1 dj l Þ sinð2pfj 0 ðn 1ÞTc þ f0j Þ 0 Ak ðn 1Þ Im bj ðn 1 dj l Þ 0 0 cosð2pfj ðn 1ÞTc þ fj Þ Ak0 ðn 1Þ 1 vðnÞe jð2pfk nTc þfk Þ vðn 1Þe jð2pfk ðn 1ÞTc þfk Þ A þ Im Ak0 ðnÞ Ak0 ðn 1Þ
J X i¼1;i6¼k
Ai
P 1 X r¼0
pi ðrÞ
1
Refbi ðm di Ak0 ðmÞ
rÞg
Imfbi ðm di rÞg cos ð2pfi 0 mTc þ f0i Þ Ak0 ðmÞ Refbi ðm 1 di rÞg sin ð2pfi 0 ðm 1ÞTc þ f0i Þ 0 Ak ðm 1Þ Imfbi ðm 1 di rÞg 0 0 ðm 1ÞT þ f Þ cos ð2pf c i i Ak0 ðm 1Þ sin ð2pfi 0 mTc þ f0i Þ þ
176
Multiuser Detection in CDMA Mobile Terminals
!# vðmÞe jð2pfk mTc þfk Þ vðm 1Þe jð2pfk ðm 1ÞTc þfk Þ þ Im Ak0 ðmÞ Ak0 ðm 1Þ ð8A:1Þ Starting from (8A.1), we obtain (8.38) as follows. The noise term correlation is given by E
vQ0 ðnÞ
vQ0 ðn 1Þ
vQ0 ðmÞ
vQ0 ðm 1Þ
Ak0 ðnÞ Ak0 ðn 1Þ Ak0 ðmÞ Ak0 ðm 1Þ 82 2 s if n ¼ m > < Ak2 v ’ 12 s2v if n ¼ m 1 > : Ak 0 otherwise
ð8A:2Þ
since Ak02 ðnÞ ’ Ak2 ;
n ¼ 0; 1; . . . ; N 1:
ð8A:3Þ
The correlation between interference-like terms can be obtained in a similar manner by exploiting the following facts. As regards the chips correlation: E ½bj ðn dj l Þbj ðn dj rÞ h i ¼ E cj ðjn dj l jQ Þcj ðjn dj rjQ Þ ' ( ' ( n dj l n dj r E aj aj Q Q
ð8A:4Þ
where h i E cj ðjn dj l jQ Þcj ðjn dj rjQ Þ ¼
Q 1 X
h i E cj ðjn dj l jQ Þcj ðjn dj rjQ Þjdj ¼ i 0 Probfdj ¼ i 0 g
i 0 ¼0 Q 1 i 1X h ¼ E cj ðjn i 0 l jQ Þcj ðjn i 0 rjQ Þ Q i 0 ¼0
¼
Q 1 1X cj ðji l jQ Þcj ðji rjQ Þ Q i¼0
ð8A:5Þ
Synchronization Techniques
177
and ' ( ' ( n dj l n dj r E aj aj Q Q ' ( ' ( n dj l n dj r ¼ ¼ Prob Q Q ¼ Prob n dj l 5 0; n dj r 5 0 þ Prob n dj l < 0; n dj r < 0 ¼ Prob dj n maxfl ; rg þ Prob dj > n minfl ; rg 1 ¼ ðn maxfl ; rg þ 1Þ þ ðQ 1 n þ minfl ; rgÞ Q Q þ minfl ; rg maxfl ; rg ¼ Q Q jl rj ð8A:6Þ ¼ Q As regards the pulse correlation: E ½pj ðl Þpj ðrÞ ¼ E ½pððl dj ÞTc Þpððr dj ÞTc Þ Z 1 Tc pððl dj ÞTc Þpððr dj ÞTc Þd ðdj Tc Þ ¼ Tc 0 Z 1 ¼ pðl dj Þpðr dj Þd dj 0
ð8A:7Þ
9 Third-Generation Mobile Radio System In this chapter we will present the basics of the terrestrial radio interface of the third-generation mobile radio system, UMTS, and outline the status of the standardization process [1, 2]. At present, terrestrial radio access envisions two radio interfaces based on FDD and two based on TDD, summarized in the next section. Of the two FDD modes, the first is based on a direct spread approach (with two carriers, one in the uplink and one for the downlink); the second is based on a multicarrier approach but is still based on an FDD paradigm. The two TDD modes have many specifications in common, among which the most relevant to us is that they employ ‘‘short’’ codes, which allow the deployment of multiuser detection in the downlink [3, 4] and, of course, in the uplink too. In particular, the chip rate difference between the two TDD modes explains why they are named TDD low chip rate (TDD-LCR) and TDD high chip rate (TDD-HCR). In this chapter we will concentrate on the specifications of the TDD-HCR mode as an example. Nevertheless, since the signal processing techniques for multiuser detection described in this book apply to the detection within a timeslot, their validity can be extended to the TDD-LCR mode, regardless of its frame structure and chip rate.
9.1 General Specifications The above radio interfaces operate within a minimum spectrum of 25 MHz for FDD, 5 MHz for TDD-HCR, and 1.6 MHz for TDD-LCR [5, 6]. For this purpose, paired and unpaired frequency bands have been identified in the region of 2 GHz to be used for third-generation mobile radio system. 179
180
Multiuser Detection in CDMA Mobile Terminals
The duplexing modes have been harmonized with respect to some basic system parameters so that FDD/TDD dual mode operation is facilitated, thus providing a basis for the development of low-cost terminals. The interworking of UMTS with GSM is also ensured. In UMTS, the different service needs are supported in a spectrally efficient way by a combination of FDD and TDD [7–10]. The FDD modes are intended for applications in public macro- and microcell environments with data rates up to 384 Kbps and high mobility. The paired frequency allocation is needed because one band contains the carriers for uplink transmission and the other band for downlink transmission. Another special feature of the FDD mode is that it uses ‘‘long’’ codes for spreading and scrambling, which means that this standard was not designed for a multiuser detection approach. In fact, as mentioned in Chapter 1, long codes prevent an easy dimensional separation of the users. The TDD modes, instead, use separate timeslots for the uplink and downlink streams; thus, an unpaired frequency band is sufficient for its operation. TDD is the proposed duplexing technique to be employed for high data rate applications and picocell coverage of low mobility users for data rates up to 2 Mbps. Therefore, the TDD mode is particularly suitable both for environments with high traffic density (e.g., city centers, business areas, etc.) and indoor environments (e.g., wireless office), environments where the applications tend to create highly asymmetric traffic. The current standardization process is led by the Operator Harmonization Group (OHG), which in June 1999 proposed a harmonized Global 3rd Generation, (G3G) concept by combining the specifications developed within the 3rd Generation Partnership Project (3GPP) [1] and 3GPP2 [2]. The G3G concept is a single standard with the following four modes of operation [7]: 1. CDMA direct spread (CDMA-DS), based on UMTS Terrestrial Radio Access (UTRA FDD), as specified by 3GPP; 2. CDMA multicarrier (CDMA-MC), based on cdma2000 (evolution of IS-95) using FDD, as specified by 3GPP2; 3. TDD-LCR, formerly time division synchronous CDMA (TDSCDMA), for low mobility and wireless local loop applications; 4. TDD-HCR, based on UTRA-TDD, as specified by 3GPP. It is worthwhile to give some motivations for the existence of the four modes and some additional details about their relation. The TDD-HCR mode is better at supporting highly asymmetric applications (wireless office
Third-Generation Mobile Radio System
181
applications, for example), which cannot be handled well with the UTRA FDD mode, which is optimized for outdoor, high mobility situations, which are presumed to be more symmetric. The TDD-HCR mode could be considered a ‘‘partner’’ with the FDD mode. The other TDD mode (TDDLCR) seems to be better suited as a stand-alone alternative to the FDD/ TDD-HCR pair. This is primarily evident from the non-radio-networkcontrolled (non-RNC) deployment possibilities in the TDD-LCR mode, which are completely absent in the TDD-HCR mode.
9.2 The Physical Layer of the TDD-HCR Mode Channels available in the UMTS system are used partly for actual user data transmission and partly to transmit control and signaling information (to perform, for example, cell identification and system and user synchronization). The channel utilized for transmission can be described in two different ways at the physical layer: Logical channels: The description is based on the type of information transported; Physical channels: The description is based on the data format transmitted. In this book we are not interested in the network aspects (upper layers); hence, we will only focus on the structure of the physical channels and on the most important parameters involved in the radio interface. We will then link this description to the receiver structures and to the synchronization techniques previously described.
9.2.1 Physical Channels We now analyze how the information is transmitted over the physical channels. Similar to the TDMA technique, time is divided in time intervals called frames with a duration of 10 ms. A sequence of frames is a new structure called a superframe (see Figure 9.1), whose standardization warrants compatibility with the GSM system currently in use. Each superframe is composed of 72 frames, each with a duration of 10 ms, for an overall duration of 720 ms, six times the duration of the superframe adopted by the GSM system. Each frame is further divided into 15 timeslots (TS) with a duration approximately equal to 666 ms, as shown in Figure 9.1.
182
Multiuser Detection in CDMA Mobile Terminals
Superframe (720 ms)
Frame (10 ms) frame #1
frame #0
frame #71
Timeslot (666 µs) TS #0
TS #1
TS #2
TS #13
TS #14
Figure 9.1 Format of the TDD-HCR transmission system.
A timeslot can be assigned either to the downlink or uplink, as shown in the figure. Different configurations of the data frames are then available: the uplink or downlink slots are allocated, depending on the local traffic condition. Figure 9.2 shows some possible allocation patterns. In the first
▼
▼
▼
▼
........ ........ ........
▲
▲
▲
▲
Symmetric allocation DL/UL
▲
▲ ▼
▼
........ ........ ........
▲
▲ ▼
▼
Symmetric allocation DL/UL
▼
▼
▼
▼ ▼
▲
▲
▼
........ ........ ........
Asymmetric allocation DL/UL
▲ ▼
▼
▲
........ ........ ........
▼
▼
▼
▼
Asymmetric allocation DL/UL
Figure 9.2 Possible symmetric and asymmetric allocation of slots to DL and UL in a TDDHCR system.
Third-Generation Mobile Radio System
183
two patterns of Figure 9.2, we have a symmetric allocation of the slots, while in the other two cases most of the slots are allocated to the downlink. We notice, for example, that the second pattern realizes a symmetric allocation of the resources for UL/DL by having a switching point at each new timeslot while the first pattern has a switching point in the middle of the frame. The switching points in patterns 3 and 4 are not regularly distributed. Different from the simple TDMA technique, the CDMA technique allows different users to transmit their own data packets in the same timeslot (see Figure 9.3), using a basic transmission resource called resource unit. More than one resource unit can be allotted, if necessary, to the same user in a specified timeslot. During the detection process, the data carried by different resource units is separated, knowing its unique timeslot, frequency, and code. The chip interval is Tc ¼ 0:26042 ms, while the corresponding chip rate is 3.840 Mchip/s, which is fixed in this system. Each timeslot has a duration of 2,560 chip intervals, equivalent to 666 ms. The spreading factor Q may vary from 1 to 16. Assuming Q ¼ 16, the symbol interval Ts is obtained as Ts ¼ Q Tc ¼ 4:17 ms
ð9:1Þ
which, however, may be different from user to user because of the variable spreading factor. Let us focus on the data packet format. Each timeslot has a duration of 666 ms and its structure is shown in Figure 9.4. Specifically, we have Two fields containing the actual user data—that is, the payload (data symbols); A midamble that contains the training sequence used at the receiver side to detect the active codes and for channel estimation; A guard period (GP) used to compensate for possible small asynchronism due to propagation delays and interference coming from adjacent cells. If we assume a propagation delay of 3 ms=km, a guard period of 96 chips can cope with propagation delays corresponding to distances up to 10 km at the most. The TDD-HCR standard envisions two types of packets with the structure of Figure 9.4; the two types of data format differ only for the size of the first three fields. The data packet (burst) format of type 1 is shown in Figure 9.5, and the size of the fields is reported in Table 9.1 (the assumed spreading factor is 16). Figure 9.6 and Table 9.2 show the data packet format and the size
184
Energy
1
16
TD/CDMA
Frequency
5 Mhz
Time
Frame time: 10 ms
1 timeslot
16 timeslot
Timeslot 0.625 ms
Data block
Midamble
Data block
GP
Code 1
Data block
Midamble
Data block
GP
Code
Figure 9.3 TDD-HCR multiple access overview.
n
Multiuser Detection in CDMA Mobile Terminals
Codes
185
Third-Generation Mobile Radio System
Data symbols
Midamble
Data symbols
Gp
Data symbols 61 (976 chips)
Gp 96 cp
Figure 9.4 Data format of a traffic packet.
Data symbols 61 (976 chips)
Midamble (512 chips)
Figure 9.5 Data packet of type 1.
Table 9.1 Data Packet Format of Type 1 Chip #
Length in Chips
Length in Symbols
Length in ms
0–975
976
61
238.3
Data symbols
976–1,487
512
—
125.0
Midamble
1,488–2,463
976
61
238.3
Data symbols
2,464–2,559
96
—
23.4
Guard period
Data symbols 69 (1,104 chips)
Midamble 256 chips
Data symbols 69 (1,104 chips)
Description
Gp 96 cp
Figure 9.6 Data packet of type 2.
Table 9.2 Data Packet Format of Type 2 Chip #
Length in Chips
0–1,103
Length in Symbols
Length in ms
Description
1,104
69
269.55
1,104–1,359
256
—
62.5
Data symbols
1,360–2,463
1,104
69
269.5
Data symbols
2,464–2,559
96
—
23.4
Guard period
Midamble
186
Multiuser Detection in CDMA Mobile Terminals
of the fields for the data traffic packet of type 2, assuming a spreading factor equal to 16. It can be noticed that the data packet of type 1 has a midamble of 512 chips, while the data packet of type 2 has a midamble of 256 chips. For this reason, data packet 1 is used mainly for the uplink, where the channel parameters of all mobile users transmitting to the base station must be estimated. Data packet 2 is used mainly in the downlink, but it can also be used in the uplink if the number of active users in the timeslot is lower than four. The choice of the data burst size is a trade-off between two conflicting requirements. On the one hand, large bursts lead to trunking effects and low granularity (i.e., the transmission of a lot of padding data if not enough actual data is available to fill the resources). On the other hand, small bursts involve too much overhead for physical layer control data that require a minimum length. For instance, the length of the training sequences may not fall below a particular length to facilitate the estimation of larger delay spreads. Figure 9.7 shows the zoom of the
Time Frame (10 ms)
0
1
2
3
4
5
6
7
8
9
10
11 12
13 14
Code 1 slot = 0.666 ms 69/61 symb
69/61 symb
Data
Midamble
Data
GP
TC #1
Data
Midamble
Data
GP
TC #2
1,104/976 chips
256/512 chips 1,104/976 chips
96 chips
Other traffic channels
Figure 9.7 Structure of the CDMA of different traffic channels within the same timeslot.
Third-Generation Mobile Radio System
187
structure of a slot consisting of the superposition of several traffic channels of either type 1 or type 2. Slots 0 and 8 are shaded, indicating that they are special slots: these two slots may be used for the transmission of the synchronization channel. We mention that explicit signaling about the convolutional encoding of the data can be embedded by means of two fields (not shown in Figures 9.5 and 9.6) of 10 bits immediately at the left and the right of the midamble. These two fields are called the transport format control indicator (TFCI) and they are described in detail in [7].
9.2.2 Training Sequences Training sequence are sequences of unmodulated chips transmitted in the midamble and used by the receiving station for channel estimation and detection of the active codes. On the downlink, the channel is in general the same for all users, as mentioned in the previous chapters. In the uplink, the situation is slightly more complicated since different userspecific channel impulse responses must be estimated. In order to reduce the complexity of the channel estimation, the training sequences in TDD are based on a particular construction algorithm to allow a joint channel estimation by one single cyclic correlator [11]. The midambles of different users that are active in the same timeslot are time-shifted versions of one single periodic basic code [7]. Different cells use different periodic basic codes. The different channel impulse response estimates are obtained sequentially in time at the output of this correlator and can be separated by simple windowing. Besides channel estimation, the superposition of training sequences in the midamble can also be used to detect the active codes in a timeslot. A proper midamble sequence generated by the Steiner algorithm is associated to each active code [11]. These midamble sequences have a significant autocorrelation and low cross-correlation with other sequences. The algorithm for active codes detection can be summarized as follows. The midamble observation (i.e., the superposition of all active midambles) is stored by the MS station in a circular shift register. Among the active codes, notification of the ones destined to a specific MS is performed by using the broadcast channel (BCH). As the midamble enjoys a circular reconfiguration property, by circulating the observation received, the MS can first set a threshold for the detection of the active codes. It can then detect all active codes and estimate the channel with an average over each estimate associated to an active code.
188
Multiuser Detection in CDMA Mobile Terminals
9.2.3 Spreading and Scrambling Sequence Orthogonal variable spreading factor sequences are used as spreading sequences. We consider a code-dependent data sequence length: specifically Nj the data sequence spread by the jth code we denote by faj ðnÞgn¼1 (j ¼ 1; 2; . . . ; J ), where Nj is the data sequence length. The chip interval Tc is a constant in a TDD system, such that denoting by Q j the spreading factor for the jth code, the symbol interval Tj ¼ Q j Tc may be different for different data streams. The product Q j Nj ¼ Ld , which measures the length of the data field in multiple of the chip interval Tc , is constant. The UTRA-TDD system employs a structured periodic composite spreading code cj;m ; ðn; kÞ obtained following the procedure described in Chapter 4: cj;m ðn; kÞ ¼ wj ðkÞzm ðjn 1jQ max =Q j ; kÞ m ¼ 1; 2; . . .
n ¼ 1; 2; . . . ; Nj
j ¼ 1; 2; . . . ; Jm
k ¼ 0; 1; . . . ; Q j 1
ð9:2Þ
where ðwj ð0Þ; wj ð1Þ; . . . ; wj ðQ j 1ÞÞ, ð j ¼ 1; 2; . . . ; J Þ is the WalshHadamard spreading sequence, and zm ðn; iÞ is the BS-specific scrambling code (m identifies the BS) as described in Chapter 4. The sequence cj;m ðn; kÞ has the same period as sequence zm ðn; kÞ. The scrambling sequences are chosen in order to warrant a low correlation between different sequences for any timing offset and between different shifted versions of the same sequence (whiteness of the composite code). The first property offers a protection against intercell interference, which uses different scrambling sequences, and it is asynchronous with respect to the desired signal. The second property can fight self-interference due to multipath when using a Rake receiver. In Figure 9.8 we have plotted, as an example, the autocorrelation and cross-correlation for the two codes listed in Table 9.3. In particular, Figure 9.8 shows the autocorrelation of the scrambling sequence ‘‘0’’ and ‘‘1’’ (AC Code 0 and AC Code 1), as well as the circular cross-correlation between code 0 and code 1 (CC Code 0-Code 1) as a function of the circular shift. We can observe that, in the absence of spreading (Q j ¼ 1), the composite code is just the scrambling sequence whose elements directly multiply the data symbols. The scrambling sequence repeats every Q max symbols. On the other extreme, for which Q j ¼ Q max , the spreading sequence and the scrambling sequence repeat at each symbol interval. In all other cases we have intermediate situations where the spreading sequence repeats each symbol interval, while the scrambling sequence has a period of QQmaxj symbol intervals.
189
Third-Generation Mobile Radio System
12 10
AC Code 0 AC Code 1 CC Code 0–Code 1
8 6 4
Correlation
2 0 –2 –4 –6 –8 –10 –12
0 1
2
3 4
5
6 7
8
9 10 11 12 13 14 15 16 17
Shift Figure 9.8 Circular autocorrelation of code 0 and code 1 and cross-correlation as a function of the shift.
The process of spreading and scrambling can be summarized as shown in Table 9.4.
9.3 Radio Interface and Transmitted Signal In this section we analyze the transmission scheme and focus on the various operations that must be carried out on the information bits. The bits are modulated using a specified spreading sequence to which we overlay a BS-dependent scrambling sequence. After a proper multiplexing, the signal is modulated and transmitted to comply with the data format scheme. Table 9.3 Two Scrambling Sequences Chip #
1 2
3
4
5
Code 0 1 1 1 1 1 Code 1
1 1
1
1
6
7
8 9
10
1 1 1 1 1
1 1
11 12 1
1 1 1 1 1
13 14
15
16
1 1
1 1 1
1
1 1 1
1
190
Multiuser Detection in CDMA Mobile Terminals
Table 9.4 Combination of the Spreading and Scrambling Procedure
ðiÞ
ðiÞ
aj ð1Þ aj ð2Þ
...
ðiÞ
aj ðNj Þ
+ spreading of each bit with the code word w j ¼ ðwj ð0Þ; . . . ; wj ðQ j 1ÞÞ + ðiÞ ðiÞ Q max aj ð1Þ wj ð0Þ; . . . ; wj ðQ j 1Þ j j aj wj ð0Þ; . . . ; wj ðQ j 1Þ Qj + chip-by-chip multiplication by the scrambling sequence + zm ð1;0Þ;...;zm ð1;Q j 1Þ;zm ð2;0Þ;...;zm ð2;Q j 1Þ;...;zm
Q max Q max ;0 ;...;zm ;Q j 1 Qj Qj
+ ‘‘spread’’ and ‘‘scrambled’’ data
The BSs are approximately aligned in time. We have shown in Figure 9.7 the structure of a slot dedicated to traffic channels. The two data fields, as shown, are used by superposition of spread and scrambled data, the midamble is superposition of unmodulated chips, while the guard period compensates for the small system asynchronism due to transmission disalignment and different propagation delays. We denote the strings of symbols belonging to the data on the left and on the right of the midamble with a different suffix i ¼ 1; 2, respectively, as reported below: ðiÞ ðiÞ ðiÞ ðiÞ j ¼ 1; 2; . . . ; J ð9:3Þ aj ¼ aj ð1Þ; aj ð2Þ; . . . ; aj ðNj Þ ðiÞ
where aj ðnÞ in general is a complex symbol, and Nj is the number of symbols of the data field, which depends on the spreading factor Q j of the jth code. We make use of the composite spreading sequence (9.2) specific to the jth user, and the mth cell is cj;m ðn; kÞ ¼ wj ðkÞzm ðn; kÞ
n ¼ 1; 2; . . . ; Nj k ¼ 0; 1; . . . ; Q j 1 ð9:4Þ
where n runs over symbol intervals, while k runs over chip intervals.
191
Third-Generation Mobile Radio System
The low-pass equivalent of the signal transmitted in the various field of a data traffic packet is as follows:
First data field:
ð1Þ
sj;m ðt Þ ¼
Nj X
ð1Þ
aj ðnÞ
n¼1
QX j 1
cj;m ðn; qÞ
q¼0
pðt qTc ðn 1ÞQ j Tc Þ
Second data field:
ð2Þ sj;m ðt Þ
¼
Nj X
ð2Þ aj ðnÞ
n¼1
QX j 1
cj;m ðn; qÞpðt qTc
q¼0
ðn 1ÞQ j Tc ðLd þ Lm ÞTc Þ
Midamble: smð3Þ ðt Þ ¼
LX t 1
gm ðl Þpðt lTc Ld Tc Þ
l ¼0
where Ld and Lm are, respectively, the nominal duration expressed in chip intervals of each data field and of the midamble. The correspondence of the ð1Þ ð2Þ ð3Þ three signals sj;m ðt Þ, sj;m ðt Þ, and sj;m ðt Þ is visually shown in Figure 9.9. Hence, the signal corresponding to a physical traffic channel (‘‘resource unit’’) can be written as ð1Þ
ð2Þ
sj;m ðt Þ ¼ sj;m ðt Þ þ sj;m ðt Þ þ smð3Þ ðt Þ
ð9:5Þ
We want to point out that the linear multiuser receivers presented in Chapter 3 can be directly applied to the data format of the TDD system. In particular, the four interference mitigation detectors for single-cell systems presented in Section 3.1 and their extensions to multiple-cell systems presented in Section 4.2 can be used in the receiver of a MS. The sliding window algorithm applies in the two data fields. It is more convenient to start the sliding procedure from the data near the midamble where the channel estimates are more reliable, moving backwards and forwards, respectively.
9.4 Cell Search and Synchronization A brief overview of the cell search and synchronization for timing acquisition, which are performed before actual data detection, is given in
Figure 9.9 Data format and relevant signals.
192
Multiuser Detection in CDMA Mobile Terminals
this section. Two downlink timeslots are used for transmission of the physical synchronization channel (SCH) in the TDD system considered. Note that other physical downlink channels may be transmitted in the same timeslots in parallel. The use of two timeslots with a seven-slot space is necessary for monitoring purposes and to enable a proper intersystem handover from GSM or UTRA FDD to UTRA TDD. To facilitate cell planning, up to 128 cells in a TDD system can be distinguished, each of which has a unique cell-specific scrambling code and two corresponding basic midamble codes used for construction of long and short midamble codes (for data packet format of type 1 and 2, respectively). The 128 cells are grouped into 32 code groups. The primary synchronization channel (PSCH) is designed in such a way that the terminal can acquire frame synchronization and cell parameters (i.e., scrambling codes and basic midamble codes) within a single three-step procedure [7]. The three hierarchical steps are reported below: Slot synchronization: During the first step the mobile terminal uses the primary synchronization sequence Cp of 256 unmodulated chips, unique for the whole system, to acquire the slot synchronization with the strongest BS, without knowledge of its identity. In this case, frame synchronization is acquired with an uncertainty of 1 out of 2, since the PSC is transmitted twice per frame. The simplest way to achieve synchronization is by correlation and peak detection. Frame synchronization and identification of the code group: During the second synchronization step for the cell search, the MS detects three secondary codes Cs;i1 ; Cs;i2 ; Cs;i3 (i1; i2; i3 ¼ 1; 2; . . . ; 15) that are transmitted at the same time and with the same overall power as Cp . This combination of secondary codes allows the discrimination of the 32 code groups. The detection of the secondary synchronization sequence can also be performed by correlation and peak detection. The uncertainty over the scrambling code is now of 1 out of 4. Scrambling code identification: In the third phase the terminal determines one out of four cells within the code group by correlating with the four different basic midamble codes (transmitted on the BCCH) belonging to the code group identified in the previous step. The synchronization techniques described in Chapter 8 can be used for all these steps.
Third-Generation Mobile Radio System
193
9.5 Novelties of the Third-Generation Radio System Third-generation systems have to support a wide range of services from voice to low data rate up to high data rate circuit-switched and packetoriented services. A high grade of asymmetry for data application is expected. These needs are efficiently supported by a combination of FDD and TDD. They are reflected in the technical requirements of international standardization bodies with respect to the supported data rates and radio environments. In particular, the radio interface of the UTRA-TDD mode is compatible with simple and cost-efficient multiuser detection techniques because it envisions the use of short codes both for spreading and scrambling. The deployment of multiuser detection techniques and interference mitigation techniques becomes a recommendation especially when the multipath is significant and the number of active codes is relatively large.
References [1]
3rd Generation Partnership Project (3GPP), Radio Interface Technical Specifications, http://www.3gpp.org.
[2]
3rd Generation Partnership Project 2 (3GPP2), Radio Interface Technical Specifications, http://www.3gpp2.org.
[3]
Castoldi, P., and H. Kobayashi, ‘‘Low Complexity Group Detectors for Multirate Transmission in TD-CDMA 3G Systems,’’ IEEE Broadband Wireless Symposium (Globecom 2000), San Francisco, CA, November–December 2000.
[4]
Castoldi, P., and H. Kobayashi, ‘‘Co-channel Interference Mitigation Detectors for Multirate Transmission in TD-CDMA System,’’ (special issue on multiuser detection for the wireless channel), Vol. 20, No. 2, February 2002, pp. 273–286.
[5]
3rd Generation Partnership Project (3GPP), Radio Interface Technical Specifications, technical specifications 25.928, http://www.3gpp.org.
[6]
3rd Generation Partnership Project (3GPP), Radio Interface Technical Specifications, technical specifications 25.102, http://www.3gpp.org.
[7]
Haardt, M., et al., ‘‘The Physical Layer of UTRA-TDD,’’ Proceedings of the IEEE Vehicular Technology Conference (VTC 2000, Spring), Tokyo, Japan, May 15–18, 2000.
[8]
Haardt, M., et al., ‘‘The TD-CDMA Based UTRA TDD Mode,’’ IEEE Journal on Selected Areas in Communications, Vol. 18, No. 8, August 2000.
[9]
Chaudhury, P., W. Mohr, and S. Onoe, ‘‘The 3GPP Proposal for IMT-2000,’’ IEEE Communications Magazine, December 1999.
194
Multiuser Detection in CDMA Mobile Terminals
[10]
Gessner, C., et al., ‘‘Layer 2 and Layer 3 of UTRA-TDD,’’ Proceedings of the IEEE Vehicular Technology Conference (VTC 2000, Spring), Tokyo, Japan, May 15–18, 2000.
[11]
Steiner, B., and P. Jung, ‘‘Uplink Channel Estimation in Synchronous CDMA Mobile Radio System with Joint Detection,’’ 4th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC ’93), Yokohama, Japan, September 1993.
10 The ASI-CNIT Communication System The system described in this chapter is the result of a collaborative research effort developed between the Italian Space Agency (ASI) and the Italian National Consortium for Telecommunications (CNIT). The project envisions experiments of teleteaching applications on a satellite network using the satellite ITALSAT II for the space segment. One of its objectives is the design of a CDMA strategy for the access management of the stations participating in the videoconference session. Like third-generation mobile radio systems, the ASI-CNIT system deploys CDMA to allow for the existence of logical channels overlapping in time and frequency [1]. Moreover, CDMA gives the system a high reconfigurability degree by assigning proper new codes to new users entering the communication—a highly desirable feature in terms of flexibility. The last few years have witnessed growing interest in the interconnection of local area networks (LANs) using a satellite. It is a current issue from the point of view of transmission and network protocols. One of the main advantages of interconnecting LANs using a satellite is the capability of covering a large geographic area, with a complexity that is independent of the number of receiving stations. The characteristics of a satellite network are very different from terrestrial wired and wireless networks [2]. The major disadvantage is the propagation delay due to the long round-trip path of a geostationary satellite. Moreover, transport of multimedia traffic requires high transmission speed, reliability, and low transmission jitter. The use of a satellite link, properly configured in terms of multiple access and bandwidth allocation, is a valid alternative to a terrestrial network [3]. 195
196
Multiuser Detection in CDMA Mobile Terminals
10.1 System Description The radio interface of the CDMA system is asynchronous. In a possible transmitting scenario, as shown in Figure 10.1, three CDMA modems located in Genoa, Bologna, and Prato, Italy are active. The three satellite stations interconnect heterogeneous LANs realized traditionally (Bologna), and by using hybrid terrestrial links (see for example Prato, Pisa, Florence, Genoa, and Parma). Integrated Service Digital Network (ISDN) links or microwave radio frequency (RF) links are used to create extended LANs. The data associated to the CDMA streams are detected by using the MMSE algorithms presented in Chapter 5. The system does not envision any kind of time alignment among transmitting stations; as a consequence, timing acquisition is necessary for the demodulation of the CDMA data streams as well as frequency offset estimation. The synchronization algorithm that can be used is described in Chapter 8. The CDMA modem has been realized on a DSP board; however, this description is beyond the scope of this chapter. The details of the DSP implementation can be found in [4], while here we will describe only the system specifications and the basic signal processing algorithm designed for correct operation [5]. The system specifications envision a configuration with at most six transmitting stations. Each station has six codes available to transmit its data. The choice of a spreading factor 63 (Gold sequences) suggests that fewer than 36 codes are employed simultaneously. Since the system is intrinsically asynchronous, each station can start transmitting at an arbitrary epoch, and the receiving stations must be able to lock to the signal that is destined to them, acquiring chip and framing synchronization and tracking it along the detection process.
10.2 Data Frame Format The data frame proposed envisions the use of synchronization sequences periodically inserted in the data stream. The synchronization sequence allows us to acquire the timing of the useful signal and to estimate other important parameters such as phase, frequency, and amplitude. Estimation of this last parameter is optional when adaptive MMSE receiving algorithms are used, since they are able to converge and track (to some extent) the optimal MMSE solution. Figure 10.2 shows the superposition of data frame formats of concurrently and asynchronously transmitting stations. We can see that a synchronization sequence of unmodulated chips whose length and energy have
LAN Genoa
ITALSAT F2 13 EST - Ka Band
HUB
2 Mbps
CDMA modem
512 Kbps
ISDN Router CDMA modem
Router
LAN Pisa
512 Kbps
HUB
LAN Prato
HUB
LAN Parma
Router Modem
Router
ISDN Router
512 Kbps
512 Kbps
RF
The ASI-CNIT Communication System
Router
LAN Bologna HUB
CDMA modem
HUB
LAN Florence Figure 10.1 CDMA satellite system with three active stations.
197
198
Multiuser Detection in CDMA Mobile Terminals
Tsync
Tdata
Ttrain
#1 1st #2 station #6
#1 6th #2 station #6
Figure 10.2 Data frame with synchronization sequence periodically inserted.
been optimized is periodically inserted to allow a receiving station to lock to the proper beam of codes. The duration Tsync of this sequence has been designed so as to be equal to a symbol interval, and can be chosen among the set of 63 Gold sequences. This choice was made in order to maintain the cyclostationarity of the received signal; a longer synchronization sequence would destroy this cyclostationarity and induce a transient for the adaptive algorithms, as shown by computer simulations later in the chapter. The power Psync of the synchronization sequence and its insertion rate must be optimized by trading good acquisition performance for a moderate interference on the other data streams. The described data frame of each transmitting station is detailed in Figure 10.3. The data part of the frame comprises a training sequence,
#1 #2 #3 #4 #5 #6 Ntrain
Nsync
Ndata
Nov Ntot
Figure 10.3 Length in symbol intervals of the fields of a data frame of a single transmitting station.
The ASI-CNIT Communication System
199
which is mandatory if we use a trained adaptive algorithm. The actual data field follows the training sequence field and contains the data stream of the transmitting station. The training sequence in the data part of the frame is composed by Ntrain symbols. Then the actual data field is made of Ndata symbols. The length of the pattern, periodically repeated, measured in symbol intervals is Ntot ¼ Ndata þ Ntrain þ 1, where Nov ¼ Ntrain þ 1 is overhead. If we plan to use a blind adaptive algorithm, the training sequence is not necessary (Ntrain ¼ 0). It might be useful to use data frames whose length is a power of 2 [i.e., n 2 symbol intervals (n natural)]. We can choose, for example, Ntot as a power of 2 either with a trained or blind algorithm. Furthermore, in order to simplify the realization, it can be interesting to choose Nov as a power of 2 if using a trained algorithm. The time duration of each field of the data frame gives useful indications to assess the DSP feasibility of the adopted synchronization and detection algorithms: the system bit rate is Rb ¼ 64 Kbps and the spreading factor is Q ¼ 63. The chip rate is Rc ¼ Rb Q ¼ 4:032 Mchip=s
ð10:1Þ
The symbol interval Ts and the chip interval Tc have the following values: 1 1 ¼ ’ 15:75 ms Rb 64 103 1 Ts ¼ ’ 0:25 ms Tc ¼ Rc Q Ts ¼
ð10:2Þ
If Tsync is the time duration of the synchronization sequence, Ttrain the duration of the training sequence field, Tdata the duration of the actual data field, Tov the duration of the overhead symbols, and Ttot the entire duration of the repeated pattern, these parameters have the following values: Tsync ¼ Nsync Ts ¼ Ts ¼ 15:75 ms Ttrain ¼ Ntrain Ts Tdata ¼ Ndata Ts Tov ¼ Nov Ts Ttot ¼ Ttot Ts
ð10:3Þ
200
Multiuser Detection in CDMA Mobile Terminals
A significant parameter that highlights the efficiency of a data frame is the percentage overhead, defined as the time spent to transmit overhead Tov and the total duration of frame Ttot : OV ¼
Tov Nov 100 ¼ 100 Ttot Ttot
ð10:4Þ
Tables 10.1 and 10.2 contain a set of the values for the above parameters and use the two adaptive MMSE detection algorithms.
10.3 Algorithms for Data Detection and Their Performance Two popular algorithms have been considered for data detection, namely the trained adaptive and the blind adaptive MMSE detectors. The performance of Table 10.1 Possible Data Frames for Trained Detectors Nsync
Ntrain
Ndata
Nov
Ntot
Ttrain [ms]
Tdata [ms]
Ttot [ms]
OV %
1
31
992
32
1,024
488
15.62
16.13
3.12
1
31
2,016
32
2,048
488
31.75
32.26
1.56
1
31
4,064
32
4,096
488
64.01
64.51
0.78
1
31
8,160
32
8,192
488
128.52
129.02
0.39
1
63
1,984
64
2,048
992
31.25
32.26
3.12
1
63
4,032
64
4,096
992
63.50
64.51
1.56
1
63
8,128
64
8,192
992
128.02
129.02
0.78
1
63
16,320
64
16,384
992
257.04
258.05
0.39
Table 10.2 Possible Data Frames for Blind Detectors Nsync
Ndata
Ntot
Tdata [ms]
Ttot [ms]
OV %
8.065
0.196
1
511
512
8.05
1
1,023
1,024
16.11
16.13
0.098
1
2,047
2,048
32.24
32.26
0.049
1
4,095
4,096
64.51
64.51
0.024
1
8,191
8,192
129.00
129.02
0.012
1
16,383
16,384
258.03
258.05
0.006
The ASI-CNIT Communication System
201
these algorithms has been assessed by computer simulation in Chapter 6, with reference to specifications very similar to those considered here. Below we summarize the algorithms and then focus on transient performance of the trained algorithm. Specifically, we address the sensitivity of the trained algorithm to the training sequence length and assess its capability of recovering from running into a high-energy synchronization sequence.
10.3.1 Trained Adaptive MMSE Detection The trained adaptive detector does not require knowledge of any spreading code, but it needs a training sequence (provided by the proposed data frame). Denoting by aj ðmÞ the jth user data and by yðmÞ the observation vector, the receiver updates vector ^tj ðmÞ of its FIR coefficient according to the following LMS adaptation rule [6]: ^tj ðm þ 1Þ ¼ ^tj ðmÞ þ mej ðmÞ yðmÞ where ej ðmÞ ¼ aj ðmÞ ^tjH ðmÞ yðmÞ and ðt Þ aj ðmÞ ¼ aj ðmÞ in the training mode ^ aj ðmÞ in the decision directed mode
ð10:5Þ
ð10:6Þ
Soft decisions are obtained as xj ðmÞ ¼ ^tjH ðmÞ yðmÞ, and a hard limiter provides a hard decision ^ aj ðmÞ ¼ quant½xj ðmÞ .
10.3.2 Blind Adaptive MMSE Detection The blind adaptive detector requires knowledge of the desired spreading sequence cj (j ¼ 1; 2; . . . ; J ), but it does not need any training sequence. Denoting by aj ðmÞ the jth user data and by yðmÞ the observation vector, the receiver updates vector ^tj ðmÞ ¼ cj þ xj (xH j cj ¼ 0) of its FIR coefficient, using the following LMS adaptation rule: ^xj ðm þ 1Þ ¼ ^xj ðmÞ m½^tjT ðmÞ y ðmÞ yðmÞ ½y T ðmÞ cj cj ð10:7Þ which updates only xj (since cj is fixed and known). If we test the receiver’s performance under nonideal conditions (e.g., timing offset), a mismatch between the nominal desired code ~cj and the actual code cj may appear. In order to solve this problem and therefore avoid the
202
Multiuser Detection in CDMA Mobile Terminals
cancellation of the useful signal, a constraint on the surplus energy w ¼ kxj k2 is required. Especially for high signal to noise ratio, a constraint on the surplus energy must be made explicit (by trial and error). The modified LMS adaptation rule (w depends on n) is ^xj ðm þ 1Þ ¼ ^xj ðmÞ m½^tjT ðmÞ y ðmÞ yðmÞ ½y T ðmÞ cj cj mn^xj ðmÞ ð10:8Þ
10.4 Optimization of the Length of the Training Sequence The length of the unmodulated synchronization sequence, which has been set equal to the spreading factor ðNsync ¼ 63Þ, and its energy per chip are the result of an optimization procedure. This optimization relies on the investigations in Chapter 8 on the acquisition performance and on the transient performance of the MSE, which is studied in this section. In all figures of this section we envision the occurrence of a synchronization sequence after 700 chip iterations, we operate with a large signal to noise ratio ðEb =N0 ¼ 27 dB), and the MSE plots are averaged over 50 independent simulation runs. Let us consider, initially, a situation where the synchronization sequence is relatively long with a low energy per chip. Specifically, we have chosen a synchronization sequence with length Ns ¼ 511 chips, with energy per Esync ¼ Ec , equal to the energy devoted to the chips modulated by the data. In Figure 10.4 we present the MSE transient behavior for a step size m ¼ 0:1, with a training sequence of infinite length, to avoid spurious phenomena due to errors in tentative decisions. The insertion of a synchronization sequence on an interfering channel does not only impair the MSE of the detector for the whole duration of the sequence itself (a synchronization sequence with a length of 511 chips affects directly at most 10 symbols), but also several successive symbols induced by a long transient behavior. As a matter of fact, a synchronization sequence with a length that is different from the spreading factor of the CDMA channels affected destroys the cyclostationarity of the received signal with reference to the adopted processing window. It takes a large number of iterations before the MSE converges to its steady-state value, although the step size value has been set to the large value m ¼ 0:1. A second investigation considers a situation where the synchronization sequence has lower length, but greater energy per chip, which is necessary to warrant the same acquisition performance of the previous case. Specifically, we have considered a synchronization sequence
203
The ASI-CNIT Communication System
0
10
–1
MSE averaged over 50 runs
10
–2
10
–3
10
–4
10
–5
10
0
100
200
300
400 500 600 Iteration #
700
800
900 1,000
Figure 10.4 Transient of the MSE for the trained detector with infinite training symbols, a high step size (m ¼ 0:1), and a synchronization sequence with length Ns ¼ 511.
with length Ns ¼ 255 chips and an energy per chip Esync ¼ 6Ec . The MSE behavior for a large step size (m ¼ 0:1) is reported in Figure 10.5. Still, an extended transient appears, although it is less significant than in the case of Ns ¼ 511. Comparing Figures 10.4 and 10.5, we see that a synchronization sequence with greater energy per chip causes a larger peak of the MSE, so that MSE will be large for a significant portion of the transient. Although numerical results are not reported, it is obvious that a lower step size would induce longer transients. Similar behavior can be obtained for a synchronization sequence of length Ns ¼ 127 chips. The general conclusion so far is that the synchronization solutions presented affect the detection performance for the duration of the synchronization sequence and for several successive symbols beyond the duration of the synchronization sequence. A solution for the problem of guaranteeing good acquisition performance and minimum impairment to detection is to choose a synchronization sequence with length Ns ¼ 63 chips, exactly equal to the spreading factor Q of the system, and a relatively high
204
Multiuser Detection in CDMA Mobile Terminals
0
10
MSE averaged over 50 runs
10
10
10
10
10
–1
–2
–3
–4
–5
0
100
200
300
400 500 600 Iteration #
700
800
900 1,000
Figure 10.5 Transient of the MSE for the trained detector with infinite training symbols, a high step size (m ¼ 0:1), and a synchronization sequence with length Ns ¼ 255.
energy per chip Esync ¼ 12Ec . In this last case, the synchronization sequence affects at most two symbols of the stream to be detected, and the glitch on iteration 700 is rapidly recovered. In fact, with this choice we maintain the cyclostationarity of the received signal on the processing window of the sliding algorithms. A large energy per chip of the synchronization sequence compared to that of the modulated chips has the drawback of causing a large peak in the glitch, compensated however by the fact that the transient is recovered extremely rapidly, as shown in Figures 10.6, 10.7, 10.8, and 10.9. Specifically, Figures 10.6, 10.7, 10.8, and 10.9 show the plots of the MSE simulations, for trained adaptive detectors as a function of the iteration step of the algorithm. In these figures the duration of the training sequence is finite and the receiver switches from the training mode to the decision directed mode very quickly, since the MSE transient is short. We have considered training sequences of 15 and 31 symbols, and two values of step size (m ¼ 0:1; 0:01), which lead, respectively, to a fast but less steady convergence and to a slow but steadier convergence of the MSE. For the
205
The ASI-CNIT Communication System
0
10
–1
MSE averaged over 50 runs
10
–2
10
–3
10
–4
10
–5
10
0
100
200
300
400 500 600 Iteration #
700
800
900 1,000
Figure 10.6 Transient of the MSE for the trained detector with 15 training symbols and high step size (m ¼ 0:1), and a synchronization sequence with length Ns ¼ 63.
transmission scenario considered, the numerical results show how critical the choice of an adequate number of training symbols is. Figures 10.6 and 10.7 show the transient performance of an MMSE trained adaptive detector for step sizes corresponding to m ¼ 0:1 and m ¼ 0:01, respectively. The ratio of the synchronization chip energy over the data chip energy is 12 (i.e., Esync =Ec ¼ 12). It is assumed that six stations are simultaneously active, each using one code to transmit its data. In both figures a short training sequence of 15 symbols is used (i.e., the detector switches from the trained mode to the decision directed mode after 15 iterations). We can notice a significant ripple of the MSE, which causes instability in the detection. Specifically in Figure 10.6, the ripple can be noticed up to 600 iterations; after that, the MSE is leveled out and a minor transient appears in handling the synchronization sequence. In Figure 10.7 we still see an instability of the MSE, which is mitigated by a lower value of the step size. In Figures 10.8 and 10.9 we show the transient performance of an MMSE trained adaptive detector for step sizes corresponding to m ¼ 0:1
206
Multiuser Detection in CDMA Mobile Terminals
0
10
MSE averaged over 50 runs
10
10
10
10
10
–1
–2
–3
–4
–5
0
100
200
300
400 500 600 Iteration #
700
800
900 1,000
Figure 10.7 Transient of the MSE for the trained detector with 15 training symbols and low step size (m ¼ 0:01), and a synchronization sequence with length Ns ¼ 63.
and m ¼ 0:01, respectively. The ratio of the synchronization chip energy over the data chip energy is 12 (i.e., Esync =Ec ¼ 12). It is assumed that six stations are simultaneously active, each using one code to transmit its data. The MSE transient is more satisfying when using a training sequence of 31 symbols for both values of the step size. As expected, we observe a faster convergence for m ¼ 0:01, with a minor delay in dealing the transient due to the synchronization sequence; the behavior is different from case m ¼ 0:1, where the convergence of the algorithm is slower but the glitch due to the synchronization sequence is immediately absorbed.
10.5 Signaling Using the Synchronization Sequence The trained detector does not intrinsically need to know the spreading sequence of the desired streams, since the training sequence provides the learning capability. Nevertheless, the transient performance can be greatly improved if this information is provided by the first iteration. In fact, in this
207
The ASI-CNIT Communication System
0
10
MSE averaged over 50 runs
10
10
10
10
10
–1
–2
–3
–4
–5
0
100
200
300
400 500 600 Iteration #
700
800
900 1,000
Figure 10.8 Transient of the MSE for the trained detector with 31 training symbols and high step size (m ¼ 0:1), and a synchronization sequence with length Ns ¼ 63.
case, the performance is degraded only by possible errors in the decision directed mode and by natural ripple of the adaptive algorithm. In this section we propose a signaling strategy that allows the receiver to detect, at the receiver side, the codes used in each data beam received. The knowledge of the active codes may be useful to improve the performance of the adaptive receivers and can be used for maximum likelihood estimation of the frequency offset. We operate with a synchronization sequence which has a length of 63 chips, exactly equal to the spreading factor used in the data section. It is known that the maximum number of Gold sequences of length Q is equal to Q þ 2 [1]; in our case, since Q ¼ 63, we can realize 65 different Gold sequences. Each station uses at most six codes; this means that 6 6 ¼ 36 codes are used by the stations for data transmission. The remaining 29 Gold codes can be used as possible synchronization sequences; hence, each of the six transmitting stations has four possible synchronization sequences available (in total we use 24 Gold codes for the synchronization sequences).
208
Multiuser Detection in CDMA Mobile Terminals
0
10
MSE averaged over 50 runs
10
10
10
10
10
–1
–2
–3
–4
–5
0
100
200
300
400 500 600 Iteration #
700
800
900 1,000
Figure 10.9 Transient of the MSE for the trained detector with 31 training symbols and low step size (m ¼ 0:01), and a synchronization sequence with length Ns ¼ 63.
In this way we can encode four transmitting states for each station. The four states that can be encoded are, for example, the declaration of activity of 1, 2, 4, or 6 logical synchronous codes. Furthermore, a specific synchronization sequence is always associated to a unique set of spreading codes. In order to detect which codes are active, the receiver must use 24 sliding correlators, which exploit the same timing acquisition strategy used for the desired signal only.
References [1]
Peterson, R. L., R. E. Ziemer, and D. E. Borth, Introduction to Spread Spectrum Communications, Upper Saddle River, NJ: Prentice Hall, 1995.
[2]
Proakis, G., Digital Communications, Third Edition, New York: McGraw Hill, 1995.
[3]
Gallager, R. G., Information Theory and Reliable Communication, New York: John Wiley & Sons, 1968, p. 337.
The ASI-CNIT Communication System
209
[4]
Sacchi, C., and L. S. Ronga, ‘‘A DSP-based DS/CDMA Modem for Multimedia Applications Over Geo-stationary Satellite Networks,’’ International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), Salt Lake City, UT, May 7–11 2001.
[5]
Conti, A., et al., ‘‘DSP-based CDMA Satellite Modem: CNIT-ASI Project,’’ 12th Tyrrhenian International Workshop on Digital Communications: Software Radio: Technologies and Services, Isola d’Elba, September 2000.
[6]
Haykin, S., Adaptive Filter Theory, Upper Saddle River, NJ: Prentice Hall, 1991.
Glossary 3GPP
3rd Generation Partnership Project
3GPP2
3rd Generation Partnership Project 2
ASI
Italian Space Agency
asymptotic efficiency
ratio of the energy in the single-user system to the energy in the multiple-user system for the same BER performance when the additive background noise tends to zero (it lies between 0 and 1)
AWGN
additive white Gaussian noise
BCH
broadcast channel
BEP
bit error probability
BER
bit error rate
BPSK
binary phase shift keying
BS
base station
CDMA
code division multiple access
CDMA-DS
CDMA direct spread
CDMA-MC
CDMA multicarrier
CNIT
Italian National Consortium for Telecommunications 211
212
Multiuser Detection in CDMA Mobile Terminals
data frame
format of the transmitted data, usually divided into fields
D-CDMA
dimension-limited (or deterministic) CDMA
downlink detection
in a cellular or satellite system, the detection employed by the mobile station when the signal comes from the base station or the satellite, respectively
DSP
digital signal processor
DS-SS
direct sequence spread spectrum
DV
decision variable
FDD
frequency division duplex
FDMA
frequency division multiple access
FIR
finite impulse response
G3G
Global 3rd Generation
GP
guard period
ICI
interchip interference
interference mitigation/ cancellation (linear/nonlinear)
procedure of attenuation or elimination (depending on the criterion) of the interfering data streams in a CDMA system in order to improve detection. It can be linear or nonlinear depending on the processing of the observation. A multiuser detection strategy is intrinsically an interference mitigator/canceler
ISDN
Integrated Service Digital Network
ISI
inter symbol interference
lagrangian function
in a constrained minimization, the modified cost function that includes the constraints
LAN
local area network
LMS
least mean square
MA
multiple access
Glossary
213
MAI
multiple access interference
MC
multicarrier
MCRB
modified Cramer-Rao bound
MLSD
maximum likelihood sequence detection
MLSE
maximum likelihood sequence estimation
MMOE
minimum mean output energy
MMSE
minimum mean square error
MMSE-I detector
detector that realizes mitigation of ISI only using the MMSE criterion
MMSE-MI detector
detector that realizes mitigation of ISI and MAI using the MMSE criterion
MOE
mean output energy
MS
mobile station
MSE
mean square error
multipath channel
a physical channel characterized by different propagation paths, which causes multiple echoes of the transmitted signal
multiuser detector
a detector that realizes the joint demodulation of mutually interfering spread spectrum digital streams (it can be linear or nonlinear). This type of detector makes use of information regarding all active users in the systems.
near-far effect
in CDMA systems without power control, the signals of users located at different distances from the receiving stations are received with different power levels and detected with different reliability
non-RNC
non-radio-network-controlled
OHD
Operator Harmonization Group
overhead
in data transmission, the part of data that does not carry actual information but only controls or signaling information
214
Multiuser Detection in CDMA Mobile Terminals
oversampling factor
in CDMA systems, it is usually refers to the chip rate and it depends on the type of function (detection, synchronization, etc) realized and on the speed of the digital signal processor employed
OVSF
orthogonal variable spreading factor
parameter acquisition
the function of ‘‘estimating’’ the value of a parameter
parameter tracking
the function of following the value of a parameter after its value has been acquired
PDMA
power division multiple access
PIC
parallel interference cancelation
PN
pseudo-noise
power control
a service, available in many wireless systems, that allows the regulation of the transmitted power of different users so that the received power levels are approximately all equal
processing window
the observation window in which samples are collected for signal processing. A sliding window is used to ensure a continuous detection.
PSCH
primary synchronization channel
PSK
phase shift keying
PSP
per survivor processing
Rake receiver
a receiver that exploits the time diversity of the received signal affected by multipath propagation. Each ‘‘finger’’ of the Rake is devoted to an echo of the signal.
R-CDMA
random CDMA
RF
radio frequency
SCH
synchronization channel
scrambling sequence
a special spreading sequence with masking properties. Scrambling sequences have mutual
Glossary
215
low cross-correlation for any timing delay and they are used, for example, in TDD systems to reduce intercell interference. SDMA
space division multiple access
SIC
successive interference cancellation
signature waveform, spreading waveform, spreading sequence
a spreading sequence modulates a chip pulse to create a signature waveform or a spreading waveform. The signature waveforms are mutually orthogonal in synchronous systems, and are nearly orthogonal in asynchronous systems.
single-user detector
a detector that realizes the detection of a single spread spectrum digital stream (it is usually realized by a matched filter). This type of detector makes use of information regarding the user of interest only.
SNIR
signal to noise and interference ratio
SNR
signal-to-noise ratio
soft/hard decisions
soft decisions are continuous-valued estimates of the transmitted data; hard decisions are discrete-valued estimates of the transmitted sequence
SSCH
secondary synchronization channel
synchronization sequence
a known sequence of unmodulated chips, interleaved in the data stream, used for the purpose of timing acquisition and frequency offset estimation
TDD
time division duplex
TDD-HCR
time division duplex, high chip rate
TDD-LCR
time division duplex, low chip rate
TDMA
time division multiple access
TD-SCDMA
time division synchronous CDMA
TFCI
transport format control indicator
216
Multiuser Detection in CDMA Mobile Terminals
trellis diagram
a diagram consisting of states used to search the minimum or maximum cost path related to the process of minimization/maximization of a likelihood function with finite memory
UMTS
Universal Mobile Telecommunication System
UTRA FDD
UMTS Terrestrial Radio Access
WGN
white Gaussian noise
ZF
zero-forcing
ZF-I detector
detector that realizes mitigation of ISI only using the ZF criterion
ZF-MI detector
detector that realizes mitigation of ISI and MAI using the ZF criterion
About the Author Piero Castoldi was born in Trento, Italy, in 1966. He received his Dr. Ing. in electrical engineering (with honors) from the University of Bologna, Italy, in 1991 and his Ph.D. in information engineering from the University of Parma, Italy, in 1996. He spent 1 year as a postdoctoral candidate at Princeton University, Princeton, New Jersey, in 1997, and he has had regular summer appointments at Princeton University since 1998. He served as an assistant professor at the University of Parma from 1998 to 2001, when he became an associate professor of telecommunications at the Scuola Superiore Sant’Anna of Pisa, Italy. He is currently coordinating the broadband communications group at the Scuola Superiore Sant’Anna. His research interests include the receiver design for mobile radio communications and network architectures for multimedia services. He has contributed to more than 20 technical papers in refereed journals and conferences. He also is managing four national projects in the area of broadband communications and interconnection of heterogeneous networks. Dr. Castoldi is an IEEE associate member and an active member of the IEEE Communications Society.
217
Index features, 113–14 MMSE receivers for, 57–61 single-rate, 34–35 with single-ray propagation, 18–21 sliding window MMSE detection, 62 spreading sequences, 17–18 time-discrete model, 21–26 transient performance, 114–18 types of systems, 113–14 See also Synchronous CDMA
3rd Generation Partnership Project (3GPP), 180 Adaptive linear multiuser detectors, 30, 85– 102 advantages, 101–2 blind, 85, 101–2 blind MMSE, 88–95 blind MMSE with surplus energy constraint, 95–101 matrix inversion and, 66 MMSE, 86 trained, 85, 101 trained MMSE, 86–88 See also Linear detection Additive Gaussian noise (AGN), 139 Additive white Gaussian noise (AWGN), 9, 19 decision feedback nonlinear receiver realization and, 139 Kay algorithm and, 165 ASI-CNIT communication system, 195– 208 data frame format, 196–200 defined, 195 system description, 196 with three active stations, 197 Asymptotic efficiency, 30 Asynchronous CDMA, 17–26 baseband equivalent of, 19 conventional detection scheme, 34
Base station (BS), 110, 111 Binary phase shift keying (BPSK), 104, 109 Bit error probability (BEP), 30, 103 exact evaluation of, 104–5 Gaussian approximation for, 105–6 for MMSE-MI, 104–5 in single-cell scenario, 106–9 for ZF-MI, 104–5 Bit error rate (BER), 103 with six asynchronous codes, 119 with 18 block-synchronous codes, 120 blind MMSE receiver, 124, 125 MMSE detectors (ideal case), 118–20 MMSE detectors (in presence of phase/ frequency offset), 123–27 MMSE detectors (in presence of timing error), 120–23 MMSE-MI detector performance (multiple-cell scenario), 111, 112 219
220
Multiuser Detection in CDMA Mobile Terminals
MMSE-MI detector performance (perfect channel state information), 109 MMSE-MI detector performance, 108 in multiple-cell scenario, 109–12 optimal blind/trained adaptive detectors, 113 in single-cell scenario, 106–9 trained/blind receivers (phase offset), 126 trained MMSE receiver (frequency offset), 127 trained MMSE receiver, 122, 123 ZF-MI detector performance, 108 Blind detector, 88–101 advantages, 101–2 BER performance (in presence of timing error), 124, 125 BER performance curves, 113 block diagram, 95 for data detection, 201–2 defined, 85 MSE convergence, 112 MSE convergence speed, 117 performance degradation, 118, 119 possible data frames, 200 spreading sequence and, 89 steady-state BER performance, 120–23 with surplus energy constraint, 95–101 timing and, 88 timing error and, 152 training sequence and, 88 transient behavior, 114 See also Adaptive linear multiuser detectors; Trained detector Block-diagonal matrix, 15–16, 24–25, 39 CDMA asynchronous, 17–26 D-CDMA, 7 dichotomies in, 27 DS-CDMA, 29, 165, 180 principle, 2 R-CDMA, 6, 7 spread spectrum techniques, 4–6 synchronous, 8–16 TD-CDMA, 7 Cholesky decomposition, 77
Coarse timing acquisition, 149–52 decision strategy, 151 defined, 149 isolated synchronization sequence, 154– 55 reconstructed signal specifications, 150 sliding correlator performance, 154–60 synchronization sequence periodically inserted, 155–60 timing error and, 152 See also Timing acquisition Code division multiple access. See CDMA Codes Gold, 18 OVSF, 8–9, 70, 109 spreading, 59, 71 structured, 48 Walsh-Hadamard, 8, 48, 51, 77 Coherent detection, 145, 164, 173–74 Conventional receiver, 30–37 asynchronous single-rate system, 34–35 defined, 30 FIR for, 34, 35 near-far effect, 43 synchronous multirate system with ISI, 35–37 synchronous multirate system without ISI, 30–33 See also Detectors Correlation matrix Rn(i), 83 Correlation matrix Rz, 175–77 Cost function MMSE, 60 trained adaptive MMSE receiver, 86 Data detection algorithms, 200–202 blind adaptive MMSE, 201–2 performance, 200–201 trained adaptive MMSE, 201 D-CDMA, 7 Decision feedback nonlinear receivers, 139– 43 algorithm evolution, 143 complexity reduction, 143 multiuser, 139–43 realization in presence of AWGN, 139 trellis-based, 140, 141 See also Nonlinear detectors
221
Index
Decision variables (DVs), 53 for ISI mitigation receivers, 56 for MAI and ISI mitigation receivers, 56 potential, 54 Detection coherent, 145, 164, 173–74 hard, 33, 48 linear, 69–82 matched filtering and, 33 noncoherent, 173–74 nonlinear, 128, 131–34 symbol-by-symbol, 52, 53 trellis-based decision feedback, 140, 141 Detectors blind, 85, 88–101 complexity requirements, 31 conventional, 30–37 decision feedback, 139–43 extended linear, 76–80 MMSE, 30 MMSE-I, 50–51, 79, 80, 103–12 MMSE-MI, 48–50, 55, 78–79, 103–12 multiuser, 45–66 performance measures, 30 PIC, 136–39 Rake, 37–43 SIC, 134–36 single-user, 29–43 sliding window MMSE, 62–65 threshold, 59–60 trained, 85, 86–88 ZF-I, 47–48, 78, 103–12 ZF-MI, 46–47, 55, 78, 103–12 Dimension-limited CDMA, 7 Direct sequence spread spectrum (DS-SS), 3, 4–6 defined, 4 modulation principle, 5 signals, 5, 6 DS-CDMA, 180 frequency estimators for, 165 near-far resistance, 29 Extended linear receivers, 76–80 defined, 76 detection algorithms, 77–80 interference, 76 task, 76
See also Linear detection Fine timing acquisition, 152–53 defined, 149 degradation, 162 error, 161, 162, 163 fractional delay, 153 possible estimated delays, 153 sliding correlator performance, 160–62 See also Timing acquisition Finite impulse response (FIR) filter, 34, 35, 46 MMSE, 86 noise contribution at output, 99 trained adaptive MMSE receiver implementation, 89 Frames, 181 Frame synchronization, 192 Frequency division duplex (FDD), 179 modes, 180 ULTRA, 192 See also Time division duplex (TDD) Frequency division multiple access (FDMA), 2 Frequency estimators for DS-CDMA, 165 LMS, 169, 173 performance of, 169–73 Frequency offset, 123–27 effect, 127 maximum likelihood estimation of, 207 SNRs and, 126 Frequency offset estimation, 145, 164–69 first-order statistics of z(i), 168 front-end processing, 165–68 LMS, 169 second-order statistics of z(i), 168 standard deviation as function of SNIR, 172 standard deviation as function of SNR, 171 TDMA and, 164 useful samples for, 147 Gaussian colored noise, 128 Global 3rd Generation (G3G), 180 Gold codes, 18 Gold sequences, 113, 169, 196
222
Multiuser Detection in CDMA Mobile Terminals
Gold sequences (continued) pilot chips, 170 as synchronization sequences, 207 Hard detection, 33, 48 MMSE-I, 51 MMSE-MI, 50 ZF-I, 48 ZF-MI, 47 Inband signaling system, 149 Integrated Service Digital Network (ISDN), 196 Intercell interferers, 70 Interference mitigation fully structured, 75 structured intercell, 77 structured intracell, 81 structured intracell/intercell, 81–82 structured intracell/unstructured intercell, 81 unstructured intracell, 81 Interferers generic, 69–70 intercell, 70 Intersymbol interference (ISI), 12, 13, 18, 96 for each data stream, 21 extended linear receiver, 76 observation windows in presence of, 37 residual, 105 synchronous multirate system with, 35– 37 synchronous multirate system without, 30–33 Kay algorithm, 165, 169 Lagrange multiplier, 100 Linear detection, 69–82 extended receivers, 76–80 interference mitigation, 82 multiuser, 45–66, 82 nonlinear detection vs., 143–44 operating modes, 81–82 performance, improving, 127–28 sliding window formulation, 80–81 structured intracell/intercell interference mitigation, 81–82
structured intracell interference mitigation, 81 structured intracell/unstructured intercell interference mitigation, 81 structured vs. unstructured, 69–82 system model, 69–76 unstructured intracell interference mitigation, 81 Linear multiuser detectors, 45–66, 82 adaptive, 30, 66, 85–102 best, 46 considerations, 66 drawback, 66 LMS frequency estimator, 169 bias of, 173 weighting vector, 169 See also Frequency estimators Local area networks (LANs), 195 Logical channels, 181 Matched filtering, 33, 36 Maximum likelihood sequence estimation (MLSE), 139 Mean output energy (MOE), 89 minimization, 92, 96 unconstrained gradient, 94 Mean square error (MSE), 103 convergence curve, 117 convergence for blind/trained MMSE detectors, 112 convergence speed, 117 transient performance, 128, 202 Minimization constrained problem, 93 MMSE-I, 79 MMSE-MI, 78–79 MOE, 92, 96 of norm of vector, 60 with trellis search, 140 ZF-I, 78 ZF-MI, 78 Minimum mean output energy (MMOE), 89, 95 Minimum mean square error (MMSE) cost function, 60 criterion, 45, 77 FIR filters, 86 sliding window algorithm, 51–57
Index
soft estimate, 49, 50 Minimum mean square error (MMSE) detectors, 30, 48–51 for asynchronous systems, 57–61 BER performance (ideal case), 118–20 BER performance (in presence of phase/ frequency offset), 123–27 BER performance (in presence of timing error), 120–23 MMSE-I, 50–51, 79, 103–12 MMSE-MI, 48–50, 55, 78–79, 103–12 performance in synchronous systems, 103–12 sliding window, 62–65 MMSE-I, 50–51 hard detection, 51 minimization, 79 performance in synchronous systems, 103–12 soft decisions, 79, 80 See also Minimum mean square error (MMSE) detectors MMSE-MI, 48–50 BEP evaluation, 104–5 BER performance, 108 BER performance (multiple-cell scenario), 111, 112 BER performance (perfect channel state information), 109 Gaussian approximation, 106 hard detection, 50 minimization, 78–79 performance in synchronous systems, 103–12 SNIR, 55 soft decisions, 79, 105 See also Minimum mean square error (MMSE) detectors Mobile station (MS), 109–10, 111, 187 Modified Cramer-Rao bound (MCRB), 165, 171, 172 Monte-Carlo simulation, 103 M-sequences, 18 Multicarrier CDMA (MC-CDMA), 180 Multipath propagation, 9–12 Multiple access, 1–4 FDMA, 2, 3 PDMA, 4
223
SDMA, 3 TDD-HCR, 184 TDMA, 2, 3, 164 See also CDMA Multiple access interference (MAI), 12, 17, 36, 96 cross-multipath, 42 DS-SS signal in presence of, 165 elimination, 30 estimation of, 131 extended linear receiver, 76–77 partial cancellation of, 35 reconstruction of, 131, 132 reliability-weighted cancellation, 139 residual, 105 SIC detector and, 134 variable SNR and, 170 Multiple-cell models, 74–76 illustrated, 72 structured interference mitigation, 75 structured intracell signal, 74–75 Multiuser detectors, 45–66 linear, 45–66 nonlinear, 30 single-user vs., 29–30 See also Detectors National Consortium for Telecommunications. See ASI-CNIT communication system Near-far effect, 43 Near-far resistance, 30 Noise samples vector, 40 Noncoherent detection, 173–74 Nonlinear detectors, 128, 131–44 decision feedback, 139–43 linear vs., 143–44 PIC, 136–39 SIC, 134–36 subtractive interference cancellation, 131 system model, 132–34 Nonlinear multiuser detectors, 30 Normalized system load, 104 Nyquist criterion, 18, 35, 91 Nyquist sampling, 32, 95, 170 Operator Harmonization Group (OHG), 180
224
Multiuser Detection in CDMA Mobile Terminals
Orthogonal with variable spreading factor (OVSF) codes, 8–9, 109 accounting for, 8 defined, 8 generation tree, 9 user-specific, 70 Oversampling factor, 160 Percentage overhead, 200 Performance MMSE receivers in asynchronous systems, 112–28 MMSE receivers in synchronous systems, 103–12 ZF receivers in synchronous systems, 103–12 Peripheric symbols, 48 Per-survivor processing (PSP), 143 Phase offset, 123–27 constant, 124 effect, 125, 126 nonconstant, 124 Physical channels, 181–87 PIC detector, 136–39 advantages, 138 defined, 136 drawbacks, 138 first stage of, 136–37 number of stages, 138 performance improvement, 138–39 scheme illustration, 137 See also Nonlinear detectors Power division multiple access (PDMA), 4 Primary synchronization channel (PSCH), 192 Pseudorandom CDMA. See R-CDMA Rake receiver, 37–43 defined, 37 fingers, 38, 41 finger soft decision, 42 illustrated, 41 MAI mitigation, 41 observation model, 37–40 principle, 40–43 structure, 42 usefulness, 43 See also Single-user detectors
Rayleigh fading channel, 106 R-CDMA, 6–7 capacity, 7 defined, 6 multipath energies and, 7 Receivers. See Detectors Reliability-weighted MAI cancellation, 139 Rice channel, 106 Rife and Boorstyn algorithm, 164 Scalar chip observation, 132 Scrambling sequences, 70 illustrated, 189 single cell transmission system in presence of, 70 spreading sequences combination, 190 TDD-HCR mode, 188–89 Sequences Gold, 113, 169, 170, 196, 207 modulated chip, 71 scrambling, 70, 188–89 spreading, 8–9, 17–18, 188–89 synchronization, 147, 150, 154–60 training, 88, 187, 199, 202–6 Walsh-Hadamard, 70–71 SIC detector, 134–36 advantage, 136 defined, 134 drawbacks, 136 first stage, 134–35 MAI and, 134 See also Nonlinear detectors Signal-to-noise-and-interference ratio (SNIR), 55–57 frequency estimate as function of, 172 for ISI mitigation receivers, 58, 59 for MAI and ISI mitigation receivers, 57, 58 maximum, 57 for MMSE-MI detector, 55 SNR relationship, 170 for ZF-MI detector, 55 Signal-to-noise ratio (SNR), 99 estimation algorithm evaluation, 170 frequency estimate as function of, 171 frequency offset and, 126 SNIR relationship, 170 variable, 170
Index
Single-cell models, 73–74 Single-user detectors, 29–43 conventional, 30–37 limits, 43 multiuser vs., 29–30 Rake, 37–43 See also Detectors Sliding correlator, 149 coarse acquisition, 154–60 complexity issues, 162–64 fine acquisition, 160–62 observation window, 163 performance, 154–64 Sliding window algorithm, 51–57 block minimization, 80 code-specific central subvectors, 81 defined, 51–52 formulation, 51–53, 80–81 validation, 55–57 window width, 80 Sliding window MMSE receiver, 62–65 for asynchronous system, 62 processing window, 65 symbol-by-symbol decision device, 63 Slot synchronization, 192 Soft hand-over procedures, 69 Space division multiple access (SDMA), 3 Spreading codes, 59, 71 Spreading factor, 183, 202 Spreading sequences for asynchronous CDMA systems, 17– 18 blind detector and, 89 periodically time-varying, 71 scrambling sequences combination, 190 for synchronous CDMA systems, 8–9 TDD-HRC mode, 188–89 Steepest descent algorithm, 86–87, 88 Subtractive interference cancellation, 131 Superframe, 181 Surplus energy blind adaptive MMSE receiver with, 95–101 defined, 96 minimum value evaluation, 97, 98 Symbol-by-symbol detection, 52 decision device, 53, 63, 77 illustrated, 52
225
Synchronization, 145–74, 191–92 channel (SCH), 192 chip energy over data chip energy ratio, 205 coherent vs. noncoherent detection, 173–74 frame, 192 frequency offset estimation, 164–69 insertion period, 163 observation model, 145–48 primary channel (PSCH), 192 sliding correlator, 154–64 slot, 192 timing acquisition, 149–53 Synchronization sequences, 147, 150, 154– 60 chip energy, 156, 158 data frame with, 198 data interleaved between, 163 Gold sequences as, 207 insertion, on interfering channel, 202 isolated, 154–55 partial overlap of, 158 periodically inserted, 155–60 power of, 198 signaling using, 206–8 time duration, 199 too long, 163 without interfering/adjacent signal, 154 without interfering signal/with adjacent signal, 155 Synchronous CDMA, 8–16 baseband model, 10 features, 104 MMSE receiver performance in, 103– 12 multipath propagation and, 12 multirate system with ISI, 35–37 multirate system without ISI, 30–33 single-rate, 38 spreading sequences, 8–9 time-discrete model, 12–16 ZF receiver performance in, 103–12 See also Asynchronous CDMA System model, 69–76 multiple-cell, 74–76 single-cell, 73–74
226
Multiuser Detection in CDMA Mobile Terminals
TDD-HCR, 179, 180, 181–89 allocation of slots, 182 format, 182 logical channels, 181 multiple access overview, 184 packet types, 183 physical channels, 181–87 spreading and scrambling sequence, 188–89 training sequences, 187 See also Time division duplex (TDD) Third-generation mobile radio system, 179–93 cell search and synchronization, 191–92 general specifications, 179–81 novelties, 193 radio interface and transmitted signal, 189–91 TDD-HCR mode physical layer, 181– 89 Threshold detector, 59–60 Time-discrete model for asynchronous systems, 21–26 for synchronous systems, 12–16 Time division duplex (TDD), xv high chip rate. See TDD-HCR low chip rate (TDD-LCR), 179, 180 modes, 179 ULTRA, 188, 192, 193 Time division multiple access (TDMA), 2, 164 digital, 2 frequency offset estimation and, 164 Timing acquisition, 145, 149–53 algorithms, 149 chip intervals, 164 coarse, 149–52 fine, 152–53 maximum probability, 149 modified, 164 problem, 149 purpose, 148 sliding correlator, 149 useful samples for, 147 See also Synchronization Timing error blind adaptive MMSE detector and, 152 coarse timing acquisition and, 152
Trained detector, 86–88 with 15 training symbols, 205, 206 with 31 training symbols, 207, 208 advantages, 101 BER performance curves, 113 block diagram, 88 convergence curves, 115–16 cost function, 86 for data detection, 201 defined, 85 FIR implementation of, 89 with infinite training symbols, 203, 204 matrix inversion and, 86 MSE convergence, 112 MSE convergence curve, 117 MSE convergence speed, 117 MSE performance, 116 with perfect timing recovery, 122, 123 performance degradation, 118 possible data frames, 200 in presence of timing error, 122, 123 simulations, 114 steady-state BER performance, 120–23 steepest descent algorithm, 86–87, 88 timing and, 86 transient performance, 114, 205, 206 See also Adaptive linear multiuser detectors; Blind detector Training sequences, 187, 199 with 15 symbols, 205, 206 with 31 symbols, 207, 208 with infinite symbols, 203, 204 optimization of length of, 202–6 Transport format control indicator, 187 Trellis decision feedback detection, 140, 141 relevant, 142 state data symbols, 142 ULTRA FDD system, 192 ULTRA-TDD system, 188, 192, 193 Universal Mobile Telecommunication System (UMTS), 69, 180 Walsh-Hadamard codes, 48, 51 generation, 8 length, 8 orthonormality, 77
Index
Weighting vector, 169 White Gaussian noise (WGN), 7, 16 Wiener estimator, 49, 79, 80 Zero-forcing (ZF), 45 criterion, 77 imperfect, of MMSE algorithm, 51 Zero-forcing (ZF) detectors, 46–48 for ISI (ZF-I), 47–48, 78, 103–12 for MAI and ISI (ZF-MI), 46–47, 78, 103–12 ZF-I, 47–48 hard detection, 48 minimization, 78
227
performance in synchronous systems, 103–12 soft decisions, 78 ZF-MI, 46–47 BEP evaluation, 104–5 BER performance, 108 Gaussian approximation, 106 hard detection, 47 minimization, 78 performance in synchronous systems, 103–12 SNIR, 55 soft decisions, 78, 105