# Frame synchronization and pilot structure for second generation DVB via satellites

# Feng-Wen Sun<sup>‡</sup>, Yimin Jiang<sup>\*,†</sup> and Lin-Nan Lee<sup>§</sup>

Hughes Network Systems Inc., 11717 Exploration Lane, Germantown, MD 20876, U.S.A.

#### SUMMARY

Due to the advanced coding schemes utilized in DVB-S2, new receivers have to function at unprecedented low SNR. From the perspective of synchronization, the major challenges are frame synchronization and carrier recovery. The challenge for frame synchronization arises from the fact that the frames of LDPC are rather long, up to 32 400 symbols, and there is no built-in structure in the LDPC to facilitate frame synchronization. Carrier recovery becomes a major challenge due to the requirement to retain the current outdoor equipment. This implies that the receivers have to work with the same phase noise as specified for the first generation DVB-S, which uses only QPSK modulation. DVB-S2 needs to support QPSK at much lower signal to noise ratios (SNR) and other higher order modulation schemes at about the same SNR of DVB-S.

In this paper, we will describe the solutions to frame synchronization and carrier recovery. Low overhead rapid frame synchronization is achieved by utilizing the *physical layer signalling code*. Robust carrier recovery is aided by the *aggregated pilot* structure. The design considerations for the pilot structure and the implementation tradeoffs of the carrier recovery schemes are addressed. Copyright © 2004 John Wiley & Sons, Ltd.

KEY WORDS: digital video broadcasting; synchronization; frame synchronization; carrier recovery

#### 1. INTRODUCTION

Second generation digital video broadcasting (DVB-S2) via satellites has been standardized recently [1]. The new channel coding schemes adopted by DVB-S2, namely the low-density parity check (LDPC) codes, impose more stringent requirement on the new receivers since they have to work at much lower signal to noise ratio (SNR) *vis-à-vis* the first generation DVB-S. It took a lot of efforts for the standard committee to finalize the frame and pilot structure for synchronization purposes. In this paper, we present the design considerations on the structures

<sup>\*</sup>Correspondence to: Yimin Jiang, 11717 Exploration Lane, Germantown, MD 20876, U.S.A.

<sup>&</sup>lt;sup>†</sup>E-mail: yjiang@hns.com

<sup>&</sup>lt;sup>‡</sup>E-mail: fsun@hns.com

<sup>&</sup>lt;sup>§</sup>E-mail: llee@hns.com

for frame synchronization and carrier recovery. We further describe efficient algorithms to utilize these structures to achieve rapid and robust synchronization.

Even before the coding scheme was finalized, it has been widely realized that two major difficulties in receiver synchronization are frame synchronization and carrier recovery. The first and foremost reason is that the receivers need to operate at rather low SNR, as low as -2 dB  $(E_s/N_0)$ . Besides low SNR, frame synchronization and carrier recovery have other constraints preventing system designers from leveraging any known techniques.

Since LDPC code is a block code, the block (frame) has to be first synchronized before the decoder can proceed to recover the information. In order to approach Shannon capacity, the code adopted by DVB-S2 is rather long. For QPSK modulation, the code can be as long as 32 400 symbols.<sup>¶</sup> Clearly, the longer the code, the more difficult the frame synchronization becomes. Furthermore, DVB-S2 supports different modulation schemes ranging from QPSK to 32-APSK. It is assumed that the receivers have no priori knowledge on which of the modulation schemes is used. Such knowledge can be gained only after the frame synchronization is achieved by analyzing the physical layer signalling information. It implies that the receivers cannot estimate many physical layer impairments such as carrier frequency offset (due to lack of information on modulation) before the frame synchronization is acquired. This, in turn, means that the frame synchronization has to be accomplished without the removal of frequency offset, which can be up to 25% of the symbol rate. In addition, there is no built-in structure in the LDPC to facilitate the frame synchronization. For broadcasting applications, channel switching time, a function of the frame acquisition time, directly affects the viewing experience of an end user.

In terms of carrier recovery, the major impairments are frequency offset and phase noise. As one of the most successful standards in the telecommunication industry, DVB-S enjoys a large installation base. It is clear that system operators would like to preserve their investment as much as possible. One economic way to upgrade end-user equipment from DVB-S to DVB-S2 is to retain the outdoor units and replace the DVB-S set-top boxes with the new hybrid DVB-S2/ DVB-S set-top boxes. However, current phase noise specification of the low noise blockdownconverter (LNB) is based on the performance requirement for QPSK modulation with less powerful coding, thus higher SNR. Figure 1 shows the single-sided DVB-S/S2 phase noise mask. One distinguished feature of DVB-S2 is to support higher order modulation schemes such as 8-PSK, 16-APSK and 32-APSK. In particular, 8-PSK modulation is of special interest to current system operators since the current spacecrafts supporting DVB-S have sufficient power to support DVB-S2 with 8-PSK modulation. However, with traditional carrier recovery techniques, it is impossible to maintain carrier synchronization at such low SNR for high order modulation. As an example, Figure 2 shows a scatter plot of 8-PSK modulation at  $E_{\rm s}/N_0 = 6.6$  dB, the operating threshold of the DVB-S2 8-PSK rate  $\frac{2}{3}$  code. The plot does not show any significant concentration around the signal constellation, indicating that conventional decision feedback tracking loops will not be able to function properly.

In order to facilitate rapid frame synchronization, the DVB-S2 standard embeds a structure into the *physical layer signalling code* to realize a rapid frame synchronization strategy [2]. This structure does not compromise the error correction capability of the code. The code is in fact an

<sup>&</sup>lt;sup>¶</sup>The earlier version of DVB-S2 actually includes BPSK, which is replaced by QPSK modulation except in the lower layer of hierarchical modulation. In this paper, we retain some of the performance data for BPSK for illustration.







Figure 2. 8-PSK scatter plot with the specified phase noise at  $E_s/N_0 = 6.6$  dB.

Copyright © 2004 John Wiley & Sons, Ltd.

optimal error correction code. As an additional benefit, the embedded structure largely simplifies the maximum likelihood (ML) decoding of the *physical layer signalling code*.

Generally speaking, carrier recovery can always be achieved with sufficient amount of pilot symbols. Obviously, the design goal is to minimize the total loss, i.e. the sum of pilot overhead and the performance loss due to imperfect synchronization. It turns out that most of the modes do not need pilot for synchronization at all. Since a receiver has to implement many different modes with different modulation schemes and pilot structures, it is important to maximize the commonalities among synchronization algorithms for different modes. The DVB-S2 pilot structure achieves this goal. At the meantime, the performance loss due to imperfect synchronization is negligible.

The rest of the paper is organized as follows. Section 2 introduces the structure of the physical layer signalling code and its applicability to accelerate frame synchronization. Section 3 presents the frame synchronization algorithms and their performance. Section 4 presents the pilot structure and the major considerations and tradeoff for carrier recovery. Section 5 gives major performance benchmarks of the carrier recovery. We conclude the paper in Section 6 with some final remarks.

## 2. STRUCTURED PLSC FOR FRAME SYNCHRONIZATION

Figure 3 illustrates the general structure of DVB-S2 physical layer frames. Each LDPC coded block is preceded by the start of frame (SOF) and the physical layer signalling code (PLSC). SOF is a known 26-symbol pattern. PLSC is a 64-symbol linear binary code, which conveys seven bits of information with a minimum distance 32. In total, SOF and PLSC occupy one slot (90 symbols). Independent from the modulation scheme of the LDPC coded block that follows, these fields are modulated by  $\pi/2$ -BPSK modulation to reduce the envelope fluctuation in comparison with the classic BPSK scheme.  $\pi/2$ -BPSK modulation rotates the signal constellation  $90^{\circ}$  for every symbol. The seven bits carried by the PLSC inform receivers the modulation scheme, code rate, pilot configuration, and the length of the LDPC coded data. In the broadcasting mode, the length of LDPC codes is 64 800 coded bits regardless of code rates and modulation schemes. In the adaptive coding and modulation (ACM) mode, LDPC codes of length 16200 can also be used. Once frame and phase synchronization are acquired, the probability of incorrectly decoding the PLSC is negligible. As long as the PLSC is correctly decoded, the next SOF can be located and the frame synchronization can be maintained. In the broadcasting mode, since the modulation and coding scheme will not change for the data stream from a single transponder, maintaining frame synchronization does not require the correct decoding of PLSC. Instead, frame synchronization can be maintained as long as the symbol timing loop is in lock. In both cases, it is a valid assumption that the frame synchronization can be maintained after it is initially acquired. Therefore, we will focus on the initial frame synchronization.



Figure 3. Frame structure of DVB-S2.

Copyright © 2004 John Wiley & Sons, Ltd.

Extensive analysis shows that SOF is too short to provide reliable and rapid frame synchronization [3–5]. It was also pointed out that the ML decoding of the PLSC is too complicated [4]. This motivates us to investigate alternative approach without increasing the overhead. Given that PLSC is a rather low rate code, it is natural to consider to embed certain structures into this code to assist the initial frame synchronization. There are many ways to construct a linear code of parameters [64,7,32]. For instance, it can be constructed as an extended BCH code, the dual of an extended Hamming code, an extended maximum length code, or a first-order Reed-Muller code [6]. In the sequel, we will present yet another construction that is particularly useful for rapid frame synchronization and efficient ML decoding.

The construction utilizes the first-order Reed-Muller code of parameters [32,6,16]. One exemplary generator matrix for [32,6,16] Reed-Muller code is shown in (1).

# 

The generator matrix can be constructed recursively by the well-known |u|u + v| construction [6]. This notation indicates how to use two codes of length *n* to construct a code of length 2n, i.e. *u* and *v* are drawn from each of the component codes, respectively. In the case of a first-order Reed-Muller code, *v* uses the trivial linear code **0** and **1**, the all-one and all-zero vectors of length *n*, as codewords, and *u* belongs to a first-order Reed-Muller code of length *n*.

The formulation of a PLSC codeword is shown in Figure 4. For every seven information bits, we encode the first six bits by the [32,6,16] first-order Reed-Muller code to obtain a binary vector **Y**. The vector **Y** is further duplicated into two identical vector. Every bit at the lower branch is binary summed with the seventh information bit. The upper and the lower branches are multiplexed bit by bit to form a vector of length 64. The mathematical formulation of the construction is as follows. Let  $\mathbf{Y} = (y_0, y_1, \dots, y_{31})$  be a codeword of the first-order Reed-Muller code [32, 6, 16]. Then two codewords of the [64,7,32] code (defined as  $\mathbf{Z} = (z_0, z_1, \dots, z_{63})$ ) can be generated as  $(y_0, y_0, y_1, y_1, \dots, y_{31}, y_{31})$  and  $(y_0, \bar{y}_0, y_1, \bar{y}_1, \dots, y_{31}, \bar{y}_{31})$ , respectively, where  $\bar{y}$ 



Figure 4. Construction of PLSC.

Copyright © 2004 John Wiley & Sons, Ltd.

represents the binary complement of y. Instead of bit by bit multiplexing the upper and lower vector, if the two vectors were cascaded together, it would have resulted in the |u|u + v| construction of the first-order Reed-Muller code of parameters [64, 7, 32]. This shows that the PLSC constructed in such a way is actually an interleaved first-order Reed-Muller code of parameters [64, 7, 32], which leads to the following statement.

The code as constructed in Figure 4 is a binary linear code of parameters [64,7,32]. Therefore, it is an optimal code.

A very useful property of the code for frame synchronization is that  $z_{2i} \oplus z_{2i+1}$  is constant for i = 0, 1, ..., 31. With the  $\pi/2$  BPSK modulation,  $z_{2i}$  maps to  $\pm 1$  and  $z_{2i+1}$  maps to  $\pm i$ . Therefore, in the modulated domain, the differential  $z_{2i}z_{2i+1}^*$  is equal to a constant.

As previously mentioned, the frequency offset during the initial acquisition can be as large as 25% of the symbol rate per DVB-S2 requirement, which implies that carrier phase can rotate up to 90° in one symbol interval. There are only two options in such a scenario: frequency offset resistant non-coherent differential detection or searching through multiple hypothesis. The latter is clearly less preferred since it takes much longer time to acquire, up to 2 s, based on the analysis of Reference [4]. That is too long for antenna pointing. The property of differential constantness of the PLSC is bound to be useful for differential detection. In fact, as long as the differential is *a priori* knowledge, receivers can take advantage of it. For this reason, we are able to further scramble the codeword of PLSC to improve the autocorrelation property. The specific sequence used for the scrambling is as follows.

# 

The scrambling sequence is essentially just an extended *m*-sequence. In the following section, we will present algorithms for rapid frame synchronization by utilizing SOF and PLSC.

# 3. FRAME SYNCHRONIZATION ALGORITHMS AND THEIR PERFORMANCE

Figure 5 illustrates scheme to correlate on both the SOF and PLSC differentially. The shift register in the circuit can be partitioned into two parts. The first part is associated with SOF, and the second part is associated with PLSC. There are in total 57 taps associated with the 89 registers. In the first part, there are 25 of them associated with the differential of SOF. In the second part, there are 32 nonzero taps associated with PLSC since only 32 out of the 64 differentials are known. The taps associated with the shift register for computing the correlation can be obtained as follows. First set all the registers to zero, then shift the modulated SOF and a modulated and scrambled codeword of PLSC into the circuit. Once the rightmost register becomes non-zero, the tap associated with a register is just the complex conjugate of the content of the corresponding register. Given that the modulated SOF and PLSC take only  $\pm 1$ ,  $\pm i$ , the taps only take these four possible values as well. Clearly these are trivial multiplication from implementation point of view.

When used for frame synchronization, the incoming signal arriving at the correlator is sampled at one sample per symbol. It is first differentially correlated with the delayed sample. The differentially correlated samples are sequentially shifted into a shift register of length 89. The contents of the shift register are multiplied with the taps. The first 25 and the last 32 values



Figure 5. Differential detection of the SOF and PLSC.

at the output of the multipliers are separately summed together. The outputs of the two summers are, respectively, added and subtracted to produce two values. The maximum of the absolute values of the two values is the final output of this correlation circuit.

The output will be further processed by a peak search algorithm. The conventional approach is to compare the output of the correlator with a predetermined threshold. If the value is larger than the threshold, it is declared that a preliminary frame synchronization has been achieved. It will proceed with post verification. The threshold is designed to balance the probabilities of missed detection and false alarm. Performance of such threshold detection has been studied [3]. It is quite sensitive to the threshold setting.

In the following, we will present an algorithm that is not based on the threshold detection and offers significantly better performance than the conventional threshold detector. It is well known that a DVB receiver only tunes into the transponders that are known to carry broadcasting programs. The first transponder to be tuned in upon cold start is typically fixed. Subsequently, transponders to be tuned into are given by network information table (NIT). This means that the intended signals are guaranteed to exist. This simple property can be taken into consideration in the detection to eliminate the need to balance the probabilities of missed detection and false alarm. Instead, a simple peak search within a prescribed window will be used.

Clearly by using a peak search, we need to make sure that the search window is big enough to have at least one SOF and PLSC. From performance perspective, it should be as small as possible. Without knowing the particular modulation and coding scheme, a receiver does not

know how often the SOF and PLSC appear. The search window depends on the lowest order of modulation scheme. For instance, if QPSK is the lowest order modulation, L should be equal to 32400 + 90.

Figure 6 shows the proposed peak search diagram. In Figure 6, L is the length of the search window. It is possible that there are multiple SOFs and PLSCs in each block of L symbols. The location in each block with the maximum peak is declared as a candidate. The post verification circuit will decode the PLSC. Based on the decoded PLSC, it derives the location of next SOF and PLSC. If the correlation at the next SOF and PLSC is of sufficient strength, it will pass the post verification and the receiver will declare a successful frame synchronization. Otherwise, the receiver will proceed to check the next candidate. The post verification can be performed in either a parallel or serial manner, representing the tradeoff between implementation complexity and acquisition speed.

Extensive computer simulations have been performed. Tables I and II summarize the mean time and the time with 99.9% confidence to acquire frame synchronization. BPSK and QPSK are particularly interesting since they require the least SNR, respectively, for the non-



Figure 6. Peak detector: peak within each window of length L is located.

| Method     | Modulation | $E_s/N_0$ (dB) | Time to acquire (ms) |
|------------|------------|----------------|----------------------|
| SOF only   | BPSK       | -2             | 134                  |
| SOF + PLSC | BPSK       | -2             | 21.7                 |
| SOF + PLSC | QPSK       | 0.7            | 3.53                 |

Table I. Mean time to acquire frame synchronization.

| Table II. Time with 99.9% confidence to acquire frame synchroniz | atio | n |
|------------------------------------------------------------------|------|---|
|------------------------------------------------------------------|------|---|

| Method                 | Modulation           | $E_s/N_0$ (dB)             | Time to acquire (ms) |  |
|------------------------|----------------------|----------------------------|----------------------|--|
| SOF only<br>SOF + PLSC | BPSK<br>BPSK<br>OPSK | $-2 \\ -2 \\ -2 \\ 0 \\ 7$ | 912<br>138           |  |
| SOF + PLSC             | QPSK                 | 0.7                        | 9.6                  |  |

Copyright © 2004 John Wiley & Sons, Ltd.

#### FRAME SYNCHRONIZATION

broadcasting and broadcasting modes, i.e. the worst-case scenarios in terms of frame synchronization. From the tables, it becomes clear that by utilizing the PLSC, significant performance improvements can be obtained.

We would like to point out that the ML decoding of PLSC is rather simple. There are two possible approaches. The first approach takes advantage of the fact that PLSC is actually an interleaved first-order Reed-Muller code of length 64. Deinterleaving the received vector results in a codeword of first-order Reed-Muller code. It is well known that fast Hadamard transform (FHT) can be used to efficiently decode such a code. The second approach follows the observation that the final comparison in Figure 5 actually decodes the seventh information bits. After the seventh bit is decoded, we can coherently sum the two adjacent received samples, obtaining a vector of 32, which can be decoded by FHT of length 32.

# 4. PILOT STRUCTURE AND DESIGN CONSIDERATIONS OF CARRIER RECOVERY

The design goal of DVB-S2 carrier recovery scheme is to deliver channel outputs reliably to the LDPC decoder at very low SNR with small synchronization overhead. The design objectives include:

- Negligible LDPC decoding performance loss due to carrier synchronization impairment (less than 0.1–0.3 dB for most modes);
- Capable of working at extremely low SNR, as low as  $E_s/N_0 = -2.0$  dB;
- Capable of acquiring large carrier frequency offset (up to 5 MHz) with a 30 KHz/s ramp;
- Robust to LNB phase noise characteristics specified by DVB-S which works at higher SNR and allows more implementation margin;
- Rapid initial acquisition;
- Simple implementation.

As mentioned earlier, the DVB-S2 phase noise spec is rather tight due to the desire to reuse the millions of LNBs that have been deployed for DVB-S set-top boxes. The major challenge for carrier recovery is to handle severe phase noise and large frequency offset at low SNR. In order to achieve negligible overall performance loss in terms of the MPEG packet error rate (PER) from LDPC decoder, the receiver designers have to pay closer attention to such performance metrics as cycle-slip rate [7–9], and probability distribution of phase tracking, rather than just the mean values, e.g. the root mean squared (RMS) error at the output of phase tracking loops. This is due to the fact that the LDPC PER curve is so steep that a small probability of cycle slip would cause an error floor.

Unlike DVB-S that supports only QPSK modulation, DVB-S2 supports several modulation schemes, such as QPSK, 8-PSK, 16-APSK, and 32-APSK. In order to expedite carrier recovery, for each modulation type, the standard allows two operating modes: pilotless and pilot, where pilot symbols are inserted to aid carrier synchronization. It is the responsibility of the system operators to choose the operating mode. The set-top boxes are informed of the pilot configuration from the PLSC residing in the PLHEADER. Table III shows the minimal  $E_s/N_0$  to achieve QEF PER for several modes for AWGN channel. Thorough investigations show that even with the tough phase noise spec, only a handful of modes, namely 8-PSK rate 2/3, 16-APSK rate 2/3 and 3/4, and 32-APSK rate 3/4, need pilot assistance for carrier recovery. Clearly, for most high rate codes, their operating SNRs are sufficiently high such that traditional

| Mode        | Spectral efficiency (bits/symbol) | Ideal $E_s/N_0$ (dB) |  |
|-------------|-----------------------------------|----------------------|--|
| BPSK 1/2    | 0.495114                          | -2.00                |  |
| QPSK 1/2    | 0.988857                          | 1.00                 |  |
| QPSK 2/3    | 1.322251                          | 3.10                 |  |
| 8-PSK 2/3   | 1.980633                          | 6.62                 |  |
| 8-PSK 3/4   | 2.228122                          | 7.91                 |  |
| 8-PSK 5/6   | 2.478560                          | 9.35                 |  |
| 16-APSK 2/3 | 2.637197                          | 8.97                 |  |
| 16-APSK 3/4 | 2.966726                          | 10.21                |  |
| 32-APSK 3/4 | 3.703293                          | 12.73                |  |
| 32-APSK 4/5 | 3.951568                          | 13.64                |  |
|             |                                   |                      |  |
|             |                                   |                      |  |

Table III.  $E_s/N_0$  performance at quasi error free (QEF) PER =  $10^{-7}$  for 64 800-bit frames in AWGN channel.



Figure 7. Pilot structure for 8-PSK modulation, where UWs (unique words) refer to pilot symbols.

second order phase locked loop (PLL) should suffice. Therefore, when implementation complexity is concerned, a generic carrier recovery strategy that is suitable for all the modulation schemes with/without pilots is desired.

Based on these observations, we proposed the following *aggregated pilot* structure which was eventually adopted by DVB-S2. Each LDPC coded frame is preceded with one slot (90 symbols) PLHEADER containing the SOF and PLSC. Afterward, 36 pilot symbols follow every 16-slot coded data symbols. If the pilot symbols coincide with the PLHEADER of the following frame, they will not be inserted. We will show later that this pilot structure is a good balance between the synchronization overhead and performance. Figure 7 shows an example of the pilot structure, which is for 8-PSK modulation. In the sequel, the term '*pilot symbols*' refers to both the 90-symbol PLHEADER and the training pilots if any, unless mentioned otherwise.

The aggregated pilot structure makes it feasible for a receiver to accommodate all the modulation schemes under both pilot and pilotless modes in a uniformed manner, which is elaborated as follows.

- Carrier frequency is acquired through a two-step procedure: coarse estimation and fine tuning (including tracking). The coarse frequency estimate is obtained by a feed-forward estimator that works only on the pilot symbols. In the fine tuning stage, the frequency estimation is achieved by another feed-forward estimator that works only on the pilot symbols in the pilot mode, or by a simple frequency estimator using both the modulated data and pilot.
- 2. Unlike conventional continuous mode receivers that usually operate in a continuous way and ignore the frame structure of transmit data, the new DVB-S2 receiver tracks the carrier phase of received data on a segment by segment basis, which is more like a burst mode modem. More specifically, the *segment* can be a 16-slot coded data segment in the pilot



Figure 8. Flow diagram of carrier recovery.

mode or a whole LDPC frame in the pilotless mode. The phase tracking loop is reinitialized with the phase estimates obtained from the pilot symbols for each segment, making its operation independent from other segments and preventing any error propagation. Therefore, the block processing eliminates the impact of traditional cycle slips. From phase tracking perspective, the only difference between different modulation schemes is the phase error detector used in the phase tracking loop.

The overall carrier synchronization flow is shown in Figure 8. The data-aided frequency estimators are independent of modulation schemes and feed-forward, thus always stable. They work very well with large frequency offset (up to 25% symbol rate) under extremely low SNR (as low as -2 dB). It takes the coarse frequency estimator less than 20 frames to bring the frequency offset from 25% to less than  $10^{-4}$ , and it takes the fine frequency estimator just 1 frame to bring the frequency offset further to the range of  $10^{-6}$ , which can be well handled by the phase tracking loop. In the next section, the carrier synchronization algorithms will be briefly explained and their performance will be examined through computer simulations.

# 5. CARRIER RECOVERY ALGORITHMS AND THEIR PERFORMANCE

The top-level block diagram for overall carrier recovery is shown in Figure 9, where CSM refers to the Carrier Synchronization Module that hosts the frequency fine tuning and phase tracking



Figure 9. Top-level block diagram for carrier recovery.

modules. Upon the establishment of LDPC frame synchronization, the coarse frequency estimator gets an initial frequency offset estimation. The fine frequency estimator further refines the estimation to achieve better accuracy. The frequency estimates obtained from both the coarse and fine estimation are fed into a mixer in the receiver front-end, which closes the frequency control loop.

#### 5.1. Algorithms

Data-aided frequency estimation is addressed in References [8, 9]. For set-top box applications, the implementation complexity is an important factor when designing the algorithms. We propose the following simple frequency estimator that is based on the work in References [10, 11]. It is well known that the carrier phase of a modulation-removed signal (which is a continuous wave (CW)) has a constant slope proportional to its angular frequency offset. At high SNR, the frequency estimator that performs linear regression on the CW phase values is ML [12, 13]. However, like any non-linear operation, this simple algorithm has a threshold effect: When the SNR of CW signal is below 9 dB ( $E_s/N_0$ ), its performance starts deteriorating very fast [9, p. 93]. Instead of doing a linear regression directly on the CW signal itself, the feed-forward frequency estimators proposed here first increase the energy of the CW signal by either autocorrelation over several frames or cross-correlation over a whole pilot symbol segment, then estimate the frequency through linear regression on the energy-enhanced CW signals.

The coarse frequency estimation algorithm operates on the pilot symbols as follows.

1. In one pilot segment, which can be a 90-symbol PLHEADER or a 36-symbol pilot segment if any, compute the autocorrelation  $R_s(m)$  from the modulation-removed CW signal  $c_k$  as follows:

$$R_s(m) \stackrel{\scriptscriptstyle \Delta}{=} \sum_{k=0}^{N_s - 1 - m} c_{k+m} c_k^* \tag{2}$$

Copyright © 2004 John Wiley & Sons, Ltd.

where s is the index of the pilot segment, m is the autocorrelation lag ranging from 1 to  $L_c$ , a design parameter,  $N_s$  is the number of pilot symbols in the pilot segment (90 for PLHEADER, 36 for training pilots).

2. Accumulate the autocorrelation over a number of LDPC frames

$$R(m) \stackrel{\scriptscriptstyle \Delta}{=} \sum_{s} R_s(m) \tag{3}$$

3. Obtain the final frequency estimate from the following weighted sum

$$\Delta \hat{f} = \frac{1}{2\pi T_s} \sum_{m=0}^{L_c - 1} w_m \Delta(m) \tag{4}$$

with

$$w_m = \frac{3((2L_c+1)^2 - (2m+1)^2)}{((2L_c+1)^2 - 1)(2L_c+1)}, \quad m = 0, \dots, L_c - 1$$

and

$$\Delta(m) = \begin{cases} \arg[R(1)], & m = 0\\ \max[\arg(R(m+1)) - \arg(R(m)), 2\pi], & m = 1, \dots, L_c - 1 \end{cases}$$

where  $T_s$  is the symbol period.

Since the coarse frequency estimator is based on differential operation (i.e. the autocorrelation), it can handle very large frequency offset (larger than 25% of symbol rate); the autocorrelation accumulates energy over several frames, it can work at very low SNR (lower than -2 dB).

In the pilotless mode, the frequency fine tuning is no different from traditional PLL given the residue frequency offset in the range of  $10^{-4}$  symbol rate. In the pilot mode, the feed-forward fine frequency estimator works on a frame by frame basis as follows.

1. Estimate the carrier phase from the sth pilot segment using the ML phase estimator [8]

$$\phi_s = \arg\left[\sum_{k=0}^{N_s-1} c_k\right] \tag{5}$$

where s ranges from 0 to  $N_p - 1$  with  $N_p$  the number of pilot segments in one frame, e.g. 15 for 8-PSK.

2. Obtain the final frequency estimate from the following weighted sum:

$$\Delta \hat{f} = \frac{1}{2\pi (N_s + N_d) T_s} \sum_{s=0}^{N_p - 2} w_s \operatorname{mod}[\phi_{s+1} - \phi_s, 2\pi]$$
(6)

with

$$w_s = \frac{3((2N_p - 1)^2 - (2s + 1)^2)}{((2N_p - 1)^2 - 1)(2N_p - 1)}, \quad s = 0, \dots, N_p - 2$$

where  $N_d$  is the number of data symbols between two pilot segments, i.e. 1440 (16 slots), and  $N_s$  is equal to 36 in DVB-S2.

Copyright © 2004 John Wiley & Sons, Ltd.

Clearly, the maximum frequency offset that the fine frequency estimator can handle should be less than  $1/(2(N_s + N_d)T_s)$ , which is equivalent to  $3.3 \times 10^{-4}$  symbol rate for DVB-S2. It has a feed-forward structure, thus always stable. The ML phase estimator (5) accumulates energy over either 90 or 36 pilot symbols, which, in fact, realizes the cross-correlation between the received symbols and pilot symbols, making the fine frequency estimator very robust to thermal noise and phase noise. The large denominator  $(N_s + N_d)$  in (6) results from the *aggregated* pilot structure shown in Figure 7, making the frequency estimation very accurate in one frame. Obviously, the further two pilot segments, i.e. bigger  $N_d$ , the more accurate the frequency estimate. On the other hand, it leads to a smaller estimation range. DVB-S2 pilot structure represents an excellent balance in these two conflicting aspects.

After carrier frequency is acquired, the PLL based tracking loop starts operation. The phase estimate from every aggregated pilot block is used to initialize and re-initialize the tracking loop. This makes the tracking loop more stable and immune from the catastrophe of the traditional-sense cycle slips.

# 5.2. Performance

Computer simulations were conducted to test every aspect of the carrier recovery algorithms based on the DVB-S2 pilot structure. In the sequel, the assumed data rate is 25 Mbaud, i.e.  $T_s = 4 \times 10^{-8}$  s. The SNR refers to  $E_s/N_0$  in dB.

The performance of the coarse frequency estimator can be summarized as follows.

- Capable of handling very large frequency offset (up to 25% symbol rate) at very low SNR (less than -2 dB).
- Robust to phase noise.
- Stable due to the feed-forward structure.
- Fast acquisition. The acquisition time is solely determined by the desired estimation accuracy and independent of frequency offset.

The coarse frequency estimator was tested for both the pilot and pilotless modes in the presence of strong AWGN and the phase noise specified by DVB-S2. The received signal has a 5 MHz frequency offset (20% symbol rate). Simulations show that the coarse frequency estimator is unbiased even at -2 dB. Figure 10 shows the RMS frequency error of the coarse estimator operating on 8-PSK frames in the pilot mode. The autocorrelation is accumulated over 10 frames. At 0 dB, the RMS error is  $1.2 \times 10^{-4}$ ; at 6 dB, the RMS error is  $5.9 \times 10^{-5}$ , which brings the residue error to less than  $2.4 \times 10^{-4}$  with 99.999% confidence. Figure 11 shows the RMS frequency error of the coarse estimator operating in the pilotless mode. The autocorrelation is accumulated only on the 90-symbol PLHEADERs. At -2 dB, the RMS error is  $1.4 \times 10^{-4}$  over 20 frames.

The performance of the fine frequency estimator (for the pilot mode) can summarized as follows.

- Capable of handling sudden frequency change up to  $3.3 \times 10^{-4}$  symbol rate.
- Stable due to the feed-forward structure.
- Fast estimation. It takes only one frame to obtain very fine frequency estimate. Thus, it can easily handle a 30 KHz/s ramp.
- Robust to both AWGN and phase noise.



Figure 10. Performance of the coarse frequency estimation based on PLHEADER & pilot segments (8-PSK frame structure) over 10 LDPC frames, 20% frequency offset,  $L_c = 32$ .



Figure 11. Performance of the coarse frequency estimation based on PLHEADER only (non-pilot case) over 20 LDPC frames, 20% frequency offset,  $L_c = 32$ .

Copyright © 2004 John Wiley & Sons, Ltd.



Figure 12. Performance of the fine frequency estimation based on one 8-PSK LDPC frame (14 segment of pilots),  $2 \times 10^{-4}$  frequency offset.

 Table IV. Carrier phase tracking performance at 25 Mbaud with the phase noise model specified by DVB-S2 in AWGN channel.

| Mode        | Pilot mode | $E_s/N_0$ (dB) | SNR <sub>PD</sub> (dB) | RMS err. (deg) | Cycle-slip rate |
|-------------|------------|----------------|------------------------|----------------|-----------------|
| BPSK 1/2    | Pilotless  | -2.00          | -1.10                  | 2.72           | $< 10^{-8}$     |
| QPSK 1/2    | Pilotless  | 1.00           | -4.82                  | 3.27           | $< 10^{-8}$     |
| 8-PSK 2/3   | Pilot      | 6.70           | -4.30                  | 3.15           | $< 10^{-8}$     |
| 16-APSK 3/4 | Pilot      | 10.20          | 3.27                   | 2.55           | $< 10^{-8}$     |

The fine frequency estimator was tested given  $2 \times 10^{-4}$  frequency offset (normalized to the symbol rate) with both AWGN only and the phase noise plus AWGN. Figure 12 shows the RMS frequency error of the fine estimator for 8-PSK modulation in the pilot mode. At 0 dB, the RMS error is  $1.0 \times 10^{-6}$  with AWGN only; at 6 dB, the RMS error is  $6.0 \times 10^{-7}$  with AWGN only. In the presence of phase noise, the estimation performance is then dominated by phase noise. The RMS error is  $6.3 \times 10^{-6}$  at 0 dB, which is still very good.

Extensive simulations were also conducted to test the carrier phase tracking algorithm. Table IV summarizes the performance of the carrier phase tracking algorithm with phase noise and the SNRs that are the lowest, respectively, for the modulation schemes. The SNR of a phase detector (PD) is defined as

$$\mathrm{SNR}_{\mathrm{PD}} \stackrel{\scriptscriptstyle \triangle}{=} \frac{A^2}{\sigma_p^2} \tag{7}$$

Copyright © 2004 John Wiley & Sons, Ltd.

where A is the PD gain, and  $\sigma_p^2$  is the average noise power of PD. All the modes in BPSK and QPSK do not need pilot for carrier recovery. Their RMS phase tracking errors are 2.72° (-2 dB) and 3.27° (1 dB), respectively. Modes 8-PSK rate 2/3 and 16-APSK rate 3/4 do need pilot symbols for carrier recovery. The cycle-slip rates (per LDPC frame) are all less than  $10^{-8}$ , which is the lowest rate verifiable via simulation.

Figures 13–15 show some phase tracking statistics of 8-PSK at 6.7 dB. Figure 13 shows the phase detector S-curve used in 8-PSK. Figure 14 shows the complementary cumulative distribution function (CCDF) of the ML phase estimator (5) used to initialize the phase tracking loop for each 16-slot data segment. The RMS estimation error is  $3.12^{\circ}$  for 36 pilot symbols at 6.7 dB. The distribution of phase estimate approaches Gaussian. Clearly, the phase estimate can bring the phase tracking loop into the tracking mode directly, which helps reduce the cycle-slip rate. Figure 15 shows the CCDF of the phase estimate at the end of a 16-slot data segment. The maximum phase error is around 19° (with a probability less than  $2 \times 10^{-7}$ ), *vis-à-vis* 22.5°, the threshold for a cycle slip in 8-PSK. Note that all the results are with the severe phase noise.

The ultimate goal of the receiver design is not to degrade the performance of the LDPC decoder when compared with ideal AWGN channel. Our receiver connected with the LPDC decoder went through extensive simulations to verify the final performance. Figure 16 shows the MPEG PER in the presence of the phase noise in AWGN channel. The perform losses for all the modes are almost negligible. There exhibits no error floor effect. For instance, the loss for BPSK rate 1/2 (operating in the pilotless mode around -2 dB) is 0.02 dB; the loss for QPSK rate 1/2 (operating in the pilotless mode around 1 dB) is 0.05 dB; the loss for 8-PSK rate 2/3 (operating in the pilot mode around 6.7 dB) is 0.08 dB; the loss for 16-APSK rate 3/4 (operating in the pilot mode around 10.2 dB) is 0.15 dB.



Figure 13. Phase detector S-curve in 8-PSK at 6.7 dB, where  $SNR_{PD} = -4.30 \text{ dB}$ .

Copyright © 2004 John Wiley & Sons, Ltd.



Figure 14. CCDF of the ML phase estimate (5) based on 36 pilot symbols at 6.7 dB.



Figure 15. CCDF of the phase estimate at the end of one 16-slot data segment in 8-PSK pilot mode at 6.7 dB.

Copyright © 2004 John Wiley & Sons, Ltd.



Figure 16. MPEG PER of HNS receiver in the presence of DVB-S2 phase noise in AWGN channel.

#### 6. CONCLUSIONS

In this paper, we summarized the DVB-S2 embedded structures for frame synchronization and carrier recovery, the most difficult tasks in DVB-S2 receiver design. Algorithms exploiting these structures to achieve rapid and robust synchronization were discussed and their performance was presented. All the design goals of DVB-S2 have been met or exceeded. At the time of the publication, Hughes Network System has built and demonstrated DVB-S2 system over real satellite channel and all the performance data presented here are further validated by the system.

#### REFERENCES

- Digital Video Broadcasting (DVB): Second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broadband satellite applications. DVBS2-74r10, November 2003.
- 2. Hughes Network Systems. LDPC frame synchronization. DVBS2-104, June 2003.
- 3. European Space Agency. DVB-S2 frame synchronization analysis. DVBS2-100, June 2003.
- 4. Alcatel Space. Proposal for a first acquisition. DVBS2-101, June 2003.
- 5. Philips. Comments on the length of SOF. DVBS2-103, June 2003.
- 6. Willaims FJ, Sloane NJA. The Theory of Error-Correction Coding. Elsevier: New York, 1977.
- 7. Meyr H, Ascheid G. Synchronization in Digital Communications, vol. 1. Wiley: New York, 1990.
- 8. Meyr H, Moeneclaey M, Fechtel S. Digital Communication Receivers, Synchronization, Channel Estimation, and Signal Processing. Wiley: New York, 1998.
- 9. Mengali U, D'Andrea AN. Synchronization Techniques for Digital Receivers. Plenum: New York, 1997.
- 10. Jiang Y, et al. Data-aided ML parameter estimators of PSK burst modems and their systolic VLSI implementations. In *Proceedings of the IEEE Globecom'99*, Rio de Janeiro, Brazil, December 1999.

Copyright © 2004 John Wiley & Sons, Ltd.

- 11. Jiang Y. Synchronization and channel parameter estimation in wireless communications. *Ph.D. thesis*, University of Maryland at College Park, July 2000.
- 12. Tretter S. Estimating the frequency of a noisy sinusoid by linear regression. *IEEE Transactions on Information Theory* 1985; **IT-31**(6):832-835.
- Kay S. A fast and accurate single frequency estimator. *IEEE Transaction on Acoustics Speech and Signal Processing* 1989; ASSP-37(12):1987–1990.
- 14. Hughes Network Systems. Carrier synchronization solution for DVB-S2 modem, DVBS2-097, June 2003.

#### AUTHORS' BIOGRAPHIES



**Feng-Wen Sun** received the BS degree from Heilongjiang University, Harbin, China in 1983, the MS degree from Nankai University, Tianjin, China in 1989, and the PhD degree from Einhoven University of Technology, Eindhoven, the Netherlands in 1994.

Since January 1996, he has been with Hughes Network Systems and currently is a technical director responsible for R&D on signal processing/coding and modem design. He has extensive publications on error-correction coding, signal processing and synchronization techniques. He has played the leading role in defining and implementing the key algorithms and VLSI architectures for many commercial products. He was involved in the U.S. and European third generation wideband CDMA standard activities and served as the chair for the Turbo code group in TIA/EIA CDMA200 standards development.

He was instrumental for the adoption of these coding schemes by the standard bodies. He played a leading role in the recent adoption of low-density parity checking cods by the European Digital Video Broadcasting standard (DVB). This is the first major standard to adopt this coding scheme. Subsequently, he and his team supply the only demodulator design to meet the DVB-S2 stringent requirements. He served as the chief architect for the VLSI implementation of the LDPC based DVB standard. Dr Sun holds 6 U.S. patents and has additional 24 patents pending. Many of his patents have been productized or licensed. They have generated millions of dollars license fee.

Dr Sun was the recipient of the 1998 and 2001 Hughes Electronics Patent Award and the co-recipient of the 1998 CDMA technical achievement award. In 2000, he received a special award for exceptional contributions to third generation wireless technologies. In 2002, he was awarded the Hughes Electronics Chairmans Award. In 2003, He was awarded the P. A. Hyland Patent Award.



**Yimin Jiang** received the BS degree in electronic engineering from Tsinghua University, Beijing, China, in 1996, and the MS and PhD degrees in electrical engineering from the University of Maryland, College Park, MD, in 1998 and 2000, respectively.

Since 1998, he has been with Hughes Network Systems, Germantown, MD. Currently, he is a Senior Member Technical Staff with the Advanced Development Group. His research interests include general areas of modem design, such as synchronization, error-correcting codes, stochastic and VLSI signal processing. Among the DVB-S2 standard activities at Hughes, he is in charge of the demodulator algorithm design, which is the only viable solution adopted by the standard committee, and its VLSI implementation and the overall system verification. Dr Jiang holds one U.S. patent and has additional seven patent applications pending.

Dr Jiang received the ISR fellowship from the University of Maryland, College Park, in 1996.

#### FRAME SYNCHRONIZATION



Lin-Nan Lee is a Vice President of Engineering at Hughes Network Systems (HNS), responsible for advanced technology development. Dr Lee received his BS Degree from National Taiwan University and his MS and PhD from University of Notre Dame, all in Electrical Engineering. He started his career at Linkabit Corporation, where he was a Senior Scientist working on packet communications over satellites at the dawn of the Internet age. He then worked at Communication Satellite Corporation (COMSAT) in various research and development capacities with emphasis on source and channel coding technology development for satellite transmission and eventually assumed the position of Chief Scientist, COMSAT Systems Division. After joining HNS in late 1992, he has contributed to HNS effort in wireless and satellite communications areas, most notably, in HNS technical contributions in third Generation Wireless standardization and Digital Video

Broadcast. Dr Lee is a Fellow of IEEE. He was the co-recipient of the COMSAT Exceptional Invention Award, and the 1985 and 1988 COMSAT Research Award. He was also awarded 2003 Hughes Electronics Chairmans honour award for his outstanding leadership in advanced technology initiatives. He has authored or co-authored more than 20 U.S. patents.