ispLever
CORE
TM
Turbo Decoder
User’s Guide
November 2008
ipug14_04.4
Lattice Semiconductor
Turbo Decoder User’s Guide
Introduction
Lattice’s Turbo Decoder core provides an ideal solution that meets the needs of turbo decoding applications. The
core provides a customizable solution allowing turbo decoding of data in many system designs. This core allows
designers to focus on the application rather than the Turbo Decoder, resulting in a faster time to market.
Turbo coding is an advanced error correction technique widely used in the communications industry. The Turbo
Decoder IP Core from Lattice is compliant with three different standards: 3GPP, 3GPP2 and CCSDS. Lattice’s
Turbo Decoder core was developed in conjunction with Lattice’s Turbo Encoder core to provide a complete solution.
This User’s Guide explains the functionality of the Turbo Decoder core and how it can be implemented to provide
decoding. The Turbo Decoder core comes with the documentation and files listed below:
• Lattice gate level netlist
• RTL simulation model
• Core instantiation template
Features
• Fully compatible with Third Generation Partnership Project (3GPP) standard:
– 3GPP TS 25.212 version 4.2.0
• Fully compatible with CDMA2000/3GPP2
– 3GPP2 C.S002-C, May 2002
• Fully compatible with Consultative Committee for Space Data Systems standard:
– CCSDS 101.0-B-5
• Throughput of 2Mbps for 3GPP at 30MHz, 7 iterations, 6-bit input symbol width
• Two’s complement data/parity input
• Depuncturing supported
• Variable soft-widths for input symbols
• User-defined number of states
• Variable block sizes during runtime
• Programmable number of iterations (1-15)
• Optional hard decision storage
• Selectable Max-Log-Map or Log-Map algorithm
• Optional external memory with programmable pipeline stages
• Optional double buffering
• Bit Error Rate of 10
-6
(at 1.5 dB Eb/No SNR)
General Description
Turbo coding is an advanced error correction technique widely used in the communications industry. Turbo encod-
ers and decoders are key elements in today’s communication systems to achieve the best possible data reception
with fewest possible errors. The basis of turbo coding is to introduce redundancy in the data to be transmitted
through a channel. The redundant data helps to recover original data from the received data. In data transmission,
turbo coding helps achieve near Shannon limit performance.
2
Lattice Semiconductor
Turbo Decoder User’s Guide
Lattice provides a Turbo Decoder IP core that is both flexible and compliant with three different standards: 3GPP,
3GPP2 and CCSDS. 3GPP is widely used in WCDMA and MC-CDMA applications while CCSDS is most com-
monly found in telemetry and space communications. Figure 1 shows the top-level block diagram of this core.
Lattice also supplies a Turbo Encoder core that provides a complete state-of-the-art error correction solution.
Figure 1. Turbo Decoder I/O Block Diagram
rstn
sr
din
inpvalid
rfi
dout
rfno
rfo
Turbo
Decoder
blocksizeset(ipcfgset)
blocksize
iterations
clk
rate
Note: Additional I/O signals are required if either an external memory or double buffer is selected. Please refer to
the Additional Signals for External Memory section of this document for further information.
MAP Algorithm
Turbo decoding is based on the principle of comparing the probability of a received soft input data being a ‘1’ and
‘0’. The Lattice Turbo Decoder uses a decoding scheme called the MAP - Maximum Aposteriori Probability algo-
rithm. The algorithm determines the probability of whether each received data symbol is a ‘1’ as well as ‘0’. This is
done with the help of the data, parity symbols, and the decoder knowledge of the encoder trellis. A trellis is a form
of a state transition table of the encoder input/output. Based on the data and parity information, the MAP decoder
computes the probability of the encoder being in a particular state. Depending on the soft data, parity value and the
weight from the previous state, the probability that the data is a ‘1’ or ‘0’ can be computed. The MAP decoder com-
putes the weight for each data symbol in a given block for both the forward and reverse direction. This results in the
computation of a forward and reverse metric. Using these two values, the probabilities are computed. After the
probabilities are determined, they are compared and a decision is made. The Lattice Turbo Decoder IP core uses
the logarithm of the probability to reduce computation; this is known as Log Likelihood ratio (LLR). The computation
of the probabilities is done iteratively to obtain a reliable result. Once the result is considered reliable, one can
make a final decision as to whether the data symbol is a ‘1’ or a ‘0’. The Lattice Turbo Decoder can implement both
the Log-Map and Max-Log-Map algorithm. The Log-Map algorithm gives a slightly better performance than the
Max-Log-Map but utilizes more resources and runs at a slower frequency.
The Log Likelihood ratio is the probability that the received data bit is a ‘0’ divided by the probability that the
received data bit is a ‘1’.
L(D) = log
P (D=0)
P (D=1)
The value of L(D) is positive if P(D=1)
≤
P(D=) and negative otherwise. The output data value is ‘1’ if L(D) is positive
and ‘0’ if L(D) is negative. For one complete cycle of iteration, one needs to compute the LLR using parity for non-
interleaved as well as interleaved data.
3
Lattice Semiconductor
Turbo Decoder User’s Guide
Block Diagram
Figure 2 shows a block diagram of the Turbo Decoder detailing the key components and the data paths between
these blocks.
Figure 2. Turbo Decoder Functional Block Diagram
din
Data/Parity
Memory
Data
Parity
Memory
Map
Map
Decoder
write address
write enable
Decoder
LLR
LLR
Buffer
write address
write enable
Buffer
Hard
Hard
dout
Decision
Decision
Storage
(optional)
Storage
(optional)
write address
read address
write enable
Control
Interleaver
Functional Description
The Turbo Decoder consists of four main components: control module, decoder, interleaver and memory buffers.
Control Module
The control module takes care of the interface, pipelining and handshake communication between various blocks
and I/O pins. Data and parity are read serially into the memory and it is assumed that the data is received in the
same order as it was transmitted from the encoder. Signal
blocksizeset(ipcfgset)
initializes the
blocksize
by specifying the size of the block to be input to the decoder. Input data can be given only when
rfi
is asserted.
Input data has to be qualified with
inpvalid
to be accepted by the core.
Decoder and Interleaver
Once the data is entered into the decoder, the decoder starts computing the LLR of each data symbol. The LLR is
computed for the block sequence twice, once using the non-interleaved data and the corresponding parity and then
using the interleaved data and the corresponding parity. One round of this computation is called an iteration. Each
iteration is divided into two sections, an ODD window and an EVEN window. The LLR for systematic parity is com-
puted during the ODD window and the LLR for interleaved data is computed during the EVEN window. When both
ODD and EVEN window computations are done, one iteration is complete. The user can set the number of itera-
tions for each block on the iterations pin. During the second half of the iteration, EVEN window, the LLR computed
in the first half is improved upon by using previous computations. Every window makes use of LLR information
computed in the previous window and tries to improve on the estimate of LLR. The interleaver is used in the second
half of iteration to generate the interleaved address. This address is used to address the data and parity memory to
read the interleaved data for the second half of the iteration. It is also used to address the LLR memory unit and get
the previously computed LLR information. At the end of one iteration, the decoder has a set of LLR for each input
data. The second iteration starts with again using the non-interleaved parity bits and data and the previously com-
puted LLR to get a new estimate of the LLR for the data. Once the decoder completes the number of iterations
required to be done, the LLR memory buffer has the final LLR values. The sign of the LLR values determines
whether the data is a ‘1’ or a ‘0’. A positive sign means the data value is a ‘1’, otherwise it is a ‘0’.
4
Lattice Semiconductor
Output Data Handshaking
Turbo Decoder User’s Guide
When the decoder is ready to output data, signal
rfo
is asserted high after the decoder has completed the speci-
fied number of iterations. The user can then assert signal
rfno
to read the decoded data, which then allows data
to be output on
dout
.
A synchronous reset signal,
sr
is available to reinitialize the Turbo Decoder in the middle of a block processing.
The current block being processed will be completely discarded during this reset. This can be done at any point of
time during the operation
Memory Buffer
The memory buffering for this IP splits into four sections. These sections are described in detail below.
Input Data/Parity Memory
The Turbo Decoder core requires a large amount of memory to store the input data block. Since data memory
requirements are large, an external memory is recommended so that on-chip memory can be used for other pur-
poses. An external memory interface is provided in the IP. A single or double buffer memory mode may be selected
depending on the available external memory at hand. Double buffer memory allows one block of data to be pro-
cessed while another block is written and read. Double buffer memory delivers better performance than the single
buffer selection by minimizing delay between the processing of each block.
Internal Memory
Some internal memory is required to implement the interleaver and other necessary functions of the Turbo
Decoder. Lattice’s Turbo Decoder requires a small amount of memory for internal purposes. For example, the
3GPP configuration uses 4.6Kb spread over four memory blocks.
LLR Memory
After the Turbo Decoder completes the required number of iterations, the LLR memory buffer stores the final LLR
values. The size of the LLR memory buffer is dependant on configuration and block size.
Hard Decision Storage Memory
The Turbo Decoder IP core offers optional hard decision storage. When LLR memory is used as an output buffer,
the decoder cannot go onto process the next block of data until current LLR values of the previous block are com-
pletely read out. This results in an extra processing delay of B cycles (B =
blocksize
). To minimize delay, output
data after hard decision can be stored in separate memory to allow the decoder to operate on a new data block if
memory can be spared.
Operational Data Flow
The following flow diagram describes the sequence for every block introduced into the Turbo Decoder core.
5