PDSP16515A
PDSP16515A
Stand Alone FFT Processor
Advance Information
DS3922
ISSUE 2.0
April 1999
Features
q
q
q
q
q
q
q
q
Ordering Information
q
Completely self contained FFT Processor
Pin and functionally compatible with the
PDSP16510A
Expanded width internal RAM supports up to 1024
complex points
18 bit internal data bus with block floating point
arithmetic for increased dynamic range
500 MIP operation gives 87 microsecond
transformation times for 1024 points
Up to 45MHz sampling rates with multiple devices.
Up to 85dB noise rejection
A choice of internal window operators with no
external ROM provide up to 67dB side lobe
attenuation.
84 pin PGA or 132 pin surface mount package
Associated Products
PDSP16330
PDSP16256
PDSP16350
PDSP16510A
Pythagoras Processor.
Programmable FIR Filter.
I/Q Splitter / NCO
Stand Alone FFT Processor
( Commercial - PGA
Package )
PDSP16515A C0 GC
( Commercial - Leaded
Chip Carrier )
PDSP16515A B0 AC
( Industrial - PGA
Package )
PDSP16515A B0 GC
( Industrial -
Leaded
Chip Carrier )
PDSP16515A A0 AC
( Military - PGA
Package )
PDSP16515A A0 GC
( Military - Leaded Chip
Carrier )
PDSP16515A/MA/GCPR
( Military - Screened
Leaded Chip Carrier. See separate datasheet for
details )
PDSP16515A C0 AC
The PDSP16515A performs Forward or Inverse Fast
Fourier Transforms on complex or real data sets
containing up to 1024 points. Data and coefficient input
are both represented by 16 bits. Data is expanded
internally to 18 bits and subject to Block Floating Point
arithmetic to preserve a greater dynamic range.
An internal RAM is provided which can hold up to 1024
complex data points. This removes the memory transfer
bottleneck, inherent in building block solutions. Its
organisation allows the PDSP16515A to
simultaneously input new data, transform data stored in
the RAM, and to output previous results. No external
buffering is needed for transforms containing up to 256
points, and the PDSP16515A can be directly connected
to an A/D converter to perform continuous transforms.
The user can choose to overlap data blocks by either
0%, 50%, or 75%. Inputs and outputs are synchronous
to the 45MHz system clock used for internal operations.
A 1024 point complex transform can be completed in
some 87µs, which is equivalent to throughput rates of
500 million operations per second. Multiple devices can
be connected in parallel in order to increase the
sampling rate up to the 45MHz system clock. Six
devices are needed to give the maximum performance
with 1024 point transforms.
Either a Hamming or a Blackman-Harris window
operator can be internally applied to the incoming real or
complex data. The latter gives 67dB side lobe
attenuation. The operator values are calculated
internally and do not require an external ROM nor do
they incur any time penalty.
The increased internal bus size together with block
floating arithmetic produce up to 85dB of noise
rejection.
The device outputs the real and imaginary components
of the frequency bins. These can be directly connected
to the PDSP16330 in order to produce magnitude and
phase values from the complex data.
1
PDSP16515A
DATA INPUT
3 TERM
WINDOW
OPERATOR
COEFFICIENT
ROM
WORKSPACE
RAM
WORKSPACE
RAM
FOUR
DATA PATHS
OUTPUT
BUFFER
RESULT OUPUT
Figure. 1. Block Diagram
SAMPLE
CLOCK
DIS
DOS
ROUT
CLK
X
PHASE
ANALOG
INPUT
PDSP16515
A/D
RIN
DEN
DAV
S3:0
IOUT
Y
PDSP16330
MAGNITUDE
GND
SCALE VALUE
AVAILABLE
Figure. 2. Typical 256 Point Real Only System Performing Continuous Transforms
2
PDSP16515A
N
D9
D10
D12
D14
DIS
VDD
DAV
GND
AUX0
AUX2
AUX4
AUX6
AUX7
M
D8
D11
D13
D15
DEF
INEN
SCLK
AUX1
AUX3
AUX5
AUX8
L
D6
D7
AUX9
AUX10
K
D4
D5
AUX11
AUX12
J
D2
D3
AUX13
AUX14
H
GND
D1
AUX15
GND
G
D0
LFLG
DEN
I15
F
VDD
R0
I14
VDD
E
R1
R2
I12
I13
D
R3
R4
I10
I11
C
R5
R6
I8
I9
B
R7
R10
R12
R14
S0
DOS
S2
I0
I2
I4
I7
A
R8
R9
R11
R13
R15
VDD
S1
GND
S3
I1
I3
I5
I6
Pin Out for 84 PGA Package (AC84 Power) - bottom view
PIN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
FUNC
VDD
GND
I7
I8
I9
I10
VDD
I11
GND
I12
VDD
I13
GND
I14
VDD
I15
GND
DEN
AUX15
GND
AUX14
GND
PIN
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
FUNC
AUX13
VDD
AUX12
GND
AUX11
VDD
GND
AUX10
AUX9
AUX8
AUX7
VDD
AUX6
VDD
AUX5
GND
AUX4
AUX3
AUX2
VDD
AUX1
AUX0
PIN
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
FUNC
GND
VDD
SCLK
GND
GND
DAV
GND
INEN
VDD
DEF
GND
DIS
VDD
D15
D14
GND
D13
D12
D11
D10
VDD
D9
PIN
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
FUNC
D8
D7
D6
D5
GND
VDD
D4
GND
D3
VDD
D2
GND
D1
VDD
D0
LFLG
GND
R0
GND
R1
VDD
R2
PIN
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
FUNC
GND
R3
VDD
R4
GND
R5
R6
R7
R8
GND
VDD
R9
VDD
R10
R11
R12
R13
GND
R14
R15
DISAB
S0
PIN
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
FUNC
GND
S1
GND
DOS
DOS
VDD
S2
GND
S3
GND
VDD
I0
I1
GND
I2
I3
I4
GND
VDD
I5
I6
VDD
Pin Out for 132 Leaded Chip Carrier (GC132)
3
PDSP16515A
SIGNAL
D15:0
AUX15:0
TYPE
I
I
DESCRIPTION
Data input during real only mode. The real component in complex data mode.
When DEF is active AUX15:0 are used to define the operating mode as defined in Table 3.
When DEF is in-active AUX15:0 either provide the 16 bit imaginary component of complex
input data, or a second set of real only inputs.
These pins output the real component of the transformed data when DAV and DEN are active.
Otherwise they are high impedance.
These pins output the imaginary component of the transformed data when DAV and DEN are
active. Otherwise they are high impedance.
The high going edge of DEF is used to internally latch the contents of AUX15:0, which then
define the operating mode. In the simplest system DEF is a power on reset. When DEF is low
the internal control logic is reset.
System clock used for internal computations.
These pins indicate the number of shifts towards the binary point which have occurred as the
result of the conditional scaling logic. When the data path right shift is restricted to 2 places
per pass, state 15 is used to indicate an overflow and only a total of 14 shifts is possible.
This flag indicates that data is being loaded into the device. It goes active in response to an
INEN input, and may be programmed to go in-active after the complete, one quarter, or one
half a data block has been loaded.
The use of this input is mode dependent. It is either used as an active low, load enabling,
signal for the DIS strobe, or it is used to initiate a new block load operation.
The rising edge of this input is used to load data into the device.
The rising edge of this input is used to dump data from the device. In most applications it may
be tied to the DIS input, even if the output rate must be higher than the input rate because of
overlapped data blocks. The DIS input is then internally divided down.
An active low signal that indicates that a transform is complete. Transformed data will then
be output in normal sequential order using DOS. It may be optionally programmed to be
delayed by 24 DOS strobes to match the delay through a PDSP16330.
This input is used to enable the data dump operation when DAV has gone active. If it is tied
low the device will automatically dump data when DAV goes active. Otherwise the device will
wait for the enabling signal to go low before the dump operation commences.
Only available in the 132 pin GC package. When high the block floating logic is disabled.
+5V pins
Ground pins
R15:0
O
I15:0
O
DEF
I
SCLK
S3:0
I
O
LFLG
O
INEN
I
DIS
DOS
I
I
DAV
O
DEN
I
DISAB
VDD
GND
I
P
P
NOTE.
All references to DEF, INEN, DAV, and DEN within the text do not contain the bar designator, signifying an active
low signal. This is considered to be implied by the signal name and is not meant to imply a change in the signal
function.
Functional Operation
The PDSP16515A performs decimation in time, radix 4,
forward or inverse Fast Fourier Transforms. Data is loaded
into an internal workspace RAM in normal sequential order,
processed, and then dumped in the correct order. With real
only input data the processing time can approximately be halved
for a given transform size. Two real inputs then replace a single
complex input, and are processed in parallel.
Either a Blackman-Harris or a Hamming window can be
4
PDSP16515A
generated internally, and applied to the incoming real or
complex data with no time penalty. No external ROM is
needed to support these windows. The Blackman-Harris
window gives improved dynamic range over the Hamming
window when two closely spaced frequencies are to be
detected, and one is of smaller magnitude than the other. It
does, however, reduce the actual frequency resolution, and
the Hamming window may then be preferable.
Data in and out of the device is represented by 16 bit real and
imaginary components, with 16 bit sine and cosine values
contained in an internal ROM. Conditional scaling, coupled
with word growth through the butterfly data path, gives
increased dynamic range. Transforms can be computed with
sample sizes of either 256 or 1024 data points. The 256 point
option can alternatively be used to simultaneously execute
either four 64 point transforms, or sixteen 16 point transforms.
The 16 point mode can only be used with a rectangular
window, and no overlapping of data blocks is possible.
The device can be configured, either, to perform continuous
transforms in a real time application, or as slave processor to
a more general purpose signal processing system. In the
continuous mode, with transform sizes of 256 points or less,
it contains three internal control units which simultaneously
allow new data to be loaded, present data to be transformed,
and previous results to be dumped. Additional, external, input/
output buffering is not needed. The internal input buffer also
allows data blocks to be overlapped by either 50% or 75%,
apart from the mode with no overlaps.
When 1024 point transforms are to be calculated, without loss
of incoming data during the transform time, it is necessary to
use an input buffer. This requirementcan be satisfied by an
external buffer memory.
In any of the real or complex modes it is possible to obtain
higher performance by connecting devices in parallel. It is then
possible to increase the sampling rate to that of the system
clock used for internal operations.
The mode of operation of the device is controlled by 16 bits in
a control register. These are loaded through the AUX15:0 port
when a control signal DEF is active low. This port is also used
to provide the imaginary component of complex input data,
and, if complex transforms are to be performed, an external
tristate buffer will be needed to isolate the control information.
This should only be enabled when DEF is active. DEF is also
used to initialise the internal circuitry, and can be a simple
power on reset if control parameters need not be
subsequently changed.
INPUT
SELECT
RAM
SIN / COS
ROM
Shift left until largest point
has one sign bit.
16
18
MULTIPLIER
S
S
29 - 14 13 - 0
"1"
18
18
FIRST ADDER
19Bit Result
18 - 1
0
REGISTER FILE
SECOND ADDER
19Bit Result
18 - 1
0
REGISTER FILE
THIRD ADDER
19Bit Result
18 - 1
17 - 0
CR
BIT3
SELECT
Figure. 3 One of Four Data Paths
of arithmetic precision, and has a detrimental effect on the
dynamic range achievable.
In practice these large word growths only occur when bipolar
complex square waves are transformed, and even then will
not occur on every pass. The PDSP16515A compromises by
allowing a 2 bit word growth during the butterfly calculation in
the first pass. This is equivalent to ignoring the most significant
bit of the 19 bit final result, which is assumed to be an extra sign
bit, and then selecting the next 18 bits for storage. In
Data Precision
During each pass of a radix-4 fast Fourier transform it is
possible for either component of a particular result to grow by
a factor of up to four in the first pass, and 5.242 in subsequent
passes. This is between two and three bits in each pass and
the data path must allow for this word growth to avoid any
possibility of overflow. At the end of the data path the word is
preserved at 18 bits and stored in the internal RAM. Any un-
necessary word growth to prevent overflow thus results in loss
5