PDSP16510A MA
PDSP16510A MA
Stand Alone FFT Processor
Advance Information
DS3762
ISSUE 3.0
October 1998
The PDSP16510 performs Forward or Inverse Fast
Fourier Transforms on complex or real data sets containing up
to 1024 points. Data and coefficients are each represented by
16 bits, with block floating point arithmetic for increased
dynamic range.
An internal RAM is provided which can hold up to 1024
complex data points. This removes the memory transfer
bottleneck, inherent in building block solutions. Its organisa-
tion allows the PDSP16510 to simultaneously input new data,
transform data stored in the RAM, and to output previous
results. No external buffering is needed for transforms con-
taining up to 256 points, and the PDSP16510 can be directly
connected to an A/D converter to perform continuous trans-
forms. The user can choose to overlap data blocks by either
0%, 50%, or 75%. Inputs and outputs are asynchronous to the
40MHz system clock used for internal operations.
A 1024 point complex transform can be completed in
some 98µs, which is equivalent to throughput rates of 450
million operations per second. Multiple devices can be con-
nected in parallel in order to increase the sampling rate up to
the 40MHz system clock. Six devices are needed to give the
maximum performance with 1024 point transforms.
Either a Hamming or a Blackman-Harris window operator
can be internally applied to the incoming real or complex data.
The latter gives 67dB side lobe attenuation. The operator
values are calculated internally and do not require an external
ROM nor do they incur any time penalty.
The device outputs the real and imaginary components of
the frequency bins. These can be directly connected to the
PDSP16330 in order to produce magnitude and phase values
from the complex data.
DATA INPUT
3 TERM
WINDOW
OPERATOR
COEFFICIENT
ROM
WORKSPACE
RAM
WORKSPACE
RAM
FOUR
DATA PATHS
OUTPUT
BUFFER
RESULT OUPUT
Fig. 1. Block Diagram
FEATURES
Completely self contained FFT Processor
Internal RAM supports up to1024 complex points
16 bit data and coefficients plus block floating point for
increased dynamic range
450 MIP operation gives 98 microsecond transforma-
tion times for 1024 points
Rev
Date
A
B
C
D
Up to 40MHz sampling rates with A grade multiple
devices.
Internal window operator gives 67dB side lobe
attenuation and needs no external ROM.
132 pin surface mount package
MAR 1993 JAN 1997 OCT 1998
NOTE
Polyimide is used as an inter-layer dielectric and as
glassivation.
Polymeric material is also used for die attach which according
to the requirement in paragraph 1.2.1.b. (2) precludes
catagorising this device as fully compliant. In every other
respect this device has been manufactured and screened in
full accordance with the requirements of Mil-Std 883 (latest
revision).
CHANGE NOTIFICATION
The change notification requirements of MIL-PRF-38535 will
be implemented on this device type. Known customers will be
notified of any changes since the last buy when ordering
further parts if significant changes have been made.
ORDERING INFORMATION
PDSP16510A MA GCPR
(Power Ceramic QFP Package
- HIREL LEVEL A Screening)
PDSP16510A MA AC1R
(Power Ceramic PGA Package
- HIREL LEVEL A Screening)
1
PDSP16510A MA
SAMPLE
CLOCK
CONFIGURATION
WORD
DIS
AUX15:0
GND
INEN
DOS
R15:0
X
CLK
PHASE
ANALOG
INPUT
PDSP16510
A/D
D15:0
I15:0
Y
DEF DEN DAV S3:0
GND
RESET
PDSP16330
MAGNITUDE
SCALE VALUE
AVAILABLE
Fig. 2. Typical 256 Point Real Only System Performing Continuous Transforms
N
D9
D10
D12
D14
DIS
VDD
DAV
GND
AUX0
AUX2
AUX4
AUX6
AUX7
M
D8
D11
D13
D15
DEF
INEN
SCLK
AUX1
AUX3
AUX5
AUX8
L
D6
D7
AUX9
AUX10
K
D4
D5
AUX11
AUX12
J
D2
D3
AUX13
AUX14
H
GND
D1
AUX15
GND
G
D0
LFLG
DEN
I15
F
VDD
R0
I14
VDD
E
R1
R2
I12
I13
D
R3
R4
I10
I11
C
R5
R6
I8
I9
B
R7
R10
R12
R14
S0
DOS
S2
I0
I2
I4
I7
A
R8
R9
R11
R13
R15
VDD
S1
GND
S3
I1
I3
I5
I6
Pin Out for 84 PGA Package (AC84) - bottom view
2
PDSP16510A MA
PIN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
FUNC
VDD
GND
I7
I8
I9
I10
VDD
I11
GND
I12
VDD
I13
GND
I14
VDD
I15
GND
DEN
AUX15
GND
AUX14
GND
PIN
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
FUNC
AUX13
GND
AUX12
GND
AUX11
VDD
GND
AUX10
AUX9
AUX8
AUX7
VDD
AUX6
VDD
AUX5
GND
AUX4
AUX3
AUX2
VDD
AUX1
AUX0
PIN
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
FUNC
GND
VDD
SCLK
GND
GND
DAV
GND
INEN
VDD
DEF
GND
DIS
VDD
D15
D14
GND
D13
D12
D11
D10
VDD
D9
PIN
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
FUNC
D8
D7
D6
D5
GND
VDD
D4
GND
D3
GND
D2
GND
D1
VDD
D0
LFLG
GND
R0
GND
R1
VDD
R2
PIN
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
FUNC
GND
R3
VDD
R4
GND
R5
R6
R7
R8
GND
VDD
R9
VDD
R10
R11
R12
R13
GND
R14
R15
DISAB
S0
PIN
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
FUNC
GND
S1
GND
N/C
DOS
VDD
S2
GND
S3
GND
VDD
I0
I1
GND
I2
I3
I4
GND
VDD
I5
I6
VDD
Pin Out for 132 Leaded Chip Carrier (GC132)
3
PDSP16510A MA
SIGNAL
D15:0
AUX15:0
TYPE
I
I
DESCRIPTION
Data input during real only mode, The real component in complex data mode.
When DEF is active AUX15:0 are used to define the operating mode as defined in Table 3.
When DEF is in-active AUX15:0 either provide the 16 bit imaginary component of complex
input data, or a second set of real only inputs.
These pins output the real component of the transformed data when DAV and DEN are active.
Otherwise they are high impedance.
These pins output the imaginary component of the transformed data when DAV and DEN are
active. Otherwise they are high impedance.
The high going edge of DEF is used to internally latch the contents of AUX15:0, which then
define the operating mode. In the simplest system DEF is a power on reset. When DEF is low
the internal control logic is reset.
System clock used for internal computations.
These pins indicate the number of shifts towards the binary point which have occurred as the
result of the conditional scaling logic. When the data path right shift is restricted to 2 places
per pass, state 15 is used to indicate an overflow and only a total of 14 shifts is possible.
This flag indicates that data is being loaded into the device. It goes active in response to an
INEN input, and may be programmed to go in-active after the complete, one quarter, or one
half a data block has been loaded.
The use of this input is mode dependent. It is either used as an active low, load enabling,
signal for the DIS strobe, or it is used to initiate a new block load operation.
The rising edge of this asynchronous input is used to load data into the device.
The rising edge of this asynchronous input is used to dump data from the device. In most
applications it may be tied to the DIS input, even if the output rate must be higher than the input
rate because of overlapped data blocks. The DIS input is then internally divided down.
An active low signal that indicates that a transform is complete. Transformed data will then
be outputed in normal sequential order using DOS. It may be optionally programmed to be
delayed by 24 DOS strobes to match the delay through a PDSP16330.
This input is used to enable the data dump operation when DAV has gone active. If it is tied
low the device will automatically dump data when DAV goes active. Otherwise the device will
wait for the enabling signal to go low before the dump operation commences.
Only available in the 132 pin GC package. When high the block floating logic is disabled.
+5V pins
Ground pins
R15:0
O
I15:0
O
DEF
I
SCLK
S3:0
I
O
LFLG
O
INEN
I
DIS
DOS
I
I
DAV
O
DEN
I
DISAB
VDD
GND
I
P
P
NOTE.
All references to DEF, INEN, DAV, and DEN within the text do not contain the bar designator, signifying an active low
signal. This is considered to be implied by the signal name and is not meant to imply a change in the signal function.
FUNCTIONAL OPERATION
The PDSP16510 performs decimation in time, radix 4,
forward or inverse Fast Fourier Transforms. Data is loaded
into an internal workspace RAM in normal sequential order,
processed, and then dumped in the correct order. With real
only input data the processing time can approximately be
halved for a given transform size. Two real inputs then replace a
single complex input, and are processed in parallel.
Either a Blackman Harris or a Hamming window can be
generated internally, and applied to the incoming real or complex
data with no time penalty. No external ROM is needed to support
these windows. The Blackman Harris window gives improved
dynamic range over the Hamming window when two closely
4
PDSP16510A MA
spaced frequencies are to be detected, and one is of smaller
magnitude than the other. It does, however, reduce the actual
frequency resolution, and the Hamming window may then be
preferable.
Data in and out of the device is represented by 16 bit real
and imaginary components, with 16 bit sine and cosine values
contained in an internal ROM. Conditional scaling, coupled
with word growth through the butterfly data path, gives in-
creased dynamic range. Transforms can be computed with
sample sizes of either 256 or 1024 data points. The 256 point
option can alternatively be used to simultaneously execute
either four 64 point transforms, or sixteen 16 point transforms.
The 16 point mode can only be used with a rectangular
window, and no overlapping of data blocks is possible.
The device can be configured, either, to perform continu-
ous transforms in a real time application, or as slave processor
to a more general purpose signal processing system. In the
continuous mode, with transform sizes of 256 points or less,
it contains three internal control units which simultaneously
allow new data to be loaded, present data to be transformed,
and previous results to be dumped. Additional, external, input/
output buffering is not needed. The internal input buffer also
allows data blocks to be overlapped by either 50% or 75%,
apart from the mode with no overlaps.
When 1024 point transforms are to be calculated, without loss
of incoming data during the transform time, it is necessary to
use an input buffer. This requirement is satisfied by a single
PDSP16540 support device.
In any of the real or complex modes it is possible to obtain
higher performance by connecting devices in parallel. It is then
possible to increase the sampling rate to that of the system
clock used for internal operations.
The mode of operation of the device is controlled by 16
bits in a control register. These are loaded through the
AUX15:0 port when a control signal DEF is active low. This
port is also used to provide the imaginary component of
complex input data, and, if complex transforms are to be
performed, an external tristate buffer will be needed to isolate
the control information. This should only be enabled when
DEF is active. DEF is also used to initiliase the internal
circuitry, and can be a simple power on reset if control
parameters need not be subsequently changed.
INPUT
SELECT
RAM
SIN / COS
ROM
Shift left until largest point
has one sign bit.
16
16
MULTIPLIER
S
S
29 - 14 13 - 0
"1"
18
16
FIRST ADDER
19Bit Result
18 - 1
0
REGISTER FILE
SECOND ADDER
19Bit Result
18 - 1
0
REGISTER FILE
THIRD ADDER
19Bit Result
18 - 3
17 - 2
DATA PRECISION
During each pass of a radix-4 fast Fourier transform it is
possible for either component of a particular result to grow by
a factor of up to four in the first pass, and 5.242 in subsequent
passes. This is between two and three bits in each pass and
the data path must allow for this word growth to avoid any
possibility of overflow. At the end of the data path the word is
again reduced to 16 bits by discarding least significant bits..
Any un-necessary word growth to prevent overflow thus
results in loss of arithmetic precision, and has a detrimental
effect on the dynamic range achievable.
In practice these large word growths only occur when
bipolar complex square waves are transformed, and even
then will not occur on every pass. The PDSP16510 compro-
mises by allowing a 2 bit word growth during the butterfly
calculation in the first pass. This is equivalent to ignoring the
most significant bit of the 19 bit final result ,which is assumed
to be an extra sign bit, and then selecting the next 16 bits for
CR
BIT3
SELECT
Fig. 3 One of Four Data Paths
storage. In subsequent passes a Control Register Bit allows
the user to continue to select these 16 bits, or instead to use
the 16 most significant bits. The latter option is equivalent to
a 3 bit word growth. The 2 or 3 bit word growth option applies
to ALL subsequent passes and is not a per pass option.
If the 2 bit option is selected there is a possibility of
overflow occurring in one of the passes. The prediction of
overflow is mathematically difficult, and only occurs with
specific complex square waves. Scaling down the inputs
cannot be guaranteed to prevent overflow because of the
5