a
FEATURES
Superscalar IEEE Floating-Point Processor
Off-Chip Harvard Architecture Maximizes Signal
Processing Performance
30 ns, 33.3 MIPS Instruction Rate, Single-Cycle
Execution
100 MFLOPS Peak, 66 MFLOPS Sustained Performance
1024-Point Complex FFT Benchmark: 0.58 ms
Divide (y/x): 180 ns
Inverse Square Root (1/√x): 270 ns
32-Bit Single-Precision and 40-Bit Extended-Precision
IEEE Floating-Point Data Formats
32-Bit Fixed-Point Formats, Integer and Fractional,
with 80-Bit Accumulators
IEEE Exception Handling with Interrupt on Exception
Three Independent Computation Units: Multiplier,
ALU, and Barrel Shifter
Dual Data Address Generators with Indirect, Immedi-
ate, Modulo, and Bit Reverse Addressing Modes
Two Off-Chip Memory Transfers in Parallel with
Instruction Fetch and Single-Cycle Multiply & ALU
Operations
Multiply with Add & Subtract for FFT Butterfly
Computation
Efficient Program Sequencing with Zero-Overhead
Looping: Single-Cycle Loop Setup
Single-Cycle Register File Context Switch
15 (or 25) ns External RAM Access Time for Zero-Wait-
State, 30 (or 40) ns Instruction Execution
IEEE JTAG Standard 1149.1 Test Access Port and
On-Chip Emulation Circuitry
223-Pin PGA Package (Ceramic)
GENERAL DESCRIPTION
32/40-Bit IEEE Floating-Point
DSP Microprocessor
ADSP-21020
FUNCTIONAL BLOCK DIAGRAM
DATA ADDRESS
GENERATORS
DAG 1
DAG 2
INSTRUCTION
CACHE
PROGRAM
SEQUENCER
JTAG TEST
& EMULATION
PROGRAM MEMORY ADDRESS
DATA MEMORY ADDRESS
EXTERNAL
ADDRESS
BUSES
PROGRAM MEMORY DATA
DATA MEMORY DATA
EXTERNAL
DATA
BUSES
REGISTER FILE
TIMER
ARITHMETIC UNITS
ALU
MULTIPLIER
SHIFTER
multiplier operations. These computation units support IEEE
32-bit single-precision floating-point, extended precision
40-bit floating-point, and 32-bit fixed-point data formats.
•
Data Register File
A general-purpose data register file is used for transferring
data between the computation units and the data buses, and
for storing intermediate results. This 10-port (16-register)
register file, combined with the ADSP-21020’s Harvard
architecture, allows unconstrained data flow between
computation units and off-chip memory.
Single-Cycle Fetch of Instruction and Two Operands
•
The ADSP-21020 is the first member of Analog Devices’ family
of single-chip IEEE floating-point processors optimized for
digital signal processing applications. Its architecture is similar
to that of Analog Devices’ ADSP-2100 family of fixed-point
DSP processors.
Fabricated in a high-speed, low-power CMOS process, the
ADSP-21020 has a 30 ns instruction cycle time. With a high-
performance on-chip instruction cache, the ADSP-21020 can
execute every instruction in a single cycle.
The ADSP-21020 features:
The ADSP-21020 uses a modified Harvard architecture in
which data memory stores data and program memory stores
both instructions and data. Because of its separate program
and data memory buses and on-chip instruction cache, the
processor can simultaneously fetch an operand from data
memory, an operand from program memory, and an
instruction from the cache, all in a single cycle.
Memory Interface
•
•
Independent Parallel Computation Units
The arithmetic/logic unit (ALU), multiplier and shifter
perform single-cycle instructions. The units are architecturally
arranged in parallel, maximizing computational throughput. A
single multifunction instruction executes parallel ALU and
Addressing of external memory devices by the ADSP-21020 is
facilitated by on-chip decoding of high-order address lines to
generate memory bank select signals. Separate control lines
are also generated for simplified addressing of page-mode
DRAM.
The ADSP-21020 provides programmable memory wait
states, and external memory acknowledge controls allow
interfacing to peripheral devices with variable access times.
REV. C
Information furnished by Analog Devices is believed to be accurate and
reliable. However, no responsibility is assumed by Analog Devices for its
use, nor for any infringements of patents or other rights of third parties
which may result from its use. No license is granted by implication or
otherwise under any patent or patent rights of Analog Devices.
One Technology Way, P.O. Box 9106, Norwood, MA 02062-9106, U.S.A.
Tel: 617/329-4700
Fax: 617/326-8703
ADSP-21020
•
Instruction Cache
The ADSP-21020 includes a high performance instruction
cache that enables three-bus operation for fetching an
instruction and two data values. The cache is selective—only
the instructions whose fetches conflict with program memory
data accesses are cached. This allows full-speed execution
of core, looped operations such as digital filter multiply-
accumulates and FFT butterfly processing.
Hardware Circular Buffers
•
C Source Level Debugger
A full-featured C source level debugger that works with the
simulator or EZ-ICE emulator to allow debugging of
assembler source, C source, or mixed assembler and C.
Numerical C Compiler
•
•
The ADSP-21020 provides hardware to implement circular
buffers in memory, which are common in digital filters and
Fourier transform implementations. It handles address
pointer wraparound, reducing overhead (thereby increasing
performance) and simplifying implementation. Circular
buffers can start and end at any location.
Flexible Instruction Set
Supports ANSI Standard (X3J11.1) Numerical C as defined
by the Numeric C Extensions Group. The compiler accepts C
source input containing Numerical C extensions for array
selection, vector math operations, complex data types,
circular pointers, and variably dimensioned arrays, and
outputs ADSP-21xxx assembly language source code.
ADSP-21020 EZ-LAB® Evaluation Board
•
•
The ADSP-21020’s 48-bit instruction word accommodates a
variety of parallel operations, for concise programming. For
example, the ADSP-21020 can conditionally execute a
multiply, an add, a subtract and a branch in a single
instruction.
The EZ-LAB Evaluation Board is a general-purpose, stand-
alone ADSP-21020 system that includes 32K words of
program memory and 32K words of data memory as well as
analog I/O. A PC RS-232 download path enables the user to
download and run programs directly on the EZ-LAB. In
addition, it may be used in conjunction with the EZ-ICE
Emulator to provide a powerful software debug environment.
ADSP-21020 EZ-ICE® Emulator
•
DEVELOPMENT SYSTEM
The ADSP-21020 is supported with a complete set of software
and hardware development tools. The ADSP-21000 Family
Development System includes development software, an
evaluation board and an in-circuit emulator.
•
Assembler
Creates relocatable, COFF (Common Object File Format)
object files from ADSP-21xxx assembly source code. It
accepts standard C preprocessor directives for conditional
assembly and macro processing. The algebraic syntax of the
ADSP-21xxx assembly language facilitates coding and
debugging of DSP algorithms.
Linker/Librarian
This in-circuit emulator provides the system designer with a
PC-based development environment that allows nonintrusive
access to the ADSP-21020’s internal registers through the
processor’s 5-pin JTAG Test Access Port. This use of on-chip
emulation circuitry enables reliable, full-speed performance in
any target. The emulator uses the same graphical user inter-
face as the ADSP-21020 Simulator, allowing an easy tran-
sition from software to hardware debug. (See “Target System
Requirements for Use of EZ-ICE Emulator” on page 27.)
ADDITIONAL INFORMATION
•
The Linker processes separately assembled object files and
library files to create a single executable program. It assigns
memory locations to code and to data in accordance with a
user-defined architecture file that describes the memory and
I/O configuration of the target system. The Librarian allows
you to group frequently used object files into a single library
file that can be linked with your main program.
Simulator
This data sheet provides a general overview of ADSP-21020
functionality. For additional information on the architecture and
instruction set of the processor, refer to the
ADSP-21020 User’s
Manual.
For development system and programming reference
information, refer to the
ADSP-21000 Family Development
Software Manuals
and the
ADSP-21020 Programmer’s Quick
Reference.
Applications code listings and benchmarks for key
DSP algorithms are available on the DSP Applications BBS; call
(617) 461-4258, 8 data bits, no parity, 1 stop bit, 300/1200/
2400/9600 baud.
ARCHITECTURE OVERVIEW
•
The Simulator performs interactive, instruction-level
simulation of ADSP-21xxx code within the hardware
configuration described by a system architecture file. It flags
illegal operations and supports full symbolic disassembly. It
provides an easy-to-use, window oriented, graphical user
interface that is identical to the one used by the ADSP-21020
EZ-ICE Emulator. Commands are accessed from pull-down
menus with a mouse.
PROM Splitter
Figure 1 shows a block diagram of the ADSP-21020. The
processor features:
•
•
•
•
•
•
Three Computation Units (ALU, Multiplier, and Shifter)
with a Shared Data Register File
Two Data Address Generators (DAG 1, DAG 2)
Program Sequencer with Instruction Cache
32-Bit Timer
Memory Buses and Interface
JTAG Test Access Port and On-Chip Emulation Support
•
•
Formats an executable file into files that can be used with an
industry-standard PROM programmer.
C Compiler and Runtime Library
Computation Units
The C Compiler complies with ANSI specifications. It takes
advantage of the ADSP-21020’s high-level language architec-
tural features and incorporates optimizing algorithms to speed
up the execution of code. It includes an extensive runtime
library with over 100 standard and DSP-specific functions.
The ADSP-21020 contains three independent computation
units: an ALU, a multiplier with fixed-point accumulator, and a
shifter. In order to meet a wide variety of processing needs, the
computation units process data in three formats: 32-bit
fixed-point, 32-bit floating-point and 40-bit floating-point. The
floating-point operations are single-precision IEEE-compatible
(IEEE Standard 754/854). The 32-bit floating-point format is
EZ-LAB and EZ-ICE are registered trademarks of Analog Devices, Inc.
–2–
REV. C
ADSP-21020
CACHE
MEMORY
32 x 48
DAG 1
8 x 4 x 32
DAG 2
8 x 4 x 24
JTAG TEST &
EMULATION
FLAGS
PROGRAM
SEQUENCER
TIMER
PMA BUS
DMA BUS
PMD BUS
48
24
PMA
32
DMA
PMD
BUS CONNECT
DMD BUS
40
DMD
FLOATING & FIXED-POINT
MULTIPLIER, FIXED-POINT
ACCUMULATOR
REGISTER
FILE
16 x 40
32-BIT
BARREL
SHIFTER
FLOATING-POINT
& FIXED-POINT
ALU
Figure 1. ADSP-21020 Block Diagram
the standard IEEE format, whereas the 40-bit IEEE extended-
precision format has eight additional LSBs of mantissa for
greater accuracy.
The multiplier performs floating-point and fixed-point
multiplication as well as fixed-point multiply/add and multiply/
subtract operations. Integer products are 64 bits wide, and the
accumulator is 80 bits wide. The ALU performs 45 standard
arithmetic and logic operations, supporting both fixed-point and
floating-point formats. The shifter performs 19 different
operations on 32-bit operands. These operations include logical
and arithmetic shifts, bit manipulation, field deposit, and extract
and derive exponent operations.
The computation units perform single-cycle operations; there is
no
computation pipeline. The three units are connected in
parallel rather than serially, via multiple-bus connections with
the 10-port data register file. The output of any computation
unit may be used as the input of any unit on the next cycle. In a
multifunction
computation, the ALU and multiplier perform
independent, simultaneous operations.
Data Register File
of the ADSP-21020 allow the following nine data transfers to be
performed every cycle:
•
•
•
•
Off-chip read/write of two operands to or from the register file
Two operands supplied to the ALU
Two operands supplied to the multiplier
Two results received from the ALU and multiplier (three, if
the ALU operation is a combined addition/subtraction)
The processor’s 48-bit orthogonal instruction word supports
fully parallel data transfer and arithmetic operations in the same
instruction.
Address Generators and Program Sequencer
Two dedicated address generators and a program sequencer
supply addresses for memory accesses. Because of this, the
computation units need never be used to calculate addresses.
Because of its instruction cache, the ADSP-21020 can
simultaneously fetch an instruction and data values from both
off-chip program memory and off-chip data memory in a single
cycle.
The data address generators (DAGs) provide memory addresses
when external memory data is transferred over the parallel
memory ports to or from internal registers. Dual data address
generators enable the processor to output two simultaneous
addresses for dual operand reads and writes. DAG 1 supplies
32-bit addresses to data memory. DAG 2 supplies 24-bit
addresses to program memory for program memory data
accesses.
Each DAG keeps track of up to eight address pointers, eight
modifiers, eight buffer length values and eight base values. A
pointer used for indirect addressing can be modified by a value
–3–
The ADSP-21020’s general-purpose data register file is used for
transferring data between the computation units and the data
buses, and for storing intermediate results. The register file has
two sets (primary and alternate) of sixteen 40-bit registers each,
for fast context switching.
With a large number of buses connecting the registers to the
computation units, data flow between computation units and
from/to off-chip memory is unconstrained and free from
bottlenecks. The 10-port register file and Harvard architecture
REV. C
ADSP-21020
in a specified register, either before (premodify) or after
(postmodify) the access. To implement automatic modulo
addressing for circular buffers, the ADSP-21020 provides buffer
length registers that can be associated with each pointer. Base
values for pointers allow circular buffers to be placed at arbitrary
locations. Each DAG register has an alternate register that can
be activated for fast context switching.
The program sequencer supplies instruction addresses to
program memory. It controls loop iterations and evaluates
conditional instructions. To execute looped code with zero
overhead, the ADSP-21020 maintains an internal loop counter
and loop stack. No explicit jump or decrement instructions are
required to maintain the loop.
The ADSP-21020 derives its high clock rate from pipelined
fetch, decode
and
execute
cycles. Approximately 70% of the
machine cycle is available for memory accesses; consequently,
ADSP-21020 systems can be built using slower and therefore
less expensive memory chips.
Instruction Cache
output. The count register is automatically reloaded from a
32-bit period register and the count resumes immediately.
System Interface
Figure 2 shows an ADSP-21020 basic system configuration.
The external memory interface supports memory-mapped
peripherals and slower memory with a user-defined combination
of programmable wait states and hardware acknowledge signals.
Both the program memory and data memory interfaces support
addressing of page-mode DRAMs.
The ADSP-21020’s internal functions are supported by four
internal buses: the program memory address (PMA) and data
memory address (DMA) buses are used for addresses associated
with program and data memory. The program memory data
(PMD) and data memory data (DMD) buses are used for data
associated with the two memory spaces. These buses are
extended off chip. Four data memory select (DMS) signals
select one of four user-configurable banks of data memory.
Similarly, two program memory select (PMS) signals select
between two user-configurable banks of program memory. All
banks are independently programmable for 0-7 wait states.
The PX registers permit passing data between program memory
and data memory spaces. They provide a bridge between the
48-bit PMD bus and the 40-bit DMD bus or between the 40-bit
register file and the PMD bus.
The PMA bus is 24 bits wide allowing direct access of up to
16M words of mixed instruction code and data. The PMD is 48
bits wide to accommodate the 48-bit instruction width. For
access of 40-bit data the lower 8 bits are unused. For access of
32-bit data the lower 16 bits are ignored.
The DMA bus is 32 bits wide allowing direct access of up to 4
Gigawords of data. The DMD bus is 40 bits wide. For 32-bit
data, the lower 8 bits are unused. The DMD bus provides a
path for the contents of any register in the processor to be
transferred to any other register or to any external data memory
location in a single cycle. The data memory address comes from
one of two sources: an absolute value specified in the instruction
code (direct addressing) or the output of a data address
generator (indirect addressing).
External devices can gain control of the processor’s memory
buses from the ADSP-21020 by means of the bus request/grant
signals (BR and
BG).
To grant its buses in response to a bus
request, the ADSP-21020 halts internal operations and places
its program and data memory interfaces in a high impedance
state. In addition, three-state controls (DMTS and
PMTS)
allow an external device to place either the program or data
memory interface in a high impedance state without affecting
the other interface and without halting the ADSP-21020 unless
it requires a memory access from the affected interface. The
three-state controls make it easy for an external cache controller
to hold the ADSP-21020 off the bus while it updates an external
cache memory.
JTAG Test and Emulation Support
The program sequencer includes a high performance, selective
instruction cache that enables three-bus operation for fetching
an instruction and two data values. This two-way, set-associative
cache holds 32 instructions. The cache is selective—only the
instructions whose fetches conflict with program memory data
accesses are cached, so the ADSP-21020 can perform a program
memory data access and can execute the corresponding instruction
in the same cycle. The program sequencer fetches the instruction
from the cache instead of from program memory, enabling the
ADSP-21020 to simultaneously access data in both program
memory and data memory.
Context Switching
Many of the ADSP-21020’s registers have alternate register sets
that can be activated during interrupt servicing to facilitate a fast
context switch. The data registers in the register file, DAG
registers and the multiplier result register all have alternate sets.
Registers active at reset are called
primary
registers; the others
are called
alternate
registers. Bits in the MODE1 control register
determine which registers are active at any particular time.
The primary/alternate select bits for each half of the register file
(top eight or bottom eight registers) are independent. Likewise,
the top four and bottom four register sets in each DAG have
independent primary/ alternate select bits. This scheme allows
passing of data between contexts.
Interrupts
The ADSP-21020 has four external hardware interrupts, nine
internally generated interrupts, and eight software interrupts.
For the external interrupts and the internal timer interrupt, the
ADSP-21020 automatically stacks the arithmetic status and
mode (MODE1) registers when servicing the interrupt, allowing
five nesting levels of fast service for these interrupts.
An interrupt can occur at any time while the ADSP-21020 is
executing a program. Internal events that generate interrupts
include arithmetic exceptions, which allow for fast trap handling
and recovery.
Timer
The programmable interval timer provides periodic interrupt
generation. When enabled, the timer decrements a 32-bit count
register every cycle. When this count register reaches zero, the
ADSP-21020 generates an interrupt and asserts its TIMEXP
The ADSP-21020 implements the boundary scan testing
provisions specified by IEEE Standard 1149.1 of the Joint
Testing Action Group (JTAG). The ADSP-21020’s test
access port and on-chip JTAG circuitry is fully compliant with
the IEEE 1149.1 specification. The test access port enables
boundary scan testing of circuitry connected to the
ADSP-21020’s I/O pins.
–4–
REV. C
ADSP-21020
1×
CLOCK
4
CLKIN
SELECTS
PROGRAM
MEMORY
OE
WE
ADDR
48
DATA
PMD
24
2
PMS1-0
PMRD
PMWR
PMA
RESET
IRQ3-0
DMS3-0
DMRD
DMWR
DMA
DMD
32
32
4
SELECTS
OE
WE
ADDR
DATA
SELECTS
OE
WE
ACK
ADDR
DATA
PERIPHERALS
DATA
MEMORY
ADSP-21010
PMTS
PMPAGE
PMACK
RCOMP
FLAG3-0
TIMEXP
DMTS
DMPAGE
DMACK
JTAG
BR
BG
4
5
Figure 2. Basic System Configuration
The ADSP-21020 also implements on-chip emulation through
the JTAG test access port. The processor’s eight sets of break-
point range registers enable program execution at full speed
until reaching a desired break-point address range. The
processor can then halt and allow reading/writing of all the
processor’s internal registers and external memories through the
JTAG port.
PIN DESCRIPTIONS
Pin
Name
Type Function
Program Memory Page Boundary. The
ADSP-21020 asserts this pin to signal that a
program memory page boundary has been
crossed. Memory pages must be defined in
the memory control registers.
Program Memory Three-State Control.
PMTS
places the program memory address,
data, selects, and strobes in a high-
impedance state. If
PMTS
is asserted while
a PM access is occurring, the processor will
halt and the memory access will not be
completed. PMACK must be asserted for at
least one cycle when
PMTS
is deasserted to
allow any pending memory access to com-
plete properly.
PMTS
should only be
asserted (low) during an active memory
access cycle.
Data Memory Address. The ADSP-21020
outputs an address in data memory on these
pins.
Data Memory Data. The ADSP-21020
inputs and outputs data on these pins.
32-bit fixed point data and 32-bit
single-precision floating point data is
transferred over bits 39-8 of the DMD bus.
Data Memory Select lines. These pins are
asserted as chip selects for the correspon-
ding banks of data memory. Memory banks
must be defined in the memory control
registers. These pins are decoded data
memory address lines and provide an early
indication of a possible bus cycle.
Data Memory Read strobe. This pin is
asserted when the ADSP-21020 reads from
data memory.
Data Memory Write strobe. This pin is
asserted when the ADSP-21020 writes to
data memory.
Data Memory Acknowledge. An external
device deasserts this input to add wait states
to a memory access.
PMPAGE O
This section describes the pins of the ADSP-21020. When
groups of pins are identified with subscripts, e.g. PMD
47–0
, the
highest numbered pin is the MSB (in this case, PMD
47
). Inputs
identified as synchronous (S) must meet timing requirements
with respect to CLKIN (or with respect to TCK for TMS, TDI,
and
TRST).
Those that are asynchronous (A) can be asserted
asynchronously to CLKIN.
O = Output; I = Input; S = Synchronous; A = Asynchronous;
P = Power Supply; G = Ground.
PMTS
I/S
Pin
Name
Type
Function
Program Memory Address. The ADSP-21020
outputs an address in program memory on
these pins.
Program Memory Data. The ADSP-21020
inputs and outputs data and instructions on
these pins. 32-bit fixed-point data and 32-bit
single-precision floating-point data is trans-
ferred over bits 47-16 of the PMD bus.
Program Memory Select lines. These pins are
asserted as chip selects for the corresponding
banks of program memory. Memory banks
must be defined in the memory control
registers. These pins are decoded program
memory address lines and provide an early
indication of a possible bus cycle.
Program Memory Read strobe. This pin is
asserted when the ADSP-21020 reads from
program memory.
Program Memory Write strobe. This pin is
asserted when the ADSP-21020 writes to
program memory.
Program Memory Acknowledge. An external
device deasserts this input to add wait states
to a memory access.
–5–
DMA
31–0
O
PMA
23–0
O
PMD
47–0
I/O
DMD
39–0
I/O
PMS
1–0
O
DMS
3–0
O
PMRD
O
DMRD
O
PMWR
O
DMWR
O
PMACK I/S
DMACK
I/S
REV. C