SHARC Processors
ADSP-21367/ADSP-21368/ADSP-21369
SUMMARY
High performance 32-bit/40-bit floating-point processor
optimized for high performance audio processing
Single-instruction, multiple-data (SIMD) computational
architecture
On-chip memory—2M bits of on-chip SRAM and 6M bits of
on-chip mask programmable ROM
Code compatible with all other members of the SHARC family
The ADSP-21367/ADSP-21368/ADSP-21369 are available
with a 400 MHz core instruction rate with unique audiocen-
tric peripherals such as the digital audio interface, S/PDIF
transceiver, serial ports, 8-channel asynchronous sample
rate converter, precision clock generators, and more. For
complete ordering information, see
Ordering Guide on
Page 55.
CORE PROCESSOR
TIMERS
INSTRUCTION
CACHE
32 48-BIT
JTAG TEST & EMULATION
4 BLOCKS OF
ON-CHIP MEMORY
2M BIT RAM
6M BIT ROM
FLAGS4-15
PWM
32
DATA
CONTROL PINS
DAG1
8 4 32
DAG2
8 4 32
PROGRAM
SEQUENCER
ADDR
DATA
EXTERNAL PORT
SDRAM
CONTROLLER
7
18
PM ADDRESS BUS
DM ADDRESS BUS
32
32
PM DATA BUS
64
ASYNCHRONOUS
MEMORY INTERFACE
SHARED MEMORY
INTERFACE
IOA(24)
IOD(32)
3
CONTROL
24
ADDRESS
8
DM DATA BUS 64
PROCESSING
ELEMENT
(PEX)
PROCESSING
ELEMENT
(PEY)
PX REGISTER
IOP REGISTER (MEMORY MAPPED)
CONTROL, STATUS, AND DATA BUFFERS
DMA
CONTROLLER
34 CHANNELS
MEMORY-TO-
MEMORY DMA (2)
DAI ROUTING UNIT
4
PRECISION CLOCK
GENERATORS (4)
GPIO FLAGS/
IRQ/TIMEXP
SRC (8 CHANNELS)
SERIAL PORTS (8)
INPUT DATA PORT/
PDAP
DAI PINS
SPI PORT (2)
2-WIRE
INTERFACE
DPI PINS
DPI ROUTING UNIT
UART (2)
S
SPDIF (Rx/Tx)
TIMERS (3)
DIGITAL AUDIO INTERFACE
20
DIGITAL PERIPHERAL INTERFACE
I/O PROCESSOR
14
Figure 1. Functional Block Diagram
SHARC and the SHARC logo are registered trademarks of Analog Devices, Inc.
Rev. C
Information furnished by Analog Devices is believed to be accurate and reliable.
However, no responsibility is assumed by Analog Devices for its use, nor for any
infringements of patents or other rights of third parties that may result from its use.
Specifications subject to change without notice. No license is granted by implication
or otherwise under any patent or patent rights of Analog Devices. Trademarks and
registered trademarks are the property of their respective companies.
One Technology Way, P.O. Box 9106, Norwood, MA 02062-9106 U.S.A.
Tel: 781.329.4700
www.analog.com
Fax: 781.461.3113
©2008
Analog Devices, Inc. All rights reserved.
ADSP-21367/ADSP-21368/ADSP-21369
KEY FEATURES—PROCESSOR CORE
At 400 MHz (2.5 ns) core instruction rate, the processors per-
form 2.4G FLOPS/800 MMACS
2M bit on-chip, SRAM (0.75M bit in blocks 0 and 1, and
0.25M bit in blocks 2 and 3) for simultaneous access by the
core processor and DMA
6M bit on-chip, mask-programmable ROM (3M bit in block 0
and 3M bit in block 1)
Dual data address generators (DAGs) with modulo and bit-
reverse addressing
Zero-overhead looping with single-cycle loop setup, provid-
ing efficient program sequencing
Single-instruction, multiple-data (SIMD) architecture
provides:
Two computational processing elements
Concurrent execution
Code compatibility with other SHARC family members at
the assembly level
Parallelism in buses and computational units allows:
single-cycle executions (with or without SIMD) of a mul-
tiply operation, an ALU operation, a dual memory read
or write, and an instruction fetch
Transfers between memory and core at a sustained
6.4 Gbps bandwidth at 400 MHz core instruction rate
Digital peripheral interface (DPI) includes three timers, two
UARTs, two SPI ports, and a 2-wire interface port
Outputs of PCG's C and D can be driven on to DPI pins
8 dual data line serial ports that operate at up to
50 Mbps on each data line—each has a clock, frame sync,
and two data lines that can be configured as either a
receiver or transmitter pair
TDM support for telecommunications interfaces including
128 TDM channel support for newer telephony interfaces
such as H.100/H.110
Up to 16 TDM stream support, each with 128 channels per
frame
Companding selection on a per channel basis in TDM mode
Input data port, configurable as eight channels of serial data
or seven channels of serial data and up to a 20-bit wide
parallel data channel
Signal routing unit provides configurable and flexible con-
nections between all DAI/DPI components
2 muxed flag/IRQ lines
1 muxed flag/timer expired line /MS pin
1 muxed flag/IRQ /MS pin
DEDICATED AUDIO COMPONENTS
S/PDIF-compatible digital audio receiver/transmitter sup-
ports EIAJ CP-340 (CP-1201), IEC-958, AES/EBU standards
Left-justified, I
2
S, or right-justified serial data input with
16-, 18-, 20- or 24-bit word widths (transmitter)
4 independent asynchronous sample rate converters (SRC).
Each converter has separate serial input and output ports,
a de-emphasis filter providing up to –140 dB SNR perfor-
mance, stereo sample rate converter and supports left-
justified, I
2
S, TDM, and right-justified modes and 24-, 20-,
18-, and 16-audio data word lengths
Pulse-width modulation provides:
16 PWM outputs configured as four groups of four outputs
supports center-aligned or edge-aligned PWM waveforms
ROM-based security features include:
JTAG access to memory permitted with a 64-bit key
Protected memory regions that can be assigned to limit
access under program control to sensitive code
PLL has a wide variety of software and hardware multi-
plier/divider ratios
Dual voltage: 3.3 V I/O, 1.2 V or 1.3 V core
Available in 256-ball BGA_ED and 208-lead LQFP_EP pack-
ages (see
Ordering Guide on Page 55)
INPUT/OUTPUT FEATURES
DMA controller supports:
34 zero-overhead DMA channels for transfers between
internal memory and a variety of peripherals
32-bit DMA transfers at peripheral clock speed, in parallel
with full-speed processor execution
32-bit wide external port provides glueless connection to
both synchronous (SDRAM) and asynchronous memory
devices
Programmable wait state options: 2 SCLK to 31 SCLK cycles
Delay-line DMA engine maintains circular buffers in exter-
nal memory with tap-/offset-based reads
SDRAM accesses at 166 MHz and asynchronous accesses at
55 MHz
Shared-memory support allows multiple DSPs to automat-
ically arbitrate for the bus and gluelessly access a
common memory device
Shared memory interface (ADSP-21368 only) support
provides:
Glueless connection for scalable DSP multiprocessing
architecture
Distributed on-chip bus arbitration for parallel bus
Connect of up to four ADSP-21368 processors and global
memory
Four memory select lines allow multiple external memory
devices
Digital audio interface (DAI) includes eight serial ports, four
precision clock generators, an input data port, an S/PDIF
transceiver, an 8-channel asynchronous sample rate con-
verter, and a signal routing unit
Rev. C |
Page 2 of 56 |
January 2008
ADSP-21367/ADSP-21368/ADSP-21369
TABLE OF CONTENTS
Revision History ...................................................... 3
General Description ................................................. 4
Core Architecture ................................................. 4
Memory Architecture ............................................ 5
External Memory .................................................. 5
Input/Output Features ........................................... 7
System Design ...................................................... 9
Development Tools .............................................. 10
Additional Information ......................................... 11
Pin Function Descriptions ........................................ 12
Data Modes ........................................................ 15
Boot Modes ........................................................ 15
Core Instruction Rate to CLKIN Ratio Modes ............. 15
Specifications ......................................................... 16
Operating Conditions ........................................... 16
Electrical Characteristics ........................................ 16
Package Information ............................................ 17
ESD Caution ...................................................... 17
Maximum Power Dissipation ................................. 17
Absolute Maximum Ratings ................................... 17
Timing Specifications ........................................... 17
Output Drive Currents .......................................... 46
Test Conditions ................................................... 46
Capacitive Loading ............................................... 46
Thermal Characteristics ........................................ 48
256-Ball BGA_ED Pinout ......................................... 49
208-Lead LQFP_EP Pinout ....................................... 52
Package Dimensions ................................................ 53
Surface-Mount Design .......................................... 54
Ordering Guide ...................................................... 55
REVISION HISTORY
1/08—Rev. B to Rev. C
All outstanding document errata from the previous revision
of this data sheet has been corrected.
This revision replaces the MQFP package with the LQFP-EP
package. See
Thermal Characteristics for 208-Lead LQFP EPAD
(With Exposed Pad Soldered to PCB) ...........................48
Ordering Guide ......................................................55
Rev. C |
Page 3 of 56 |
January 2008
ADSP-21367/ADSP-21368/ADSP-21369
GENERAL DESCRIPTION
The ADSP-21367/ADSP-21368/ADSP-21369 SHARC
®
proces-
sors are members of the SIMD SHARC family of DSPs that
feature Analog Devices’ Super Harvard Architecture. These pro-
cessors are source code-compatible with the ADSP-2126x and
ADSP-2116x DSPs as well as with first generation ADSP-2106x
SHARC processors in SISD (single-instruction, single-data)
mode. The processors are 32-bit/40-bit floating-point proces-
sors optimized for high performance automotive audio
applications with its large on-chip SRAM, and mask
programmable ROM, multiple internal buses to eliminate I/O
bottlenecks, and an innovative digital audio interface (DAI).
As shown in the functional block diagram
on Page 1,
the
processors use two computational units to deliver a significant
performance increase over the previous SHARC processors on a
range of DSP algorithms. Fabricated in a state-of-the-art, high
speed, CMOS process, the ADSP-21367/ADSP-21368/
ADSP-21369 processors achieve an instruction cycle time of up
to 2.5 ns at 400 MHz. With its SIMD computational hardware,
the processors can perform 2.4G FLOPS running at 400 MHz.
Table 1
shows performance benchmarks for these devices.
Table 1. Processor Benchmarks (at 400 MHz)
Speed
Benchmark Algorithm
(at 400 MHz)
1024 Point Complex FFT (Radix 4, with reversal) 23.2
μs
FIR Filter (per tap)
1
1.25 ns
IIR Filter (per biquad)
1
5.0 ns
Matrix Multiply (pipelined)
[3×3] × [3×1]
11.25 ns
[4×4] × [4×1]
20.0 ns
Divide (y/x)
8.75 ns
Inverse Square Root
13.5 ns
1
• On-chip SRAM (2M bit)
• On-chip mask-programmable ROM (6M bit)
• JTAG test access port
The block diagram of the ADSP-21368
on Page 1
also illustrates
the following architectural features:
• DMA controller
• Eight full-duplex serial ports
• Digital audio interface that includes four precision clock
generators (PCG), an input data port (IDP), an S/PDIF
receiver/transmitter, eight channels asynchronous sample
rate converters, eight serial ports, a 16-bit parallel input
port (PDAP), a flexible signal routing unit (DAI SRU).
• Digital peripheral interface that includes three timers, an
I
2
C
®
interface, two UARTs, two serial peripheral interfaces
(SPI), and a flexible signal routing unit (DPI SRU).
CORE ARCHITECTURE
The ADSP-21367/ADSP-21368/ADSP-21369 are code compati-
ble at the assembly level with the ADSP-2126x, ADSP-21160,
and ADSP-21161, and with the first generation ADSP-2106x
SHARC processors. The ADSP-21367/ADSP-21368/
ADSP-21369 processors share architectural features with the
ADSP-2126x and ADSP-2116x SIMD SHARC processors, as
detailed in the following sections.
SIMD Computational Engine
The processors contain two computational processing elements
that operate as a single-instruction, multiple-data (SIMD)
engine. The processing elements are referred to as PEX and PEY
and each contains an ALU, multiplier, shifter, and register file.
PEX is always active, and PEY may be enabled by setting the
PEYEN mode bit in the MODE1 register. When this mode is
enabled, the same instruction is executed in both processing ele-
ments, but each processing element operates on different data.
This architecture is efficient at executing math intensive DSP
algorithms.
Entering SIMD mode also has an effect on the way data is trans-
ferred between memory and the processing elements. When in
SIMD mode, twice the data bandwidth is required to sustain
computational operation in the processing elements. Because of
this requirement, entering SIMD mode also doubles the band-
width between memory and the processing elements. When
using the DAGs to transfer data in SIMD mode, two data values
are transferred with each access of memory or the register file.
Assumes two files in multichannel SIMD mode.
The ADSP-21367/ADSP-21368/ADSP-21369 continues
SHARC’s industry-leading standards of integration for DSPs,
combining a high performance 32-bit DSP core with integrated,
on-chip system features.
The block diagram of the ADSP-21368
on Page 1
illustrates the
following architectural features:
• Two processing elements, each of which comprises an
ALU, multiplier, shifter, and data register file
• Data address generators (DAG1, DAG2)
• Program sequencer with instruction cache
• PM and DM buses capable of supporting four 32-bit data
transfers between memory and the core at every core pro-
cessor cycle
• Three programmable interval timers with PWM genera-
tion, PWM capture/pulse width measurement, and
external event counter capabilities
Independent, Parallel Computation Units
Within each processing element is a set of computational units.
The computational units consist of an arithmetic/logic unit
(ALU), multiplier, and shifter. These units perform all opera-
tions in a single cycle. The three units within each processing
element are arranged in parallel, maximizing computational
throughput. Single multifunction instructions execute parallel
ALU and multiplier operations. In SIMD mode, the parallel
Rev. C |
Page 4 of 56 |
January 2008
ADSP-21367/ADSP-21368/ADSP-21369
ALU and multiplier operations occur in both processing
elements. These computation units support IEEE 32-bit single-
precision floating-point, 40-bit extended precision floating-
point, and 32-bit fixed-point data formats.
MEMORY ARCHITECTURE
The ADSP-21367/ADSP-21368/ADSP-21369 processors add
the following architectural features to the SIMD SHARC
family core.
Data Register File
A general-purpose data register file is contained in each pro-
cessing element. The register files transfer data between the
computation units and the data buses, and store intermediate
results. These 10-port, 32-register (16 primary, 16 secondary)
register files, combined with the ADSP-2136x enhanced Har-
vard architecture, allow unconstrained data flow between
computation units and internal memory. The registers in PEX
are referred to as R0–R15 and in PEY as S0–S15.
On-Chip Memory
The processors contain two megabits of internal RAM and six
megabits of internal mask-programmable ROM. Each block can
be configured for different combinations of code and data stor-
age (see
Table 2 on Page 6).
Each memory block supports
single-cycle, independent accesses by the core processor and I/O
processor. The memory architecture, in combination with its
separate on-chip buses, allows two data transfers from the core
and one from the I/O processor, in a single cycle.
The SRAM can be configured as a maximum of 64k words of
32-bit data, 128k words of 16-bit data, 42k words of 48-bit
instructions (or 40-bit data), or combinations of different word
sizes up to two megabits. All of the memory can be accessed as
16-bit, 32-bit, 48-bit, or 64-bit words. A 16-bit floating-point
storage format is supported that effectively doubles the amount
of data that can be stored on-chip. Conversion between the
32-bit floating-point and 16-bit floating-point formats is per-
formed in a single instruction. While each memory block can
store combinations of code and data, accesses are most efficient
when one block stores data using the DM bus for transfers, and
the other block stores instructions and data using the PM bus
for transfers.
Using the DM bus and PM buses, with one bus dedicated to
each memory block, assures single-cycle execution with two
data transfers. In this case, the instruction must be available in
the cache.
Single-Cycle Fetch of Instruction and Four Operands
The ADSP-21367/ADSP-21368/ADSP-21369 feature an
enhanced Harvard architecture in which the data memory
(DM) bus transfers data and the program memory (PM) bus
transfers both instructions and data (see
Figure 1 on Page 1).
With separate program and data memory buses and on-chip
instruction cache, the processors can simultaneously fetch four
operands (two over each data bus) and one instruction (from
the cache), all in a single cycle.
Instruction Cache
The processors include an on-chip instruction cache that
enables three-bus operation for fetching an instruction and four
data values. The cache is selective—only the instructions whose
fetches conflict with PM bus data accesses are cached. This
cache allows full-speed execution of core, looped operations
such as digital filter multiply-accumulates, and FFT butterfly
processing.
Data Address Generators with Zero-Overhead Hardware
Circular Buffer Support
The ADSP-21367/ADSP-21368/ADSP-21369 have two data
address generators (DAGs). The DAGs are used for indirect
addressing and implementing circular data buffers in hardware.
Circular buffers allow efficient programming of delay lines and
other data structures required in digital signal processing, and
are commonly used in digital filters and Fourier transforms.
The two DAGs contain sufficient registers to allow the creation
of up to 32 circular buffers (16 primary register sets, 16 second-
ary). The DAGs automatically handle address pointer
wraparound, reduce overhead, increase performance, and sim-
plify implementation. Circular buffers can start and end at any
memory location.
EXTERNAL MEMORY
The external port provides a high performance, glueless inter-
face to a wide variety of industry-standard memory devices. The
32-bit wide bus can be used to interface to synchronous and/or
asynchronous memory devices through the use of its separate
internal memory controllers. The first is an SDRAM controller
for connection of industry-standard synchronous DRAM
devices and DIMMs (dual inline memory module), while the
second is an asynchronous memory controller intended to
interface to a variety of memory devices. Four memory select
pins enable up to four separate devices to coexist, supporting
any desired combination of synchronous and asynchronous
device types. NonSDRAM external memory address space is
shown in
Table 3.
SDRAM Controller
The SDRAM controller provides an interface of up to four sepa-
rate banks of industry-standard SDRAM devices or DIMMs, at
speeds up to f
SCLK
. Fully compliant with the SDRAM standard,
each bank has its own memory select line (MS0–MS3), and can
be configured to contain between 16M bytes and 128M bytes of
memory. SDRAM external memory address space is shown in
Table 4.
Flexible Instruction Set
The 48-bit instruction word accommodates a variety of parallel
operations, for concise programming. For example, the
ADSP-21367/ADSP-21368/ADSP-21369 can conditionally exe-
cute a multiply, an add, and a subtract in both processing
elements while branching and fetching up to four 32-bit values
from memory—all in a single instruction.
Rev. C |
Page 5 of 56 |
January 2008