Preliminary Technical Data
KEY FEATURES
500 MHz, 2.0 ns Instruction Cycle Rate
4M Bits of Internal—On-Chip—DRAM Memory
25×25 mm (576-Ball) Thermally Enhanced Ball Grid Array
Package
Dual Computation Blocks—Each Containing an ALU, a Multi-
plier, a Shifter, and a Register File
Dual Integer ALUs, providing Data Addressing and Pointer
Manipulation
Integrated I/O Includes 10 Channel DMA Controller, External
Port, Two Link Ports, SDRAM Controller, Programmable
Flag Pins, Two Timers, and Timer Expired Pin for System
Integration
1149.1 IEEE Compliant JTAG Test Access Port for On-Chip
Emulation
On-Chip Arbitration for Glueless Multiprocessing
TigerSHARC
®
Embedded Processor
ADSP-TS203S
KEY BENEFITS
Provides High-Performance Static Superscalar DSP Opera-
tions, Optimized for Large, Demanding Multiprocessor
DSP Applications
Performs Exceptionally Well on DSP Algorithm and I/O
Benchmarks (See Benchmarks in
Table 1)
Supports Low-Overhead DMA Transfers Between Internal
Memory, External Memory, Memory-Mapped Peripherals,
Link Ports, Host Processors, and Other (Multiprocessor)
DSPs
Eases DSP Programming Through Extremely Flexible Instruc-
tion Set and High-Level-Language Friendly DSP
Architecture
Enables Scalable Multiprocessing Systems With Low Commu-
nications Overhead
DATA ADDRESS GENERATION
4M BITS INTERNAL MEMORY
MEMORY BLOCKS
(PAGE CACHE)
SOC BUS
JTAG
JTAG PORT
6
INTEGER
J ALU
32
32
INTEGER
K ALU
32X32
32
128
32
PROGRAM
SEQUENCER
ADDR
FETCH
J-BUS ADDR
J-BUS DATA
K-BUS ADDR
32X32
4xCROSSBAR CONNECT
A
D
A
D
A
D
A
D
EXTERNAL
PORT
32
ADDR
HOST
MULTI
PROC
SDRAM
CTRL
SOC INTERFACE
32
DATA
8
CTRL
10
CTRL
BTB
K-BUS DATA
I-BUS ADDR
128
32
128
C-BUS
ARB
EXT DMA
REQ
4
PC
I-BUS DATA
DMA
IAB
T
MULTIPLIER
128
128
128
S-BUS ADDR
S-BUS DATA
128
32
LINK PORTS
4
8
IN
L0
4
OUT
8
4
8
IN
L1
4
OUT
8
4
X
REGISTER
FILE
32x32
MULTIPLIER
SHIFTER
128
DAB
DAB
Y
REGISTER
FILE
32x32
COMPUTATIONAL BLOCKS
Figure 1. Functional block diagram
TigerSHARC and the TigerSHARC logo are registered trademarks of Analog Devices, Inc.
Rev. PrB
Information furnished by Analog Devices is believed to be accurate and reliable.
However, no responsibility is assumed by Analog Devices for its use, nor for any
infringements of patents or other rights of third parties that may result from its use.
Specifications subject to change without notice. No license is granted by implication
or otherwise under any patent or patent rights of Analog Devices. Trademarks and
registered trademarks are the property of their respective owners.
One Technology Way, P.O.Box 9106, Norwood, MA 02062-9106 U.S.A.
Tel:781/329-4700
www.analog.com
Fax:781/326-8703
© 2003 Analog Devices, Inc. All rights reserved.
SHIFTER
ALU
ALU
ADSP-TS203S
TABLE OF CONTENTS
General Description ................................................. 3
Dual Compute Blocks ............................................ 4
Data Alignment Buffer (DAB) .................................. 4
Dual Integer ALU (IALU) ....................................... 4
Program Sequencer ............................................... 5
Interrupt Controller ........................................... 5
Flexible Instruction Set ........................................ 5
DSP Memory ....................................................... 5
External Port (Off-Chip Memory/Peripherals Interface) . 5
Host Interface ................................................... 6
Multiprocessor Interface ...................................... 7
SDRAM Controller ............................................ 7
EPROM Interface .............................................. 7
DMA Controller ................................................... 7
Link Ports (LVDS) ................................................ 8
Timer and General-Purpose I/O ............................... 9
Reset and Booting ................................................. 9
Clock Domains .................................................... 9
Power Domains .................................................. 10
Filtering Reference Voltage and Clocks .................... 10
Development Tools ............................................. 10
Designing an Emulator-Compatible DSP Board (Target) 11
Additional Information ........................................ 11
Preliminary Technical Data
Pin Function Descriptions ........................................ 12
Strap Pin Function Descriptions ................................ 19
ADSP-TS203S—Specifications ................................... 21
Recommended Operating Conditions ...................... 21
Electrical Characteristics ....................................... 21
Absolute Maximum Ratings ................................... 22
ESD Sensitivity ................................................... 22
Timing Specifications ........................................... 23
General AC Timing .......................................... 23
Link Port Low-Voltage, Differential-Signal (LVDS)
Electrical Characteristics and Timing ................. 27
Link Port—Data Out Timing ........................... 28
Link Port—Data In Timing .............................. 31
Output Drive Currents ......................................... 32
Test Conditions .................................................. 33
Output Disable Time ......................................... 33
Output Enable Time ......................................... 34
Capacitive Loading ........................................... 34
Environmental Conditions .................................... 36
Thermal Characteristics ..................................... 36
576-Ball BGA_ED Pin Configurations ......................... 36
Outline Dimensions ................................................ 40
Ordering Guide ..................................................... 40
REVISION HISTORY
Revision PrB:
• Applies corrections and additional information to
VREF
Filtering Scheme (page 10), SCLK_VREF Filtering
Scheme (page 10), Drive Strength/Output Impedance
Selection (page 18), Recommended Operating Condi-
tions (page 21), Electrical Characteristics (page 21),
Power-Up Reset Timing (page 23), AC Signal Specifica-
tions (page 25), Link Port—Data Out Timing (page 28),
Link Port—Data In Timing (page 31),
and
Ordering
Guide (page 40).
• Provides unused pin termination data in
Pin Function
Descriptions (page 12).
• Changes pins R2 and R3 to NC in
576-Ball (25 mm × 25
mm) BGA_ED Pin Assignments (page 37).
Rev. PrB
| Page 2 of 40 |
December 2003
Preliminary Technical Data
GENERAL DESCRIPTION
The ADSP-TS203S TigerSHARC processor is an ultra-high per-
formance, static superscalar processor optimized for large signal
processing tasks and communications infrastructure. The DSP
combines very wide memory widths with dual computation
blocks—supporting 32- and 40-bit floating-point and support-
ing 8-, 16-, 32-, and 64-bit fixed-point processing—to set a new
standard of performance for digital signal processors. The
TigerSHARC static superscalar architecture lets the DSP exe-
cute up to four instructions each cycle, performing twenty-four
16-bit fixed-point operations or six floating-point operations.
Four independent 128-bit wide internal data buses, each con-
necting to the four 1M bit memory banks, enable quad-word
data, instruction, and I/O accesses and provide 28G bytes per
second of internal memory bandwidth. Operating at 500 MHz,
the ADSP-TS203S processor’s core has a 2.0 ns instruction cycle
time. Using its Single-Instruction, Multiple-Data (SIMD) fea-
tures, the ADSP-TS203S processor can perform four billion 40-
bit MACs or one billion 80-bit MACs per second.
Table 1
shows
the DSP’s performance benchmarks.
Table 1. General Purpose Algorithm Benchmarks
at 500 MHz
Clock
Cycles
32-bit Algorithm, one billion MACs/s peak performance
18.8 µs
9419
1K Point Complex FFT
1
(Radix2)
1
64K Point Complex FFT (Radix2)
2.8 ms
1397544
FIR Filter (per real tap)
1 ns
0.5
[8
×
8][8
×
8] Matrix Multiply (Complex, 2.8 µs
1399
Floating-point)
16-bit Algorithm, four billion MACs/s peak performance
256 Point Complex FFT
1
(Radix 2)
1.9 µs
928
I/O DMA Transfer Rate
External port
500M bytes/s n/a
Link ports (each)
500M bytes/s n/a
1
ADSP-TS203S
• Four 128-bit internal data buses, each connecting to the
four 1M bit memory banks
• On-chip DRAM (4M bit)
• An external port that provides the interface to host proces-
sors, multiprocessing space (DSPs), off-chip memory-
mapped peripherals, and external SRAM and SDRAM
• A 10 channel DMA controller
• Two full-duplex LVDS link ports
• Two 64-bit interval timers and timer expired pin
• A 1149.1 IEEE compliant JTAG test access port for on-chip
emulation
Figure 2 on page 3
shows a typical single-processor system with
external SRAM and SDRAM.
Figure 4 on page 8
shows a typical
multiprocessor system.
ADSP-TS203S
RST_IN
RST_OUT
POR_IN
CLOCK
REFERENCE
REFERENCE
SDRAM
MEMORY
(OPTIONAL)
BOOT
EPROM
(OPTIONAL)
Benchmark
Speed
SCLK
SCLKRAT2–0
SCLK_V
REF
V
REF
IRQ3–0
FLAG3–0
ID2–0
MSSD3–0
RAS
CAS
LDQM
SDWE
SDCKE
BMS
CS
ADDR
BRST
ADDR31–0
DATA31–0
RD
WRL
ACK
MS1–0
MSH
HBR
HBG
BOFF
DATA
MEMORY
(OPTIONAL)
ADDR
DATA
OE
WE
ACK
CS
HOST
PROCESSOR
INTERFACE
(OPTIONAL)
CLK
CS
ADDR
RAS
DATA
CAS
DQM
WE
CKE
A10
Cache preloaded
LINK
DEVICES
(2 MAX)
(OPTIONAL)
SDA10
BR7–0
IORD
CPA
IOWR
DPA
IOEN
LxDATO3–0P/N
LxCLKOUTP/N
DMAR3–0
LxACKI
LxBCMPO
CONTROL
ADDRESS
DATA
ADDR
DATA
DMA DEVICE
(OPTIONAL)
DATA
The ADSP-TS203S processor is code-compatible with the other
TigerSHARC processors.
The Functional Block Diagram
on page 1
shows the ADSP-
TS203S processor’s architectural blocks. These blocks include:
• Dual compute blocks, each consisting of an ALU, multi-
plier, 64-bit shifter, and 32-word register file and associated
Data Alignment Buffers (DABs)
• Dual integer ALUs (IALUs), each with its own 31-word
register file for data addressing and a status register
• A program sequencer with Instruction Alignment Buffer
(IAB) and Branch Target Buffer (BTB)
• An interrupt controller that supports hardware and soft-
ware interrupts, supports level- or edge-triggers, and
supports prioritized, nested interrupts
LxDATI3–0P/N
LxCLKINP/N
LxACKO
LxBCMPI
CONTROLIMP1–0
BM
BUSLOCK
TMR0E
DS2–0
JTAG
Figure 2. ADSP-TS203S Single-Processor System With External SDRAM
The TigerSHARC DSP uses a Static Superscalar
*
architecture.
This architecture is superscalar in that the ADSP-TS203S pro-
cessor’s core can execute simultaneously from one to four 32-bit
instructions encoded in a Very Large Instruction Word (VLIW)
instruction line using the DSP’s dual compute blocks. Because
*
Static Superscalar™ is a trademark of Analog Devices, Inc.
Rev. PrB
| Page 3 of 40 |
December 2003
ADSP-TS203S
the DSP does not perform instruction re-ordering at runtime—
the programmer selects which operations will execute in parallel
prior to runtime—the order of instructions is static.
With few exceptions, an instruction line, whether it contains
one, two, three, or four 32-bit instructions, executes with a
throughput of one cycle in a ten-deep processor pipeline.
For optimal DSP program execution, programmers must follow
the DSP’s set of instruction parallelism rules when encoding an
instruction line. In general, the selection of instructions that the
DSP can execute in parallel each cycle depends on the instruc-
tion line resources each instruction requires and on the source
and destination registers used in the instructions. The program-
mer has direct control of three core components—the IALUs,
the compute blocks, and the program sequencer.
The ADSP-TS203S processor, in most cases, has a two-cycle
execution pipeline that is fully interlocked, so—whenever a
computation result is unavailable for another operation depen-
dent on it—the DSP automatically inserts one or more stall
cycles as needed. Efficient programming with dependency-free
instructions can eliminate most computational and memory
transfer data dependencies.
In addition, the ADSP-TS203S processor supports SIMD opera-
tions two ways—SIMD compute blocks and SIMD
computations. The programmer can load both compute blocks
with the same data (broadcast distribution) or different data
(merged distribution).
Preliminary Technical Data
Using these features, the compute blocks can:
• Provide 8 MACs per cycle peak and 7.1 MACs per cycle
sustained 16-bit performance and provide 2 MACs per
cycle peak and 1.8 MACs per cycle sustained 32-bit perfor-
mance (based on FIR)
• Execute six single-precision floating-point or execute
twenty-four 16-bit fixed-point operations per cycle, pro-
viding 3 GFLOPS or 12.0 GOPS performance
• Perform two complex 16-bit MACs per cycle
DATA ALIGNMENT BUFFER (DAB)
The DAB is a quad-word FIFO that enables loading of quad-
word data from nonaligned addresses. Normally, load instruc-
tions must be aligned to their data size so that quad words are
loaded from a quad-aligned address. Using the DAB signifi-
cantly improves the efficiency of some applications, such as FIR
filters.
DUAL INTEGER ALU (IALU)
The ADSP-TS203S processor has two IALUs that provide pow-
erful address generation capabilities and perform many general-
purpose integer operations. The IALUs are referred to as J and
K in assembly syntax and have the following features:
• Provides memory addresses for data and update pointers
• Supports circular buffering and bit-reverse addressing
• Performs general-purpose integer operations, increasing
programming flexibility
• Includes a 31-word register file for each IALU
As address generators, the IALUs perform immediate or indi-
rect (pre- and post-modify) addressing. They perform modulus
and bit-reverse operations with no constraints placed on mem-
ory addresses for the modulus data buffer placement. Each
IALU can specify either a single-, dual-, or quad-word access
from memory.
The IALUs have hardware support for circular buffers, bit
reverse, and zero-overhead looping. Circular buffers facilitate
efficient programming of delay lines and other data structures
required in digital signal processing, and they are commonly
used in digital filters and Fourier transforms. Each IALU pro-
vides registers for four circular buffers, so applications can set
up a total of eight circular buffers. The IALUs handle address
pointer wraparound automatically, reducing overhead, increas-
ing performance, and simplifying implementation. Circular
buffers can start and end at any memory location.
Because the IALU’s computational pipeline is one cycle deep, in
most cases integer results are available in the next cycle. Hard-
ware (register dependency check) causes a stall if a result is
unavailable in a given cycle.
DUAL COMPUTE BLOCKS
The ADSP-TS203S processor has compute blocks that can exe-
cute computations either independently or together as a Single-
Instruction, Multiple-Data (SIMD) engine. The DSP can issue
up to two compute instructions per compute block each cycle,
instructing the ALU, multiplier, or shifter to perform indepen-
dent, simultaneous operations. Each compute block can execute
eight 8-bit, four 16-bit, two 32-bit, or one 64-bit SIMD compu-
tations in parallel with the operation in the other block.
The compute blocks are referred to as X and Y in assembly syn-
tax, and each block contains three computational units—an
ALU, a multiplier, a 64-bit shifter—and a 32-word register file.
• Register File—Each Compute Block has a multiported 32-
word, fully orthogonal register file used for transferring
data between the computation units and data buses and for
storing intermediate results. Instructions can access the
registers in the register file individually (word-aligned), in
sets of two (dual-aligned), or in sets of four (quad-aligned).
• ALU—The ALU performs a standard set of arithmetic
operations in both fixed- and floating-point formats. It also
performs logic and PERMUTE operations.
• Multiplier—The multiplier performs both fixed- and float-
ing-point multiplication and fixed-point multiply and
accumulate.
• Shifter—The 64-bit shifter performs logical and arithmetic
shifts, bit and bitstream manipulation, and field deposit
and extraction operations.
Rev. PrB
| Page 4 of 40 |
December 2003
Preliminary Technical Data
PROGRAM SEQUENCER
The ADSP-TS203S processor’s program sequencer supports the
following:
• A fully interruptible programming model with flexible pro-
gramming in assembly and C/C++ languages; handles
hardware interrupts with high throughput and no aborted
instruction cycles
• A ten-cycle instruction pipeline—four-cycle fetch pipe and
six-cycle execution pipe—computation results available
two cycles after operands are available
• Supply of instruction fetch memory addresses; the
sequencer’s Instruction Alignment Buffer (IAB) caches up
to five fetched instruction lines waiting to execute; the pro-
gram sequencer extracts an instruction line from the IAB
and distributes it to the appropriate core component for
execution
• Management of program structures and program flow
determined according to JUMP, CALL, RTI, RTS instruc-
tions, loop structures, conditions, interrupts, and software
exceptions
• Branch prediction and a 128-entry branch target buffer
(BTB) to reduce branch delays for efficient execution of
conditional and unconditional branch instructions and
zero-overhead looping; correctly predicted branches that
are taken occur with zero overhead cycles, overcoming the
five-to-nine stage branch penalty
• Compact code without the requirement to align code in
memory; the IAB handles alignment
ADSP-TS203S
• Eliminates toggling DSP hardware modes because modes
are supported as options (for example, rounding, satura-
tion, and others) within instructions
• Branch prediction encoded in instruction; enables zero-
overhead loops
• Parallelism encoded in instruction line
• Conditional execution optional for all instructions
• User defined partitioning between program and data
memory
DSP MEMORY
The DSP’s internal and external memory is organized into a
unified memory map, which defines the location (address) of all
elements in the system, as shown in
Figure 3.
The memory map is divided into four memory areas—host
space, external memory, multiprocessor space, and internal
memory—and each memory space, except host memory, is sub-
divided into smaller memory spaces.
The ADSP-TS203S processor internal memory has 4M bits of
on-chip DRAM memory, divided into four blocks of 1M bits
(32K words
×
32 bits). Each block—M0, M2, M4, and M6—can
store program, data, or both, so applications can configure
memory to suit specific needs. Placing program instructions
and data in different memory blocks, however, enables the DSP
to access data while performing an instruction fetch. Each mem-
ory segment contains a 128K bit cache to enable single cycle
accesses to internal DRAM.
The four internal memory blocks connect to the four 128-bit
wide internal buses through a crossbar connection, enabling the
DSP to perform four memory transfers in the same cycle. The
DSP’s internal bus architecture provides a total memory band-
width of 28G bytes per second, enabling the core and I/O to
access eight 32-bit data words and four 32-bit instructions each
cycle. The DSP’s flexible memory structure enables:
• DSP core and I/O accesses to different memory blocks in
the same cycle
• DSP core access to three memory blocks in parallel—one
instruction and two data accesses
• Programmable partitioning of program and data memory
• Program access of all memory as 32-, 64-, or 128-bit
words—16-bit words with the DAB
Interrupt Controller
The DSP supports nested and nonnested interrupts. Each inter-
rupt type has a register in the interrupt vector table. Also, each
has a bit in both the interrupt latch register and the interrupt
mask register. All interrupts are fixed as either level-sensitive or
edge-sensitive, except the IRQ3–0 hardware interrupts, which
are programmable.
The DSP distinguishes between hardware interrupts and soft-
ware exceptions, handling them differently. When a software
exception occurs, the DSP aborts all other instructions in the
instruction pipe. When a hardware interrupt occurs, the DSP
continues to execute instructions already in the instruction pipe.
Flexible Instruction Set
The 128-bit instruction line, which can contain up to four 32-bit
instructions, accommodates a variety of parallel operations for
concise programming. For example, one instruction line can
direct the DSP to conditionally execute a multiply, an add, and a
subtract in both computation blocks while it also branches to
another location in the program. Some key features of the
instruction set include:
• Algebraic assembly language syntax
• Direct support for all DSP, imaging, and video arithmetic
types
EXTERNAL PORT (OFF-CHIP
MEMORY/PERIPHERALS INTERFACE)
The ADSP-TS203S processor’s external port provides the DSP’s
interface to off-chip memory and peripherals. The 4G word
address space is included in the DSP’s unified address space.
The separate on-chip buses—four 128-bit data buses and four
32-bit address buses—are multiplexed at the SOC interface and
transferred to the external port over the SOC bus to create an
external system bus transaction. The external system bus pro-
vides a single 32-bit data bus and a single 32-bit address bus.
The external port supports data transfer rates of 500M bytes per
second over the external bus.
Rev. PrB
| Page 5 of 40 |
December 2003