System Bus Noc
-
Upload
kangsan-lee -
Category
Documents
-
view
223 -
download
0
Transcript of System Bus Noc
-
8/8/2019 System Bus Noc
1/102
1
System Busses / Networks-on-Chip
EECE 579 - Advanced Topics in VLSI Design
Spring 2009
Brad Quinton
-
8/8/2019 System Bus Noc
2/102
2
Outline
1. Simple systems busses Overview
AMBAAPB
Advantages/Limitations
2. Complexsystems busses Overview AMBAAHB
Advantages/Limitations
3. Networks-on-Chip (NoC)
Overview AMBAAXI
Research Topics: Topology, Protocol, VLSIImplementation...
Review: A GenericArchitecture forOn-ChipPacket-SwitchedInterconnections
-
8/8/2019 System Bus Noc
3/102
3
Bluetooth PlatformSoC
ARM7TDMI
DAP I/F
RADIO
I/F
SPEECH
I/F
SHARED
MEMORY
CONTROLLER
LM C
BRIDGE
POW ER &
CLOCK
CONTROL
DM A
SM C
PLL
CLOCKS
SHARED
MEMORY
TI C
DECODER
ARBITER
AH B AP B
AD C
t xt ACI USBUARTUART
TIMERSPICGPIOW ATCH
DOG
ProcessorMemory
Controller
ApplicationSpecificLogic
Low-speedI/OandSupportLogic
SystemBus/Hardware
I/F
-
8/8/2019 System Bus Noc
4/102
4
SimpleSystemBusses
-
8/8/2019 System Bus Noc
5/102
5
SimpleSystemBusses
Theprimary goalofasimplesystem busistoallowsoftware (runningonaprocessor) tocommunicate
with otherhardware intheSoC
Therearemany differentimplementation ... buttheyareallvery similar
-
8/8/2019 System Bus Noc
6/102
6
EmbeddedProcessorI/O
RISC-basedembeddedprocessorscommunicatewith external hardwareusingtwosimpleinstructions:
-
8/8/2019 System Bus Noc
7/102
7
EmbeddedProcessorI/O
RISC-basedembeddedprocessorscommunicatewith external hardwareusingtwosimpleinstructions:
LoadOperation: Copiesawordofdata fromaspecificaddresstoalocalregister
StoreOperation: Copiesawordofdata fromalocal
register toaspecificaddress
-
8/8/2019 System Bus Noc
8/102
8
EmbeddedProcessorI/O
RISC-basedembeddedprocessorscommunicatewith external hardwareusingtwosimpleinstructions:
LoadOperation: Copiesawordofdata fromaspecificaddresstoalocalregister
StoreOperation: Copiesawordofdata fromalocal
register toaspecificaddress
Thesimplesystem busis justadirectextensionofthismodel
-
8/8/2019 System Bus Noc
9/102
9
EmbeddedProcessorI/O
-
8/8/2019 System Bus Noc
10/102
10
EmbeddedProcessorI/O
Software
sets up the
register with
the address
and data ...
-
8/8/2019 System Bus Noc
11/102
11
EmbeddedProcessorI/O
Software
sets up the
register with
the address
and data ...
Blocks
decode
addresses
to see if
they are the
targets...
-
8/8/2019 System Bus Noc
12/102
12
EmbeddedProcessorI/O
Software
sets up the
register with
the address
and data ...
Blocks
decode
addresses
to see if
they are the
targets...
Datatransferred
between
register and
hardware
-
8/8/2019 System Bus Noc
13/102
13
AMBASpecification
AMBA:AdvancedMicrocontrollerBusArchitecture
Created by ARMtoenablestandardizedinterfacesto
theirembeddedprocessors
Actually threestandards:APB, AHB, andAXI
Ver y commonly used forcommercialIPcores
-
8/8/2019 System Bus Noc
14/102
14
AMBASpecification
AMBA:AdvancedMicrocontrollerBusArchitecture
Created by ARMtoenablestandardizedinterfacesto
theirembeddedprocessors
Actually threestandards:APB, AHB, andAXI
Ver y commonly used forcommercialIPcores
Simple Bus
-
8/8/2019 System Bus Noc
15/102
15
AMBASpecification
AMBA:AdvancedMicrocontrollerBusArchitecture
Created by ARMtoenablestandardizedinterfacesto
theirembeddedprocessors
Actually threestandards:APB, AHB, andAXI
Ver y commonly used forcommercialIPcores
Simple Bus Comple Bus
-
8/8/2019 System Bus Noc
16/102
16
AMBASpecification
AMBA:AdvancedMicrocontrollerBusArchitecture
Created by ARMtoenablestandardizedinterfacesto
theirembeddedprocessors
Actually threestandards:APB, AHB, andAXI
Ver y commonly used forcommercialIPcores
NoCSimple Bus Comple Bus
-
8/8/2019 System Bus Noc
17/102
17
AMBAAPB:ReadOperation
QuickTime and aBMP decompressor
are needed to see this picture.
-
8/8/2019 System Bus Noc
18/102
18
AMBAAPB:ReadOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Target Address
-
8/8/2019 System Bus Noc
19/102
19
AMBAAPB:ReadOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Target Address
TransactionType
-
8/8/2019 System Bus Noc
20/102
20
AMBAAPB:ReadOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Target Address
TransactionType
Address
Decode
-
8/8/2019 System Bus Noc
21/102
21
AMBAAPB:ReadOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Target Address
TransactionType
Address
Decode
Optional (for
asynchronous
implementations
...)
-
8/8/2019 System Bus Noc
22/102
22
AMBAAPB:ReadOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Target Address
TransactionType
Address
Decode
Optional (for
asynchronous
implementations
...)Read Data
-
8/8/2019 System Bus Noc
23/102
23
AMBAAPB: WriteOperation
QuickTime and aBMP decompressor
are needed to see this picture.
-
8/8/2019 System Bus Noc
24/102
24
AMBAAPB: WriteOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Common Signals
Between Read and
Write
-
8/8/2019 System Bus Noc
25/102
25
AMBAAPB: WriteOperation
QuickTime and aBMP decompressor
are needed to see this picture.
Write Data
Common Signals
Between Read and
Write
-
8/8/2019 System Bus Noc
26/102
26
RememberOurCaseStudy
- data width:16 bits
- address width: 16 bits
- read cycle time: 50 ns
- write cycle time: 50 ns
Simple generic processor interface:
-
8/8/2019 System Bus Noc
27/102
27
- data width:16 bits
- address width: 16 bits
- read cycle time: 50 ns
- write cycle time: 50 ns
Simple generic processor interface:
System bus
RememberOurCaseStudy
-
8/8/2019 System Bus Noc
28/102
28
SimpleBusAdvantages
Simpletoimplement
Easy tounderstand
Simpleprogrammingmodel
Easy toaddnew hardware blocks Minimal hardwarerequirements (mostofthesignals
areshared)
-
8/8/2019 System Bus Noc
29/102
29
SimpleBusLimitations
SingleMaster- limitsparallelism
Scalability - performancesuffersas busisloaded...
Singleoutstandingrequest- poorthroughputand
multi-threadingperformance bottleneck
-
8/8/2019 System Bus Noc
30/102
30
CaseStudy:SingleMaster
Imagineanewpartition:
APS Bit ErrorMonitorcommunicates
directly with Switch
Simple busdoesntwork...
-
8/8/2019 System Bus Noc
31/102
31
CaseStudy:SingleMaster
No ath
Imagineanewpartition:
APS Bit ErrorMonitorcommunicates
directly with Switch
Simple busdoesntwork...
-
8/8/2019 System Bus Noc
32/102
32
CaseStudy:SingleMaster
No ath
Thiscanmakesoftwarethe bottleneckinthesystem....
Imagineanewpartition:
APS Bit ErrorMonitorcommunicates
directly with Switch
Simple busdoesntwork...
-
8/8/2019 System Bus Noc
33/102
33
SingleMasterSummary
A busthatislimitedtoasinglemaster:
Makesinter-blockcommunicationinefficient
Limitsparallelism between hardwareandsoftware
Increasesrelianceoninterrupts
Createssoftwareperformance bottlenecks
Isnotcompatiblewith multipleprocessors
-
8/8/2019 System Bus Noc
34/102
34
Scalability
-
8/8/2019 System Bus Noc
35/102
35
Scalability
Blocks are functionally
easy to add, but....
-
8/8/2019 System Bus Noc
36/102
36
Scalability
Each new
block
increasesthe delay
on the
address
and data
Blocks are functionally
easy to add, but....
-
8/8/2019 System Bus Noc
37/102
-
8/8/2019 System Bus Noc
38/102
38
SingleOutstandingRequest
QuickTime and aBMP decompressor
are needed to see this picture.
-
8/8/2019 System Bus Noc
39/102
39
SingleOutstandingRequest
QuickTime and aBMP decompressor
are needed to see this picture.
Processor is stalled waiting for response...
-
8/8/2019 System Bus Noc
40/102
40
SingleOutstandingRequest
QuickTime and aBMP decompressor
are needed to see this picture.
Processor is stalled waiting for response...
best-case
-
8/8/2019 System Bus Noc
41/102
41
SingleOutstandingRequestSummary
Busseslimitedtoasingleoutstandingrequest:
Reducesoftwareperformancesincethesoftwaremuststallonthe firsttransaction
Arenotabletoachieve full busthroughputsincethedatabus isidleduringtheaddressphase
-
8/8/2019 System Bus Noc
42/102
42
ComplexSystemBusses
-
8/8/2019 System Bus Noc
43/102
43
ComplexSystemsBusses
Thecomplexsystem busisattemptstoaddresssomeoftheissueswith thesimple bus:
Multi-master
Pipelinedtransactions
Therearemany differentwaystogoaboutthis...
-
8/8/2019 System Bus Noc
44/102
44
AMBAAHB
AHBaddressesmany ofthelimitationsofAPB:
multi-master
multipleoutstandingtransactions (sortof...)
back-to-backtransactions
Unfortunately, thisaddssignificantcomplexity
-
8/8/2019 System Bus Noc
45/102
45
Bringonthecomplexity...
-
8/8/2019 System Bus Noc
46/102
46
Bringonthecomplexity...
CPU #1
CPU #2
IP lock
#1
IP lock
#1
IP lock
#2
IP lock
#3
IP lock
#4
-
8/8/2019 System Bus Noc
47/102
47
Bringonthecomplexity...
Request
CPU #1
CPU #2
IP lock
#1
IP lock
#1
IP lock
#2
IP lock
#3
IP lock
#4
-
8/8/2019 System Bus Noc
48/102
48
Bringonthecomplexity...
Request
GrantCPU #1
CPU #2
IP lock
#1
IP lock
#1
IP lock
#2
IP lock
#3
IP lock
#4
-
8/8/2019 System Bus Noc
49/102
49
Bringonthecomplexity...
Request
Grant
TransactionCPU #1
CPU #2
IP lock
#1
IP lock
#1
IP lock
#2
IP lock
#3
IP lock
#4
-
8/8/2019 System Bus Noc
50/102
50
BusArbitration
Whenmultiplemasterssharea bustheremust besomecentralresourcetomanagethe bus:anarbiter
Oncethereiscompetition forthe bus, itispossiblethatitisnotready when youneedit: backpressure
Backpressureaddscomplexity and hurtperformance
-
8/8/2019 System Bus Noc
51/102
51
Request/ GrantProtocol
-
8/8/2019 System Bus Noc
52/102
52
Request/ GrantProtocol
Before a transaction a
master makes a request
to the central arbiter
-
8/8/2019 System Bus Noc
53/102
53
Request/ GrantProtocol
Before a transaction a
master makes a request
to the central arbiter
Eventually the request is
granted
-
8/8/2019 System Bus Noc
54/102
54
Request/ GrantProtocol
Before a transaction a
master makes a request
to the central arbiter
Eventually the request is
granted
Then the
transaction
proceeds
-
8/8/2019 System Bus Noc
55/102
55
Request/ GrantProtocol
Before a transaction a
master makes a request
to the central arbiter
Eventually the request is
granted
Then the
transaction
proceeds
Performance Impact
-
8/8/2019 System Bus Noc
56/102
56
PipelinedTransactions
To helpimprove busefficiency thetransactionsonthe buscan bepipelined
Thisisreally asimpleimplementationofmultipleoutstandingtransactions
Theaddress foronetransactioncan bepresented
beforethedata fromtheprevioustransaction hasbeencompleted
-
8/8/2019 System Bus Noc
57/102
57
PipelinedTransactions
-
8/8/2019 System Bus Noc
58/102
58
PipelinedTransactions
TransactionAStarts
-
8/8/2019 System Bus Noc
59/102
59
PipelinedTransactions
TransactionAStarts
TransactionBStarts
-
8/8/2019 System Bus Noc
60/102
60
PipelinedTransactions
TransactionAStarts
TransactionBStarts
TransactionA Completes
-
8/8/2019 System Bus Noc
61/102
61
PipelinedTransactions
TransactionAStarts
TransactionBStarts
TransactionA Completes
Notice backpressure
-
8/8/2019 System Bus Noc
62/102
62
Advantages
Relatively easy toaddnew blocks
Still hasthe familiarbusstructure
Low hardwarecost
Busarbitration solvesmany orderingproblems
-
8/8/2019 System Bus Noc
63/102
63
Disadvantages
Bussesthatrequirearbitration: mustroutesignalstothearbitrationlogicand back
mustfinda fairway tosharethe bus
slavesarenotalwaysavailable => backpressure
difficulttoprovideperformanceguarantees...
Stillpotentially a bandwidth bottleneck
Stilldoesntscalewellwhen blocksareadded
Multipleoutstandingtransactionsnot handledwell-
noorderinginformation
-
8/8/2019 System Bus Noc
64/102
64
Networks-on-Chip (NoCs)
-
8/8/2019 System Bus Noc
65/102
65
Networks-on-Chip
Itisclearthatevenwith significantdesigneffortthebus-styleinterconnectisnotgoingtosufficient forlargeSoCs:
thephysicalimplementation doesnotscale: bus fanout,loading, arbitrationdepth allreduceoperating frequency
theavailable bandwidth doesnotscale:thesingle bus
must beshared by allmastersandslaves
-
8/8/2019 System Bus Noc
66/102
66
Networks-on-Chip
Itisclearthatevenwith significantdesigneffortthebus-styleinterconnectisnotgoingtosufficient forlargeSoCs:
thephysicalimplementation doesnotscale: bus fanout,loading, arbitrationdepth allreduceoperating frequency
theavailable bandwidth doesnotscale:thesingle bus
must beshared by allmastersandslaves
Letsstartagain: Leverageresearch fromdatanetworking
-
8/8/2019 System Bus Noc
67/102
67
Whatdowewant?
TheSoCsofthe futurewill:
have 100sof hardware blocks,
have billionsoftransistors,
havemultipleprocessors,
havelargewire-to-gatedelay ratios,
handlelargeamountsof high-speeddata,
needtosupport plug-and-playIP blocks
Our NoCneedsto beready fortheseSoCs...
-
8/8/2019 System Bus Noc
68/102
68
TheIdeal Network
Whatwouldtheidealnetworklooklike?:
Lowareaoverhead
Simpleimplementation
High-speedoperation
Low-latency
High-bandwidth
Operateataconstant frequency evenwith additional
blocks Increaseavailable bandwidth as blocksareadded
Provideperformanceguarantees
Havea universalinterface
-
8/8/2019 System Bus Noc
69/102
69
TheIdeal Network
Whatwouldtheidealnetworklooklike?:
Lowareaoverhead
Simpleimplementation
High-speedoperation
Low-latency
High-bandwidth
Operateataconstant frequency evenwith additional
blocks Increaseavailable bandwidth as blocksareadded
Provideperformanceguarantees
Havea universalinterface
Thesearecompeting
requirements: Designa
networkthatisthe bestfit.
-
8/8/2019 System Bus Noc
70/102
70
Whatdoweneedtodecide?
NetworkInterface
NetworkProtocol/TransactionFormat
NetworkTopology VLSIImplementation
-
8/8/2019 System Bus Noc
71/102
71
NetworkInterface
Wewantournetworkto be plug-and-playsoindustry standardization iskey
However thestandard beuniversalenough toaddressmany differentneeds
AMBAAXIisanexampleofanattemptatthis
-
8/8/2019 System Bus Noc
72/102
72
AMBAAXI
ARMaddedtheAXIspecificationtoVersion3.0oftheAMBAstandard
Newapproach: definetheinterfaceandleavetheinterconnectuptothedesigners
Goodplansinceaspecific busimplementationisnolongerrequired
ItispossibletouseAXIto buildmany different NoCs
-
8/8/2019 System Bus Noc
73/102
73
AMBAAXI
Inter facedividedinto5channels:
WriteAddress
WriteData
WriteResponse
ReadAddress
ReadData/Response
Each channelisindependentandusetwo-way flowcontrol
-
8/8/2019 System Bus Noc
74/102
74
AMBAAXIRead Channels
C
-
8/8/2019 System Bus Noc
75/102
75
AMBAAXIRead Channels
Independent
AMBA AXI R d Ch l
-
8/8/2019 System Bus Noc
76/102
76
AMBAAXIRead Channels
Givemesomedata
Independent
AMBA AXI R d Ch l
-
8/8/2019 System Bus Noc
77/102
77
AMBAAXIRead Channels
Givemesomedata
Here yougo
Independent
AMBA AXI R d Ch l
-
8/8/2019 System Bus Noc
78/102
78
AMBAAXIRead Channels
Givemesomedata
Here yougo
Independent
channelssynchronizedwith ID # ortags
AMBA AXI W it Ch l
-
8/8/2019 System Bus Noc
79/102
79
AMBAAXI Write Channels
AMBA AXI W it Ch l
-
8/8/2019 System Bus Noc
80/102
80
AMBAAXI Write Channels
Independent
Independent
AMBA AXI W it Ch l
-
8/8/2019 System Bus Noc
81/102
81
AMBAAXI Write Channels
Imsendingdata. Pleasestoreit.
Independent
Independent
AMBA AXI W it Ch l
-
8/8/2019 System Bus Noc
82/102
82
AMBAAXI Write Channels
Imsendingdata. Pleasestoreit.
Hereisthedata.
Independent
Independent
AMBA AXI W it Ch l
-
8/8/2019 System Bus Noc
83/102
83
AMBAAXI Write Channels
Imsendingdata. Pleasestoreit.
Hereisthedata.
Ireceivedthatdatacorrectly.
Independent
Independent
AMBA AXI W it Ch l
-
8/8/2019 System Bus Noc
84/102
84
AMBAAXI Write Channels
Imsendingdata. Pleasestoreit.
Hereisthedata.
Ireceivedthatdatacorrectly.
Independent
Independent
channelssynchronized
with ID # ortags
AMBA AXI Fl C t l
-
8/8/2019 System Bus Noc
85/102
85
AMBAAXIFlow-Control
Informationmovesonlywhen:
Source is Valid, and
DestinationisReady
Oneach channelthemasterorslavecanlimit
the flow
Ver y flexible
AMBA AXI Flo Control
-
8/8/2019 System Bus Noc
86/102
86
AMBAAXIFlow-Control
Informationmovesonlywhen:
Source is Valid, and
DestinationisReady
Oneach channelthemasterorslavecanlimit
the flow
Ver y flexible
Transfer
AMBA AXI Flow Control
-
8/8/2019 System Bus Noc
87/102
87
AMBAAXIFlow-Control
Thisdefinitionofvery independent, fully flow-controlledchannelsisvery useful
However , thereisapotentialproblem:
AMBA AXI Flow Control
-
8/8/2019 System Bus Noc
88/102
88
AMBAAXIFlow-Control
Thisdefinitionofvery independent, fully flow-controlledchannelsisvery useful
However , thereisapotentialproblem:DEADLOCK
AMBA AXI Flow Control
-
8/8/2019 System Bus Noc
89/102
89
AMBAAXIFlow-Control
Thisdefinitionofvery independent, fully flow-controlledchannelsisvery useful
However , thereisapotentialproblem:DEADLOCK
Onawritetransactionthemastermustnotwait forAWREADY beforeasserting WVALID
AMBA AXI Read
-
8/8/2019 System Bus Noc
90/102
90
AMBAAXIRead
AMBA AXI Read
-
8/8/2019 System Bus Noc
91/102
91
AMBAAXIRead
ReadAddress Channel
ReadData Channel
AMBA AXI Write
-
8/8/2019 System Bus Noc
92/102
92
AMBAAXI Write
AMBA AXI Write
-
8/8/2019 System Bus Noc
93/102
93
AMBAAXI Write
WriteAddress Channel
WriteResponse Channel
Write
Data
Channel
A True Interface Specification
-
8/8/2019 System Bus Noc
94/102
94
ATrueInterfaceSpecification
Becauseofthechannelindependence andthetwo-way flow-control theinterfacedoesnotdictatethenetworkprotocol, transaction format, networktopology, orVLSIimplementation
For example: if youwantto buildapacket-basednetwork, youcan
backpressurethedatachannelwhile you buildthe
packet headerfromtheaddresschannelinformation, youcanusestore-and-forward, orcut-through,
etc.
Network Protocol / Transaction Format
-
8/8/2019 System Bus Noc
95/102
95
NetworkProtocol/TransactionFormat
Therearemany choice fornetworkprotocolsandtransactions formats:
circuit-switched :planandprovisionaconnection before
communicationstarts
packet-switched :issuespacketswhich compete fornetworkresources
hybrids: scheduleconnectivity (dynamicorstatic)
Network Protocol / Transaction Format
-
8/8/2019 System Bus Noc
96/102
96
NetworkProtocol/TransactionFormat
Therearemany choice fornetworkprotocolsandtransactions formats:
circuit-switched :planandprovisionaconnection before
communicationstarts
packet-switched :issuespacketswhich compete fornetworkresources
hybrids: scheduleconnectivity (dynamicorstatic)
Thereisstilllotsofresearch here....
Network Topology
-
8/8/2019 System Bus Noc
97/102
97
NetworkTopology
Howshould yournetworkelements beinterconnected:
Fully Connected (N2): high areacost, high performance
Mesh: lowareacost, potentialpoorperformance
Hypercube:mediumarea, trafficdependentperformance
Fat-tree:mediumarea, trafficdependentperformance
Torus:mediumarea, trafficdependentperformance
Network Topology
-
8/8/2019 System Bus Noc
98/102
98
NetworkTopology
There is lots of research here....
Network Topology - Caveat
-
8/8/2019 System Bus Noc
99/102
99
NetworkTopology - Caveat
There has beenalotofresearch ontopologies forNoCs,howeveritisimportanttorealizethattheperformanceofatopology is highly dependentonthetrafficpatterns!
TrafficpatternsinanSoCthat youaredesigning yourselfare NOTrandom, thereforemuch ofthetopology researchisnotapplicabletomostSoCs!
VLSI Implementation
-
8/8/2019 System Bus Noc
100/102
100
VLSIImplementation
Once you haveatopology thereisstillthematerofimplementingiton yourSoC
Therearemany considerations:
Clocking: Synchronous, Asynchronous
BufferInsertion: Trade-offpower, area, performance
RegisterInsertion/Pipelining: Trade-offclock frequency, area,
andlatency PacketBuffers:Trade-offarea, latency andthroughput
Again, lotsofresearch on-going...
Bluetooth Platform SoC
-
8/8/2019 System Bus Noc
101/102
101
Bluetooth Platform SoC
ARM7TDMI
DAP I/F
RADIO
I/F
SPEECH
I/F
SHARED
MEMORY
CONTROLLER
LM C
BRIDGE
POW ER &
CLOCK
CONTROL
DM A
SM C
PLL
CLOCKS
SHARED
MEMORY
TI C
DECODER
ARBITER
AH B AP B
AD C
t xt ACI USBUARTUARTTIMERSPICGPIOW ATCH
DOG
Processor
Memory
Controller
ApplicationSpecificLogic
Low-speedI/OandSupportLogic
SystemBus/Hardware
I/F
Research Paper
-
8/8/2019 System Bus Noc
102/102
Research Paper
Letslookat:
Guerrier, P.; Greiner, A., "A generic architecture for on-chip
packet-switched interconnections ," Design, Automation and
Test in Europe Conference and Exhibition 2000. Proceedings
, vol., no., pp.250-256, 2000