Introspective 3D Chips S. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood...
-
Upload
gavin-singleton -
Category
Documents
-
view
217 -
download
0
Transcript of Introspective 3D Chips S. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood...
Introspective 3D ChipsS. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood (UCSB), ASPLOS 2006
Shimin Chen
(LBA Reading Group Presentation)
Motivation Focus: run-time monitoring for development Tool overhead amount of analysis at test-time Previous research: specialized on-chip h/w
modules At odds with economics of consumer
microprocessors May require significant amount of area Often introduce interconnect congestion Replicated on every processors whether used or not
Challenge: enabling these techniques with a minimum of impact on typical end-user systems
Solution: Add-On using 3D
Optionally adding a layer to a processor specifically for analysis
Developers: processors with this layer
End users: processors without this layer
Outline
Introduction Benefits of Introspection in 3D Quantifying the Technology
(Methodology) Architectural Ramifications
(Evaluation) Conclusion
Benefits of Introspection in 3D Cutting interconnect impact Reducing cost for commodity parts Enabling more powerful software
analysis
Cutting Interconnect Impact
Previous: gathering data from all over chip for centralized analysis
Global interconnect Cross almost every design block Consume significant top metal layer Run at high speed Require wire buffering &
even pipeline latches Reserve silicon for buffers
Cutting Interconnect Impact
Previous: global interconnect 3D: Area for inter-layer vias localized to positions of
taps
Reducing Cost for Commodity Parts
225 million PCs in use vs. 0.7 million programmers
Need to consider two costs: Cost of a consumer system:
cirtuit that drives the post and the vertical column of vias
Cost of a developer system: adding an extra layer
Enabling More Powerful SW Analysis
More h/w resources allocated to analysis Area power
Outline
Introduction Introspection in 3D Quantifying the Technology Architectural Ramifications Conclusion
Cross Section of 3D Chip
Posts: 5um x 5um cross 30 - 40 um high
(compare normal metal wire: 1um x 1um)
Estimating Interconnect Overhead
Optimal buffer size and inter-buffer separation
2D interconnect overhead 3D interconnect overhead Metalization area
Number of Vertical Posts
Estimate that 1024 bits of profile data will be generated per cycle (?)
Gathering Profile Data on Pentium 4
Example HW Monitor
16KBRISC ARM
16KB
32KB
32KB 32KB
130nm technology, area: 16mm2, power: 2.7W
Outline
Introduction Introspection in 3D Quantifying the Technology Architectural Ramifications Conclusion
Four Types of Systems to Compare
Basic System (Sbase) System with integrated profiling HW
(Sintegrated) System with profiling HW stacked (Sstacked) System with profiling stubs (Sstubs)
Routability
Based on Pentium 4 analysis Sintegrated:
Total wire length=5682.3 mm Total buffers=~20,000
Sstacked: Total buffers=1024 (one per post)
Area for Wires and Buffers
Power
Thermal
Thermal
Conclusion
Economic argument: cost of specialized H/W is decoupled from consumer market
H/W stubs add only 0.021 mm2 area and 0.9% power
Thank you!