Graphite Power Models - MIT...
Transcript of Graphite Power Models - MIT...
Power Models
• Cache & Directory (using McPAT)
– http://www.hpl.hp.com/research/mcpat/
– Uses CACTI for modeling data and tag arrays
• Network (using Orion 2.0)
– http://projects.csail.mit.edu/cgi-bin/wiki/view/LSPgroup/OrionPage
– Upgrading to DSENT [Chen et. al. NOCS2012]
• Core (using McPAT)
– Currently validating against real hardware
2
Modeling Framework
3
Graphite Config File Core Params Network Params Memory Subsystem
McPAT / CACTI
Orion 2.0 / DSENT
Area
Performance
Energy
Technology Parameters (VDD, T, Wmin)
Benchmark
Core Models
Memory Subsystem
Models
Network Models
Fill Buffer
Write-back Buffer
Miss Status Buffer
Data Array
Tag Array
Cache (/Directory) Power Models Modeled Components
4
Data (Cache Line)
1 S 00 0x1AC9
1 M 10 0xB456
0 I
0xCB 0x45 0x68 0x21
0xA1 0x40 0x34 0xBF
0xBACD 0x34 0xBA 0x34 Read 0x1AB9
Core Power Models Modeled Components
5
Execution Unit
– Instruction Window
– Integer ALUs
– Floating Point Units (FPUs)
– Complex ALUs (Mul/Div)
– Results Broadcast Bus
Instruction Fetch Unit – Instruction Buffer
– Instruction Decoder
Load Store Unit
– Load/Store Buffers
Memory Management Unit – I-TLB
– D-TLB
Register Files - Integer RF - Floating Point RF
Network Power Models DSENT Calibration on Network Components
• DSENT fully validated vs. Spice
• Energy modeling within 10% of Spice with satisfied timing constraints
7
Power Models Current Status
• Cache Model
– Models for L1-I cache, L1-D cache, L2 cache and directory are in place
• Network Model
– Currently uses Orion 2.0
– Integration of Graphite with DSENT is being carried out
• Core Model
– Currently validating model against real hardware running multicore applications
8
Core Architectural Configuration
• General Parameters – clock_rate – core_tech_node – instruction_length – opcode_width – machine_type – num_hardware_threads – fetch_width – num_instruction_fetch_ports – decode_width – issue_width – commit_width – fp_issue_width – prediction_width – integer_pipeline_depth – fp_pipeline_depth – ALU_per_core – MUL_per_core – FPU_per_core – instruction_buffer_size – decoded_stream_buffer_size
10
• Register File – arch_regs_IRF_size
– arch_regs_FRF_size
– phy_regs_IRF_size
– phy_regs_FRF_size
• Load-Store Unit – LSU_order
– store_buffer_size
– load_buffer_size
– num_memory_ports
– RAS_size
Core Event Counters
11
• Instruction Counters – total_instructions
– int_instructions
– fp_instructions
– branch_instructions
– branch_mispredictions
– load_instructions
– store_instructions
– committed_instructions
– committed_int_instructions
– committed_fp_instructions
• Cycle Counters – total_cycles
– idle_cycles
– busy_cycles
• Reg File Access Counters – ialu_accesses
– mul_accesses
– fpu_accesses
– cdb_alu_accesses
– cdb_mul_accesses
– cdb_fpu_accesses
• Execution Unit Access Counters – ialu_accesses
– mul_accesses
– fpu_accesses
– cdb_alu_accesses
– cdb_mul_accesses
– cdb_fpu_accesses
Network Power Modeling
• Modeling Tool: – “Orion: A Power-Performance Simulator for
Interconnection Networks”
– http://projects.csail.mit.edu/cgi-bin/wiki/view/LSPgroup/OrionPage
• Tracked Events: – Link Traversals
– Router Buffer Reads/Writes
– Router Switch Allocator Requests
– Router Crossbar Traversals (Unicast/Multicast)
12
Cache Power Modeling
• Modeling Tool: – “McPAT: An Integrated Power, Area, and Timing
Modeling Framework for Multicore and Manycore Architectures”
– http://www.hpl.hp.com/research/mcpat/
• Tracked Events: – Directory Cache Accesses
– L1/L2 Cache Data Reads
– L1/L2 Cache Data Writes
– L1/L2 Cache Tag Accesses
13
Core Power Modeling
• Modeling Tool: – “McPAT: An Integrated Power, Area, and Timing
Modeling Framework for Multicore and Manycore Architectures”
– http://www.hpl.hp.com/research/mcpat/
• Example Events: – Integer/Floating Point add
– Integer/Floating Point subtract
– Integer/Floating Point multiply
– Integer/Floating Point divide
14
Power Models
• Activity Counters track events
– Total Dynamic Energy = Event Counter x Dynamic Energy associated with each event
– Total Static Energy = Completion Time x Static Power associated with each component
15
Overall Modeling Flow
Network Router & Link
Energy & Area
Cache Energy & Area
Core Counters
Electrical / Optical Router &
Link Counters
Inputs
Tools
Outputs
Network Models
Electrical Technology Parameters
Optical /Electrical
Technology Parameters
McPAT Core
McPAT Cache
Orion 2.0 / DSENT
Graphite
16
Cache Models
Benchmark Core
Models
Core Energy & Area
Cache Counters
Core Power Modeling Structure
17
Graphite Core Model
Graphite-McPAT Interface
McPAT Processor Model
McPAT Core Model
McPAT Data Structure
Architectural Parameters
Event Counters
Architectural Parameters
Event Counters
Core Power Modeling Process
18
Graphite Core Model
Graphite-McPAT Interface McPAT Processor Model
McPAT Core Model
McPAT Data Structure
Architectural Parameters
Event Counters
Architectural Parameters
Event Counters
Area Static Power Dynamic Energy
McPAT Cache Model
• Event Counters – Tag array reads
– Tag array writes
– Data array reads
– Data array writes
– Miss / Writeback/ Fill buffer accesses
19
Parameters Cache Size
Cache Block Size
Associativity
Miss / Writeback / Fill Buffer Size
Frequency
Latency
Throughput
Inputs
Area, Leakage Power, Dynamic Energy Outputs