Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael...
-
Upload
frank-aspen -
Category
Documents
-
view
218 -
download
0
Transcript of Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael...
![Page 1: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/1.jpg)
Architectural Support for Security in the Many-core Age: Threats and
Opportunities
Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev
Department of Computer Science
SUNY-Binghamton
{nael,dima}@cs.binghamton.edu
![Page 2: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/2.jpg)
Multi-cores-->Many-cores
• Moore's law coming to an end– Power wall; ILP wall; memory wall– “End of lazy-boy programming era”
• Multi-cores offer a way out– New Moore's law: 2x number of cores every 1.5 years– The many-core era is about to get started– Will have more cores than can power -> likely to have
a lot of accelerators, including the ones for security
• How to best support trusted computing?• Critical to anticipate and diffuse security threats
![Page 3: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/3.jpg)
Security Challenges for Many-cores
• Diverse applications, both parallel and sequential
• New vulnerabilities due to resource sharing
• Side-Channel and Denial-of-Service Attacks
• Performance impact is a critical consideration
• Can use spare cores/thread contexts to accelerate security mechanisms
• Can use speculative checks to lower latency
![Page 4: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/4.jpg)
DEFENDING AGAINST ATTACKS ON SHARED RESOURCES
![Page 5: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/5.jpg)
Attacks on Shared Resources
• Resource sharing (specifically the sharing of the cache hierarchy) opens the door for two types of attacks– Side-Channel Attacks
– Denial-of-Service Attacks
• Our first target: software cache-based side channel attacks.
• First, some cache background...
![Page 6: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/6.jpg)
Background: Set-Associative Caches
•
Address
22 8
V TagIndex
0
1
2
253
254255
Data V Tag Data V Tag Data V Tag Data
3222
4-to-1 multiplexor
Hit Data
123891011123031 0
![Page 7: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/7.jpg)
L1 Cache Sharing in SMT Processor
InstructionCache
FetchUnit
PCPCPCPC
Decode RegisterRename
Issue Queue
Load/StoreQueues
RegisterFile
PCPCPCExecutionUnits
DataCachePCLDST
Units
Re-order BuffersPrivate Resources
Shared Resources
ArchState
![Page 8: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/8.jpg)
Last-level Cache Sharing on Multicores (Intel Xeon)
2 × quad-coreIntel Xeon e5345(Clovertown)
![Page 9: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/9.jpg)
Advanced Encryption Standard (AES)
• One of the most popular algorithms in symmetric key cryptography 16-byte input (plaintext) 16-byte output (ciphertext) 16-byte secret key (for standard 128-bit
encryption) several rounds of 16 XOR operations and 16 table
lookups index
secret key byte
Input byte
Lookup Table
![Page 10: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/10.jpg)
Cache-Based Side Channel Attacks
• An attacker and a victim process (e.g. AES) run together using a shared cache
• Access-Driven Attack: • Attacker occupies the cache, evicting victim’s data
• When victim accesses cache, attacker’s data is evicted
• By timing its accesses, attacker can detect intervening accesses by the victim
• Time-Driven Attack• Attacker fills the cache
• Times victim’s execution for various inputs
• Performs correlation analysis
![Page 11: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/11.jpg)
11
Attack Example
Cache
Main Memory
Attacker’s data AES data
abcd
b>(a≈c≈d)
Can exploit knowledge of the cache replacement policy to optimize attack
![Page 12: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/12.jpg)
Simple Attack Code Example
#define ASSOC 8#define NSETS 128#define LINESIZE 32#define ARRAYSIZE (ASSOC*NSETS*LINESIZE/sizeof(int))
static int the_array[ARRAYSIZE]int fine_grain_timer(); //implemented as inline assembler
void time_cache() { register int i, time, x; for(i = 0; i < ARRAYSIZE; i++) { time = fine_grain_timer(); x = the_array[i]; time = fine_grain_timer() - time; the_array[i] = time; }}
![Page 13: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/13.jpg)
Existing Solutions
• Avoid using pre-computed tables – too slow
• Lock critical data in the cache (Lee, ISCA 07)• Impacts performance• Requires OS/ISA support for identifying critical data
• Randomize the victim selection (Lee, ISCA 07)
• Significant cache re-engineering -> impractical • High complexity• Requires OS/ISA support to limit the extent to critical
data only
![Page 14: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/14.jpg)
Desired Features and Our Proposal
• Desired solution:• Hardware-only (no OS, ISA or language support)• Low performance impact• Low complexity• Strong security guarantee• Ability to simultaneously protect against denial-of-service
(a by-product of access-driven attack)
• Our solution: Non-Monopolizable (NoMo) Caches
![Page 15: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/15.jpg)
NoMo Caches
Key idea: An application sharing cache cannot use all lines in a set
NoMo invariant: For an N-way cache, a thread can use at most N – Y lines
Y – NoMo degree Essentially, we reserve Y cache ways for each co-
executing application and dynamically share the rest
If Y=N/2, we have static non-overlapping cache partitioning
Implementation is very simple – just need to check the reservation bits at the time of replacement
![Page 16: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/16.jpg)
NoMo Replacement Logic
![Page 17: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/17.jpg)
Initial cache usage
F:1 H:1 C:1 K:1
A:1
G:1 B:1 J:1 D:1
L:1 I:1 E:1
More cache usage
F:1 H:1 C:1 K:1
A:1 P:1
G:1 B:1 N:1 J:1 D:1
M:1 L:1 I:1 O:1 E:1
Thread 2 enters
F:1 H:1 C:1 Q:2 K:1
A:1 P:1
G:1 B:1 N:1 J:1 D:1
M:1 L:1 I:1 O:1 E:1
NoMo Entry (Yellow = T1, Blue = T2)
F:1 H:1 C:1 Q:2 K:1
A:1 P:1
G:1 B:1 N:1 J:1 D:1
M:1 L:1 I:1 O:1 E:1
Reserved way usage
F:1 H:1 R:2 Q:2 K:1
A:1 P:1
G:1 B:1 J:1 D:1
M:1 L:1 T:2 S:2 I:1 O:1 E:1
Shared way usage
F:1 H:1 R:2 Q:2 K:1
A:1 P:1
G:1 B:1 J:1 N:1 D:1
M:1 L:1 T:2 S:2 I:1 U:2 O:1 E:1
NoMo example for an 8-way cache
• Showing 4 lines of an 8-way cache with NoMo-2• X:N means data X from thread N
![Page 18: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/18.jpg)
Why Does NoMo Work?
• Victim’s accesses become visible to attacker only if the victim has accesses outside of its allocated partition between two cache fills by the attacker.
• In this example: NoMo-1
![Page 19: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/19.jpg)
Evaluation Methodology
• We used M-Sim-3.0 cycle accurate simulator (multithreaded and Multicores derivative of Simplescalar) developed at SUNY Binghamton
• http://www.cs.binghamton.edu/~msim• Evaluated security for AES and Blowfish encryption/decryption• Ran security benchmarks for 3M blocks of randomly generated input• Implemented the attacker as a separate thread and ran it alongside
the crypto processes• Assumed that the attacker is able to synchronize at the block
encryption boundaries (i.e. It fills the cache after each block encryption and checks the cache after the encryption)
• Evaluated performance on a set of SPEC 2006 Benchmarks. Used Pin-based trace-driven simulator with Pintools.
![Page 20: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/20.jpg)
Aggregate Exposure of Critical Data
NoMo-0 NoMo-1 NoMo-2 NoMo-3 NoMo-40.0
0.2
0.4
0.6
0.8
1.0Aggregate Critical Exposure
AES encryptAES decryptBF encryptBF decrypt
Exp
osur
e ra
te
![Page 21: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/21.jpg)
Sets with Critical Exposure
AES enc. AES dec. BF enc. BF dec.
NoMo-0 128 128 128 128
NoMo-1 128 128 128 128
NoMo-2 10 14 22 22
NoMo-3 0 0 1 1
NoMo-4 0 0 0 0
![Page 22: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/22.jpg)
Impact on IPC Throughput (105 2-threaded SPEC 2006 workloads simulated)
![Page 23: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/23.jpg)
Impact on Fair Throughput (105 2-threaded SPEC 2006 workloads simulated)
0.97
0.98
0.99
1.00
1.01
NoMo-1NoMo-2NoMo-3NoMo-4
Benchmark Mixes
Nor
mal
ized
Fai
rnes
s
![Page 24: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/24.jpg)
NoMo Design Summary
• Practical and low-overhead hardware-only design for defeating access-driven cache-based side channel attacks
• Can easily adjust security-performance trade-offs by manipulating degree of NoMo
• Can support unrestricted cache usage in single-threaded mode
• Performance impact is very low in all cases
• No OS or ISA support required
![Page 25: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/25.jpg)
NoMo Results Summary (for an 8-way L1 cache)
• NoMo-4 (static partitioning): complete application isolation with 1.2% average (5% max) performance and fairness impact on SPEC 2006 benchmarks
• NoMo-3: No side channel for AES, and 0.07% critical leakage for Blowfish. 0.8% average(4% max) performance impact on SPEC 2006 benchmarks
• NoMo-2: Leaks 0.6% of critical accesses for AES and 1.6% for Blowfish. 0.5% average (3% max) performance impact on SPEC 2006 benchmarks
• NoMo-1: Leaks 15% of critical accesses for AES and 18% for Blowfish. 0.3% average (2% max) performance impact on SPEC 2006 benchmarks
![Page 26: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/26.jpg)
Extending NoMo to Last-level Caches
• Side-channel attack is possible at the L2/L3 level, especially with cache hierarchy that explicitly guarantee inclusion
• Attacker can invalidate victim’s lines in L2/L3, thus forcing their evictions from private L1s.
• Effect of partitioning is much more profound at that level.
• Have to address the possibility of a multithreaded attack.
• Examine other designs for protecting L2/L3 caches. Latency is less critical there.
• Investigations in progress...
![Page 27: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/27.jpg)
USING EXTRA CORES/THREADS FOR SECURITY
![Page 28: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/28.jpg)
Using Extra Cores/Threads for Security
• Main opportunity: using extra cores and core extensions to support security:• Improve performance by offloading security-related
computations• Reduce design complexity
• Applications that we consider:
• Dynamic Information Flow Tracking (DIFT)• Dynamic Bounds Checking (not covered in this talk)
![Page 29: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/29.jpg)
Dynamic Information Flow Tracking
• Basic Idea:
– Attacks come from outside of processor
– Mark data coming from the outside as tainted
– Propagate taint inside processor during
program execution
– Flag the use of tainted data in unsafe ways
![Page 30: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/30.jpg)
• Memory address AND data tainted• Load address is tainted• Store address is tainted• Jump destination is tainted*• Branch condition is tainted*• System call arguments are tainted*• Return address register is tainted• Stack pointer is tainted• Memory address OR data tainted*
Security Checking Policies
![Page 31: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/31.jpg)
Existing DIFT Schemes and Limitations
• Hardware solutions: – Taint propagation with extra busses – Additional checking unitsLimitations– intrusive changes to datapath design
• Software solutions: – More instructions to propagate and check taintLimitations– High performance cost– Source code recompilation
![Page 32: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/32.jpg)
t
r5 0IFQ
1
0
r1
r4
r0
r2r3
Instruction Decode
0
10
add r3,r1,r4
Exception Checking Logic
WriteBack
00
00
data 1data 1data 1data 1add r3 r1 r4
r4 r1
r1
r4
r1+r4
r1+r4 1
RF
ALUMEM
+
t
1
0
1
Taint Computation
Logic
Hardware –based DIFT
ExistingDIFT
![Page 33: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/33.jpg)
add r3 r1 r4
add r3 r1 r4
compiler
add r3 r1 r4shr r2 r5 2shl r0 r5 1
or r0 r2 r0
MEM
ALU
RFIFQ
r1
r4
r0
r2r3
Instruction Decode
WriteBack
datadatadatadata
00111100
01100000
or r5 r5 r0
and r2 r2 16and r0 r0 16
r5 is used for storing taint information of
remaining register file
A small region of memory is used to store taint
information of memory
Software –based DIFT
ExistingDIFT
![Page 34: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/34.jpg)
Our Proposal: SIFT (SMT-based DIFT)• Execute two threads on SMT processor
– Primary thread executes real program
– Security thread executes taint tracking instructions
• Committed instructions from main thread generate taint checking instruction(s) for security thread
• Instruction generation is done in hardware
• Taint tracking instructions are stored into a buffer from where they are fed to the second thread context
• Threads are synchronized at system calls
![Page 35: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/35.jpg)
Instruction Cache
Decode / Dispatch
Execute Writeback CommitSIFT
Instruction Generation
Primary Thread
Security Thread Inst. Buffer
Instruction Flow in SIFT
![Page 36: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/36.jpg)
Fetch Unit
Decode/Dispatch
PC
Register Rename
Mem Units
FU 1
FU 2
FU N
IFQ
SIFT Instruction Generator
addr
Inst
Instruction Cache
IQ
ROB
Data CacheLSQ
Shared Resources
Private ResourcesArch State
SMT Datapath with SIFT Support
SMT Datapath with SIFT Logic
![Page 37: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/37.jpg)
ALUMEM
r1
r4
r0
r2r3
Instruction Decode
add r3 r1 r4
WriteBack
0011110
datadatadatadata
0
r5
SIFT Generator
add r3 r1 r4
or r3 r1 r4
or r3 r1 r4 RF 1
IFQ 1
IFQ 2
1
0
0
10
0RF 2
r1+r4
r1
r1+r4
1
0
1
1
r1r1
r4r4
r4
or+add r3 r1 r4
SIFT Example
Context-2DIFT
Context-1
Shared Resources
![Page 38: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/38.jpg)
SIFT Instruction Generation Logic
1. Taint Code Generation
2. Secutiry Instruction Opcodes are read from COT
3. Rest of the instructions are taken from Register Organizer and stored Instruction Buffer
4. Load and Store Instruction’s memory addresses are stored in Address Buffer
![Page 39: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/39.jpg)
Die Floorplan with SIFT Logic• SUN T1 Open Source Core• IGL synthesized using Synopsys
Design Compiler using a TSMC 90nm standard cell library
• COT, IB and AB implemented using Cadence Virtuoso
• The integrated processor netlist placed and routed using Cadence SoC Encounter
• Cost 4.5% of whole processor area
![Page 40: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/40.jpg)
• Software is not involved, transparent to user and applications (although the checking code can also be generated in software)
• Hardware instruction generation is faster than software generation
• Additional hardware is at the back end of the pipeline, it is not on the critical path
• No inter-core communication
Benefits of Taint Checking with SMT
![Page 41: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/41.jpg)
Number of Security Instructions per committed Primary Thread Instruction
![Page 42: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/42.jpg)
SIFT Performance Overhead
![Page 43: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/43.jpg)
• Reduce the number of checking instructions by eliminating the ones that never change the taint state.
• Reduce data dependencies in the checker by preloading taint values into its cache once the main program encounters the corresponding address
• Reduce the number of security instructions depending on taint state of registers and TLB
SIFT Performance Optimizations
![Page 44: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/44.jpg)
lda r0,24176(r0)xor r9,3,r2addq r0,r9,r0ldah r16,8191(r29)ldq_u r1,0(r0)lda r16,-26816(r16)lda r0,1(r0)lda r18,8(r16)extqh r1,r0,r1sra r1,56,r10bne r2,0x14and r9,255,r9stl r3,32(r30)ldl r2,64(r2)lda r16,48(r30)bic r2,255,r2bis r3,r2,r2
bis r0,r0,r0bis r9,r9,r2bis r0,r9,r0bis r29,r29,r16ldq_u r1,0(r0)bis r1,r0,r1bne r1,0xfffffffffffff080bis r16,r16,r16bis r0,r0,r0bis r16,r16,r18bis r1,r0,r1bis r1,r1,r10bne r2,0xfffffffffffff080bis r9,r9,r9bis r3,r30,r3bne r3,0xfffffffffffff080stl r3,32(r30)ldl r2,64(r2)bis r2,r2,r2bne r2,0xfffffffffffff080bis r30,r30,r16bis r2,r2,r2bis r3,r2,r2
bis r9,r9,r2bis r0,r9,r0bis r29,r29,r16ldq_u r1,0(r0)bis r1,r0,r1bne r1,0xfffffffffffff080bis r16,r16,r18bis r1,r0,r1bis r1,r1,r10bne r2,0xfffffffffffff080bis r3,r30,r3bne r3,0xfffffffffffff080stl r3,32(r30)ldl r2,64(r30)bne r2,0xfffffffffffff080bis r30,r30,r16bis r3,r2,r2
Primary Thread SIFT Security Thread SIFT – F Security Thread
Eliminating Checking Instructions
![Page 45: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/45.jpg)
SIFT Logic with Instruction Elimination
![Page 46: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/46.jpg)
Performance Loss of SIFT and SIFT-F compared to Baseline Single Thread execution
Percentage of Filtered Instructions
Performance Impact of Eliminating Security Instructions
![Page 47: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/47.jpg)
Performance Impact of Cache Prefetching
![Page 48: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/48.jpg)
SIFT Datapath with TLB Based Optimization
![Page 49: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/49.jpg)
SIFT Logic with TLB Based Optimization
![Page 50: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/50.jpg)
Details of TLB Based Optimization• For Stores• tainted register -> clean page – generate instructions• tainted register -> tainted page – generate instructions• clean register -> clean page – don't generate instructions• clean register -> tainted page – generate instructions
• For Loads• tainted page -> tainted register – generate instructions• tainted page -> clean register – generate instructions• clean page -> tainted register – generate instructions• clean page -> clean register – don't generate instructions
![Page 51: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/51.jpg)
SIFT Performance on a 4-way Issue Processor
![Page 52: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/52.jpg)
SIFT Performance on a 8-way Issue Processor
![Page 53: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/53.jpg)
Future Work
• To consider in the future:
• Collapse multiple checking instructions via ISA
• Optimize resource sharing between two threads
• Provide additional execution units for the checker
• Implementation of Register Taint Vector and TLB based instruction elimination
![Page 54: Architectural Support for Security in the Many-core Age: Threats and Opportunities Dr. Nael Abu-Ghazaleh and Dr. Dmitry Ponomarev Department of Computer.](https://reader034.fdocuments.us/reader034/viewer/2022051401/56649c745503460f949280cd/html5/thumbnails/54.jpg)
Thank you!
Any Questions?