Revisiting Resource Partitioning for Multi -core...

Post on 13-Apr-2018

219 views 1 download

Transcript of Revisiting Resource Partitioning for Multi -core...

RevisitingResourcePartitioningforMulti-coreChips:IntegrationofSharedResourcePartitioningonaCommercialRTOS

21Apr.2017

PAK,EUNJI

Seniorresearcher,ETRI(ElectronicsandTelecommunicationsResearchInstitute)

pakeunji@etri.re.kr

CMAAS’2017

Agenda• Qplus-AIR, acommercialRTOS• ComprehensivesharedresourcepartitioningimplementationonQplus-AIR

Qplus-AIR

ARINC653compliantRTOSCertifiableforDO-178BLevelA

IntroductiontoQplus-AIR• Qplus-AIR

� DevelopedbyETRIforsafety-criticalsystem(2010~2012)� MainoperatingsystemfortheIFCC(Integratedflightcontrolcomputer)ofUAV(UnmannedAvionicsVehicle),KAI

� IntegrateMC(MissionControl),FC(FlightControl),andC&C(CommunicationsandCommands)intheIFCC

� ARINC653compliantRTOS*� Robustpartitioningamongapplications� Spatialandtemporal� Preventcross-applicationinfluenceanderrorpropagationamongapplications

� Easyintegrationofmultipleapplicationswithdifferentdegreesofcriticality

*AirlinesElectronicEngineeringCommittee,AvionicsApplicationSoftwareStandardInterfaceARINCSpecification653Part1,2006

IntroductiontoQplus-AIR• Qplus-AIR

� CertifiablepackageforDO-178BLevelA� LightweightARINC653support:kernel-levelimplementation� Supportformulticoreplatforms(2014~)

• RTWORKS� AcommercialversionofQplus-AIR� ManagedbyRTST(2013~),ETRI’sspin-offcompany

� Startwith4developers,andnowhas11OSdevelopers� AUTOSAR(automotiveindustrystandard)andISO26262ASILDisinprogress

• ETRIfocusesonresearchissueswhileRTSTfocusesoncommercialization

ApplicationExamples• Safety-criticalindustrialapplications

� Integratedflightcontrolcomputerofunmannedavionicsvehicle,2010~2012

� Tiltrotorflightcontrolcomputer,2012� Nuclearpowerplantcontrolsystem,2013� HUMS(HealthandUsageMonitoringSystem)forhelicopter,2013~2016

� Subwayscreen-doorcontrolsystem,2016 (exporttoBrazil)� Communicationsystemofself-propelledguns,2017~� (project)Autonomousdrivingcar,2015~

ComprehensivesharedresourcepartitioningimplementationonQplus-AIR

Contents• Introduction

• HWplatform:P4080

• Comprehensiveresourcepartitioningimplementation� Memorybusbandwidthpartitioning� DRAMbankpartitioning� Sharedcachepartitioning– set-based/way-based

• CombinedallthetechniquesontheQplus-AIR

• Evaluations

• Conclusions&FutureWork

Introduction[1/2]• Robustpartitioningamongapplications(partitions)

� Qplus-AIRsupportsspatialandtemporalpartitioning� Ensuresindependentexecutionofmultipleapplicationswithvarioussafety-criticallevels

• Robustpartitioningmaynolongerbevalidinmulticore� Multiplecoressharehardwareresourcessuchascacheormemory� Concurrentlyexecutingapplicationsaffecteachotherduetothecontentiononsharedresource

� Majorsourceoftimingvariability� PessimisticWCETestimation→overprovisioningofhardwareresourcesandlowsystemutilization

� Insafety-criticalsystems,wehadtoturnoffbutonecore

Introduction[2/2]• Wemustdealwiththeresourcecontentionproperly

� WCEToftasksstaysguaranteedandtightlybounded� Especiallyforsafetycriticalapplicationsthatrequirecertification

• Requirementofinter-coreinterferencemitigation� “TheapplicanthasidentifiedtheinterferencechannelsthatcouldpermitinterferencetoaffectthesoftwareapplicationshostedontheMCPcores,andhasverifiedtheapplicant’schosenmeansofmitigationoftheinterference.“- FAACAST(CertificationAuthoritiesSoftwareTeam)-32APositionPaper*

• ComprehensivesharedresourcepartitioningimplementationonARINC653compliantRTOS� Integrateanumberofresourcepartitioningschemes,eachofwhichtargetsdifferentsharedhardwareresources, onQplus-AIR

� UniquechallengesduetothefactthattheRTOSdidnotsupportLinux-likedynamicpaging

*CertificationAuthoritiesSoftwareTeam,PositionPaperCAST-32A:Multi-coreProcessors,2016.

HWplatform,P4080[1/2]• P4080architecture*

� EightPowerPCe500mccores� Eachcorehasaprivate32KB-I/32KB-DL1and128KBL2cache� TwoL332-way1MBcacheswithcache-lineinterleaving� Twomemorycontrollersfortwo2GBDDRDIMMmodules(eachDIMMmoduleshas16DRAMbanks)

� CoreNet coherencyfabric– interconnectscoresandotherSoC modules,ahigh-bandwidthswitchthatsupportsseveralconcurrenttransactions

PowerPCe500mccore

CoreNetInterface

L2$

L1I-$ L1I-$

CoreNetFabric

L3$

DDR

Controller

L3$

DUART

GPIO

FMan

BMan

……

QMan

DDR

Controller

DIMM

mod

ule

DIMM

mod

ule

*P4080QorIQIntegratedProcessorHardwareSpecifications,Feb2014.

HWplatform,P4080[2/2]• PartitioningsupportofrecentPowerPCprocessors*

Hardware Support for Robust Partitioning in Freescale QorIQ Multicore SoCs (P4080 and derivatives), Rev. 0

10 Freescale Semiconductor

Overall partitioning model

Figure 4. Example of a Partitioned System

In this model, there are four distinct partitions, each running on two cores. The main memory is divided into several physical regions:

• Private• Shared between partitions; accessible at user level• Shared among partitions; restricted to hypervisor level

This mapping is enforced by the cores’ MMUs accessible only at the hypervisor level. System peripherals (PCIe and sRIO) in this example are not shared -- each is allocated to a partition usage. As such, the hypervisor is able to restrict their DMA-accessible memory range to some part of the memory region assigned to the partition through the MMU.

The shared internal memory (CPC) is partially partitioned, which provides two partition-specific sub-ranges.

NOTEThis CPC allocation can be done per-way. Each way is configured to work either as a cache or as a fixed-address sRAM.

1.7 HypervisorsSeveral hypervisor technologies are proposed for the P4080 to address different purposes.

RTOS suppliers, such as GreenHills, SysGo and WindRiver, have developed their own hypervisor technology with particular focus on safety and robust partitioning.

*HardwareSupportforRobustPartitioninginFreescaleQorIQMulticoreSoCs(P4080andderivatives)

Mainmemoryisdividedintoseveralphysicalregions• Private• Sharedbetweenpartitions;accessibleatuserlevel

• Sharedamongpartitions;restrictedtohypervisorlevel

Thismappingisenforcedbythecore’sMMUs

Systemperipheralsarenotshared• HypervisorisabletorestricttheirDMA-accessiblememoryrangetosomepartofthememoryregion(throughtheMMU)

CPCisPartitioned• Waypartition(32KBperway)

Eachcoreisallocatedtoeachpartition

Restrictthecoherencyoverhead• Disablethecoherency– preventsnoopoverhead• Specifyagroupparticipatingcoherency

Resourcepartitioningmechanisms• 1. Memorybus(interconnect)bandwidthpartitioning

• 2. Memorybankpartitioning

• Sharedcachepartitioning� 3. Set-basedcachepartitioningwithpagecoloring� 4. Way-basedcachepartitioningwiththesupportofP4080hardware

• CombineallthetechniquesandintegratedonQplus-AIR

• Paging� Memorybankpartitioningandset-basedcachepartitioningassumesthatOSsupportsLinux-likepaging

� PagingimplementationinQplus-AIR

ResourcepartitioningmechanismsMemorybusbandwidthregulator [1/2]• Busbandwidthregulator*

� Limitthebandwidthusagepercore

Core1 Core2

1)Setmemorybusbandwidthbudget

10/10 3/10

2)Count#ofrequestssenttomemorybus

3) Generateaninterrupt

Core1 Core2

Memorybus(CoreNet Fabric)

#/10 #/10

Memorybus(CoreNet Fabric)

Core1 Core2

10/10 3/10

Memorybus(CoreNet Fabric)

Core1 Core2

10/10 3/10

Memorybus(CoreNet Fabric)

4)Throttletherequestsfromcore1

*H.Yun,G.Yao,R.Pellizzoni,M.Caccamo,andL.Sha.Memorybandwidthmanagementforefficientperformanceisolationinmulti-coreplatforms.IEEETransactionsonComputers,65:562–576,2015.

ResourcepartitioningmechanismsMemorybusbandwidthregulator [2/2]• Implementation

� Setupthebudgetandconfiguretogenerateaninterruptwhenacoreexhaustthebudget� Configureperformancemonitoringcontrolregistersandperformancemonitoringcounters

� OSschedulerthrottlesfurtherexecutionatthatcore� ImplementinterrupthandlerfortheinterruptthatPMCgenerates� Schedulerde-schedulethetasksonthecore

• Periodofbandwidthregulatorexecution� Iftooshort,overheadbecomesexcessive;incontrast,iftoolong,predictabilityisworsened

� Defaultperiodofourimplementationis5ms

ResourcepartitioningmechanismsBank-awarememoryallocation• DRAMbank-awarememoryallocation*

� Managesmemoryallocationinsuchawaythatnoapplicationsharesitsmemorybankwithapplicationsrunningonothercores

1)requestmemory

DRAM2)Allocatephysicalmemory

Bank1

Bank2

Application2

VirtualMemory Physical

memory

OS

Application1

VirtualMemory

Core1 Core2

Physicalmemory

DRAM

Core1 Core2

Bank1

Physicalmemory

Bank2

Physicalmemory

Pagetable(virtual-to-physicaladdresstranslation)

HWMMU

*H.Yun,R.Mancuso,Z.-P.Wu,andR.Pellizzoni.PALLOC:Drambank-awarememoryallocatorforperformanceisolationonmulticoreplatforms.InRTAS,2014.

031 67141618

banks

12L3cachesets

L2cachesets

[P4080memoryaddressmapping]

ResourcepartitioningmechanismsSet-basedcachepartitioning [1/2]• Set-basedpartitioningviapagecoloring*

� Allocationofphysicalmemoryconsideringthecachesetlocation� 𝑛𝑢𝑚𝑏𝑒𝑟𝑜𝑓𝑐𝑜𝑙𝑜𝑟𝑠 = ./.012341

5/612341∗./.01/223.3/83938:

1)requestmemory

DRAM2)AllocatephysicalmemoryApplication2

RTOS

Application1

Core1 Core2

Cache

031 716 12

L3cachesets

colorsPhysicalpagenumber

*R.Mancuso,R.Dudko,E.Betti,M.Cesati,M.Caccamo,andR.Pellizzoni.Real-timecachemanagementframeworkformulti-corearchitectures.InRTAS,2013.*M.Chisholm,B.C.Ward,N.Kim,andJ.H.Anderson.Cachesharingandisolationtradeoffsinmulticoremixed-criticalitysystems.InRTSS,2015.

ResourcepartitioningmechanismsSet-basedcachepartitioning[2/2]• Implementations

� Manipulatesvirtualtophysicaladdressmapping– allocatedisjointcachesetstoeachcore� Amongaddressbits[15:7],cachesetindex,exploits[15:12]bits,whichintersectswiththephysicalpagenumberinP4080

• L2co-partitioning&Restrictionsofset-basedpartitioning� Co-partitionL2cache

� L3cachesetisdeterminedby[15:12]andL2cachesetby[13:6]� Using[13:12]bitshasasideeffectofco-partitioningL2cache

� Onlythe[15:14]bitsareallowedforL3cachesetpartitioning� Thenumberofcachepartitionsislimitedto4� Ifweadoptfor8cores,somecachesetsinevitablysharedby2cores

031 67141618

banks

12L3cachesets

L2cachesets

[P4080memoryaddressmapping]

ResourcepartitioningmechanismsWay-basedcachepartitioning[1/2]• Way-basedpartitioningwithHardware-levelsupport

� Configuremainmemorywithmultipledistinctpartitions� Foreachpartition,registerthe(memoryrange,target,andpartitionID)intheLAW(LocalAccessWindow)register

� PartitiontheL3cacheandallocatedisjointcachewaystoeachcore� ConfiguretheL3cache(CPC)relatedregisters– transactionsfromthespecifiedpartitioncanallocatetheblocksinthedesignatedcacheways� E.g.,transactionsfromthe‘partition1‘allocateblocksinthe‘way0,1,2,3’

Physicalmemory(DDR3,DRAM)CPC(L3cache)

e6500core

L1cache L1cache L1cache L1cache

2MBBankedL2cache

CoreNetCoherencyFabric

e6500core

e6500core

e6500core

LocalAccessWindowsLocalAccessWindowsLocalAccessWindows CPCConfigurationRegister

MMUMMU MMUMMU

Part.1

Part.2

Part.1 Part.2 Part.3 Part.4

Part.3

Part.4

shared

Part.

1

Part.

2

Part.

3

Part.

4

ResourcepartitioningmechanismsWay-basedcachepartitioning[2/2]• Relaxedrestrictionsonthenumberofcachepartitions

� Withset-basedcachepartitioning,numberofcachepartitionsisrestricteduptofour

� P4080supportscachepartitioningwithper-waygranularity,witheachwayproviding32KB� L3cacheis32-wayandcanbepartitionedto32parts

• Limitationsofway-basedcachepartitioning� Way-basedcachepartitioningcannotbeusedwithset-basedcacheormemorybankpartitioning

� Conflictingrequirementofmemoryallocation� Sequentialvs.interleaving� MayberelevanttoallotherPowerPCchipmodels

� Cachewaylockingallowintegration� MostARMprocessorssupportscachewaylocking� PowerPCe500mcprocessorsupportscachelockinginablockgranularity

Part.1(core1)

Part.2(core2)

Part.1Part.2Part.1Part.2Part.1Part.2

vs.

ImplementationissuesFromtheperspectiveofanRTOS[1/4]• Challenges– paging

� PagecoloringassumesthatOSmanagesmemorywithfixed-sizedpages(normally,4KB)

� Qplus-AIRdeliberatelyavoidpagingduetothetimingpredictabilityisworsenedwhenaTLBmissoccurswithinapagingscheme

Kerneldata

Kernelcode

Partition2

Partition1

Partition3

Memorylayout

• MemorymanagementofQplus-AIR� Managedwithvariablesizedpagesratherthanfixed4KBpages� Kerneldata/code,partitionregions� Manageseachregionasonelargepage- 1TLBentryforeachregion

� OSlockstheentryintheTLB- ForceallthemappingdatatostayintheTLB

� Sizeofmemoryforeachapplicationisconfiguredbydevelopers

� MMUisusedtopreventcross-applicationmemoryaccesses

16MB

16MB

Size(example)

16MB

64MB

64MB

ImplementationissuesFromtheperspectiveofanRTOS[3/4]• MemorymanagementinP4080

� TwolevelsofMMU� Hardware-managedL1MMU� Software-managedL2MMU

� EachMMUconsistsof� TLBforvariable-sizedpages(VSP),11differentpagesizes(4KB~4GB)

� TLBfor4KBfixed-sizedpages(FSP)� TLBlockingforvariable-sizedpages

• Modify memorymanagementofQplus-AIR� Tosupportpagecoloring,whichisusedtoimplementmemorybankpartitioningandset-basedcachepartitioning

� Manageapplication’smemoryregionswith4KBgranularity� Managementofkernelregionswasunchanged– bindperformancepredictabilityofkernelexecution

[ref.]PowerPCe500mccorereferencemanual

ImplementationissuesFromtheperspectiveofanRTOS[3/4]• Overheadofpaging

� ‘Latency’benchmarkwithchangingdatasizeandaccesspattern� Sequentialaccessandrandomaccessoflinkedlist

� Measuretheaveragememoryaccesslatency

0

10

20

30

40

50

60

70

80

90

0 2000 4000 6000 8000 10000

aver

age

mem

ory

late

ncy

data size (KB)

paging overhead(sequential access)

no paging paging

0

50

100

150

200

250

300

0 2000 4000 6000 8000 10000

aver

age

mem

ory

late

ncy

data size (KB)

paging overhead(random access)

no paging paging

Upto6%overheadwhendatasize>2MB

[note]TLBhitratio=98.43%L2TLBhas512-entry

Upto197%overheadwhendatasize>2MB

ImplementationissuesFromtheperspectiveofanRTOS[4/4]• Analysisofoverhead

� DegradationisduetotheMMUarchitectureofe500mccore� L1instructionanddataTLBsandL2unifiedTLB� L1MMUiscontrolledasaninclusivecacheofL2MMU� InPowerPCe6500core,L1andL2MMUisnotinclusive

• Requirementsforthepredictablepaging� Somestudiesfocusedonpredictablepaging*� COTShardwareprovidesmeansforimplementingpredictablepaging–software-managedTLBorTLBlocking

L1TLB

L2TLB

L1TLB

L2TLB

L1TLB

L2TLB

TLBentryforcodeTLBentryfordata

Evict(replaceout)InstructionTLBentries

DatasizeincreasesInvalidated(inclusionproperty)

L1I-TLBmiss!

L1I-TLBmissevenifthecodesizeiswithintheL1I-TLBcoverage

I-TLB D-TLB

*D.HardyandI.Puaut.Predictablecodeanddatapagingforrealtimesystems.InECRTS,2008.

*T.Ishikawa,T.Kato,S.Honda,andH.Takada.Investigationandimprovementontheimpactoftlb missesinreal-timesystems.InOSPERT,2013.

ResourcepartitioningmechanismsIntegrationofpartitioningschemes• Fourtechniqueswithpaging

� Memorybuspartitioning(RP-BUS),memorybankpartitioning(RP-BANK),set-basedcachepartitioning(RP-$SET),andway-basedcachepartitioning(RP-$WAY)

• Integrationofmemorybus,memorybank,andset-basedandway-basedcachepartitioningmechanisms� Notethatway-basedcachepartitioningcannotbeintegratedwithmemorybankpartitioningorset-basedcachepartitioning

• Possibleintegration options� Integrationoption#1:RP_BUS,RP_BANK,andRP_$SET

� Restrictionsonthenumberofavailablecachepartitions� Integrationoption#2:RP_BUSandRP_$WAY

� Contentionsonmemorybankisunavoidable

Evaluations [1/5]• Evaluationsetup

� Hardwareplatform� P4080withactivate4or8oftotal8cores

� Softwareplatform� Qplus-AIR

� Syntheticbenchmark� Latency :traversealinkedlisttoperformaread/writeoperationoneachnode,memoryrequestismadeoneatatime

� Bandwidth :accessmemoryinsequencewithnodatadependencybetweenconservativeaccesses– CPUgeneratemultiplememoryrequestsinparallel,maximizingmemorylevelparallelism(MLP)availableinthememorysystem

� Metric� Averagememoryaccesslatency(ns)– timetoread/writeoneblock(64B)� Normalizeaveragelatencytothebest-casewithoutresourcecontention

Evaluations[2/5]• Evaluationsetup

� Twobenchmarkmixes� 4-core MIX

� Causecontentiononallthememoryresourcestoevaluateeachpartitioningmechanismandintegratedone

� 8-coreMIX� toshowthelimitationofset-basedcachepartitioning

� Datasizeconfiguration

Core1 Core2, 3 Core4

Latency(512KB)

Bandwidth(4MB)

Bandwidth(32MB)

Core1, 2 Core3, 4, 5, 6 Core7, 8

Latency(512KB)

Bandwidth(4MB)

Bandwidth(32MB)

DatasizeExamples Cache(LLC)

hit ratePlatform:2MBLLCon4-coreCPU

LLC SizeofLLCdividedbynumberofcores

2MB/4cores=512KB

100%

DRAM/small TwicethesizeofLLC 2MB;2 =4MB 0%

DRAM/large SignificantlylargerthanLLC

Muchlargerthan2MB(32MBinourexperimentalsetup)

0%

Evaluations [3/5](a) (b) (c) (d) (e)

core1 0.41 0.55 0.97 0.97 1.00

core2 0.49 0.57 0.62 0.78 1.00

core3 0.50 0.57 0.62 0.79 1.00

core4 0.93 0.87 0.87 0.85 1.00

0.20.30.40.50.60.70.80.91

1.1

(a)WORST (b)RP_BANK (c)RP_BANK+RP_$SET

(d)RP_BANK+RP_$SET+RP_BUS

(e)BEST

Normalize

dperformance

core1 core2 core3 core41 istheperformancew/ointerference

• 4-coreMIX,IntegrationOption#1� RP_BANK,RP_$SET,andRP_BUS� (b)RP_BANK:allthecoresareenabledtoaccessbanksinparallel� (c)AddingRP_$SETensures512KBL3cacheforLatency(LLC)apprunningoncore1(56%improvementcomparedtotheworst-case)� Moreover,feweraccessestomainmemorywererequestedbycore1helpsperformanceonothercores

� (d)AddRP_BUS:Performancewhenalltechniquesareputtogether

Evaluations [4/5]• 4-coreMIX,Integrationoption#2

� RP_$WAYandRP_BUS� RP_BANKisinapplicable

� Inthisbenchmark,memoryaccessisnotconcentratedtoabanksinceRP_$WAYallocatesmemorytoeachcoresequentially

� However,worstcasecouldarisedependingonantaskbehavior� RP_$WAYvs.RP_SET

� PagingoverheadonRTOSdegradesperformance� 3%, 16%, 17%, and13%foreachapplicationoncore1,2,3,and4

0.20.30.40.50.60.70.80.91

1.1

(a)WORST (b)RP_$WAY (c)RP_$WAY+RP_BUS

(d)BEST

Normalize

dperformance

core1 core2 core3 core4

(a) (b) (c) (d)

core1 0.41 1.00 1.00 1.00

core2 0.49 0.78 0.91 1.00

core3 0.50 0.79 0.91 1.00

core4 0.93 1.01 0.89 1.00

0.20.30.40.50.60.70.80.91

1.1

(c)RP_BANK+RP_$SET

1 istheperformancew/ointerference

0

0.2

0.4

0.6

0.8

1

1.2

(a)WORST (b)RP_BANK+RP_$SET (c)RP_BANK+RP_$SET+RP_BUS

(d)RP_$WAY (e)RP_$WAY+RP_BUS BEST

Norm

alize

dperfo

rmance

core1 core2 core3 core4 core5 core6 core7 core8

Evaluations [5/5]• 8-coreMIX,Integration#1&#2

� Restrictionsonnumberofpossiblecachepartitions� RP_$SET– 4partitions,RP_$WAY– 32partitionsinP4080platform� PerformanceofLatency(LLC)isabout64%and88%withRP_$SETandRP_$WAY,respectively

� Overheadofpaging� Comparetheperformancein(b)and(d),or(c)and(e)

(a) (b) (c) (d) (e) (f) core1 0.37 0.64 0.64 0.88 0.87 1.00core2 0.37 0.64 0.63 0.88 0.86 1.00core3 0.30 0.42 0.54 0.52 0.71 1.00core4 0.30 0.42 0.54 0.52 0.71 1.00core5 0.30 0.42 0.54 0.53 0.71 1.00core6 0.30 0.42 0.54 0.53 0.71 1.00core7 0.82 0.75 0.74 0.94 0.79 1.00core8 0.82 0.74 0.73 0.94 0.79 1.00

1 istheperformancew/ointerference

Conclusions&FutureWork• Conclusions

� Qplus-AIR,anARINC653compliantRTOS� ComprehensivesharedresourcepartitioningimplementationonanARINC653compliantRTOS,Qplus-AIR� Implementationissuesofimplementingandcombiningmultipleresourcepartitioningmechanisms

� TheuniquechallengesweencounteredduetothefactthattheRTOSdidnotsupportLinux-likedynamicpaging

• FutureWork� Predictablepaging� Evaluationwithreal-worldapplications

ThankYoufortheattentionpakeunji@etri.re.kr

References [1/2][1]AirlinesElectronicEngineeringCommittee,AvionicsApplicationSoftwareStandardInterfaceARINCSpecification653Part1,2006.[2]BIOSandkerneldeveloper’sguildforAMDfamily15hprocessors,March2012.[3]ARMCortex53TechnicalReferenceManual,2014.[4]P4080QorIQIntegratedProcessorHardwareSpecifications,Feb2014.[5]CertificationAuthoritiesSoftwareTeam,PositionPaperCAST-32A:Multi-coreProcessors,2016.[6]QorIQ T2080ReferenceManual,2016.[7]M.Chisholm,B.C.Ward,N.Kim,andJ.H.Anderson.Cachesharingandisolationtradeoffsinmulticoremixed-criticalitysystems.InRTSS,2015.[8]J.Flodin,K.Lampka,andW.Yi.Dynamicbudgetingforsettlingdramcontentionofco-runninghardandsoftreal-timetasks.InSIES,2014.[9]D.HardyandI.Puaut.Predictablecodeanddatapagingforrealtimesystems.InECRTS,2008.[10]T.Ishikawa,T.Kato,S.Honda,andH.Takada.Investigationandimprovementontheimpactoftlb missesinreal-timesystems.InOSPERT,2013.[11]H.Kim,A.Kandhalu,andR.Rajkumar.Acoordinatedapproachforpracticalos-levelcachemanagementinmulti-corereal-timesystems.InECRTS,2013.[12]T.Kim,D.Son,C.Shin,S.Park,D.Lim,H.Lee,B.Kim,andC.Lim.Qplus-air:Ado-178bcertifiablearinc 653rtos.InThe8thISET,2013.

References [2/2][13]R.Mancuso,R.Dudko,E.Betti,M.Cesati,M.Caccamo,andR.Pellizzoni.Real-timecachemanagementframeworkformulti-corearchitectures.InRTAS,2013.[14]M.D.BennettandN.C.Audsley.Predictableandefficientvirtualaddressingforsafety-criticalreal-timesystems.InECRTS,2001.[15]J.NowotschandM.Paulitsch.Leveragingmulti-corecomputingarchitecturesinavionics.InEDCC,2012.[16]J.Nowotsch,M.Paulitsch,D.Buhler,H.Theiling,S.Wegener,andM.Schmidt.Multi-coreinterference-sensitivewcetanalysisleveragingruntimeresourcecapacityenforcement.InECRTS,2014.[17]S.A.PanchamukhiandF.Mueller.Providingtaskisolationviatlbcoloring.InRTAS,2015.[18]M.K.QureshiandY.N.Patt.Utility-basedcachepartitioning:Alow-overhead,high-performance,runtimemechanismtopartitionsharedcaches.InMICRO,2006.[19]R.E.KesslerandM.D.Hill.Pagereplacementalgorithmsforlargereal-indexedcaches.InACMTrans.onComp.Sys.,1992.[20]L.Sha,M.Caccamo,R.Mancuso,J.-E.Kim,andM.-K.Yoon.Singlecoreequivalentvirtualmachinesforhardreal-timecomputingonmulticoreprocessors,whitepaper.2014.[21]N.Suzuki,H.Kim,D.deNiz,B.Anderson,L.Wrage,M.Klein,andR.Rajkumar.Coordinatedbankandcachecoloringfortemporalprotectionofmemoryaccesses.InICCSE,2013.[22]H.Yun,R.Mancuso,Z.-P.Wu,andR.Pellizzoni.Palloc:Drambank-awarememoryallocatorforperformanceisolationonmulticoreplatforms.InRTAS,2014.[23]H.Yun,G.Yao,R.Pellizzoni,M.Caccamo,andL.Sha.Memorybandwidthmanagementforefficientperformanceisolationinmulti-coreplatforms.IEEETransactionsonComputers,65:562–576,2015.