A survey on exploring memory optimizations in smartphones

22
A SURVEY ON EXPLORING MEMORY OPTIMIZATIONS IN SMARTPHONES -KARTHIKEYAN RAMKUMAR

Transcript of A survey on exploring memory optimizations in smartphones

Page 1: A survey on exploring memory optimizations in smartphones

A SURVEY ON EXPLORING MEMORY OPTIMIZATIONS IN

SMARTPHONES

-KARTHIKEYAN RAMKUMAR

Page 2: A survey on exploring memory optimizations in smartphones

ABSTRACT

• Many memory optimizations have been explored for computer systems and in this survey we explore

their applicability to smartphone hardware.

• Memory technologies such as Mobile RAM (M-RAM), Power Aware Virtual Memory

(PAVM), Dynamic RAM (DRAM) and On-demand mechanisms such as Immediate Power Down

(IPD) mechanism and Immediate Self Refresh (ISR) mechanism are described in this survey.

• Newly emerging technologies such as Phase Change Memory (PCM) and a hybrid approach consisting

of both Phase Change Memory and Mobile RAM are also surveyed.

Page 3: A survey on exploring memory optimizations in smartphones

INTRODUCTION

• Additional features and improved user experience, provided by fast processors, copious

memory, resource demanding software, and power-hungry hardware makes energy a precious resource.

With hardware continuously improving in performance and price, vendors are able to build systems

with higher-performance and higher power components trying to meet users’ ever increasing demands

and compete for customers.

• However, this results in systems that are over-provisioned with components that provide more

capacity, more throughput, and more processing power than needed for the typical workload, and as a

result, it is becoming more difficult to maintain long battery life in these devices.

• While a smartphone contains many energy hungry components, such as CPU, display, and multiple

radios, energy consumed by memory subsystem has been given limited consideration.

• Therefore, we explore the efficiency of the existing energy management mechanisms on smartphones.

Page 4: A survey on exploring memory optimizations in smartphones

MEMORY TECHNOLOGIES

The memory technologies discussed in this paper include Dynamic RAM (DRAM) which is the most

widely used memory technology in mobile devices and is otherwise referred to as Mobile RAM (M-

RAM). A recent contender for main memory technology is Phase Change Memory (PCM) which is a type

of non-volatile random-access memory that eliminates idle power due to its non-volatile nature but offers

lower performance than M-RAM. Another memory technology that is described is the Power Aware

Virtual Memory (PAVM), which reduces the energy consumed by the memory in response to workloads

becoming increasingly data-centric. This section describes the various memory technologies and how

they are optimized in smartphones to give a better performance.

Page 5: A survey on exploring memory optimizations in smartphones

1. DYNAMIC RAM (DRAM)

• Dynamic random-access memory (DRAM) is a type of random-access memory that stores each bit of

data in a separate capacitor within an integrated circuit.

• As applications are becoming increasingly data-centric, we expect main memory to remain as a

significant energy consumer because achieving good overall system performance will be more likely to

depend on having higher-performance and larger-capacity DRAM.

• We use the terminology of the Double-Data Rate (DDR) memory simply because DDR is becoming the

most common type of memory used in today's PC and server systems. This approach is not limited to

only DDR but this technique can also be applied to other memory types, e.g., SDR and RDRAM.

Page 6: A survey on exploring memory optimizations in smartphones

1.1 MEMORY TRAFFIC RESHAPING

• To reshape the memory traffic for our benefit, we must make memory access less random and more

controllable.

• We use a 4-rank system wherein memory requests are likely to be randomly distributed among the 4

ranks and this creates a large number of small and medium sized idle periods.

• To elongate idle periods, the concepts of hot and cold ranks are introduced.

• Since more opportunities are created on cold ranks since Self Refresh can be more utilized, more

valuable opportunities are created in it.

Page 7: A survey on exploring memory optimizations in smartphones

1.1 MEMORY TRAFFIC RESHAPING

In the experiments conducted, the average

interarrival time was elongated by almost 2

orders of magnitude on cold ranks.

An example showing that if memory traffic

is left unshaped, power management cannot

take full advantage of deeper power-saving

states since most idle periods are too short.

Page 8: A survey on exploring memory optimizations in smartphones

1.2 EFFECT OF RESHAPING ON MEMORY TRAFFIC

• To study the effect of memory traffic reshaping in more detail, we compare the results of migrating

1%, 5%, and 10% of pages.

• Migrating only 1 % of pages gives only limited benefits in power reduction. On the other

hand, migrating 10% of pages does not give any additional energy benefit beyond that of migrating

5%. In addition, it also suffers from more performance penalty due to having to migrate more pages.

• Therefore, migrating 5% of pages gives the best result for the workloads we ran.

Page 9: A survey on exploring memory optimizations in smartphones

1.2 EFFECT OF RESHAPING ON MEMORY TRAFFIC

• As we can see from the Figure, migrating 1% as

opposed to 5% of pages does not give much

benefit in reducing performance penalty.

• To solve the problem at its root, it calls for an.

alternative main memory design, where we

should use high-performance, highly parallel

memory on hot ranks and low-performance low-

power memory on cold ranks.

• Results shows that a 35.63-38.87% additional

energy can be saved by complementing existing

power management techniques with this

technique.

Effects of actively reshaping memory

traffic by migrating 1%, 5%, and 10% of

pages for the low memory intensive

workload (above) and high memory-

intensive workload (below).

Page 10: A survey on exploring memory optimizations in smartphones

2. PHASE CHANGE MEMORY (PCM)

• Phase change memory is a type of non-volatile random access memory and provides a non-volatile

storage mechanism agreeable to process scaling.

• However, for a DRAM alternative, we must architect PCM for feasibility in main memory within

general-purpose systems.

• Drawn from a rigorous survey of PCM device and circuit prototypes published within the last five

years and comparing against modern DRAM memory subsystems, we examine the following: Buffer

Organization and Partial Writes.

Page 11: A survey on exploring memory optimizations in smartphones

2.1 BUFFER ORGANIZATION

• We examine PCM buffer organizations that satisfy DRAM imposed area constraints.

• PCM buffer reorganizations reduce application execution time from 1.6x to 1.2x and memory energy

from 2.2x to 1.0x, relative to DRAM-based systems.

Evaluation:

On optimizing average delay and energy across the workloads, we find four 512B-wide buffers most

effective. Executing on effectively buffered PCM, more than half the benchmarks achieve within 5

percent of their DRAM performance. Although each PCM array write requires 43.1x more energy than a

DRAM array write, these energy costs are mitigated by narrow buffer widths and additional rows, which

reduce the granularity of buffer evictions and expose opportunities for write coalescing, respectively.

Page 12: A survey on exploring memory optimizations in smartphones

2.2 PARTIAL WRITES

• Partial writes, which track data modifications and write only modified cache lines or words to the PCM

array are utilized. Using an endurance model to estimate lifetime, we expect write coalescing and

partial writes to deliver a memory module average lifetime of 5.6 years.

• Scaling improves PCM endurance, extending lifetimes by four orders of magnitude at 32nm.

Evaluation:

• In a baseline architecture with a single 2048B-wide buffer, average module lifetime is approximately

525 hours.

• For our memory intensive workloads, we observe 32.8 percent memory bus utilization. Scaling by

application-specific write intensity, we find 6.9 percent of memory bus cycles are utilized by writes.

• On average, the four 512B-wide buffers coalesce 38.9 percent of writes emerging from the memory

bus, which is 47.0 percent utilized. Writes alone utilize 11.0 percent of the bus. Buffers use partial

writes so that only a fraction of the buffer’s bits is written to the array.•

Page 13: A survey on exploring memory optimizations in smartphones

PHASE CHANGE MEMORY (PCM)

• Collectively, these results indicate PCM is a viable DRAM alternative, with architectural solutions

providing competitive performance, comparable energy, and feasible lifetimes.

• On utilizing PCM as a viable alternative to M-RAM, we need to note that it consumes more energy to

perform I/O operations, particularly write operations, since the cell state has to be changed.

• However, PCM consumes significantly less idle power than M-RAM, especially in the low-power state

where the power consumption is reduced to 0. Therefore, we should leverage the tradeoffs between

performance and energy efficiency to apply PCM technology in mobile devices.

Page 14: A survey on exploring memory optimizations in smartphones

3. CHARACTERIZING MOBILE SOFTWARE

The applications selected for this survey are shown in the table

This table lists 12 popular Android applications selected from the Android market along with their trace

statistics. We is e a T-Mobile G1 smartphone is used to collect the application traces. Each trace consists of

task intervals with the task execution length and the number of memory I/Os

Page 15: A survey on exploring memory optimizations in smartphones

3. CHARACTERIZING MOBILE SOFTWARE

• Compared to the CPU speed, human interactions are extremely slow, such that a mobile system is idle and waiting

for user input for the majority of time.

• Prior study has shown that human perception threshold is between 50ms and 100ms and any event shorter than the

perception threshold appears instantaneous to the user.

• Completing task execution earlier than the perception threshold is meaningless since the user will not notice this

amount of time and cannot initiate new tasks any sooner. This observation is the key to enabling energy

optimizations without impacting observed application performance

• The majority of tasks are very short as more than 90% of all tasks complete within 10ms. Moreover, 95% of all

tasks are shorter than 50ms, indicating that these tasks can be extended to the 50ms perception threshold deadline

without any performance penalty. Similarly, for the remaining 5% of long tasks, any additional extension less than

50ms will not be noticed by the user, avoiding performance degradation.

Page 16: A survey on exploring memory optimizations in smartphones

4. MECHANISM COMPARISON

• M-RAM needs to refresh the storage cells regularly for data retention, therefore consuming non-

negligible power even in the low-power state. PCM is able to completely eliminate idle power due to

non-volatile nature. We will evaluate the effectiveness of various energy management mechanisms on

M-RAM and PCM under the same execution environment.

• For this survey, a simulator that models the system configuration of a T-Mobile G1 smartphone is used

• The memory subsystem consists of a memory controller and three 64MB ranks (192MB totally), for

either M-RAM or PCM. The simulator feeds with the traces, determines the memory power state, and

conducts task execution under the current CPU and memory state. The memory controller conducts

memory I/O operations, and executes power state transitions for each rank based on the energy

management mechanism

System Configuration of a T-Mobile G1

Page 17: A survey on exploring memory optimizations in smartphones

4.1 POWER AWARE VIRTUAL MEMORY (PAVM)

• In mobile applications when the smartphone is waiting for user input, idle periods are common and

therefore powering down the memory devices during this period can help in reducing the energy

consumption.

• Power-Aware Virtual Memory (PAVM) is a simple and efficient way to provide energy management. It

keeps the memory devices occupied by the currently running process in the active state while keeping

all other memory devices in a low-power state to save energy. Memory devices used by the newly

scheduled process are powered up during the context switch time to minimize the delays exposed to the

user due to power state transitions. Memory energy consumption

with a standard system (ON) and

the PAVM mechanism. The left

two bars for each application

show the energy of M-RAM and

PCM in standard system, while

the right two bars show the

energy for the PAVM mechanism.

Page 18: A survey on exploring memory optimizations in smartphones

5. ON-DEMAND MECHANISMS

• Despite PAVM’s benefits to the standard system, it fails to address the energy efficiency of the active

rank accessed during the process execution.

• Immediate Power Down (IPD) mechanism and Immediate Self Refresh (ISR) mechanism have been

proposed for RAM to provide on-demand power state transitions and improve energy efficiency of

active ranks.

• As soon as an I/O request arrives at the memory controller, the rank to be accessed is transitioned to the

PRE state, and transitioned back to a low-power immediately after the I/O completes. Each energy bar

is normalized to M-RAM with the PAVM mechanism.

Page 19: A survey on exploring memory optimizations in smartphones

5. ON-DEMAND MECHANISMS

• The first bar shows the energy consumption of PCM with the PAVM mechanism and the other bars show the energy

of on-demand mechanisms.

• The two on-demand mechanisms outperform the PAVM mechanism on PCM and as a result, PCM’s inferior I/O

efficiency can’t offset its energy savings from idle periods, except for lightly loaded applications Amazon, Music

and Twidroid.

• The PCM OFF mechanism completely eliminates the active idle energy, resulting in 44% energy reduction over the

PAVM mechanism on PCM. Compared to the IPD and ISR mechanisms on M-RAM, the PCM OFF mechanism

offers 18% and 22% energy savings respectively.

Memory energy consumption for

on-demand mechanisms

normalized to the PAVM

mechanism on M-RAM

Page 20: A survey on exploring memory optimizations in smartphones

5. ON-DEMAND MECHANISMS

The distribution of extended tasks

that expose delays for on-demand

mechanisms

• We can observe that the IPD mechanism achieves the best performance with negligible delays exposed.

The ISR and PCM OFF mechanisms, on the other hand, incur more evident degradation due to the

141.5ns long transition latency

• energy is the only concern, the novel PCM technology with on-demand mechanism surpasses the

traditional MRAM. However, taking into account the performance as well, M-RAM still has the chance

to beat PCM,

• We therefore need an approach to balance energy and performance more efficiently than any standalone

memory technology.

Page 21: A survey on exploring memory optimizations in smartphones

6. HYBRID MEMORY ARCHITECTURE

• From the previous analysis, we can see that PCM is superior to M-RAM for its lower idle power

consumption, while M-RAM excels PCM for faster I/O speed and lower I/O energy. Therefore, a

hybrid memory consisting of M-RAM and PCM can improve both the energy efficiency and

performance.

• When an application is invoked and its image does not reside in M-RAM, it is loaded into M-RAM

either from secondary storage or PCM, and the corresponding process identifier is put at the head of the

LRU list.

• When an application is closed, its memory image will stay in M-RAM until it is swapped out.

• The hybrid approach preserves more than 99% of IPD’s performance and achieves the best energy

efficiency among all mechanisms while maintaining almost full memory performance.

Page 22: A survey on exploring memory optimizations in smartphones

CONCLUSIONS

• The PAVM mechanism saves more than 90% energy as compared to the standard system with no

energy management.

• Additional energy savings are provided by the on-demand mechanisms which offer around 40% more

savings compared to the PAVM mechanism, for both M-RAM and PCM.

• The energy efficiency can be improved further by a hybrid approach consisting of mixed memory

technologies and mechanisms and this approach provides an energy savings of 98% with negligible

performance overheads as compared to the standard system.