Mobile System Concerns - McMaster University

126
Mobile System Concerns Mobile System Concerns in the Cloud Age in the Cloud Age Lin Zhong Ri Effi i tC ti G ( ) Rice EfficientComputing Group (recg.org) Rice University

Transcript of Mobile System Concerns - McMaster University

Mobile System ConcernsMobile System Concerns in the Cloud Agein the Cloud Age

Lin ZhongRi Effi i t C ti G ( )Rice Efficient Computing Group (recg.org)

Rice University

rackspacep

• Input• Output• Output• Wireless connectivity

3

Omni directional transmission a key bottleneck

Ongoing project with Ashutosh Sabharwal

ISLPED’10 and MobiCom’10

Two ways to realize directionalityy y

• Passive directional antennas 8 and 5dBi antennasPassive directional antennas– Low costfi d b tt– fixed beam patterns

– MobiCom’10

• Digital beamforming– Flexible beam patterns– Flexible beam patterns– High cost

6Phased-array antenna system from Fidelity Comtech

Passive directional antennas

Mi t i t (5dBi k i )• Microstrip antenna (5dBi peak gain)

3.2 cm

cm3.

2

7

Challengesg

• Multipath effectMultipath effect– Hard to find the best transmit direction

bili d i• Mobility and rotation– Destroys the already found best direction– Rotation is most challenging

8

Questions we try to answery

1) How do smartphone‐like mobile device1) How do smartphone like mobile device rotate during wireless access?

2) H d di i l b h i h2) How do directional antennas behave with indoor and non‐line‐of‐sight (NLOS) propagations?

3) How can a device dynamically select the best3) How can a device dynamically select the best antenna?

9

Orientation estimation using Euler AnglesAngles• θ and φ based on tri‐axis accelerometerθ and φ based on tri axis accelerometer • ψ based on tri‐axis compass and θ and φ

z Sky

Y Device X axisDevice Y axis

y

YX

Device Z axis M i

ψ

θDevice Z axis Magnetic

North

10x

ψ

Ground

User studyy

• Collecting accelerometer/compass dataCollecting accelerometer/compass data• 11 users with G1 android phone• One week of continuous measurement for each user

• Data is made open access– http://www.ruf.rice.edu/~mobile/downloads.htmp

11

Device rotates slowlyy

1

0.1s1s

0.5C

DF

1s10s

CD

F

0

CC

10-4

10-2

100

102

Rotational speed(/s)

Slower than 120°/s for 90% of the time

12

Questions we try to answery

1) How do smartphone‐like mobile device1) How do smartphone like mobile device rotate during wireless access?

2) H d di i l b h i h2) How do directional antennas behave with indoor and non‐line‐of‐sight (NLOS) propagations?

3) How can a device dynamically select the best3) How can a device dynamically select the best antenna?

13

Measurement setupMeasurement setup

1 i di ti l t

3 directional antennas

1 omni‐directional antenna

WARP

Controllable motor

Laptop

14

First measurement

• RSSI measured at both endsRSSI measured at both ends

Data packets

ACK packetsACK packets

Different directions in th h i t l lthe horizontal plane

15

Directional antennas outperform omni in big angular regionsChannel reciprocity holdsin big angular regionsChannel reciprocity holds

NLOS ind. / 5dBi antenna

-30

-20m

)

Dir-ClientDir-APOmni Client

50

-40

SS(d

Bm Omni-Client

Omni-AP

0 60 120 180 240 300 360

-60

-50

RS

0 60 120 180 240 300 360Direction()

16

Second measurement

• RSSI measured at APRSSI measured at AP

PHY symbols

One hour of rotationaccording to user

Sent from omni antenna

Sent from directional antenna

according to userstudy traces

17

Directional outperforms omni for ~50% of the total timeSuperiority intervals > 1 second~50% of the total timeSuperiority intervals > 1 second

305dBi

20

25

e(%

)

10

15

otal

tim

e

[0 0 1) [0 1 1) [1 10) [10 inf)0

5

to

[0,0.1) [0.1,1) [1,10) [10,inf)superiority intervals(s)

18

RSS is predictablep

1005dBi

Zero orderFirst order

1r(

dB)

0.01Erro

r

10ms 100ms 1sPrediction Intervals(s)

Prediction Intervals(s)

Less than 0.1 dB error for short intervals19

Less than 0.1 dB error for short intervals

Multi antenna design (MiDAS)g ( )

• Only one RF chainy– One active antenna at a time

• Directional antennas used only for Data transmission and yAck reception

Omni-directional antenna Antenna switch

Transceiver. . .Directional antennas

Antenna selection

RSSI

20

Questions we try to answery

1) How do smartphone‐like mobile device1) How do smartphone like mobile device rotate during wireless access?

2) H d di i l b h i h2) How do directional antennas behave with indoor and non‐line‐of‐sight (NLOS) propagations?

3) How can a device dynamically select the best3) How can a device dynamically select the best antenna?

21

Packet‐based antenna selection

• Works for legacy networksWorks for legacy networks

• Antenna assessment based on ACK packet– Uses channel reciprocityUses channel reciprocity

• One antenna assessment per packet– Antenna assessment is expensivep

22

Heuristic antenna selection

• Best mode: Uses theBest mode: Uses the previously selected  Best

right antenna• Safe mode: Uses omni

(1) Still the best?

Yes

Safe mode: Uses omni antenna when RSS 

No

Safe

changes rapidlyAssess antennas(2) Should

NoYes

Assess antennaswe assess?

23

Symbol‐based antenna selectiony

• Needs change to the PHY layerNeeds change to the PHY layer• Data packet‐based assessment

– Does not depend on channel reciprocity 

• Assess all antennas per packetp p

A t t i i Regular packet Antenna training symbols

ACK SEL

24

Trace based evaluation

• Rotation traces replayed on the motorp y• RSSI traces collected for all antennas• Algorithms evaluated on traces offline• Algorithms evaluated on traces offline

-45

-50B)

Dir 3

-55RSS

(dB

Dir

Omni

0 5 10 15 20-60

Dir1 Dir

3

0 5 10 15 20time(second)

25

~3dB possible link gainBoth selection methods work well for p gshort average intervals

5

6

Upper bound Symbol-based Packet-based

3

4in

(dB

)

1

2Ga

10ms 100ms 1s 10s0

Average Packet Interval

g

• Poisson traffic• Three directional antennas and one omni

26

• 5dBi directional antennas• NLOS indoor environment

Good performance in all environmentsGood performance in all environments

5

6

Upper bound Symbol-based Packet-based

3

4in

(dB

)

1

2Ga

NLOS-ind. LOS-ind. outd.0

Environment

• Poisson traffic with 10ms average interval• Three directional antennas and one omni

27

• 5dBi directional antennas

Two directional antennas are enoughTwo directional antennas are enough

5

6

Upper bound Symbol-based Packet-based

3

4in

(dB

)

1

2Ga

three two-opp two-adj one0

Antenna Configuration

g

• Poisson traffic with 10ms average interval• 5dBi directional antennas

28

• NLOS indoor environment

More focused beam is not necessarily betterbetter

5

6

Upper bound Symbol-based Packet-based

3

4in

(dB

)

1

2Ga

5dBi 8dBi0

Antenna Gain

• Poisson traffic with 10ms average interval• Three directional antennas and one omni

29

• NLOS indoor environment

MiDAS handles mobility tooy

• MiDAS client rotating according to the tracesMiDAS client rotating according to the traces• Omni AP moves around randomly

6Upper bound

4

5

in(dB)

ppSymbol‐based

2

3

rage G

ai

0

1

2

Aver

30

0

Adding Rate Adaptation & Power ControlAdding Rate Adaptation & Power Control• Why?Why?

– Realize the gain of MiDAS in terms of goodput and power savingpower saving

• How?– SNR‐triggered rate adaptation

• Uses goodput‐rate table of the wireless cardg p

– Pick the highest rate possible, given SNRReduce the transmit power as much as possible– Reduce the transmit power as much as possible not to hurt the chosen rate more than a threshold

31

Goodput gain for weak connectionsPower saving for strong connectionsGoodput gain for weak connectionsPower saving for strong connections

150

GG-Upper bound GG-MiDASPR-Upper bound

100%

PR Upper bound PR-MiDAS

50

0 10 20 300

Omni SNR(dB)

Omni SNR(dB)

Simulated for 802.11a with 8 rates

32

Real‐time experimentp

• WARP goodput‐rate table14

BPSK

8

10

12

Mbp

s)

BPSK QPSK QAM16

4

6

8oo

dput

(

10 20 30 400

2

SNR (dB)

G

SNR (dB)

33

Goodput gain for weak connectionsPower saving for strong connectionsGoodput gain for weak connectionsPower saving for strong connections

90100

Goodput gain85

708090 p g

TX Power reduction85

405060

%

51

102030

7010

Low SNR High SNR

70

34

g

Experiment

Conclusions

• CharacterizationsCharacterizations– Smartphones rotate relatively slowTh h l f di ti l t i i l– The channel of directional antennas is reciprocal and predictable in short intervals

• MiDAS– Effectively employs directional antennas onEffectively employs directional antennas on smartphones to increase link gain by ~3dB

35

Beamformingg

• A group of omni‐directional antennasA group of omni directional antennas• Multiple RF chains• Baseband signal is weighted, and multiplexed to different RF chains and antennas

11RX

22

⋮⋮

,

36

,

Power constraint

• Circuit power increasingly smallCircuit power increasingly small 

DAC Filter Mixer Filter PA1

PPAPCircuit

1200

mW

)

SISO

Baseband Signal

800

1000

nsum

ptio

n ( SISO

2x2 MIMO

FrequencySynthesizer N

PShared 400

600

Pow

er C

onBaseband 

DAC Filter Filter PAN

PShared

2002 2004 2006 2008 20100

200Tr

ansm

itter

Signal Mixer

Data collected from JSSC and ISSCC

2002 2004 2006 2008 2010Year

T

Form factor constraint

6090

120

0.5lambda0.3lambda 7

4 antennas

30150

0.1lambda

5

6

gai

n (d

B)

4 antennas3 antennas2 antennas

180 0

2

3

4

eam

form

ing

210

240 300

330

0

1

2

Pea

k be

270 0 0.1 0.2 0.3 0.4 0.50Antenna spacing (wavelength)

38

Mobility Constrainty

• Beamsteering gain under CSI estimation (perBeamsteering gain under CSI estimation (per 10ms)

)

Indoor

Max )

Outdoor

Max

6

ng g

ain

(dB

)

Static90d/s180d/s

6

ng g

ain

(dB Static

90d/s180d/s

3

Bea

mfo

rmin

3

Bea

mfo

rmin

N=2 N=40

B

N=2 N=40

39

Mobility Constraint (Contd.)y ( )

• Beamsteering gain under CSI estimation (perBeamsteering gain under CSI estimation (per 100ms)

6dB

)

Indoor

MaxStatic 6

B)

Outdoor

MaxStatic

3

6

min

g ga

in (d

Static90d/s180d/s

3

6

min

g ga

in (d

Static90d/s180d/s

0

3

Bea

mfo

rm

0

3

Bea

mfo

rm

N=2 N=40

N=2 N=40

40

Mobility Constraint (Contd.)y ( )

• Key conclusionsKey conclusions– Beamsteering gain is more stable indoorSt ti li t l t l t k th– Static client can always accurately track the channel

– CSI estimation per 10ms guarantees near‐perfect tracking of the channel

• Typical frame length– UMTS/LTE: 10ms– UMTS/LTE: 10ms– 802.11: tens of ms

41

Key tradeoffsy

• Circuit and transmit powerCircuit and transmit power• Device efficiency and network capacity

4

0 K=4K=3K=2

1 2

3 4

90 270

K=1

90 270

180

Tradeoff by Beamsteeringy g

N=1

 

Transmit power

Circuit powerN=2

Circuit power

N=4

• Single‐link scenario• Single‐link scenario• Two‐link scenario

43

Single‐link Scenariog

• An uplink channel between a MC and a BSAn uplink channel between a MC and a BS

BS

1500

ion

(mW

)

N=1N=2N=3

d

1000

rcon

sum

pt N=4

MC

500

ient

pow

er

2 4 60Cl

Uplink capacity (b/s/Hz)

44

Single‐link Scenario (Contd.)g ( )

• Key conclusionsKey conclusions– Beamsteering can be more power efficient than omni directional transmissionomni‐directional transmission

– The larger the uplink capacity, the larger the ti l b t i ioptimal beamsteering size

– Optimal beamsteering size  1 /

45

Two‐link Scenario

• Two uplink channels between two MCs andTwo uplink channels between two MCs and two BSs

) d11/d12 5 C1/C2 5

BS1 BS24000

5000

ptio

n (m

W) d11/d12=5, C1/C2=5

N=1N=2N=3

d11 d22

d12

BS1 BS2

2000

3000

r Con

sum

p

N=4OptBeamSteering

d21MC1 MC2

1000

2000

wor

k P

owe

2 4 6 8 10 120Network Capacity (Bit/s/Hz)

Net

w

46

Two‐link Scenario (Contd.)( )

• Key conclusionsy– Capacity is determined by inter‐link interference– Beamsteering can be more power efficient thanBeamsteering can be more power efficient than omni‐directional transmission

– With the same power beamsteering achievesWith the same power, beamsteering achieves higher network capacity

– Beamsteering can achieve network unattainableBeamsteering can achieve network unattainable by omni‐directional transmission

– The optimal beamsteering sizes need to be jointlyThe optimal beamsteering sizes need to be jointly decided

47

BeamAdaptp

• Optimal use of beamsteering on mobileOptimal use of beamsteering on mobile devices

O ti l t d ff b t ffi i d– Optimal tradeoff between power efficiency and network capacity

• Theoretical formulation• System designSystem design• Evaluation

48

Theoretical Formulation

• Minimum power consumption to achieveMinimum power consumption to achieve certain network capacity

Minimize

, ,

s ts.t. , , 1 , , ∀1

49

Difficult to Solve

• No closed‐form formulation of theNo closed form formulation of the beamsteering gainN• Non‐convex

• Integer constraint of Ng– NP‐hard mixed‐integer‐programming (MIP) problemproblem

• High complexity of brute‐force search

– ,

50

,1

WINDOW TO THE CLOUD51

Mobile browsers are slow

17 secNexus One (N1) HTC Dream (G1)

13 sec12 sec

8 sec

52

Top 10 Mobile Website Top10 Non‐mobile Webpage

• What does the browser show?

• How does the browser work?How does the browser work?

• Where is the bottleneck?

53

What does the browser show?

100%

Non‐mobile webpages Mobile webpages

80%

40%

60%avg

20%

40%

0%

1 251 2525 iPhone 3GS Users

54

How does the browser work?

New resources to loadNew resources to loadInternal Representation (IR)

HTMLParsing

Loadedreso rces

Update

Layout

PaintingStyle 

Formatting

resources IRGraphics

PaintingResourceLoading ......

Scripting

IR operations: Parsing Style Scripting Layout Painting55

IR operations: Parsing, Style, Scripting, Layout, Painting

Where is the bottleneck?

• Existing work on PC browsersExisting work on PC browsers– LayoutSt l f tti– Style formatting

– Scripting

56

1. C. Stockwell, "IE8 Performance," http://blogs.msdn.com/b/ie/archive/2008/08/26/ie8-performance.aspx, 2008.2. L. A. Meyerovich and R. Bodik, "Fast and parallel webpage layout," in Proc. Int. Conf. World Wide Web (WWW) Raleigh, North Carolina, USA: ACM, 2010. 3. K. Zhang, L. Wang, A. Pan, and B. B. Zhu, "Smart caching for web browsers," in Proc. Int. Conf. World Wide Web (WWW) Raleigh, North Carolina, USA: ACM, 2010.

Is it true for mobile browsers?

Layout, Style, Scripting

57

Performance characterization

• Metric: browser delay– Starting point: when the user presses the “GO” button of the browser to open an URL. 

– End point: when the browser’s page loading progress bar indicates 100%indicates 100%. 

58

Performance characterization

• Dependency timeline characterizationp y

• What‐if analysis

59

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

60

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4request

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

61

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4

redirection

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

n

62

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4

Data packets

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

p

63

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

Thi i t f l di

4Res

our

5

6 This is part of resource loading.It is mainly spent on network.

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

64

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

Glue operation4R

esou

r

5

6

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

65

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4 intra-group dependency

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

66

Elapsed time (ms)4000

Dependency timeline characterizationup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

i t

2

3

4 inter-group dependency

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

67

Elapsed time (ms)4000

What overall performance gain will be achieved if a browser operation is accelerated?

68

What‐if analysisup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6 Shrink

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

69

Elapsed time (ms)4000

What‐if analysisup

ScriptingLayout

Resource Loading

PaintingParsing

Style Shift to the left

rce

Gro

7

8

4Res

our

5

6 Shrink

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

70

Elapsed time (ms)4000

What‐if analysisup

ScriptingLayout

Resource Loading

PaintingParsing

Style Shift to the left

rce

Gro

7

8

4Res

our

5

6 Shrink

2

3

4

They will not move

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

71

Elapsed time (ms)4000

What‐if analysisup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

72

Elapsed time (ms)4000

What‐if analysis

Original

up

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

4Res

our

5

6

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

73

Elapsed time (ms)4000

What‐if analysis

Originl

Ne

ScriptingLayout

Resource Loading Paintin

g

ParsingStyle

alw

rce 7

8g

4Res

our

Gro

up

5

6

2

3

4

Elapsed time

1

0 1000 1500 2000 2500 3000 3500 4000

74

Elapsed time (ms)

4000

Experimental setup

Pl tf• Platform: – HTC Dream (G1): 528MHzNexus One (N1) 1GHz– Nexus One (N1): 1GHz

• Operating System• Operating System– Android 2.1 (Eclair)

• Benchmark Webpages:– Top 10 mobile websites– Top 10 mobile websites– Top 10 visited non‐mobile webpages from LiveLab

751.Nielsen.com, "Top mobiel phones, sites and brands for 2009," http://blog.nielsen.com/nielsenwire/online_mobile/top-mobile-phones-sites-and-brands-for-2009/, 2009. 2.C. Shepard, A. Rahmati, C. Tossell, L. Zhong, and P. Kortum, "Live- Lab: Measuring Wireless Networks and Smartphone Users in the Field," in Proc. Workshop on Hot Topics in Measurement & Model-ing of Computer Systems, June 2010.

Experimental setup

W d h k• We used three network conditions:– Emulated enterprise Ethernet– Emulated enterprise Ethernet 

(no traffic control)

– Typical 3G network (T‐mobile)

– Emulated adverse network• First‐hop RTT: 400msp• Bandwidth (downlink/uplink): 500Kbps/100Kbps

76

Logging informationgg g

• Time stamp for browser operations– Overhead: <1%

• Tcpdump( ) ( )– Overhead: <2% (CPU); <0.4% (MEM)

77

Results

two take‐away messagestwo take away messages

78

IR operations do not matter much!

Parsing Style Scripting Layout PaintingParsing, Style, Scripting, Layout, Painting

79

IR operations do not matter much

4%Layout

4%Style

2%

4%

provem

ent

Non‐mobile WebMobile Web

2%

4%

mprovem

ent

Non‐mobile WebMobile Web

0%

0 8 16 24 32

Overall Im

0%Overall Im

0 8 16 24 32

Layout Calculating Speedup0 8 16 24 32Style Formatting Speedup

4%t Script

2%

4%

mprovem

ent

0%Overall Im

Non‐mobile WebMobile Web

0 8 16 24 32Scripting Speedup

80

IR operations do not matter much

4%Layout

4%Style

2%

4%

provem

ent

Non‐mobile WebMobile Web

2%

4%

mprovem

ent

Non‐mobile WebMobile Web

0%

0 8 16 24 32

Overall Im

0%Overall Im

0 8 16 24 32

Layout Calculating Speedup0 8 16 24 32Style Formatting Speedup

4%t Script

2%

4%

mprovem

ent

0%Overall Im

Non‐mobile WebMobile Web

0 8 16 24 32Scripting Speedup

81

IR operations do not matter much

8%

Combined: Parsing, Layout, Style, Scripting, Painting, Glue

6%

vemen

t

4%

all Improv

2%

Overa Non‐mobile Web

Mobile Web

0%

0 8 16 24 32

Speedupp p

82

Resource loading is the bottleneck!

83

Resource loading is the bottleneck

Resource Loading

400%

600%

vemen

t

200%

400%

l Improv

Non‐mobile Web

0%Overal Non mobile Web

Mobile Web

0 8 16 24 32

Resource Loading Speedup

84

Resource loading is the bottleneck

Resource Loading

400%

600%

vemen

t

200%

400%

l Improv

Non‐mobile Web

0%Overal Non mobile Web

Mobile Web

0 8 16 24 32

Resource Loading Speedup

85

Resource loading is the bottleneck

• Network RTT

• Network Bandwidth

• Browser loading procedure• Browser loading procedure

• Processing power

86

Network RTT matters

G1 ‐ Non‐mobile Web N1 ‐ Non‐mobile Web

30G1 ‐Mobile Web N1 ‐Mobile Web

20ay (sec)

wser D

ela

10

Brow

3G0

0 200 400

3G

Injected  RTT (ms)

87

Network bandwidth doesn’t matter

G1 ‐ Non‐mobile Web N1 ‐ Non‐mobile Web

30G1 ‐Mobile Web N1 ‐Mobile Web

20ay (sec)

10wser D

ela

10

Brow

3G

0250/50 500/100 1000/200 1500/400

Bandwidth: Downlink/Uplink (Kbps)88

Browser loading procedureup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8New resources can only be discovered after 

4Res

our

5

6 parsing a loaded resource

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

89

Elapsed time (ms)4000

Browser loading procedureup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8Redirections on the main HTML file further 

4Res

our

5

6 delay the discovering time of later resources

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

90

Elapsed time (ms)4000

Browser loading procedureup

ScriptingLayout

Resource Loading

PaintingParsing

Style

rce

Gro

7

8

Parsing of the HTML file is blocked by JavaScripts

4Res

our

5

6 blocked by JavaScripts

2

3

4

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 3500 4000

91

Elapsed time (ms)4000

Browser loading procedure

• Up to 25 concurrent requests for topUp to 25 concurrent requests for top mobile/non‐mobile webpages

• Constrains on concurrent TCP connections

Mobile Browser Connections/hostname Maximum connectionsMobile Browser Connections/hostname Maximum connections

Android 4 4

iPhone 4.3 6 35

Blackberry 9700 4 16

Opera Mobile 4 4

i i

92

Opera Mini 10 60

http://www.browserscope.org/

Average number of resources

96

2222

Top 10 Mobile Website Top10 Non‐mobile Webpage

93

Average number of network round trips

2727

19

Top 10 Mobile Website Top10 Non‐mobile Webpage

94

Processing power in resource loading

ScriptingResource Loading Parsing

8

9

ScriptingLayout

g

PaintingParsing

Style

urce

p6

7

8

4Res

ouG

roup

5

This is part of resource loading

2

3This is part of resource loading.It is mainly spent on network.

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 Elapsed time (ms)

http://mail.yahoo.com 95

Processing power in resource loading

ScriptingResource Loading Parsingou

p (i)8

9

ScriptingLayout

g

PaintingParsing

Style

urce

Gro

6

7

8

4Res

ou

5 DNS lookup query

2

3

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 Elapsed time (ms)

http://mail.yahoo.com 96

Processing power in resource loading

ScriptingResource Loading Parsingou

p (i) (ii)8

9

ScriptingLayout

g

PaintingParsing

Style

urce

Gro

6

7

8 TCP connection established HTTP GET

sent out

4Res

ou

5 DNS lookup query

2

3

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 Elapsed time (ms)

http://mail.yahoo.com 97

Processing power in resource loading

ScriptingResource Loading Parsingou

p (i) (ii) (iii)8

9

ScriptingLayout

g

PaintingParsing

Style

urce

Gro

6

7

8 TCP connection established HTTP GET

sent out

4Res

ou

5 DNS lookup query

2

3

Elapsed time (ms)

1

0 1000 1500 2000 2500 3000 Elapsed time (ms)

http://mail.yahoo.com 98

Time spent by G1 and N1 for those three cases

126ms

98ms

G1 N1

98ms

40ms

13ms3

18ms

3ms

(i) (ii) (iii)

99

Total time spent in the three cases on average when opening a mobile webpagewhen opening a mobile webpage

2 sec

1 sec

G1 N1

100

Processing power in resource loading

• Other uncategorized processingOther uncategorized processing

Th OS th d t f t k t k t– The OS moves the data from network stack to browser after receiving data packets

– Computation for secure connection (HTTPS)p

101

More powerful hardware improves the browser delay mainly through faster OS services and network stack instead f f t IR tiof faster IR operations.

102

Performance characterization results

• IR operations do not matter muchIR operations do not matter much

• Resource loading is the bottleneck– Network RTT (X)Network RTT (X)– Network bandwidthB l di d (X)– Browser loading procedure (X)

– Processing power (X)

103

How to improve mobile browser’s performance?

Reduce RTT Reduce # of Round Trips• Cloudlet• Data staging

• Web Pre‐fetching• Resource batchingg g g• Data URI scheme• Speculative resource• Speculative resource loading

1.M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, "The Case for VM-Based Cloudlets in Mobile Computing," IEEE Pervasive Computing, vol. 8, pp. 14-23, 2009. 2.J. Flinn, S. Sinnamohideen, N. Tolia, and M. Satyanaryanan, "Data Staging on Untrusted Surrogates," in Proceedings of the 2nd USENIX Conference on File and Storage Technologies San Francisco, CA: USENIX Association, 2003.

104

on File and Storage Technologies San Francisco, CA: USENIX Association, 2003. 3.V. N. Padmanabhan and J. C. Mogul, "Using predictive prefetching to improve World Wide Web latency," SIGCOMM Comput. Commun. Rev., vol. 26, pp. 22-36, 1996. 4.Skyfire: http://www.skyfire.com/. 5.L. Masinter, "The "data" URL scheme," http://tools.ietf.org/html/rfc2397, 1998.

On‐going workg g

• Speculative mobile browser design

• Fully understand the impact of hardwareFully understand the impact of hardware

• OS and network service acceleration

http://www.owlnet.rice.edu/~zw3/projects Tempo.html

105

p // / /p j _ p

Today’s smartphoney p

Application processorprocessor

106

rackspacep

Heterogeneous multiprocessorg p

Application processor

µ‐controller

Turducken like systems

108

Turducken-like systems

Heterogeneous body‐area networkg y

109

Smartphone 2020p

µ‐controllerCloud processor

Cloud CloudCl d

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

processorµ‐controllerprocessorprocessor

µ‐controller

110

Challenges to programmingg p g g

µ‐controllerCloud processor

Cloud CloudCl d

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

processorµ‐controllerprocessorprocessor

µ‐controller

• Resource disparity– ISA disparityp y

111

Challenges to programmingg p g g

µ‐controllerCloud processor

Cloud CloudCl d

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

processorµ‐controllerprocessorprocessor

µ‐controller

• Resource limitation on “small” processors– Virtual machine and coherent memory difficulty

112

Challenges to programmingg p g g

µ‐controllerCloud processor

Cloud CloudCl d

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

processorµ‐controllerprocessorprocessor

µ‐controller

• Separation of hardware vendors, application developers, and users– Developer blind of external computing resources and runtime context

113

Challenges to programmingg p g g

µ‐controllerCloud processor

Cloud CloudCl d

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

processorµ‐controllerprocessorprocessor

µ‐controller

• Established programming model and OS

114

Existing solutionsExisting solutionsCPU+GPU systems

mPlatform etc.

Single ISA

Virtual machine

Turducken-like

Offloading systems (active disk, Hydra etc.)

Single ISA Turducken-like cohort systems

Complete transparency No transparency

Prohibitively expensive High burden on application developerspp p

115

Reflex: Transparent programming ofReflex: Transparent programming of heterogeneous mobile systems

htt // fl i d /http://reflex.recg.rice.edu/

Inspired by the heterogeneous distributed nervous system

Enough transparencyg p yCPU+GPU systems

mPlatform etc.

ReflexSingle ISA Turducken-like

Offloading systems (active disk, Hydra etc.)Virtual machine

Single ISA Turducken-like cohort systems

Complete transparency No transparency

• Ease of programmingff• Execution efficiency

117

Key ideasy

µ‐controllerCloud processor

Cloud CloudCl d

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

processorµ‐controllerprocessorprocessor

µ‐controller

• Light weight virtualization of sensor data acquisition, timer, and memory managementq , , y g

118

Key ideasy

µ‐controllerCloud processor

Cloud CloudCl dReflex runtime

Reflex runtime

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

Reflex runtimeReflex runtime

processorµ‐controllerprocessorprocessorReflex 

µ‐controllerruntime

• Distributed runtime for transparent message passingp g

119

Key ideasy

µ‐controllerCloud processor

Cloud CloudCl dReflex runtime

Reflex runtime

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

Reflex runtimeReflex runtime

processorµ‐controllerprocessorprocessorReflex 

µ‐controllerruntime

• Automatic code partition through a collaboration between runtime and compilerp

120

Key ideasy

µ‐controllerCloud processor

Cloud CloudCl dReflex runtime

Reflex runtime

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

Reflex runtimeReflex runtime

processorµ‐controllerprocessorprocessorReflex 

µ‐controllerruntime

• Identify a small coherent memory segment – Maintain by message passing through the runtimey g p g g

121

Key ideasy

µ‐controllerCloud processor

Cloud CloudCl dReflex runtime

Reflex runtime

Application µ controller

processorprocessorCloud 

processorCloud 

processorCloud 

processorCloud 

processorCloud 

Reflex runtimeReflex runtime

processorµ‐controllerprocessorprocessorReflex 

µ‐controllerruntime

• Type safety for dynamic process migration

122

Reflex Prototype (board integration)f yp g

• Programmable accelerometer (TI MSP430)Programmable accelerometer (TI MSP430)• Wired sensor through UART port

Rice Orbit S

Nokia N810

Sensor

123Serial connection

Fall detection with N810

Average PowerAverage Power 100mW

20 W20mW

Legacy Reflex

Th t d t f ll ftThe secret: we do not fall very often

124

Coded as part of Smartphone programclass SenseletFall : public SenseletBase {public:

SenseletFall () { _avg_energy = 0; };

void OnCreate() { RegisterSensorData(ACCEL, 50); };

void OnData(uint8 t *readings, uint16 t len) {void OnData(uint8_t readings, uint16_t len) {uint16_t energy = readings[0]*readings[0] + \

readings[1]*readings[1] + \readings[2]*readings[2];

//d i l l fil i//do a simple low-pass filtering_avg_energy = _avg_energy / 2 + energy / 2;

// detect fall accident with the filtered energy// detect a acc de t t t e te ed e e gyif (_avg_energy > THRESHOLD) {

theMainBody.FallAlert(); //RMI}

}}

void OnDestroy() { UnRegisterSensorData(ACCEL); };

125

private:uint16_t _avg_energy;

};

Thanks!h //http://www.recg.org