(SDD419) Amazon EC2 Networking Deep Dive and Best Practices | AWS re:Invent 2014

Post on 24-Jun-2015

2.946 views 2 download

Tags:

description

Amazon EC2 instances give customers a variety of high-bandwidth networking choices. In this session, we discuss how to choose among Amazon EC2 networking technologies and examine how to get the best performance out of Amazon EC2 enhanced networking and cluster networking. We also share best practices and useful tips for success.

Transcript of (SDD419) Amazon EC2 Networking Deep Dive and Best Practices | AWS re:Invent 2014

November 12, 2014 | Las Vegas, NV

Becky Weiss, Principal Software Engineer, Amazon EC2 Networking

Elastic

network

interface

Subnet A

us-east-1a10.0.1.0/24

10.0.1.100

Subnet A2

us-east-1a10.0.2.0/24

10.0.1.101

10.0.2.50

10.0.2.51

Subnet C

us-east-1c10.0.3.0/24

10.0.3.99

Instance

1

Instance

2

Instance

3 Instance

4

elastic

network

interface

Subnet A

us-east-1a10.0.1.0/24

10.0.1.100

Subnet A2

us-east-1a10.0.2.0/24

10.0.1.101

10.0.2.50

10.0.2.51

Subnet C

us-east-1c10.0.3.0/24

10.0.3.99

Instance

1

Instance

2

Instance

3 Instance

4

Placement group

Subnet A is in us-east-1a

C:> aws ec2 run-instances --image-id ami-b66ed3de --instance-type c3.8xlarge --subnet-id subnet-c03cfb99 --security-group-ids sg-72caf017 --key-name NetworkingTestSSHKey --count 2

---------------------------------------------------------------------------------

| RunInstances |

+----------------------------------------+--------------------------------------+

| OwnerId | 123456789012 |

| ReservationId | r-9f5404b5 |

+----------------------------------------+--------------------------------------+

| Instances |

|+-----------------------------------+-----------------------------------------+|

|| AmiLaunchIndex | 0 ||

|| Architecture | x86_64 ||

|| ClientToken | None ||

|| EbsOptimized | False ||

|| Hypervisor | xen ||

|| ImageId | ami-b66ed3de ||

C:> aws ec2 run-instances --image-id ami-b66ed3de --instance-type c3.8xlarge --subnet-id subnet-c03cfb99 --security-group-ids sg-72caf017 --key-name NetworkingTestSSHKey --count 2

---------------------------------------------------------------------------------

| RunInstances |

+----------------------------------------+--------------------------------------+

| OwnerId | 123456789012 |

| ReservationId | r-9f5404b5 |

+----------------------------------------+--------------------------------------+

| Instances |

|+-----------------------------------+-----------------------------------------+|

|| AmiLaunchIndex | 0 ||

|| Architecture | x86_64 ||

|| ClientToken | None ||

|| EbsOptimized | False ||

|| Hypervisor | xen ||

|| ImageId | ami-b66ed3de ||

AMI: More about this

choice later…

C:> aws ec2 run-instances --image-id ami-b66ed3de --instance-type c3.8xlarge --subnet-id subnet-c03cfb99 --security-group-ids sg-72caf017 --key-name NetworkingTestSSHKey --count 2

---------------------------------------------------------------------------------

| RunInstances |

+----------------------------------------+--------------------------------------+

| OwnerId | 123456789012 |

| ReservationId | r-9f5404b5 |

+----------------------------------------+--------------------------------------+

| Instances |

|+-----------------------------------+-----------------------------------------+|

|| AmiLaunchIndex | 0 ||

|| Architecture | x86_64 ||

|| ClientToken | None ||

|| EbsOptimized | False ||

|| Hypervisor | xen ||

|| ImageId | ami-b66ed3de ||

Big instance type:

c3.8xlarge

Avg: 0.167msec

NetworkingTestPlacementGroup available cluster

C:> aws ec2 run-instances --image-id ami-b66ed3de --instance-type c3.8xlarge --subnet-id subnet-c03cfb99 --security-group-ids sg-72caf017 --key-name NetworkingTestSSHKey --count 2 --placement GroupName=NetworkingTestPlacementGroup

---------------------------------------------------------------------------------

| RunInstances |

+----------------------------------------+--------------------------------------+

| OwnerId | 123456789012 |

| ReservationId | r-13374839 |

+----------------------------------------+--------------------------------------+

| Instances |

|+-----------------------------------+-----------------------------------------+|

|| AmiLaunchIndex | 0 ||

|| Architecture | x86_64 ||

|| ClientToken | None ||

|| EbsOptimized | False ||

|| Hypervisor | xen ||

|| ImageId | ami-b66ed3de ||

Avg: .099msec

Instance 1 Instance 2

...........

Virtualization layer

eth

0

eth

1

Instance Virtual NICs

Physical NIC

Virtualization layer

eth

0

Instance

Physical NICVF Driver

eth

1

VF

[ec2-user@ip-10-0-3-70 ~]$ ethtool -i eth0

driver: vif

version:

firmware-version:

bus-info: vif-0

[ec2-user@ip-10-0-3-70 ~]$ ethtool -i eth0

driver: ixgbevf

version: 2.14.2+amzn

firmware-version: N/A

bus-info: 0000:00:03.0

amzn-ami-hvm-2012.03.1.x86_64-ebs

hvm

--attribute sriovNetSupport

InstanceId i-37c5d1d9Not yet!

[ec2-user@ip-10-0-3-125 ~]$ sudo yum update

OS update

reboot-instances

Reboot

(OS update)

(Not shown here: analogous steps for other Linux distros)

Add to Windows driver store

stop-instances

Stop the instance

stop-instances

--sriov-net-support simple

Enable SRIOV

Cannot be undone

start-instances

Start

start-instances

--attribute sriovNetSupport

InstanceId i-37c5d1d9

Value simple

We’re on

modinfo ixgbevf

aws ec2 register-image --name MyEnhancedNetworkingImage--image-location … --sriov-net-support-simple

i2.8xlarge

Storage-optimized instance

require 'mongo‘

'randomdb'

until Time SECONDS_TO_RUN

KEY_MAX

:key

Time

if

:times_accessed

:key

else

:key :value:times_accessed

end

Time

end

Spin in tight loop:

Read a random document

Then write it back

def add_write_statistic

:sample_count

:sum

:minimum :minimum

:maximum :maximum

end

Aggregating statistics for CloudWatch

require 'aws-sdk'

AWS CloudWatch Client

if Time

:namespace 'NetworkingTest/MongoDemo',

:metric_data => [{:metric_name => 'WriteTime',

:dimensions => [{:name => 'RunId', :value => MY_RUN_ID}],

:statistic_values => write_stats}],

:unit => 'Seconds'

Time

:sample_count :sum

end

CloudWatch PutMetricData:

Writing a custom metric

# ec2-run-instances ami-b66ed3de --instance-type c3.large --subnet subnet-c03cfb99 --group sg-72caf017 --placement-group NetworkingTestPlacementGroup --monitor --user-data-file my_startup_script.sh --iam-profile NetworkingTestIAMRole --instance-count 10

RESERVATION r-d13d6f37 123456789012

INSTANCE i-fb6d5352 ami-b66ed3de ip-10-0-1-113.ec2.internal pending NetworkingTestSSHKey 0 c3.large 2014-10-30T13:26:33+0000 us-east-1a monitoring-pending 10.0.1.113 vpc-ca28afaf subnet-c03cfb99 ebs NetworkingTestPlacementGroup hvmxen sg-72caf017 defaultfalse arn:aws:iam::123456789012:instance-profile/NetworkingTestIAMRole

NIC eni-b560caed subnet-c03cfb99 vpc-ca28afaf 123456789012 in-use 10.0.1.113 true

NICATTACHMENT eni-attach-fb6ddf9d 0 attaching 2014-10-30T06:26:33-0800 true

GROUP sg-72caf017 default

...

# ec2-run-instances ami-b66ed3de --instance-type c3.large --subnet subnet-c03cfb99 --group sg-72caf017 --placement-group NetworkingTestPlacementGroup --monitor --user-data-file my_startup_script.sh --iam-profile NetworkingTestIAMRole --instance-count 10

RESERVATION r-d13d6f37 123456789012

INSTANCE i-fb6d5352 ami-b66ed3de ip-10-0-1-113.ec2.internal pending NetworkingTestSSHKey 0 c3.large 2014-10-30T13:26:33+0000 us-east-1a monitoring-pending 10.0.1.113 vpc-ca28afaf subnet-c03cfb99 ebs NetworkingTestPlacementGroup hvmxen sg-72caf017 defaultfalse arn:aws:iam::123456789012:instance-profile/NetworkingTestIAMRole

NIC eni-b560caed subnet-c03cfb99 vpc-ca28afaf 123456789012 in-use 10.0.1.113 true

NICATTACHMENT eni-attach-fb6ddf9d 0 attaching 2014-10-30T06:26:33-0800 true

GROUP sg-72caf017 default

...

CloudWatch detailed monitoring:

1-minute metrics

# ec2-run-instances ami-b66ed3de --instance-type c3.large --subnet subnet-c03cfb99 --group sg-72caf017 --placement-group NetworkingTestPlacementGroup --monitor --user-data-file my_startup_script.sh --iam-profile NetworkingTestIAMRole --instance-count 10

RESERVATION r-d13d6f37 123456789012

INSTANCE i-fb6d5352 ami-b66ed3de ip-10-0-1-113.ec2.internal pending NetworkingTestSSHKey 0 c3.large 2014-10-30T13:26:33+0000 us-east-1a monitoring-pending 10.0.1.113 vpc-ca28afaf subnet-c03cfb99 ebs NetworkingTestPlacementGroup hvmxen sg-72caf017 defaultfalse arn:aws:iam::123456789012:instance-profile/NetworkingTestIAMRole

NIC eni-b560caed subnet-c03cfb99 vpc-ca28afaf 123456789012 in-use 10.0.1.113 true

NICATTACHMENT eni-attach-fb6ddf9d 0 attaching 2014-10-30T06:26:33-0800 true

GROUP sg-72caf017 default

...

Startup script file

# cat startup_script.sh

Download client test script from S3

Then gogogo!

# ec2-run-instances ami-b66ed3de --instance-type c3.large --subnet subnet-c03cfb99 --group sg-72caf017 --placement-group NetworkingTestPlacementGroup --monitor --user-data-file my_startup_script.sh --iam-profile NetworkingTestIAMRole --instance-count 10

RESERVATION r-d13d6f37 123456789012

INSTANCE i-fb6d5352 ami-b66ed3de ip-10-0-1-113.ec2.internal pending NetworkingTestSSHKey 0 c3.large 2014-10-30T13:26:33+0000 us-east-1a monitoring-pending 10.0.1.113 vpc-ca28afaf subnet-c03cfb99 ebs NetworkingTestPlacementGroup hvmxen sg-72caf017 defaultfalse arn:aws:iam::123456789012:instance-profile/NetworkingTestIAMRole

NIC eni-b560caed subnet-c03cfb99 vpc-ca28afaf 123456789012 in-use 10.0.1.113 true

NICATTACHMENT eni-attach-fb6ddf9d 0 attaching 2014-10-30T06:26:33-0800 true

GROUP sg-72caf017 default

...

Security best practice:

Launch instances with IAM roles if

they need to access any AWS

resources

# aws iam list-role-policies --role-name NetworkingTestIAMRole

{

"PolicyNames": [

"NetworkingTestIAMRole-CloudWatchPolicy",

"NetworkingTestIAMRole-S3Policy"

]

}

# aws iam get-role-policy --role-name NetworkingTestIAMRole --policy-name NetworkingTestIAMRole-S3Policy

Allow retrieving objects from a particular S3 bucket

# aws iam get-role-policy --role-name NetworkingTestIAMRole --policy-name NetworkingTestIAMRole-CloudWatchPolicy

Allow CloudWatch PutMetricData

Label WriteTime

389483.0 2014-10-29T02:30:00Z Seconds

390189.0 2014-10-29T02:33:00Z Seconds

392373.0 2014-10-29T02:34:00Z Seconds

392387.0 2014-10-29T02:32:00Z Seconds

377256.0 2014-10-29T02:31:00Z Seconds

SampleCount statistic:How many of these WriteTime statistics

were written across all instances during

each minute?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

“WriteTime” SampleCount statisticby number of client instances

TPS, regular TPS, enhanced

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

DiskWriteBytes 1-minute Sum statisticby number of client instances

Regular Enhanced

Placement group

Instance

Virtualization layer

VF driver

http://bit.ly/awsevals