Enterprise Monitoring System - ans.co.uk offers 1 million free CloudWatch API requests per month,...

13
Enterprise Monitoring System AWS and Azure On-Boarding

Transcript of Enterprise Monitoring System - ans.co.uk offers 1 million free CloudWatch API requests per month,...

Enterprise Monitoring System

AWS and Azure On-Boarding

Page 2 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

Contents 1. AWS Monitoring................................................................................................. 3 1.1 About AWS Monitoring ........................................................................................ 3 1.2 AWS Permissions ............................................................................................... 3 1.2.1 Read Only Access: .......................................................................................... 4 1.2.2 Custom Permissions ......................................................................................... 4 1.3 Adding your AWS Account into LogicMonitor ............................................................. 5 1.4 CloudWatch Costs ............................................................................................. 6 2. Azure Monitoring .............................................................................................. 6 2.1 Granting LogicMonitor Access to the Azure Insights API ................................................ 6 3. Web Service Checks ........................................................................................... 9 Appendix A – Collector Server Specifications ............................................................... 10 Appendix B - Azure DataSources ............................................................................... 11 Azure VMs ............................................................................................................. 11

Page 3 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

1. AWS Monitoring

1.1 About AWS Monitoring

LogicMonitor collects data for your AWS resources in two ways: 1. AWS CloudWatch metrics are collected 2. The AWS SDK is used to collect metrics that are not exposed in CloudWatch

If you're monitoring AWS resources for which only CloudWatch data is being collected, you don't need a local LogicMonitor collector. A LogicMonitor hosted collector is used to collect CloudWatch data for your resources via the CloudWatch API.

If you intend to monitor AWS resources for which LogicMonitor will query the AWS SDK, or if you want to use custom script datasources to monitor your AWS resources, you will need a collector to perform these queries. This collector can be in your AWS environment or in your local environment. For certain AWS resources, LogicMonitor native datasources via a local collector provide more detailed monitoring than the AWS metrics collected - in these situations it would be beneficial to have your collector in your AWS environment (on an EC2 instance). Examples of such resources include EC2 instances running Apache and RDS instances. Appendix A outlines the Collector server specifications if one is required. LogicMonitor has AWS Datasources to monitor the following AWS services:

• API Gateway

• Application ELB

• AutoScaling

• Billing

• CloudFront

• DynamoDB

• EBS

• EC2

• ECS

• EFS

• ElastiCache

• Elasticsearch

• ELB

• Kinesis

• Lambda

• Redshift

• RDS

• Route53

• S3

• SNS

• SQS

• SWF

1.2 AWS Permissions

In order for LogicMonitor to monitor your AWS environment, you'll need to configure a user from you AWS console that can access CloudWatch and SDK metrics for your AWS resources

Page 4 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

The user you create needs to have permission to access the data for your AWS resources. There are

two ways that you can grant this access:

1. Attach the default AWS Read Only Access Policy to your LogicMonitor user & add additional permissions for certain AWS resources as necessary. We recommend this option because updates and changes are less likely to affect the collection of your AWS data.

2. Create and attach a custom policy that includes the minimum permissions necessary for LogicMonitor to collect data for your AWS resources.

1.2.1 Read Only Access:

To grant LogicMonitor read-only permissions, follow these steps in your AWS Console:

• Navigate to the Identity and Access Management (IAM) Section of your AWS Console

• Create a user in the 'Users' section of IAM in your AWS Console (be sure to save the Access Key and Secret Access Key, you will need them later) and then either:

o Attach the 'ReadOnlyAccess' policy to the user o Add the user to a Group with the ReadOnlyAccess policy attached

If you attach AWS's default read-only policy to your user, there are additional permissions that must be specified in order for LogicMonitor to monitor ElastiCache and AWS Billing. The easiest way to grant these additional permissions is to add an inline policies to your LogicMonitor AWS user (IAM --> select user --> add Inline Policy). ANS will advise on suggested policies and provide more information on the specific permissions required for your AWS resources where necessary.

1.2.2 Custom Permissions

You can alternatively grant custom permissions by creating a new policy that can be attached to a user or a group of users in AWS. In order to grant the minimum level of access needed for LogicMonitor to monitor your environment, the following minimum permissions must be included in the policy: {

"Statement": [

{

"Action": [

"apigateway:get",

"cloudfront:list*",

"cloudwatch:Describe*",

"cloudwatch:Get*",

"cloudwatch:List*",

"dynamodb:DescribeTable",

"dynamodb:ListTables",

"ec2:Describe*",

"ecs:Describe*",

"ecs:List*",

"elasticfilesystem:Describe*",

"s3:List*",

"s3:GetObject",

"s3:GetObjectVersion",

"s3:getBucketTagging",

"s3:GetBucketLocation",

"swf:list*",

"autoscaling:DescribeAutoScalingGroups",

"elasticache:DescribeCacheClusters",

Page 5 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

"elasticache:ListTagsForResource",

"elasticloadbalancing:DescribeLoadBalancers",

"elasticloadbalancing:describeTags",

"elasticmapreduce:Describe*",

"elasticmapreduce:List*",

"es:ListTags",

"es:Describe*",

"es:ListDomainNames",

"kinesis:DescribeStream",

"kinesis:listStreams",

"kinesis:listTagsForStream",

"lambda:List*",

"lambda:getFunctionConfiguration",

"rds:DescribeDBInstances",

"rds:listTagsForResource",

"redshift:DescribeClusters",

"route53:Get*",

"route53:List*",

"sns:listTopics",

"sns:getTopicAttributes",

"ebs:describeVolumes",

"sqs:GetQueueAttributes",

"sqs:GetQueueUrl",

"sqs:listQueues"

],

"Effect": "Allow",

"Resource": "*"

}

],

"Version": "2012-10-17"

}

To create a new policy with the above permissions:

1. create a new policy in the Policy section of IAM in your AWS Console 2. select Create Your Own Policy' and paste the contents above into the 'Policy Document'

field of the Policy 3. Apply the policy to a user or a group of users

Note: You don't need to include all of the permissions listed above if they are not applicable to the resources in your environment.

1.3 Adding your AWS Account into LogicMonitor

To start monitoring your AWS environment you will need to provide the following information to ANS:

• AWS Account ID - you can find this value from your AWS console by clicking Support in the

upper right hand corner & then Support Center.

• AWS Access Key ID – this must correspond with the user configured in section 1.2 of this document - You can find your AWS Access Key ID in your AWS Console in IAM > Users > Manage Keys.

• AWS Secret Access Key – this must correspond with the user configured in section 1.2 of

this document -If you don't know your Secret Access Key, you should generate a new key pair.

• AWS Regions – which AWS regions do you wish to discover services from.

Page 6 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

• Tag Filter – What tag filters (if any) should be applied. If you specify a Tag Filter, only AWS

resources that meet the filter criteria will be monitored. Note that: o You can use glob expressions with the tag filter (e.g. tag value = prod*) o Multiple filters will be logically connected with an OR o The tag filter is case sensitive

• Automatically remove dead instances – choose whether you want to remove dead instances from monitoring, and whether this should happen immediately or after a

specified period of time. This will only apply to terminated instances.

• Services – List the AWS services you want to monitor, e.g. EC2, Load Balancers, Elasti Cache, etc.

1.4 CloudWatch Costs

LogicMonitor's AWS Datasources rely on the CloudWatch API to get data for your AWS resources. AWS offers 1 million free CloudWatch API requests per month, but API requests beyond that free tier cost $.01/1000 requests. Depending on which AWS services you are monitoring and how many resources you are monitoring, it is possible that you may see CloudWatch costs associated with LogicMonitor's AWS Monitoring. For more information on estimating costs see the following document: https://www.logicmonitor.com/support/monitoring/aws/cloudwatch-costs/

2. Azure Monitoring

LogicMonitor collects metrics for your Azure resources via the Azure Insights API. As such, you will need to set up an Azure Active Directory (AAD) application that LogicMonitor can use to make requests to the Azure Insights API.

2.1 Granting LogicMonitor Access to the Azure Insights API

All API requests need to be authenticated via AAD, which is why you'll need to create an AAD application for LogicMonitor. That application will need reader permissions associated with the resources you want monitored in LogicMonitor. Typically, these permissions are set at the subscription level. The following steps provide instruction for creating an application and assigning the necessary permission:

1) In the AAD section of your Azure portal, select 'App registrations' under the Manage menu &

Add a new application:

Page 7 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

The Name of the application is how you will see it displayed throughout your Azure portal, but does not have any specific requirements.

The Application Type should be 'Web app / API'

The Sign-on URL doesn't have any significance (and therefore doesn't matter) because the application will only be making API requests; you won't be signing into it. In our example, we just put our LogicMonitor URL.

2) Once you've saved the application, select it and navigate to the Keys section, under the API

Access menu. You'll need to add a set of keys by filling in a key description, a duration, and then selecting save:

You'll need to provide the Application ID (Azure Client ID) and an Application Key Value (Azure Secret Key) in to ANS, so please make a note of them.

3) Now that you have an application with an API Key, you'll need to give this application

access to the resources you want monitored. In Azure, you can assign permission at the resource group level or the subscription level. Usually, assigning permissions at the subscription level is easiest & provides the most value. For each subscription you want to assign the application permission to, you'll need to navigate to the subscription's Access control (IAM) and add the application as a user with a minimum of Reader permissions:

Page 8 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

If you have multiple subscriptions that you want to add into monitoring, you'll need to add the

application with a reader role to each one. For a large number of subscriptions, you may consider doing this via PowerShell. For example, the following PowerShell script will add an AAD application for LogicMonitor & add the application as a reader to each subscription available to the user that runs the script: # Authenticate to all Azure subscriptions that the user has access to

Login-AzureRmAccount

# Password for the service principal

$pwd = "{service-principal-password}"

# Create a new Azure AD application

$azureAdApplication = New-AzureRmADApplication `

-DisplayName "LogicMonitor" `

-HomePage "https://lmtest.logicmonitor.com" `

-IdentifierUris "https://lmtest.logicmonitor.com" `

-Password $pwd

# Create a new service principal associated with the designated application

New-AzureRmADServicePrincipal -ApplicationId $azureAdApplication.ApplicationId

# Assign Reader role to the newly created service principal for each subscription

Get-AzureRmSubscription | ForEach-Object {

Set-AzureRmContext -SubscriptionId $_.SubscriptionId

New-AzureRmRoleAssignment -RoleDefinitionName Reader `

-ServicePrincipalName $azureAdApplication.ApplicationId.Guid

}

LogicMonitor has Azure Datasources to monitor the following Azure services:

• Azure VMs

• Azure SQL Databases

• Azure App Services

• Azure Redis Cache

• Azure Application Gateway All services are automatically discovered when a subscription is onboarded.

Further details regarding Azure DataSources can be found in Appendix B.

Page 9 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

3. Web Service Checks

Service Checks periodically test website performance and availability from outside of your network.

While regular server monitoring can provide website performance, these tests are run from inside your network, and therefore may not adequately reflect your external customer’s point of view. Using LogicMonitor Service Checks to test your website from outside of your network can help ensure that your website is accessible to your users. There are two types of Service Checks:

• Ping Service Checks - periodically ping an IP address from one or more external locations

• Web Service Checks - periodically make HTTP GET, HEAD or POST requests to one or more

URLs from one or more external locations (can handle Basic, NTLM or form based authentication)

There are 5 different testing locations:

• Dublin, EU

• Washington DC, US

• Los Angeles, US

• Singapore

In order to configure Service Checks ANS require the following information:

• URL (please specify if HTTP or HTTPS)

• HTTP Method you’d like to check (e.g. GET, HEAD, POST)

• If we’re using POST what data is sent and what format is it in

• Is there a redirect to follow?

• Is Authentication required – if so is it BASIC or NTLM – and what are the account credentials we should use

• What format will the response to the HTTP request will be in? Choose from plain text/string, regular expression, glob expression, xml, JSON and multi-line key-value pairs.

• What criteria should be included in the response?

• Specify the expected HTTP status code(s) response.

• Which of the following locations do you wish want to monitor from: o US – Los Angeles o US – Washington DC o Europe - Dublin o Asia – Singapore

• How often should the check run (Default is every 5 minutes)

• How quickly must the webpage load (In seconds – default is 30 seconds)

• After how many failed checks should an alert be raised (default is 1)

• What number of locations should the check fail at before an alert is triggered and what

severity of alert should be triggered, options for number of locations is: o All o Half o More than one o Any

• What level of alert should be sent should a single location check fail?

Page 10 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

Appendix A – Collector Server Specifications

Monitoring Azure Virtual Machines currently requires the installation of a LogicMonitor collector

within your Azure environment. AWS services that require monitoring through the AWS SDK may also require a collector. ANS carry out the installation of the LogicMonitor Collector and the addition of any devices into the LogicMonitor portal.

• A Windows 2008/2012/2016 Virtual Machine.

• A minimum of 1GB of RAM (preferably 2GB if you plan to collect data from more than 100 devices)

• Able to make an outgoing https connection (TLS on port 443) to the LogicMonitor servers (proxies are supported). This can be via standard Internet access or can be locked down to the following:

o If DNS names in firewall access control rules are supported: ▪ account.logicmonitor.com ▪ appproxy.logicmonitor.com

o If DNS names in firewall access control rules are not supported ▪ 212.118.245.0/24 (UK) ▪ 63.251.201.0/24 ▪ 74.201.65.0/24 ▪ 69.25.43.0/24 ▪ 54.193.15.255

▪ 54.209.7.170 ▪ 54.194.232.54 ▪ 54.254.224.41

• The collector must be able to reach all the hosts from which it will be collecting data by the appropriate methods, for example, SNMP, WMI, HTTP, JDBC. For reference those ports are:

o ICMP for ping monitoring o 80 for HTTP monitoring o 135 and high ports for WMI o 161 for SNMP o 162 for SNMP traps o 443 for HTTPS

o 445 for Perfmon o 1433 for SQL o 1521 for Oracle o 2055 for Netflow o 3306 for MySQL o 22 for Router and Switch Config Backups

• Minimize network impediments between the collector and the monitored hosts/devices. The recommendation is one collector per VPC/VNET where possible.

• The collector should have reliable time - thus it should have NTP setup or Windows Time

Services to synchronize via NTP. If running on a VMware virtual machine, install VMware tools with VMware tools periodic Time Sync disabled. For further information, see this VMware document.

• If present the collector should be added into the customer’s domain, specifically the domain we will be monitoring any devices in to.

• Anti-Virus installed on the collector. This can be provided by the customer or ANS can

provide a WebRoot Anti-Virus client if required. Please discuss with your service manager.

• Configure Windows Update to automatically download and install updates at 3am every Sunday.

Page 11 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

Appendix B - Azure DataSources

Azure VMs

LogicMonitor currently has one DataSource for monitoring Azure VM performance metrics:

Microsoft_Azure_VMs - collects basic performance data for Azure VMs

Note that installing a Collector within your Azure environment and monitoring your VMs via Collector will provide more comprehensive metrics than those reported via the Azure Monitor API.

Microsoft_Azure_VMs

Source: Azure Insights API

Datapoints:

• DiskReadBytes

• DiskReadOperationsPerSec

• DiskWriteBytes

• DiskWriteOperationsPerSec

• NetworkIn

• PercentageCPU

Azure SQL Databases

LogicMonitor currently has one DataSource for monitoring Azure SQL database performance metrics:

Microsoft_Azure_SQLDatabase - collects performance data for Azure SQL databases

Note that installing a Collector within your Azure environment and monitoring your SQL Databases via Collector does provide more comprehensive metrics than those reported via the Azure Monitor API.

Microsoft_Azure_SQLDatabase

Source: Azure Insights API

Datapoints:

• BlockedByFireWall

• ConnectionFailed

• ConnectionSuccessful

• CPUPercent

• DTUConsumptionPercent

• DTULimit

Page 12 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

• DTUUsed

• LogWritePercent

• PhysicalDataReadPercent

• SessionsPercent

• Storage

• StoragePercent

• WorkersPercent

• XTPStoragePercent

Azure App Services

LogicMonitor currently has one DataSource for monitoring Azure App Services:

Microsoft_Azure_WebApplication - collects performance data for Azure Web Apps (under Azure App Services)

Microsoft_Azure_WebApplication

Source: Azure Insights API

Datapoints:

• AverageMemoryWorkingSet

• AverageResponseTime

• BytesReceived

• BytesSent

• CpuTime

• Http101

• Http2xx

• Http3xx

• Http401

• Http403

• Http404

• Http406

• Http4xx

• Http5xx

Page 13 of 13, Issue No: 1 Issue Date: 22/02/2017: CLASSIFIED: CONFIDENTIAL

LogicMonitor – AWS and Azure On-Boarding

• MemoryWorkingSet

• Requests

Azure Redis Cache

LogicMonitor currently has one DataSource for monitoring Azure Redis Cache: Microsoft_Azure_RedisCache - collects performance data for Azure Redis Cache resources

Microsoft_Azure_RedisCache

Source: Azure Monitor (formerly Insights) API Datapoints:

• CacheHits

• CacheMisses

• CacheRead

• CacheWrite

• ConnectedClients

• CPU

• EvictedKeys

• ExpiredKeys

• GetCommands

• ServerLoad

• SetCommands

• TotalCommandsProcessed

• TotalKeys

• UsedMemory

• UsedMemoryRss