Perform Big Data Engineering on Microsoft Cloud Services ... · Each question in the series...

Perform Big Data Engineering on Microsoft

Cloud Services (beta)

Microsoft 70-776 Dumps Available Here at:

https://www.certification-questions.com/microsoft-exam/70-776-dumps.html

Enrolling now you will get access to 83 questions in a unique set of 70-

776 dumps

Question 1 Note: This question is part of series of questions that present the same scenario. Each question in

the series contains a unique solution that might meet the stated goals. Some question sets might

have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these

questions will not appear in the review screen.

You have a table named Table1 that contains 3 billion rows. Table1 contains data from the last 36 months.

At the end of every month, the oldest month of data is removed based on a column named DateTime.

You need to minimize how long it takes to remove the oldest month of data.

Solution: You specify DateTime as the hash distribution column.

Does this meet the goal?

Options:

A. Yes

B. No

Answer: B

Explanation:

A hash-distributed table distributes table rows across the Compute nodes by using a deterministic hash

function to assign each row to one distribution.

Since identical values always hash to the same distribution, the data warehouse has built-in knowledge of

the row locations. SQL Data Warehouse uses this knowledge to minimize data movement during queries,

which improves query performance.

Note: A distributed table appears as a single table, but the rows are actually stored across 60 distributions.

The rows are distributed with a hash or round-robin algorithm.

References: https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-

distribute

Microsoft 70-776

https://www.certification-questions.com











Solution: You implement round robin for table distribution.


Options:

A. Yes

B. No

Answer: B

Explanation:

A distributed table appears as a single table, but the rows are actually stored across 60 distributions. The

rows are distributed with a hash or round-robin algorithm.

A round-robin distributed table distributes table rows evenly across all distributions. The assignment of rows

to distributions is random. Unlike hash-distributed tables, rows with equal values are not guaranteed to be

assigned to the same distribution.

References: https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-

distribute

Question 3 DRAG DROP

Note: This question is part of a series of questions that use the same scenario. For your

convenience, the scenario is repeated in each question. Each question presents a different goal and

answer choices, but the text of the scenario is exactly the same in each question in this series.

Start of repeated scenario

You are developing a Microsoft Azure SQL data warehouse to perform analytics on the transit system of a

city. The data warehouse will contain data about customers, trips, and community events.

You have two storage accounts named StorageAccount1 and StorageAccount2. StorageAccount1 is

associated to the data warehouse. StorageAccount2 contains weather data files stored in the CSV format.

The files have a naming format of city_state_yyymmdd.csv.

Microsoft SQL Server is installed on an Azure virtual machine named AzureVM1.

You are migrating from an existing on premises solution that uses Microsoft SQL Server 2016 Enterprise.

The planned schema is shown in the exhibit. (Click the Exhibit button)

Microsoft 70-776


Microsoft 70-776


The first column of each table will contain unique values. A table named Customer will contain 12 million

rows. A table named Trip will contain 3 billion rows.

You have the following view.

Microsoft 70-776


You plan to use Azure Data Factory to perform the following four activities:

- Activity1: Invoke an R script to generate a prediction column.

- Activity2: Import weather data from a set of CSV files in Azure Blob storage

-Activity3: Execute a stored procedure in the Azure SQL data warehouse.

- Activity4: Copy data from an Amazon Simple Storage Service (S3).

You plan to detect the following two threat patterns:

- Pattern1: A user logs in from two physical locations.

- Pattern2: A user attempts to gain elevated permissions.

End of repeated scenario

Which types of threat detection should you configure for each threat pattern? To answer, drag the

appropriate threat detection types to the correct patterns. Each threat detection type may be used once,

more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Select and Place:

Microsoft 70-776


Options:

A.

Answer: A

Explanation:

:

SQL Threat Detection provides a new layer of security, which enables customers to detect and respond to

potential threats as they occur by providing security alerts on anomalous activities. Users receive an alert

upon suspicious database activities, potential vulnerabilities, and SQL injection attacks, as well as

anomalous database access patterns.

From scenario: You plan to detect the following two threat patterns:

Pattern1: A user logs in from two physical locations.

Pattern2: A user attempts to gain elevated permissions.

References: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-threat-detection

Microsoft 70-776


Question 4 Note: This question is part of a series of questions that use the same scenario. For your












Microsoft 70-776





Microsoft 70-776





- Activity3: Execute a stored procedure in the Azure SQL data warehouse.






You plan to create the Azure Data Factory pipeline.

Which activity requires that you create a custom activity?

Options:

A. Activity2

B. Activity4

C. Activity3

D. Activity1

Answer: D

Explanation:

Explanation:

Incorrect Answers:

A: Supported copy activities include Copy data in GZip compressed text (CSV) format from Azure Blob and

Microsoft 70-776


write to Azure SQL Database.

B: Amazon S3 is supported as a source data store.

C: You can use the SQL Server Stored Procedure activity in a Data Factory pipeline to invoke a stored

procedure in one of the following data stores: Azure SQL Database, Azure SQL Data Warehouse, SQL

Server Database in your enterprise or an Azure VM.

Note: There are two types of activities that you can use in an Azure Data Factory pipeline.

- Data movement activities to move data between supported source and sink data stores.

- Data transformation activities to transform data using compute services such as Azure HDInsight, Azure

Batch, and Azure Machine Learning.

To move data to/from a data store that Data Factory does not support, or to transform/process data in a

way that isn't supported by Data Factory, you can create a Custom activity with your own data movement or

transformation logic and use the activity in a pipeline. The custom activity runs your customized code logic

on an Azure Batch pool of virtual machines.

References: https://docs.microsoft.com/en-us/azure/data-factory/transform-data-using-dotnet-custom-

activity

Question 5 Note: This question is part of a series of questions that use the same scenario. For your












Microsoft 70-776





Microsoft 70-776











You need to copy the weather data for June 2016 to StorageAccount1.

Which command should you run on AzureVM1?

Options:

A. azcopy.exe

B. robocopy.exe

C. sqlcmd.exe

D. bcp.exe

Answer: A

Explanation:

AzCopy is a command-line utility designed for copying data to/from Microsoft Azure Blob, File, and Table

storage, using simple commands designed for optimal performance. You can copy data between a file

system and a storage account, or between storage accounts.

From scenario: You have two storage accounts. StorageAccount1 is associated to the data warehouse.

Microsoft 70-776


StorageAccount2 contains weather data files stored in the CSV format.

Incorrect Answers:

B: Robocopy is a free and robust file copy utility included in Windows, for doing large file copies.

References: https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy

Question 6 HOTSPOT

Note: This question is part of a series of questions that use the same scenario. For your












Microsoft 70-776





Microsoft 70-776











You plan to create a report that will query customer records for a selected ResidenceZip. The report will

return customer trips sorted by TripStartDateTime.

You need to specify the distribution clause for each table. The solution must meet the following

requirements.

- Minimize how long it takes to query the customer information.

- Perform the operation as a pass-through query without data movement.

How should you complete the statement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Hot Area:

Microsoft 70-776


Options:

A.

Microsoft 70-776


Answer: A

Explanation:

:

The most common example of when a table distributed by a column will far outperform a Round Robin

table

is when two large fact tables are joined. For example, if you have an orders table, which is distributed by

order_id, and a transactions table, which is also distributed by order_id, when you join your orders table to

your transactions table on order_id, this query becomes a pass-through query, which means we eliminate

data movement operations. Fewer steps mean a faster query. Less data movement also makes for faster

Microsoft 70-776


queries.

Incorrect Answers:

Round Robin: Hash distribute large tables

By default, tables are Round Robin distributed. This makes it easy for users to get started creating tables

without having to decide how their tables should be distributed. Round Robin tables may perform

sufficiently

for some workloads, but in most cases selecting a distribution column will perform much better

References: https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-best-practices









Solution: You implement range partitioning based on the year and the month.


Options:

A. Yes

B. No

Answer: A

Explanation:

The data from the same time period would be in the same partition. This would it faster to remove one

month of data.

Question 8 You plan to deploy a Microsoft Azure virtual machine that will a host data warehouse. The data warehouse

will contain a 10-TB database.

You need to provide the fastest read and writes times for the database.

Which disk configuration should you use?

Options:

A. spanned volumes

Microsoft 70-776


B. storage pools with striped disks

C. RAID 5 volumes

D. storage pools with mirrored disks

E. stripped volumes

Answer: B

Question 9 You need to connect to a Microsoft Azure SQL data warehouse from an Azure Machine Learning

experiment.

Which data source should you use?

Options:

A. Azure Table

B. SQL Database

C. Web URL via HTTP

D. Data Feed Provider

Answer: B

Explanation:

Use Azure SQL Database as the Data Source.

References: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/import-

from-azure-sql-database

Question 10 You have a fact table named PowerUsage that has 10 billion rows. PowerUsage contains data about

customer power usage during the last 12 months. The usage data is collected every minute. PowerUsage

contains the columns configured as shown in the following table.

Microsoft 70-776


LocationNumber has a default value of 1. The MinuteOfMonth column contains the relative minute within

each month. The value resets at the beginning of each month.

A sample of the fact table data is shown in the following table.

There is a related table named Customer that joins to the PowerUsage table on the CustomerId column.

Sixty percent of the rows in PowerUsage are associated to less than 10 percent of the rows in Customer.

Most queries do not require the use of the Customer table. Many queries select on a specific month.

Microsoft 70-776


You need to minimize how long it takes to find the records for a specific month.

What should you do?

Options:

A. Implement partitioning by using the MonthKey column. Implement hash distribution by using the

CustomerId column.

B. Implement partitioning by using the CustomerId column. Implement hash distribution by using

the

MonthKey column.

C. Implement partitioning by using the MinuteOfMonth column. Implement hash distribution by

using the

MeasurementId column.

D. Implement partitioning by using the MonthKey column. Implement hash distribution by using the

MeasurementId column.

Answer: C

Would you like to see more? Don't miss our 70-776 PDF

file at:

https://www.certification-questions.com/microsoft-pdf/70-776-pdf.html

Microsoft 70-776


Perform Big Data Engineering on Microsoft Cloud Services ... · Each question in the series...

Documents

Transcript of Perform Big Data Engineering on Microsoft Cloud Services ... · Each question in the series...