BatchJobService

Post on 07-Apr-2017

11.607 views 0 download

Transcript of BatchJobService

BatchJobServiceA new approach to batch mutates

Agenda

● Sync vs. Async

● Batch job flow

● Temporary IDs

● Incremental Uploads

● Best practices

● Limitations

Synchronousvs.

Asynchronous

Synchronous Requests

API Services Backend

Request

Response

RPC

RPC return

Sends operation

OK: receives status of the operation

Asynchronous Requests

API Async Services Backend

Request

Response

RPC

RPC return

OperationsOperations

Op. return

Request

Sends all Operations

OK: operations will be processed

Check operations status

Server sends the results

Operations enter an execution queue, and release the request.

Response

Op. return

Async Processing Isn't Free

Asynchronous processes usually incur additional overhead, such as:● Scheduling

● Management/load-balancing of the job queue

● Retry of failed operations

Synchronous vs. Asynchronous

Use Case Sync Async

Highly responsive user interface

Nightly bulk operations job

Fire-and-forget

Maximum throughput

AdWords API Batch Job Flow

Batch Processing - Why Use It?

● Automatic retry of operations that encounter transient errors such as:○ CONCURRENT_MODIFICATION○ UNEXPECTED_INTERNAL_API_ERROR

● Automatic retry of operations that encounter rate limits

● Your application won't be blocked waiting for a response.

Limitations of MutateJobService

● Error handling is difficult○ Difficult to find the operation that failed

● Can't create dependent entities in a single job

● MutateJobService has a much lower limit on the number of operations (in the thousands)

Introducing BatchJobService

● Allows setting temporary IDs on new entities for dependent entity creation

● Each result contains an operation index for correlating the result to its operation

● Much higher operations limit of 1 GB of unprocessed operations

● Utilities are available in all client libraries

Farewell, MutateJobService

● BatchJobService is a replacement for MutateJobService

● MutateJobService is going away

Supported OperationsAdGroupAdLabelOperationAdGroupAdOperationAdGroupBidModifierOperationAdGroupCriterionLabelOperationAdGroupCriterionOperationAdGroupLabelOperationAdGroupOperationBudgetOperationCampaignCriterionOperationCampaignLabelOperationCampaignOperationFeedItemOperation

AdGroupAdService

AdGroupService

BudgetService

AdGroupBidModifierService

CampaignCriterionService

FeedItemService

CampaignService

AdGroupCriterionService

Basic Steps

Download the results of the job

Upload the list of operations

Poll the batch job's status periodically

Operations

Results

2

3

4

Job

status

downloadUrl

Create a BatchJob

BatchJobService

Job

1

uploadUrl

Creating the Batch Job

● Create the BatchJob objectBatchJobOperation addOp = new BatchJobOperation();addOp.setOperator(Operator. ADD);addOp.setOperand(new BatchJob());

BatchJob batchJob = batchJobService.mutate(new BatchJobOperation[]{ addOp} ).getValue

(0);

● Make sure to grab the uploadUrl// Get the upload URL from the new job.String uploadUrl = batchJob.getUploadUrl().getUrl();

Creating the Batch Job (Cont.)

● Job status will be AWAITING_FILESystem.out.printf(

"Created BatchJob with ID %d, status '%s' and upload URL %s.%n" , batchJob.getId(),batchJob.getStatus(),uploadUrl);

$ Created BatchJob with ID <ID>, status 'AWAITING_FILE' and upload URL <URL>.

● Creating the BatchJob is a synchronous operation!

Campaign campaign = new Campaign();campaign.setName("Batch Campaign " + r.nextInt());campaign.setStatus(CampaignStatus. PAUSED);campaign.setAdvertisingChannelType(AdvertisingChannelType. SEARCH);Budget budget = new Budget();campaign.setBudget(budget);

BiddingStrategyConfiguration biddingStrategyConfiguration =

new BiddingStrategyConfiguration(); // Fill the biddingStrategyConfiguration

campaign.setBiddingStrategyConfiguration(biddingStrategyConfiguration );

CampaignOperation operation = new CampaignOperation();operation.setOperand(campaign);operation.setOperator(Operator. ADD);operations.add(operation); // Add to the list of operations

Uploading the Operations

● First, construct operations like usual

Uploading the Operations (Cont.)

● The next step is to send operations to the uploadUrl

● uploadUrl is valid for one week

Request method POST

URL Upload URL returned by BatchJobService.mutate

Content-Type HTTP header application/xml

Request body A mutate element in XML form, as specified in BatchJobOps.xsd

Uploading the Operations (Cont.)

● All libraries have utilities to help building the request!

● BatchJobHelper for Java client library:

// Use a BatchJobHelper to upload all operations.BatchJobHelper batchJobHelper = new BatchJobHelper(session);

batchJobHelper.uploadBatchJobOperations( operations, uploadUrl);System.out.printf(

"Uploaded %d operations for batch job with ID %d.%n" , operations.size(), batchJob.getId());

Polling the Job Status

● After uploading operations, the BatchJob status will move to ACTIVE.

AWAITING_FILE

ACTIVE

CANCELED

DONE

Polling the Job Status (Cont.)● Check the job status

int pollAttempts = 0;boolean isPending = true;Selector selector =

new SelectorBuilder() .fields(

BatchJobField.Id, BatchJobField.Status,BatchJobField.DownloadUrl, BatchJobField.ProcessingErrors, BatchJobField.ProgressStats)

.equalsId(batchJob.getId()) .build();do {

long sleepSeconds = (long) Math.scalb(30, pollAttempts); System.out.printf("Sleeping %d seconds...%n", sleepSeconds); Thread.sleep(sleepSeconds * 1000); batchJob = batchJobService.get(selector).getEntries(0); System.out.printf( "Batch job ID %d has status '%s'.%n",

batchJob.getId(), batchJob.getStatus());

pollAttempts++; isPending = PENDING_STATUSES.contains(batchJob.getStatus());} while (isPending && pollAttempts < MAX_POLL_ATTEMPTS);

Polling the Job Status (Cont.)

● Poll for completion of the batch job with an exponential back off

● Check the job's status until it is CANCELED or DONE

● Log progressStats for large sets of operations

Downloading Results

● At this stage the job status can be DONE or CANCELED

● Let's check out the details...

Downloading Results - DONE

● Status: DONE

● Description: BatchJobService successfully parsed and attempted each of the uploaded operations

● Actions to take: Download the results for each operation from the batch job's downloadUrl

Downloading Results - CANCELED

● Status: CANCELED

● Description: An unexpected error occurred, or BatchJobService could not parse the uploaded operations

● Actions to take: ○ Inspect the list of processingErrors○ Download the results for any successfully parsed

operations from the batch job's downloadUrl

RARE

Downloading Results - CANCELED

● Some operations may still have been attempted

● Always check the downloadUrl for the results

● Details about the mutateResult object:○ https://goo.gl/pXAhBS

Downloading Results (Cont.)● downloadUrl returns mutateResults● There is another utility to help!

if (batchJob.getDownloadUrl() != null && batchJob.getDownloadUrl().getUrl() != null) {

BatchJobMutateResponse mutateResponse =batchJobUploadHelper.downloadBatchJobMutateResponse(

batchJob.getDownloadUrl().getUrl());System.out.printf("Downloaded results from %s:%n" ,

batchJob.getDownloadUrl().getUrl());

for (MutateResult mutateResult :mutateResponse.getMutateResults()) {

String outcome = mutateResult.getErrorList() == null ? "SUCCESS" : "FAILURE";

System.out.printf(" Operation [%d] - %s\n" ,mutateResult.getIndex(), outcome);

}}

Downloading Results (Cont.)

● A mutateResult will have either a result or an errorList, but not both○ result - The result of the corresponding

successful operation, e.g., a Campaign object○ errorList - The error list for the corresponding

failed operation○ index - The zero-based index of the

corresponding operation MutateResult● result Operand● errorList ErrorList● index long

Errors

BatchJob.processingErrors (per job)● Errors encountered while parsing uploaded

operations - FILE_FORMAT_ERROR● Unexpected errors such as issues writing

resultsMutateResult.errorList (per operation)● Retrieved from BatchJob.downloadUrl● Errors encountered while attempting to

execute a single successfully parsed operation.

Temporary IDs

Temporary IDs

● Did you ever want to create a complete campaign in a single batch job?

● New feature introduced with BatchJobService○ A negative number of type long○ Just create an ID and reuse it in

operations for dependent objects

Common Use CaseCreate a Campaign in a single BatchJob:

Campaign● id = -1

AdGroup● id = -2● campaignId = -1

CampaignLabel● campaignId = -1

NegativeCampaignCriterion● campaignId = -1

AdGroupAd● adGroupId = -2

BiddableAdGroupCriterion● adGroupId = -2● criterion

○ text = "shoes"

BiddableAdGroupCriterion● adGroupId = -2● criterion

○ text = "jackets"

...

Important Notes

● The order of operations is very important!

● Keep this in mind when using temp IDs

● Create the parent object before creating its child objects

Incremental Uploads

Introduction

● Allows multiple operation upload requests to the same BatchJob

● Job will start after last set of operations is sent out

● Follows the Google Cloud storage guidelines

Incremental Upload Use Cases

● You build your BatchJob operations in phases or separate processes, so sending all operations at once isn't feasible

● You have a large set of operations and you don't want to send them all in one enormous upload request○ E.g.: you may not want to send 500 MB of

operations in a single POST request

Range of bytes in the request, followed by total bytes. Total bytes will be * for the first and intermediate requests.

The number of bytes in the contents of the current request.

application/xmlContent-TypeHTTP Header

Request AttributesRequest method PUT

URL

Content-LengthHTTP Header

Content-RangeHTTP Range

Request body

Upload URL returned by BatchJobService.mutate

Operations in XML form, as specified in BatchJobOps.xsd.

More on the Request Body...

● BatchJobService will concatenate all the requests

● You just need to send the first and last markers:

Request Start mutate element End mutate element

First

Intermediate

Last

More on the Request Body (Cont.)

● All requests will be parsed as a single document

● The concatenation of all requests has to be a complete XML document

● The size of each request body must be a multiple of 256K (262144) bytes○ This rule does not apply to the last request

Request 1

Content-Range: 0-262143/*<?xml version="1.0" encoding="UTF-8"?>

<ns1:mutate xmlns:ns1="https://adwords.google.com/api/adwords/cm/v201509"><operations xsi:type="ns1:CampaignOperation">

<operator xsi:type="ns1:Operator">ADD</operator><operand xsi:type="ns1:Campaign">…

</operations><operations> …</operat

Content length of 262144, where the "t" in the last line is the 262144th byte.

Request 3

Content-Range: 524288-524304/524305rations></mutate>

● Content length without padding of 17 bytes, where the closing > on </mutate> is the 17th byte

● Total content length across all requests for the job is 262144+262144+17 = 524305 bytes

Request 2

Content-Range: 262144-524287/*ions>

<operations xsi:type="ns1:AdGroupOperation"><operator xsi:type="ns1:Operator">ADD</operator><operand xsi:type="ns1:AdGroup">...

</operations><operations>...

</ope

Content length of 262144, where the "e" in the last line is the 262144th byte.

Use the Client Libraries!

● The client libraries have utilities to do all the parsing

● No need to worry about size details

● Check out the online examples○ https://goo.gl/wgywm1

Best PracticesGeneral Guidelines

Improve Throughput

● Fewer larger jobs over many smaller jobs

● Exponential back off when polling

● Don't poll job status too frequently○ Might hit a rate limit

Dealing with Same Client ID

● Avoid different jobs working on the same objects○ Might result in deadlocks, followed by execution

failures

● Wait for the job to be DONE or CANCELED

One Last Tip...

● Avoid multiple mutates of the same object in the same job

Limitations

Regarding Operations Size

● 1 Gb of unfinished operations per account at any given time○ Will throw DISK_QUOTA_EXCEEDED error○ Operations from incremental uploads where the

last upload has not occurred do not count towards this limit

● Just wait for some jobs to complete and try again

Regarding Shopping Campaigns

● BJS similar to partialFailure = true

● Partial failure not supported for Shopping campaign ad group criteria...

● ...so BJS does not support AdGroupCriterionOperations on ad groups of shopping campaigns○ Will result in a CAMPAIGN_TYPE_NOT_COMPATIBLE_WITH_PARTIAL_FAILURE error

Cancelling the Job

BatchJob status is read-only:

● You can't cancel a job before the operations upload is finished

● You can't cancel a job while it's executing